When models begin to influence markets: The Zillow conundrum

by Laurence Hamilton , Chief Commercial Officer , Consilient

At the end of 2025, Zillow, the largest residential real estate listing platform in the United States, quietly removed its climate risk scores. The decision reportedly followed pushback from real estate agents and homeowners who complained that the scores were distorting the value of the properties and ultimately harming sales.

The platform had been displaying climate scores directly on individual property listings, rating exposure to risks such as flooding and wildfire. Buyers saw the scores. Sales dropped. 

Here’s just one example: last year, a Florida mansion was listed for $295m, the highest asking price in the US at the time. Situated in one of the country’s highest flood-risk areas, the property saw multiple price cuts before being taken off the market.

The climate risk had not changed. The property hadn’t moved. What changed was the presence of the climate score on the listing that made buyers factor it into their decision.

Banks know this issue well. When models determine who gets credit, at what rate, or who gets shut out entirely, those models operate under formal governance and review. Decisions have to be explainable and challengeable at an individual level. 

On the face of it, Zillow operated under a different validation and governance context from regulated financial institutions. As the scores were applied at the individual-property level, their market impact raised questions that financial-services governance frameworks are specifically designed to address.

At that point, the model stopped being an informational tool and became a market actor. When model outputs are applied at the individual-asset level and surfaced directly to decision-makers, they don’t just influence single outcomes. At scale, they shape pricing, liquidity, and market behaviour.

That is precisely the threshold at which financial services applies formal model governance. It’s not that the models are wrong; it’s that their consequences are real.

TL;DR
Key takeaways from this article:
🟣Once climate risk scores were displayed at the individual-listing level, they became a visible input into buyer behaviour and pricing.
🟣Scale level model accuracy does not protect against individual level harm at scale that can then have the knock on effect for the whole market.
🟣Financial institutions govern these models because they have learned the cost and consequences of not doing so.
🟣As models increasingly influence asset values outside finance, these governance principles are increasingly becoming critical.

How regulated financial institutions approach comparable model deployments

If a climate risk score with material impact on asset values were deployed inside a regulated financial institution, it would be subject to governance that extends far beyond technical validation. The withdrawal of Zillow’s climate risk scores highlights the types of controls that regulated financial institutions typically apply when models begin to influence market behaviour.

#1. Assessment of material impact

A regulated institution would begin by asking: whether the model could influence credit decisions, collateral valuation, pricing, or customer behavior? Any model capable of affecting property values gets classified as high impact.

That classification would trigger senior ownership, formal risk sign-off, and enhanced oversight before deployment.

#2. Validation beyond predictive performance

Independent review would extend beyond accuracy metrics to examine conceptual soundness, sensitivity to assumptions, and the risks of presenting a single score for complex and uncertain phenomena. Climate models, in particular, would be expected to show how uncertainty is handled and to avoid outputs that could be interpreted as definitive or precise at the individual-property level.

A single, context-free score carries risk when uncertainty is high and consequences are concentrated.

#3. Explanation and challenge at the individual level

Where model outputs affect valuation or lending decisions, institutions are expected to explain how outcomes are reached in a way individuals can understand and contest.

That means surfacing the drivers behind the score, such as elevation, proximity to flood pathways, or other contributing factors, alongside a clear route for review or correction when the outcome is disputed.

#4. Monitoring for unintended consequences

After deployment, regulated institutions monitor how models operate in practice.

That includes tracking unintended effects, like disproportionate value impacts across regions or communities. Evidence of consumer damage or market distortion would normally trigger escalation, recalibration, or suspension well before withdrawal became the only option.

Zillow’s experience illustrates how governance expectations change once models begin to shape economic outcomes at the individual level. The issue was not model quality in isolation, but the absence of formal mechanisms to address uncertainty, consumer impact, explanation, and redress once the scores were presented at the individual-asset level.

None of these controls is novel. They reflect sthe tandards that regulated financial institutions have been operating under for years.

Why models that move asset values require individual-level defensibility

Once a model determines outcomes for individuals, the standard changes: can this specific decision be defended for this specific person, using the information available at the time?

In the United States, that standard is explicit. The Fair Credit Reporting Act gives individuals the right to see the information used about them and to correct inaccuracies. Under the Equal Credit Opportunity Act, adverse decisions must be accompanied by reasons explaining the key factors involved. And enforcement bodies, including the Consumer Financial Protection Bureau, have confirmed these obligations apply even where outcomes are driven by complex or opaque models.

Comparable expectations apply elsewhere. In the UK, the Financial Conduct Authority’s principles-based regime focuses on fair outcomes for individuals. In the EU, the incoming AI Act classifies many financial services use cases as high risk and attaches governance and redress requirements.

And these requirements exist because once an AI model influences value or access at the individual level, aggregate performance no longer cuts it. Zillow’s climate risk scores crossed into this territory once they started shaping liquidity and valuation for individual properties.

That’s when individual-level defensibility becomes mandatory, not after harm occurs, but as a condition of operating models that shape economic outcomes.

Why notification and complaints fail once models shape market behaviour

Notification and complaint processes address error after harm has already landed. And in high-volume, model-driven decisioning, that delay creates real governance problems.

Explanations are often too abstract to support meaningful challenge, particularly where complex models are involved. Data correction is reactive. Individual errors rarely get reassessed automatically, and corrections don’t consistently feed back into model behavior. As a results, models that perform well on average can generate harmful outcomes at the margins.

For regulated institutions, the pattern is familiar. Tail errors at the individual level accumulate into supervisory findings, remediation programs, and legal exposure.

But once a score sits directly in front of the market, the impact is immediate. Liquidity shifted and valuations moved before any practical mechanism existed for an individual to question or correct the underlying assessment.

And where models have this kind of effect, institutions are expected to demonstrate that outcomes can be defended at the level of the individual, reviewed by a human where required, and corrected before harm scales. Notification and complaints alone don’t meet that standard.

Why these lessons extend beyond finance

Financial services learned these lessons early because it was among the first sectors to deploy models as decision-making tools rather than analytical aids. Once models began determining access to credit, pricing risk, or restricting accounts, errors translated directly into consumer harm and institutional exposure. The consequences were immediate, visible, and politically sensitive.

Other sectors are now encountering the same issues, just later. As models in real estate, climate risk, insurance, and digital platforms move from informing judgment to determining outcomes, they’re hitting the same failure modes finance faced years earlier:

  • ➡false precision presented as certainty
  • ➡opaque decisioning with limited ways to challenge
  • ➡individual harm hidden behind strong aggregate performance

And in many of these domains, model outputs sit directly in front of consumers. Scores are treated as authoritative signals with little qualification. Because of that, the distance between prediction and consequence collapses. Error moves fast – through prices, access, behavior.

Insurance shows a partial exception. It adopted models early, but underwriting, pricing, and claims decisions kept human discretion. Models functioned as decision support. Accountability and override stayed with people. That structure absorbed some model error before it reached the consumer.

But as automation increases and pressure grows to remove human intervention, insurance is starting to face similar governance pressures. Human oversight doesn’t eliminate risk, but removing it without putting safeguards in place makes things worse, faster.

The bottom line? Real estate platforms, climate scoring systems, and other market-facing models are now in the same position finance once occupied. Financial crime controls, including AML, have operated under these expectations for years. So, they don’t need new principles. They require those principles to be applied wherever model now shape economic outcomes.

A governance blueprint for high-stakes models

Financial services learned its governance lessons the hard way, and decades ago. Strong performance at scale never removed the need for explanation, correction, and accountability when decisions affect individuals. 

Zillow’s climate risk scores hit the same challenges. Once the score started influencing liquidity and valuation, the absence of explanation, challenge, and uncertainty handling became impossible to ignore.

As models move deeper into real estate, insurance, and other market-facing domains, similar pressures will follow. Regulation may arrive unevenly, but expectations around accountability are already forming.

That’s why finance provides a working reference. It shows how high-impact models can operate at scale while remaining defensible for the individuals affected. High-stakes models need governance that recognizes where error lands and how outcomes can be examined when they carry lasting consequences. Talk to us about building explanation, correction, and accountability into high-impact decisioning.

Media Contact Email: enquiry@consilient.com

February 3, 2026 | Blog