Zillow Users Will Lose with “Winning” Million-Dollar Contest’s Home Valuation Model
Zillow recently named a million-dollar winner in a Kaggle contest that challenged entrants to boost the accuracy of its home valuation algorithm — and the prize went to the wrong team. Zillow users will lose with this “winning” home valuation accuracy model — because accuracy isn’t impacted.
While the winning model had a 0.5% increase in accuracy, it still had an average error of 4%. Zillow misses one fundamental point: Every stakeholder of Zillow is affected differently by accuracy.
The home pricing game is one of negotiation and asymmetric costs. If Zillow underestimates the correct price, the buyer may lose significant money today. For example, if I am selling a $500k home, Zillow’s improved model can still underestimate by enough that I lose the value of a new Mercedes. On the other hand, if Zillow overestimates the correct price, the buyer may pay more, but much of that is usually paid out over 30 years and the immediate impact would be less than on the seller. The realtors, of course, make more money if the house sells for more. Clearly overestimates and underestimates have very different impacts on the different Zillow stakeholders. Simple accuracy measures don’t distinguish between overestimates and underestimates. Thus, accuracy just doesn’t measure the impact the AI model has on the buyers, the sellers or the market.
By only measuring for accuracy, Zillow has silently put in place an “improved” model that either helps buyers or sellers, but they did not tell us who they just helped.
Let’s take a simpler example: Another million-dollar, famous Kaggle prize awarded by Netflix. If Netflix recommends a Real Housewives show to me, I would wonder how it could possibly think I would like that show. I may even stop trusting Netflix recommendations because of such a bad experience. This is called a False Positive, where the AI predicts I would like the show, but is wrong. If Netflix fails to recommend the latest Jackie Chan movie to me, that would be a False Negative, where the AI predicts I won’t like a movie, but I would have. However, I may not even notice that the AI failed to recommend this movie to me. If I do notice, I may think that I should have rated more Jackie Chan movies, or maybe this Jackie Chan movie is not that great after all. Either way, the negative impact of the False Negative is much less than that of a False Positive. Once again, clearly the benefits and costs of different kinds of errors are clearly not symmetric.
Then how can we focus on simple accuracy measures that don’t consider the impact of different kinds of errors on the end user? We need to start every AI project by clearly understanding the cost of different types of errors and balancing the impact on the different stakeholders. If we skip this step, we risk unintentionally harming the people we were trying to help by building a better AI model. This is the core of ethics in AI — understanding the implications of the AI before deploying it.
Read More: The Future of Business Communications
If we can’t just evaluate AI based on accuracy, then what is the right approach? The answer is impact. Dozens of losing, lower accuracy, models would have been better for Zillow users if we actually considered how buyers, sellers and realtors are all impacted differently by a sale. Unless a model is adjusted to balance the unique needs of each group, accuracy can prove misleading and dangerous.
The Zillow Prize should have gone to the team whose model took the interests of the buyer and the seller into account. Such a contestant would have been further down the list because they would have had a lower accuracy rate — but a fairer model. The ivory tower approach of measuring AI only for accuracy makes silent judgments that impact markets, businesses and people directly. AI is no longer just the domain of experts. It impacts all of us once it starts affecting which movie we watch and the value of our house. It’s time for us to recognize that model accuracy simply does not measure how AI affects stakeholders. Stop competing on accuracy. It’s just wrong.
Read More: What’s the Next Game AI Will Solve?