A Toronto data scientist is sharing a million-dollar prize won in the obscure world of competitive machine-learning derbies, in this case the task was to help online real estate listings service Zillow Group make a more accurate “Zestimate” to predict the sales price of a home.
Zillow’s Stan Humphries, chief analytics officer and creator of the Zestimate, presented an oversized novelty cheque in Toronto to Nima Shahbazi, a graduate of Iran’s Sharif University of Technology who holds a PhD in computer science from York University and has studied at the Rotman School of Management. “It’s surreal,” Mr. Shahbazi said. He’s one-third of an international team calling themselves ChaNJestimate that formed midway through the competition; the other members are Chahhou Mohamed of Morocco and Jordan Meyer of the United States.
ChaNJestimate’s winning submission was 13.3 per cent more accurate than Zillow’s existing Benchmark Model score, based on the ultimate selling price of the property.
Zillow is one of the biggest online real estate portals in the United States and said that in the third quarter of 2018 it had an average of more than 186 million unique visits to its sites every month. Last October, it said it would begin adding Canadian properties to its site in partnership with Century 21 Canada, Right at Home Realty, Re/Max Ultimate Realty Inc., Exit Realty Corp International, Better Homes and Gardens Real Estate Signature Service, Core Assets Real Estate and Greater Property Group.
Introduced in 2006, Zillow claims its Zestimate calculator is wrong in its home price predictions only about 4.5 per cent of the time. The goal of the competition was to help the company get that error rate below 4 per cent.
It’s not the first time Mr. Shahbazi has won money in machine learning competitions. He has entered several Kaggle contests (owned by Alphabet Inc.) where some of the world’s best computer scientists compete to crack difficult big data and artificial intelligence problems.
“I competed for the first time in 2015 in the ICDM competition,” he said, referring to the IEEE International Conference on Data Mining’s 2015 contest to develop ways to track people’s digital identity as across devices “improving marketers' ability to identify individual users” and target them with “relevant” ads. “It was so cool to work with real data on a real problem and to have my solution ranked against top entries from around the world. In academia there is much less feedback. I ended up placing seventh in that competition and I was hooked.”
As of 2018, Kaggle has more than 2.5 million members, more than 14,000 public datasets to download (and another 78,000 private datasets). It launched 52 new competitions in 2018 and 181,000 users participated in them. At one time Mr. Shahbazi was ranked the 19th-best Kaggler (despite this win he’s fallen to 64th over all), and is one of 131 “grandmasters” of the data-science competitions.
All told Shahbazi has competed in 20 Kaggle events, coming second five times and finishing in the top 15 another five times. He finished in the top five per cent of challengers on contests for everything from a Grupo Bimbo (the Mexican bakery multinational that owns Dempsters, Hostess, Vachon brands in Canada) inventory algorithm, a Home Depot online search relevancy challenge, a drug store sales volume prediction engine, a music recommendation system and even the 2017 Data Bowl that looked at lung-cancer detection.
Your house is your most valuable asset. We have a weekly Real Estate newsletter to help you stay on top of news on the housing market, mortgages, the latest closings and more. Sign up today.