KDD 2020: Data Science for the Real Estate Industry

In August 2020, Cherre’s CTO Ron Bekkerman presented a tutorial on Data Science for the Real Estate Industry in collaboration with Foster Provost (NYU/Compass), Ali Rauh (Airbnb), and Vanja Josifovski (Airbnb) at the virtual KDD-2020 event.

Tutorial Overview

The multi-trillion-dollar real estate industry has been lagging on being brought into the twenty-first century. It is one of the oldest and largest industries in the world, and is notoriously resistant to change.  As a matter of fact, it has not changed dramatically over the past 3,000 years. Real estate investments are still made based on gut feelings. Market research is still done through leg work, which makes real estate investment strategies extremely local and not diversified. A company can be one of the largest real estate investors in the world while owning just a few (landmark) properties in one neighborhood. If something goes wrong with the neighborhood, the company gets in trouble. On the other hand, diversification is risky too, as the company would have to enter the “uncharted waters” of a different neighborhood.

Change has come. Real estate data has become increasingly available, allowing extensive market research, predictive modeling, trend and anomaly detection, visualization, and more. This modernization has been difficult to adjust to for many traditional players, creating rare opportunities for industry disruption.  For example, Airbnb has already completely changed the short-term rental market, and data analytics play no small role.  Zillow has changed how consumers perceive the residential real-estate market primarily due to data-driven estimates of property value.  A few traditional real estate players, including large brokerages, lenders, property insurers, and others, have taken the first steps towards adopting data-driven methodologies.  Over the next decade, the data scientist will bring a massive change to this huge industry.

Surprisingly, very little attention has been paid by the data mining community to the problems being tackled in real estate. This tutorial’s primary goal was to fill this gap by introducing the world of real estate data science to non-real estate data scientists. The presenters took the first steps toward preparing data scientists to make substantive contributions to the business and science of real estate, and invited researchers to work on real estate problems. 

The tutorial began with a short “Real Estate 101” course, introducing basic concepts, terminology, and the many different real estate businesses. Next, the tutorial covered a variety of real estate opportunities and challenges that can be addressed by data scientists.  The presenters then suggested general problems for which data science methods are well suited.  The tutorial highlighted opportunities for data science methods that are vital but have received less attention by the KDD community.  The real estate industry provides an exciting complement to traditional areas like ad tech: instead of very high volume, low confidence estimations, real estate often requires very high confidence estimations from only moderate amounts of data.

The presenters then dig deeper into a few specific areas to provide a more technical introduction to the breadth and depth of the opportunity.  For example, property valuation and pricing is vital across much of the real-estate industry.  The presenters covered the several different approaches for real estate valuation, including replacement cost assessment, comparables assessment, the income valuation model, the repeated sales model, hedonic models, etc.  They illustrated how these approaches align with data science methods, and how modern data-science techniques add value to traditional approaches. 

During the KDD event, the presenters introduced some data science concepts that were not broadly familiar to the KDD audience, but should be. For example, they covered methods for producing ranges/intervals for model-based valuation estimations–an important topic well beyond real estate, that has seen relatively little treatment in data mining research.

Finally, they introduced the real estate industry-wide knowledge graph and got into the details of its construction process, its characteristics, its challenges, and its role in developing novel methodologies for making macro- and micro-level market predictions.

Due to current events, the presenters also included a special section on challenges and opportunities in crisis situations, such as helping to house first responders, open house programs in the age of social distancing, accommodation utilization/availability in crisis situations. 

Presenters

RON BEKKERMAN is the Chief Technology Officer of Cherre Inc., an AI-powered real estate data integration platform. From 2013 to 2018, Ron was Assistant Professor and Director of the Big Data Science Lab at the University of Haifa, Israel. Prior to that, he was the Chief Data Officer of Viola Ventures, a founding member of the Data Science team at LinkedIn, and a Research Scientist at HP Labs in the Bay Area. He received his B.Sc. and M.Sc. in Computer Science from the Technion – Israel Institute of Technology, and his Ph.D. in Machine Learning from the University of Massachusetts, Amherst.

FOSTER PROVOST is a Distinguished Scientist for real-estate tech unicorn Compass, and Professor of Data Science, Professor of Information Systems, Andre Meyer Faculty Fellow at NYU’s Stern School of Business, Director of the NYU Stern Fubon Center’s Data Analytics and AI Initiative, and former Director of NYU’s Center for Data Science.  Foster previously was Program coChair for KDD, General coChair for IEEE DSAA, Editor-in-Chief of the journal Machine Learning, and founder/organizer for many workshops, including some that have achieved notable long-term success (such as HCOMP).is a Distinguished Scientist for real-estate tech unicorn Compass, and Professor of Data Science, Professor of Information Systems, Andre Meyer Faculty Fellow at NYU’s Stern School of Business, Director of the NYU Stern Fubon Center’s Data Analytics and AI Initiative, and former Director of NYU’s Center for Data Science. Foster previously was Program coChair for KDD, General coChair for IEEE DSAA, Editor-in-Chief of the journal Machine Learning, and founder/organizer for many workshops, including some that have achieved notable long-term success (such as HCOMP).

ALI RAUH leads the marketplace dynamics data science team at Airbnb. Her team works on delivering maximum value to Airbnb’s customers, community and stakeholders through pricing, monetization, cancellation policies, competitive intelligence, and supply and demand intelligence. Prior to joining Airbnb, Ali worked at Cornerstone Research where she applied economic analyses to complex business litigation involving antitrust, labor, consumer fraud and product liability, intellectual property, and other matters. Ali holds a PhD in Economics from the University of Chicago.

VANJA JOSIFOVSKI is the Chief Technology Officer, Homes at Airbnb where he leads efforts around developing technical vision and direction across the Homes business. He leads the Engineering, Data Science, Marketplace Dynamics, and Search Ranking functions. Vanja was most recently CTO at Pinterest, where he set the technical strategy in areas like machine learning and search. Prior to this role, he held positions as the Head of Discovery, Ads Engineering, and Growth Engineering. Before joining Pinterest, Vanja worked on large scale machine learning and information extraction as a Technical Lead at Google Research. His career began with roles at Yahoo Research and IBM Research. Vanja holds a PhD in large scale database systems.