KDD 2020: Data Science for the Real Estate Industry

On August 23, 2020, Cherre’s CTO Ron Bekkerman will co-present a KDD-2020 tutorial on Data Science for the Real Estate Industry, in collaboration with Foster Provost (NYU/Compass), Ali Rauh (Airbnb), and Vanja Josifovski (Airbnb).

Tutorial Overview

The multi-trillion-dollar real estate industry has been lagging on being brought into the twenty-first century. It is one of the oldest and largest industries in the world, and is notoriously resistant to change.  As a matter of fact, it has not changed dramatically over the past 3,000 years. Real estate investments are still made based on gut feelings. Market research is still done through leg work, which makes real estate investment strategies extremely local and not diversified. A company can be one of the largest real estate investors in the world while owning just a few (landmark) properties in one neighborhood. If something goes wrong with the neighborhood, the company gets in trouble. On the other hand, diversification is risky too, as the company would have to enter the “uncharted waters” of a different neighborhood.

Change has come. Real estate data has become increasingly available, allowing extensive market research, predictive modeling, trend and anomaly detection, visualization, and more. This modernization has been difficult to adjust to for many traditional players, creating rare opportunities for industry disruption.  For example, Airbnb has already completely changed the short-term rental market, and data analytics play no small role.  Zillow has changed how consumers perceive the residential real-estate market primarily due to data-driven estimates of property value.  A few traditional real estate players, including large brokerages, lenders, property insurers, and others, have taken the first steps towards adopting data-driven methodologies.  Over the next decade, the data scientist will bring a massive change to this huge industry.

Surprisingly, very little attention has been paid by the data mining community to the problems being tackled in real estate. This tutorial’s primary goal is to fill this gap by introducing the world of real estate data science to non-real estate data scientists. We take the first steps toward preparing data scientists to make substantive contributions to the business and science of real estate, and invite researchers to work on real estate problems. 

The tutorial will start with a short “Real Estate 101” course, introducing basic concepts, terminology, and the many different real estate businesses. Then, we will familiarize the audience with a variety of real estate opportunities and challenges that can be addressed by data scientists.  We will suggest general problems for which data science methods are well suited.  We also will reveal opportunities for data science methods that are vital but have received less attention by the KDD community.  The real estate industry provides an exciting complement to traditional areas like ad tech: instead of very high volume, low confidence estimations, real estate often requires very high confidence estimations from only moderate amounts of data.

The tutorial will then dig into one or two specific areas to provide a more technical introduction to the breadth and depth of the opportunity.  For example, property valuation and pricing is vital across much of the real-estate industry.  We will present the several different approaches for real estate valuation, including replacement cost assessment, comparables assessment, the income valuation model, the repeated sales model, hedonic models, etc.  We will illustrate how these approaches align with data science methods, and how modern data-science techniques can add value to traditional approaches. 

We will introduce some data science concepts that are not broadly familiar to the KDD audience, but should be. For example, we will discuss methods for producing ranges/intervals for model-based valuation estimations–an important topic well beyond real estate, that has seen relatively little treatment in data mining research.

Finally, we will introduce the real estate industry-wide knowledge graph and get into the details of its construction process, its characteristics, its challenges, and its role in developing novel methodologies for making macro- and micro-level market predictions.

Given current events, we also will include a special section on challenges and opportunities in crisis situations, such as helping to house first responders, open house programs in the age of social distancing, accommodation utilization/availability in crisis situations. 

This tutorial will whet the appetite of the data mining research community for real estate problems. As more and more data scientists turn their attention to real estate challenges, and we see increasing conference submissions and real-world applications, this tutorial will have prepared KDD attendees to understand that work more deeply, and to contribute themselves.

Presenters

RON BEKKERMAN is the Chief Technology Officer of Cherre Inc., an AI-powered real estate data integration platform. From 2013 to 2018, Ron was Assistant Professor and Director of the Big Data Science Lab at the University of Haifa, Israel. Prior to that, he was the Chief Data Officer of Viola Ventures, a founding member of the Data Science team at LinkedIn, and a Research Scientist at HP Labs in the Bay Area. He received his B.Sc. and M.Sc. in Computer Science from the Technion – Israel Institute of Technology, and his Ph.D. in Machine Learning from the University of Massachusetts, Amherst.

FOSTER PROVOST is a Distinguished Scientist for real-estate tech unicorn Compass, and Professor of Data Science, Professor of Information Systems, Andre Meyer Faculty Fellow at NYU’s Stern School of Business, Director of the NYU Stern Fubon Center’s Data Analytics and AI Initiative, and former Director of NYU’s Center for Data Science.  Foster previously was Program coChair for KDD, General coChair for IEEE DSAA, Editor-in-Chief of the journal Machine Learning, and founder/organizer for many workshops, including some that have achieved notable long-term success (such as HCOMP).is a Distinguished Scientist for real-estate tech unicorn Compass, and Professor of Data Science, Professor of Information Systems, Andre Meyer Faculty Fellow at NYU’s Stern School of Business, Director of the NYU Stern Fubon Center’s Data Analytics and AI Initiative, and former Director of NYU’s Center for Data Science. Foster previously was Program coChair for KDD, General coChair for IEEE DSAA, Editor-in-Chief of the journal Machine Learning, and founder/organizer for many workshops, including some that have achieved notable long-term success (such as HCOMP).

ALI RAUH leads the marketplace dynamics data science team at Airbnb. Her team works on delivering maximum value to Airbnb’s customers, community and stakeholders through pricing, monetization, cancellation policies, competitive intelligence, and supply and demand intelligence. Prior to joining Airbnb, Ali worked at Cornerstone Research where she applied economic analyses to complex business litigation involving antitrust, labor, consumer fraud and product liability, intellectual property, and other matters. Ali holds a PhD in Economics from the University of Chicago.

VANJA JOSIFOVSKI is the Chief Technology Officer, Homes at Airbnb where he leads efforts around developing technical vision and direction across the Homes business. He leads the Engineering, Data Science, Marketplace Dynamics, and Search Ranking functions. Vanja was most recently CTO at Pinterest, where he set the technical strategy in areas like machine learning and search. Prior to this role, he held positions as the Head of Discovery, Ads Engineering, and Growth Engineering. Before joining Pinterest, Vanja worked on large scale machine learning and information extraction as a Technical Lead at Google Research. His career began with roles at Yahoo Research and IBM Research. Vanja holds a PhD in large scale database systems.