By the end of 2017, Yelp had amassed more than 140 million reviews of local businesses. While the company’s mission focuses on helping people find local businesses more easily, this wealth of data has the potential to serve other purposes. For instance, Yelp data might help restaurants understand which markets they should consider entering, or whether to add a bar. It can help real estate investors understand where gentrification might occur. And it might help private equity firms with an interest in coffee decide whether to invest in Philz or Blue Bottle.
The potential value of the large data sets being amassed by private companies raises new opportunities and challenges for managers making strategic data decisions. While there are plenty of well-publicized examples of data repurposing gone wrong, we think it would be a shame for companies to decide the only option is to hoard their data. Before you decide that your data can’t be put to a new use, consider how it might help augment public data sources.
- Sponsored by GoogleThe science of storytelling and brand performance.
For example, in a recent paper, we explored the potential for Yelp data to measure local economic change and augment the official data, often from the U.S. census, which has long been the bread and butter of economic analyses. Our motivation was simple: Census data is valuable but can be slow-moving and coarse. Public-facing census data can tell you whether more restaurants are opening in a ZIP code, but only after several years. Yelp data can tell you, almost in real time, not only whether restaurants are opening in a ZIP code but even whether more-affordable restaurants are opening on a specific block. We found that Yelp data can help to meaningfully predict trends in the local economy well before census data becomes available, especially in more urban, more educated, and wealthier parts of the country.
This speaks to the broader potential for data from online platforms to improve our understanding of all of America. Just as Yelp can shed light on local economic changes, Zillow could inform our understanding of housing markets, LinkedIn could provide insight about labor markets, and Glassdoor could teach us about the quality of employment options in an area. Companies increasingly recognize the possibility of repurposing their data in these ways for the public good. But repurposing data can have benefits for a company far beyond the warm glow of having done some good. As researchers work with the data, new insights about their data and platform design choices may surface. As policy makers rely on the insights from the data, new relationships can form and facilitate valuable collaborations. Public-facing data efforts can also increase awareness of a company’s brand — allowing companies to do well by doing good.
Of course, there are times when repurposing data is not an option, because the data is either sensitive or not that useful. But we often see examples in which a potentially successful use of a new data source fails to deliver because of poor execution.
Drawing on our academic research assessing repurposed data sources, as well as our work with organizations, we see that simple guiding principles can help companies understand how to successfully repurpose their data.
Principle 1: Understand your unique perspective. When deciding whether and how to use your data, it’s crucial to take the time to understand whether it has real value relative to the information people already have access to. Start by looking for the best data available. Choose a broadly accepted benchmark, and set a narrow goal to see whether and where you can meaningfully add value.
When looking at Yelp data, for example, we considered census data a significant benchmark, since it is something commonly used within research and policy work. And we set the narrow goal of understanding whether Yelp data can augment existing data points with additional variables and provide more up-to-date information (since it’s updated in real time, while the census happens every 10 years). This flavor of incremental improvements can, paradoxically, lead to the largest gains, by making sure that you are going down the right path.
Principle 2: Develop credible analyses. For every exciting new use of digital data that we’ve come across, we’ve seen countless others fail to deliver. Successfully repurposing data requires taking benchmarks seriously and cross-validating against them. If your data doesn’t match existing benchmarks, then you have to understand why. If the differences are irreconcilable, then you might reconsider the value of your data on that dimension. And if you do go forward with using the data, it’s important to think through the best approach to analysis, taking the mismatch into account.
Credible analytics also requires understanding — and being transparent about — the strengths and limitations of your data. Returning to the Yelp example, we highlighted the strengths above. One limitation is that Yelp coverage varies over time and across places. Maintaining credibility and making the most of the data requires understanding and factoring this and other limitations into the analysis and conclusions drawn from the data.
Principle 3: Build partnerships. Even a company that has a great internal data team may not have the right skills to produce public-facing data that will have a real impact. Working with outside researchers and policy makers can help you gauge general interest, build a product that will have credibility, and develop insights that will create value for a broader audience.
There is no such thing as a perfect data set. This is both why new data sources are valuable and why repurposing data can be hard. Tech companies are now collecting unprecedented amounts of data, and they have the potential to greatly improve our understanding of the economy and policy. Yelp ratings are now being used for a variety of purposes, from predicting which restaurants are most likely to have health code violations, to helping understand which businesses are going to be impacted by increases to the minimum wage, to shedding light on how gentrifying neighborhoods are evolving. Other platforms have similar potential. And when done carefully and incrementally, each platform adds one piece to the puzzle, leading to a deeper and more nuanced understanding of the economy — all the while harvesting benefits for the company.