Real Estate

Zillow Orders Data Company to Stop Scraping Its Content

Zillow sent a cease-and-desist letter this week to a data startup that, the letter argues, was improperly “harvesting” data from the online real estate giant.

The letter went out Monday to a company called Datafiniti. The Texas-based company offers “instant access to every data point on the web,” and claims to have data sets that include millions of people, products and businesses.

It provides this data to firms in a variety of industries — its website touts clients such as cable TV channel Nickelodeon and marketing firm Wishpond — including those in real estate. The company’s property section specifically promises to let clients “access every real estate listing on the web instantly.”

However, Zillow claims that Datafiniti was “systematically harvesting” data and content such as photos from Zillow’s website in violation of the portal’s terms of use. The cease-and-desist letter — which Inman has reviewed — specifically states that Zillow’s terms of use don’t allow companies to use robots, web crawlers and other automated tools to access its site. Zillow also doesn’t let other parties reproduce, display or otherwise use Zillow content.

Datafiniti’s alleged content harvesting, the letter indicates, violates these terms and amounts to a copyright infringement.

“Zillow Group has worked persistently over the past 16 years to establish goodwill in the real estate industry and your use of data from the Zillow Sites for your commercial purposes is a blatant misrepresentation of a business relationship with Zillow and is causing Zillow reputational harm,” the letter continues.

The letter concludes by ordering Datafiniti to stop using and delete Zillow data.

Shion Deysarkar

In a phone conversation Tuesday, Datafiniti founder and CEO Shion Deysarkar characterized the situation as a misunderstanding. He said that his company is crawling the web the way Google does as it compiles and creates search results.

“Instead of consumer search we’re providing more structured search,” Deysarkar said.

He went on to say that Datafiniti clients receive “raw information” about properties, and that the information is publicly available. Deysarkar also said Datafiniti is not building a competitor to Zillow. Instead, Datafiniti is trying to work with companies that focus on things like fraud prevention and pricing analytics, and which might need property data.

While exploring Datafiniti’s website, Inman was able to find photos with watermarks from Redfin, the California Regional Multiple Listing Service (CRMLS) and the Central Panhandle Association of Realtors (CPAR), among others.

A screenshot from Datafiniti’s website showing a photo with a Redfin watermark in the lower left corner. Credit: Datafiniti

In an email, a Redfin spokesperson told Inman the company is “currently investigating this issue.”

“We don’t have specifics to share at this time, but we can reiterate that our terms of use clearly prohibit scraping and scrubbing any websites owned and operated by Redfin,” the spokesperson also said.

CRMLS did not immediately respond to Inman’s requests for comment. In the case of CPAR, a spokesperson told Inman that Datafiniti does not subscribe to its IDX or MLS, and as a result CPAR couldn’t confirm if the company was using its data.

A screenshot from Datafiniti’s website showing a photo with a CRMLS watermark in the lower right corner. Credit: Datafiniti

Asked about the presence of watermarked photos on Datafiniti’s website, Deysarkar told Inman “that’s something that we should change.” Deysarkar also said the images are “not actually something we’ve downloaded,” and that “we’re linking to the Zillow site.”

Deysarkar said he hopes to resolve the situation by explaining to Zillow what his company is doing.

Whether that is enough to satisfy Zillow and potentially other organizations remains to be seen, but the case does highlight the chaotic world of real estate data, as well as the ease with which that data can migrate across virtual locations.

Eric Stegemann

Eric Stegemann, CEO of real estate data firm Tribus, is among those in the industry who has watched the Datafiniti situation unfold. He told Inman that property-oriented technology startups often run the risk of misusing data because they don’t understand the complicated licensing procedures involved in the space. For example, in the case of his own company, Stegemann said Tribus has spent 12 years piecing together agreements with over 300 MLSs.

By contrast, some people or startups mistakenly assume they’re free to use online content as they see fit.

“I think the problem in this space is that there’s an expectation that if something is on the internet then the data should be free,” Stegemann said.

Such situations typically end with cease and desist orders, or with demands for compensation.

A high profile example is the conflict between Zillow and the satirical architecture blog McMansion Hell. In 2017, Zillow demanded writer Kate Wagner stop using photos from the portal to mock oversized homes on her blog. Zillow eventually backed down and Wagner said she wouldn’t use the company’s images. But the incident highlights how even in a relatively low stakes situation, companies have learned to be protective of the content on their sites.

Stegemann said he was familiar with other situations in which companies appeared to have improperly gathered data. One such situation happened in 2019, when would-be Zillow competitor HouseCanary abruptly turned off its public-facing portal after multiple MLS executives said the company didn’t have permission to use their data.

However, Stegemann called the current situation with Datafiniti “one of the more egregious ones that I’ve seen” because “they’re selling data to third parties.”

“So there are potentially thousands of third parties that have this data set in their hands that shouldn’t have it,” he added.

Given the fragmented landscape, Stegemann expects conflicts over data use to continue being an issue, comparing the situation to a game of whack-a-mole. As a result, he advised organizations, particularly MLSs that might have public-facing listing sites, to actively check if other companies are scooping up their data. He also said real estate organizations should proactively seek out partners that will protect their content.

“Make sure those companies you’re working with,” Stegemann said, “have systems and tools in place to check and block these sorts of scrapers.”

Email Jim Dalrymple II

What's your reaction?

In Love
Not Sure

You may also like

More in:Real Estate

Leave a reply

Your email address will not be published. Required fields are marked *