Tech

Pete Warden, Facebook info harvester, explains why he deleted all his data

In recent days, media outlets nationwide have told the story of Boulder-based Pete Warden, who created a database of information about Facebook's approximately 215 million account holders. Facebook accused Warden of violating the rules of its site and threatened him with legal action -- a prospect he took seriously enough to destroy all the info he'd gleaned to date.

Turns out that Warden's project had raised eyebrows in the tech community for months: Check out this February post from MichaelZimmer.org, which challenges the ethics of the database. "Just because these Facebook users made their profiles publicly available does not mean they are fair game for scraping for research purposes," Zimmer argues, adding that the the approach "poses a serious privacy threat to the subjects in the dataset, their friends, and perhaps unknown others."

Warden doesn't see it that way.

On Warden's personal blog, PeteSearch, he's posted a long explanation about what went down under the heading, "How I Got Sued by Facebook."

Here's how he explains his basic concept:

I scratched my head a bit and thought "well, how hard can it be to build my own search engine?". As it turned out, it was very easy. Checking Facebook's robot.txt, they welcome the web crawlers that search engines use to gather their data, so I wrote my own in PHP (very similar to this Google Profile crawler I open-sourced) and left it running for about 6 months. Initially all I wanted to gather was people's names and locations so I could search on those to find public profiles. Talking to a few other startups they also needed the same sort of service so I started looking into either exposing a search API or sharing that sort of 'phone book for the internet' information with them.

Warden subsequently set up a website called FanPageAnalytics.com, which he saw as having commercial applications. But in early February, after putting together "How to Split Up the US," an article gleaned from some of his findings, he got a call from a Facebook attorney.

After contacting a lawyer of his own, Warden came to the conclusion that while he could fight Facebook's demand that he call a halt to his project, "the legal costs alone of being a test case would bankrupt me."

Hence, his decision to destroy his database. He concedes that he's "just glad that the whole process is over." However, he adds:

I'm bummed that Facebook are taking a legal position that would cripple the web if it was adopted (how many people would Google need to hire to write letters to every single website they crawled?), and a bit frustrated that people don't understand that the data I was planning to release is already in the hands of lots of commercial marketing firms, but mostly I'm just looking forward to leaving the massive distraction of a legal threat behind and getting on with building my startup. I really appreciate everyone's support, stay tuned for my next project!

Now doubt plenty of folks will be watching his next moves closely.

KEEP WESTWORD FREE... Since we started Westword, it has been defined as the free, independent voice of Denver, and we'd like to keep it that way. With local media under siege, it's more important than ever for us to rally support behind funding our local journalism. You can help by participating in our "I Support" program, allowing us to keep offering readers access to our incisive coverage of local news, food and culture with no paywalls.
Michael Roberts has written for Westword since October 1990, serving stints as music editor and media columnist. He currently covers everything from breaking news and politics to sports and stories that defy categorization.
Contact: Michael Roberts