Patreon Unhacked

A brief history of the Patron Watch project.

January 27, 2015

Today I started researching Patreon.

I was talking with my colleague Christina Xu about patterns in crowdfunding and support for creative work over time, and we decided to dig in and find some answers.

Our plan is to monitor the behavior of project funding over time, interview people about their experiences running a project, and mash our findings together into an enlightening story.

The first step was to get our hands on some data. Patreon doesn’t publish a comprehensive index of projects, but I discovered a URL from a previous version of the website that was still working:

https://www.patreon.com/discoverNext?srt=2&p=1

That p parameter is page; by incrementing it, you can scroll through a comprehensive list of projects. srt is probably sort, and 2 appears to indicate reverse chronological order.

So there’s the index of projects we need. Add a web scraper for individual project pages, slap together a data model, and we’re in business!

February 3, 2015

Christina and I interviewed Zach Wienersmith about his experiences using Patreon to fund webcomics.

One tidbit we learned was that most of his fans don’t collect their rewards or expect anything in return—they simply want to support him to keep doing what he’s already doing.

His insight was much more interesting (and fun!) than any of the numbers we gathered thus far.

March 16, 2015

Patreon acquired their competitor Subbable and announced a “matching” program to placate Subbable users, offering up to $100,000 in matching funds for the first 45 days of the transition.

March 31, 2015

There was a sudden, sustained spike in all levels of funding. Is it related to the matching program or the acquisition, or is it pure venture capital black magic?

I didn’t touch the crawler.

The median pledge (only counting projects with at least one patron) jumped from $0.42 two days ago to $3.60 today.

May 5, 2015

There was a small but sharp decline in median pledges: down to $3.50 from a steady $4.00. I suppose this was the end of the matching program.

Median monthly pledge per project between February and August 2015
MarchAprilMayJuneJulyAugust$0$20$40$60$80$100$120$140
Bottom 25%MedianTop 75%

I should note that our numbers don’t always agree with the figures quoted by Patreon. Their framing can be a little weird if you want to reason about a single project, but major discrepancies are most likely due to an incomplete crawl and not any statistical funny business.

July 21, 2015

I shut off Yahoo! Pipes, which had been our main page fetcher, and reluctantly ordered Google Compute Engine (previously the understudy) to pick up the slack.

Pipes had some issues with excessive politeness: it cached aggressively, timed out early, and obeyed robot directives strictly. But the main issue was Yahoo’s announcement that Pipes was shutting down at the end of September.

Pipes was the only way I know to make a major search engine scrape the web for you.

August 13, 2015

I checked up on the crawler logs and noticed that Patreon had removed their /discoverNext URL. Good for ‘em, tightening up security!

Small parts of the site layout also changed, and this broke the corresponding scraping code.

With the comprehensive feed of new projects gone, and no candidate for a substitute, it seems that the data collection portion of this research project is winding down.

September 30, 2015

Patreon got hacked.

Suddenly I can easily obtain the original database, not just incomplete snapshots. All that web scraping is obsolete in the shadow of this shiny new hi-fi artifact.

But I can’t bring myself to load the database dump. Using hacked data feels morally questionable, while crawling the public web does not.

October 5, 2015

I wonder who is combing through the leaked database. Researchers, business competitors, curious creative types, snoopers, scammers, stalkers, and malicious attackers can all take a peek.

It’s not just the leak that’s making me rethink things, though. The index we were using to find newly created projects is gone. We’ve lost momentum as our attention shifts to other projects. And this feels like a rotten time to interview, study, and audit Patreon’s users, who now have harassment, fraud, and theft to worry about.

We are left with quite a few unanswered questions and unfinished inquiries. These include:

But rather than use the cheat code (or is it the teachers edition?), we are letting this project go.