Reddit co-founder, mass downloader of scholarly work, faces 13 felonies

Aaron Swartz was charged under a 1984 Internet law


(Fred Benenson/Flickr/Wikimedia Commons)

And now the curious case of 25 year-old Aaron Swartz.

By any measure, he is a brilliant kid. He co-wrote the specs for RSS 1.0 (really simple syndication) when he was 14. When he hit 20, Swartz sold Reddit, the social news site he co-founded, to Condé Naste for some small fortune, and then dedicated himself to non-profit work for the public good.

His philanthropy didn’t involve gala fundraisers. Swartz became an aggressive activist, an agitator, and depending on your view of things, a data radical.

One major project was Demand Progress, his PAC (political action committee). The group pioneered online activism tactics and was a major player in the successful campaign against SOPA and PIPA, proposed legislation that would have curbed civil rights online. But this isn’t what got Aaron Swartz into trouble.

Somewhere along the way he began using data analysis research to track corruption. He crunched a data set of over 400,000 law review articles to determine the source of their funding.  This led to a fellowship at the Harvard Ethics Center Lab on Institutional Corruption. That gave him access to MIT’s libraries, which in turn had access to JSTOR, a non-profit online database of journal articles.

Aaron Swartz allegedly set out to liberate these articles. JSTOR might be a non-profit, but using it costs money. It has a massive trove of searchable, high-quality peer-reviewed academic work that it separates from the public with a paywall.

Swartz is accused of writing a program that began automatically scraping JSTOR’s 38 million pages of text, one at a time, and dumping them onto a removal hard drive hooked up to an Acer laptop that Swartz allegedly hid in an MIT library closet. He allegedly returned to the library periodically to switch full hard drives with empty ones, covering his face with his bike helmet as he passed surveillance cameras.

Why might he have done this? Swartz has a history of acting as a sort of data Robin Hood. In 2009 PACER, a walled database of public court documents, was made available for free at 17 public libraries. Access to it would have normally cost 8 cents a page. Swartz installed a scraper app that sent the pages directly to the cloud, where they were released into the wild, free for anyone to access. When I asked him about it in an interview, he told me that “it’s our job for those of us on the outside to keep pushing… and showing how important it is to get things on the web”.

The FBI investigated Swartz for releasing the PACER documents, but pressed no charges. He hasn’t been so lucky with the JSTOR data. He now faces 13 felony charges—9 were laid last year, and an additional 4 just last week. He could spend decades in prison, and face fines of up to $1 million. Some of the charges have to do with the damage he supposedly did to MIT’s servers, but others seem far more quotidian. Swartz is accused of using a false email address, a made-up guest login name, and violating JSTOR’s clickable terms of service agreement. How are these infractions considered felonies?  Because of the 1984 Computer Fraud and Abuse Act, a law crafted well before the Internet was in common use, which was intended to provide stiff penalties for those who hacked into corporate or government systems to steal data or destroy them.

A federal appeals court found that if the the CFAA was applied to things like violations of website’s terms of service agreements, then  “millions of unsuspecting individuals would find that they are engaging in criminal conduct.”  It ruled that these kinds of cases were better left to civil courts. But this ruling only covers certain States in the west. In Massachusetts, where Swartz is accused of ignoring JSTOR’s terms, clicking “I agree” without reading and obeying could land you in a Federal prison, a convicted felon.

Swartz pleaded not guilty to the original charges and has promised to do the same with regards to the new felonies. His activism, it seems, will now move on to challenging the validity of an almost 30 year old computer law that makes criminals out of anyone who has ever used a fake name on a website, and which threatens his own freedom.

Follow Jesse Brown on Twitter @JesseBrown


Reddit co-founder, mass downloader of scholarly work, faces 13 felonies

  1. Jesse this is a little weak. While Swartz did apparently provide a fake email and id, he also intentionally undermined the financial model JSTOR used to pay for its activities. Given that JSTOR appears to be an entity whose purpose is to provide a public service at cost, there is clearly a case for a criminal prosecution. After all, the end game might be that JSTOR ceases to function and the public’s access to future work is diminished. The distinction between Swartz’s destructive (to JSTOR) activities and someone who benignly doesn’t follow all of the rules should be clear enough to everyone including the courts.

  2. “Aaron Swartz allegedly set out to liberate these articles.”

    That’s one way to look at it. Another way to look at it is he set out to steal these articles.

    • But the way Aaron Swartz looked at it, it was liberation. Which is probably why the author chose that word in telling the story. It’s called writing.

      • It is called writing – I believe the ‘That’s one way to look at it’ covers that point.

        The ‘Another way to look at it is he set out to steal these articles’ illustrates my opinion as opposed to the author’s (or Swartz’s) – it is called reading.

    • You clearly have no knowledge of the law or your rights. You do not understand the difference between criminal and civil offenses, and you do not understand the definition of “theft”. What you are referring to is “copyright infringement” and the law is very clear (or at least once was) that copyright infringement is a civil offense.

      Now here’s a little thought exercise: how is Google cache any different?

      • I did not realize the word ‘liberate’ was a technical, legal term in copyright law, such that my reply could have been construed as espousing a technical, legal term.

        However, in the USA copyright infringement can be a criminal offense. Here is an overview of US criminal copyright law.

        I doubt he meets the criteria for criminal copyright prosecution but it is silly to proclaim such a thing doesn’t exist.

        I don’t know enough about Google cache to comment on what differences may or may not exist.

  3. JSTOR did not press charges. These are criminal charges filed by the US Attorney for the District of Massachusetts. Subsequent to the events at issue, JSTOR has established a limited public access program. JSTOR operates only by the leave of major copyright holders and with funding from (initially) foundations and most universities, large public libraries and well-funded secondary school libraries. JSTOR has a history of working with text-mining and other “non-consumptive” research users. Their chief complaint here was related to server loads and bandwidth, at least according to their contemporaneous press release.

  4. Robin Hood (even in digital form) robs the rich to give to the poor. He doesn’t rob non-profit companies to give to, well, whoever he was giving the files to.

  5. First of all, for the millionth time, copyright infringement is not stealing. Just like saying the earth is round is not blasphemy, or abortion is not murder. Don’t confuse, or reinforce artificial confusion of, distinct concepts. Had this copyright infringement = stealing thing worked, there would be no piratebay by now would there? And in the same vein, Swartz is no Robin Hood.

    Second, (this one is based on a specific ethical point of view, rather than a legal one) the majority of JSTOR’s content, unlike other academic archives, is electronic copies of works which are actually in public domain. There are 18th century journals in JSTOR’s archives. All JSTOR does is to make them accessible in a new format. I’m not claiming that this is a cheap or simple operation. But JSTOR should have recouped most of the expenses since 1995, shouldn’t it? Furthermore, indirect copyrights such as those JSTOR exerts over the public domain works it makes accessible are not morally as defensible as the direct ones the authors had. For the minority of JSTOR’s content, which is newer, this objection does not hold though.

    • So.. once you recoup any expenses you’ve incurred, you’re not entitled to make any more money off of the work you’ve done?

      • Not if you’re a non-profit.

        • Come back when you understand what a non-profit is.

          Hint: “making money” does not have to be “profit”
          Hint 2: “how do they continue their work?”

  6. He knew what he was doing was wrong. He hid the hard drive and covered his face from the security cameras. He went to elaborate efforts to knowingly break the law and to take something he knew he was not entitled to take. That is NOT simply clicking on an “I agree” button.

    • He didn’t “take” anything. When you download from the web you’re not “taking” anything. At best he was copying, posting to the cloud to provide unencumbered access. The original electronic files still sit wherever they were stored. As far as it goes, people who normally access those files have the option to go to the source, pay and go about their work. People who choose not to, say a student have an alternate, less expensive means. 8 cents a page may not seem like much. DL a couple of thousand pages and you’re talking big cash for a person of limited means.

      • And how much did the company pay getting them digitized? You think they should do that for free?

        If he was willing to duplicate their work.. transcribe those articles himself.. there’d be no problem. He wasn’t. He was taking their work. The material itself he may have simply copied, but their work, their effort, their labours, he took that.

  7. For those who say Swartz’s “liberation” of scholarly articles is stealing, stealing from whom? The content of Jstor consists of articles written by academics for free. Publishing helps academics win tenure and promotion but we don’t get money for writing them. I sure don’t get a payment if someone downloads the article nor does the journal that published the articles. Much of the content in Jstor is historic and the authors are long dead and the articles are out of copyright. Academic institutions pay millions of dollars every year to Jstor to access the work created by the academy for free. Billions of dollars every year to for profit vendors like Thomson-Reuters, Lexis-Nexis and Science Direct for mostly academic and government authored information. Academia passes these costs onto students in ever rising tuition costs.

    A key issue here is that Jstor is not pressing charges or pursuing any civil charges against Swartz. Jstor has now moved to make some of its content freely accessible. The Feds are driving this witch hunt, and it is shameful.