In January this year, young programmer and open society evangelist Aaron Swartz killed himself at the age of 26. As someone who had helped create RSS, developed Reddit, and been a huge force in recent Internet campaigns against the US SOPA and PIPA bills, his death was a huge blow to digital rights supporters.
His death followed his involvement in a campaign to open academic archives. His actions to ‘liberate’ academic journals led him to trouble with the US state attorney, and ultimately their attempt to make an example of him led to him taking his own life.
The core of his campaign is accepted by many academics. Academia by and large has come round to the idea that publicly funded research should be published without copyright restrictions, so that they are available to everyone.
A great deal of research is still published in expensive journals, that are paid for by subscriptions. If you or I want access, we have to pay per article – usually around £25 a go.
While this made sense in the paper age, it is difficult to justify in the digital era. Nearly all of the cost is in the production of the research, only a small amount is in the, editorial, (usually unpaid) peer review and publication costs.
Academic publishers like Reed-Elselvier make 30% margins from academic journals, as Universities subscribe to the journals that publish their state-funded research. Carrying on with these restrictive publishing practices doesn’t make sense, and state funders in the UK and the USA want it to change. They are changing to force ‘open access’ publishing, either simultaneously, or up to two years after initial publication.
Change is sometimes slow to come, unfortunately, as other forces are at work, such as the need to publish in prestigious journals, which often remain closed.
But Aaron Swartz’s campaigning friends wanted something more: they asked, if we accept that academic knowledge should be available to everyone in the future, why are we accepting that older pre-Internet publications remain unavailable? His Guerilla Open Access Manifesto said:
“even under the best scenarios, their [academic] work will only apply to things published in the future. Everything up until now will have been lost.
“That is too high a price to pay. Forcing academics to pay money to read the work of their colleagues? Scanning entire libraries but only allowing the folks at Google to read them? Providing scientific articles to those at elite universities in the First World, but not to children in the Global South? It’s outrageous and unacceptable.”
His manifesto advocated deliberately circumventing copyright restrictions:
“We need to take information, wherever it is stored, make our copies and share them with the world. We need to take stuff that's out of copyright and add it to the archive. We need to buy secret databases and put them on the Web. We need to download scientific journals and upload them to file sharing networks. We need to fight for Guerilla Open Access.
“With enough of us, around the world, we’ll not just send a strong message opposing the privatization of knowledge — we’ll make it a thing of the past. Will you join us?”
Swartz was being pursued because he had apparently, secretly, been downloading academic papers from JSTOR. He was arrested, after he was noticed, and United States Attorney Carmin Ortiz decided to pursue him, to send a message against copyright infringement, even though he hadn’t distributed any files. Because statutory damages are extremely high in the USA, Swartz faced large fines and prison, even from a limited selection of accusations, and succumbed to the pressure.
Swartz wasn’t the only person to take action. Greg Maxwell torrented 18,592 out of copyright journals from the Philosophical Transactions of the Royal Society. The Royal Society had digitised this wealth of historic academic knowledge – but were charging for access. After Maxwell’s actions, they relented and released the files for free access. This was no small victory, some of the great scientific discoveries of the past are contained in those documents.
The logic of publishing material freely, without legal restrictions, goes beyond academic publications and public domain material. Swartz was involved in other direct action campaigns to open up public information. He circumvented controls on court records to allow them to be freely published.
In the USA, anything the government produces is assumed to be out of the scope of copyright. All government documents, films, photos from NASA; they are for everyone’s benefit. So trying to restrict access to electronic court documents seemed to be out of step with US practice.
In the UK, court records aren’t routinely published, although they are semi-public. Government data isn’t automatically public domain.
Change is coming, but slowly. Our government accepts that “Open Data” is a good idea, and it publishes most government material on a licence that allows you to reuse the material.
In fact, the government has great hopes for Open Data. They expect information about government performance, social changes and routine services to help democracy and grow the economy. So far so good: reusing information that exists and costs little to publish could lead to all kinds of benefit.
Unfortunately, the government lost some of the big opportunities. You may have noticed that you can download who Ordnance Survey maps, for instance – but not the most detailed information. You can get lists of pos codes – but not lists of addresses relating to post codes.
The reason you can’t is that Ordnance Survey traditionally makes money out of selling map data, and the Royal Mail makes a small amount of cash from licensing post code data. Similarly, weather data from the Met Office used to be something you had to pay for – although they have decided to release nearly all of their records to the public.
Why is releasing data good for the economy, when people will pay for it? Firstly, it means more people will use it, which means more jobs and more tax. Secondly, it means more competition. Thirdly, more use should mean more innovation and efficiency.
That’s all a bit abstract, so here’s a couple of examples. With post codes, most online businesses need to check addresses with post codes for billing or delivery. Charging for that data makes it harder and more costly for new entrants.
Buses and trains – publicly funded but privately run – should be aiming to get passengers onto their services, so timetables and live departures can improve people’s journeys. Live traffic data makes your road travel easier and help you avoid jams.
Maps, post codes, Companies House data and weather data is also useful because people can build other services on top of it. It’s often called “core reference data” because it is a kind of data infrastructure: so it is disappointing that only some of it has been released for everyone to use.
The lesson Aaron Swartz asked us to face is, if information is power, and more access to information leads to a wider distribution of power and benefits, then we should be very reluctant to restrict the flow of information. Copyright can be abused to restrict public information: academic publications, court records or government data. In a digital world, that really doesn’t make sense, but it also places people at a serious disadvantage. Sometimes that is a moral concern, in the case of scientific and educational information. Other times, it is just dumb.
Either way, the remarkable thing is that twenty years into the digital revolution, and we’re still having these arguments, and political forces can still resist to the point of pushing someone like Swartz over the edge.
Guerilla Open Access Manifesto