We’ve been asked to explain what’s “new” in the most recent update to the data in our ongoing study of the impact of peer-to-peer (P2P) file sharing on paid book sales. At BookExpo America, we presented an update that included sales data for 21 O’Reilly Media2008 front-list titles that we found on one or more P2P sites. This is an increase of 13 titles over the 8 that had been found when we first presented at Tools of Change in February 2009. It is still less than a third of all O’Reilly titles first published in 2008.
In trying to assess the impact of digital piracy on paid sales, we have been measuring paid sales four weeks before and four weeks after a title is first seeded. In our initial data set (eight titles), sales in the four weeks after a file was first seeded increased 6.5%; in the most recent report (all 21 titles), sales decreased 4.8% in the four weeks after seeding first occurred. The average lag time between first paid sale and first instance of seeding on a P2P site remained relatively constant at about 19 weeks.
With a larger data set, we tried plotting the average paid sales of pirated and un-pirated content using a common starting point (that is, we plotted sales data week-by-week after publication). The results of the week-by-week and four-week rolling averages are shown on slides 28 and 29 of the BEA presentation. Both pirated and un-pirated titles showed similar growth in sales in the first few weeks after a title is published, followed by a decline after peak. Average sales for unpirated content start higher and peak later, although this may reflect the specific nature of titles in a small sample.
The primary difference between sales of pirated and unpirated content appeared in weeks 19 through 25, when sales for pirated content peaked a second time at a level higher than that seen in the first, sell-in period. This second peak followed the time (19 weeks) at which the average pirated O’Reilly front-list title was first seeded on a P2P site.
We stress that this is correlation, not causality, but the difference in the sales profile is notable and persists even when using rolling averages. Data after about week 40 is not as reliable because the number of titles on sale for that length of time or more drops significantly. We will continue to monitor the data on an ongoing basis to establish a more complete profile. A download of the full research paper, which is published as a Rough Cut that includes access to any future updates, is now available for purchase ($99).