In July of 2005 S.N. Dorogovtseva, J.F.F. Mendesa and J.G. Oliveira published the results of an experiment in which they put every number from 0 to 100 000 into Google, noted the number of results obtained. The paper states this research was (amazingly to Sci7) partially supported by grant money. Citation: Physica A 360 (2006) 548–556
The findings are largely unsurprising, with the popularity of numbers generally decreasing with size, powers of ten being more common than their neighbours, and various “special” numbers being particularly common, groups of special were identified as:
- Powers of 10
- Multiples of 10 and 5
- Easy to remember or symmetric numbers eg. 666 and 131313
- Powers of 2
- Numbers with strong associations eg. 666
- Popular zip codes eg. 78701
- Toll free telephone number prefixes eg. 866, 877
- Important historical dates eg. 1812
- Serial numbers of popular products 747, 8086
- Initial parts of mathematical constants 314159
The data was collected in the second week of December 2004, the number “2004″ was present at a particularly high frequency (3,030,000,000 pages), with a rapid fall off in popularity of future years.
Sci7 has at the begining of the second week of December 2005 obtained a page count for the years 1990-2015:
| 1990 | 268,000,000 |
| 1991 | 213,000,000 |
| 1992 | 246,000,000 |
| 1993 | 239,000,000 |
| 1994 | 359,000,000 |
| 1995 | 588,000,000 |
| 1996 | 562,000,000 |
| 1997 | 566,000,000 |
| 1998 | 658,000,000 |
| 1999 | 795,000,000 |
| 2000 | 1,440,000,000 |
| 2001 | 1,250,000,000 |
| 2002 | 1,340,000,000 |
| 2003 | 1,610,000,000 |
| 2004 | 2,140,000,000 |
| 2005 | 6,680,000,000 |
| 2006 | 1,020,000,000 |
| 2007 | 157,000,000 |
| 2008 | 103,000,000 |
| 2009 | 52,600,000 |
| 2010 | 93,400,000 |
| 2011 | 24,500,000 |
| 2012 | 39,300,000 |
| 2013 | 16,200,000 |
| 2014 | 13,500,000 |
| 2015 | 29,100,000 |
The top result for all the years to-date is the Wikipedia article for the year, the results for the years to come varies and includes the official london2012 site for 2012 and 2015.com. The popularity of the current year and fall off in future years is also seen in the above table. It is interesting to note that the current page count for “2004″ in December 2005 is 0.7 of what is was in December 2004. This could be interpreted as suggesting the web, as reflected in Google’s index is being purged of outdated information, or could be a reflection of the number of sites which display the current year on them for various reasons. 1992 is slightly anomalous in that there are currently more pages on Google referring to it than 1993.
The accuracy of Google’s number of pages returned count is incredibly not discussed, and neither is Google’s progress as of 2004 towards its stated aim of making the entirety of the world’s information searchable, and the bias of the current incomplete index of all human knowledge which Google holds.
Sci7 is able to produce datasets such as those used for the research discussed here (and those with much greater complexity) from a wide variety of sources.
A free full text PDF of the original article is available:
http://arxiv.org/pdf/physics/0504185