## A Trillion Hours

[Translations: Japanese]

The web is pretty big. Researchers at Google won’t say how many pages Google indexes, but they recently said that their inspection of the web reveals that it has more than one trillion unique urls. It’s difficult to know what to count as a unique page, because as they explain, some sites such as a web calendar page can generate an infinite number of pages if you click on the “next day” link. The very first public web page was created in August, 1991. So we’ve (the collective YOU) have created 1 trillion pages of content in 6,200 days.

I can assure you that 6,000 days ago, way back in 1991 — or even as recently as ten years ago in 1998 — no one would have believed that we could create 1 trillion web pages so fast. The obvious question then was, who would pay for all this? Who has the time?

To create one trillion pages takes a LOT of time. If we conservatively say that on average each url takes one hour of research, composition, design and programing to produce, then the web entails 1 trillion hours of labor — at the minimum. We could probably safely double that to include the other trillion of web pages that have disappeared in the last 15 years — but let’s not.

One trillion hours equals 114 million years.  If there was only one person working on the web, he or she would have had to begun back in the Cretaceous Age to get where we are today. But running in parallel it takes 114 million people working around the clock one year to produce one trillion urls. Since even the most maniacal webmaster sleeps every now and then, if we reduce the work day to only 8 hours, or one third of the day, then it will take 114 million folks 3 years of full-time work to produce the web. That is equal to 342 million worker-years.

But since we’ve had 15 years to construct this great work, we needed only 22.8 million webworkers working full-time for the past 15 years. At first glance that seems far more than I think we actually had.

A very large portion of this immense load of work has been done for free. In the past I calculated that 40% of the web was done non-commercially. But that included government and non-profits, which do pay their workers.  I would guess that 80% of the web is produced for pay. (If you have a better figure, post a comment.) So we take 80% of 342 million worker-years to get 273 million paid worker years. What is their salaries worth? In other words, what is the replacement cost of the web today? If all the back-up hard drives disappeared and we had to reconstruct the trillion urls of the web, what would it cost?  Or in other words how much would it cost to refill the content of the One Machine?

At the bargain rate of \$20,000 per year, it would cost \$5.4 trillion. Multiply or divide that by whatever factor you think is necessary to make the salary more realistic.  Maybe it takes less than an hour on average to create the content of a web page; maybe it takes more. But I suspect the order of magnitude is close.

In the first 6,000 days of the web we’ve put in a trillion hours, a trillion pages, and 5 trillion dollars. What can we expect in the next 6,000 days?

• David Smith

And presumably Google’s trillion URLs don’t include all those hidden on intranets behind corporate firewalls.

• It would be interesting to have access to historical records from Google. If we trended the growth acceleration of the last 6000 days, we could probably make a fairly good guess at the next 6000.

Its a very cool idea.

• Isn’t 2.73e8 times 2.0e4 equals 5.46e12 which are trillions and not billions? Really interesting way of looking at it. And I guess the next 6000 days will be very interesting.

• If we conservatively say that on average each url takes one hour of research, composition, design and programing to produce

That’s not a conservative estimate at all. I have a command-line tool that automatically uploads a file to Amazon S3 for me and copies the resulting URL into my clipboard. I created it so I can create unique URLs without having to do it by hand. The command-line tool runs in the space of maybe a minute, max, with bad bandwidth – usually only a few seconds – and I created it because doing the process by hand could take as much as three minutes if I got distracted halfway through.

I’ve created at least two or three hundred unique URLs myself with this tool just uploading images to my blog. Not one of them took as long as five minutes. I’ve also created a bunch of tiny, one-off web pages, like this one:

http://s3.amazonaws.com/giles/example_080508/example.html

Which I wrote, uploaded, and got back a unique URL for, all in less than a minute.

I know there are other people at the opposite end of the curve, but I don’t think your estimate is conservative at all.

• This is a cool topic but the estimates seem to imagine that all pages are built and/or populated by hand.

But that’s not the case! In fact it seems to me the vast majority of pages at this point are surely generated at least in part by feeds or script.

In fact, theoretically, you could generate infinite unique pages via Google searches alone, merely by entering infinitely variable search terms.

The idea of measuring the web by pages is flawed, because the web isn’t a static entity. We should think of it more verbishly, than nounishly, and measure not the pages, but the flow. We don’t measure rivers in cups and litres!

• Do the 1 trillion pages include blog entries? Some can take only a couple of minutes to compose.

What about html code produced by other software? Does it count MS Word documents saved to html and posted to the web?

I also think that you underestimated the number of people producing web content. Even my 7 year old is posting web content.

• I am a web developer and I think your estimate of 1 hour per page on average is way off. For example, I created one health insurance brokerage web site that currently has over 154 static pages (not included pages that I have deleted due to expired information).

It took me about 40 hours to create the basic design, layout, and coding for the site. After that it takes less then a minute to create each page. Even if you include the time for writing the copy of the page, it would not even be close to an hour. I add about 2-3 pages each week to this site and it only takes about 1 minute per new page.

Now consider dynamic sites like amazon. A web site like amazon can churn out a page with just a few data entry keystrokes. The larger a site is, the less the average time it takes to create each new page.

• Archos

At Giles and others:
By the average URL, Kevin means the creation of a somewhat meaningful web page, of course. Not the simple creation of a URL without content on it. That’s not an average URL. That (noise, 1 sec to create) is the opposite of a page like this (signal; 10 hours, including collaborative pages on which a lot of people spend time thinking/ researching). The average is in the middle and I think his estimation of one hour is correct.

• When things were “broadcast” to a billion people, a much smaller number of people were involved (sum of all broadcast media employees?) When things are “interacted with” by a billion people, all one billion are by definition involved. Active not passive. My impression from a few thousand days of roaming the web is that it is very difficult comprehend what one billion people can do and what one billion people can know.

I think I unwittingly still have the gut feel that the web is one billion people “listening”, rather than one billion people “sharing”. That’s why a trillion sounds so big.