There is a big switch afoot. Computation is moving from your PC to the One Machine. Until recently the most interesting and apparent computational cycles have been the ones on the laptops and desktops in front of us, while the boring enterprise cycles have happened in large data centers. But the data center servers have been getting better, faster, cheaper and they offer many benefits to ordinary users that PCs don’t. In this new paradigm, instead of working on your laptop, you touch the “cloud” of other machines, usually in the form of the web.
There is a tendency to think of cloud computing as centralized, or even a throw back to timesharing a mainframe. But in fact the cloud is the belated arrival of massively parallel computing. The computation is not centralized at all. It is distributed over a massive number (hundreds of thousands of servers) in many locations around the world, running massive numbers of programs in parallel. There are different ways to do this, but currently Google’s parallel architecture called MapReduce is gaining ascendence as a template that works.
MapReduce (a name hammered together from two Lisp programming functions) works a little bit like packet switching does in telecommunications. In packet switching — the key distinguishing feature of internet communications – a message is broken up into tiny parcels and the packets are sent independently through many different routes through the network. As each parcel is received at its final destination it is reconstituted into the whole. If some part of network is down and one (or more) packet doesn’t get through, it is re-routed along different paths until it does. In the MapReduce architecture a programming chore is reduced into parcels and those parcels are sent out to various nodes in the cloud of all participating servers to be computed. If a node is down or the computation fails, the parcel is reassigned until it is completed.
Keeping track of all these flying parts is what MapReduce does. It is an algorithmic scheme that divides (“map”) functions that it has not met before into intermediate portions. It processes these functional fragments a zillion times in parallel and then “reduces” the parallel processing into the unified answer for the client. What’s astounding is that MapReduce disassembles and reassembles software functions defined and written by others who know nothing of MapReduce itself.
MapReduced has spawned open source implementations, such as Hadoop, which is written in Java and runs on Linux. Amazon web services already offers ways to harness its cloud of servers using Hadoop. A New York Times techie created PDFs for hundred of thousands of archived pages of the Times using Hadoop and the Amazon cloud.
Eventually we’ll have the intercloud, the cloud of clouds. This intercloud will have the dimensions of one machine comprised of all servers and attendant cloudbooks on the planet.
A few benefits of working with the Cloud:
1) Cloud is more reliable. Now that your entire working life and life in general is on your computer, this is important.
2) Cloud remembers to back up.
3) Cloud is available 24/7/365 from any terminal in the world.
4) Cloud can hold infinite apps, infinite storage
5) Cloud permits seamless sharing and collaboration
The New York Times ran an article today announcing IBM’s entrance into the cloud marketplace, an initiative IBM is calling Blue Cloud. They will be selling servers and software tailored for parallel cloud computing. Their software will be based on Hadoop. This may prove to be a large scale shift for all computer makers. As the Times reports:
“In some ways, the cloud is a natural next step from the grid-utility model,” said Frank Gens, an analyst at the research firm IDC.
The gadget the connects you to the cloud can be rather minimal. It doesn’t need a large disk storage, for instance. John Markoff, technology reporter for the New York Times, damaged his laptop while on a reporting trip and out of desperation stumbled on his own makeshift cloudbook, a stripped down computer with wireless connection but no disk. He writes:
What I discovered was that – with the caveat of a necessary network connection – life is just fine without a disk. Between the Firefox Web browser, Google’s Gmail and and the search engine company’s Docs Web-based word processor, it was possible to carry on quite nicely without local data during my trip.
Bouncing between hotel rooms to Wi-Fi-enabled lobbies and conference rooms, I was easily able to stay online and file my stories without incident. Afterwards it made me wonder why there aren’t more wireless, Web-connected ultralight portables for business travelers.
Nick Carr, who has just written a book alerting the world to “The Big Switch” (a title so appropriate I wrote a blurb for the book) from PC to the cloud, has also coined a neat term for this gadget: the CloudBook. On his blog he lists the attributes he imagines for the CloudBook, which he fantasies could be designed by Apple and run on Google’s servers. The device is primarily a well-designed screen with powerful batteries and an embedded wifi card, but no disk. To paraphrase Carr’s specifications for a CloudBook:
1) Cheap. $99-$199. Free storage and apps for personal use. $50/month for premium business.
2) Energy efficient. Uses LED screen, flash drive, low-power chip.
3) Durable. Few moving parts. Think iPod nano.
4) Mobile and flexible. No syncing, no backups.
But why wait for Apple? There are at least two prototypes of the cloudbook on the market now. From Taiwan there is the eeePC, a $300 rugged super lightweight wifi mobile notebook. Also made in Taiwan is Nicholas Negroponte’s One Laptop per Child XO which is rugged, lightweight wifi mobile, disk-less notebook, that should in theory cost only $100, but will cost about $200 a first. For the next two weeks a limited number of the XOs will be available from the One Laptop foundation in the following deal. Contribute $400 and you get one, and they give one.
I just made my contribution. I’ll review the XO when it arrives.