The Technium

Four Stages in the Internet of Things

[Translations: Japanese]

I’m making no bets whether the new buzz word “graph” will replace the still useful but overexposed terms of network and web. (I would have bet against the word “blog” ever being uttered by someone not smirking.) But it doesn’t matter. This short posting entitled the Giant Global Graph by Tim Berners-Lee, which blesses the hip use of “graph,” is the best summation of the semantic web I’ve seen yet — in part because he doesn’t talk about semantics. 

The points he makes are worth paraphrasing. The following riff on his essay doubles as my summation of what I think the Semantic Web, or even Web 3.0 is about. 


The first stage of this current communication evolution was linking up computers. We called that link-up the network of networks, or the internet. It was useful and boring. It was sort of like the phone system without phones. If you wanted to book an airplane flight the best you could do was connect to the airline’s computers. Active participants on this new system had to take one step toward openness. Computers on the internet had to be willing to forward other folk’s packets of data, and in the larger scheme an originator of bits did not have full control of its own packets (unlike the phone system). 


The second stage of digital communication was linking up documents and pages. That’s the web. Now you could connect to a page or document about your airline flight rather than just the company. There was a refinement in resolution that make the system far more useful. And to play in this arena, a player had to be open to sharing their page. Hiding pages behind passwords was generally self-defeating. Neither could you restrict who linked to your documents, as many neophytes mistakenly try to do. You also have to be open to your content being copied and pasted a bit, and to be copied in full as an index by search engines. This was another step in letting go.  But the value of connecting was generally widely appreciated.


We are now at the end of the beginning of the third stage. What happens here is that after linking and sharing computers, then linking and sharing documents, we are linking and sharing data in those documents. We are sharing and linking the subjects and meaning of what those documents are about. Instead of connecting to the airline’s computer, or later to the flight page, we can connect directly to the flight’s information itself. The data is unbundled and in a form that can be read by any device on the web. Indeed, when done correctly it can be comprehended by the web itself since it is not rendered in English but in a general semantic form. That universal form is something that will live in a database. In fact, you could think of this stage as the World Wide Database. 

Wwide Database

Having access to the data of our world is liberating and made possible by lots of 3-letter technologies: XML, RSS, API, RDF, OWL. These are standards of communicating and sharing data on the web. It’s been amazing how much can be accomplished just by zipping content between websites via RSS. Another tremendously unappreciated enabling technology is the API. This gateway allows controlled sharing of vast archives of data, unleashing the data’s power via the usual network effects — the more that use it the more valuable it becomes.  And just as in the first two stages, folks struggle to overcome their fear of sharing. Sharing data feels closer to us than sharing either our computers or our documents. But learning how to share data is what the next web will be about. Those who are able to let go and understand that all the real value in this next stage will be built on the emergent value that comes from deep interlinking, deep interconnecting, and freely (as is reasonable) releasing your precious data, those will be the technologies and organizations who gain the most. 

Here’s how Tim Berners-Lee puts it himself: 

The less inviting side of sharing is losing some control. Indeed, at each layer — Net, Web, or Graph — we have ceded some control for greater benefits… Letting your data connect to other people’s data is a bit about letting go in that sense. It is still not about giving to people data which they don’t have a right to. It is about letting it be connected to data from peer sites. It is about letting it be joined to data from other applications. 


BTW, I don’t think this third stage is the end of history, or the end of the story. (I have learned to be suspicious of any history that comes in threes). I think we can see a fourth stage beyond. That fourth stage is the drift towards linking up the things themselves. You want all the data about a thing to be embedded into the thing. You want location information embedded at, or in, the location itself. You actually want to connect not to the airline’s computer, nor to the airline’s flight page, nor to the flight data, but to the flight itself. Ideally, we would connect to the embedded processing and raw information in the airplane, in your particular seat, at  the airport’s slot — the entire complex of items and services we call “the flight.” What we ultimately want is an internet of things. 


That internet of things, where everything we make contains a sliver of connection, is still a ways off, although I believe we will create it. The internet of data — the world wide database — is quickening right now. As far as I can tell, this is what people mean by the Semantic Web. Because in order to be shared, information is extracted from natural language, reduced to its distinct informational elements, and tagged into a database. In this foundational form it can then be re-assembled into meaningful (semantic) informational molecules in thousands of new ways that is not possible to do when it remains in a flat un-annotated primitive document.

I believe this shareable extraction of data is also what people mean by Web 3.0. In this version of the webosphere data surges, flows, and expands across websites as if it were acting within one large database, or within one large machine. My site solicits a steady stream of data from Alice and Bob; it then adds value by structuring the data in a new (semantic) way, and then I issue my own streams of organized data, to be consumed by others as raw data. This ecosystem of data runs on an open transport system, and consensual protocols, even though not all data is shared or public.

An operational Semantic Web, or World Wide Database, or Giant Global Graph, or Web  3.0, will make possible millions of seemingly smarter services. I  won’t have to re-tell each website who my friends are; once will be enough. If my name shows up in text, it will know it’s me. My town will be a town on the web — a place with definable characters — and not just another word. That ubiquity enables any references to my town to link to the actual information about the town.  The apparent smarter nature of the web will be due to the fact that the web will “know” more. Not in a conscious way, but in a programatic way. Concepts and items represented on the web will point to each other and know about each other — in a fundamental way they do not right now.

The Social You

The details matter. Which protocol will triumph, which standard prevail, which company maintain the majority? — these are the unknowns. Policy details matter too. Documents and computers as property are a lot less problematic than data as property. Data is notoriously hard to own. And then there is the slippery notion of identity. If the web knows you are always you, who are you? If the price of total personal service is total personal transparency, is that any different than total personal surveillance?

The smartness of this thing will unnerve many people. Even  though it will be miles from anything human, the fact that it will know anything at all, and know anything about them, will make many folks jump back. And push back. I’m counting on the fact that kids will love it.

  • Before this kind of technology can become useful, humanity will need to overcome the notion that “because the computer said so it must be true”. Each of the semantic databases will be full of data that at it’s root was entered by a human being; that means it’ll contain errors. Some accidental errors, some errors of omission and / or falsification. Much misery will be caused by people assuming that data equals truth…

  • The ultimate direction of the semantic web is artificial intelligence. Future computersystems, search engines want to connect different sources, read entire web sites, blogs etc., and want to “get it” the way that human beings do.

  • I don’t think that inaccurate negative information about people will be much of a problem, for long. The fear people have of being negatively described made sense in the old low information bandwidth world: If someone heard something bad about you, that would be all they’d hear about you all day, and all they’d end up knowing about you. Understandings of things were formed based on whatever trickle of information existed, and individual pieces of information could wildly swing portrayals and impressions.

    The character of a high information bandwidth society is fundamentally different. Instead of an occasional trickle of information, the Graph’s understanding of each person will be formed by a huge flow of data from numerous sources. More importantly, the increasingly structured quality of the data will mean that new facts & ideas will fall directly into rich, established contexts. You won’t so often encounter isolated inaccurate facts without a situational awareness of where those facts come from & how they relate to the totality of information about a person.

    Not only will there be more safeguards to keep inaccurate negative information about people from being taken seriously, the flipside of that same process will be that genuine facts & warnings will be able to find corroboration and substantiation. We will hear negative things about people that are true, in a context that proves them to us. The punishments and rewards of having accurate detailed information about everything you do in your public life very readily available will almost entirely supplant old ideas of criminality and justice– and the injustices and minor crimes of daily life, the many (it must be most!) which are now beyond the arm of the law or the possibility of public awareness, will be righted for the first time in history. The end result will be a world which is dramatically more just, fair, above board, genuine, trusting and connected.

  • If step 4 involves putting a computer chip in every OBJECT, will step 5 be putting a computer chip in every LIVING THING?

    Identity Manegment will be a huge business for steps 3 and 4, but will step 5 finally be the solution to identity theft? The chip being part of you may be the final solution to the computer knowing you are ALWAYS you.

  • as someone who is often on the wrong side of public opinion, and someone who is even more often on the wrong side of governmental opinion, I find this kind of talk terrifying.

    total transparency means mob rule…or big brother. I do not see this as a positive thing.

    There may be positive aspects of this but I don’t like the implications.

    and yes, I’ll be right there with “the kids”, using it anyway.

  • Nick Desbarats

    I agree with most of Mungojelly’s comments below, with the exception of the quasi-Utopian society he/she predicts will emerge when the developments Kevin predicts materialize.

    Anyone has been able to publish erroneous, slanderous information about anyone else online for over 15 years. Putting it in a shared database won’t create a reputation problem any more than such a problem already exists.

    It’s a safe bet that databases that routinely include erroneous, slanderous information will be less used/shared than those that contain accurate, reliable information, just as sites that tend to have reliable, accurate information get more traffic and Google juice than aimless flame sites today.

  • ted

    How will we insure that we are honestly represented? A perceived slight can generate an attack with real consequenses,and no accountability. Once it’s out there…