The initial paradox

Aaron Swartz

This book is dedicated to the memory of Aaron Swartz.

I met Aaron Swartz in 2001, in Cambridge Mass., in a meeting organized by the World Wide Web consortium. I didn't know who he was. He was 14 years old then, and looked like a kid. I thought he was the son of Tim Berners-Lee, who brought him at the meeting because he couldn't find a baby-sitter that day. Then, when he started talking, I realized he was the most articulate person in the room. He was describing his vision of access to updates to online content in a standardized, computer-readable format1, as well as the design principles of the Semantic Web, that was intended to provide meaningful navigation between web sites.

The causes that Aaron Swartz defended, for me, were no brainers. Of course, everybody should be able to access scientific knowledge and enrich our culture. Of course, as law-abiding citizens, we should be all know the law, and therefore be able to access it for free. Of course, we should facilitate the ways in which we talk to each other to ease the technical hurdles for transmission. Of course, computer technology should be used to empower us, not to enslave us. Of course, we should all have equal access to the Internet. Of course, we should be able to publicly denounce things that we consider to be morally abject, as did whistleblower Chelsea Manning.

So, when Aaron Swartz was prosecuted and threatened to spend several years in jail for downloading academic materials from the PACER system, which ironically means "Public Access to Court Electronic Records"2, and academic articles from JSTOR3, an online archive hosted by the Massachusetts Institute of Technology, I was surprised to see how narrow-minded the institutions were. Small immediate profits were more important than ensuring a long-term future for us as a whole, ensuring that everyone has access to the law and to scientific knowledge. In the lawsuit that developed, a compromise was found: if Aaron Swartz would plead guilty for some of the charges, he would only spend 6 months in a low-security prison instead of spending 50 years behind bars. He refused, because he was convinced his cause was just, and the pressure from the trial was too much for him to bear. He committed suicide in January 2013, at the age of 26.

I have still not recovered from the feeling that something was deeply wrong here, and this was the starting point for this book. Sometimes, the whole society has mechanisms to go against one individual, and because all institutions converge, giving many apparently good reasons for explaining their behavior, the whole affair settles down, and we are back to "business as usual". But I just couldn't take it. On the contrary, the example of Aaron Swartz shows how many things are going in the wrong direction, and will backfire. Only years later we will realize how badly things were headed and we wish we would have done things differently.

The contrast between the positive changes brought by technology that we have all seen occurring in our life and the dark side of it is striking. It seems as if we still don't fully grasp the ramifications of the changes that are occurring before our eyes.

This book is about the battle between humans and technology. But that is an unfair qualification. Technology is driven by humans, whose interests may at some point diverge from the interests of the society as a whole. It is aimed at denouncing the abuses that have led us to a submission mentality, where we are often told our human interests have become irrelevant in face of the impersonal, devouring needs of the technological behemoth. But that's just window dressing. This way of seeing things is just a way for some people to exploit the credibility of others, by leading them to believe that they have become powerless.

The way this distance is widening can be measured by one specific aspect of the information society: accountability, or more accurately, the absence of accountability and its consequences. I am presenting a diagnosis of some of the most visible, outrageous, signs of lack of accountability, and am proposing a way to start thinking on how it would like if we would consider making our information accountable. I am trying to avoid being too technical in this book, and present a conceptual vision. I will develop an approach that consists in emulating what accounting does for money exchanges in the world of information exchanges.

Lost in Technology

I once had a dream. Or rather a nightmare.

I was stuck, having forgotten all my codes, passwords, telephone numbers, and I couldn't move in any direction. I was supposed to meet my wife who was with a friend at an elevated train station where she would be arriving by train. I couldn't remember whether we said we should meet on the platform, or at the ticket booth. I was in the street, downstairs, trying to enter the station, but I couldn't get in because I had to buy a ticket from a machine and I couldn't figure what was the Personal Identification Number of my debit card. I simply had forgotten it. Then I thought: maybe my wife is upstairs, but why wouldn't she call me on my cell phone? Anyway, that also would be useless, because I just got a new cellphone, and couldn't figure out which key to press to answer it when it rings.

I didn't know how much time elapsed. I was in such a panic that I couldn't say if this lasted three minutes or three hours. Eventually the employee in the booth proposed to help me get in, but I told him I couldn't get in because I needed a password and I couldn't figure it out. It was hopeless, until I remembered that I had some cash in my wallet. Back to the old, pre-digitized world. I asked the employee: if I give you two dollars, would you then sell me a ticket and let me go through? Sure, he said. I suddenly felt relieved. That at least sound familiar, but I almost forgot I could still do it. My wife showed up and said: we found you! It was like rescuing me from drowning.

Suddenly I came to realize how dependent I have become of the information technologies. But I am far from alone. We all are. If the Internet would be out of order for a while, the amount of disruption throughout the world economy would be enormous.

We live in an information society driven by mechanisms that are recent and therefore are not yet completely mastered. They tremendously impact the way we live in a variety of ways. Politics, commerce, social networks, work-related activities, public services, the media, religion, ..., have been radically transformed from what they were even as recently as twenty years ago. New opportunities have emerged, and at the same time new challenges that force us to rethink the fundamental values on which we stand. Democracy in the digital age is one such example. Privacy is difficult to preserve. Our actions, movements, readings, purchases are now systematically tracked. Some settings can help limit some of our visibility online, but we have no guarantee that they are effective. If we were reading the terms of service for every website we use, and were concerned about preserving some personal opinion, we would realize that there are no real ways to achieve that goal.

There seems to be no standard of accountability as to what private or governmental entities can do with the information emanating from us that they acquire and consider their own, and sometimes resell for marketing purposes. This industry is mostly unregulated, especially in the United States. A new form of criminal behavior has emerged, including identity thefts, stealing money from bank accounts, and various kinds of espionage. Since everybody can publish on the Internet, it becomes more difficult to distinguish information we can trust from information which is biased, misleading, or simply erroneous. Electronic commerce flourishes nevertheless, much valuable information is now available online, therefore we also greatly benefit from this age of pervasive information.

There are many reasons explaining the situation we are experiencing. Obviously, there are businesses benefitting from the trove of information they gather, and no incentives for them to refrain from doing it. But more importantly, there seems to be a general unspoken consensus that technology is overwhelmingly complex and we give up on attempting to understand what needs to be done to fix this.

The problems are hidden behind layers of technical intricacies, thus making us believe that only highly specialized technical people are able to understand what is going on. The consequence of that situation is that often the policy makers and executives are deferring decisions to information technology experts, claiming that the learning curve is much too high and not worth their time. This is not a healthy situation. This book is about presenting concepts that can help us figuring out ways to change this situation, by providing to non-experts with the intellectual tools useful to understand the invisible layers of how information works under the hood.

Several misunderstandings contribute to send us to the wrong track. Communicating meaningful information between humans using computers seems trivially simple, but is in fact exceedingly difficult. And the reason of that difficulty doesn't originate only from the intricacies of the technologies used, but also from the very nature of communication, which is ambiguous by nature. Despite constant progresses made with artificial intelligence and machine learning, the problem is that computers are not very good at handling ambiguity. Computers mostly require information to be stable, properly structured and straightforward, and many technology experts make others believe that this is the only way to go. They insist on presenting information in a way that is digestible for computers, even if that means distorting it from what it is supposed to mean, or ignoring the various ways it could be interpreted.

Furthermore, the principles that are underlying the design of a new information system, more often implicit than explicit, are usually not transmitted to a subsequent generation of users, those who were not present when the system was initiated. Therefore, these new users tend to use information in ways which differ from the original intents, not because they are mischievous or badly trained, but because these principles have not been expressed clearly. It results that the meaning of information that is conveyed drifts from the original design, and without warning, the information systems become progressively obsolete. Information systems are usually not able to dynamically reconfigure themselves with new or unheard of information. And when the situation becomes untenable, the information systems need to be replaced, often at high cost.

Another paradox is the belief that there should be no loss of information between the emitter and the receiver. In fact, it is often desirable to filter out unwanted information in order to focus on what really matters. Specific filters need to be designed in order to prepare information to be received by those at whom it is aimed. The same information sources can be viewed using different filters. The classification levels of information are examples of such filters, hiding or displaying content to selected groups of users depending on their credentials.

This book is called the "Digital Messenger" because it focuses about what happens to information when it gets transmitted electronically. It is not about the technicalities of messaging protocols, but about what information means, and the variety of usages that can be performed with it. There is quite a discrepancy between information we create, and information that we receive. In his seminal article on information theory, Claude Shannon writes: "The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem." 4

The book is made of three parts: the first part describes what goes wrong, the second part the remedies, and a third part presents the state of the art in information management, from a conceptual point of view.


  1. RDF Site Summary (RSS) 1.0 

  2. The reason why this was considered illegal, was because PACER charged 8 cents per article, which Carl Malamud said is arguable. 

  3. "Journal Storage". JSTOR gives access to many documents that are in the public domain, but charges for them. 

  4. C.E. Shannon, "A Mathematical Theory of Communication", Bell System Technical Journal , vol. 27, pp. 379-423, 623-656, July, October, 1948. Eprint, PDF