NEH Chairman Bruce Cole talked recently with one of the Fathers of the Internet, Vinton G. Cerf. A recipient of the National Medal of Technology, Cerf is vice president and "chief Internet evangelist" for Google.
Bruce Cole: It's hard to think back before the computer age. The Internet intersects so much in our lives and has changed the way we communicate and the way we do research and the way, actually, we think.
Vinton Cerf: I find it rather interesting to consider the history of human communication.
Remember, we started out with oral traditions. Writing comes along. Writing is a way of preserving information and sharing it with future generations. Then comes printing and the ability to replicate material so that lots of people have access to it.
It's a story you know by heart. What I think happens now is information in text form is transforming in several dimensions. The one that I think is most interesting is that computers can learn to understand text. They do better at that than they do oral exchange. That means that more and more of our exchanges with each other can become archival in value.
Cole: Preserving them is an issue that we are interested in. But, you are right, the record is there.
Cerf: Let me give you an example, though, of a problem that has arisen. In the early stages of the Archinet development, my friend Steve Crocker initiated a series of documents that he called requests for comments, RFCs. The RFCs were very informal communications, but they document the early thinking of networking.
As the Internet has evolved, more and more of the exchanges have been by e-mail and e-mail is typically not an archival medium. The history of thinking--the problems and the solutions and so on--is less clear than it was when people deliberately wrote something expecting it to become part of an archive. The earliest exchanges may have been composed electronically, but they were distributed on paper. I found this a conundrum.
Cole: That's very interesting.
Cerf: As all of our communication becomes more digital, it doesn't necessarily mean that all of our communication is as coherent. That's why the Google work is so interesting. I don't care what form the text takes. As long as the computer can get access to it, I can try to do some kind of indexing.
Cole: I think about the invention of writing and then manuscripts and the invention of printing and the replication and dissemination of the written word. Then I think about the Internet and I think we are not thinking anymore in this linear way. I am interested in the interactive nature of the Internet, the ability for collaboration and research. We are in a different world.
Cerf: The network allows for a broad range of communication and collaboration and sometimes concurrent, parallel lives.E-mail is one example. Instant messaging is another. In fact, the younger crowd prefers instant messaging to e-mail because they think of e-mail as too slow. You can carry voice over the network, as well, by digitizing it.
Cerf: I had a conference call yesterday with people scattered all around the world. By joining a chat room and effectively instant messaging as a group I can see people who want to raise their hands to say something, which I could not see in a regular voice conference call.
Second, there can be ideas injected in the written communication that inform and affect the voice communication. You have immediate reference to a document that you might wish to pick up in the course of the conversation. It is not uncommon for people to be interacting using the network within different modalities all concurrently.
We even get one other peculiar phenomenon where there is a group conference call and people are instant messaging in small groups, maybe a pair or two or three people talking about what is going on in the conference call. Sometimes they use that as a way of, let's say, planning tactics.
Cole: Right. I see this in a reduced scale in large meetings with people using their BlackBerry devices to talk to each other across the table and figuring out various strategies. These messages are being sent up to some satellite and then being sent down where people are sitting three feet away from each other.
Cerf: My wife and I found a very peculiar phenomenon. We both use instant messaging and e-mail a lot. When we are at home, we find ourselves continuing our discussions using e-mail, and the reason we do that is that it maintains a record so that we remember where we ended. It's a little weird sitting on sofas facing each other and sending e-mails to each other.
Cole: I am very interested in what I call the digital humanities. I'm interested in using the power of digitizing and the Internet to be able to analyze data to create new questions, which, in turn, will create new knowledge. I think this is the frontier of humanities scholarship and access in the coming years.
Cerf: I think we all agree that getting more information into people's hands has to be a good thing. You probably know that Google has been working with Jim Billington at the Library of Congress on several dimensions.
Cole: We are partnering with the Library in the National Digital Newspaper Program where we are digitizing millions of pages of newspapers. Newspapers are the first draft of our history, and the files are going to be on the Library of Congress Web site. I'm really interested in projects that are born digitally, that use all the wonderful tools that we have now to analyze metadata in new ways. This is going to be very, very exciting. I compare it to the way computation has made the genome project possible. We want to be in the forefront of the digital humanities, and we have been talking to a number of people about this.
I'd like to find out how you got started in this. I read that you read The Boy Scientist when you were young.
Cerf: I was an inveterate reader. I can remember at age eight or so eagerly awaiting the next volume of The Wizard of Oz. Of course, these books had all been published before I had been born, but I didn't have copies in my library and so I got a present from my traveling father who would bring one home for me.
I read a lot and a lot of it was science and science fiction like George Gamow's One Two Three . . . Infinity. From a fairly early stage I was pretty sure I wanted to be a scientist of some kind. I thought I was going to be a nuclear physicist. Later on, I discovered that physics and I didn't get along too well. I drifted off into mathematics and wound up at Stanford studying math and taking every course in computing I possibly could.
I have to confess to a certain amount of envy when I meet kids, you know, eight years old, who are busy building their Web pages on the Internet, and I keep thinking I had to wait until I was twenty-eight because we had to invent it first.
Stanford led to IBM. After two years at IBM, I realized that I needed to go back to school and learn a lot more about computing and computer science than I had gotten as an undergraduate. So I went back to graduate school at UCLA, and, within a year or so, I was introduced to this project that the Defense Department was pursuing, called the ARPANET. It was the first wide-area demonstration of a technology called packet switching. I was mesmerized by the thought of linking computers together over long distances.
Cole: Which is packet switching, right?
Cerf: Yes, packet switching is the technology that drives almost all computer communications today. It's an alternative to the telephone. The telephone does what is called circuit switching. When I dial you, a circuit is set up between the two of us with dedicated capacities so that our voices can be exchanged, and no one else can use that capacity. Even if neither of us is talking, the capacity is dedicated to that conversation until one of us hangs up.
In the packet-switching world, the capacity is shared. These little bursts of data come out, kind of like electronic postcards with a "to" address and a "from" address and they get switched through the system. As soon as a postcard goes by, the circuit is available to other traffic between other parties. It's an extremely dynamic way of sharing communications capacity.
After I finished my Ph.D. at UCLA, I went up to Stanford and I worked with Bob Kahn, who was at ARPA, the Advanced Research Projects Agency. Kahn realized that if the military was going to make use of these networking ideas that it would need to have computers in the field as well on board ships. These couldn't be interconnected by wires because they were mobile systems--tanks and the like. You had to use radio.
We had to figure out how to get them all interconnected to make this computer networking idea useful to the military. Bob and I designed the Internet, basically, in 1973 to try to solve that problem.
Cole: I know you are often called "The Father of the Internet."
Cerf: It's a misnomer for any one person to be given that label. Many people contributed to the ARPANET and then, of course, many contributed subsequently to the creation of the Internet.
Cole: When you were talking about the need for these various kinds of communications, were you already envisioning how the Internet was going to change everything?
Cerf: I have to admit that we certainly didn't quite have the vision of what the Internet is today. But it is fair to say that the research programs that ARPA was funding, revealed very much, actually, about what the world could be like. It was a man named Douglas Engelbart, at SRI International, who was credited with inventing the mouse, who also invented a system called the Online System that had hyperlinking in it, which is a key element of the World Wide Web. He promoted the use of portrait format displays as opposed to landscape format and black-on-white presentation. This was back in the mid to late 1960s.
On top of that, Xerox Palo Alto Research Center was formed in 1972, and out of that came the Ethernet that Bob Metcalfe invented and something called the Alto, which is for all practical purposes the first personal computer. They cost fifty thousand dollars each. There were two hundred and fifty people in the lab; every one of them had one of these Alto personal computers connected to what was then a three-megabit Ethernet. They were working with black-on-white displays, portrait-mode displays, using Windows-like presentations that Alan Kay came up with. They were working in 1972 in the world of 1992.
A lot of us could see much of the potential. The problem was that nobody could afford to do it. It was not until people like Steve Jobs came up with ways to decrease the cost of personal computing, and networking became more widespread and cheaper, that many of the manifestations that we see today were possible. After Tim Berners-Lee does the original work in 1989, the World Wide Web shows up in the form of communications, and the world changes in 1994.
Cole: I want to talk to you a little bit about some of the downsides of the Internet. But let me just pause for a minute. We have talked about young people a couple of times. I have just finished Steven Johnson's book Everything Bad Is Good for You.
Johnson talks about, basically, games and the Internet. His idea is that these are very complex, demanding games that create new kinds of cognition and skills and that it's not entirely bad. While it doesn't make you a better person or more moral, it does create a kind of complexity and sophistication of thought that is really good. It's a clever book. He also extends this to more complex television plots and movie plots. He talks about how the mind really wants to be challenged.
Cerf: The theme that you are suggesting reminds me of some descriptions I heard of the younger set, let's say in the twelve to twenty range. You come home and you see them with ten or eleven instant messaging windows running. They've got Google searches going because they are doing research for their homework. There is television playing in the background. They have got a headset on, and they are listening to downloaded MP3s.
Cole: That's exactly what he says, yes. They are multitasking, multiplexing.
Cerf: Exactly. So, it may very well be that the games that are being offered are deliberately designed around complexity and that these folks like the multitasking environment.
Cole: He's careful to say that there is a downside to these games as well. As I said, they don't make you a better person or or more moral or lucid and the like, but he's not quick to dismiss them. It's counterintuitive and a very interesting book.
Cerf: I'll make a point of getting ahold of a copy.
Cole: I love the Internet and I think it's absolutely revolutionary, but for many people, unless you know how to use the Internet and you have some discernment, it's a thousand miles wide and sometimes an inch deep. How do we get people to be intelligent users of the Internet?
Cerf: Let me suggest a couple of things. First of all, it's clear that when Tim Berners-Lee launched the World Wide Web and Netscape Communications popularized it, there was a rapid growth in knowledge about how to create Web pages. Everybody showed everyone else how to do HTML.
Cerf: Even better, the browser has a little tool called "Show Source." If you liked what you saw, you could ask the browser to show you how that Web page was composed and then you could go off and edit it and do your own. So there was this rapid proliferation of ability to produce Web-based content that led to an avalanche of content on the Net. A side effect of that is that some of it was absolutely worthless, some of it was incredibly valuable, and everything in between.
The first challenge, in my view--and maybe this is your thousand-miles-wide and an inch-deep observation--is figuring out what information on the network is actually useful and valuable--and credible--
Cerf:--and what information is not. We have got to teach people to expect that not everything is on the network. If you do a Google search or some other search on the Net, you're not guaranteed that you've done a thorough search. We also have to get them to think critically about the material that they're getting because it may not be of equal quality everywhere. Teaching people to expect that there will be that variation and to understand how to make the evaluation is very important. We have to do that for all the media that presents us with information, whether it's television, radio, newspapers, magazines, or a friend.
Cerf: It seems to me that the Internet poses the challenge which we probably should have taken up more thoroughly in the older media. Otherwise, you make the mistake of assuming that everything you read on the network is correct, and, of course, we know it isn't.
Cole: I guess it's the ability to access so much material. If you think about looking things up in books, you'd have to go from one book to another. You have to have some kind of target. On the Internet, your net is a lot wider and you drag in a lot of other things, which is often good.
Cerf: Part of our job at Google has been to try to find ways of helping people find the relevant information in a particular query. One of the things that excites me is this book search effort that is under way to try to capture the contents of books--not necessarily to present a book in online form, but to help people discover which books have content that they care about.
Cole: That's wonderful.
Cerf: I stand in the middle of my personal library trying to remember which book it is that had a particular phrase and I have no way of physically going through these books in the library to figure out which one it was. Even if I couldn't read them online--I mean we're concerned about intellectual property protection, as we should be--but even if I couldn't read them online, if I could just find which one it was and what page it was on, I would be happy to go turn to my personal library or the public library or the bookstore to get the copy of the book. Having things online is vitally important for making knowledge accessible to everybody.
Cole: I take very seriously our role of dissemination of the humanities and the democratization of knowledge. Take something like our presidential papers, which I hope will be all digitized as quickly as possible, or our digital newspaper program. I think about not only scholars using that, but an eleventh grader doing a paper, or a newspaper reporter using it. About seventy million pages of newspapers have been microfilmed, but the sheer size of the material was a barrier. Now, with the word-searching ability of digitizing, you will be able to scan across this massive amount of material.
These newspapers will be displayed with optical character recognition so you see the whole page. You not only find what you are looking for, but you also serendipitously see all sorts of other things like advertising and editorials and columns and the like. That to me is not only an access issue, but it's also the ability to analyze this huge amount of important data, as I said, and to create new questions which will, in turn, lead to new knowledge. That to me is one of the most exciting things going on in the humanities today.
Cerf: I fully subscribe to the perspective that you're suggesting. I get excited about these efforts to digitize information retrospectively. It's an opportunity to preserve history, language, and culture in a way that is more accessible. And we're not limited to textual material. We can be recording sound, we can be recording images and so on. People who worry that the Internet will somehow homogenize our social structure, I think, are missing the point. In flipping through newspapers or old magazines, the articles are of interest historically, but so are the advertisements. They tell us something about what life was like back in that period.
Cerf: So I get very excited about making sure we capture all of it, including the commercial material. It may be that we have to show contemporary advertising in order to make all this stuff pay for itself, but I wouldn't want to lose the understanding that commercial speech gives us about what day- to-day life might have been like a hundred years ago or two hundred years ago.
Cole: Besides working with the Library of Congress on the newspapers, we are working with the National Science Foundation to document endangered languages. There is a very heavy digital component to this--the preservation of the language digitally, and also the creation of a sound vocabulary. The rescue of these languages, I think, is of key importance. They are really the DNA of civilization.
Cerf: I love the idea of trying to create a framework in which new material is properly structured so that we don't have to go through all the pain and agony of a retrospective process to get it in place.
Cole: Instead of putting it in an archive eventually, you build into it all the tools and analytical equipment as you are doing it so you have that ability to begin with. It seems to me that this is really the leading edge. To be able to provide geographic information systems and census data and to layer these things and to not even think of it as linearly constructed and to have this kind of plasticity of knowledge and interactivity, it just seems to me to be tremendously exciting.
Cerf: Let me suggest that there are some real problems here which need to be addressed. First of all, I'm sure that people have come to you at one time or another waving CDs and saying, look how much material I can store on this thing.
Cerf: One reaction you might have is, well, that's all fine, but how long will it last? It isn't just the medium. It's also the format and the ability to read it. I've been confronted by librarians who look at these things with a jaundiced eye and go out into the stacks and come back with this 1000 AD vellum manuscript which is still readable and magnificently illustrated and they say, "now tell me how long this little CD is going to last." I worry about that. A digital program needs to factor in the physics of all of it: the changing physical capabilities, the changing mechanical readability.
Cole: This is an inoperability problem, right?
Cerf: You have a serious problem in moving the archive from one technology to another. You have to factor that in. Worse, you also have to worry about software that is capable of interpreting the digital material. Let's say somebody hands you a document that was written in 1985 with a word-processing program that is no longer in use. You either have to move the documents into new formats or preserve the software and the operating systems under which they worked.
Brewster Kahle, for example, is doing the Internet Archive in San Francisco. What programs, what browsers, were used to view it and are they still compatible or not? This is not an easy matter.
Cole: I know the British Library is very concerned with collecting original digital manuscripts, writers' manuscripts. I've read somewhere they have been going out and finding old Osborne computers and the like. Ultimately, I think that people like you are going to figure this out.
Cerf: I hope someday we really get down deep into it. One thing I would suggest is that visual presentations may turn out to be very, very powerful tools for commonality over long, long periods of time. That which we can read, as human beings, may also be readable by computer. Optical character recognition may turn out to be an important tool for maintaining our ability to archive.
The second observation I make is that paper is actually pretty good stuff when it comes to longevity. Imagine putting digital information in digital form on some kind of printed structure and then being able to scan that. You've seen bar codes. There are two-dimensional bar codes that are quite advanced in their ability to record digital material. You could imagine taking material that's to be archived, having bitmap storage of it so that it's visually presentable and could be rescanned in that form, and also having a digital representation of it in, for example, XML, as an alternative. Even if you no longer could read a CD, if you could get the presentation on paper in digital form and then scan it again digitally and interpret the digital information, you could map it into more modern storage medium.
Cole: A kind of matrix, right?
Cerf: Exactly. So there may very well be a role, oddly enough, for paper in the digital world as a long-term storage medium. I know that the density of paper is not as good as the density of electronic media. But if we can't find a way of mapping from older media to new ones, it could be that printed materials could be a backup.
Cole: I'm interested in formulating some ideas which would encourage the collaboration of engineers and computer scientists and scholars. The potential for a collaborative effort in both research and dissemination seems important to me. We want to encourage that kind of forward thinking.
Of course, one of the problems is that digital humanities research on campuses has not been encouraged as much as it could be. How do online publications factor in tenure decisions and the like? Providing some support for that, getting the Endowment's imprimatur on that, I think, would be important.
So many other fields of inquiry are really ahead of the curve. How are the humanities doing?
Cerf: You mentioned the word "tenure," and one of the things that continues to dog this process is the focus on published papers in printed journals. There is a kind of disdain for anything that is published online even though, oddly enough, the same quality of filtering, evaluation, review, and the like can be applied to a paper that is published online as can be applied to one which is printed on paper.
It seems to me that the Endowment could foster the creation of online journals with the same quality of review and therefore the same authority that the printed journals have. I'm sure that will cause some dislocations here and there, but I've become increasingly unhappy about the cost of the physical journals and the practice which, at least in my disciplines, have the authors paying page charges to get their papers published and then having libraries paying incredibly high fees for access to the printed journals, thereby limiting access by the researchers to this material, which is, in a sense, impeding research progress.
We see pre-prepublication distribution online as an alternative and a faster alternative than waiting for the printed journals. The problem is that when tenure decisions come up, it seems the referencing of the printed journal is a necessity. I would love to see a change in attitude about that.
Cole: Do you think it's just generational? If you do an article online, you get comments, corrections, suggestions from all over the world. The whole interactive nature of the work adds a dimension that you don't find in, say, print. Why is that?
Cerf: I'm not sure, to be dead honest with you, because it depends on what the evaluation practices are at the universities on tenure. Maybe it would be worthwhile having a poll to find out what are acceptable measures of performance in the academic realm and whether or not anyone gets any credit for online publications.
Cole: I know this is definitely an issue.
Cerf: This whole thing is bound up in economics. I think that if there were a few transforming examples of online publication where the economics are so different, we might actually start an avalanche.
I have been a longtime member of the Association for Computer Machinery, and if I want an online version of their printed journals, I have to pay a hundred dollars a year extra to do that. My reactions are probably not printable. They should be "born digital," to use your phrase, in the first place.
Cole: You were talking about the democratization of knowledge. You indicate that there is maybe more of a cultural divide in the print world than in the Internet world. Is the Internet giving more people access to history? To culture?
Cerf: I believe that the Internet offers that opportunity. It takes some effort, of course, to make the materials available, but I see an increasing number of retrospective projects to try to bring to the online world that which up to now has not been available conveniently. So that I'm excited about.
I also think that it's a way of preserving cultural artifacts that otherwise might be lost. I remember being at the National Museum of American Art back in the Clinton Administration, and the move to show some art that had been at the White House to a larger audience. At MCI, where I was at the time, we sponsored a Web site that had information about all of these beautiful art objects. We got three-dimensional views of them. We had interviews with the artists. We had a weekly clip showing the latest artistic efforts by that artist and how a particular piece was evolving. It was quite elaborate. In the course of the proceedings, there was a question period and some guy jumped up and said that it was terrible that we were putting all this stuff up on the Web and that it was somehow destroying the important tactile and visual experience you get by going into the museum. I remember thinking, where is this guy coming from? I wanted to respond but I didn't have to because one of the artists was there and he got up and he said, "Listen, you idiot. Anything that gets people interested in art is a good thing."
Cole: Absolutely. It's like one of the purists who say you should never record any music. You have to see it in person. If you record it, it's ossified.
I see, by the way, at Google you have an intriguing new title, "chief Internet evangelist for Google." What does the chief evangelist do?
Cerf: First of all, that wasn't the title I chose. That was something that Larry Page and Sergey Brin and Eric Schmidt thought was an accurate description of what I had been doing for the last thirty years. Originally, when they said, what title do you want, I said, "How about archduke?" (Laughter.)
Cole: I like that.
Cerf: That didn't work out, so I'm the chief evangelist.
I have three roles in the title. One of them is public outreach. The second part is to be on the lookout for new technologies and maybe even new companies that could improve Google's ability to deliver new products and services to the people who use it. Third, I make it a point to get around to all of our engineering offices, which are at this point starting to pop up all around the world.
We're taking advantage wherever we can of the smart people in the world who are interested in making information available and accessible. We are particularly conscious of the fact that cultures and languages vary and we want technical expertise which is embedded in those cultures to help us in this work.
I play the role of intellectual bumblebee, trying to share information and make sure people are aware of what other people are doing. It's a full-time job and a half, and I love it.
Cole: What's the most exciting thing the Internet has delivered in your long association with it?
Cerf: To be fair, I think the World Wide Web, at least up to now, is the most exciting thing that's happened. It's unleashed such an avalanche of sharing of human knowledge. The fact that it has varied in quality from absolute nonsense and junk to unbelievably good quality is part of the side effect of allowing everyone to make a contribution. That part has been very exciting.
The other part, of course, is search engines like Google which have made it possible to at least search through all that material, to get find things that you think are relevant.
My guess is that, honestly, 99 percent of all the really cool applications on the Internet haven't been invented yet. They are being invented as we speak by young twelve- or thirteen-year-olds.
The edge of the Net is essentially open to anything you want to try. So for me it's not just the unleashing of the content, but it is the unleashing of human creativity to try things out.
Cole: This has been fascinating. Thank you.
Cerf: I've enjoyed it.