Google digitizing all books

Tue Mar 10 03:16:35 UTC 2009

There's a contradiction here. One one hand:

> The copyright and patent systems are part of many ways our society  
> offers authors and inventors an incentive to produce while also  
> pursuing the public good of making the information available to  
> society as a whole.

And on the other:

> If Google is really doing society such a big favor by making all this  
> information available, they should do it on a nonprofit basis. If they  
> stole it for nothing, they should give to their customers for nothing.

The first argument tells us that the public good is represented by  
granting commercial privilege to the distributor of some quantity of  
information, but the second tells us that public good only arises where  
a different distributor is denied its commercial privileges. What  
gives?

Second, at what point did finding tools cease to qualify as public  
good?  I myself use the ordinary Google search engine at least a dozen  
times a day -- without question the most powerful research tool I use  
-- and yet I have never paid a dime in user fees (I'm aware that the  
operating costs for search vendors like Google and Yahoo are hidden in  
the advertising expenses of the companies who do sell me stuff, but how  
is that new?). How many people on this listserv have ever mailed a  
cheque to Google or Yahoo for research services rendered? How much  
would you have to pay a grad student to collect the same information  
manually?

And back to that matter of profit: what cosmic law decrees that public  
good must be achieved "on a nonprofit basis"? You may disagree with me  
on this, but I'd argue that the explosion of accessible information  
that has been occurring over the last thirty-odd years has been the  
leading edge of a renaissance that has yet to peak. As Paul has said,  
there will always be plenty of information ("reproduced indefinitely at  
virtually no cost"), but the sheer level of precision with which we can  
use that information is completely unprecedented, and it's  
accelerating. Incidentally, ask any visitor to the Russian archives how  
useful a colossal repository of information is if you can't access it.  
Probably the single largest contributor to that explosion in  
information accessibility is the internet search engine, and in  
particular, Google. How odd, then, that Google has been turning a  
profit all along.

Third, which information is Google really selling? Here's the first  
sentence from the Google Library Book Search intro page:

	We're working with several major libraries to include their  
collections in
	Google Book Search and, like a card catalog, show users information
	about the book, and in many cases, a few snippets ¨C a few sentences
	to display the search term in context.

How is this different from any number of bibliographic search tools  
that have been in production for centuries? The information trading  
hands is not the content of the work in question, but rather a  
*reference* between the information input by a search user and the  
information catalogued in a library. The actual information exchanged  
is not the copyrighted work, "snippets" notwithstanding.

Lastly, the question was raised in another message whether Google isn't  
building an information monopoly. No doubt Google is building the sort  
of information infrastructure that would give it an advantageous  
position for exerting a lot of force over some sort of eBook pricing  
scheme, but a monopoly is not really possible when you can stroll down  
to your local library and check the card catalogue for free. It's also  
worth noting that academic database vendors have been charging  
considerable fees to search journals for years, and the uproar is yet  
to come. Besides, Google (and a great many others) are on to something  
much larger than the ability to search for a book, fee or no fee (more  
on that shortly).

But one more thing to consider with respect to the prospect of  
monopoly: even now Google releases a staggering amount of its own  
intellectual property for free use. Not only can you use the search  
engine without any sort of licence cost, but Google offers a  
considerable number of programming tools to repurpose its data as you  
see fit. Here, for instance, is the full documentation on "stealing"  
the functionality of the maps service:

http://code.google.com/apis/maps/documentation/

And here is a page where Google has gone to the trouble of compiling as  
much historical stock price information on IBM as you care to view,  
complete with a download option so you can keep the data for yourself,  
permanently.

http://www.google.com/finance/historical?q=NYSE:IBM

By the way, banks and investment counselling services still charge a  
lot of money for information of this quality. Make no mistake, there  
was a cost to both compiling the data and writing the software to  
deliver it. When you include the rest of Google's many projects (GMail,  
Android, image analysis), that figure easily reaches into the hundreds  
of millions of dollars, probably more. Yet no user fee.

The mechanism behind Google's success at this has been slowly rolling  
right over the traditional publishing model for a decade or so and has  
already had a vast effect on the future of research and academic  
communication. This change started in the software industry and is now  
just beginning to hit the more "artistic" cultural spheres. Paul was  
close to it when he said:

> Many people fall into the trap of thinking that because information is  
> intangible, it is worthless. Nothing could be further from the truth.  
> If you are about to jump off a cliff and I tell you that you will die  
> as a result, how much is that worth to you? Perhaps I should make the  
> information a gift on ethical grounds, but that still doesn't make it  
> worthless, as evidenced by the gratitude most people feel when someone  
> saves their lives. What makes information so different from tangible  
> goods is that it can be reproduced indefinitely at virtually no cost.  
> But here, too, we face a logical trap: the /value/ of a thing is not  
> the same as its /cost/ of production. The value of a thing is what  
> people will pay for it.

But that's not quite it, because if the value of information is what we  
pay for it, then where are those user fees? Shouldn't a one-to-one  
relationship between data and fees be the most efficient model?  
Wouldn't ProQuest be a multi-billion-dollar juggernaut rather than  
Google? Why is Google (and Sun, and dozens of free software vendors)  
giving away "intellectual property", and what does Google really have  
if it's only repackaging information "owned" by others?

So let's ask Paul's question differently: imagine I tell you that you  
will die if you jump, and you jump anyway. What was that information  
worth? It turns out the value of information is based in how that  
information is used. Again, that's what makes archives so problematic:  
the simple existence of information is irrelevant if it the information  
cannot be applied.

That's the key to what Google is all about. Ironically, Google's most  
valuable asset is that empty search box where we type all of our  
requests, and our constant willingness to tell the box what we want to  
know. I'd hazard a guess that Google dedicates 10 times more resources  
to analyzing and cataloguing what you type into that box than they do  
to analyzing digital books, because the data passed through that search  
box is critical to understanding how the information in the repository  
will be applied, and therefore to analyzing its value. It seems that  
information is so valuable that Google can subsist primarily on selling  
summaries of it to advertisers, and thus not even charge a cent from  
the user.

Which brings us back to the amazing social contract that could one day  
replace the concept of copyright. In the past, "artistic production"  
has been a one-way street, with a constant flow of information traffic  
from authors/artists/creators to audiences. But now there is a strong  
precedent in traditionally "non-artistic" (though, having written a  
program or two, I'd debate the point) technical disciplines for a  
system that reverses the information flow and vastly increases its  
value. There's some sort of lesson in that change. As cultural  
"producers" we stand to gain immensely from learning it, intellectually  
and commercially.

Kevin Bray

On 09 Mar 2009, at 00:09, Paul B. Gallagher wrote:
>
> There are obviously public goods to be had from making information  
> available -- most obviously because others can build on it. But if all  
> information were freely available, there would be less incentive for  
> people to invest their time and money in developing it. You may be an  
> altruist who freely gives his property away, but that doesn't mean  
> others should do so as well. A gift is something given freely and  
> willingly, not something taken against the owner's will. And at least  
> some of what Google proposes is just that -- theft. It doesn't matter  
> whether they make profit out of it -- if I take your Rembrandt off  
> your wall and hang it on my own, never to earn a penny for me, it is  
> still theft because I took your property against your will.
>
> The copyright and patent systems are part of many ways our society  
> offers authors and inventors an incentive to produce while also  
> pursuing the public good of making the information available to  
> society as a whole. It's a balance between the rights of the  
> individual and the good of society. If society values a work so highly  
> that it is willing to steal it, well then the author/inventor has  
> produced something so valuable that s/he deserves to be fairly  
> rewarded.
>
> Many people fall into the trap of thinking that because information is  
> intangible, it is worthless. Nothing could be further from the truth.  
> If you are about to jump off a cliff and I tell you that you will die  
> as a result, how much is that worth to you? Perhaps I should make the  
> information a gift on ethical grounds, but that still doesn't make it  
> worthless, as evidenced by the gratitude most people feel when someone  
> saves their lives. What makes information so different from tangible  
> goods is that it can be reproduced indefinitely at virtually no cost.  
> But here, too, we face a logical trap: the /value/ of a thing is not  
> the same as its /cost/ of production. The value of a thing is what  
> people will pay for it. This is §ã§ä§à§Ú§Þ§à§ã§ä§î in the sense of §è§Ö§ß§ß§à§ã§ä§î,  
> not §ã§ä§à§Ú§Þ§à§ã§ä§î §á§â§à§Ú§Ù§Ó§à§Õ§ã§ä§Ó§Ñ.
>
> If Google is really doing society such a big favor by making all this  
> information available, they should do it on a nonprofit basis. If they  
> stole it for nothing, they should give to their customers for nothing.  
> And I'll bet you that kind of ROI would put a stop to it in a New York  
> minute.
>
> --  
> War doesn't determine who's right, just who's left.
> --
> Paul B. Gallagher
> pbg translations, inc.
> "Russian Translations That Read Like Originals"
> http://pbg-translations.com
>
> ----------------------------------------------------------------------- 
> --
> Use your web browser to search the archives, control your subscription
>  options, and more.  Visit and bookmark the SEELANGS Web Interface at:
>                    http://seelangs.home.comcast.net/
> ----------------------------------------------------------------------- 
> --

-------------------------------------------------------------------------
 Use your web browser to search the archives, control your subscription
  options, and more.  Visit and bookmark the SEELANGS Web Interface at:
                    http://seelangs.home.comcast.net/
-------------------------------------------------------------------------