Google digitizing all books

Tue Mar 10 05:20:42 UTC 2009

Kevin Bray makes several valuable contributions to our discussion:

> There's a contradiction here. One one hand:

[quoting me without attribution:]
>> The copyright and patent systems are part of many ways our society
>> offers authors and inventors an incentive to produce while also 
>> pursuing the public good of making the information available to 
>> society as a whole.
> 
> And on the other:
[quoting me without attribution:]
>> If Google is really doing society such a big favor by making all
>> this information available, they should do it on a nonprofit basis.
>> If they  stole it for nothing, they should give to their customers
>> for nothing.
> 
> The first argument tells us that the public good is represented by 
> granting commercial privilege to the distributor of some quantity of
> information, but the second tells us that public good only arises
> where a different distributor is denied its commercial privileges.
> What gives?

You're right to point out the inconsistency. I don't really want Google 
stealing our works and giving them away for nothing. That was a 
sarcastic way of saying, "if they're so all-fired proud of their 
altrustic contribution to the public good, then isn't it hypocritical of 
them to be making a profit, and doesn't that prove that they're not 
really being so altruistic after all?"

> Second, at what point did finding tools cease to qualify as public 
> good? I myself use the ordinary Google search engine at least a dozen
>  times a day -- without question the most powerful research tool I
> use -- and yet I have never paid a dime in user fees (I'm aware that
> the operating costs for search vendors like Google and Yahoo are
> hidden in the advertising expenses of the companies who do sell me
> stuff, but how is that new?). How many people on this listserv have
> ever mailed a cheque to Google or Yahoo for research services
> rendered? How much would you have to pay a grad student to collect
> the same information manually?

Far too much, of course.

A traditional business model is based on the sale of goods, information, 
etc., where revenue is derived from user fees, sales proceeds, etc. 
Google's model doesn't work that way, but it works nevertheless because 
Google's income (from the sale of ads) exceeds its expenses (from 
acquiring information, maintaining its servers, etc.). There are two 
21st century innovations here: 1) Google pays next to nothing to acquire 
its product, and 2) it derives revenue almost entirely from the sale of 
advertising and not from the sale of the product itself. The latter is 
not really an innovation -- there have been lots of examples of this 
type of behavior in publishing before -- but Google has taken the 
principle to an extreme. And the former is also not an entirely new 
concept -- think of the various extractive industries that go out and 
harvest trees, mine gold, etc. for a pittance; colonialism was built 
largely on such theft.

> And back to that matter of profit: what cosmic law decrees that
> public good must be achieved "on a nonprofit basis"? You may disagree
> with me on this, but I'd argue that the explosion of accessible
> information that has been occurring over the last thirty-odd years
> has been the leading edge of a renaissance that has yet to peak. As
> Paul has said, there will always be plenty of information
> ("reproduced indefinitely at virtually no cost"), but the sheer level
> of precision with which we can use that information is completely
> unprecedented, and it's accelerating. Incidentally, ask any visitor
> to the Russian archives how useful a colossal repository of
> information is if you can't access it. Probably the single largest
> contributor to that explosion in information accessibility is the
> internet search engine, and in particular, Google. How odd, then,
> that Google has been turning a profit all along.

I would never deny that Google provides a useful service; I use it on an 
hourly basis and would be severely handicapped without it or other 
engines like it. The issue is not whether they are entitled to provide 
information that the owners have placed in the public domain, but 
whether they are entitled to take information against the owners' will 
and publish that for profit as well.

> Third, which information is Google really selling? Here's the first 
> sentence from the Google Library Book Search intro page:
> 
> 	We're working with several major libraries to include their 
> 	collections in Google Book Search and, like a card catalog, 
> 	show users information about the book, and in many cases, 
> 	a few snippets – a few sentences to display the search term 
> 	in context.
> 
> How is this different from any number of bibliographic search tools 
> that have been in production for centuries? The information trading 
> hands is not the content of the work in question, but rather a 
> *reference* between the information input by a search user and the 
> information catalogued in a library. The actual information exchanged
> is not the copyrighted work, "snippets" notwithstanding.

Surely the intent of Google's statement here is to claim that they are 
remaining within the bounds of "fair use" -- the objective of which (as 
a matter of public policy) is to tease the user into buying the work if 
he finds it interesting/useful enough. But what Google is actually 
proposing to do, publishing entire works, is not fair use, not even 
close. Obviously, certain works (those in the public domain, public 
records such as laws, etc.) are excluded from the restriction, but if I 
write a new symphony, Google has no right to take and publish it without 
my permission, and I will not grant that permission without fair 
compensation.

Now, if the government should decide that there is a public good in 
publishing my symphony, they are certainly welcome to pay my fee, too. 
But if they decided to take it as some kind of perverted "eminent 
domain," you can bet they'll hear from my attorneys and those of 
everyone else who thinks he's next in line.

> Lastly, the question was raised in another message whether Google
> isn't building an information monopoly. No doubt Google is building
> the sort of information infrastructure that would give it an
> advantageous position for exerting a lot of force over some sort of
> eBook pricing scheme, but a monopoly is not really possible when you
> can stroll down to your local library and check the card catalogue
> for free. It's also worth noting that academic database vendors have
> been charging considerable fees to search journals for years, and the
> uproar is yet to come. Besides, Google (and a great many others) are
> on to something much larger than the ability to search for a book,
> fee or no fee (more on that shortly).
> 
> But one more thing to consider with respect to the prospect of 
> monopoly: even now Google releases a staggering amount of its own 
> intellectual property for free use. Not only can you use the search
> engine without any sort of licence cost, but Google offers a 
> considerable number of programming tools to repurpose its data as you
> see fit. Here, for instance, is the full documentation on "stealing"
> the functionality of the maps service:
> 
> <http://code.google.com/apis/maps/documentation/>
> 
> And here is a page where Google has gone to the trouble of compiling
> as much historical stock price information on IBM as you care to
> view, complete with a download option so you can keep the data for
> yourself, permanently.
> 
> <http://www.google.com/finance/historical?q=NYSE:IBM>

As I said previously, it's not stealing if the owner gives it away. You 
can look it up. Or ask Abbie Hoffman. ;-)

> By the way, banks and investment counselling services still charge a
> lot of money for information of this quality. Make no mistake, there
> was a cost to both compiling the data and writing the software to 
> deliver it. When you include the rest of Google's many projects
> (GMail, Android, image analysis), that figure easily reaches into the
> hundreds of millions of dollars, probably more. Yet no user fee.

Banks and investment companies are in business to make a profit, too, 
and if they can find suckers who will pay for information instead of 
getting it free on Google, well, they're entitled to do so.

> The mechanism behind Google's success at this has been slowly rolling
> right over the traditional publishing model for a decade or so and
> has already had a vast effect on the future of research and academic
> communication. This change started in the software industry and is
> now just beginning to hit the more "artistic" cultural spheres. Paul
> was close to it when he said:
> 
>> Many people fall into the trap of thinking that because information
>> is  intangible, it is worthless. Nothing could be further from the
>> truth.  If you are about to jump off a cliff and I tell you that
>> you will die  as a result, how much is that worth to you? Perhaps I
>> should make the  information a gift on ethical grounds, but that
>> still doesn't make it  worthless, as evidenced by the gratitude
>> most people feel when someone  saves their lives. What makes
>> information so different from tangible  goods is that it can be
>> reproduced indefinitely at virtually no cost.  But here, too, we
>> face a logical trap: the /value/ of a thing is not the same as its
>> /cost/ of production. The value of a thing is what people will pay
>> for it.
> 
> But that's not quite it, because if the value of information is what
> we pay for it, then where are those user fees? Shouldn't a one-to-one
> relationship between data and fees be the most efficient model? 
> Wouldn't ProQuest be a multi-billion-dollar juggernaut rather than 
> Google? Why is Google (and Sun, and dozens of free software vendors)
> giving away "intellectual property", and what does Google really
> have if it's only repackaging information "owned" by others?

Consider the case of Manhattan Island. What was it worth to the Lenape 
Indians, and what was it worth to the Dutch settlers? Sixty guilders 
seems low, but that's because we had an ignorant seller who didn't 
realize what the thing was worth (there were also questions of clear 
title, but let's ignore those). I suspect the price would've been a lot 
higher if the seller did due diligence, and had an army to enforce its 
rights, and so forth.

Similarly, the same piece of information can have different values to 
different buyers. I don't much care what your platelet count is, but you 
might be willing to pay money for it. Some people would see a yellowish 
piece of rock and throw it away, while others would pay hundreds of 
dollars an ounce for the gold it contains. And so forth. Market value is 
what you can get for it, and it only takes one generous buyer to make 
the thing valuable.

> So let's ask Paul's question differently: imagine I tell you that you
> will die if you jump, and you jump anyway. What was that information
> worth? It turns out the value of information is based in how that 
> information is used. Again, that's what makes archives so
> problematic: the simple existence of information is irrelevant if it
> the information cannot be applied.

This is a good example of what I meant -- if the seller doesn't realize 
the thing's value, he won't pay. And you're right that application has 
an awful lot to do with value.

> That's the key to what Google is all about. Ironically, Google's most
> valuable asset is that empty search box where we type all of our 
> requests, and our constant willingness to tell the box what we want
> to know. I'd hazard a guess that Google dedicates 10 times more
> resources to analyzing and cataloguing what you type into that box
> than they do to analyzing digital books, because the data passed
> through that search box is critical to understanding how the
> information in the repository will be applied, and therefore to
> analyzing its value. It seems that information is so valuable that
> Google can subsist primarily on selling summaries of it to
> advertisers, and thus not even charge a cent from the user.

Yes, that's a major part of their business model.

> Which brings us back to the amazing social contract that could one
> day replace the concept of copyright. In the past, "artistic
> production" has been a one-way street, with a constant flow of
> information traffic from authors/artists/creators to audiences. But
> now there is a strong precedent in traditionally "non-artistic"
> (though, having written a program or two, I'd debate the point)
> technical disciplines for a system that reverses the information flow
> and vastly increases its value. There's some sort of lesson in that
> change. As cultural "producers" we stand to gain immensely from
> learning it, intellectually and commercially.

There has always been a feedback loop between artists and their 
audiences, and you can easily find plenty of literature on the tension 
an artist feels between pursuing his personal artistic vision and making 
his work commercially successful. Many artists who do the former end up 
being unappreciated until late in life or even afterward. What's really 
wonderful about today's information age is that artists can stay closer 
to their vision and target very specific niches in their audiences 
(because it's much easier to find and connect with those few potential 
clients), and conversely a connoisseur can find those few artists who 
offer just what he wants to see/hear. This connectivity will inevitably 
lead to a blossoming of the arts, provided we can maintain a fair 
balance and not allow rampant piracy to keep starving artists from their 
share of the pie.

-- 
War doesn't determine who's right, just who's left.
--
Paul B. Gallagher
pbg translations, inc.
"Russian Translations That Read Like Originals"
http://pbg-translations.com

-------------------------------------------------------------------------
 Use your web browser to search the archives, control your subscription
  options, and more.  Visit and bookmark the SEELANGS Web Interface at:
                    http://seelangs.home.comcast.net/
-------------------------------------------------------------------------