by Michael S. Hart, Executive Director, Project Gutenberg
Today is the six month anniversary of the huge multi-million dollar media blitz announcing Google Print, the most important event since Gutenberg’s invention of the printing press.
To commemorate this event I have been interviewing people from wide ranges of the Internet about their Google Print experiences, but it seems to be harder than either I or they expected.
Several problems are being reported, both in terms of the usage for the user and some apparent changes of horses in midstream, in terms of the actual purpose of Google Print.
One of the MAJOR CHANGES in Google Print seems to be that they have
decided they’re NOT GOING TO PROMOTE READING of those 10-15 million books they mentioned in their worldwide press releases. Instead it seems their first recommendation is going to be to click on some of the online bookstores they are promoting, and secondly they send us
off to search in libraries.From: Google Print Help
“Google Print helps you discover books, not read them online.”
“To read the whole book, we encourage you to use the ‘Buy this book’ link to purchase it online …”
[These comments are neatly buried in the very middle of the help section, just about exactly half way through. #5 & #7]
Six months ago, we all heard that Google was going to revolutionize
the entire concept of libraries with their new project at a rate of 10,000 to 15,000 books per week which would yield over a quarter of a million books in their first 26 weeks at the lower rate, and a figure nearer half a million at the higher rate, even presuming the total was zero on December 14th.However, now it appears that Google has changed horses in midstream and replaced much of the actual “library” qualities of Google Print with “catalog” properties. . .they are now actually saying in their offical publications to “discover books, not read them online.”
I’ve asked Michael to perform a study of Google’s stock price before and after the announcement of Google print. MIT Technology Review recently ran an article on Google Print but did not scrutinize the concept further than to ask whether it spells the end of paper libraries. Sadly, they did not stop to consider two very important facts:
a) The digitization of text involves a very protracted and human-resource-heavy process of destructive scanning, modern OCR technology and intense proofreading. A million eBooks in as short a time as Google predicts is nice in concept, but ask these folks how it’s done and what it takes.
b) Hello, people of the world, eBooks have been around longer than you think; it’s not a novel Google concept. Hart, Kahle and Eldred for starters???
It’s easy to understand why Google is promoting the idea of “discover the books here and then go buy them” because if they didn’t do that, they would face of the wrath of publishers who would never allow them to digitize copyrighted content. Sure, it is within their ability to digitize and make available all out-of-copyright stuff, but if they do that they will probably never receive permission to put up current works which beats the point of having Google Print.
IMHO, what they have is great for the following reasons:
1. They are following their motto of “organizing the world’s information” by extending search from webpages to printed books. They are, after all in the business of ‘search’.
2. They intend to be non-threatening to existing businesses. Even now people are talking about “driving your google to the google to buy some google for your google”, and they do not intend to be the evil guy who ruins the print business model by giving away books for free online. Remember that even though Gutenburg does the same thing, it does not really threaten the sales of books. If Google did the same, it would be on a totally different scale and publishers would be up in arms.
That’s what they’re telling the publishers now, but that sure wasn’t their original intent. As recently as a quarter ago, MIT’s TechReview and other publishing journals were questioning the viability of paper books within a Google Print model. As Michael himself says:
“The only purpose of my remarks has been to report what the actual Google Print project has done differently that what they announced six months ago.
1. Google Print is no longer touted as an online library, but rather as more of an online catalog: not meant to provide reading materials, but rather to provide clues to buying those reading materials or finding them at a selection of libraries.
2. The Google Print books have been separated, each book, into two camps, plain searchable text, or page images, but neither one is now intended for actual reading.
3. The plain text files have not been proofread to any of the standards that have ever been in place for eBooks, and thus their search engines had to modified to do an acceptable job with these via “fuzzy searches.” These text files are not directly accessible, even when book contents are obviously in the public domain. . .all we get from those files are “snippets,” which might be an appropriate solution for copyrighted works, or not, as the legal wrangling over these has yet to be settled.
4. The image files have been overlayed with an “invisible file,” so that if you try to save what you see, it has the effect of giving you a blank screen, even in those cases where the pages are obviously public domain.
5. I have seen nothing in terms of any reports similar to those initial December 14th annoucements permeating an extremely wide range of media concering the progresses of the various aspects of Google Print.
6. I have received no reports from anyone reporting their satisfactory use of Google Print, even though I should expect that my requests for such have reached hundreds or thousands or even tens of thousands of readers.
7. Another reply to my message suggested that I look from the point of view of what the big PR blitz did to sell Google stock at ever increasing prices.
8. I got one reply that stated the reader could find ONLY ONE public domain book on Google Print. . .could it be that Google Print is going to be pretty much ONLY from the copyrighted works point of view?
THAT would certainly be a HUGE DEPARTURE from what the press blitz was saying 6 months ago!”
===
Really, I need to get off my behind and make Michael his own WordPress blog. Yaay, all Maitri needs — more projects!!!
I hear ya. I was pretty bummed myself when I searched for books that have been long out of copyright (example) only to find that you can browse through only a couple of pages before and after the current page. That’s when I realized that the only thing this will be good for is finding books with the particular text – even good old classics are marked ‘copyrighted material’. Heck, even Da Vinci’s drawings are ‘copyrighted’..
However, I doubt if Google did that to inflate their stock price.. most probably publishers convinced them to chance their strategy (I do remember reading at that time that there were representations from publishers to Google)
I looked up the historical chart of Google stock prices.. If the 2nd and 3rd weeks of December are the period in question, Google’s stock *fell* during those weeks..
What the PR bliz did is apparently thoroughly consistent with how stock prices should behave.
a. If the PR blitz advertised that Google would offer a free service to visitors with no way to convert eyeballs to revenues, while investing their shareholders money in doing that – obviously sensible shareholders started selling off shares.. sending a signal to management to stop screwing around with shareholder money..
b. Management got the message and rolled back from their personal goals of Improving The World, and got back to their stated task of making profitable investments.
c. So, management refocused themselves on hawking products to their customers. That is what they earn their pay for – anything else is a personal goal and vanity.
d. If they really are “changing horses mid-stream” then my respect for them goes way up because they have the good sense to admit they were wrong in their initial goals and have refocused. If all executives were this sensible.. well, I’d be out of a job :-).
———–
So from what I can see, Google print searches within the text of a book and shows you what book you were looking for. And then, making the entirely sensible assumption that you wish to possess the same, it offers to redirect you to someone who sells it. So… where is the problem here? Why is this considered incipient corporate malfeasance?
In fact, if Google Print became a free service which offered up the full-text of books that are out of copyright, I predict Google’s stock price will fall. And I’ll be right up there, shorting their stock.
The problem is that they didn’t tell that to the libraries and publishing concerns that they originally got all riled up with this prospect.
> One of the MAJOR CHANGES in Google Print seems to be that they have
> decided they’re NOT GOING TO PROMOTE READING of those 10-15 million
> books they mentioned in their worldwide press releases.
So, apparently in the worldview of this person, people simply don’t read books if they cost money to buy… Amazon, B&N, Borders, Powell’s etc., discourage reading… Schools and colleges are against the spirit of literacy because.. you know.. they charge tuition.
> Instead it seems their first recommendation is going to be to click > on some of the online bookstores they are promoting, and secondly
> they send us off to search in libraries.
It costs twenty bucks on average to buy and have a book delivered to the concierge at your doorstep ! If you can’t afford that, you can buy used books on pennies to the dollar, and have them shipped for about ten bucks for a half a dozen.
The incredibly hard task of needing to search in a library is so daunting that it forms a sufficiently high barrier to prevent reading…. Now, that’s the spirit that conquered the west..
What next? “The need to turn pages forms a high barrier, and is the leading cause of people not reading any more..”
Sri, obviously you don’t understand the premise under which Google Print started and the state of reading in this world today. Moreover, we are talking about books that are out of copyright (one of a kind that are usually very hard to find) that Google promised to get scanned into eBook form as part of the original plan. Working for Project Gutenberg, I understand how difficult it is to find the books we want to generate in electronic form, especially if they are in old paper format.
The original plan had nothing to do with sending people to online booksellers of popular literature. In fact, this was about creating a “dead book” repository.
Also, don’t be so narrow as to think that every move on someone else’s part (even Google’s) is anti-capitalist. There is nothing wrong with keeping public domain or hard-to-access books in circulation for the world, for those who can afford and can’t. In addition to that, just because something’s free doesn’t make it bad or illegitimate.
Stop obsessing over your money.
Right.. shoot the messenger.
Think about the logic of what you are saying..
“Some books are out of copyright” and “are very hard to find”.
Well, something wrong with that scenario, right? If they are out of copyright, then they should be easier to find..
The fact that they are hard to find means, there is not sufficient demand to make it worthwhile for a business to “keep it in circulation”.
Maybe Google promised, in version 1 of their business plan, to have such books converted into eBook form. Then they went back and realised it isn’t worth their time and effort, or for their most important customers.. (hint: the ones off whom Google can make a profit, and hence continue to remain a going concern..)
Now, if other people.. oh, let’s call them digital rights activists.. believe such books should be made available for free in eBook formats, then shouldn’t they invest their personal time and resources in finding these books, scanning them, and doing what it takes to make them available? Why should Google get called names because they refuse to participate in this activity after a cost-benefit analysis as a corporation?
I don’t understand how anything I said can be interpreted to mean I accuse anyone of being anti-capitalist, so I’m going to let that pass. And who said anything about something free being illegitimate?
Here’s what gets my goat. This assumption that if books are ‘expensive’ or if they are ‘hard to find’ then people won’t read them.
In the real world, people don’t read books because said books add no value to their lives in as much as people measure value. If they felt books add value to their lives, they would do what it takes to read. The entry barrier is not money, nor access, it’s motivation.
Who’s obsessing over money? I take mine for granted.
The fact that quite a few public domain books are hard to find (for destructive scanning, which is the most efficient manner in which to generate an eBook) means that they are one of a kind, rare, collector’s items and/or buried in the libraries of a private or university collection. The last kind are the ones that Google promised to put in circulation in the form of 10 – 15,000 eBooks.
There are people who “invest their personal time and resources in finding these books, scanning them, and doing what it takes to make them available” — we’re known as Project Gutenberg. We approached these same universities just a year before the announcement of Google Print and were turned down and/or No Reply. Then, they turned around and made a pact with Google Print, also exclusively turning over digital rights to these books to Google. If only you knew the huge deal Google made about this in the publishing arena (of course, they had a completely untenable model for mass-generating eBooks from print copies, but that’s beside the point here).
What of those books that were promised to be made availabe? Google gets called out because of “changing horses” without notifying the same people using the same voice, pitch and fervor with which they made the initial announcement.
More to get and keep your goat — when people scrimp and save, they don’t spend money on books that are more and more expensive. True, if reading were their real focus, they would spend any amount of money to get at these books. But, books shouldn’t have to cost that much (their prices have gone up way beyond standard inflation calls for). And before you get all balled up over it, I was told by more than one book publisher that the mere fact that they (i.e. the publishing middlepeople, not the books) exist demands more money, even if they pay the author the same amount. Talk about non-value-adding, self-fulfilling crap. A significant more to your bottom line without a significant value add on your part. If you call that capitalism, please stop stealing the language.
If books are hard to find, how will you even know they are there in order to read them? There is nothing wrong with providing access and accessibility — whether the general public pays for it or not, whether it wants it or not.
(I’m continuing this discussion in the hope of finding the ‘Truth’ if there is such.. I hope you don’t believe I’m being sheer bloody-minded. And this post is long… )
I know little to nothing about the technical details of what happens in destructive scanning and will defer to your statement that it is the most efficient. (I do have the question why transcription isn’t an option. I know that’s laborious but we’re talking about an average of less than 17c per page when outsourced to some sweatshop in India.. could that be the win-win we’re looking for? Just a thought..)
Anyway – I presume you are referring to this press release from Dec 04 when you said Google made a big deal of this service:
http://www.google.com/press/pressrel/print_library.html
That said, Google has, consistently, over and over again insisted that this will be a for-profit initiative.. For example, here:
http://print.google.com/googleprint/library.html
and here,
http://print.google.com/googleprint/about.html
apart from the press release linked above.
Now, apart from Larry Page saying over and over again about “monetizing the results” here’s what their original press release said:
“Today’s announcement is an expansion of the Google Printâ„¢ program, which assists publishers in making books and other offline information searchable online. Google is now working with libraries to digitally scan books from their collections, and over time will integrate this content into the Google index, to make it searchable for users worldwide.”
And also,
“Google is now working with libraries to digitally scan books from their collections, and over time will integrate this content into the Google index, to make it searchable for users, worldwide.”
Emphasis mine.
So, unless they went back and changed their press release (who knows, after all they are a Corporation, part of the military-industrial ‘Big Search’ Establishment..) I don’t see where they said they would ‘PROMOTE READING’ which Mr. Hart believes is a promise they’ve gone back on.
In fact, the closest they came to saying anything about encouraging reading in their press release is:
“this expansion of the Google Print program will increase the visibility of in and out of print books,”
which they craftily follow up with..
“and generate book sales via “Buy this Book” links and advertising.”
And as for users,
“For users, Google’s library program will make it possible to search across library collections including out of print books and titles that weren’t previously available anywhere but on a library shelf.”
Emphasis mine.
Now, they do make some books available online, for example :
http://print.google.com/print?id=yGZZXIrbUKQC&lpg=7&pg=5
I don’t know how relevant these books are, nor how rare. I am not their target market anyway.
As for Mr. Hart, I believe he is mixing two ideas up.
Google’s definition of “increasing access” is to help people reach a clearer identification of a book they are seeking to find.
Mr. Hart’s objective is completely different – actual physical access to the entire book. From one perspective he may be more laudable.. But he seems to believe everyone must share his goals. I respectfully disagree with him that Google has ‘gone back’ on anything. Not from what I can see.
If there is some other press release, material, emails etc which will show me why I’m wrong, please feel free to share them with me.
As for your comment about Capitalism.. you seem to think there is some ‘right price’ for books which can be set by the right individual. The right price is whatever clears the market. In other words, the right price is that where the owners’ profit is maximized. How and why inflation matters here, I don’t understand.
I know you already know this, but let me remind you again..
Capitalism does not mean that the only gains/earnings/profits permissible are what directly accrue from direct physical labor. That is merely ‘wages’.
The three sources of income in a capitalist system are:
1. Wages
2. Profits (as payment for business risk)
3. Rents
You know how the first and second work. Rents are when the owner of a resource charges ‘rent’ for usage of that resource by the user. It is a perfectly legitimate means of earning, nothing dishonorable whatsoever about it. If the price is too high for the customer, then the customer can walk and the renter loses a sale.
It’s only when the customer knows that his utility from the resource is higher than the cost, does he pay for it. Sure, Brealy & Myers shouldn’t cost $180/- in a just world. But without what it teaches him, Dan has no hope of becoming an investment banker. Right?
So, why sneer at intermediaries for what is perfectly legitimate? Think of the services publishers provide to authors as well as to you. Would you have met even one of your favorite authors without the publisher’s mediation? If that is not ‘significant value add’, I don’t know what is. You seem to believe we still live in a hunter-gatherer world where we all consume commodities. Not !
How would one find books in a world overrun by capitalists…? Good question. Here’s how:
The rent-seeker knows that he/she can extract the appropriate rent. So, he/she has the *incentive* to go out there, find the best scanning technology and / or typists/transcribers to convert the dead capital lying in their collection into a format which they can then sell to as many people as possible and hence make maximum profits. The prospect of making a profit concentrates the mind extremely well in most people.
Why is this not happening? Because the libraries have no idea of how to estimate what the market size is.. They have no idea if the investment in time and effort is worth it financially. They have other pressing duties which will provide a more certain prospect of profit *which they can measure with more certainty today*. Sure, scholars and open-source enthusiasts keep pestering them to convert everything into digital media and allow access. But, what is that effort worth?
Google walks in and shows them how to measure. “Allow our users to search your libraries. Over 1 year, you know exactly how many users searched for the specific books you have in your collection. Based on the sales we generate, you know precisely how many books you are likely to sell if you re-print another batch. You’ll know in a years time if it’s worth it. And you can re-print only those books which have a genuine demand, not to satisfy some scholars fetish to preserve every single frickin book ever written. Some books should die, and the sooner they die, the better. But let’s get back to business here..”
If I was a librarian faced with declining donations and increasing demand for services who knows I’m sitting on a potential goldmine, but don’t know precisely where it is.. what do you think I’d do? What would *you* do if you’re the librarian?