Google Book Search and orphan works

1 11 2008

In Google Book Settlement, Business Trumps Ideals, reports Juan Perez in this insightful business article in PC World. Here’s the quote that sums up the deal’s novel approach to orphan works:

Of the 7 million books Google has scanned, 1 million are in full preview mode as part of formal publisher agreements. Another 1 million are public domain works. Most of the other 5 million aren’t in print or commercially available. Google today can only show snippets of their text. The agreement opens up those books for broader preview and potential paid access via individual purchase or institutional subscriptions.

“Together, we’re igniting a new market for these books that have been held in libraries but not available commercially,” Google’s Smith said.

So, what’s so new? Everything.

This isn’t the Congressional approach to problem solving (shove the parties into a room and lock the door until they have reached an agreement — and may the strongest interest obliterate the weaker and we’ll call it a compromise in the public interest). This is the publisher’s and Google’s no nonsense business approach: “Hey, let’s just start selling all the books and if there’s money to be made, the owners will either show up to claim it, or the money will lie there for 5 years while we give everyone time to wake up and smell the coffee. At the end of 5 years, we’ll pretty much know what’s orphan and what’s not. What’s not to like?”

At first I was appalled. Especially because the settlement terms provided that the information about who claimed what was going to be kept secret between Google and the publishers/authors (ie, the Registry). And equally as bad, if no one came forward to claim a book, as copyright owner, essentially the Registry would keep the money. There are provisions for the Registry to use it for x, y and z, and *if* any is left, it goes to a reading-oriented charity or some such. But I’m not thinking there’s going to be any left… What do you think?

Further, Google clearly understood and accepted that this plan was based on an idea I found repugnant: if orphan works don’t have owners, by definition, then why is it that the Registry should keep the money that comes in for books that ultimately no one claims? The publishers and authors just don’t see orphans as really belonging to everyone in the absence of an owner. They see them as belonging to all the other authors and publishers, but not the public. That really rubbed me the wrong way. After all, it’s not the publishers and authors who have collected these books, maintained them, preserved them, and are now making it POSSIBLE for anyone to even have potential to find them and buy them by partnering with Google to make them a part of Book Search. Where do they get off claiming that they are entitled to keep unearned, undeserved revenues to the exclusion of everyone else in the world?

“Ah, Georgia, uh, this is a rather innovative and practical approach to orphan works, probably better than anyone has come up with. Come down off the ceiling and think it through,” said Alex. Well not in those terms. He was just honest and straightforward (as he always is) and explained that a deal with publishers and authors that started from the premise I favored (that orphans don’t belong to anyone so if they generate revenue, it should go back to those who paid when it’s clear the work is orphan) was simply not possible. So Google started where the publishers were willing to start and worked for a good outcome, the practical effect of the proposal on availability of orphans, and ultimately availability of information about which ones *were* orphans. Google focused on the fifth of those five years.

That’s why the secrecy thing had to be fixed. And it was fixed, but in my opinion, it’s still not as good as it needs to be. I’m happy that in five years (from the approval of the settlement and implementation of the business model) there will (we take on faith) be some sort of way to pull together which books have not been claimed and more or less know what’s orphaned of those works that were published in the 20th century. But the process by which a book is claimed needs to be transparent. If the public will not know whether claimants meet rigorous or absurdly simple criteria for proving their claims, confidence in the outcome of the process will fail. This has the potential to be very powerful — or a joke. Maybe the court won’t accept this aspect of the deal unless the transparency of the process through which claimants come forward and their claims are vetted improves. Imagine if the process of registering a copyright at the Copyright Office were secret and only the result, that a copyright was registered, were available. No actual registration, no basis for disputing whether a claim is valid.

Many people anticipate a slew of murky claims to be disputed by various claimants (where, for example, no one is sure whether rights reverted, or sales of assets were not accompanied by clear copyright titles, etc.), but the whole idea of orphan works is that there’s no one around to claim the work. This could make spurious claims easy to perpetrate because of the likelihood that there’s no one to take you to task for fraudulently claiming. This worries me.

I want this process to work. I think it has a much better chance of working than that piece of, uh, than that piece of legislation that nearly passed earlier this fall. It doesn’t give us an answer today and it *only* deals with books, so it’s not a comprehensive solution, but it might serve as an example of what works, assuming it does work. But libraries can still do their own research on individual titles that they think may be orphans while we wait for this deal’s market incentives to do their job, and for it to become clear that transparency is in the owners’ best interests as well as the public’s.

For example, I believe that the OCLC’s Copyright Evidence Registry is just as important today as it was 5 days ago before Google announced this deal. Although the publisher/author Registry has potential to be definitive, there will be need for multiple sources of information about the copyright status of works until the publisher/author Registry earns its keep. No source that wants to be definitive can do so if it can’t be trusted. In the absence of trust, we will absolutely need to view it as just one source of information, to be accumulated with other, hopefully more trustworthy sources, and then make our decisions, based on our own risk tolerance levels, what we’re comfortable is orphan and what’s not.

Speculation is fun. But this deal offers a real living, breathing experiment for bringing orphan works to a new audience, and for bringing information about what works are orphans to light as well. The settlement is not written in stone. I know from working with Google as a Book Search Partner that Google doesn’t work at the level of its contractual commitments. It sees those commitments as starting points and works up from there. If there are aspects of the settlement that threaten its value, they will be addressed. I think the transparency of the Registry process and outcomes is one of those elements.



Leave a comment

You must be logged in to post a comment