Sunday, June 17, 2007

Google Book Search vs Live Search Books

Google Book Search and Live Search Books are the largest databases of full-text books on the internet. Most of their holdings are books in the public domain, but both now include more recent copyrighted titles as well. Google claims over a million records, while Live had over 800,000 last I heard.

The Shifting Sands of Google Book Search

For the past several days, I input the same words and phrases into Google Book Search. Every day, it spit back a different number for each search. Maybe they added some books, so the number is higher for each succeeding day? Not so--the number has actually been going down every day! Perhaps publishers are insisting that Google remove their books? Or could it be the search engine is either intentionally or sinisterly inaccurate?

I tried the phrase "Next Attack" today and Google returned a result of 904 items. But what does this number mean? 904 books? When I went through all the pages and tried to get to that 904th book, Google stopped at number 484! Where are the other 400+ plus books I was promised? 904 doesn't exist when I try to find it. Well, it's just a glitch that wouldn't be repeated with another phrase, right?

Next I tried the phrase "Homeland Security" which gave me 5123 results. I decided to go through all the pages, but that would take some time, wouldn't it? But wait! To my chagrin, there are really only 155 books! Where are those other 5,000 books on homeland security Google assured me were in this database? False advertising? (In fairness, I just tried the search again--5,161 results! But of course, on the last page is book number "162 of 172."

How about a rare word? "Sapajous" garners 645 results, but after going through all the pages it gives me only "416 of 436"! Google rarely gives as much as it promises!

If Google has a display limit, why does it differ for each search? Of course, the same thing happens when you search the open internet at google.com. As you sift through the results of your search, the numbers often pull back once you go through all the result pages. But surely with a much smaller, finite database, such as Google Books, this shouldn't happen. Can't I receive an honest number when I'm only searching a mere 1 million records?

The Brick Wall of Live Search Books


The numbers at Microsoft's Live Search Books don't move. Same thing every day I try the same search. And there is a limit to how many of your results you can view. That doesn't change either.

The phrase "homeland security" draws 749 results (the search process is slower than Google's, but that's okay as long as the results are honest!). But wait! When I try to see book number 749 by moving the scroll button all the way down to the bottom, it stops on book number 250! I can go no further.

Let's try "Next attack." 732 results. The same story. The results stop at book 250 and won't give me the rest, although I suspect they really are there. I take this to mean if I want to see all results for any search, I must get the result number down to 250 or less. That's easily accomplished. My rare word "Sapajous" draws 43 results and Live give me all of them.

"Jesus Mohammed Buddha Moses Krishna" gives me 260 results. Will Live give me 251-260? No way. 250 is the absolute limit.

The answer for using both search engines, I suppose, is that you need to get your results down to no more than 250 so you can be sure to see everything. But I'm the sort of person who wants to sift through all results on a topic and pluck out those that are useful. This often involves going through literally thousands of records. That just isn't possible with either of these databases that I can see.

Live's book result numbers are usually much lower than Google's. The only exception seems to be Religion, where Live often beats Google. I'm not sure what this really means anyway, since Google's numbers seem to be inflated.

Unlike Google, there is no advanced search page for Live Books. The "intitle" limiter works (although that isn't advertised anywhere) but "inauthor" doesn't.

Conclusion: I'm not satisfied at all with either book search product. Both refuse to give me what they promise! If I can't see 750 books, don't promise that many!

Google will have a booth at the ALA conference in Washington in a couple weeks. I'll be sure to go there and ask about the Book Search engine.

No comments: