Newsworthy

The Secret Of Google's Book Scanning Machine Revealed

The other day my colleague Kee Malesky turned me on to an incredibly interesting article from the New Scientist website about the granting of patent 7508978. What's so important about Patent 7508978 you ask? It's the patent that explains how Google's proprietary book scanning technology works.

Patent Office Image of Google's Infrared Camera Technology

Image of Google's infrared camera technology United States Patent and Trademark Office hide caption

itoggle caption United States Patent and Trademark Office

Before Google came on the scene, book scanning was a tedious process that sometimes resulted in the death of a book. The software used to scan books, called Optical Character Recognition software or OCR for short, required each page of the book to be flat. Now anyone who's ever opened a book knows it's next to impossible for a book to lie flat without some sort of device. One solution to the problem was to use glass plates that individually flattened each page, but this method wasn't very efficient. The other solution was to chop off the book's binding, but that method destroyed the book. How was one to go about scanning a book quickly and efficiently without destroying it? It was a problem that vexed book scanners for years until Google came up with this solution

Turns out, Google created some seriously nifty infrared camera technology that detects the three-dimensional shape and angle of book pages when the book is placed in the scanner. This information is transmitted to the OCR software, which adjusts for the distortions and allows the OCR software to read text more accurately. No more broken bindings, no more inefficient glass plates. Google has finally figured out a way to digitize books en masse. For all those who've pondered "How'd They Do That?" you finally have an answer.

Patent Office Image of Google's Infrared Camera Technology

Image of Google's infrared camera technology United States Patent and Trademark Office hide caption

itoggle caption United States Patent and Trademark Office
Patent Office Image of Google's Infrared Camera Technology

Image of Google's infrared camera technology United States Patent and Trademark Office hide caption

itoggle caption United States Patent and Trademark Office

Comments

 

Please keep your community civil. All comments must follow the NPR.org Community rules and terms of use, and will be moderated prior to posting. NPR reserves the right to use the comments we receive, in whole or in part, and to use the commenter's name and location, in any medium. See also the Terms of Use, Privacy Policy and Community FAQ.

About