Google offers a scanning document search service

by yinyin on 2008-10-31 18:26:50

According to foreign media reports, Google recently announced that it will provide a search service to scan documents. This function requires huge computing power as well as advanced image recognition technology.

Unlike standard documents, scanned files do not have any textual data that can be classified by Google Spider search. So Google uses optical character recognition (OCR) technology to convert text pictures into text data.

In the past, Google has indexed these scans extensively, but only in terms of file titles and Metadata, not the contents of the files themselves. Now, Google can search for the contents of the scanned image and display it in the search results normally. After clicking on the search result, users can see the original format of the scanned document, such as PDF. You can also click on "View as HTML" to display the converted text.

Sina Technology