Friday, January 25, 2019

How Deep Learning Deciphers Historical Documents

Deep learning researchers are hitting the books. By building AI tools to transcribe historical texts in antiquated scripts letter by letter, they’re creating an invaluable resource for researchers who study centuries-old documents. Many old documents have been digitized as scans or photographs of physical pages. But while obsolete scripts like Greek miniscule or German Fraktur may be readable by experts, the text on these scanned pages is neither legible to a broad audience nor searchable by computers.

Hiring transcribers to turn manuscripts into typed text is a lengthy and expensive process. So developers have built digital tools for optical character recognition, the process of converting printed or written characters into machine-readable form. And deep learning dramatically increases the accuracy of these tools.