Page:Code Swaraj - Carl Malamud - Sam Pitroda.djvu/51

From Wikisource
Jump to navigation Jump to search
This page has been proofread, but needs to be validated.

Access to Knowledge in India and America, Remarks of Carl Malamud

June 14, 2017, The Internet Archive, San Francisco

Thank you Sam. I had the great pleasure of tagging along with Sam in October as he went barnstorming through India. We spoke at the Sabarmati Ashram on Gandhi-ji’s birthday, there were speeches to the Indian Institution of Engineers, at the Mayo Boy’s College, at Rajasthan Central University, and everywhere he was mobbed with admirers. When we got out of the car at Gandhi’s Ashram, there were at least 100 people who surrounded him taking selfies.

His contributions to India for over 50 years, from bringing telephones to every village to his more recent work advising Prime Ministers, creating food banks, and so many other things, have been immense. Thank you for joining us tonight.

I have a few closing thoughts, but before I get to those I would be remiss if I did not thank some of the people on whose shoulders we stand tonight. The Digital Library of India would never have been possible without the visionary efforts of Carnegie Mellon University and the Million Books Project pioneered by Professor Raj Reddy and Dean Gloria St. Clair.

In India, the Digital Library of India project has been headed by a distinguished computer scientist, Professor Narayanaswamy Balakrishnan. The Digital Library of India is now a project of the government of India with 25 scan centers throughout the country, and it is a huge undertaking.

The library has 550,000 books scanned, and we have over 400,000 of those spinning and available today here at the Internet Archive. We’re delighted to be working closely with the project.

It truly is a remarkable collection, particularly when it comes to Indian languages. There are over 45,000 books in Hindi, 33,000 in Sanskrit, 30,000 in Bengali, and much more. Overall, there are 50 different languages represented.

When books are ingested here at the Internet Archive, you’ll see that in addition to the basic PDF file, they are run through Optical Character Recognition.

In addition to OCR, you’ll see that the books are transformed into formats that work with your e-book reader, your Kindle, and your tablet. You can search

43