Practical OCR solution for converting a large book to a digital format?
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Over a Mysterious Island
--
Chapters
00:00 Practical Ocr Solution For Converting A Large Book To A Digital Format?
01:51 Accepted Answer Score 9
03:06 Answer 2 Score 3
03:24 Answer 3 Score 1
04:08 Answer 4 Score 1
04:36 Answer 5 Score 0
04:51 Thank you
--
Full question
https://superuser.com/questions/41620/pr...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#ocr
#avk47
ACCEPTED ANSWER
Score 9
I came across this on Lifehacker quite some time back, and it has been one of my top DIY projects ever since.
Replace the iPhone with any camera or imaging, and you get a stack of nice high-res jpegs ready for you to OCR with any software, even (urks!) MS Office... ;)
Cheap. Effective. DIY. You can't beat an idea like this.
EDIT: Comments raised up some points about shadows, page curlings, etc. Quite easily resolved for anyone who have literally photo-copied library texts.
Add a multiple light sources to illuminate the book, and eliminate the shadows.
slant the book at 90 degrees to the pages don't curl towards the bindings in the middle. It also preserves the binding.
I'll see if I can give an example and set one up myself.
EDIT 2 : uploaded sample of how you should hold the book, and also notice the light source from the left.
ANSWER 2
Score 3
From what I know, ABBYY makes the best OCR software, but it's not free. You should try using a trial version of ABBYY FineReader, maybe it will help you.
ANSWER 3
Score 1
You will need to capture the image somehow. Various services exist to do this for you. You will also need someone who is familiar with the content of the text to proofread as OCR is not perfect yet. Especially with anything handwritten.
Others are discussing your question here: http://ask.metafilter.com/92506/scan-my-books
Some companies will do this for you: http://www.scandexsystems.com/BookScanning2.html http://www.kirtas.com/index.php?option=com_content&view=article&id=13&Itemid=48 http://www.ristech.ca/product.html
Some Free Software: http://download.cnet.com/Image-To-PDF-OCR-Converter-PDF-E-Book-Maker/3000-6675_4-10392924.html
ANSWER 4
Score 1
For a large and important to you and your family project like this, a DIY Book Scanner may be the way to go, some designs even sport page turners - http://www.diybookscanner.org/ This one doesn't natively support OCR, but does shoot 600 pages an hour and you can run it through OCR after the fact http://hackaday.com/2011/07/18/diy-book-scanner-processes-600-pageshour/
ANSWER 5
Score 0
You may want to see if a university near you has a whole book scanner and then beg/bribe a student to put your book through it.