Image Capture and OCR Apps for the iPhone

In this day and age when many current journals, magazines, newspapers, and books are available in electronic formats, the need for scanning and applying optical character recognition (OCR) to print material is becoming less and less. However, many older texts and references in libraries are still only available in printed form. Scanning pages from bound books is always problematic. It is difficult to get pages completely flat and when scanned and the text nearest to the binding is usually distorted. 
I hate having to bring my laptop and scanner with me to the library to scan pages from books, so I decided to test three popular and highly recommended image capture and/or OCR apps for the iPhone. These apps are generally designed to get a quick electronic version of printed materials, including hand written notes, but I thought they might be useful for scanning pages from books. The programs varied in intuitive design but all were generally easy to use after a short time.

General Observations

Flat pages worked well using all of the apps, but since I was interested in how these programs handled bound pages, I chose a page from a thick bound book printed in a serif font (similar to Times) and containing a variety of text styles including italicized text. As a comparison, I also scanned the same page using a flatbed scanner and used ClaroRead for its OCR capabilities to convert the page to an electronic document. The document produced using this method had nearly 99% accuracy. All of these apps were affected by lighting and caused some gradation of shadow across a page due to the lay of the open book. Sometimes the flash on the iPhone 4 helped and other times it created a tunnel effect with the text in the center being quite bright and crisp and the surrounding text gradually less distinct. The best results were accomplished when I could get a light in front of me to shine right on the page. But even in the best case scenario, the resulting jpg images, PDFs, and OCR-captured documents generated by these apps varied greatly.

App Review

Genius Scan (available in a free version and a $2.99 ad-free version with the option to upload files to Dropbox, EverNote, and GoogleDocs.). This app is only for scanning documents (no OCR) and was easy to use. Since it does not have an OCR, you need to have one available should you want to convert the image to electronic text. I used the program to scan a page from a book and email myself a jpeg and PDF version of the page. I then used ClaroRead on the 2 files to see how well it could recognize the text. I was never able to get a good enough image scan of a bound page for accurate OCR using this app and text recognition was always quite poor (at times nearly 0%).
Perfect OCR ($3.99). The app is easy and intuitive and there is good functionality for eliminating uneven lighting and shadows, improving the contrast, and reducing the effect of movement or jitter while using the camera. This app on its own produced electronic documents that were about 80% accurate in text recognition.
SayText (Free) actually got the best results of all at 90%+ accuracy. SayText utilizes the iPhone’s built in VoiceOver, so you can instantly have the OCR captured document read out loud. But SayText has no option for saving the documents on your iPhone which is a bit annoying but you can email the OCR captured text document to yourself. All of the other apps, have some document management for storing documents for access at a later date.
Of course, these 3 apps aren’t the only scanning apps one can find. There are several other options with similar functionality and more are added to the app library all the time. Most just take a picture of the document and convert it to a jpeg or PDF (like Genius Scan) whereas a few others include OCR for converting the picture of the document to electronic text (like Perfect OCR or SayText). All tested to date have produced similar results to the 3 described in this post. If you discover or know of a scanning app that you find does the trick, let me know!
Until that perfect scanning app comes along, I would use the free SayText to grab text and have it read out loud or to email it to myself for reference later. But for efficiency and accuracy, I won’t be abandoning my trusty scanner and ClaroRead anytime soon.
Scanning Software, Text-to-Speech, and Text-to-Audio File

It is Monday, and your English teacher just gave you a short story to read out of a book. Your biology teacher just uploaded a 10 page electronic PDF for you to read. You are expected to read both assignments by the end of the week for discussion in class. Reading isn’t easy for you and if only these documents were provided in an accessible text format, then you could use your handy AT software with text-to-speech to help you work your way through these documents.
I didn’t have any AT options in school, so I was rarely able to read fast enough to complete a reading assignment on time and failed many tests because I couldn’t keep up. I had to rely on peer discussion groups and teacher lectures to cover the reading material to actually learn what was in those various texts. By college I had become a human tape recorder, memorizing practically everything said in conversations, in lectures, and in study groups.
Today, what AT do I use? If I need to read printed material, I can set up my flatbed scanner and use a feature in ClaroRead called "Scan from Paper." By taking what is essentially a photograph of a page it applies a slick piece of software known as optical character recognition or OCR to the page. Of course the better the printed copy, the better the program will be able to recognize text on the page. Students and their support team using any program with a built in OCR, like ClaroRead, will need to be able to find places where the OCR didn’t do such a good job and correct the mistakes before having a “clean” copy. TTS can certainly help with the clean-up of a scanned document by aiding students in finding incorrect words.
What about the electronic PDF reading assignment? ClaroRead has a nifty “Scan from PDF/File” feature which applies OCR to the electronic PDF rather than first having to scan a printed copy. Of course, it would make a student’s life far easier if these documents were already in a good electronic format, but that is a whole other discussion.
ClaroRead takes this all one step further. You can also convert electronic text to an audio file (mp3) that you could then listen to on an iPod, Zune, or any other mp3 player. The “Save as Audio” feature is handy for short documents, but anything too long, and then it becomes difficult to find where you left off in your listening. Books and magazines are often available in other electronic formats, such as the Daisy format, and these will be discussed on our blog in later posts.
Speech Feedback and Word Prediction featuring WordQ

I have great difficulty reading and recognizing words and I only see them as little pictures. I can recognize words in context, but often out of context I may not always know which word I’m seeing since many words look nearly identical to me and I don't recognize the individual letters. And please don’t ask me to spell a word; I may know how to spell it from memory, but I couldn’t tell you just by looking at it. I might be able to decipher the first letter and maybe the last letter, but everything in between is often just a mash of curved and straight lines. People often ask me which letters look the same to me and I reply- “all of them.”
WordQ has been my favorite reading and writing support program for many years. Simplicity is the key to WordQ. The interface features a floating toolbar with just 4 buttons: Options, Words, Speech, and Read. Users can access any of last 3 functions either by direct selection or by hotkeys; the latter make it easy to turn features on and off, allowing one to minimize the program menu bar while working.
As I discussed in the previous blogs, speech feedback, commonly called text-to-speech or TTS, and word prediction are important tools for aiding students in reading and writing and these are the cornerstone features of WordQ.
WordQ comes with several high quality and natural sounding voices in 4 languages: English, Spanish, French and German. It works seamlessly with many office suites, including Microsoft Office, and most internet browsers and mail handling programs. I can easily highlight a block of text and press the F11 key (Read) and provided Speech is turned on (easily done with the F10 key), I will hear the text spoken out loud in the voice and speed I have selected in the speech feedback options. Using the Read feature for proofreading is very important for catching missing or incorrect words and to detect run on sentences.
The longer you use WordQ’s word prediction, the more useful the suggested words become and those you use frequently, including word combinations, turn up higher in your prediction list. It will suggest synonyms as you type, helping you think outside of your usual vocabulary. WordQ will even suggest words taking into consideration possible spelling and typing mistakes, including words spelled incorrectly but phonetically (WordQ calls this “creative spelling”). Homophone support is robust- the word prediction box displays usage examples which when combined with speech feedback can help students distinguish between commonly confused words, such as "there,” "their," and “they’re.” And the latest WordQ version makes abbreviation/ expansion easy to setup and your abbreviations can be added to your user dictionary so they will appear in the prediction box. And like the Speech and Read features, if you don’t need word prediction all the time, you can turn it on and off easily using the F9 key.
Speech Recognition: The Writing Magic Bullet?

Wanting to try speech recognition software is a popular request from students at our AT Demonstration and Lending Library. These software programs have come a long way from when they were first introduced in the 1980s. The recognition accuracy has significantly increased and at the same time, the amount of time and effort to train a speech recognition program for a specific user has decreased. But there are considerations to be made if you believe speech recognition is for you.
Once a voice file has been created- the student needs a good quality microphone and your computer needs to be sufficiently powerful to run both the speech recognition program and word processor simultaneously. But a student should be aware that every word that is recognized is correctly spelled. However, a correctly spelled word does not mean the correct word choice. The student needs to be able to detect and make corrections. This is where reading with text-to-speech (TTS) comes in. A student can listen for the words and make the correction by voice in some programs (preferred since this improves recognition over time) or by using the mouse and keyboard. Additionally, some programs can play back a recording of your dictation so the student can actually hear what they said.
If hands free computer control and navigation is your goal, some dictation programs can make this a reality but with a lot of training and technical assistance from qualified professionals. The cognitive load is very high since making corrections would involve learning and remembering a large number of verbal commands. However, using a speech recognition program to be totally hands free isn’t always important or necessary for everyone. 
Besides the popular, reliable, and powerful DragonNaturallySpeaking and the new Dragon Dictate for Mac, all Microsoft Windows operating systems since Vista have very good speech recognition built into the Ease of Access. This built-in option has less navigation controls and a smaller vocabulary than a stand alone program. However, is a good option for many users and has the benefit of being free. 
Finally, SpeakQ is a speech recognition add-on to WordQ that is specifically targeted at students who have difficulty with writing. It is especially useful for students who cannot fluently dictate at a natural speaking rate, remember verbal commands, and/or get through the initial training. It is not meant to be a full feature speech recognition tool as it lacks navigation and editing commands. However, it works seamlessly with WordQ’s word prediction, combining the benefits of both of these features, and is especially useful in picking the correct homophone.
