OCR of handwriting: why yes via stylus, not so much via scan?

I love to write with an actual pen and paper, and I also love the computers and the ability to search for information.

There are a number of apps that can do a decent job of recognizing manual writing as it is created with a stylus on a tablet or a computer, but barely a hint of an attempt to recognize manual writing that is scanned in.

Is there any particular reason? Is there a difference in the underlying programming or is it just the software people have chosen to create?

Writing with a stylus gives the software time-dependent information. So, it makes decoding overlapping or joined-togeter characters much easier. Also, most of those apps require printing, so each character is input separately, which makes deciphering them a lot easier.

I don’t know the state of the art of OCR software, but I see two major possibilities (neither mutually exclusive):

  1. Scanning isn’t perfect, so you get letters with odd holes and fading that throws off the algorithms. Using a darker pen like a felt tip and scanning at a higher resolution may help this.

  2. Stylus-based OCR may be using techniques to augment the recognition – for instance, if you’re using a stylus then the computer has access to things like stroke order and direction. This gives it more features so it can predict the character better. Consider that a lot of peoples “r” and “v” end up looking the same, but the way they produce the characters is different. Same with “R” and “A”. Another example may be “O” and “Q” where a short stubby slash on the Q may not be picked up visually, but the stroke can be detected even if it’s not especially visible.

MyScript, which is a program that is used with certain Livescribe pens and notebooks, does a pretty decent job of converting written words into printed text. I’m not thrilled that it sometimes thinks my a is a 9 and I don’t know what a Doke is or why it would choose that instead of Duke, but it’s pretty accurate most of the time.

Graffiti, the original stylus alphabet for PalmOS, was based entirely on stroke direction. It also had special strokes for certain letters like K and F, so they could be written in a single stroke. That was enough to work back in 1996. You don’t get stroke information when scanning text visually, and recognizing handwritten characters is therefore much harder.

By the way, when I say “hand writing” I don’t necessarily mean cursive, just writing with my hand… I generally print.