bug-ocrad
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-ocrad] Re: Feature request: numeric charset


From: Antonio Diaz Diaz
Subject: [Bug-ocrad] Re: Feature request: numeric charset
Date: Wed, 08 Jun 2005 15:46:57 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.7.3) Gecko/20040913

Hello manfred.

Yes, ocrad will have some day options like "--charset=numeric", or, for texts without numbers, "--charset=alphabetic". Also an user-defined charset will probably be implemented.

Of course, it will be implemented sooner if someone offers to sponsor it. ;-)


Regards,
Antonio.


Manfred Schwarb wrote:
trying to recognize numbers in tables, I stumbled across
the usual OCR hassle:
Zero is recognized as "O" or "o", One is recognized as lowercase "L" or uppercase "i".
I think ocrad is doing it's best, and the results are great.
Nevertheless there are such mis-recognitions, inevitable, I think.

This could be avoided it there is a "--charset=numbers" or similar,
which restricts the charset to [0123456789], and perhaps [+-].

Alternatively, one could even think of an option
  --charset="0123456789", i.e. a list of characters out of the
ascii character set.

What do you think?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]