RE: [Bug-ocrad] Re: Feature request: numeric charset

bug-ocrad

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-ocrad] Re: Feature request: numeric charset

From:	Tobias Andersson
Subject:	RE: [Bug-ocrad] Re: Feature request: numeric charset
Date:	Fri, 17 Jun 2005 11:10:30 +0000

Yes, ocrad will have some day options like "--charset=numeric", or,

The quick-fix is of course to make a "soft sub" of the result if you knowyou can expect numeric characters only. Simply make a "search-and-replace"for common letters that you know should be digits: replace every I,l,i with1 (one), O,o,Q with 0 (zero) etc.

The best solution (I think) would be to introduce different classifiers inthe rec. engine... but that's a project in the distance future I suppose...


/Tobias A

From: Antonio Diaz Diaz <address@hidden>
To: address@hidden
CC: Manfred Schwarb <address@hidden>
Subject: [Bug-ocrad] Re: Feature request: numeric charset
Date: Wed, 08 Jun 2005 15:46:57 +0200

Hello manfred.

Yes, ocrad will have some day options like "--charset=numeric", or, fortexts without numbers, "--charset=alphabetic". Also an user-defined charsetwill probably be implemented.

Of course, it will be implemented sooner if someone offers to sponsor it.;-)



Regards,
Antonio.


Manfred Schwarb wrote:

trying to recognize numbers in tables, I stumbled across
the usual OCR hassle:

Zero is recognized as "O" or "o", One is recognized as lowercase "L" oruppercase "i".

I think ocrad is doing it's best, and the results are great.
Nevertheless there are such mis-recognitions, inevitable, I think.

This could be avoided it there is a "--charset=numbers" or similar,
which restricts the charset to [0123456789], and perhaps [+-].

Alternatively, one could even think of an option
  --charset="0123456789", i.e. a list of characters out of the
ascii character set.

What do you think?



_______________________________________________
Bug-ocrad mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/bug-ocrad


_________________________________________________________________

Chatt: Träffa nya nätkompisar på Habbo Hotelhttp://habbohotel.msn.se/habbo/sv/channelizer

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-ocrad] Freature request: numeric charset, Manfred Schwarb, 2005/06/07
- [Bug-ocrad] Re: Feature request: numeric charset, Antonio Diaz Diaz, 2005/06/08
  - RE: [Bug-ocrad] Re: Feature request: numeric charset, Tobias Andersson <=

Prev by Date: [Bug-ocrad] Re: Troubleshootings installing ocrad under Tiger
Previous by thread: [Bug-ocrad] Re: Feature request: numeric charset
Next by thread: [Bug-ocrad] Troubleshootings installing ocrad under Tiger
Index(es):
- Date
- Thread