[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-ocrad] Re: OCRAD project: library is needed
From: |
Dmitry Katsubo |
Subject: |
[Bug-ocrad] Re: OCRAD project: library is needed |
Date: |
Tue, 6 Jul 2010 14:18:06 +0200 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.9) Gecko/20100502 Shredder/3.0.5pre |
Hi Antonio!
I have discussed this point shortly with Igor: you should not bother
about how to compile OSRA. I need to develop a test-case to demonstrate
the problem.
So, here goes the test case.
The results I get when using OCRAD_bitmap in line 128:
Test 1: width x height = 10x11
+ recognised via Character::recognize1(): N
+ recognised via OCRAD_result_first_character():
Test 2: width x height = 10x11
+ recognised via Character::recognize1(): N
+ recognised via OCRAD_result_first_character(): w
Test 3: width x height = 11x11
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character(): r
Test 4: width x height = 18x7
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character(): r
Test 5: width x height = 9x10
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character(): r
Test 6: width x height = 9x10
+ recognised via Character::recognize1(): /
+ recognised via OCRAD_result_first_character(): t
I expect that "N" in test cases 1 and 2 is recognized not worse then via
API. Also API recognizes "r" and "t", which may trigger false positives.
When I change to OCRAD_greymap, I get the following result:
Test 1: width x height = 10x11
+ recognised via Character::recognize1(): N
+ recognised via OCRAD_result_first_character(): o
Test 2: width x height = 10x11
+ recognised via Character::recognize1(): N
+ recognised via OCRAD_result_first_character(): o
Test 3: width x height = 11x11
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character(): o
Test 4: width x height = 18x7
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character(): ~
Test 5: width x height = 9x10
+ recognised via Character::recognize1(): _
+ recognised via OCRAD_result_first_character():
Test 6: width x height = 9x10
+ recognised via Character::recognize1(): /
+ recognised via OCRAD_result_first_character(): n
I expect the results to be the same as above at least for consistency :)
There should be some obvious mistake in the test code...
Thank you in advance!
On 18.06.2010 19:34, Antonio Diaz Diaz wrote:
> Hello Igor,
>
> Igor Filippov [Contr] wrote:
>>> As explained here[1], Blob is an internal type of ocrad. It is
>>> neither documented not guaranteed to remain stable. Moreover, Blob
>>> has some requirements, like pixel connectivity, that could make osra
>>> inestable if your data does not meet them.
>>>
>>> [1] http://lists.gnu.org/archive/html/bug-ocrad/2009-12/msg00003.html
>>>
>>> Lines 195-220 of osra_ocr.cpp already use the public API from
>>> ocradlib.h. I think they are comented out because Igor Filippov found
>>> they produce worse results than directly using Blob. I have to
>>> investigate this, and it would be useful if someone could provide me
>>> with an image showing the bug (correctly recognized as Blob, but not
>>> recognized by new API).
>>
>> I have sent you such an image on January 8th of this year with the
>> description of what I'm getting by using Blob vs. what I'm getting (or
>> rather not getting) by using standard API.
>>
>> This was your reply back then:
>>
>> From: Antonio Diaz Diaz <address@hidden>
>> To: Igor Filippov <address@hidden>
>> Subject: Re: Version 0.19-pre1 of GNU Ocrad released
>> Date: 01/09/2010 02:26:38 PM
>>
>> Igor Filippov wrote:
>>>> Attached are output from the old version and the new version along with
>>>> the image. Note that the new version did not get "N" nitrogen label -
>>>> it seems to have either "w" or "r" instead.
>>
>> I have noticed two things about this image; the letters are a little
>> small for ocrad (8x9), and it is a greymap with more than two pixel
>> values.
>>
>> I'll try to solve the size problem as soon as I can (I am now busy
>> working on lzlib).
>>
>> ===================================================================
>
> What you sent me (apodaca.png) was a complete image of a molecule with 6
> rings, four "N", one "O" and one "HO". What I need is the raw data feed
> to ocrad as Blob or through the OCRAD_set_image function. Maybe some
> disconnected pixels are being ignored by Blob functions but are
> interfering character formation when using the API.
>
> Unfortunately I was unable to compile osra last time I tried because of
> some dependency not installed. I'll try again ASAP.
--
With best regards,
Dmitry
osra_ocr.cpp
Description: Text document
Makefile
Description: Text document
- [Bug-ocrad] Re: OCRAD project: library is needed,
Dmitry Katsubo <=
- Re: [Bug-ocrad] Re: OCRAD project: library is needed, Antonio Diaz Diaz, 2010/07/06
- [Bug-ocrad] Re: OCRAD project: library is needed, Antonio Diaz Diaz, 2010/07/11
- Re: [Bug-ocrad] Re: OCRAD project: library is needed, Igor Filippov [Contr], 2010/07/13
- [Bug-ocrad] Re: OCRAD project: library is needed, Antonio Diaz Diaz, 2010/07/13
- [Bug-ocrad] Re: OCRAD project: library is needed, Igor Filippov [Contr], 2010/07/13
- [Bug-ocrad] Re: OCRAD project: library is needed, Igor Filippov [Contr], 2010/07/15
- [Bug-ocrad] Re: OCRAD project: library is needed, Antonio Diaz Diaz, 2010/07/14
- [Bug-ocrad] Re: OCRAD project: library is needed, Igor Filippov, 2010/07/14
- [Bug-ocrad] Re: OCRAD project: library is needed, Antonio Diaz Diaz, 2010/07/15
- [Bug-ocrad] Re: OCRAD project: library is needed, Igor Filippov, 2010/07/15