pdf-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[pdf-devel] New option in pdf_text_get_unicode


From: Aleksander Morgado
Subject: [pdf-devel] New option in pdf_text_get_unicode
Date: Tue, 11 Nov 2008 22:20:55 +0100
User-agent: Thunderbird 2.0.0.17 (X11/20080925)

Hi Jose,

I added a new option to the `pdf_text_get_unicode' function, which allows to get the unicode string NUL-terminated (1-byte-NUL for UTF-8, 2-byte-NUL for UTF-16, and 4-byte-NUL for UTF-32).

This may be useful when using Windows API functions which expect NUL-terminated UTF-16BE strings, for example.

Also added 54 new test cases in the TSD, and implemented 7 of them. I will open a new task in flyspray for the remaining ones. The tests are really simple, as they are based in the previous 54 tests that were available for the `pdf_text_get_unicode' function, so if anyone wants to take them, please feel free to do so.

Find attached a patch which includes the changes in the source code, the modification of documentation (API and TSD) and the 7 unit tests implemented.

Cheers,
-Aleksander
# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: address@hidden
#   gh0tl1e12xvx88hk
# target_branch: file:///home/aleksander/Development/gnu/libgnupdf\
#   /libgnupdf-repo/trunk/
# testament_sha1: 825def469327cf90b90bf3d8ce5d2d02e8876132
# timestamp: 2008-11-11 22:11:22 +0100
# base_revision_id: address@hidden
# 
# Begin patch
=== modified file 'ChangeLog'
--- ChangeLog   2008-11-03 23:51:44 +0000
+++ ChangeLog   2008-11-11 21:10:14 +0000
@@ -1,3 +1,20 @@
+2008-11-11  Aleksander Morgado  <address@hidden>
+
+       * src/base/pdf-text.c (pdf_text_get_unicode): Correct out_length when
+       using PDF_TEXT_UNICODE_WITH_NUL_SUFFIX.
+
+       * torture/unit/base/text/pdf-text-get-unicode.c
+       (test_pdf_text_get_unicode): Added implementation of new test cases.
+
+       * doc/gnupdf-tsd.texi (pdf_text_get_unicode): Added new test cases to
+       test the new option in `pdf_text_get_unicode' function to get the string
+       with a NUL suffix.
+
+2008-11-10  Aleksander Morgado  <address@hidden>
+
+       * doc/gnupdf.texi (Text Data Types): Added new unicode option, to append
+       NUL suffix to the unicode string.
+
 2008-11-04  Aleksander Morgado  <address@hidden>
 
        * utils/pdf-filter.c (process_stream): We should check return value of

=== modified file 'doc/gnupdf-tsd.texi'
--- doc/gnupdf-tsd.texi 2008-11-02 20:30:16 +0000
+++ doc/gnupdf-tsd.texi 2008-11-11 19:21:15 +0000
@@ -2488,6 +2488,553 @@
 @end deffn
 
 
address@hidden Test pdf_text_get_unicode_055
+Get the contents of a non-empty text object in UTF-8 without BOM and with NUL 
suffix. The contents of the text object include characters that are encoded in 
UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_056
+Get the contents of a non-empty text object in UTF-8 with BOM and with NUL 
suffix. The contents of the text object include characters that are encoded in 
UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
BOM in UTF-8 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_057
+Get the contents of an empty text object in UTF-8 without BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL, and should include the last NUL.
+
+3. The returned length must be equal to 1.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_058
+Get the contents of an empty text object in UTF-8 with BOM and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-8, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-8 plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_059
+Get the contents of a non-empty text object with lang/country info, in UTF-8 
without BOM, with lang/country information embedded (which should not be 
supported in UTF-8) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_060
+Get the contents of a non-empty text object with lang/country info, in UTF-8 
with BOM, with lang/country information embedded (which should not be supported 
in UTF-8) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
+
address@hidden Test pdf_text_get_unicode_061
+Get the contents of a non-empty text object in UTF-16BE without BOM and with 
NUL suffix. The contents of the text object include characters that are encoded 
in UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_062
+Get the contents of a non-empty text object in UTF-16BE with BOM and with NUL 
suffix. The contents of the text object include characters that are encoded in 
UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
BOM in UTF-16 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_063
+Get the contents of an empty text object in UTF-16BE without BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_064
+Get the contents of an empty text object in UTF-16BE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-16BE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-16 plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_065
+Get the contents of a non-empty text object with lang/country info, in 
UTF-16BE without BOM, with lang/country information embedded (which IS 
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the lang/country 
information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
lang/country info and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_066
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-16BE without BOM, with lang/country information embedded (which 
IS supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the language 
information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
language info and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_067
+Get the contents of an empty text object with lang/country info, in UTF-16BE 
without BOM, with lang/country information embedded (which IS supported in 
UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only include the lang/country information 
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the lang/country info 
plus the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_068
+Get the contents of an empty text object with language info (no country info), 
in UTF-16BE without BOM, with lang/country information embedded (which IS 
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the language information embedded, 
NUL terminated.
+
+3. The returned length must be equal to the length of the language info plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_069
+Get the contents of a non-empty text object with lang/country info, in 
UTF-16BE with BOM, with lang/country information embedded (which IS supported 
in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the BOM and the 
lang/country information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
lang/country info and the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_070
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-16BE with BOM, with lang/country information embedded (which IS 
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the BOM and the 
language information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
language info and the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_071
+Get the contents of an empty text object with lang/country info, in UTF-16BE 
with BOM, with lang/country information embedded (which IS supported in 
UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only include the BOM and lang/country information 
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the lang/country info 
plus the length of the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_072
+Get the contents of an empty text object with language info (no country info), 
in UTF-16BE with BOM, with lang/country information embedded (which IS 
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM and the language information 
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the language info plus 
the length of the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_073
+Get the contents of a non-empty text object in UTF-16LE without BOM and with 
NUL suffix. The contents of the text object include characters that are encoded 
in UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_074
+Get the contents of a non-empty text object in UTF-16LE with BOM and with NUL 
suffix. The contents of the text object include characters that are encoded in 
UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
BOM in UTF-16 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_075
+Get the contents of an empty text object in UTF-16LE without BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_076
+Get the contents of an empty text object in UTF-16LE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-16LE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-16 plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_077
+Get the contents of a non-empty text object with lang/country info, in 
UTF-16LE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_078
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-16LE without BOM, with lang/country information embedded (which 
is NOT supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_079
+Get the contents of an empty text object with lang/country info, in UTF-16LE 
without BOM, with lang/country information embedded (which is NOT supported in 
UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_080
+Get the contents of an empty text object with language info (no country info), 
in UTF-16LE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_081
+Get the contents of a non-empty text object with lang/country info, in 
UTF-16LE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_082
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-16LE with BOM, with lang/country information embedded (which is 
NOT supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_083
+Get the contents of an empty text object with lang/country info, in UTF-16LE 
with BOM, with lang/country information embedded (which is NOT supported in 
UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_084
+Get the contents of an empty text object with language info (no country info), 
in UTF-16LE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_085
+Get the contents of a non-empty text object in UTF-32BE without BOM and with 
NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_086
+Get the contents of a non-empty text object in UTF-32BE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
BOM in UTF-32 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_087
+Get the contents of an empty text object in UTF-32BE without BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_088
+Get the contents of an empty text object in UTF-32BE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-32BE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-32 plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_089
+Get the contents of a non-empty text object with lang/country info, in 
UTF-32BE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_090
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-32BE without BOM, with lang/country information embedded (which 
is NOT supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_091
+Get the contents of an empty text object with lang/country info, in UTF-32BE 
without BOM, with lang/country information embedded (which is NOT supported in 
UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_092
+Get the contents of an empty text object with language info (no country info), 
in UTF-32BE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_093
+Get the contents of a non-empty text object with lang/country info, in 
UTF-32BE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_094
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-32BE with BOM, with lang/country information embedded (which is 
NOT supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_095
+Get the contents of an empty text object with lang/country info, in UTF-32BE 
with BOM, with lang/country information embedded (which is NOT supported in 
UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_096
+Get the contents of an empty text object with language info (no country info), 
in UTF-32BE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_097
+Get the contents of a non-empty text object in UTF-32LE without BOM and with 
NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_098
+Get the contents of a non-empty text object in UTF-32LE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the 
BOM in UTF-32 and the length of the NUL suffix.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_099
+Get the contents of an empty text object in UTF-32LE without BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_100
+Get the contents of an empty text object in UTF-32LE with BOM and with NUL 
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-32LE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-32 plus 
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_101
+Get the contents of a non-empty text object with lang/country info, in 
UTF-32LE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_102
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-32LE without BOM, with lang/country information embedded (which 
is NOT supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_103
+Get the contents of an empty text object with lang/country info, in UTF-32LE 
without BOM, with lang/country information embedded (which is NOT supported in 
UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_104
+Get the contents of an empty text object with language info (no country info), 
in UTF-32LE without BOM, with lang/country information embedded (which is NOT 
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_105
+Get the contents of a non-empty text object with lang/country info, in 
UTF-32LE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_106
+Get the contents of a non-empty text object with language info (no country 
info), in UTF-32LE with BOM, with lang/country information embedded (which is 
NOT supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_107
+Get the contents of an empty text object with lang/country info, in UTF-32LE 
with BOM, with lang/country information embedded (which is NOT supported in 
UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_108
+Get the contents of an empty text object with language info (no country info), 
in UTF-32LE with BOM, with lang/country information embedded (which is NOT 
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
+
+
 @node pdf_text_get_hex
 @subsubsection pdf_text_get_hex
 

=== modified file 'doc/gnupdf.texi'
--- doc/gnupdf.texi     2008-10-07 17:28:21 +0000
+++ doc/gnupdf.texi     2008-11-11 19:21:15 +0000
@@ -3857,7 +3857,10 @@
 @item PDF_TEXT_UTF16BE_WITH_LANGCODE
 Insert language/country code information between the BOM (if required) and the
 data. This option is ONLY applicable to UTF16BE. If specified in any other
-encoding, the function will fail. 
+encoding, the function will fail.
address@hidden PDF_TEXT_UNICODE_WITH_NUL_SUFFIX
+Append a NUL suffix to the Unicode string (1-byte NUL for UTF-8, 2-byte NUL for
+UTF-16 and 4-byte NUL for UTF-32).
 @end table
 @end deftp
 

=== modified file 'src/base/pdf-text.c'
--- src/base/pdf-text.c 2008-09-09 01:41:01 +0000
+++ src/base/pdf-text.c 2008-11-11 21:10:14 +0000
@@ -709,22 +709,55 @@
     {
       pdf_char_t header[PDF_TEXT_USHMAXL];
       pdf_size_t header_size = 0;
-      /* Clear header array */
-      memset(&(header[0]), 0, PDF_TEXT_USHMAXL);
-      /* Get requested header (BOM and/or lang/country info) */
-      pdf_text_get_unicode_string_header(header,
-                                         &header_size,
-                                         new_enc,
-                                         options,
-                                         pdf_text_get_language(text),
-                                         pdf_text_get_country(text));
+      pdf_size_t trailer_size = 0;
+
+      /* Compute header if needed */
+      if((options &  PDF_TEXT_UNICODE_WITH_BOM) || \
+         (options &  PDF_TEXT_UTF16BE_WITH_LANGCODE))
+        {
+          /* Clear header array */
+          memset(&(header[0]), 0, PDF_TEXT_USHMAXL);
+          /* Get requested header (BOM and/or lang/country info) */
+          pdf_text_get_unicode_string_header(header,
+                                             &header_size,
+                                             new_enc,
+                                             options,
+                                             pdf_text_get_language(text),
+                                             pdf_text_get_country(text));
+        }
+      /* Compute trailer if needed */
+      if(options & PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)
+        {
+          switch(new_enc)
+            {
+              case PDF_TEXT_UTF8:
+                  trailer_size = 1;
+                  break;
+              case PDF_TEXT_UTF16_BE:
+              case PDF_TEXT_UTF16_LE:
+              case PDF_TEXT_UTF16_HE:
+                  trailer_size = 2;
+                  break;
+              case PDF_TEXT_UTF32_BE:
+              case PDF_TEXT_UTF32_LE:
+              case PDF_TEXT_UTF32_HE:
+                  trailer_size = 4;
+                  break;
+              default:
+                  trailer_size = 0;
+                  break;
+            }
+        }
       
-      if(header_size > 0)
+      if((header_size > 0) || \
+         (trailer_size > 0))
         {
           pdf_char_t *new_out_data = NULL;
           
           /* Allocate memory for new string */
-          new_out_data = (pdf_char_t *)pdf_alloc(out_length + header_size);
+          new_out_data = (pdf_char_t *)pdf_alloc(out_length + \
+                                                 header_size + \
+                                                 trailer_size);
           if(new_out_data == NULL)
             {
               return PDF_ENOMEM;
@@ -740,9 +773,16 @@
               /* Reset output data array, if any */
               pdf_dealloc(out_data);
             }
+
+          /* Store trailer (N-byte NUL) */
+          if(trailer_size > 0)
+            {
+              memset(&new_out_data[out_length+header_size],0,trailer_size);
+            }
+
           out_data = new_out_data;
-          out_length += (header_size);
-        }  
+          out_length += (header_size + trailer_size);
+        }
       else
         {
           PDF_DEBUG_BASE("Invalid unicode option requested (%u)",

=== modified file 'src/base/pdf-text.h'
--- src/base/pdf-text.h 2008-09-09 01:41:01 +0000
+++ src/base/pdf-text.h 2008-11-11 19:21:15 +0000
@@ -60,9 +60,10 @@
  *  Each of these enumerations is a Mask of Bits, so that multiple options 
  *  can be set at the same time */
 enum pdf_text_unicode_options_e {
-  PDF_TEXT_UNICODE_NO_OPTION     = 0x00,
-  PDF_TEXT_UNICODE_WITH_BOM      = 0x01, /* UTF-8, UTF-16(L/B)E, UTF-32(L/B)E 
*/
-  PDF_TEXT_UTF16BE_WITH_LANGCODE = 0x02  /* UTF16BE */
+  PDF_TEXT_UNICODE_NO_OPTION       = 0x00,
+  PDF_TEXT_UNICODE_WITH_BOM        = 0x01, /* UTF-8, UTF-16(L/B)E, 
UTF-32(L/B)E */
+  PDF_TEXT_UTF16BE_WITH_LANGCODE   = 0x02, /* UTF16BE */
+  PDF_TEXT_UNICODE_WITH_NUL_SUFFIX = 0x04  /* UTF-8, UTF-16(L/B)E, 
UTF-32(L/B)E */
 };
 
 

=== modified file 'torture/unit/base/text/pdf-text-get-unicode.c'
--- torture/unit/base/text/pdf-text-get-unicode.c       2008-09-09 01:41:01 
+0000
+++ torture/unit/base/text/pdf-text-get-unicode.c       2008-11-11 21:10:14 
+0000
@@ -32,6 +32,8 @@
 
 #define INTERACTIVE_DEBUG   0
 
+
+
 /*
  * Test: pdf_text_get_unicode_001
  * Description:
@@ -2833,6 +2835,477 @@
 END_TEST
 
 
+
+
+/*
+ * Test: pdf_text_get_unicode_055
+ * Description:
+ *   Get the contents of a non-empty text object in UTF-8 without BOM and with
+ *   NUL suffix. The contents of the text object include characters that are
+ *   encoded in UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit. 
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should return PDF_OK.
+ *    2. The returned string must be the expected one, NUL terminated.
+ *    3. The returned length must be the expected one, including the length of
+ *        the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_055)
+{
+  extern const test_string_t utf8_strings[];
+  int i;
+  
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  i = 0;
+  while(utf8_strings[i].data != NULL)
+    {
+      pdf_text_t text;
+      pdf_char_t *data = NULL;
+      pdf_size_t size = 0;
+      pdf_char_t *expected_data;
+      pdf_size_t expected_size;
+      
+      /* Set expected data and size (without BOM, with last NUL) */
+      expected_size = 1 + utf8_strings[i].size -3;
+      expected_data = pdf_alloc(expected_size);
+      fail_if(expected_data == NULL);
+      memcpy(expected_data, &(utf8_strings[i].data[3]), utf8_strings[i].size - 
3);
+      expected_data[expected_size-1] = '\0';      
+      
+      fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+                                        utf8_strings[i].utf32be_data,
+                                        (pdf_size_t) \
+                                        utf8_strings[i].utf32be_size,
+                                        PDF_TEXT_UTF32_BE,
+                                        &text) != PDF_OK);
+      
+      
+      /* 1. The call to  pdf_text_get_unicode should return PDF_OK. */
+      fail_unless(pdf_text_get_unicode(&data,
+                                       &size,
+                                       text,
+                                       PDF_TEXT_UTF8,
+                                       PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) == 
PDF_OK);
+      
+      
+      if(INTERACTIVE_DEBUG)
+        {
+          pdf_char_t *internal_hex = NULL;
+          pdf_char_t *expected_hex = NULL;
+          internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+          expected_hex = 
pdf_text_test_get_hex(expected_data,expected_size,':');
+          fail_if(expected_hex == NULL);
+          fail_if(internal_hex == NULL);
+          printf("pdf_text_get_unicode_055:%d:Internal> '%s'\n",
+                 i, internal_hex);
+          printf("pdf_text_get_unicode_055:%d:Expected> '%s'\n",
+                 i, expected_hex);
+          pdf_dealloc(internal_hex);
+          pdf_dealloc(expected_hex);
+        }
+      
+      /* 2. The returned string must be the expected one, NUL terminated */
+      fail_if(data == NULL);
+      fail_unless(memcmp(expected_data, data, size) == 0);
+      fail_unless(data[size-1] == '\0');
+
+      /* 3. The returned length must be the expected one. */
+      fail_unless(size == expected_size);
+
+      pdf_text_destroy(text);
+      pdf_dealloc(data);
+      pdf_dealloc(expected_data);
+      
+      ++i;
+    }
+  
+}
+END_TEST
+
+
+
+
+
+
+
+/*
+ * Test: pdf_text_get_unicode_056
+ * Description:
+ *   Get the contents of a non-empty text object in UTF-8 with BOM and with NUL
+ *   suffix. The contents of the text object include characters that are 
encoded
+ *   in UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should return PDF_OK.
+ *    2. The returned string must be the expected one, NUL terminated.
+ *    3. The returned length must be the expected one, including the length of
+ *       the BOM in UTF-8 and the length of the NUL suffix.
+ */
+START_TEST(pdf_text_get_unicode_056)
+{
+  extern const test_string_t utf8_strings[];
+  int i;
+  
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  i = 0;
+  while(utf8_strings[i].data != NULL)
+    {
+      pdf_text_t text;
+      pdf_char_t *data = NULL;
+      pdf_size_t size = 0;
+      pdf_char_t *expected_data;
+      pdf_size_t expected_size;
+      
+      /* Set expected data and size (with BOM) */
+      expected_size = 1 + utf8_strings[i].size;
+      expected_data = pdf_alloc(expected_size);
+      fail_if(expected_data == NULL);
+      memcpy(expected_data, &(utf8_strings[i].data[0]), utf8_strings[i].size);
+      expected_data[expected_size-1] = '\0';
+
+      
+      fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+                                        utf8_strings[i].utf32be_data,
+                                        (pdf_size_t) \
+                                        utf8_strings[i].utf32be_size,
+                                        PDF_TEXT_UTF32_BE,
+                                        &text) != PDF_OK);
+      
+      
+      /* 1. The call to  pdf_text_get_unicode should return PDF_OK. */
+      fail_unless(pdf_text_get_unicode(&data,
+                                       &size,
+                                       text,
+                                       PDF_TEXT_UTF8,
+                                       (PDF_TEXT_UNICODE_WITH_BOM | \
+                                        PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) == 
PDF_OK);
+      
+      
+      if(INTERACTIVE_DEBUG)
+        {
+          pdf_char_t *internal_hex = NULL;
+          pdf_char_t *expected_hex = NULL;
+          pdf_char_t *output_hex = NULL;
+          internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+          expected_hex = 
pdf_text_test_get_hex(expected_data,expected_size,':');
+          output_hex = pdf_text_test_get_hex(data,size,':');
+          fail_if(expected_hex == NULL);
+          fail_if(internal_hex == NULL);
+          fail_if(output_hex == NULL);
+          printf("pdf_text_get_unicode_056:%d:Internal> '%s'\n",
+                 i, internal_hex);
+          printf("pdf_text_get_unicode_056:%d:Expected> '%s'\n",
+                 i, expected_hex);
+          printf("pdf_text_get_unicode_056:%d:Output> '%s'\n",
+                 i, output_hex);
+          pdf_dealloc(internal_hex);
+          pdf_dealloc(expected_hex);
+          pdf_dealloc(output_hex);
+        }
+      
+      /* 2. The returned string must be the expected one, not NUL terminated */
+      fail_if(data == NULL);
+      fail_unless(memcmp(expected_data, data, size) == 0);
+
+      /* 3. The returned length must be the expected one, including the length
+       *       of the BOM in UTF-8. */
+      fail_unless(size == expected_size);
+      
+      pdf_text_destroy(text);
+      pdf_dealloc(data);
+      pdf_dealloc(expected_data);
+      
+      ++i;
+    }
+  
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_057
+ * Description:
+ *   Get the contents of an empty text object in UTF-8 without BOM and with NUL
+ *    suffix.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should return PDF_OK.
+ *    2. The returned string must not be NULL.
+ *    3. The returned length must be equal to the length of the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_057)
+{
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  pdf_text_t text;
+  pdf_char_t *data = NULL;
+  pdf_size_t size = 0;
+  
+  fail_if(pdf_text_new (&text) != PDF_OK);
+  
+  /* 1. The call to  pdf_text_get_unicode should return PDF_OK. */
+  fail_unless(pdf_text_get_unicode(&data,
+                                   &size,
+                                   text,
+                                   PDF_TEXT_UTF8,
+                                   PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) == 
PDF_OK);
+  
+  /* 2. The returned string must not be NULL */
+  fail_unless(data != NULL);
+  fail_unless(memcmp(data, "\x00", 1) == 0);
+  /* 3. The returned length must be equal to the last NUL. */
+  fail_unless(size == 1);
+  
+  pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_058
+ * Description:
+ *   Get the contents of an empty text object in UTF-8 with BOM and with NUL
+ *    suffix.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should return PDF_OK.
+ *    2. The returned string must only contain the BOM in UTF-8, NUL
+ *       terminated.
+ *    3. The returned length must be equal to the length of the BOM in UTF-8
+ *        plus the length of the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_058)
+{
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  pdf_text_t text;
+  pdf_char_t *data = NULL;
+  pdf_size_t size = 0;
+  
+  fail_if(pdf_text_new (&text) != PDF_OK);
+  
+  /* 1. The call to  pdf_text_get_unicode should return PDF_OK. */
+  fail_unless(pdf_text_get_unicode(&data,
+                                   &size,
+                                   text,
+                                   PDF_TEXT_UTF8,
+                                   (PDF_TEXT_UNICODE_WITH_BOM | \
+                                    PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) == 
PDF_OK);
+  
+  /* 2. The returned string must only contain the BOM in UTF-8, NUL
+   *       terminated. */
+  fail_unless(data != NULL);
+  fail_unless(memcmp(&data[0], "\xEF\xBB\xBF\x00", 4) == 0);
+
+  /* 3. The returned length must be equal to the length of the BOM in UTF-8 
plus
+   *      the length of the last NUL. */
+  fail_unless(size == 4);
+  
+  pdf_text_destroy(text);
+  pdf_dealloc(data);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_059
+ * Description:
+ *   Get the contents of a non-empty text object with lang/country info, in
+ *   UTF-8 without BOM, with lang/country information embedded (which should
+ *   not be supported in UTF-8) and with NUL suffix.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should NOT return PDF_OK.
+ */
+START_TEST(pdf_text_get_unicode_059)
+{
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  pdf_text_t text;
+  pdf_char_t *data = NULL;
+  pdf_size_t size = 0;
+  pdf_char_t *utf8data = (pdf_char_t *)"GNU's not Unix";
+  pdf_size_t utf8size = strlen((char *)utf8data);
+  const pdf_char_t *language = (pdf_char_t *)"en";
+  const pdf_char_t *country = (pdf_char_t *)"GB";
+  
+  fail_if(pdf_text_new_from_unicode(utf8data, utf8size,
+                                    PDF_TEXT_UTF8,
+                                    &text) != PDF_OK);
+  fail_if(pdf_text_set_language(text, language) != PDF_OK);
+  fail_if(pdf_text_set_country(text, country) != PDF_OK);
+  
+
+  
+  /* 1. The call to  pdf_text_get_unicode should NOT return PDF_OK. */
+  fail_unless(pdf_text_get_unicode(&data,
+                                   &size,
+                                   text,
+                                   PDF_TEXT_UTF8,
+                                   (PDF_TEXT_UTF16BE_WITH_LANGCODE | \
+                                    PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) != 
PDF_OK);
+  pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_060
+ * Description:
+ *   Get the contents of a non-empty text object with lang/country info, in
+ *   UTF-8 with BOM, with lang/country information embedded (which should
+ *   not be supported in UTF-8) and with NUL suffix.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should NOT return PDF_OK.
+ */
+START_TEST(pdf_text_get_unicode_060)
+{
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  pdf_text_t text;
+  pdf_char_t *data = NULL;
+  pdf_size_t size = 0;
+  pdf_char_t *utf8data = (pdf_char_t *)"GNU's not Unix";
+  pdf_size_t utf8size = strlen((char *)utf8data);
+  const pdf_char_t *language = (pdf_char_t *)"en";
+  const pdf_char_t *country = (pdf_char_t *)"GB";
+  
+  fail_if(pdf_text_new_from_unicode(utf8data, utf8size,
+                                    PDF_TEXT_UTF8,
+                                    &text) != PDF_OK);
+  fail_if(pdf_text_set_language(text, language) != PDF_OK);
+  fail_if(pdf_text_set_country(text, country) != PDF_OK);
+
+  /* 1. The call to  pdf_text_get_unicode should NOT return PDF_OK. */
+  fail_unless(pdf_text_get_unicode(&data,
+                                   &size,
+                                   text,
+                                   PDF_TEXT_UTF8,
+                                   (PDF_TEXT_UNICODE_WITH_BOM | \
+                                    PDF_TEXT_UTF16BE_WITH_LANGCODE | \
+                                    PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) != 
PDF_OK);
+  pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_061
+ * Description:
+ *   Get the contents of a non-empty text object in UTF-16BE without BOM and
+ *   with NUL suffix. The contents of the text object include characters that
+ *   are encoded in UTF-16 using 16-bit and 32-bit.
+ * Success conditions:
+ *    1. The call to  pdf_text_get_unicode should return PDF_OK.
+ *    2. The returned string must be the expected one, NUL terminated.
+ *    3. The returned length must be the expected one, including the length of
+ *        the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_061)
+{
+  extern const test_string_t utf16be_strings[];
+  int i;
+  
+  /* Always INIT! Check runs each test in a different process */
+  fail_if(pdf_text_init() != PDF_OK);
+  
+  i = 0;
+  while(utf16be_strings[i].data != NULL)
+    {
+      pdf_text_t text;
+      pdf_char_t *data = NULL;
+      pdf_size_t size = 0;
+      pdf_char_t *expected_data;
+      pdf_size_t expected_size;
+      
+      /* Set expected data and size (without BOM) and with last NUL */
+      expected_size = utf16be_strings[i].size -2 + 2;
+      expected_data = pdf_alloc(expected_size);
+      fail_if(expected_data == NULL);
+      memcpy(expected_data, &(utf16be_strings[i].data[2]), 
utf16be_strings[i].size -2);
+      memset(&expected_data[expected_size-2], 0, 2);
+
+      
+      fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+                                        utf16be_strings[i].utf32be_data,
+                                        (pdf_size_t) \
+                                        utf16be_strings[i].utf32be_size,
+                                        PDF_TEXT_UTF32_BE,
+                                        &text) != PDF_OK);
+      
+      
+      /* 1. The call to  pdf_text_get_unicode should return PDF_OK. */
+      fail_unless(pdf_text_get_unicode(&data,
+                                       &size,
+                                       text,
+                                       PDF_TEXT_UTF16_BE,
+                                       PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) == 
PDF_OK);
+      
+      
+      if(INTERACTIVE_DEBUG)
+        {
+          pdf_char_t *internal_hex = NULL;
+          pdf_char_t *expected_hex = NULL;
+          internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+          expected_hex = 
pdf_text_test_get_hex(expected_data,expected_size,':');
+          fail_if(expected_hex == NULL);
+          fail_if(internal_hex == NULL);
+          printf("pdf_text_get_unicode_061:%d:Internal> '%s'\n",
+                 i, internal_hex);
+          printf("pdf_text_get_unicode_061:%d:Expected> '%s'\n",
+                 i, expected_hex);
+          pdf_dealloc(internal_hex);
+          pdf_dealloc(expected_hex);
+        }
+      
+      /* 2. The returned string must be the expected one, not NUL terminated */
+      fail_if(data == NULL);
+      fail_unless(memcmp(expected_data, data, size) == 0);
+
+      /* 3. The returned length must be the expected one. */
+      fail_unless(size == expected_size);
+      
+      pdf_text_destroy(text);
+      pdf_dealloc(data);
+      pdf_dealloc(expected_data);
+      
+      ++i;
+    }
+  
+}
+END_TEST
+
+
+/*
+
+CONTINUE HERE WITH TESTS FROM #62 to #108.
+
+Check:
+http://gnupdf.org/Lib:Test_Specification_Document#pdf_text_get_unicode
+
+And the corresponding Flyspray task.
+
+Tests from #55 to #108 are EQUIVALENT to tests from #1 to #54, with the only
+difference that the new PDF_TEXT_UNICODE_WITH_NUL_SUFFIX option is used. 
+
+Check the pass/fail criteria from the TSD for each test.
+
+The next test to be done is the #62, which is equivalent to #8.
+
+Also, take into account that last NUL has different sizes depending the
+encoding:
+  - 1 NUL byte for UTF-8
+  - 2 NUL bytes for all the UTF-16 encodings 
+  - 4 NUL bytes for all the UTF-32 encodings 
+
+*/
+
+
+
+
 /*
  * Test case creation function
  */
@@ -2895,6 +3368,15 @@
   tcase_add_test(tc, pdf_text_get_unicode_052);
   tcase_add_test(tc, pdf_text_get_unicode_053);
   tcase_add_test(tc, pdf_text_get_unicode_054);
+
+  tcase_add_test(tc, pdf_text_get_unicode_055);
+  tcase_add_test(tc, pdf_text_get_unicode_056);
+  tcase_add_test(tc, pdf_text_get_unicode_057);
+  tcase_add_test(tc, pdf_text_get_unicode_058);
+  tcase_add_test(tc, pdf_text_get_unicode_059);
+  tcase_add_test(tc, pdf_text_get_unicode_060);
+  tcase_add_test(tc, pdf_text_get_unicode_061);
+
   
   
   return tc;

# Begin bundle
IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWSVtx00AHCR/gH9zAAD/////
f+//zv////5gH762eh7sdFmaB86aShrcwXYa40xu3I1La1ihVKD7m9vLCLYHo64J2oySKnsN7GqB
FSlBUqpSaWGUqgIKUgSUTU0gY0gZDQ0ehpAaeoAGgNAABpoAACUCAgE0REbSHqZD1A0ADQAAAAAA
A5p+VPTKGowABMAAAAAmAAAAAACTUk1MhBPSnkmTanqD2pGCMmnkmnqGgeoNAAfqgGEARJEJoE0N
JgBCnjQEymjApkNM00epo1NPRHpMRsgFSRAgTTRpMEaE9AEp+mjJTMmU2kGTJo9QB6nqNGjRtgPY
JE1QClCqBSRECRACI++FJnPatdjgziFoVJFOwSmIefK39MalmMKz1yk+gSgyrIQqIiEfV1UdMtO4
XXyr8BVFVKKcNvG2m/Lub35jytT+N5VTwatUBNTraB62zTcsFEQoLU2TaMaYFDxyzLCmzG5/d/hm
meSfuMKygmcZw6rBeIGDJWRU5RS1PJJQWCiBnBCCsGJD+n578XTsiaKpzDk06zELtLgFgohBpoAh
CBjSGhecTS/x1QuvF2n9e7xuidHd3dXVs8XA8UfEsXjA5hRiLqgL4hpOqqYNiGikR3CjkqPhFGCj
Y/5w3beDtl64e3bs0s7kaMP3+KoMtTgCBx4kERQA4himERAoFIY8oAY48SCImJatoiOhEyP3UQTm
RFQRf8lyzC+bHRmPpq+CWk8O/Bat1s5HZuONae0Ioi83LKRKgorsFIhGCCBzqFIYQUGp76D3xsgg
YEWIJCH0WAoYhEQswiu7vLbt1AXPcai0woFkkWIwC5IhSQf1hugnmLB/TFmzotcB4nFsWO1f+H7j
x9x/O/FoAKW3t8oIbHUnEoSSCJ3iaTm4zv0PUMLeSYG9cMYOa5qnd6NhxFbS0jBDwH5BOEDqWEJE
ixYqgvSnBC52aAOiB1cdzAo2znyw6fpSWp2+Ct2EKh1WoEi8iyFWBYFWiszHu1riWshSnQgdWr4O
wy9u3iP1KmKVx0zxpOcYFSCTWWIYTbJavK4Qw5j7TxHmOokQwcPZZLv3rFi05eRKLjdbJYegUYKP
H63M6F6sAIeEIsWEahCqRFWCCDBBiifDJhM1qkvs1W0L3966QPOGsVsA9dyTV28vdtb7zEsRAdcG
5oUZeIBOwQgxUPGCBi+qen108edOgYTjDIUYv12zFGy+pBQ22LxdBEUqFG+hEkRziSJUciTDLacn
+F6R85o5t9L+JAOvYUC+3Uh6eq3Fxugs9/VMrlwq9BwAsCBEkSMGS+eM5IOuLeCr7XkBBZUN7kiS
Zlp6IYSQZhvfv55aUkTThJXU/y0vJybBC4283PdDLfLCNs8cvM5yUgJIDYVaH1L3fNOI5w+Spyty
3jFM0cg1F5wVCFypRYXuuUouLxdcpRcXuuUouLpjaYGFFWMDjgG5hikaBeSSSRRCEIZNmtp0INaI
GmwC6LNAUMltVXQYJChJZDyoUhSlIaNak7WLDK97VywNTg1ybmSyQbJhEXYUiNIXATRgp4/TLWz4
No35XA1hCI2gvH330VVVVVVVVVVVVRVPmFHAMl7MidNK8LMb4H2klFK2khuR0Urtk4sH5kGGNiwL
YSzoXczLuwMBQnAdSC8MH0gbQHAMRR09WS5IvKbV2SBXHig5oQELrwgHOhV+q1V0zYcQ2IEIEY9/
hNWaK46AE3ZBVs7BWcte4LlLBpIWIRkMChhRUdnZmd2FFB1d3d3dmqZ83YeqBPMKZCPu5/VouqO6
tx4PZggTo4zHzcttnWnYGPXsomiSijwV4Zv69HrSqVQUtETCkzIyIUFrF7iKRQBE1zrHmSZQJv5n
u+b7Ore4seYzGU7nLcz9/geUccdCtHe0Wdd1B3dq7l6JhmzV0OnJoOpAn1ljuMtGJRiHKHgKVKIh
9SxLIHXFD1r7VpwjEP7GVi3vD9jR9Tc8kXZpXxrRZ7l/kepeDQu2jE3mBg3W5Z5jlLbjdK3Gsyec
NfLe7d+BwcG7cfZ8+eeDg2zTxUAYfqxxks3Xlclo8y4iFk4l/Zwb4hDgcddtsIQ0qh0fCz+D8nD0
+g112Ovw9feN86UjOgJuG72nkudl06sUe7nRNR20ZImBtITk0VCh7OQAqdrI6Fze3UsfezMBtNhD
upD7Y47xz2OLh3PJjrGBLAVJCX0YmkLmijm5+nm6c8s4e0OY54Q+gCgT3xUpRixILH8AUD9EVA/U
zz/I/OusOANgcgUQs0eG7dpu0+qeXp7t+Phj3dtsdmsMtlBajVsCvlCyQIBjYwOLE2p35rmkREVV
VVVFRVM5IWcCity4uHyfDobm7vRDzaV4sF3cwatQTyQ5DjLv9RJwTZNGhwYMJSx2OezGfa+FpUSe
Y03ew2oTbvdE4z1SHpmUDemLH1SlK4li6L7P14dsDNjI2+UHKnQ8bXOE+hM6an8z3dPm9YJPP49q
AgknAe7o14wvmZiXEspgKJutUFCihV6Wk23RMVKKIythPC14XM2hBnDlSHUkNOjy+G+2VRTdv704
MZ/AeSFxrhIwofYonpP2xOc9TCMCMX91EjKfywTlDp7vx8bVUnn4cOOIAcd/dLH2HHDLRfRJcbHk
7Id8nk6vFWHEr1xygF4tRDthWQftbWZG1HwOXGXHymYwmQCkhHTepxmOJYhpmup0RlKhmShwWECZ
FyZmTqzCOEJI/lASFV1xr3ViGIECrUHgL3XEmR7NFTAl6QJJBUNk2yxyKyU/bTfrPp8lkbOcn75v
bWxtLOl8dT29KaOMh8yNppIbokwy/vlrXxFtnJhya8TLCUsn58cMxzwkyyN/5bN/3kIkT0ijnnzZ
u2rthKkkjIR2HDZF0bdFQ2I5p0mKLSL1Dy9BioWdGIzcYYF6L6SruWRxE4C5+DApdeQJmMeuhBh2
GENNNRCcI+bgouRrzyRAwCOrVrRxQ08JWniPjxFIERIHuXU6lhQrcCQd9lCehyJbLUTHE4mhXTiI
VSpoKcSpkdAhgOSNOPPjVUbOnKvNE1znwrI5IT5xkel5dlNiaBM6ROpgUxKiVVBtzAogI4rr4dut
0CcEhub5vWbt2c6d7hKLbl2Bg8bM3KmWEOs4OZHxXcoX15VygTcFJkQvWF8O+9EEg0EEOnXZOqUn
W1up0O1nGuRrw5HHp4YYuN9zZlqdOeZo9FoHRVJUCqaFotDl5eI1vAZ8mvXgcIGlDFc2XqGCI7k4
phM2TgbGHFTWmwtz1fJD6mRJdyDRTDjIdO9JjMYjlzDbnqbkhjIxLGqdoN08YyTHSesb4ywVmagc
MXNnTMH0EHWBcDiTHEKC37xkrHZvpRJ6GaozaA5lAZo6XGcsuuKdzJtY49PX683DkKdaNMFmpuU5
29qnGcpthR9YZuG9pmiqbzd61bVlYUlET4AwyIFjYligXOGGRQwGRMh7g5v03KHLXrhQkTY6wpqK
QKUumBAxCc4oOUNzJDckZqUGY3Ugnw/vgXNy5Y1DQUYuYnW6UoiBry5sqvZrbYzykLLdIrVZgpFZ
51Wml2i3uTZgdjqWm51N7Fhr51GZxSLmic3N2nIKT6+cjA3m5OS5EGKPO2Y/EhrJwkaETi4NTFhJ
0sHQ/hOltU/NId05S85tmWV76tKmulKNSN4oahFzUYA6EQHGwyxHvcYMUVLgbijURhsSDdHks2kL
MqStEhnst3yJ4KlXmtrbGbe1xujIlF2Q5JdqZPFLbt7GhONbvjWUZvQVRApSyK4L4ADCSJh3rOua
oBY2GBAv2TMYdZ9YYxIaG0KknkZ1UsQ5wLHAxlsbjdfIn34h3ksr8p4kc4iYmY5oYkhIKGYqaIhh
2Zhz3rvxKtlxwpDpDuQDCbMKkhQmv3IgZjclSiKmBozmw1VsY1dalh7LM0JAxQcX4wHiIX6twI5T
SnRSnJZOyElKdfT2saywlSgSvDrXi4ktu3wzukEgN+kTlRw2muOEQ9TR3lkLMQ6GGVarI8Lk0Pa/
E0ARkCK8QjhbOWIoHdS4UHsEhgHRQzkHNx8gNRTpEA2CH12OdkqCE+XwUFkXZKhEokQogkYMaR35
vWSiSQhNgUUfhKNrtYqYVLqKVKKjOJPK532Q+13rXZrmppPnm3XMWTN703AxSJxQRZv+pLyRNMUR
oUf7P3dQX+pf1V0wkNhopYe83hushBII97sa1YIKrBCIHvA1RQWrKAHgYQqIsIHbnI6Q2EDR4puG
MlhD8vIGiJA8skAPskuBh+3a3HZ1POwFVQQQQRVVQQQQuMCgGRAphS3Gl/gwm3vxcNPS4xBepyHQ
EB/Q3UBNYP+4OjHlyVALYO84QhbZIskkIIySIGfCW32ROPIQuvM3KAsv3imm7tB4HwIqXHjigRiJ
PwSFMEANAUSbYcPCsGCsGKFSVVnZlITu7mshOvgW2dMmhoh4pMhR04WcuwrAX7ICCYg/cKRFTq7o
IDje/MqPJkqptYn2/SU/bG8LRsxuELE4eDcnPZejaBOPAXsWcAObIKvYukUd2a2FpwFwgiaFATAU
v2cTllvXrESkV37N9kUKiKGbkDyP32VVNqhjiogUqQNgSkSgeLhET0fkDzsYIQyV+USlPJbdFa8O
00bDtiDoTRAiJyYyoiatNznA1rngdiKnh84P4vUvkXARIImSJqX6UQxD40LMVqwsX9KwVE+a5Xyt
VGmMKkWpKYVKJNyWciEKgVZVi0jBcSXOn2rAcwhxBnGLFYJlGoFgK0gGFo/HJCQhGQkLoBQllmti
YrZcRUGCfhFRINxHjsJSipOuoFkh+x+pyEdz9ja/S83IrBwb+5ZguxfraNxNrFo//0Ykib8x+jf0
Q0fc9smROk/CSTsM4nI66kJYm6GZWSHah/8D9z4/nfLFRQqLH+z7/J4HkINyOeVKr2zcjeyJOVQJ
zogeJADmGKs6/0mFjT1UInMNxOoUluBexIhsw1AhhlBIHaA+eIjuVbApkW1CRNKiVlRz3yYRlUtx
hGRFU6ycY+M4oD5Rqjh9Qtboex7TyPIkGLR3prk8JI5nyInYkHtvQSFIACA/AgCSEpWTVSO5h5sL
TEhyxJYh53nkswTz2YzyOh0qeB4GKeJzMGbucxBvk5JLO7yP2m574knb3Oi/PJviSd7F69kQ0G7U
8CYd7ufc90Dmhs10M1rakHljCJJLQPcKNCjQdZkKIbv/TYlgCArqaM3xe1baxUTYhqd55mr/AVSC
RShWUwVKCWtYRZQikhYskKiBQ8Sok0L9L1MHis8Tmd7F62B5mqRzTytZOC5/dc+cEgK7l2ms53pP
wHP0rGzHnY03J/VbNQbtym3xNOWMKC/r8x5TecaG4EP9QgK8fphSFSojkdq0RG1JXEhmpYUjU6uZ
InekStvWUUCqQUnLrk1EifGJ1KiENWg02VvSMy0STabDzsx5cdFeMUe8+BD4yINhRsi39DRPUvet
gQ0hUkF8RSFhGQytjbQT8N+MhHyQxgqEFkNAUh32sAkNOeFlM3S0Ua2mKmkUIJrZRZMALmjhPtUL
aO8TsdQvmjnSSxoz13fY8QqTO4iB45I7YxoUOAKbiyk5FOlG07+A6VLsc0I5pN08AxiWmFSYh0OD
sHUTqM0mrmeZueTOH4oNKIXkhNIhyylFPDUWBUo4pCywiyQsLSJzVSy0STgSOd4cmtyORoijwCgb
w0qjtHhBTkAwK+AkkkJIqioiCqqqqrFVVVEbTe9Bi8zdlDvU6HoJErKQTEVqslT6U3JBcOE5dSF6
tnSVUg7clohdKopz9cLYdVRJNozsvetz3H5ZNvaQbSOU1Q+wUYJAIpR37Fs3+h+sVenaxhAIRYRs
CHySkjxcUO1Eax4JEfknzST4Sj8NP9G6YEmDGn+b+hs8WYoGqCiBZhyHEkT3gdivSoDGS6QwiWk4
PUyetgnsj2Pa0eNZgxE9zGCDvyan1cvgTpktC0oVSpRUpUw1tj0I2KhwkPkQMiRLfnHsGsZJIjpS
SWiUSPWWntjmR9K8+Sc+dlAK7FSW7S5dL0qAo4fCXuoo+MklFFlRKNPe2pIjsWR61+M6fdCSSfDz
qGlU7yCAGagcQERWIMARIDDRVUUekSL6kT9hQcMBTdkpwREPMCBEViEREJGAoRi84ezqrKnJ/xsh
+jNEUI9iOSENkDx/kQU2NH91gl/K/I4Ot9ghvPo8n4UQt0IinCRsNT2xclVIfL6ZDt79FRE2DZKC
xPBsNiILw9MklkH1xPBaETe6YbqoUKQZIEIicC3fhdz1DZwWV6TMsZm4iUoUiqFQGUl5E+tI+EOc
VBXD1yRMrQhN8STg9HFmoDaQn5a7vWOgNYoHkfRD2O78Sh8T1nYhEhrlPjrtTb3RPoSDY+jmgypI
lQWPR4Uc24nLMkMpCeGA8qhNspIdcIaEiYyJ6/sNED6XjaalJJFMMyeIi1qEjW5UOY13+Vg6EG2R
jJqJ+nuapIlD32TfBKJHzAobpjSPTHsnrr1cM1fB7HE2yD1NZQ8abDyCYmlFcSPc84KHkbd4o4uk
EWSJaIGa8okaiSi8Kn/UmRYQ3j6KUchDSJkq4ugoQKR4WWBioxIKMOeIoXRgoGV4KD0Q0Apo9vct
J1/heOzx47ZETUkSiQ2wFRRiprUBpV3HAgfviAJRWC2ogJiJtDQJ55eEmElDSpM5qGFC8KlCgfU3
VLDLgKUKJ2oFh6Rx83N7ChKWCwkop3hSxRFRDbO0+7ZN6QedDUdcn3W8qRJL4hH7qCfBmdU+Vxci
Jso9BvFHiQCIvddFdmsBNxmKliZQGKRHXhDA7J4+iT010pPqh+JeCcuzfEMpOWSdn0QhjCRUDvbu
/f/L774MOM419p4UgrzqHBY5sl8B6z1g2t+OR8q2enaJR7WBHcNEhTI0saIwkLNNBc/hSz/rycCW
i5xKaiUxYsWLQySGSSDYQ0NKaTAkk0m0T8/zqg9R5+oD6fVkan+sX5efNEOr4dqIX41yXhOcIbQ8
ywNiK9Qw9OckeaQOdnWDohDJNQhUFqiqaCB6sSJdEXiayd7zdHPkmdIz45mfjZQySOOFBDi+pYdC
ku907UbZKiko1TXvQN8nEWZQT+cykk6E4EFoiiQzky4ikiari0nDUzhRCilHJJgupESBo3ijsI71
615wHKyK56cVwsYP7yCgeBC+kTCs/1rpBL5xroI2WxmEK1ol8D7kgxkubialSbMXru8U5y6KR8/P
BDITcKU3PcEpJCJ4id6RswLxPmDAjQfrfppPdAxiDlt4QeB+LsMIob99qAUy/VLFiHMiS0KQ6iyR
KlA6ImUCa0GH5RRupSe8iqmHozyQCZoB6+l5VIKdkMRaS6FFIjZM5unnhMJIhjS6BQIpaCLdccVr
C7out8EVywq1XHIELXcVwXJcWAFgIjcvZCwknFZaXsLGYasMmjECywUEpssMxstmwUt4CRgtkQTA
VDaKO4UckKxJR1LS6uNNJpM3a5+EEKASwLpoFdCo0KBylaRH54zNwEMUNRgh/KcSyLtoegCCgdi2
Q+0PxtB8Me59KegJD15rqAlBSTsYOWTqiHM7eilJCKSSKpog6KIH0UqiI+NBjA7u0udpI/kU7bVQ
q2SRLlSeZGKIclCd0T42unj4Jefedbqv16iDSoklCRRSIb8XgXJufX3sqCFOWJJTEdMknzocIV9f
SIFtggHnUK5kENpy8xT1m+CyIOd1KWhB3URwk1QkcIbohnBDkkakeb2EhnA40qKKYdan4vQllyiw
fOfMXLh9mhOl6+XhGyp+Z5XyifWN0LYDSOeTZtBVFHRUS1JIcaEi1IV9/RN7SI3lRHbLAfqeg/sP
6UXn3mEhmfOYqn2MBUG4nDy9pSEIRATtLRFe1twLBsQ/2DS/XkJouu0ehv5eIUHo4C/yPtT/B+Zc
BR/wda3wMWBWK4ij/kXckU4UJAlbcdNA

reply via email to

[Prev in Thread] Current Thread [Next in Thread]