[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[pdf-devel] New option in pdf_text_get_unicode
From: |
Aleksander Morgado |
Subject: |
[pdf-devel] New option in pdf_text_get_unicode |
Date: |
Tue, 11 Nov 2008 22:20:55 +0100 |
User-agent: |
Thunderbird 2.0.0.17 (X11/20080925) |
Hi Jose,
I added a new option to the `pdf_text_get_unicode' function, which
allows to get the unicode string NUL-terminated (1-byte-NUL for UTF-8,
2-byte-NUL for UTF-16, and 4-byte-NUL for UTF-32).
This may be useful when using Windows API functions which expect
NUL-terminated UTF-16BE strings, for example.
Also added 54 new test cases in the TSD, and implemented 7 of them. I
will open a new task in flyspray for the remaining ones. The tests are
really simple, as they are based in the previous 54 tests that were
available for the `pdf_text_get_unicode' function, so if anyone wants to
take them, please feel free to do so.
Find attached a patch which includes the changes in the source code, the
modification of documentation (API and TSD) and the 7 unit tests
implemented.
Cheers,
-Aleksander
# Bazaar merge directive format 2 (Bazaar 0.90)
# revision_id: address@hidden
# gh0tl1e12xvx88hk
# target_branch: file:///home/aleksander/Development/gnu/libgnupdf\
# /libgnupdf-repo/trunk/
# testament_sha1: 825def469327cf90b90bf3d8ce5d2d02e8876132
# timestamp: 2008-11-11 22:11:22 +0100
# base_revision_id: address@hidden
#
# Begin patch
=== modified file 'ChangeLog'
--- ChangeLog 2008-11-03 23:51:44 +0000
+++ ChangeLog 2008-11-11 21:10:14 +0000
@@ -1,3 +1,20 @@
+2008-11-11 Aleksander Morgado <address@hidden>
+
+ * src/base/pdf-text.c (pdf_text_get_unicode): Correct out_length when
+ using PDF_TEXT_UNICODE_WITH_NUL_SUFFIX.
+
+ * torture/unit/base/text/pdf-text-get-unicode.c
+ (test_pdf_text_get_unicode): Added implementation of new test cases.
+
+ * doc/gnupdf-tsd.texi (pdf_text_get_unicode): Added new test cases to
+ test the new option in `pdf_text_get_unicode' function to get the string
+ with a NUL suffix.
+
+2008-11-10 Aleksander Morgado <address@hidden>
+
+ * doc/gnupdf.texi (Text Data Types): Added new unicode option, to append
+ NUL suffix to the unicode string.
+
2008-11-04 Aleksander Morgado <address@hidden>
* utils/pdf-filter.c (process_stream): We should check return value of
=== modified file 'doc/gnupdf-tsd.texi'
--- doc/gnupdf-tsd.texi 2008-11-02 20:30:16 +0000
+++ doc/gnupdf-tsd.texi 2008-11-11 19:21:15 +0000
@@ -2488,6 +2488,553 @@
@end deffn
address@hidden Test pdf_text_get_unicode_055
+Get the contents of a non-empty text object in UTF-8 without BOM and with NUL
suffix. The contents of the text object include characters that are encoded in
UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_056
+Get the contents of a non-empty text object in UTF-8 with BOM and with NUL
suffix. The contents of the text object include characters that are encoded in
UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
BOM in UTF-8 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_057
+Get the contents of an empty text object in UTF-8 without BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL, and should include the last NUL.
+
+3. The returned length must be equal to 1.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_058
+Get the contents of an empty text object in UTF-8 with BOM and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-8, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-8 plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_059
+Get the contents of a non-empty text object with lang/country info, in UTF-8
without BOM, with lang/country information embedded (which should not be
supported in UTF-8) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_060
+Get the contents of a non-empty text object with lang/country info, in UTF-8
with BOM, with lang/country information embedded (which should not be supported
in UTF-8) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
+
address@hidden Test pdf_text_get_unicode_061
+Get the contents of a non-empty text object in UTF-16BE without BOM and with
NUL suffix. The contents of the text object include characters that are encoded
in UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_062
+Get the contents of a non-empty text object in UTF-16BE with BOM and with NUL
suffix. The contents of the text object include characters that are encoded in
UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
BOM in UTF-16 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_063
+Get the contents of an empty text object in UTF-16BE without BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_064
+Get the contents of an empty text object in UTF-16BE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-16BE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-16 plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_065
+Get the contents of a non-empty text object with lang/country info, in
UTF-16BE without BOM, with lang/country information embedded (which IS
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the lang/country
information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
lang/country info and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_066
+Get the contents of a non-empty text object with language info (no country
info), in UTF-16BE without BOM, with lang/country information embedded (which
IS supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the language
information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
language info and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_067
+Get the contents of an empty text object with lang/country info, in UTF-16BE
without BOM, with lang/country information embedded (which IS supported in
UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only include the lang/country information
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the lang/country info
plus the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_068
+Get the contents of an empty text object with language info (no country info),
in UTF-16BE without BOM, with lang/country information embedded (which IS
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the language information embedded,
NUL terminated.
+
+3. The returned length must be equal to the length of the language info plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_069
+Get the contents of a non-empty text object with lang/country info, in
UTF-16BE with BOM, with lang/country information embedded (which IS supported
in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the BOM and the
lang/country information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
lang/country info and the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_070
+Get the contents of a non-empty text object with language info (no country
info), in UTF-16BE with BOM, with lang/country information embedded (which IS
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, including the BOM and the
language information embedded, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
language info and the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_071
+Get the contents of an empty text object with lang/country info, in UTF-16BE
with BOM, with lang/country information embedded (which IS supported in
UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only include the BOM and lang/country information
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the lang/country info
plus the length of the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_072
+Get the contents of an empty text object with language info (no country info),
in UTF-16BE with BOM, with lang/country information embedded (which IS
supported in UTF-16BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM and the language information
embedded, NUL terminated.
+
+3. The returned length must be equal to the length of the language info plus
the length of the BOM and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_073
+Get the contents of a non-empty text object in UTF-16LE without BOM and with
NUL suffix. The contents of the text object include characters that are encoded
in UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_074
+Get the contents of a non-empty text object in UTF-16LE with BOM and with NUL
suffix. The contents of the text object include characters that are encoded in
UTF-16 using 16-bit and 32-bit.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
BOM in UTF-16 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_075
+Get the contents of an empty text object in UTF-16LE without BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_076
+Get the contents of an empty text object in UTF-16LE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-16LE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-16 plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_077
+Get the contents of a non-empty text object with lang/country info, in
UTF-16LE without BOM, with lang/country information embedded (which is NOT
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_078
+Get the contents of a non-empty text object with language info (no country
info), in UTF-16LE without BOM, with lang/country information embedded (which
is NOT supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_079
+Get the contents of an empty text object with lang/country info, in UTF-16LE
without BOM, with lang/country information embedded (which is NOT supported in
UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_080
+Get the contents of an empty text object with language info (no country info),
in UTF-16LE without BOM, with lang/country information embedded (which is NOT
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_081
+Get the contents of a non-empty text object with lang/country info, in
UTF-16LE with BOM, with lang/country information embedded (which is NOT
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_082
+Get the contents of a non-empty text object with language info (no country
info), in UTF-16LE with BOM, with lang/country information embedded (which is
NOT supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_083
+Get the contents of an empty text object with lang/country info, in UTF-16LE
with BOM, with lang/country information embedded (which is NOT supported in
UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_084
+Get the contents of an empty text object with language info (no country info),
in UTF-16LE with BOM, with lang/country information embedded (which is NOT
supported in UTF-16LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_085
+Get the contents of a non-empty text object in UTF-32BE without BOM and with
NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_086
+Get the contents of a non-empty text object in UTF-32BE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
BOM in UTF-32 and the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_087
+Get the contents of an empty text object in UTF-32BE without BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_088
+Get the contents of an empty text object in UTF-32BE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-32BE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-32 plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_089
+Get the contents of a non-empty text object with lang/country info, in
UTF-32BE without BOM, with lang/country information embedded (which is NOT
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_090
+Get the contents of a non-empty text object with language info (no country
info), in UTF-32BE without BOM, with lang/country information embedded (which
is NOT supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_091
+Get the contents of an empty text object with lang/country info, in UTF-32BE
without BOM, with lang/country information embedded (which is NOT supported in
UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_092
+Get the contents of an empty text object with language info (no country info),
in UTF-32BE without BOM, with lang/country information embedded (which is NOT
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_093
+Get the contents of a non-empty text object with lang/country info, in
UTF-32BE with BOM, with lang/country information embedded (which is NOT
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_094
+Get the contents of a non-empty text object with language info (no country
info), in UTF-32BE with BOM, with lang/country information embedded (which is
NOT supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_095
+Get the contents of an empty text object with lang/country info, in UTF-32BE
with BOM, with lang/country information embedded (which is NOT supported in
UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_096
+Get the contents of an empty text object with language info (no country info),
in UTF-32BE with BOM, with lang/country information embedded (which is NOT
supported in UTF-32BE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_097
+Get the contents of a non-empty text object in UTF-32LE without BOM and with
NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_098
+Get the contents of a non-empty text object in UTF-32LE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must be the expected one, NUL terminated.
+
+3. The returned length must be the expected one, including the length of the
BOM in UTF-32 and the length of the NUL suffix.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_099
+Get the contents of an empty text object in UTF-32LE without BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must not be NULL.
+
+3. The returned length must be equal to the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_100
+Get the contents of an empty text object in UTF-32LE with BOM and with NUL
suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should return PDF_OK.
+
+2. The returned string must only contain the BOM in UTF-32LE, NUL terminated.
+
+3. The returned length must be equal to the length of the BOM in UTF-32 plus
the length of the last NUL.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_101
+Get the contents of a non-empty text object with lang/country info, in
UTF-32LE without BOM, with lang/country information embedded (which is NOT
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_102
+Get the contents of a non-empty text object with language info (no country
info), in UTF-32LE without BOM, with lang/country information embedded (which
is NOT supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_103
+Get the contents of an empty text object with lang/country info, in UTF-32LE
without BOM, with lang/country information embedded (which is NOT supported in
UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_104
+Get the contents of an empty text object with language info (no country info),
in UTF-32LE without BOM, with lang/country information embedded (which is NOT
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_105
+Get the contents of a non-empty text object with lang/country info, in
UTF-32LE with BOM, with lang/country information embedded (which is NOT
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_106
+Get the contents of a non-empty text object with language info (no country
info), in UTF-32LE with BOM, with lang/country information embedded (which is
NOT supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_107
+Get the contents of an empty text object with lang/country info, in UTF-32LE
with BOM, with lang/country information embedded (which is NOT supported in
UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
address@hidden Test pdf_text_get_unicode_108
+Get the contents of an empty text object with language info (no country info),
in UTF-32LE with BOM, with lang/country information embedded (which is NOT
supported in UTF-32LE) and with NUL suffix.
address@hidden @strong
address@hidden Success conditions
+1. The call to @code{ pdf_text_get_unicode} should NOT return PDF_OK.
address@hidden table
address@hidden deffn
+
+
+
@node pdf_text_get_hex
@subsubsection pdf_text_get_hex
=== modified file 'doc/gnupdf.texi'
--- doc/gnupdf.texi 2008-10-07 17:28:21 +0000
+++ doc/gnupdf.texi 2008-11-11 19:21:15 +0000
@@ -3857,7 +3857,10 @@
@item PDF_TEXT_UTF16BE_WITH_LANGCODE
Insert language/country code information between the BOM (if required) and the
data. This option is ONLY applicable to UTF16BE. If specified in any other
-encoding, the function will fail.
+encoding, the function will fail.
address@hidden PDF_TEXT_UNICODE_WITH_NUL_SUFFIX
+Append a NUL suffix to the Unicode string (1-byte NUL for UTF-8, 2-byte NUL for
+UTF-16 and 4-byte NUL for UTF-32).
@end table
@end deftp
=== modified file 'src/base/pdf-text.c'
--- src/base/pdf-text.c 2008-09-09 01:41:01 +0000
+++ src/base/pdf-text.c 2008-11-11 21:10:14 +0000
@@ -709,22 +709,55 @@
{
pdf_char_t header[PDF_TEXT_USHMAXL];
pdf_size_t header_size = 0;
- /* Clear header array */
- memset(&(header[0]), 0, PDF_TEXT_USHMAXL);
- /* Get requested header (BOM and/or lang/country info) */
- pdf_text_get_unicode_string_header(header,
- &header_size,
- new_enc,
- options,
- pdf_text_get_language(text),
- pdf_text_get_country(text));
+ pdf_size_t trailer_size = 0;
+
+ /* Compute header if needed */
+ if((options & PDF_TEXT_UNICODE_WITH_BOM) || \
+ (options & PDF_TEXT_UTF16BE_WITH_LANGCODE))
+ {
+ /* Clear header array */
+ memset(&(header[0]), 0, PDF_TEXT_USHMAXL);
+ /* Get requested header (BOM and/or lang/country info) */
+ pdf_text_get_unicode_string_header(header,
+ &header_size,
+ new_enc,
+ options,
+ pdf_text_get_language(text),
+ pdf_text_get_country(text));
+ }
+ /* Compute trailer if needed */
+ if(options & PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)
+ {
+ switch(new_enc)
+ {
+ case PDF_TEXT_UTF8:
+ trailer_size = 1;
+ break;
+ case PDF_TEXT_UTF16_BE:
+ case PDF_TEXT_UTF16_LE:
+ case PDF_TEXT_UTF16_HE:
+ trailer_size = 2;
+ break;
+ case PDF_TEXT_UTF32_BE:
+ case PDF_TEXT_UTF32_LE:
+ case PDF_TEXT_UTF32_HE:
+ trailer_size = 4;
+ break;
+ default:
+ trailer_size = 0;
+ break;
+ }
+ }
- if(header_size > 0)
+ if((header_size > 0) || \
+ (trailer_size > 0))
{
pdf_char_t *new_out_data = NULL;
/* Allocate memory for new string */
- new_out_data = (pdf_char_t *)pdf_alloc(out_length + header_size);
+ new_out_data = (pdf_char_t *)pdf_alloc(out_length + \
+ header_size + \
+ trailer_size);
if(new_out_data == NULL)
{
return PDF_ENOMEM;
@@ -740,9 +773,16 @@
/* Reset output data array, if any */
pdf_dealloc(out_data);
}
+
+ /* Store trailer (N-byte NUL) */
+ if(trailer_size > 0)
+ {
+ memset(&new_out_data[out_length+header_size],0,trailer_size);
+ }
+
out_data = new_out_data;
- out_length += (header_size);
- }
+ out_length += (header_size + trailer_size);
+ }
else
{
PDF_DEBUG_BASE("Invalid unicode option requested (%u)",
=== modified file 'src/base/pdf-text.h'
--- src/base/pdf-text.h 2008-09-09 01:41:01 +0000
+++ src/base/pdf-text.h 2008-11-11 19:21:15 +0000
@@ -60,9 +60,10 @@
* Each of these enumerations is a Mask of Bits, so that multiple options
* can be set at the same time */
enum pdf_text_unicode_options_e {
- PDF_TEXT_UNICODE_NO_OPTION = 0x00,
- PDF_TEXT_UNICODE_WITH_BOM = 0x01, /* UTF-8, UTF-16(L/B)E, UTF-32(L/B)E
*/
- PDF_TEXT_UTF16BE_WITH_LANGCODE = 0x02 /* UTF16BE */
+ PDF_TEXT_UNICODE_NO_OPTION = 0x00,
+ PDF_TEXT_UNICODE_WITH_BOM = 0x01, /* UTF-8, UTF-16(L/B)E,
UTF-32(L/B)E */
+ PDF_TEXT_UTF16BE_WITH_LANGCODE = 0x02, /* UTF16BE */
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX = 0x04 /* UTF-8, UTF-16(L/B)E,
UTF-32(L/B)E */
};
=== modified file 'torture/unit/base/text/pdf-text-get-unicode.c'
--- torture/unit/base/text/pdf-text-get-unicode.c 2008-09-09 01:41:01
+0000
+++ torture/unit/base/text/pdf-text-get-unicode.c 2008-11-11 21:10:14
+0000
@@ -32,6 +32,8 @@
#define INTERACTIVE_DEBUG 0
+
+
/*
* Test: pdf_text_get_unicode_001
* Description:
@@ -2833,6 +2835,477 @@
END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_055
+ * Description:
+ * Get the contents of a non-empty text object in UTF-8 without BOM and with
+ * NUL suffix. The contents of the text object include characters that are
+ * encoded in UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should return PDF_OK.
+ * 2. The returned string must be the expected one, NUL terminated.
+ * 3. The returned length must be the expected one, including the length of
+ * the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_055)
+{
+ extern const test_string_t utf8_strings[];
+ int i;
+
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ i = 0;
+ while(utf8_strings[i].data != NULL)
+ {
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+ pdf_char_t *expected_data;
+ pdf_size_t expected_size;
+
+ /* Set expected data and size (without BOM, with last NUL) */
+ expected_size = 1 + utf8_strings[i].size -3;
+ expected_data = pdf_alloc(expected_size);
+ fail_if(expected_data == NULL);
+ memcpy(expected_data, &(utf8_strings[i].data[3]), utf8_strings[i].size -
3);
+ expected_data[expected_size-1] = '\0';
+
+ fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+ utf8_strings[i].utf32be_data,
+ (pdf_size_t) \
+ utf8_strings[i].utf32be_size,
+ PDF_TEXT_UTF32_BE,
+ &text) != PDF_OK);
+
+
+ /* 1. The call to pdf_text_get_unicode should return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) ==
PDF_OK);
+
+
+ if(INTERACTIVE_DEBUG)
+ {
+ pdf_char_t *internal_hex = NULL;
+ pdf_char_t *expected_hex = NULL;
+ internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+ expected_hex =
pdf_text_test_get_hex(expected_data,expected_size,':');
+ fail_if(expected_hex == NULL);
+ fail_if(internal_hex == NULL);
+ printf("pdf_text_get_unicode_055:%d:Internal> '%s'\n",
+ i, internal_hex);
+ printf("pdf_text_get_unicode_055:%d:Expected> '%s'\n",
+ i, expected_hex);
+ pdf_dealloc(internal_hex);
+ pdf_dealloc(expected_hex);
+ }
+
+ /* 2. The returned string must be the expected one, NUL terminated */
+ fail_if(data == NULL);
+ fail_unless(memcmp(expected_data, data, size) == 0);
+ fail_unless(data[size-1] == '\0');
+
+ /* 3. The returned length must be the expected one. */
+ fail_unless(size == expected_size);
+
+ pdf_text_destroy(text);
+ pdf_dealloc(data);
+ pdf_dealloc(expected_data);
+
+ ++i;
+ }
+
+}
+END_TEST
+
+
+
+
+
+
+
+/*
+ * Test: pdf_text_get_unicode_056
+ * Description:
+ * Get the contents of a non-empty text object in UTF-8 with BOM and with NUL
+ * suffix. The contents of the text object include characters that are
encoded
+ * in UTF-8 using 8-bit, 16-bit, 24-bit and 32-bit.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should return PDF_OK.
+ * 2. The returned string must be the expected one, NUL terminated.
+ * 3. The returned length must be the expected one, including the length of
+ * the BOM in UTF-8 and the length of the NUL suffix.
+ */
+START_TEST(pdf_text_get_unicode_056)
+{
+ extern const test_string_t utf8_strings[];
+ int i;
+
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ i = 0;
+ while(utf8_strings[i].data != NULL)
+ {
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+ pdf_char_t *expected_data;
+ pdf_size_t expected_size;
+
+ /* Set expected data and size (with BOM) */
+ expected_size = 1 + utf8_strings[i].size;
+ expected_data = pdf_alloc(expected_size);
+ fail_if(expected_data == NULL);
+ memcpy(expected_data, &(utf8_strings[i].data[0]), utf8_strings[i].size);
+ expected_data[expected_size-1] = '\0';
+
+
+ fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+ utf8_strings[i].utf32be_data,
+ (pdf_size_t) \
+ utf8_strings[i].utf32be_size,
+ PDF_TEXT_UTF32_BE,
+ &text) != PDF_OK);
+
+
+ /* 1. The call to pdf_text_get_unicode should return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ (PDF_TEXT_UNICODE_WITH_BOM | \
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) ==
PDF_OK);
+
+
+ if(INTERACTIVE_DEBUG)
+ {
+ pdf_char_t *internal_hex = NULL;
+ pdf_char_t *expected_hex = NULL;
+ pdf_char_t *output_hex = NULL;
+ internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+ expected_hex =
pdf_text_test_get_hex(expected_data,expected_size,':');
+ output_hex = pdf_text_test_get_hex(data,size,':');
+ fail_if(expected_hex == NULL);
+ fail_if(internal_hex == NULL);
+ fail_if(output_hex == NULL);
+ printf("pdf_text_get_unicode_056:%d:Internal> '%s'\n",
+ i, internal_hex);
+ printf("pdf_text_get_unicode_056:%d:Expected> '%s'\n",
+ i, expected_hex);
+ printf("pdf_text_get_unicode_056:%d:Output> '%s'\n",
+ i, output_hex);
+ pdf_dealloc(internal_hex);
+ pdf_dealloc(expected_hex);
+ pdf_dealloc(output_hex);
+ }
+
+ /* 2. The returned string must be the expected one, not NUL terminated */
+ fail_if(data == NULL);
+ fail_unless(memcmp(expected_data, data, size) == 0);
+
+ /* 3. The returned length must be the expected one, including the length
+ * of the BOM in UTF-8. */
+ fail_unless(size == expected_size);
+
+ pdf_text_destroy(text);
+ pdf_dealloc(data);
+ pdf_dealloc(expected_data);
+
+ ++i;
+ }
+
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_057
+ * Description:
+ * Get the contents of an empty text object in UTF-8 without BOM and with NUL
+ * suffix.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should return PDF_OK.
+ * 2. The returned string must not be NULL.
+ * 3. The returned length must be equal to the length of the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_057)
+{
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+
+ fail_if(pdf_text_new (&text) != PDF_OK);
+
+ /* 1. The call to pdf_text_get_unicode should return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) ==
PDF_OK);
+
+ /* 2. The returned string must not be NULL */
+ fail_unless(data != NULL);
+ fail_unless(memcmp(data, "\x00", 1) == 0);
+ /* 3. The returned length must be equal to the last NUL. */
+ fail_unless(size == 1);
+
+ pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_058
+ * Description:
+ * Get the contents of an empty text object in UTF-8 with BOM and with NUL
+ * suffix.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should return PDF_OK.
+ * 2. The returned string must only contain the BOM in UTF-8, NUL
+ * terminated.
+ * 3. The returned length must be equal to the length of the BOM in UTF-8
+ * plus the length of the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_058)
+{
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+
+ fail_if(pdf_text_new (&text) != PDF_OK);
+
+ /* 1. The call to pdf_text_get_unicode should return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ (PDF_TEXT_UNICODE_WITH_BOM | \
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) ==
PDF_OK);
+
+ /* 2. The returned string must only contain the BOM in UTF-8, NUL
+ * terminated. */
+ fail_unless(data != NULL);
+ fail_unless(memcmp(&data[0], "\xEF\xBB\xBF\x00", 4) == 0);
+
+ /* 3. The returned length must be equal to the length of the BOM in UTF-8
plus
+ * the length of the last NUL. */
+ fail_unless(size == 4);
+
+ pdf_text_destroy(text);
+ pdf_dealloc(data);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_059
+ * Description:
+ * Get the contents of a non-empty text object with lang/country info, in
+ * UTF-8 without BOM, with lang/country information embedded (which should
+ * not be supported in UTF-8) and with NUL suffix.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should NOT return PDF_OK.
+ */
+START_TEST(pdf_text_get_unicode_059)
+{
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+ pdf_char_t *utf8data = (pdf_char_t *)"GNU's not Unix";
+ pdf_size_t utf8size = strlen((char *)utf8data);
+ const pdf_char_t *language = (pdf_char_t *)"en";
+ const pdf_char_t *country = (pdf_char_t *)"GB";
+
+ fail_if(pdf_text_new_from_unicode(utf8data, utf8size,
+ PDF_TEXT_UTF8,
+ &text) != PDF_OK);
+ fail_if(pdf_text_set_language(text, language) != PDF_OK);
+ fail_if(pdf_text_set_country(text, country) != PDF_OK);
+
+
+
+ /* 1. The call to pdf_text_get_unicode should NOT return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ (PDF_TEXT_UTF16BE_WITH_LANGCODE | \
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) !=
PDF_OK);
+ pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_060
+ * Description:
+ * Get the contents of a non-empty text object with lang/country info, in
+ * UTF-8 with BOM, with lang/country information embedded (which should
+ * not be supported in UTF-8) and with NUL suffix.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should NOT return PDF_OK.
+ */
+START_TEST(pdf_text_get_unicode_060)
+{
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+ pdf_char_t *utf8data = (pdf_char_t *)"GNU's not Unix";
+ pdf_size_t utf8size = strlen((char *)utf8data);
+ const pdf_char_t *language = (pdf_char_t *)"en";
+ const pdf_char_t *country = (pdf_char_t *)"GB";
+
+ fail_if(pdf_text_new_from_unicode(utf8data, utf8size,
+ PDF_TEXT_UTF8,
+ &text) != PDF_OK);
+ fail_if(pdf_text_set_language(text, language) != PDF_OK);
+ fail_if(pdf_text_set_country(text, country) != PDF_OK);
+
+ /* 1. The call to pdf_text_get_unicode should NOT return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF8,
+ (PDF_TEXT_UNICODE_WITH_BOM | \
+ PDF_TEXT_UTF16BE_WITH_LANGCODE | \
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX)) !=
PDF_OK);
+ pdf_text_destroy(text);
+}
+END_TEST
+
+
+/*
+ * Test: pdf_text_get_unicode_061
+ * Description:
+ * Get the contents of a non-empty text object in UTF-16BE without BOM and
+ * with NUL suffix. The contents of the text object include characters that
+ * are encoded in UTF-16 using 16-bit and 32-bit.
+ * Success conditions:
+ * 1. The call to pdf_text_get_unicode should return PDF_OK.
+ * 2. The returned string must be the expected one, NUL terminated.
+ * 3. The returned length must be the expected one, including the length of
+ * the last NUL.
+ */
+START_TEST(pdf_text_get_unicode_061)
+{
+ extern const test_string_t utf16be_strings[];
+ int i;
+
+ /* Always INIT! Check runs each test in a different process */
+ fail_if(pdf_text_init() != PDF_OK);
+
+ i = 0;
+ while(utf16be_strings[i].data != NULL)
+ {
+ pdf_text_t text;
+ pdf_char_t *data = NULL;
+ pdf_size_t size = 0;
+ pdf_char_t *expected_data;
+ pdf_size_t expected_size;
+
+ /* Set expected data and size (without BOM) and with last NUL */
+ expected_size = utf16be_strings[i].size -2 + 2;
+ expected_data = pdf_alloc(expected_size);
+ fail_if(expected_data == NULL);
+ memcpy(expected_data, &(utf16be_strings[i].data[2]),
utf16be_strings[i].size -2);
+ memset(&expected_data[expected_size-2], 0, 2);
+
+
+ fail_if(pdf_text_new_from_unicode((pdf_char_t *) \
+ utf16be_strings[i].utf32be_data,
+ (pdf_size_t) \
+ utf16be_strings[i].utf32be_size,
+ PDF_TEXT_UTF32_BE,
+ &text) != PDF_OK);
+
+
+ /* 1. The call to pdf_text_get_unicode should return PDF_OK. */
+ fail_unless(pdf_text_get_unicode(&data,
+ &size,
+ text,
+ PDF_TEXT_UTF16_BE,
+ PDF_TEXT_UNICODE_WITH_NUL_SUFFIX) ==
PDF_OK);
+
+
+ if(INTERACTIVE_DEBUG)
+ {
+ pdf_char_t *internal_hex = NULL;
+ pdf_char_t *expected_hex = NULL;
+ internal_hex = pdf_text_test_get_hex(text->data,text->size,':');
+ expected_hex =
pdf_text_test_get_hex(expected_data,expected_size,':');
+ fail_if(expected_hex == NULL);
+ fail_if(internal_hex == NULL);
+ printf("pdf_text_get_unicode_061:%d:Internal> '%s'\n",
+ i, internal_hex);
+ printf("pdf_text_get_unicode_061:%d:Expected> '%s'\n",
+ i, expected_hex);
+ pdf_dealloc(internal_hex);
+ pdf_dealloc(expected_hex);
+ }
+
+ /* 2. The returned string must be the expected one, not NUL terminated */
+ fail_if(data == NULL);
+ fail_unless(memcmp(expected_data, data, size) == 0);
+
+ /* 3. The returned length must be the expected one. */
+ fail_unless(size == expected_size);
+
+ pdf_text_destroy(text);
+ pdf_dealloc(data);
+ pdf_dealloc(expected_data);
+
+ ++i;
+ }
+
+}
+END_TEST
+
+
+/*
+
+CONTINUE HERE WITH TESTS FROM #62 to #108.
+
+Check:
+http://gnupdf.org/Lib:Test_Specification_Document#pdf_text_get_unicode
+
+And the corresponding Flyspray task.
+
+Tests from #55 to #108 are EQUIVALENT to tests from #1 to #54, with the only
+difference that the new PDF_TEXT_UNICODE_WITH_NUL_SUFFIX option is used.
+
+Check the pass/fail criteria from the TSD for each test.
+
+The next test to be done is the #62, which is equivalent to #8.
+
+Also, take into account that last NUL has different sizes depending the
+encoding:
+ - 1 NUL byte for UTF-8
+ - 2 NUL bytes for all the UTF-16 encodings
+ - 4 NUL bytes for all the UTF-32 encodings
+
+*/
+
+
+
+
/*
* Test case creation function
*/
@@ -2895,6 +3368,15 @@
tcase_add_test(tc, pdf_text_get_unicode_052);
tcase_add_test(tc, pdf_text_get_unicode_053);
tcase_add_test(tc, pdf_text_get_unicode_054);
+
+ tcase_add_test(tc, pdf_text_get_unicode_055);
+ tcase_add_test(tc, pdf_text_get_unicode_056);
+ tcase_add_test(tc, pdf_text_get_unicode_057);
+ tcase_add_test(tc, pdf_text_get_unicode_058);
+ tcase_add_test(tc, pdf_text_get_unicode_059);
+ tcase_add_test(tc, pdf_text_get_unicode_060);
+ tcase_add_test(tc, pdf_text_get_unicode_061);
+
return tc;
# Begin bundle
IyBCYXphYXIgcmV2aXNpb24gYnVuZGxlIHY0CiMKQlpoOTFBWSZTWSVtx00AHCR/gH9zAAD/////
f+//zv////5gH762eh7sdFmaB86aShrcwXYa40xu3I1La1ihVKD7m9vLCLYHo64J2oySKnsN7GqB
FSlBUqpSaWGUqgIKUgSUTU0gY0gZDQ0ehpAaeoAGgNAABpoAACUCAgE0REbSHqZD1A0ADQAAAAAA
A5p+VPTKGowABMAAAAAmAAAAAACTUk1MhBPSnkmTanqD2pGCMmnkmnqGgeoNAAfqgGEARJEJoE0N
JgBCnjQEymjApkNM00epo1NPRHpMRsgFSRAgTTRpMEaE9AEp+mjJTMmU2kGTJo9QB6nqNGjRtgPY
JE1QClCqBSRECRACI++FJnPatdjgziFoVJFOwSmIefK39MalmMKz1yk+gSgyrIQqIiEfV1UdMtO4
XXyr8BVFVKKcNvG2m/Lub35jytT+N5VTwatUBNTraB62zTcsFEQoLU2TaMaYFDxyzLCmzG5/d/hm
meSfuMKygmcZw6rBeIGDJWRU5RS1PJJQWCiBnBCCsGJD+n578XTsiaKpzDk06zELtLgFgohBpoAh
CBjSGhecTS/x1QuvF2n9e7xuidHd3dXVs8XA8UfEsXjA5hRiLqgL4hpOqqYNiGikR3CjkqPhFGCj
Y/5w3beDtl64e3bs0s7kaMP3+KoMtTgCBx4kERQA4himERAoFIY8oAY48SCImJatoiOhEyP3UQTm
RFQRf8lyzC+bHRmPpq+CWk8O/Bat1s5HZuONae0Ioi83LKRKgorsFIhGCCBzqFIYQUGp76D3xsgg
YEWIJCH0WAoYhEQswiu7vLbt1AXPcai0woFkkWIwC5IhSQf1hugnmLB/TFmzotcB4nFsWO1f+H7j
x9x/O/FoAKW3t8oIbHUnEoSSCJ3iaTm4zv0PUMLeSYG9cMYOa5qnd6NhxFbS0jBDwH5BOEDqWEJE
ixYqgvSnBC52aAOiB1cdzAo2znyw6fpSWp2+Ct2EKh1WoEi8iyFWBYFWiszHu1riWshSnQgdWr4O
wy9u3iP1KmKVx0zxpOcYFSCTWWIYTbJavK4Qw5j7TxHmOokQwcPZZLv3rFi05eRKLjdbJYegUYKP
H63M6F6sAIeEIsWEahCqRFWCCDBBiifDJhM1qkvs1W0L3966QPOGsVsA9dyTV28vdtb7zEsRAdcG
5oUZeIBOwQgxUPGCBi+qen108edOgYTjDIUYv12zFGy+pBQ22LxdBEUqFG+hEkRziSJUciTDLacn
+F6R85o5t9L+JAOvYUC+3Uh6eq3Fxugs9/VMrlwq9BwAsCBEkSMGS+eM5IOuLeCr7XkBBZUN7kiS
Zlp6IYSQZhvfv55aUkTThJXU/y0vJybBC4283PdDLfLCNs8cvM5yUgJIDYVaH1L3fNOI5w+Spyty
3jFM0cg1F5wVCFypRYXuuUouLxdcpRcXuuUouLpjaYGFFWMDjgG5hikaBeSSSRRCEIZNmtp0INaI
GmwC6LNAUMltVXQYJChJZDyoUhSlIaNak7WLDK97VywNTg1ybmSyQbJhEXYUiNIXATRgp4/TLWz4
No35XA1hCI2gvH330VVVVVVVVVVVVRVPmFHAMl7MidNK8LMb4H2klFK2khuR0Urtk4sH5kGGNiwL
YSzoXczLuwMBQnAdSC8MH0gbQHAMRR09WS5IvKbV2SBXHig5oQELrwgHOhV+q1V0zYcQ2IEIEY9/
hNWaK46AE3ZBVs7BWcte4LlLBpIWIRkMChhRUdnZmd2FFB1d3d3dmqZ83YeqBPMKZCPu5/VouqO6
tx4PZggTo4zHzcttnWnYGPXsomiSijwV4Zv69HrSqVQUtETCkzIyIUFrF7iKRQBE1zrHmSZQJv5n
u+b7Ore4seYzGU7nLcz9/geUccdCtHe0Wdd1B3dq7l6JhmzV0OnJoOpAn1ljuMtGJRiHKHgKVKIh
9SxLIHXFD1r7VpwjEP7GVi3vD9jR9Tc8kXZpXxrRZ7l/kepeDQu2jE3mBg3W5Z5jlLbjdK3Gsyec
NfLe7d+BwcG7cfZ8+eeDg2zTxUAYfqxxks3Xlclo8y4iFk4l/Zwb4hDgcddtsIQ0qh0fCz+D8nD0
+g112Ovw9feN86UjOgJuG72nkudl06sUe7nRNR20ZImBtITk0VCh7OQAqdrI6Fze3UsfezMBtNhD
upD7Y47xz2OLh3PJjrGBLAVJCX0YmkLmijm5+nm6c8s4e0OY54Q+gCgT3xUpRixILH8AUD9EVA/U
zz/I/OusOANgcgUQs0eG7dpu0+qeXp7t+Phj3dtsdmsMtlBajVsCvlCyQIBjYwOLE2p35rmkREVV
VVVFRVM5IWcCity4uHyfDobm7vRDzaV4sF3cwatQTyQ5DjLv9RJwTZNGhwYMJSx2OezGfa+FpUSe
Y03ew2oTbvdE4z1SHpmUDemLH1SlK4li6L7P14dsDNjI2+UHKnQ8bXOE+hM6an8z3dPm9YJPP49q
AgknAe7o14wvmZiXEspgKJutUFCihV6Wk23RMVKKIythPC14XM2hBnDlSHUkNOjy+G+2VRTdv704
MZ/AeSFxrhIwofYonpP2xOc9TCMCMX91EjKfywTlDp7vx8bVUnn4cOOIAcd/dLH2HHDLRfRJcbHk
7Id8nk6vFWHEr1xygF4tRDthWQftbWZG1HwOXGXHymYwmQCkhHTepxmOJYhpmup0RlKhmShwWECZ
FyZmTqzCOEJI/lASFV1xr3ViGIECrUHgL3XEmR7NFTAl6QJJBUNk2yxyKyU/bTfrPp8lkbOcn75v
bWxtLOl8dT29KaOMh8yNppIbokwy/vlrXxFtnJhya8TLCUsn58cMxzwkyyN/5bN/3kIkT0ijnnzZ
u2rthKkkjIR2HDZF0bdFQ2I5p0mKLSL1Dy9BioWdGIzcYYF6L6SruWRxE4C5+DApdeQJmMeuhBh2
GENNNRCcI+bgouRrzyRAwCOrVrRxQ08JWniPjxFIERIHuXU6lhQrcCQd9lCehyJbLUTHE4mhXTiI
VSpoKcSpkdAhgOSNOPPjVUbOnKvNE1znwrI5IT5xkel5dlNiaBM6ROpgUxKiVVBtzAogI4rr4dut
0CcEhub5vWbt2c6d7hKLbl2Bg8bM3KmWEOs4OZHxXcoX15VygTcFJkQvWF8O+9EEg0EEOnXZOqUn
W1up0O1nGuRrw5HHp4YYuN9zZlqdOeZo9FoHRVJUCqaFotDl5eI1vAZ8mvXgcIGlDFc2XqGCI7k4
phM2TgbGHFTWmwtz1fJD6mRJdyDRTDjIdO9JjMYjlzDbnqbkhjIxLGqdoN08YyTHSesb4ywVmagc
MXNnTMH0EHWBcDiTHEKC37xkrHZvpRJ6GaozaA5lAZo6XGcsuuKdzJtY49PX683DkKdaNMFmpuU5
29qnGcpthR9YZuG9pmiqbzd61bVlYUlET4AwyIFjYligXOGGRQwGRMh7g5v03KHLXrhQkTY6wpqK
QKUumBAxCc4oOUNzJDckZqUGY3Ugnw/vgXNy5Y1DQUYuYnW6UoiBry5sqvZrbYzykLLdIrVZgpFZ
51Wml2i3uTZgdjqWm51N7Fhr51GZxSLmic3N2nIKT6+cjA3m5OS5EGKPO2Y/EhrJwkaETi4NTFhJ
0sHQ/hOltU/NId05S85tmWV76tKmulKNSN4oahFzUYA6EQHGwyxHvcYMUVLgbijURhsSDdHks2kL
MqStEhnst3yJ4KlXmtrbGbe1xujIlF2Q5JdqZPFLbt7GhONbvjWUZvQVRApSyK4L4ADCSJh3rOua
oBY2GBAv2TMYdZ9YYxIaG0KknkZ1UsQ5wLHAxlsbjdfIn34h3ksr8p4kc4iYmY5oYkhIKGYqaIhh
2Zhz3rvxKtlxwpDpDuQDCbMKkhQmv3IgZjclSiKmBozmw1VsY1dalh7LM0JAxQcX4wHiIX6twI5T
SnRSnJZOyElKdfT2saywlSgSvDrXi4ktu3wzukEgN+kTlRw2muOEQ9TR3lkLMQ6GGVarI8Lk0Pa/
E0ARkCK8QjhbOWIoHdS4UHsEhgHRQzkHNx8gNRTpEA2CH12OdkqCE+XwUFkXZKhEokQogkYMaR35
vWSiSQhNgUUfhKNrtYqYVLqKVKKjOJPK532Q+13rXZrmppPnm3XMWTN703AxSJxQRZv+pLyRNMUR
oUf7P3dQX+pf1V0wkNhopYe83hushBII97sa1YIKrBCIHvA1RQWrKAHgYQqIsIHbnI6Q2EDR4puG
MlhD8vIGiJA8skAPskuBh+3a3HZ1POwFVQQQQRVVQQQQuMCgGRAphS3Gl/gwm3vxcNPS4xBepyHQ
EB/Q3UBNYP+4OjHlyVALYO84QhbZIskkIIySIGfCW32ROPIQuvM3KAsv3imm7tB4HwIqXHjigRiJ
PwSFMEANAUSbYcPCsGCsGKFSVVnZlITu7mshOvgW2dMmhoh4pMhR04WcuwrAX7ICCYg/cKRFTq7o
IDje/MqPJkqptYn2/SU/bG8LRsxuELE4eDcnPZejaBOPAXsWcAObIKvYukUd2a2FpwFwgiaFATAU
v2cTllvXrESkV37N9kUKiKGbkDyP32VVNqhjiogUqQNgSkSgeLhET0fkDzsYIQyV+USlPJbdFa8O
00bDtiDoTRAiJyYyoiatNznA1rngdiKnh84P4vUvkXARIImSJqX6UQxD40LMVqwsX9KwVE+a5Xyt
VGmMKkWpKYVKJNyWciEKgVZVi0jBcSXOn2rAcwhxBnGLFYJlGoFgK0gGFo/HJCQhGQkLoBQllmti
YrZcRUGCfhFRINxHjsJSipOuoFkh+x+pyEdz9ja/S83IrBwb+5ZguxfraNxNrFo//0Ykib8x+jf0
Q0fc9smROk/CSTsM4nI66kJYm6GZWSHah/8D9z4/nfLFRQqLH+z7/J4HkINyOeVKr2zcjeyJOVQJ
zogeJADmGKs6/0mFjT1UInMNxOoUluBexIhsw1AhhlBIHaA+eIjuVbApkW1CRNKiVlRz3yYRlUtx
hGRFU6ycY+M4oD5Rqjh9Qtboex7TyPIkGLR3prk8JI5nyInYkHtvQSFIACA/AgCSEpWTVSO5h5sL
TEhyxJYh53nkswTz2YzyOh0qeB4GKeJzMGbucxBvk5JLO7yP2m574knb3Oi/PJviSd7F69kQ0G7U
8CYd7ufc90Dmhs10M1rakHljCJJLQPcKNCjQdZkKIbv/TYlgCArqaM3xe1baxUTYhqd55mr/AVSC
RShWUwVKCWtYRZQikhYskKiBQ8Sok0L9L1MHis8Tmd7F62B5mqRzTytZOC5/dc+cEgK7l2ms53pP
wHP0rGzHnY03J/VbNQbtym3xNOWMKC/r8x5TecaG4EP9QgK8fphSFSojkdq0RG1JXEhmpYUjU6uZ
InekStvWUUCqQUnLrk1EifGJ1KiENWg02VvSMy0STabDzsx5cdFeMUe8+BD4yINhRsi39DRPUvet
gQ0hUkF8RSFhGQytjbQT8N+MhHyQxgqEFkNAUh32sAkNOeFlM3S0Ua2mKmkUIJrZRZMALmjhPtUL
aO8TsdQvmjnSSxoz13fY8QqTO4iB45I7YxoUOAKbiyk5FOlG07+A6VLsc0I5pN08AxiWmFSYh0OD
sHUTqM0mrmeZueTOH4oNKIXkhNIhyylFPDUWBUo4pCywiyQsLSJzVSy0STgSOd4cmtyORoijwCgb
w0qjtHhBTkAwK+AkkkJIqioiCqqqqrFVVVEbTe9Bi8zdlDvU6HoJErKQTEVqslT6U3JBcOE5dSF6
tnSVUg7clohdKopz9cLYdVRJNozsvetz3H5ZNvaQbSOU1Q+wUYJAIpR37Fs3+h+sVenaxhAIRYRs
CHySkjxcUO1Eax4JEfknzST4Sj8NP9G6YEmDGn+b+hs8WYoGqCiBZhyHEkT3gdivSoDGS6QwiWk4
PUyetgnsj2Pa0eNZgxE9zGCDvyan1cvgTpktC0oVSpRUpUw1tj0I2KhwkPkQMiRLfnHsGsZJIjpS
SWiUSPWWntjmR9K8+Sc+dlAK7FSW7S5dL0qAo4fCXuoo+MklFFlRKNPe2pIjsWR61+M6fdCSSfDz
qGlU7yCAGagcQERWIMARIDDRVUUekSL6kT9hQcMBTdkpwREPMCBEViEREJGAoRi84ezqrKnJ/xsh
+jNEUI9iOSENkDx/kQU2NH91gl/K/I4Ot9ghvPo8n4UQt0IinCRsNT2xclVIfL6ZDt79FRE2DZKC
xPBsNiILw9MklkH1xPBaETe6YbqoUKQZIEIicC3fhdz1DZwWV6TMsZm4iUoUiqFQGUl5E+tI+EOc
VBXD1yRMrQhN8STg9HFmoDaQn5a7vWOgNYoHkfRD2O78Sh8T1nYhEhrlPjrtTb3RPoSDY+jmgypI
lQWPR4Uc24nLMkMpCeGA8qhNspIdcIaEiYyJ6/sNED6XjaalJJFMMyeIi1qEjW5UOY13+Vg6EG2R
jJqJ+nuapIlD32TfBKJHzAobpjSPTHsnrr1cM1fB7HE2yD1NZQ8abDyCYmlFcSPc84KHkbd4o4uk
EWSJaIGa8okaiSi8Kn/UmRYQ3j6KUchDSJkq4ugoQKR4WWBioxIKMOeIoXRgoGV4KD0Q0Apo9vct
J1/heOzx47ZETUkSiQ2wFRRiprUBpV3HAgfviAJRWC2ogJiJtDQJ55eEmElDSpM5qGFC8KlCgfU3
VLDLgKUKJ2oFh6Rx83N7ChKWCwkop3hSxRFRDbO0+7ZN6QedDUdcn3W8qRJL4hH7qCfBmdU+Vxci
Jso9BvFHiQCIvddFdmsBNxmKliZQGKRHXhDA7J4+iT010pPqh+JeCcuzfEMpOWSdn0QhjCRUDvbu
/f/L774MOM419p4UgrzqHBY5sl8B6z1g2t+OR8q2enaJR7WBHcNEhTI0saIwkLNNBc/hSz/rycCW
i5xKaiUxYsWLQySGSSDYQ0NKaTAkk0m0T8/zqg9R5+oD6fVkan+sX5efNEOr4dqIX41yXhOcIbQ8
ywNiK9Qw9OckeaQOdnWDohDJNQhUFqiqaCB6sSJdEXiayd7zdHPkmdIz45mfjZQySOOFBDi+pYdC
ku907UbZKiko1TXvQN8nEWZQT+cykk6E4EFoiiQzky4ikiari0nDUzhRCilHJJgupESBo3ijsI71
615wHKyK56cVwsYP7yCgeBC+kTCs/1rpBL5xroI2WxmEK1ol8D7kgxkubialSbMXru8U5y6KR8/P
BDITcKU3PcEpJCJ4id6RswLxPmDAjQfrfppPdAxiDlt4QeB+LsMIob99qAUy/VLFiHMiS0KQ6iyR
KlA6ImUCa0GH5RRupSe8iqmHozyQCZoB6+l5VIKdkMRaS6FFIjZM5unnhMJIhjS6BQIpaCLdccVr
C7out8EVywq1XHIELXcVwXJcWAFgIjcvZCwknFZaXsLGYasMmjECywUEpssMxstmwUt4CRgtkQTA
VDaKO4UckKxJR1LS6uNNJpM3a5+EEKASwLpoFdCo0KBylaRH54zNwEMUNRgh/KcSyLtoegCCgdi2
Q+0PxtB8Me59KegJD15rqAlBSTsYOWTqiHM7eilJCKSSKpog6KIH0UqiI+NBjA7u0udpI/kU7bVQ
q2SRLlSeZGKIclCd0T42unj4Jefedbqv16iDSoklCRRSIb8XgXJufX3sqCFOWJJTEdMknzocIV9f
SIFtggHnUK5kENpy8xT1m+CyIOd1KWhB3URwk1QkcIbohnBDkkakeb2EhnA40qKKYdan4vQllyiw
fOfMXLh9mhOl6+XhGyp+Z5XyifWN0LYDSOeTZtBVFHRUS1JIcaEi1IV9/RN7SI3lRHbLAfqeg/sP
6UXn3mEhmfOYqn2MBUG4nDy9pSEIRATtLRFe1twLBsQ/2DS/XkJouu0ehv5eIUHo4C/yPtT/B+Zc
BR/wda3wMWBWK4ij/kXckU4UJAlbcdNA
- [pdf-devel] New option in pdf_text_get_unicode,
Aleksander Morgado <=