gnunet-svn
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GNUnet-SVN] r23156 - in Extractor: . doc src/main src/plugins src/plugi


From: gnunet
Subject: [GNUnet-SVN] r23156 - in Extractor: . doc src/main src/plugins src/plugins/testdata test
Date: Tue, 7 Aug 2012 21:56:09 +0200

Author: grothoff
Date: 2012-08-07 21:56:09 +0200 (Tue, 07 Aug 2012)
New Revision: 23156

Added:
   Extractor/src/plugins/test_exiv2.c
   Extractor/src/plugins/test_jpeg.c
   Extractor/src/plugins/test_wav.c
   Extractor/src/plugins/testdata/jpeg_image.jpg
   Extractor/src/plugins/testdata/wav_alert.wav
   Extractor/src/plugins/testdata/wav_noise.wav
Removed:
   Extractor/src/plugins/translitextractor.c
   Extractor/test/test.jpg
Modified:
   Extractor/INSTALL
   Extractor/configure.ac
   Extractor/doc/extract.1
   Extractor/doc/extractor.texi
   Extractor/doc/libextractor.3
   Extractor/doc/version.texi
   Extractor/src/main/extractor.c
   Extractor/src/plugins/Makefile.am
   Extractor/src/plugins/exiv2_extractor.cc
   Extractor/src/plugins/jpeg_extractor.c
   Extractor/src/plugins/mime_extractor.c
   Extractor/src/plugins/template_extractor.c
   Extractor/src/plugins/wav_extractor.c
   Extractor/src/plugins/xm_extractor.c
Log:
train hacking

Modified: Extractor/INSTALL
===================================================================
--- Extractor/INSTALL   2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/INSTALL   2012-08-07 19:56:09 UTC (rev 23156)
@@ -1,8 +1,8 @@
 Installation Instructions
 *************************
 
-Copyright (C) 1994, 1995, 1996, 1999, 2000, 2001, 2002, 2004, 2005,
-2006, 2007, 2008, 2009 Free Software Foundation, Inc.
+Copyright (C) 1994-1996, 1999-2002, 2004-2011 Free Software Foundation,
+Inc.
 
    Copying and distribution of this file, with or without modification,
 are permitted in any medium without royalty provided the copyright
@@ -226,6 +226,11 @@
 
 and if that doesn't work, install pre-built binaries of GCC for HP-UX.
 
+   HP-UX `make' updates targets which have the same time stamps as
+their prerequisites, which makes it generally unusable when shipped
+generated files such as `configure' are involved.  Use GNU `make'
+instead.
+
    On OSF/1 a.k.a. Tru64, some versions of the default C compiler cannot
 parse its `<wchar.h>' header file.  The option `-nodtk' can be used as
 a workaround.  If GNU CC is not installed, it is therefore recommended

Modified: Extractor/configure.ac
===================================================================
--- Extractor/configure.ac      2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/configure.ac      2012-08-07 19:56:09 UTC (rev 23156)
@@ -293,6 +293,13 @@
          AM_CONDITIONAL(HAVE_MPEG2, false))],
          AM_CONDITIONAL(HAVE_MPEG2, false))
 
+AC_CHECK_LIB(jpeg, jpeg_std_error,
+        [AC_CHECK_HEADERS([jpeglib.h],
+           AM_CONDITIONAL(HAVE_JPEG, true)
+           AC_DEFINE(HAVE_JPEG,1,[Have libjpeg]),
+         AM_CONDITIONAL(HAVE_JPEG, false))],
+         AM_CONDITIONAL(HAVE_JPEG, false))
+
 AC_CHECK_LIB(poppler, _ZTI9MemStream,
         [AC_CHECK_HEADERS([poppler/goo/gmem.h],
            AM_CONDITIONAL(HAVE_POPPLER, true)

Modified: Extractor/doc/extract.1
===================================================================
--- Extractor/doc/extract.1     2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/doc/extract.1     2012-08-07 19:56:09 UTC (rev 23156)
@@ -1,4 +1,4 @@
-.TH EXTRACT 1 "Dec 20, 2009" "libextractor 0.6.0"
+.TH EXTRACT 1 "Aug 7, 2012" "libextractor 0.7.0"
 .\" $Id
 .SH NAME
 extract
@@ -6,16 +6,9 @@
 .SH SYNOPSIS
 .B extract
 [
-.B \-bghLnvV
+.B \-bgihLmnvV
 ]
 [
-.B \-H
-.I hash\-algorithm
-]
-[
-.B \-i
-]
-[
 .B \-l
 .I library
 ]
@@ -31,7 +24,7 @@
 \&...
 .br
 .SH DESCRIPTION
-This manual page documents version 0.6.0 of the
+This manual page documents version 0.7.0 of the
 .B extract
 command.
 .PP
@@ -63,6 +56,9 @@
 .B \-L
 Print a list of all known keyword types.
 .TP 8
+.B \-m
+Load the file into memory and perform extraction from memory (for debugging).
+.TP 8
 .B \-n
 Do not use the default set of extractors (typically all standard extractors, 
currently mp3, ogg, jpg, gif, png, tiff, real, html, pdf and mime\-types), use 
only the extractors specified with the .B \-l option.
 .TP

Modified: Extractor/doc/extractor.texi
===================================================================
--- Extractor/doc/extractor.texi        2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/doc/extractor.texi        2012-08-07 19:56:09 UTC (rev 23156)
@@ -158,15 +158,9 @@
 @verb{|man extract|}).
 
 @cindex license
address@hidden is licensed under the GNU General Public License.  The
-developers have frequently received requests to license GNU
-libextractor under alternative terms.  However, @gnule{}
-borrows plenty of GPL-licensed code from various other projects.
-Hence we cannot change the license (even if we wanted to)address@hidden
-maybe possible to switch to GPLv3 in the future.  For this, an audit
-of the license status of our dependencies would be required.  The new
-code that was developed specifically for @gnule{} has
-always been licensed under GPLv2 @emph{or any later version}.}
address@hidden is licensed under the GNU General Public License,
+specifically, since version 0.7, @gnule{} is licensed under GPLv3
address@hidden any later version}.
 
 @node Preparation
 @chapter Preparation
@@ -181,7 +175,7 @@
 autotools build process, read the @file{INSTALL} file and query
 @verb{|./configure --help|} for additional options.  
 
address@hidden has various dependencies, some of which are optional. 
address@hidden has various dependencies, most of which are optional. 
 Instead of specifying the names of the software packages, we
 will give the list in terms of the names of the respective
 Debian (unstable) packages that should be installed.
@@ -199,38 +193,34 @@
 g++ 
 @item
 libltdl7-dev
address@hidden
-zlib1g-dev
address@hidden
-libbz2-dev
 @end itemize
 
 Recommended dependencies are:
 @itemize @bullet
 @item
-libgtk2.0-dev
+zlib1g-dev
 @item
+libbz2-dev
address@hidden
+libgif-dev
address@hidden
 libvorbis-dev
 @item
 libflac-dev
 @item
-libgsf-1-dev
address@hidden
 libmpeg2-4-dev
 @item
-libqt4-dev
address@hidden
 librpm-dev
 @item
+libgtk2.0-dev
address@hidden
+libgsf-1-dev
address@hidden
+libqt4-dev
address@hidden
 libpoppler-dev
 @item
 libexiv2-dev
address@hidden itemize
-
-Optional dependencies (you would need to additionally specify 
-the configure option @code{--enable-ffmpeg}) to make use of these
-are:
address@hidden @bullet
 @item
 libavformat-dev
 @item
@@ -355,7 +345,8 @@
 // hello.c
 #include <Extractor/extractor.h>
 
-int main()
+int
+main (int argc, char **argv)
 {
   struct EXTRACTOR_PluginList *el;
   el = EXTRACTOR_plugin_load_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY);
@@ -408,9 +399,7 @@
 @section Note to package maintainers
 
 The suggested way to package GNU libextractor is to split it into
-roughly the following binary packages:@footnote{Debian policy
-furthermore requires a @file{-dev} (meta) package that would depend on
-all of the above packages.}
+roughly the following binary packages:
 
 @itemize @bullet
 @item
@@ -491,7 +480,10 @@
 
 @verbatim
 #include <extractor.h>
-int main(int argc, char ** argv) {
+
+int 
+main (int argc, char ** argv) 
+{
   struct EXTRACTOR_PluginList *plugins
     = EXTRACTOR_plugin_add_defaults (EXTRACTOR_OPTION_DEFAULT_POLICY);
   EXTRACTOR_extract (plugins, argv[1],
@@ -740,7 +732,7 @@
 
 @section Mono
 
-This binding is undocumented at this point.
+his binding is undocumented at this point.
 
 @section Perl
 
@@ -802,77 +794,28 @@
 
 @itemize @bullet
 @item
-APPLEFILE
+EXIV2 (using libexiv2)
address@hidden 
+FLAC (using libFLAC)
 @item
-ASF
+GIF (using libgif)
 @item
-DEB
address@hidden
-DVI
address@hidden
-ELF
address@hidden
-EXIV2
address@hidden
-FLAC
address@hidden
-FLV
address@hidden
-GIF
address@hidden
-HTML
address@hidden
-ID3 (v2.0, v2.3, v2.4)
address@hidden
-IT
address@hidden
 JPEG
 @item
-OLE2
+MIME (using libmagic)
 @item
-thumbnail (GTK, QT or FFMPEG-based)
address@hidden
-MAN
address@hidden
-MIME
address@hidden
 MP3 (ID3v1)
 @item
-MPEG
+MPEG (using libmpeg2)
 @item
-NSF and NSFE
address@hidden
-ODF
address@hidden
 PNG
 @item
-PS (PostScript)
address@hidden
-QT (QuickTime)
address@hidden
-REAL
address@hidden
-RIFF
address@hidden
-RPM
address@hidden 
-S3M
address@hidden
-SID
address@hidden
-TAR
address@hidden
-TIFF
address@hidden
-WAV
address@hidden
-XM
address@hidden
-ZIP
+RPM (using librpm)
 @end itemize
 
 @file{gzip} and @file{bzip2} compressed versions of these formats are 
-also supported (as well as meta data embedded by @file{gzip} itself).
+also supported (as well as meta data embedded by @file{gzip} itself)
+if zlib or libbz2 are available.
 
 @node Writing new Plugins
 @chapter Writing new Plugins
@@ -891,29 +834,21 @@
 
 The plugin library must be called libextractor_XXX.so, where XXX 
 denotes the file format of the plugin. The library must export a 
-method @verb{|libextractor_XXX_extract|}, with the following 
+method @verb{|libextractor_XXX_extract_method|}, with the following 
 signature:
 @verbatim
-int
-EXTRACTOR_XXX_extract
-   (const char *data,
-    size_t data_size,
-    EXTRACTOR_MetaDataProcessor proc,
-    void *proc_cls,
-    const char * options);
+void
+EXTRACTOR_XXX_extract_method (struct EXTRACTOR_ExtractContext *ec);
 @end verbatim
 
address@hidden is a pointer to the typically memory mapped contents of
-the file.  Note that plugins cannot ignore the @verb{|const|}
-annotation since the memory mapping may have been done read-only (and
-thus writes to this page will result in an error).  The @samp{data_size}
-argument specifies the size of the @samp{data} buffer in bytes.
address@hidden contains various information the plugin may need for its
+execution.  Most importantly, it contains functions for reading
+(``read'') and seeking (``seek'') the input data and for returning
+extracted data (``proc'').  The ``config'' member can contain
+additional configuration options.  ``proc'' should be called on
+each meta data item found.  If ``proc'' returns non-zero,
+processing should be aborted (if possible).
 
address@hidden should be called on each meta data item found.  If @samp{proc} 
-returns non-zero, processing should be aborted and the @code{extract}
-function must return 1.  Otherwise @code{extract} should always return zero.
-
-
 In order to test new plugins, the @file{extract} command can be run
 with the options ``-ni'' and ``-l XXX'' .  This will run the plugin
 in-process (making it easier to debug) and without any of the other
@@ -926,111 +861,32 @@
 a file.
 @example
 @verbatim
-int
-EXTRACTOR_mymime_extract
-   (const char *data,
-    size_t data_size,
-    EXTRACTOR_MetaDataProcessor proc,
-    void *proc_cls,
-    const char * options)
+void
+EXTRACTOR_mymime_extract (struct EXTRACTOR_ExtractContext *ec)
 {
+  void *data;
+  ssize_t data_size,
+
+  if (-1 == (data_size = ec->read (ec->cls, &data, 4)))
+    return; /* read error */
   if (data_size < 4)
-    return 0;
+    return; /* file too small */
   if (0 != memcmp (data, "\177ELF", 4))
-    return 0;
-  if (0 != proc (proc_cls, 
-                 "mymime",
-                 EXTRACTOR_METATYPE_MIMETYPE,
-                 EXTRACTOR_METAFORMAT_UTF8,
-                 "text/plain",
-                 "application/x-executable",
-                 1 + strlen("application/x-executable")))
-    return 1;
+    return; /* not ELF */
+  if (0 != ec->proc (ec->cls, 
+                     "mymime",
+                     EXTRACTOR_METATYPE_MIMETYPE,
+                     EXTRACTOR_METAFORMAT_UTF8,
+                     "text/plain",
+                     "application/x-executable",
+                     1 + strlen("application/x-executable")))
+    return;
   /* more calls to 'proc' here as needed */
-  return 0;
 }
 @end verbatim
 @end example
 
address@hidden Plugin execution options
 
-Plugins can request that their execution be done in a particular way.
-For this, the plugin defines a function with the following signature:
-
address@hidden
-const char *
-EXTRACTOR_XXX_options (void);
address@hidden verbatim
-
-The function should return a string with the execution options.
-Individual options in this string should be separated by semicolons.
-Options that are included in the string but not known to the library
-are ignored.  The following options are supported:
-
address@hidden @bullet
address@hidden
address@hidden ensures that the plugin is only run out-of-process; if
-this is not possible, the plugin will not be executed at all if this
-option is set.
-
address@hidden
address@hidden ensures that @code{stderr} is closed during the
-execution of the plugin.  This is useful if the plugin uses libraries
-that write (error) messages to @code{stderr} and where this behavior cannot be 
-turned off.  This option only works if the plugin is executed out-of-process.
-
address@hidden
address@hidden ensures that @code{stdout} is closed during the
-execution of the plugin.  This is useful if the plugin uses libraries
-that write messages to @code{stdout} and where this behavior cannot be 
-turned off.  This option only works if the plugin is executed out-of-process.
-
address@hidden
address@hidden kills and restarts the plugin process for each
-file that is being analyzed.  This is useful if the plugin uses
-libraries that keep global state between runs that is problematic or
-if the plugin uses libraries that are known to have serious resource
-leaks (such as memory leaks).
-
address@hidden
address@hidden 
-In order to limit memory consumption, limit the amount if reading from
-disk and to keep the API simple, the @samp{data} argument passed to
-the @code{EXTRACTOR_XXX_extract} method bounded (to 32 MB of normal
-data; for compressed data, a limit of 16 MB is imposed)address@hidden
address@hidden was given a pointer to an existing, uncompressed block of
-data in memory, no bound is imposed for plugins executing in-process;
-for out-of-process plugins, a 32 MB limit is still imposed.}  Since
-some file formats contain meta data at the end of the file, this option
-provides a way for plugins to access not the first 16--32 MB of a file
-but instead the last (roughly) 32 MB. 
-
-Note that even for files larger than 32 MB, @samp{size} is not
-guaranteed to be 32 MB since @samp{data} will be aligned to the page
-size of the operating system.  However, the last byte of @samp{data}
-is guaranteed to be the last byte of the file.  Furthermore, if the
-file was large and compressed, unlike in the case of meta data
-extraction from the header, the end of the file will not be
-automatically decompressed by @gnule{}.  
-
address@hidden itemize
-
-Note that using options other than @code{want-tail} is pretty much
-always a kludge and should thus be avoided.
-
address@hidden Example for an options method
-
-The following example shows how a plugin can set some of the options listed 
above:
address@hidden
address@hidden
-const char *
-EXTRACTOR_id3_options ()
-{
-  return "close-stderr;want-tail";
-}
address@hidden verbatim
address@hidden example
-
 @node Internal utility functions
 @chapter Internal utility functions
 
@@ -1055,7 +911,7 @@
 @file{convert.h} provides a function for character set conversion described
 below.
 
address@hidden {char *} EXTRACTOR_common_convert_to_utf8 (const char *input, 
size_t len, const char * charset)
address@hidden {char *} EXTRACTOR_common_convert_to_utf8 (const char *input, 
size_t len, const char *charset)
 @cindex UTF-8
 @cindex character set
 @findex EXTRACTOR_common_convert_to_utf8

Modified: Extractor/doc/libextractor.3
===================================================================
--- Extractor/doc/libextractor.3        2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/doc/libextractor.3        2012-08-07 19:56:09 UTC (rev 23156)
@@ -1,32 +1,32 @@
-.TH LIBEXTRACTOR 3 "Dec 14, 2009"
+.TH LIBEXTRACTOR 3 "Aug 8, 2012" "libextractor 0.7.0"
 .SH NAME
-libextractor \- meta\-information extraction library 0.6.0
+libextractor \- meta\-information extraction library 0.7.0
 .SH SYNOPSIS
 
 \fB#include <extractor.h>
 
-\fBconst char *EXTRACTOR_metatype_to_string(enum EXTRACTOR_MetaType 
\fItype\fB);
+\fBconst char *EXTRACTOR_metatype_to_string (enum EXTRACTOR_MetaType 
\fItype\fB);
 
-\fBconst char *EXTRACTOR_metatype_to_description(enum EXTRACTOR_MetaType 
\fItype\fB);
+\fBconst char *EXTRACTOR_metatype_to_description (enum EXTRACTOR_MetaType 
\fItype\fB);
 
 \fBenum EXTRACTOR_MetaTypeEXTRACTOR_metatype_get_max (void);
 
-\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_defaults(enum 
EXTRACTOR_Options \fIflags\fB);
+\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_defaults (enum 
EXTRACTOR_Options \fIflags\fB);
 
-\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add (struct 
EXTRACTOR_PluginList * \fIprev\fB, const char * \fIlibrary\fB, const char * 
\fIoptions\fB, enum EXTRACTOR_Options \fIflags\fB);
+\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add (struct 
EXTRACTOR_PluginList *\fIprev\fB, const char *\fIlibrary\fB, const char 
*\fIoptions\fB, enum EXTRACTOR_Options \fIflags\fB);
 
 
-\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_last(struct 
EXTRACTOR_PluginList *\fIprev\fB, const char *\fIlibrary\fB, const char 
*\fIoptions\fB, enum EXTRACTOR_Options \fIflags\fB);
+\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_last (struct 
EXTRACTOR_PluginList *\fIprev\fB, const char *\fIlibrary\fB, const char 
*\fIoptions\fB, enum EXTRACTOR_Options \fIflags\fB);
 
-\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_config (struct 
EXTRACTOR_PluginList * \fIprev\fB, const char *\fIconfig\fB, enum 
EXTRACTOR_Options \fIflags\fB);
+\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_add_config (struct 
EXTRACTOR_PluginList *\fIprev\fB, const char *\fIconfig\fB, enum 
EXTRACTOR_Options \fIflags\fB);
                
-\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_remove(struct 
EXTRACTOR_PluginList * \fIprev\fB, const char * \fIlibrary\fB);
+\fBstruct EXTRACTOR_PluginList *EXTRACTOR_plugin_remove (struct 
EXTRACTOR_PluginList *\fIprev\fB, const char *\fIlibrary\fB);
 
-\fBvoid EXTRACTOR_plugin_remove_all(struct EXTRACTOR_PluginList 
*\fIplugins\fB);
+\fBvoid EXTRACTOR_plugin_remove_all (struct EXTRACTOR_PluginList 
*\fIplugins\fB);
 
-\fBvoid EXTRACTOR_extract(struct EXTRACTOR_PluginList *\fIplugins\fB, const 
char *\fIfilename\fB, const void *\fIdata\fB, size_t \fIsize\fB, 
EXTRACTOR_MetaDataProcessor \fIproc\fB, void *\fIproc_cls\fB);
+\fBvoid EXTRACTOR_extract (struct EXTRACTOR_PluginList *\fIplugins\fB, const 
char *\fIfilename\fB, const void *\fIdata\fB, size_t \fIsize\fB, 
EXTRACTOR_MetaDataProcessor \fIproc\fB, void *\fIproc_cls\fB);
 
-\fBint EXTRACTOR_meta_data_print(void * \fIhandle\fB, const char 
*\fIplugin_name\fB, enum EXTRACTOR_MetaType \fItype\fB, enum 
EXTRACTOR_MetaFormat \fIformat\fB, const char *\fIdata_mime_type\fB, const char 
*\fIdata\fB, size_t \fIdata_len\fB);
+\fBint EXTRACTOR_meta_data_prin t(void *\fIhandle\fB, const char 
*\fIplugin_name\fB, enum EXTRACTOR_MetaType \fItype\fB, enum 
EXTRACTOR_MetaFormat \fIformat\fB, const char *\fIdata_mime_type\fB, const char 
*\fIdata\fB, size_t \fIdata_len\fB);
 
 \fBEXTRACTOR_VERSION
 
@@ -34,16 +34,16 @@
 .P
 GNU libextractor is a simple library for keyword extraction.  libextractor 
does not support all formats but supports a simple plugging mechanism such that 
you can quickly add extractors for additional formats, even without recompiling 
libextractor.  libextractor typically ships with dozens of plugins that can be 
used to obtain meta data from common file-types.  If you want to write your own 
plugin for some filetype, all you need to do is write a little library that 
implements a single method with this signature:
 
- \fBint EXTRACTOR_name_extract(const char *\fIdata\fB, size_t \fIdatasize\fB, 
EXTRACTOR_MetaDataProcessor \fIproc\fB, void *\fIproc_cls\fB, const char 
*\fIoptions\fB);
+ \fBvoid EXTRACTOR_XXX_extract_method (struct EXTRACTOR_ExtractContext *ec);
 
 .P
-Data is a pointer to the contents of the file and datasize is the size of 
data.  The extract method must call proc for meta data that it finds.  The 
interpretation of options is up to the plugin.  The function should return 0 if 
'proc' always returned 0, otherwise 1.  After 'proc' returned a non-zero value, 
proc should not be called again. An example implementation can be found in 
\fIhtml_extractor.c\fP.  Plugins should be automatically found and used once 
they are installed in the respective directory (typically something like 
/usr/lib/libextractor/).  
+'ec' contains function pointers for reading, seeking, getting the overall file 
size and returning meta data.  There is also a field with options for the 
plugin.  New plugins will be automatically located and used once they are 
installed in the respective directory (typically something like 
/usr/lib/libextractor/).  
 .P
-The application extract gives an example how to use libextractor.
+The application 'extract' gives an example how to use libextractor.
 .P
 The basic use of libextractor is to load the plugins (for example with 
\fBEXTRACTOR_plugin_add_defaults\fP), then to extract the keyword list using 
\fBEXTRACTOR_extract\fP, and finally unloading the plugins (with 
\fBEXTRACTOR_plugin_remove_all\fP).
 .P
-Textual meta data obtained from libextractor is supposed to be UTF-8 encoded 
if the text encoding is known.  Plugins are supposed to convert meta-data to 
UTF-8 if necessary.    The EXTRACTOR_meta_data_print function converts the 
UTF-8 keywords to the character set from the current locale before printing 
them.  
+Textual meta data obtained from libextractor is supposed to be UTF-8 encoded 
if the text encoding is known.  Plugins are supposed to convert meta\-data to 
UTF\-8 if necessary.    The \fBEXTRACTOR_meta_data_print\fP function converts 
the UTF-8 keywords to the character set from the current locale before printing 
them.  
 .P
 .SH "SEE ALSO"
 extract(1)

Modified: Extractor/doc/version.texi
===================================================================
--- Extractor/doc/version.texi  2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/doc/version.texi  2012-08-07 19:56:09 UTC (rev 23156)
@@ -1,4 +1,4 @@
address@hidden UPDATED 29 January 2012
address@hidden UPDATED-MONTH January 2012
address@hidden UPDATED 7 August 2012
address@hidden UPDATED-MONTH August 2012
 @set EDITION 0.6.3
 @set VERSION 0.6.3

Modified: Extractor/src/main/extractor.c
===================================================================
--- Extractor/src/main/extractor.c      2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/src/main/extractor.c      2012-08-07 19:56:09 UTC (rev 23156)
@@ -636,16 +636,19 @@
     {
       /* need to create shared memory segment */
       shm = EXTRACTOR_IPC_shared_memory_create_ (DEFAULT_SHM_SIZE);
-      for (pos = plugins; NULL != pos; pos = pos->next)
-       if ( (NULL == pos->shm) &&
-            (EXTRACTOR_OPTION_IN_PROCESS != pos->flags) )
+    }
+  for (pos = plugins; NULL != pos; pos = pos->next)
+    if ( (NULL == pos->channel) &&
+        (EXTRACTOR_OPTION_IN_PROCESS != pos->flags) )
+      {
+       if (NULL == pos->shm)
          {
            pos->shm = shm;
            (void) EXTRACTOR_IPC_shared_memory_change_rc_ (shm, 1);
-           pos->channel = EXTRACTOR_IPC_channel_create_ (pos,
-                                                         shm);
          }
-    }
+       pos->channel = EXTRACTOR_IPC_channel_create_ (pos,
+                                                     shm);
+      }
   do_extract (plugins, shm, datasource, proc, proc_cls);
   EXTRACTOR_datasource_destroy_ (datasource);
 }

Modified: Extractor/src/plugins/Makefile.am
===================================================================
--- Extractor/src/plugins/Makefile.am   2012-08-07 17:46:44 UTC (rev 23155)
+++ Extractor/src/plugins/Makefile.am   2012-08-07 19:56:09 UTC (rev 23156)
@@ -1,4 +1,7 @@
-INCLUDES = -I$(top_srcdir)/src/include -I$(top_srcdir)/src/common 
-I$(top_srcdir)/src/main
+INCLUDES = \
+ -I$(top_srcdir)/src/include \
+ -I$(top_srcdir)/src/common \
+ -I$(top_srcdir)/src/main
 
 # install plugins under:
 plugindir = $(libdir)/@RPLUGINDIR@
@@ -14,10 +17,13 @@
 EXTRA_DIST = template_extractor.c \
   testdata/ogg_courseclear.ogg \
   testdata/gif_image.gif \
+  testdata/jpeg_image.jpg \
   testdata/rpm_test.rpm \
   testdata/flac_kraftwerk.flac \
   testdata/mpeg_alien.mpg \
-  testdata/mpeg_melt.mpg
+  testdata/mpeg_melt.mpg \
+  testdata/wav_noise.wav \
+  testdata/wav_alert.wav
 
 if HAVE_VORBISFILE
 PLUGIN_OGG=libextractor_ogg.la
@@ -49,26 +55,43 @@
 TEST_MPEG=test_mpeg
 endif
 
+if HAVE_JPEG
+PLUGIN_JPEG=libextractor_jpeg.la
+TEST_JPEG=test_jpeg
+endif
 
+if HAVE_POPPLER
+PLUGIN_EXIV2=libextractor_exiv2.la
+TEST_EXIV2=test_exiv2
+endif
+
+
 plugin_LTLIBRARIES = \
+  libextractor_xm.la \
+  libextractor_wav.la \
   $(PLUGIN_OGG) \
   $(PLUGIN_MIME) \
   $(PLUGIN_GIF) \
   $(PLUGIN_RPM) \
   $(PLUGIN_FLAC) \
-  $(PLUGIN_MPEG)
+  $(PLUGIN_MPEG) \
+  $(PLUGIN_JPEG) \
+  $(PLUGIN_EXIV2)
 
 if HAVE_ZZUF
   fuzz_tests=fuzz_default.sh 
 endif
 
 check_PROGRAMS = \
+  test_wav \
   $(TEST_OGG) \
   $(TEST_MIME) \
   $(TEST_GIF) \
   $(TEST_RPM) \
   $(TEST_FLAC) \
-  $(TEST_MPEG)
+  $(TEST_MPEG) \
+  $(TEST_JPEG) \
+  $(TEST_EXIV2)
 
 TESTS = \
   $(fuzz_tests) \
@@ -84,7 +107,23 @@
   $(top_builddir)/src/main/libextractor.la
 
 
+libextractor_xm_la_SOURCES = \
+  xm_extractor.c
+libextractor_xm_la_LDFLAGS = \
+  $(PLUGINFLAGS)
 
+
+libextractor_wav_la_SOURCES = \
+  wav_extractor.c
+libextractor_wav_la_LDFLAGS = \
+  $(PLUGINFLAGS)
+
+test_wav_SOURCES = \
+  test_wav.c
+test_wav_LDADD = \
+  $(top_builddir)/src/plugins/libtest.la
+
+
 libextractor_ogg_la_SOURCES = \
   ogg_extractor.c
 libextractor_ogg_la_LDFLAGS = \
@@ -163,3 +202,29 @@
   $(top_builddir)/src/plugins/libtest.la
 
 
+libextractor_jpeg_la_SOURCES = \
+  jpeg_extractor.c
+libextractor_jpeg_la_LDFLAGS = \
+  $(PLUGINFLAGS)
+libextractor_jpeg_la_LIBADD = \
+  -ljpeg
+
+test_jpeg_SOURCES = \
+  test_jpeg.c
+test_jpeg_LDADD = \
+  $(top_builddir)/src/plugins/libtest.la
+
+
+libextractor_exiv2_la_SOURCES = \
+  exiv2_extractor.cc
+libextractor_exiv2_la_LDFLAGS = \
+  $(PLUGINFLAGS)
+libextractor_exiv2_la_LIBADD = \
+  -lexiv2
+
+test_exiv2_SOURCES = \
+  test_exiv2.c
+test_exiv2_LDADD = \
+  $(top_builddir)/src/plugins/libtest.la
+
+

Modified: Extractor/src/plugins/exiv2_extractor.cc
===================================================================
--- Extractor/src/plugins/exiv2_extractor.cc    2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/exiv2_extractor.cc    2012-08-07 19:56:09 UTC (rev 
23156)
@@ -2,7 +2,7 @@
 /*
  * This program is free software; you can redistribute it and/or
  * modify it under the terms of the GNU General Public License
- * as published by the Free Software Foundation; either version 2
+ * as published by the Free Software Foundation; either version 3
  * of the License, or (at your option) any later version.
  *
  * This program is distributed in the hope that it will be useful,
@@ -14,74 +14,588 @@
  * along with this program; if not, write to the Free Software
  * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
  */
-/*!
-  @file    exiv2_extractor.cc
-  @brief   libextractor plugin for Exif using exiv2
-  @version $Rev$
-  @author  Andreas Huggel (ahu)
-  <a href="mailto:address@hidden";>address@hidden</a>
-  @date    30-Jun-05, ahu: created; 15-Dec-09, cg: updated
-*/
-
+/**
+ * @file plugins/exiv2_extractor.cc
+ * @brief libextractor plugin for Exif using exiv2
+ * @author Andreas Huggel (ahu)
+ * @author Christian Grothoff
+ */
+#include "platform.h"
+#include "extractor.h"
 #include <iostream>
 #include <iomanip>
 #include <cassert>
 #include <cstring>
 #include <math.h>
-
-#include "platform.h"
-#include "extractor.h"
-
 #include <exiv2/exif.hpp>
+#include <exiv2/error.hpp>
 #include <exiv2/image.hpp>
 #include <exiv2/futils.hpp>
 
+/**
+ * Should we suppress exiv2 warnings?
+ */
 #define SUPPRESS_WARNINGS 1
 
-#define ADD(s, type) do { if (0!=proc(proc_cls, "exiv2", type, 
EXTRACTOR_METAFORMAT_UTF8, "text/plain", s, strlen(s)+1)) return 1; } while (0)
 
-static int
-addExiv2Tag(const Exiv2::ExifData& exifData,
-           const std::string& key,
-           enum EXTRACTOR_MetaType type,
-           EXTRACTOR_MetaDataProcessor proc,
-           void *proc_cls)
+/**
+ * Implementation of EXIV2's BasicIO interface based
+ * on the 'struct EXTRACTOR_ExtractContext.
+ */
+class ExtractorIO:Exiv2::BasicIo
 {
-  const char * str;
+private:
+
+  /**
+   * Extract context we are using.
+   */
+  struct EXTRACTOR_ExtractContext *ec;
+
+public:
+
+  /**
+   * Constructor.
+   * 
+   * @param s_ec extract context to wrap
+   */
+  ExtractorIO (struct EXTRACTOR_ExtractContext *s_ec)
+  {
+    ec = s_ec;
+  }
+
+  /**
+   * Destructor.
+   */
+  virtual ~ExtractorIO ()
+  {
+    /* nothing to do */
+  }
+
+  /**
+   * Open stream.
+   * 
+   * @return 0 (always successful)
+   */
+  virtual int open ();
+
+  /**
+   * Close stream.
+   * 
+   * @return 0 (always successful)
+   */
+  virtual int close ();
   
-  Exiv2::ExifKey ek(key);
-  Exiv2::ExifData::const_iterator md = exifData.findKey(ek);
-  if (md != exifData.end()) {
-    std::string ccstr = Exiv2::toString(*md);
-    str = ccstr.c_str();
-    while ( (strlen(str) > 0) && isspace((unsigned char) str[0])) str++;
-    if (strlen(str) > 0)
-      ADD (str, type);
-    md++;
-  }
+  /**
+   * Read up to 'rcount' bytes into a buffer
+   *
+   * @param rcount number of bytes to read
+   * @return buffer with data read, empty buffer (!) on failure (!)
+   */
+  virtual Exiv2::DataBuf read (long rcount);
+
+  /**
+   * Read up to 'rcount' bytes into 'buf'.
+   *
+   * @param buf buffer to fill
+   * @param rcount size of 'buf'
+   * @return number of bytes read successfully, 0 on failure (!)
+   */
+  virtual long read (Exiv2::byte *buf,
+                    long rcount);
+
+  /**
+   * Read a single character.
+   *
+   * @return the character
+   * @throw exception on errors
+   */
+  virtual int getb ();
+
+  /**
+   * Write to stream.
+   *
+   * @param data data to write
+   * @param wcount how many bytes to write
+   * @return -1 (always fails)
+   */ 
+  virtual long write (const Exiv2::byte *data,
+                     long wcount);
+
+  /**
+   * Write to stream.
+   *
+   * @param src stream to copy
+   * @return -1 (always fails)
+   */ 
+  virtual long write (Exiv2::BasicIo &src);
+
+  /**
+   * Write a single byte.
+   *
+   * @param data byte to write
+   * @return -1 (always fails)
+   */
+  virtual int putb (Exiv2::byte data);
+
+  /**
+   * Not supported.
+   *
+   * @throws error
+   */
+  virtual void transfer (Exiv2::BasicIo& src);
+
+  /**
+   * Seek to the given offset.
+   *
+   * @param offset desired offset
+   * @parma pos offset is relative to where?
+   * @return -1 on failure, 0 on success
+   */
+  virtual int seek (long offset,
+                   Exiv2::BasicIo::Position pos);
+
+  /**
+   * Not supported.
+   *
+   * @throws error
+   */
+  virtual Exiv2::byte* mmap (bool isWritable);
+  
+  /**
+   * Not supported.
+   *
+   * @return -1 (error)
+   */
+  virtual int munmap ();
+
+  /**
+   * Return our current offset in the file.
+   *
+   * @return -1 on error
+   */
+  virtual long int tell (void) const;
+
+  /**
+   * Return overall size of the file.
+   *
+   * @return -1 on error
+   */
+  virtual long int size (void) const;
+
+  /**
+   * Check if file is open.
+   * 
+   * @return true (always).
+   */
+  virtual bool isopen () const;
+
+  /**
+   * Check if this file source is in error mode.
+   *
+   * @return 0 (always all is fine).
+   */
+  virtual int error () const;
+
+  /**
+   * Check if current position of the file is at the end
+   * 
+   * @return true if at EOF, false if not.
+   */
+  virtual bool eof () const;
+
+  /**
+   * Not supported.
+   *
+   * @throws error
+   */
+  virtual std::string path () const;
+
+#ifdef EXV_UNICODE_PATH
+  /**
+   * Not supported.
+   *
+   * @throws error
+   */
+  virtual std::wstring wpath () const;
+#endif
+  
+  /**
+   * Not supported.
+   *
+   * @throws error
+   */
+  virtual Exiv2::BasicIo::AutoPtr temporary () const;
+
+};
+  
+
+/**
+ * Open stream.
+ * 
+ * @return 0 (always successful)
+ */
+int 
+ExtractorIO::open ()
+{
   return 0;
 }
 
+/**
+ * Close stream.
+ * 
+ * @return 0 (always successful)
+ */
+int 
+ExtractorIO::close ()
+{
+  return 0;
+}
 
-static int
-addIptcData(const Exiv2::IptcData& iptcData,
-           const std::string& key,
-           enum EXTRACTOR_MetaType type,
-           EXTRACTOR_MetaDataProcessor proc,
-           void *proc_cls)
+
+/**
+ * Read up to 'rcount' bytes into a buffer
+ *
+ * @param rcount number of bytes to read
+ * @return buffer with data read, empty buffer (!) on failure (!)
+ */
+Exiv2::DataBuf
+ExtractorIO::read (long rcount)
 {
-  const char * str;
+  void *data;
+  ssize_t ret;
   
-  Exiv2::IptcKey ek(key);
-  Exiv2::IptcData::const_iterator md = iptcData.findKey(ek);
-  while (md != iptcData.end()) 
+  if (-1 == (ret = ec->read (ec->cls, &data, rcount)))
+    return Exiv2::DataBuf (NULL, 0);
+  return Exiv2::DataBuf ((const Exiv2::byte *) data, ret);
+}
+
+
+/**
+ * Read up to 'rcount' bytes into 'buf'.
+ *
+ * @param buf buffer to fill
+ * @param rcount size of 'buf'
+ * @return number of bytes read successfully, 0 on failure (!)
+ */
+long 
+ExtractorIO::read (Exiv2::byte *buf,
+                  long rcount)
+{
+  void *data;
+  ssize_t ret;
+  
+  if (-1 == (ret = ec->read (ec->cls, &data, rcount)))
+    return 0;
+  memcpy (buf, data, ret);
+  return ret;
+}
+
+
+/**
+ * Read a single character.
+ *
+ * @return the character
+ * @throw exception on errors
+ */
+int 
+ExtractorIO::getb ()
+{
+  void *data;
+  char *r;
+  
+  if (1 != ec->read (ec->cls, &data, 1))
+    throw Exiv2::BasicError<char> (42 /* error code */);
+  r = (char *) data;
+  return *r;
+}
+
+
+/**
+ * Write to stream.
+ *
+ * @param data data to write
+ * @param wcount how many bytes to write
+ * @return -1 (always fails)
+ */ 
+long 
+ExtractorIO::write (const Exiv2::byte *data,
+                   long wcount)
+{
+  return -1;
+}
+
+
+/**
+ * Write to stream.
+ *
+ * @param src stream to copy
+ * @return -1 (always fails)
+ */ 
+long 
+ExtractorIO::write (Exiv2::BasicIo &src)
+{
+  return -1;
+}
+
+
+/**
+ * Write a single byte.
+ *
+ * @param data byte to write
+ * @return -1 (always fails)
+ */
+int 
+ExtractorIO::putb (Exiv2::byte data)
+{
+  return -1;
+}
+
+
+/**
+ * Not supported.
+ *
+ * @throws error
+ */
+void
+ExtractorIO::transfer (Exiv2::BasicIo& src)
+{
+  throw Exiv2::BasicError<char> (42 /* error code */);
+}
+
+
+/**
+ * Seek to the given offset.
+ *
+ * @param offset desired offset
+ * @parma pos offset is relative to where?
+ * @return -1 on failure, 0 on success
+ */
+int 
+ExtractorIO::seek (long offset,
+                  Exiv2::BasicIo::Position pos)
+{
+  int rel;
+  
+  switch (pos)
     {
-      if (0 != strcmp (Exiv2::toString(md->key()).c_str(), key.c_str()))
+    case beg: // Exiv2::BasicIo::beg:
+      rel = SEEK_SET;
+      break;
+    case cur:
+      rel = SEEK_CUR;
+      break;
+    case end:
+      rel = SEEK_END;
+      break;
+    default:
+      abort ();
+    }
+  if (-1 == ec->seek (ec->cls, offset, rel))
+    return -1;
+  return 0;
+}
+
+
+/**
+ * Not supported.
+ *
+ * @throws error
+ */
+Exiv2::byte *
+ExtractorIO::mmap (bool isWritable)
+{
+  throw Exiv2::BasicError<char> (42 /* error code */);
+}
+
+
+/**
+ * Not supported.
+ *
+ * @return -1 error
+ */
+int
+ExtractorIO::munmap ()
+{
+  return -1;
+}
+
+
+/**
+ * Return our current offset in the file.
+ *
+ * @return -1 on error
+ */
+long int
+ExtractorIO::tell (void) const
+{
+  return (long) ec->seek (ec->cls, 0, SEEK_CUR);
+}
+
+
+/**
+ * Return overall size of the file.
+ *
+ * @return -1 on error
+ */
+long int 
+ExtractorIO::size (void) const
+{
+  return (long) ec->get_size (ec->cls);
+}
+
+
+/**
+ * Check if file is open.
+ * 
+ * @return true (always).
+ */
+bool 
+ExtractorIO::isopen () const
+{
+  return true;
+}
+
+
+/**
+ * Check if this file source is in error mode.
+ *
+ * @return 0 (always all is fine).
+ */
+int
+ExtractorIO::error () const
+{
+  return 0;
+}
+
+
+/**
+ * Check if current position of the file is at the end
+ * 
+ * @return true if at EOF, false if not.
+ */
+bool 
+ExtractorIO::eof () const
+{
+  return size () == tell ();
+}
+
+
+/**
+ * Not supported.
+ *
+ * @throws error
+ */
+std::string
+ExtractorIO::path () const
+{
+  throw Exiv2::BasicError<char> (42 /* error code */);
+}
+
+
+#ifdef EXV_UNICODE_PATH
+/**
+ * Not supported.
+ *
+ * @throws error
+ */
+std::string
+ExtractorIO::wpath () const
+{
+  throw Exiv2::BasicError<char> (42 /* error code */);
+}
+#endif
+
+
+/**
+ * Not supported.
+ *
+ * @throws error
+ */
+Exiv2::BasicIo::AutoPtr
+ExtractorIO::temporary () const
+{
+  throw Exiv2::BasicError<char> (42 /* error code */);
+}
+
+
+/**
+ * Pass the given UTF-8 string to the 'proc' callback using
+ * the given type.  Uses 'return 1' if 'proc' returns non-0.
+ *
+ * @param s 0-terminated UTF8 string value with the meta data
+ * @param type libextractor type for the meta data
+ */
+#define ADD(s, type) do { if (0 != proc (proc_cls, "exiv2", type, 
EXTRACTOR_METAFORMAT_UTF8, "text/plain", s, strlen (s) + 1)) return 1; } while 
(0)
+
+
+/**
+ * Try to find a given key in the exifData and if a value is
+ * found, pass it to 'proc'.
+ *
+ * @param exifData metadata set to inspect
+ * @param key key to lookup in exifData
+ * @param type extractor type to use
+ * @param proc function to call with results
+ * @param proc_cls closurer for proc
+ * @return 0 to continue extracting, 1 to abort
+ */
+static int
+addExiv2Tag (const Exiv2::ExifData& exifData,
+            const std::string& key,
+            enum EXTRACTOR_MetaType type,
+            EXTRACTOR_MetaDataProcessor proc,
+            void *proc_cls)
+{
+  const char *str;
+  Exiv2::ExifKey ek (key);
+  Exiv2::ExifData::const_iterator md = exifData.findKey (ek);
+
+  if (exifData.end () == md) 
+    return 0; /* not found */
+  std::string ccstr = Exiv2::toString(*md);
+  str = ccstr.c_str();
+  /* skip over whitespace */
+  while ( (strlen (str) > 0) && isspace ((unsigned char) str[0]))
+    str++;
+  if (strlen (str) > 0)
+    ADD (str, type);
+  md++;
+  return 0;
+}
+
+
+/**
+ * Try to find a given key in the iptcData and if a value is
+ * found, pass it to 'proc'.
+ *
+ * @param ipctData metadata set to inspect
+ * @param key key to lookup in exifData
+ * @param type extractor type to use
+ * @param proc function to call with results
+ * @param proc_cls closurer for proc
+ * @return 0 to continue extracting, 1 to abort
+ */
+static int
+addIptcData (const Exiv2::IptcData& iptcData,
+            const std::string& key,
+            enum EXTRACTOR_MetaType type,
+            EXTRACTOR_MetaDataProcessor proc,
+            void *proc_cls)
+{
+  const char *str;
+  Exiv2::IptcKey ek (key);
+  Exiv2::IptcData::const_iterator md = iptcData.findKey (ek);
+
+  while (iptcData.end () !=  md) 
+    {
+      if (0 != strcmp (Exiv2::toString (md->key ()).c_str (), key.c_str ()))
          break;
-      std::string ccstr = Exiv2::toString(*md);
-      str = ccstr.c_str();
-      while ( (strlen(str) > 0) && isspace( (unsigned char) str[0])) str++;
-      if (strlen(str) > 0)
+      std::string ccstr = Exiv2::toString (*md);
+      str = ccstr.c_str ();
+      /* skip over whitespace */
+      while ((strlen (str) > 0) && isspace ((unsigned char) str[0])) 
+       str++;
+      if (strlen (str) > 0)
        ADD (str, type);
       md++;
     }
@@ -89,6 +603,17 @@
 }
 
 
+/**
+ * Try to find a given key in the xmpData and if a value is
+ * found, pass it to 'proc'.
+ *
+ * @param xmpData metadata set to inspect
+ * @param key key to lookup in exifData
+ * @param type extractor type to use
+ * @param proc function to call with results
+ * @param proc_cls closurer for proc
+ * @return 0 to continue extracting, 1 to abort
+ */
 static int
 addXmpData(const Exiv2::XmpData& xmpData,
           const std::string& key,
@@ -97,149 +622,163 @@
           void *proc_cls)
 {
   const char * str;
-  
-  Exiv2::XmpKey ek(key);
-  Exiv2::XmpData::const_iterator md = xmpData.findKey(ek);
-  while (md != xmpData.end()) 
+  Exiv2::XmpKey ek (key);
+  Exiv2::XmpData::const_iterator md = xmpData.findKey (ek);
+
+  while (xmpData.end () != md) 
     {
-      if (0 != strcmp (Exiv2::toString(md->key()).c_str(), key.c_str()))
+      if (0 != strcmp (Exiv2::toString (md->key ()).c_str (), key.c_str ()))
        break;
-      std::string ccstr = Exiv2::toString(*md);
-      str = ccstr.c_str();
-      while ( (strlen(str) > 0) && isspace( (unsigned char) str[0])) str++;
-      if (strlen(str) > 0)
+      std::string ccstr = Exiv2::toString (*md);
+      str = ccstr.c_str ();
+      while ( (strlen (str) > 0) && isspace ((unsigned char) str[0])) str++;
+      if (strlen (str) > 0)
        ADD (str, type);
       md++;
     }
   return 0;
 }
 
-#define ADDEXIV(s,t) do { if (0 != addExiv2Tag (exifData, s, t, proc, 
proc_cls)) return 1; } while (0)
-#define ADDIPTC(s,t) do { if (0 != addIptcData (iptcData, s, t, proc, 
proc_cls)) return 1; } while (0)
-#define ADDXMP(s,t)  do { if (0 != addXmpData  (xmpData,  s, t, proc, 
proc_cls)) return 1; } while (0)
 
+/**
+ * Call 'addExiv2Tag' for the given key-type combination.
+ * Uses 'return' if addExiv2Tag returns non-0.
+ *
+ * @param s key to lookup
+ * @param type libextractor type to use for the meta data found under the 
given key
+ */
+#define ADDEXIV(s,t) do { if (0 != addExiv2Tag (exifData, s, t, ec->proc, 
ec->cls)) return; } while (0)
 
-extern "C" {
-  
-  int 
-  EXTRACTOR_exiv2_extract (const char *data,
-                          size_t size,
-                          EXTRACTOR_MetaDataProcessor proc,
-                          void *proc_cls,
-                          const char *options)
-  {
-    try 
-      {            
-       Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open( (Exiv2::byte*) 
data,
-                                                                size);
-       if (image.get() == 0)
-         return 0;
-       image->readMetadata();
-       Exiv2::ExifData &exifData = image->exifData();
-       if (!exifData.empty()) 
-         {                     
-           Exiv2::ExifData::const_iterator md;
-           
-           /* FIXME: this should be a loop over data,
-              not a looooong block of code */
-           ADDEXIV ("Exif.Image.Copyright", EXTRACTOR_METATYPE_COPYRIGHT);
-           ADDEXIV ("Exif.Photo.UserComment", EXTRACTOR_METATYPE_COMMENT);
-           ADDEXIV ("Exif.GPSInfo.GPSLatitudeRef", 
EXTRACTOR_METATYPE_GPS_LATITUDE_REF);
-           ADDEXIV ("Exif.GPSInfo.GPSLatitude", 
EXTRACTOR_METATYPE_GPS_LATITUDE);
-           ADDEXIV ("Exif.GPSInfo.GPSLongitudeRef", 
EXTRACTOR_METATYPE_GPS_LONGITUDE_REF);
-           ADDEXIV ("Exif.GPSInfo.GPSLongitude", 
EXTRACTOR_METATYPE_GPS_LONGITUDE);
-           ADDEXIV ("Exif.Image.Make", EXTRACTOR_METATYPE_CAMERA_MAKE);    
-           ADDEXIV ("Exif.Image.Model", EXTRACTOR_METATYPE_CAMERA_MODEL);
-           ADDEXIV ("Exif.Image.Orientation", EXTRACTOR_METATYPE_ORIENTATION);
-           ADDEXIV ("Exif.Photo.DateTimeOriginal", 
EXTRACTOR_METATYPE_CREATION_DATE);
-           ADDEXIV ("Exif.Photo.ExposureBiasValue", 
EXTRACTOR_METATYPE_EXPOSURE_BIAS);
-           ADDEXIV ("Exif.Photo.Flash", EXTRACTOR_METATYPE_FLASH);
-           ADDEXIV ("Exif.CanonSi.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
-           ADDEXIV ("Exif.Panasonic.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
-           ADDEXIV ("Exif.Olympus.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
-           ADDEXIV ("Exif.Photo.FocalLength", EXTRACTOR_METATYPE_FOCAL_LENGTH);
-           ADDEXIV ("Exif.Photo.FocalLengthIn35mmFilm", 
EXTRACTOR_METATYPE_FOCAL_LENGTH_35MM);
-           ADDEXIV ("Exif.Photo.ISOSpeedRatings", 
EXTRACTOR_METATYPE_ISO_SPEED);
-           ADDEXIV ("Exif.CanonSi.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
-           ADDEXIV ("Exif.Nikon1.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
-           ADDEXIV ("Exif.Nikon2.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
-           ADDEXIV ("Exif.Nikon3.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
-           ADDEXIV ("Exif.Photo.ExposureProgram", 
EXTRACTOR_METATYPE_EXPOSURE_MODE);
-           ADDEXIV ("Exif.CanonCs.ExposureProgram", 
EXTRACTOR_METATYPE_EXPOSURE_MODE);
-           ADDEXIV ("Exif.Photo.MeteringMode", 
EXTRACTOR_METATYPE_METERING_MODE);
-           ADDEXIV ("Exif.CanonCs.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
-           ADDEXIV ("Exif.Fujifilm.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
-           ADDEXIV ("Exif.Olympus.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
-           ADDEXIV ("Exif.Panasonic.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
-           ADDEXIV ("Exif.CanonCs.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Fujifilm.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Sigma.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Nikon1.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Nikon2.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Nikon3.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Olympus.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.Panasonic.Quality", 
EXTRACTOR_METATYPE_IMAGE_QUALITY);
-           ADDEXIV ("Exif.CanonSi.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Fujifilm.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Sigma.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Nikon1.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Nikon2.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Nikon3.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Olympus.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
-           ADDEXIV ("Exif.Panasonic.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
 
-           ADDEXIV ("Exif.Photo.FNumber", EXTRACTOR_METATYPE_APERTURE);
-           md = exifData.findKey(Exiv2::ExifKey("Exif.Photo.ApertureValue"));
-           if (md != exifData.end()) {
+/**
+ * Call 'addIptcData' for the given key-type combination.
+ * Uses 'return' if addIptcData returns non-0.
+ *
+ * @param s key to lookup
+ * @param type libextractor type to use for the meta data found under the 
given key
+ */
+#define ADDIPTC(s,t) do { if (0 != addIptcData (iptcData, s, t, ec->proc, 
ec->cls)) return; } while (0)
+
+
+/**
+ * Call 'addXmpData' for the given key-type combination.
+ * Uses 'return' if addXmpData returns non-0.
+ *
+ * @param s key to lookup
+ * @param type libextractor type to use for the meta data found under the 
given key
+ */
+#define ADDXMP(s,t)  do { if (0 != addXmpData  (xmpData,  s, t, ec->proc, 
ec->cls)) return; } while (0)
+
+
+/**
+ * Main entry method for the 'exiv2' extraction plugin.  
+ *
+ * @param ec extraction context provided to the plugin
+ */
+extern "C" void
+EXTRACTOR_exiv2_extract_method (struct EXTRACTOR_ExtractContext *ec)
+{
+  try
+    {      
+      std::auto_ptr<Exiv2::BasicIo> eio = new ExtractorIO (ec);
+      Exiv2::Image::AutoPtr image = Exiv2::ImageFactory::open (eio);
+      if (0 == image.get ())
+       return;
+      image->readMetadata ();
+      Exiv2::ExifData &exifData = image->exifData ();
+      if (! exifData.empty ()) 
+       {                                 
+         ADDEXIV ("Exif.Image.Copyright", EXTRACTOR_METATYPE_COPYRIGHT);
+         ADDEXIV ("Exif.Photo.UserComment", EXTRACTOR_METATYPE_COMMENT);
+         ADDEXIV ("Exif.GPSInfo.GPSLatitudeRef", 
EXTRACTOR_METATYPE_GPS_LATITUDE_REF);
+         ADDEXIV ("Exif.GPSInfo.GPSLatitude", EXTRACTOR_METATYPE_GPS_LATITUDE);
+         ADDEXIV ("Exif.GPSInfo.GPSLongitudeRef", 
EXTRACTOR_METATYPE_GPS_LONGITUDE_REF);
+         ADDEXIV ("Exif.GPSInfo.GPSLongitude", 
EXTRACTOR_METATYPE_GPS_LONGITUDE);
+         ADDEXIV ("Exif.Image.Make", EXTRACTOR_METATYPE_CAMERA_MAKE);    
+         ADDEXIV ("Exif.Image.Model", EXTRACTOR_METATYPE_CAMERA_MODEL);
+         ADDEXIV ("Exif.Image.Orientation", EXTRACTOR_METATYPE_ORIENTATION);
+         ADDEXIV ("Exif.Photo.DateTimeOriginal", 
EXTRACTOR_METATYPE_CREATION_DATE);
+         ADDEXIV ("Exif.Photo.ExposureBiasValue", 
EXTRACTOR_METATYPE_EXPOSURE_BIAS);
+         ADDEXIV ("Exif.Photo.Flash", EXTRACTOR_METATYPE_FLASH);
+         ADDEXIV ("Exif.CanonSi.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
+         ADDEXIV ("Exif.Panasonic.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
+         ADDEXIV ("Exif.Olympus.FlashBias", EXTRACTOR_METATYPE_FLASH_BIAS);
+         ADDEXIV ("Exif.Photo.FocalLength", EXTRACTOR_METATYPE_FOCAL_LENGTH);
+         ADDEXIV ("Exif.Photo.FocalLengthIn35mmFilm", 
EXTRACTOR_METATYPE_FOCAL_LENGTH_35MM);
+         ADDEXIV ("Exif.Photo.ISOSpeedRatings", EXTRACTOR_METATYPE_ISO_SPEED);
+         ADDEXIV ("Exif.CanonSi.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
+         ADDEXIV ("Exif.Nikon1.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
+         ADDEXIV ("Exif.Nikon2.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
+         ADDEXIV ("Exif.Nikon3.ISOSpeed", EXTRACTOR_METATYPE_ISO_SPEED);
+         ADDEXIV ("Exif.Photo.ExposureProgram", 
EXTRACTOR_METATYPE_EXPOSURE_MODE);
+         ADDEXIV ("Exif.CanonCs.ExposureProgram", 
EXTRACTOR_METATYPE_EXPOSURE_MODE);
+         ADDEXIV ("Exif.Photo.MeteringMode", EXTRACTOR_METATYPE_METERING_MODE);
+         ADDEXIV ("Exif.CanonCs.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
+         ADDEXIV ("Exif.Fujifilm.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
+         ADDEXIV ("Exif.Olympus.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
+         ADDEXIV ("Exif.Panasonic.Macro", EXTRACTOR_METATYPE_MACRO_MODE);
+         ADDEXIV ("Exif.CanonCs.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Fujifilm.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Sigma.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Nikon1.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Nikon2.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Nikon3.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Olympus.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.Panasonic.Quality", EXTRACTOR_METATYPE_IMAGE_QUALITY);
+         ADDEXIV ("Exif.CanonSi.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Fujifilm.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Sigma.WhiteBalance", EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Nikon1.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Nikon2.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Nikon3.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Olympus.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Panasonic.WhiteBalance", 
EXTRACTOR_METATYPE_WHITE_BALANCE);
+         ADDEXIV ("Exif.Photo.FNumber", EXTRACTOR_METATYPE_APERTURE);
+         ADDEXIV ("Exif.Photo.ExposureTime", EXTRACTOR_METATYPE_EXPOSURE);
+
+#if FIXME
+         /* FIXME: the 'ADD' macro below won't work as we don't have 'proc' 
and 'proc_cls' in
+            this scope... */
+         Exiv2::ExifData::const_iterator md = 
exifData.findKey(Exiv2::ExifKey("Exif.Photo.ApertureValue"));
+         if (exifData.end() != md) 
+           {
              std::ostringstream os;
              os << std::fixed << std::setprecision(1)
                 << "F" << exp(log(2.0) * md->toFloat() / 2);
              ADD (os.str().c_str(), EXTRACTOR_METATYPE_APERTURE);
            }
-    
-           ADDEXIV ("Exif.Photo.ExposureTime", EXTRACTOR_METATYPE_EXPOSURE);
-           md = 
exifData.findKey(Exiv2::ExifKey("Exif.Photo.ShutterSpeedValue"));
-           if (md != exifData.end()) {
+         
+         md = exifData.findKey(Exiv2::ExifKey("Exif.Photo.ShutterSpeedValue"));
+         if (exifData.end() != md) 
+           {
              double tmp = exp(log(2.0) * md->toFloat()) + 0.5;
              std::ostringstream os;
-             if (tmp > 1) {
-               os << "1/" << static_cast<long>(tmp) << " s";
-             }
-             else {
-               os << static_cast<long>(1/tmp) << " s";
-             }
+             if (tmp > 1) 
+               {
+                 os << "1/" << static_cast<long>(tmp) << " s";
+               }
+             else 
+               {
+                 os << static_cast<long>(1/tmp) << " s";
+               }
              ADD (os.str().c_str(), EXTRACTOR_METATYPE_EXPOSURE);
            }
-
-           /* this can sometimes be wrong (corrupt exiv2 data?).
-              Either way, we should get the data directly from
-              the specific file format parser (i.e. jpeg, tiff). */
-           // Exif Resolution
-           unsigned long xdim = 0;
-           unsigned long ydim = 0;
-           md = exifData.findKey(Exiv2::ExifKey("Exif.Photo.PixelXDimension"));
-           if (md != exifData.end()) xdim = md->toLong();
-           md = exifData.findKey(Exiv2::ExifKey("Exif.Photo.PixelYDimension"));
-           if (md != exifData.end()) ydim = md->toLong();
-           if ( (xdim != 0) && (ydim != 0))
-             {
-               std::ostringstream os;
-               os << xdim << "x" << ydim;
-               ADD (os.str().c_str(), EXTRACTOR_METATYPE_IMAGE_DIMENSIONS);
-             }     
-         } 
-
-       Exiv2::IptcData &iptcData = image->iptcData();
-       if (! iptcData.empty()) {
+#endif
+       } 
+      
+      Exiv2::IptcData &iptcData = image->iptcData();
+      if (! iptcData.empty()) 
+       {
          ADDIPTC ("Iptc.Application2.Keywords", EXTRACTOR_METATYPE_KEYWORDS);
          ADDIPTC ("Iptc.Application2.City", EXTRACTOR_METATYPE_LOCATION_CITY);
          ADDIPTC ("Iptc.Application2.SubLocation", 
EXTRACTOR_METATYPE_LOCATION_SUBLOCATION);
          ADDIPTC ("Iptc.Application2.CountryName", 
EXTRACTOR_METATYPE_LOCATION_COUNTRY);
          ADDIPTC ("Xmp.photoshop.Country", EXTRACTOR_METATYPE_RATING);
        }
-
-       Exiv2::XmpData &xmpData = image->xmpData();
-       if (! xmpData.empty()) {
+      
+      Exiv2::XmpData &xmpData = image->xmpData();
+      if (! xmpData.empty()) 
+       {
          ADDXMP ("Xmp.photoshop.City", EXTRACTOR_METATYPE_LOCATION_CITY);
          ADDXMP ("Xmp.xmp.Rating", EXTRACTOR_METATYPE_RATING);
          ADDXMP ("Xmp.MicrosoftPhoto.Rating", EXTRACTOR_METATYPE_RATING);
@@ -248,14 +787,12 @@
          ADDXMP ("Xmp.lr.hierarchicalSubject", EXTRACTOR_METATYPE_SUBJECT);
        }       
       }
-    catch (const Exiv2::AnyError& e) {
+  catch (const Exiv2::AnyError& e) 
+    {
 #ifndef SUPPRESS_WARNINGS
       std::cout << "Caught Exiv2 exception '" << e << "'\n";
 #endif
     }
-    
-    return 0;
-  }
+}
 
-
-}
+/* end of exiv2_extractor.cc */

Modified: Extractor/src/plugins/jpeg_extractor.c
===================================================================
--- Extractor/src/plugins/jpeg_extractor.c      2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/jpeg_extractor.c      2012-08-07 19:56:09 UTC (rev 
23156)
@@ -1,10 +1,10 @@
 /*
      This file is part of libextractor.
-     (C) 2002, 2003, 2004 Vidyut Samanta and Christian Grothoff
+     (C) 2002, 2003, 2004, 2012 Vidyut Samanta and Christian Grothoff
 
      libextractor is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published
-     by the Free Software Foundation; either version 2, or (at your
+     by the Free Software Foundation; either version 3, or (at your
      option) any later version.
 
      libextractor is distributed in the hope that it will be useful, but
@@ -17,261 +17,154 @@
      Free Software Foundation, Inc., 59 Temple Place - Suite 330,
      Boston, MA 02111-1307, USA.
  */
-
+/**
+ * @file plugins/jpeg_extractor.c
+ * @brief plugin to support JPEG files
+ * @author Christian Grothoff
+ */
 #include "platform.h"
 #include "extractor.h"
+#include <jpeglib.h>
+#include <setjmp.h>
 
 
-#define M_SOI   0xD8            /* Start Of Image (beginning of datastream) */
-#define M_EOI   0xD9            /* End Of Image (end of datastream) */
-#define M_SOS   0xDA            /* Start Of Scan (begins compressed data) */
-#define M_APP12        0xEC
-#define M_COM   0xFE            /* COMment */
-#define M_APP0  0xE0
-
 /**
- * Get the next character in the sequence and advance
- * the pointer *data to the next location in the sequence.
- * If we're at the end, return -1.
+ * Context for custom functions.
  */
-#define NEXTC(data,end) ((*(data)<(end))?*((*(data))++):-1)
+struct Context
+{
+  /**
+   * Environment for longjmp from within error_exit handler.
+   */
+  jmp_buf env;
+};
 
-/* The macro does:
-unsigned int NEXTC(unsigned char ** data, char *  end) {
-  if (*data < end) {
-    char result = **data;
-    (*data)++;
-    return result;
-  } else
-    return -1;
-}
-*/
 
 /**
- * Read length, convert to unsigned int.
- * All 2-byte quantities in JPEG markers are MSB first
- * @return -1 on error
+ * Function used to avoid having libjpeg write error messages to the console.
  */
-static int
-readLength (const unsigned char **data, const unsigned char *end)
+static void
+no_emit (j_common_ptr cinfo, int msg_level)
 {
-  int c1;
-  int c2;
-
-  c1 = NEXTC (data, end);
-  if (c1 == -1)
-    return -1;
-  c2 = NEXTC (data, end);
-  if (c2 == -1)
-    return -1;
-  return ((((unsigned int) c1) << 8) + ((unsigned int) c2)) - 2;
+  /* do nothing */
 }
 
+
 /**
- * @return the next marker or -1 on error.
+ * Function used to avoid having libjpeg write error messages to the console.
  */
-static int
-next_marker (const unsigned char **data, const unsigned char *end)
+static void
+no_output (j_common_ptr cinfo)
 {
-  int c;
-  c = NEXTC (data, end);
-  while ((c != 0xFF) && (c != -1))
-    c = NEXTC (data, end);
-  do
-    {
-      c = NEXTC (data, end);
-    }
-  while (c == 0xFF);
-  return c;
+  /* do nothing */
 }
 
+
+/**
+ * Function used to avoid having libjpeg kill our process.
+ */
 static void
-skip_variable (const unsigned char **data, const unsigned char *end)
+no_exit (j_common_ptr cinfo)
 {
-  int length;
+  struct Context *ctx = cinfo->client_data;
 
-  length = readLength (data, end);
-  if (length < 0)
-    {
-      (*data) = end;            /* skip to the end */
-      return;
-    }
-  /* Skip over length bytes */
-  (*data) += length;
+  /* we're not allowed to return (by API definition),
+     and we don't want to abort/exit.  So we longjmp
+     to our cleanup code instead. */
+  longjmp (ctx->env, 1);
 }
 
-static char *
-process_COM (const unsigned char **data, const unsigned char *end)
+
+/**
+ * Main entry method for the 'image/jpeg' extraction plugin.  
+ *
+ * @param ec extraction context provided to the plugin
+ */
+void 
+EXTRACTOR_jpeg_extract_method (struct EXTRACTOR_ExtractContext *ec)
 {
-  unsigned int length;
-  int ch;
-  int pos;
-  char *comment;
+  struct jpeg_decompress_struct jds;
+  struct jpeg_error_mgr em;
+  void *buf;
+  ssize_t size;
+  int is_jpeg;
+  unsigned int rounds;
+  char format[128];
+  struct jpeg_marker_struct *mptr;
+  struct Context ctx;
 
-  length = readLength (data, end);
-  if (length <= 0)
-    return NULL;
-  comment = malloc (length + 1);
-  if (comment == NULL)
-    return NULL;
-  pos = 0;
-  while (length > 0)
+  is_jpeg = 0;
+  rounds = 0; /* used to avoid going on forever for non-jpeg files */
+  jpeg_std_error (&em);
+  em.emit_message = &no_emit;
+  em.output_message = &no_output;
+  em.error_exit = &no_exit;
+  jds.client_data = &ctx;
+  if (1 == setjmp (ctx.env)) 
+    goto EXIT; /* we get here if libjpeg calls 'no_exit' because it wants to 
die */
+  jds.err = &em;
+  jpeg_create_decompress (&jds);
+  jpeg_save_markers (&jds, JPEG_COM, 1024 * 8);
+  while ( (1 == is_jpeg) || (rounds++ < 8) )
     {
-      ch = NEXTC (data, end);
-      if ((ch == '\r') || (ch == '\n'))
-        comment[pos++] = '\n';
-      else if (isprint ((unsigned char) ch))
-        comment[pos++] = ch;
-      length--;
+      if (-1 == (size = ec->read (ec->cls,
+                                 &buf,
+                                 16 * 1024)))
+       break;
+      if (0 == size)
+       break;
+      jpeg_mem_src (&jds, buf, size);
+      if (0 == is_jpeg)
+       {      
+         if (JPEG_HEADER_OK == jpeg_read_header (&jds, 1))
+           is_jpeg = 1; /* ok, really a jpeg, keep going until the end */
+         continue;
+       }
+      jpeg_consume_input (&jds);
     }
-  comment[pos] = '\0';
-  return comment;
-}
 
-
-int 
-EXTRACTOR_jpeg_extract (const unsigned char *data,
-                       size_t size,
-                       EXTRACTOR_MetaDataProcessor proc,
-                       void *proc_cls,
-                       const char *options)
-{
-  int c1;
-  int c2;
-  int marker;
-  const unsigned char *end;
-  char *tmp;
-  char val[128];
-
-  if (size < 0x12)
-    return 0;
-  end = &data[size];
-  c1 = NEXTC (&data, end);
-  c2 = NEXTC (&data, end);
-  if ((c1 != 0xFF) || (c2 != M_SOI))
-    return 0;              /* not a JPEG */
-  if (0 != proc (proc_cls, 
-                "jpeg",
-                EXTRACTOR_METATYPE_MIMETYPE,
-                EXTRACTOR_METAFORMAT_UTF8,
-                "text/plain",
-                "image/jpeg",
-                strlen ("image/jpeg")+1))
-    return 1;
-  while (1)
+  if (1 != is_jpeg)
+    goto EXIT;
+  if (0 !=
+      ec->proc (ec->cls,
+               "jpeg",
+               EXTRACTOR_METATYPE_MIMETYPE,
+               EXTRACTOR_METAFORMAT_UTF8,
+               "text/plain",
+               "image/jpeg",
+               strlen ("image/jpeg") + 1))
+    goto EXIT;
+  snprintf (format,
+           sizeof (format),
+           "%ux%u",
+           (unsigned int) jds.image_width,
+           (unsigned int) jds.image_height);
+  if (0 !=
+      ec->proc (ec->cls,
+               "jpeg",
+               EXTRACTOR_METATYPE_IMAGE_DIMENSIONS,
+               EXTRACTOR_METAFORMAT_UTF8,
+               "text/plain",
+               format,
+               strlen (format) + 1))
+    goto EXIT;
+  for (mptr = jds.marker_list; NULL != mptr; mptr = mptr->next)
     {
-      marker = next_marker (&data, end);
-      switch (marker)
-        {
-        case -1:               /* end of file */
-        case M_SOS:
-        case M_EOI:
-          goto RETURN;
-        case M_APP0:
-          {
-            int len = readLength (&data, end);
-            if (len < 0x8)
-              goto RETURN;
-            if (0 == strncmp ((char *) data, "JFIF", 4))
-              {
-                switch (data[0x4])
-                  {
-                  case 1:      /* dots per inch */
-                    snprintf (val, 
-                             sizeof (val),
-                              _("%ux%u dots per inch"),
-                              (data[0x8] << 8) + data[0x9],
-                              (data[0xA] << 8) + data[0xB]);
-                   if (0 != proc (proc_cls, 
-                                  "jpeg",
-                                  EXTRACTOR_METATYPE_IMAGE_RESOLUTION,
-                                  EXTRACTOR_METAFORMAT_UTF8,
-                                  "text/plain",
-                                  val,
-                                  strlen (val)+1))
-                     return 1;
-                    break;
-                  case 2:      /* dots per cm */
-                    snprintf (val, 
-                             sizeof (val),
-                              _("%ux%u dots per cm"),
-                              (data[0x8] << 8) + data[0x9],
-                              (data[0xA] << 8) + data[0xB]);
-                   if (0 != proc (proc_cls, 
-                                  "jpeg",
-                                  EXTRACTOR_METATYPE_IMAGE_RESOLUTION,
-                                  EXTRACTOR_METAFORMAT_UTF8,
-                                  "text/plain",
-                                  val,
-                                  strlen (val)+1))
-                     return 1;
-                    break;
-                  case 0:      /* no unit given */
-                    snprintf (val, 
-                             sizeof (val),
-                              _("%ux%u dots per inch?"),
-                              (data[0x8] << 8) + data[0x9],
-                              (data[0xA] << 8) + data[0xB]);
-                   if (0 != proc (proc_cls, 
-                                  "jpeg",
-                                  EXTRACTOR_METATYPE_IMAGE_RESOLUTION,
-                                  EXTRACTOR_METAFORMAT_UTF8,
-                                  "text/plain",
-                                  val,
-                                  strlen (val)+1))
-                     return 1;
-                    break;
-                  default:     /* unknown unit */
-                    break;
-                  }
-              }
-            data = &data[len];
-            break;
-          }
-        case 0xC0:
-          {
-            int len = readLength (&data, end);
-            if (len < 0x9)
-              goto RETURN;
-            snprintf (val, 
-                     sizeof (val),
-                      "%ux%u",
-                      (data[0x3] << 8) + data[0x4],
-                      (data[0x1] << 8) + data[0x2]);
-           if (0 != proc (proc_cls, 
-                          "jpeg",
-                          EXTRACTOR_METATYPE_IMAGE_DIMENSIONS,
-                          EXTRACTOR_METAFORMAT_UTF8,
-                          "text/plain",
-                          val,
-                          strlen (val)+1))
-             return 1;
-            data = &data[len];
-            break;
-          }
-        case M_COM:
-        case M_APP12:
-          tmp = process_COM (&data, end);
-         if (NULL == tmp)
-           break;
-         if (0 != proc (proc_cls, 
-                        "jpeg",
-                        EXTRACTOR_METATYPE_COMMENT,
-                        EXTRACTOR_METAFORMAT_UTF8,
-                        "text/plain",
-                        tmp,
-                        strlen (tmp)+1))
-           {
-             free (tmp);
-             return 1;
-           }
-         free (tmp);
-          break;
-        default:
-          skip_variable (&data, end);
-          break;
-        }
+      if (JPEG_COM != mptr->marker)
+       continue;
+      if (0 !=
+         ec->proc (ec->cls,
+                   "jpeg",
+                   EXTRACTOR_METATYPE_COMMENT,
+                   EXTRACTOR_METAFORMAT_C_STRING,
+                   "text/plain",
+                   (const char *) mptr->data,
+                   mptr->data_length))
+       goto EXIT;
     }
-RETURN:
-  return 0;
+  
+ EXIT:
+  jpeg_destroy_decompress (&jds);
 }
+
+/* end of jpeg_extractor.c */

Modified: Extractor/src/plugins/mime_extractor.c
===================================================================
--- Extractor/src/plugins/mime_extractor.c      2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/mime_extractor.c      2012-08-07 19:56:09 UTC (rev 
23156)
@@ -80,12 +80,8 @@
       else
        magic_path = NULL;
     }
-  mime = magic_buffer (magic, buf, ret);
-  if (NULL == mime)
-    {
-      magic_close (magic);
-      return;
-    }
+  if (NULL == (mime = magic_buffer (magic, buf, ret)))
+    return;
   ec->proc (ec->cls,
            "mime",
            EXTRACTOR_METATYPE_MIMETYPE,
@@ -116,8 +112,11 @@
 void __attribute__ ((destructor)) 
 mime_ltdl_fini () 
 {
-  magic_close (magic);
-  magic = NULL;
+  if (NULL != magic)
+    {
+      magic_close (magic);
+      magic = NULL;
+    }
   if (NULL != magic_path)
     {
       free (magic_path);

Modified: Extractor/src/plugins/template_extractor.c
===================================================================
--- Extractor/src/plugins/template_extractor.c  2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/template_extractor.c  2012-08-07 19:56:09 UTC (rev 
23156)
@@ -1,10 +1,10 @@
 /*
      This file is part of libextractor.
-     (C) 2002, 2003, 2004, 2009 Vidyut Samanta and Christian Grothoff
+     (C) 2002, 2003, 2004, 2009, 2012 Vidyut Samanta and Christian Grothoff
 
      libextractor is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published
-     by the Free Software Foundation; either version 2, or (at your
+     by the Free Software Foundation; either version 3, or (at your
      option) any later version.
 
      libextractor is distributed in the hope that it will be useful, but
@@ -17,19 +17,28 @@
      Free Software Foundation, Inc., 59 Temple Place - Suite 330,
      Boston, MA 02111-1307, USA.
  */
-
+/**
+ * @file plugins/template_extractor.c
+ * @brief example code for writing your own plugin
+ * @author add your own name here
+ */
 #include "platform.h"
 #include "extractor.h"
 
-#include "extractor_plugins.h"
-#include "le_architecture.h"
 
-int
-EXTRACTOR_template_extract_method (struct EXTRACTOR_PluginList *plugin,
-    EXTRACTOR_MetaDataProcessor proc, void *proc_cls)
+/**
+ * This will be the main method of your plugin.
+ * Describe a bit what it does here.
+ *
+ * @param ec extraction context, here you get the API
+ *   for accessing the file data and for returning
+ *   meta data
+ */
+void
+EXTRACTOR_template_extract_method (struct EXTRACTOR_ExtractContext *ec)
 {
   int64_t offset;
-  unsigned char *data;
+  void *data;
 
   /* temporary variables are declared here */
 
@@ -37,50 +46,22 @@
     return 1;
 
   /* initialize state here */
+  
+  /* Call seek (plugin, POSITION, WHENCE) to seek (if you know where
+   * data starts):
+   */
+  // ec->seek (ec->cls, POSITION, SEEK_SET);
 
-  /* Call pl_seek (plugin, POSITION, WHENCE) to seek (if you know where
-   * data starts.
+  /* Call read (plugin, &data, COUNT) to read COUNT bytes 
    */
-  /* Call pl_read (plugin, &data, COUNT) to read COUNT bytes (will be stored
-   * as data[0]..data[COUNT-1], no need to allocate data or free it; but it
-   * "goes away" when you make another read call, so store interesting values
-   * somewhere once you find them).
-   */
-  /* If you need to search for a magic id that is not at the beginning of the
-   * file, do pl_read() calls, reading sizable (1 megabyte or so) chunks,
-   * then use memchr() on them to find first byte of the magic sequence,
-   * then compare the rest of the sequence, if found.
-   * Mind the fact that you need to iterate over COUNT - SEQUENCE_LENGTH chars,
-   * and seek to POS + COUNT - SEQUENCE_LENGTH once you run out of bytes,
-   * otherwise you'd have a chance to skip bytes at chunk boundaries.
-   */
-  /* Do try to make a reasonable assumption about the amount of data you're
-   * going to search through. Iterating over the whole file, byte-by-byte is
-   * NOT a good idea, if the search itself is slow. Try to make the search as
-   * efficient as possible.
-   */
-  /* Avoid making long seeks backwards (for performance reasons)
-   */
-  /* pl_get_pos (plugin) will return current offset from the beginning of
-   * the file (i.e. index of the data[0] in the file, if you call pl_read
-   * at that point). You might need it do calculate forward-searches, if
-   * there are offsets stored within the file.
-   * pl_get_fsize (plugin) will return file size OR -1 if it is not known
-   * yet (file is not decompressed completely). Don't rely on fsize.
-   */
-  /* Seeking forward is safe
-   */
-  /* If you asked to read X bytes, but got less - it's EOF
-   */
-  /* Seeking backward a bit shouldn't hurt performance (i.e. read 4 bytes,
-   * then immediately seek 4 bytes back).
-   */
-  /* Don't read too much (you can't read more than MAX_READ from extractor.c,
-   * which is 32MB at the moment) in one call.
-   */
+
+
   /* Once you find something, call proc(). If it returns non-0 - you're done.
    */
-  /* Return 1 to indicate that you're done. */
+  // if (0 != ec->proc (ec->cls, ...)) return;
+
   /* Don't forget to free anything you've allocated before returning! */
-  return 1;
+  return;
 }
+
+/* end of template_extractor.c */

Added: Extractor/src/plugins/test_exiv2.c
===================================================================
--- Extractor/src/plugins/test_exiv2.c                          (rev 0)
+++ Extractor/src/plugins/test_exiv2.c  2012-08-07 19:56:09 UTC (rev 23156)
@@ -0,0 +1,61 @@
+/*
+     This file is part of libextractor.
+     (C) 2012 Vidyut Samanta and Christian Grothoff
+
+     libextractor is free software; you can redistribute it and/or modify
+     it under the terms of the GNU General Public License as published
+     by the Free Software Foundation; either version 3, or (at your
+     option) any later version.
+
+     libextractor is distributed in the hope that it will be useful, but
+     WITHOUT ANY WARRANTY; without even the implied warranty of
+     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+     General Public License for more details.
+
+     You should have received a copy of the GNU General Public License
+     along with libextractor; see the file COPYING.  If not, write to the
+     Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+     Boston, MA 02111-1307, USA.
+*/
+/**
+ * @file plugins/test_exiv2.c
+ * @brief testcase for ogg plugin
+ * @author Christian Grothoff
+ */
+#include "platform.h"
+#include "test_lib.h"
+
+
+/**
+ * Main function for the EXIV2 testcase.
+ *
+ * @param argc number of arguments (ignored)
+ * @param argv arguments (ignored)
+ * @return 0 on success
+ */
+int
+main (int argc, char *argv[])
+{
+  struct SolutionData exiv2_image_sol[] =
+    {
+      { 
+       EXTRACTOR_METATYPE_IMAGE_DIMENSIONS,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "3x3",
+       strlen ("3x3") + 1,
+       0 
+      },
+      { 0, 0, NULL, NULL, 0, -1 }
+    };
+  struct ProblemSet ps[] =
+    {
+      { NULL, NULL },
+      { "testdata/exiv2_image.jpg",
+       exiv2_image_sol },
+      { NULL, NULL }
+    };
+  return ET_main ("exiv2", ps);
+}
+
+/* end of test_exiv2.c */

Added: Extractor/src/plugins/test_jpeg.c
===================================================================
--- Extractor/src/plugins/test_jpeg.c                           (rev 0)
+++ Extractor/src/plugins/test_jpeg.c   2012-08-07 19:56:09 UTC (rev 23156)
@@ -0,0 +1,77 @@
+/*
+     This file is part of libextractor.
+     (C) 2012 Vidyut Samanta and Christian Grothoff
+
+     libextractor is free software; you can redistribute it and/or modify
+     it under the terms of the GNU General Public License as published
+     by the Free Software Foundation; either version 3, or (at your
+     option) any later version.
+
+     libextractor is distributed in the hope that it will be useful, but
+     WITHOUT ANY WARRANTY; without even the implied warranty of
+     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+     General Public License for more details.
+
+     You should have received a copy of the GNU General Public License
+     along with libextractor; see the file COPYING.  If not, write to the
+     Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+     Boston, MA 02111-1307, USA.
+*/
+/**
+ * @file plugins/test_jpeg.c
+ * @brief testcase for ogg plugin
+ * @author Christian Grothoff
+ */
+#include "platform.h"
+#include "test_lib.h"
+
+
+
+/**
+ * Main function for the JPEG testcase.
+ *
+ * @param argc number of arguments (ignored)
+ * @param argv arguments (ignored)
+ * @return 0 on success
+ */
+int
+main (int argc, char *argv[])
+{
+  struct SolutionData jpeg_image_sol[] =
+    {
+      { 
+       EXTRACTOR_METATYPE_MIMETYPE,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "image/jpeg",
+       strlen ("image/jpeg") + 1,
+       0 
+      },
+      { 
+       EXTRACTOR_METATYPE_IMAGE_DIMENSIONS,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "3x3",
+       strlen ("3x3") + 1,
+       0 
+      },
+      { 
+       EXTRACTOR_METATYPE_COMMENT,
+       EXTRACTOR_METAFORMAT_C_STRING,
+       "text/plain",
+       "(C) 2001 by Christian Grothoff, using gimp 1.2 1",
+       strlen ("(C) 2001 by Christian Grothoff, using gimp 1.2 1"),
+       0 
+      },
+      { 0, 0, NULL, NULL, 0, -1 }
+    };
+  struct ProblemSet ps[] =
+    {
+      { "testdata/jpeg_image.jpg",
+       jpeg_image_sol },
+      { NULL, NULL }
+    };
+  return ET_main ("jpeg", ps);
+}
+
+/* end of test_jpeg.c */

Added: Extractor/src/plugins/test_wav.c
===================================================================
--- Extractor/src/plugins/test_wav.c                            (rev 0)
+++ Extractor/src/plugins/test_wav.c    2012-08-07 19:56:09 UTC (rev 23156)
@@ -0,0 +1,91 @@
+/*
+     This file is part of libextractor.
+     (C) 2012 Vidyut Samanta and Christian Grothoff
+
+     libextractor is free software; you can redistribute it and/or modify
+     it under the terms of the GNU General Public License as published
+     by the Free Software Foundation; either version 3, or (at your
+     option) any later version.
+
+     libextractor is distributed in the hope that it will be useful, but
+     WITHOUT ANY WARRANTY; without even the implied warranty of
+     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+     General Public License for more details.
+
+     You should have received a copy of the GNU General Public License
+     along with libextractor; see the file COPYING.  If not, write to the
+     Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+     Boston, MA 02111-1307, USA.
+*/
+/**
+ * @file plugins/test_wav.c
+ * @brief testcase for ogg plugin
+ * @author Christian Grothoff
+ */
+#include "platform.h"
+#include "test_lib.h"
+
+
+
+/**
+ * Main function for the WAV testcase.
+ *
+ * @param argc number of arguments (ignored)
+ * @param argv arguments (ignored)
+ * @return 0 on success
+ */
+int
+main (int argc, char *argv[])
+{
+  struct SolutionData wav_noise_sol[] =
+    {
+      { 
+       EXTRACTOR_METATYPE_MIMETYPE,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "audio/x-wav",
+       strlen ("audio/x-wav") + 1,
+       0 
+      },
+      { 
+       EXTRACTOR_METATYPE_RESOURCE_TYPE,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "1000 ms, 48000 Hz, mono",
+       strlen ("1000 ms, 48000 Hz, mono") + 1,
+       0 
+      },
+      { 0, 0, NULL, NULL, 0, -1 }
+    };
+  struct SolutionData wav_alert_sol[] =
+    {
+      { 
+       EXTRACTOR_METATYPE_MIMETYPE,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "audio/x-wav",
+       strlen ("audio/x-wav") + 1,
+       0 
+      },
+      { 
+       EXTRACTOR_METATYPE_RESOURCE_TYPE,
+       EXTRACTOR_METAFORMAT_UTF8,
+       "text/plain",
+       "525 ms, 22050 Hz, mono",
+       strlen ("525 ms, 22050 Hz, mono") + 1,
+       0 
+      },
+      { 0, 0, NULL, NULL, 0, -1 }
+    };
+  struct ProblemSet ps[] =
+    {
+      { "testdata/wav_noise.wav",
+       wav_noise_sol },
+      { "testdata/wav_alert.wav",
+       wav_alert_sol },
+      { NULL, NULL }
+    };
+  return ET_main ("wav", ps);
+}
+
+/* end of test_wav.c */

Copied: Extractor/src/plugins/testdata/jpeg_image.jpg (from rev 23141, 
Extractor/test/test.jpg)
===================================================================
(Binary files differ)

Added: Extractor/src/plugins/testdata/wav_alert.wav
===================================================================
(Binary files differ)


Property changes on: Extractor/src/plugins/testdata/wav_alert.wav
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Added: Extractor/src/plugins/testdata/wav_noise.wav
===================================================================
(Binary files differ)


Property changes on: Extractor/src/plugins/testdata/wav_noise.wav
___________________________________________________________________
Added: svn:mime-type
   + application/octet-stream

Deleted: Extractor/src/plugins/translitextractor.c
===================================================================
--- Extractor/src/plugins/translitextractor.c   2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/translitextractor.c   2012-08-07 19:56:09 UTC (rev 
23156)
@@ -1,1076 +0,0 @@
-/*
-     This file is part of libextractor.
-     (C) 2002 - 2005 Vidyut Samanta and Christian Grothoff
-
-     libextractor is free software; you can redistribute it and/or modify
-     it under the terms of the GNU General Public License as published
-     by the Free Software Foundation; either version 2, or (at your
-     option) any later version.
-
-     libextractor is distributed in the hope that it will be useful, but
-     WITHOUT ANY WARRANTY; without even the implied warranty of
-     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-     General Public License for more details.
-
-     You should have received a copy of the GNU General Public License
-     along with libextractor; see the file COPYING.  If not, write to the
-     Free Software Foundation, Inc., 59 Temple Place - Suite 330,
-     Boston, MA 02111-1307, USA.
- */  
-  
-/**
- * @brief Transliterate keywords that contain international characters
- * @author Nils Durner
- */ 
-  
-#include "platform.h"
-#include "extractor.h"
-#include "convert.h"
-  
-/* Language independent chars were taken from glibc's locale/C-translit.h.in
- * 
- * This extractor uses two tables: one contains the Unicode
- * characters and the other one contains the transliterations (since
- * transliterations are often used more than once: � -> ae, � -> ae).
- * The first table points to an appropriate transliteration stored in the
- * second table.
- * 
- * To generate the two tables, a relational database was prepared:
- *  create table TBL(UNI varchar(20), TRANSL varchar(10), TRANSLID integer);
- *  create table TRANSL (TRANSL varchar(20) primary key, TRANSLID integer);
- * 
- * After that, the data from glibc was converted to a SQL script using
- * "awk -F '\t'":
- *   {
- *     transl = $2;
- *     gsub(/'/, "''", transl);
- *     print "insert into TBL(UNI, TRANSL) values ('0x" substr($3, 6, 
index($3, ">") - 6) "', '" transl "');";
- *     print "insert into TRANSL(TRANSL, TRANSLID) values ('" transl "', 
(Select count(*) from TRANSL));";
- *   }
- * 
- * Then the SQL script was executed, "commit"ted and the relation between the
- * two tables established using:
- *   update TBL Set TRANSLID = (Select TRANSLID from TRANSL where 
TRANSL.TRANSL = TBL.TRANSL);
- *   commit;
- * 
- * The C arrays were then created with:
- *   Select '{' || UNI || ', ' || TRANSLID || '},' from TBL order by UNI;
- *   Select TRANSL || ', '  from TRANSL order by TRANSLID;
- * and reformatted with:
- *   {
- *     a = $0;
- *     getline;
- *     b = $0;
- *     getline;
- *     c = $0;
- *     getline;
- *     printf("%s %s %s %s\n", a, b, c, $0);
- *   }
- * 
- * The unicode values for the other characters were taken from
- *   http://bigfield.ddo.jp/unicode/unicode0.html
- */ 
-
-unsigned int chars[][2] = { 
-    {0x00C4, 444}, {0x00D6, 445}, {0x00DC, 446}, {0x00DF, 13},
-  /* �, �, �, � */ 
-{0x00E4, 14}, {0x00F6, 19}, {0x00FC, 447}, {0x00C5, 448}, /* �, �, �, � */ 
-{0x00E5, 449}, {0x00C6, 444}, {0x00E6, 14}, {0x00D8, 445}, /* �, �, �, � */ 
-{0x00F8, 19}, {0x00C0, 419}, {0x00C8, 77}, {0x00D9, 426}, /* �, �, �, � */ 
-{0x00E0, 431}, {0x00E8, 76}, {0x00F9, 5}, {0x00C9, 77}, /* �, �, �, � */ 
-{0x00E9, 76}, {0x00C2, 419}, {0x00CA, 77}, {0x00CE, 63}, /* �, �, �, � */ 
-{0x00D4, 423}, {0x00DB, 426}, {0x00E2, 431}, {0x00EA, 76}, /* �, �, �, � */ 
-{0x00EE, 80}, {0x00F4, 41}, {0x00FB, 5}, {0x00CB, 77}, /* �, �, �, � */ 
-{0x00CF, 63}, {0x00EB, 76}, {0x00EF, 80}, {0x00C7, 57}, /* �, �, �, � */ 
-{0x00E7, 118}, {0x0152, 445}, {0x0053, 19}, {0x0080, 66}, /* �, �, �, � */ 
-  
-  /* Language independent */ 
-{0xFB00, 391}, {0xFB01, 392}, {0xFB02, 393}, {0xFB03, 394}, 
-  {0xFB04, 395}, {0xFB06, 396}, {0xFB29, 40}, {0xFEFF, 36}, 
-  {0xFE4D, 33}, {0xFE4E, 33}, {0xFE4F, 33}, {0xFE5A, 401}, 
-  {0xFE5B, 402}, {0xFE5C, 403}, {0xFE5F, 404}, {0xFE50, 6}, 
-  {0xFE52, 42}, {0xFE54, 397}, {0xFE55, 34}, {0xFE56, 398}, 
-  {0xFE57, 399}, {0xFE59, 400}, {0xFE6A, 407}, {0xFE6B, 408}, 
-  {0xFE60, 405}, {0xFE61, 128}, {0xFE62, 40}, {0xFE63, 3}, 
-  {0xFE64, 47}, {0xFE65, 48}, {0xFE66, 262}, {0xFE68, 127}, 
-  {0xFE69, 406}, {0xFF0A, 128}, {0xFF0B, 40}, {0xFF0C, 6}, 
-  {0xFF0D, 3}, {0xFF0E, 42}, {0xFF0F, 126}, {0xFF01, 399}, 
-  {0xFF02, 38}, {0xFF03, 404}, {0xFF04, 406}, {0xFF05, 407}, 
-  {0xFF06, 405}, {0xFF07, 30}, {0xFF08, 400}, {0xFF09, 401}, 
-  {0xFF1A, 34}, {0xFF1B, 397}, {0xFF1C, 47}, {0xFF1D, 262}, 
-  {0xFF1E, 48}, {0xFF1F, 398}, {0xFF10, 409}, {0xFF11, 410}, 
-  {0xFF12, 411}, {0xFF13, 412}, {0xFF14, 413}, {0xFF15, 414}, 
-  {0xFF16, 415}, {0xFF17, 416}, {0xFF18, 417}, {0xFF19, 418}, 
-  {0xFF2A, 421}, {0xFF2B, 422}, {0xFF2C, 64}, {0xFF2D, 79}, 
-  {0xFF2E, 66}, {0xFF2F, 423}, {0xFF20, 408}, {0xFF21, 419}, 
-  {0xFF22, 75}, {0xFF23, 57}, {0xFF24, 81}, {0xFF25, 77}, 
-  {0xFF26, 78}, {0xFF27, 420}, {0xFF28, 61}, {0xFF29, 63}, 
-  {0xFF3A, 73}, {0xFF3B, 429}, {0xFF3C, 127}, {0xFF3D, 430}, 
-  {0xFF3E, 31}, {0xFF3F, 33}, {0xFF30, 68}, {0xFF31, 69}, 
-  {0xFF32, 70}, {0xFF33, 424}, {0xFF34, 425}, {0xFF35, 426}, 
-  {0xFF36, 100}, {0xFF37, 427}, {0xFF38, 105}, {0xFF39, 428}, 
-  {0xFF4A, 83}, {0xFF4B, 434}, {0xFF4C, 65}, {0xFF4D, 119}, 
-  {0xFF4E, 435}, {0xFF4F, 41}, {0xFF40, 32}, {0xFF41, 431}, 
-  {0xFF42, 432}, {0xFF43, 118}, {0xFF44, 82}, {0xFF45, 76}, 
-  {0xFF46, 433}, {0xFF47, 60}, {0xFF48, 62}, {0xFF49, 80}, 
-  {0xFF5A, 442}, {0xFF5B, 402}, {0xFF5C, 129}, {0xFF5D, 403}, 
-  {0xFF5E, 35}, {0xFF50, 436}, {0xFF51, 437}, {0xFF52, 438}, 
-  {0xFF53, 20}, {0xFF54, 439}, {0xFF55, 5}, {0xFF56, 111}, 
-  {0xFF57, 440}, {0xFF58, 12}, {0xFF59, 441}, {0x00AB, 2}, 
-  {0x00AD, 3}, {0x00AE, 4}, {0x00A0, 0}, {0x00A9, 1}, 
-  {0x00BB, 7}, {0x00BC, 8}, {0x00BD, 9}, {0x00BE, 10}, 
-  {0x00B5, 5}, {0x00B8, 6}, {0x00C6, 11}, {0x00DF, 13}, 
-  {0x00D7, 12}, {0x00E6, 14}, {0x0001D4AA, 423}, {0x0001D4AB, 68}, 
-  {0x0001D4AC, 69}, {0x0001D4AE, 424}, {0x0001D4AF, 425}, {0x0001D4A2, 420}, 
-  {0x0001D4A5, 421}, {0x0001D4A6, 422}, {0x0001D4A9, 66}, {0x0001D4BB, 433}, 
-  {0x0001D4BD, 62}, {0x0001D4BE, 80}, {0x0001D4BF, 83}, {0x0001D4B0, 426}, 
-  {0x0001D4B1, 100}, {0x0001D4B2, 427}, {0x0001D4B3, 105}, {0x0001D4B4, 428},
-  
-{0x0001D4B5, 73}, {0x0001D4B6, 431}, {0x0001D4B7, 432}, {0x0001D4B8, 118},
-  
-{0x0001D4B9, 82}, {0x0001D4CA, 5}, {0x0001D4CB, 111}, {0x0001D4CC, 440},
-  
-{0x0001D4CD, 12}, {0x0001D4CE, 441}, {0x0001D4CF, 442}, {0x0001D4C0, 434},
-  
-{0x0001D4C2, 119}, {0x0001D4C3, 435}, {0x0001D4C5, 436}, {0x0001D4C6, 437},
-  
-{0x0001D4C7, 438}, {0x0001D4C8, 20}, {0x0001D4C9, 439}, {0x0001D4DA, 422},
-  
-{0x0001D4DB, 64}, {0x0001D4DC, 79}, {0x0001D4DD, 66}, {0x0001D4DE, 423},
-  
-{0x0001D4DF, 68}, {0x0001D4D0, 419}, {0x0001D4D1, 75}, {0x0001D4D2, 57},
-  
-{0x0001D4D3, 81}, {0x0001D4D4, 77}, {0x0001D4D5, 78}, {0x0001D4D6, 420},
-  
-{0x0001D4D7, 61}, {0x0001D4D8, 63}, {0x0001D4D9, 421}, {0x0001D4EA, 431},
-  
-{0x0001D4EB, 432}, {0x0001D4EC, 118}, {0x0001D4ED, 82}, {0x0001D4EE, 76},
-  
-{0x0001D4EF, 433}, {0x0001D4E0, 69}, {0x0001D4E1, 70}, {0x0001D4E2, 424},
-  
-{0x0001D4E3, 425}, {0x0001D4E4, 426}, {0x0001D4E5, 100}, {0x0001D4E6, 427},
-  
-{0x0001D4E7, 105}, {0x0001D4E8, 428}, {0x0001D4E9, 73}, {0x0001D4FA, 437},
-  
-{0x0001D4FB, 438}, {0x0001D4FC, 20}, {0x0001D4FD, 439}, {0x0001D4FE, 5},
-  
-{0x0001D4FF, 111}, {0x0001D4F0, 60}, {0x0001D4F1, 62}, {0x0001D4F2, 80},
-  
-{0x0001D4F3, 83}, {0x0001D4F4, 434}, {0x0001D4F5, 65}, {0x0001D4F6, 119},
-  
-{0x0001D4F7, 435}, {0x0001D4F8, 41}, {0x0001D4F9, 436}, {0x0001D40A, 422},
-  
-{0x0001D40B, 64}, {0x0001D40C, 79}, {0x0001D40D, 66}, {0x0001D40E, 423},
-  
-{0x0001D40F, 68}, {0x0001D400, 419}, {0x0001D401, 75}, {0x0001D402, 57},
-  
-{0x0001D403, 81}, {0x0001D404, 77}, {0x0001D405, 78}, {0x0001D406, 420},
-  
-{0x0001D407, 61}, {0x0001D408, 63}, {0x0001D409, 421}, {0x0001D41A, 431},
-  
-{0x0001D41B, 432}, {0x0001D41C, 118}, {0x0001D41D, 82}, {0x0001D41E, 76},
-  
-{0x0001D41F, 433}, {0x0001D410, 69}, {0x0001D411, 70}, {0x0001D412, 424},
-  
-{0x0001D413, 425}, {0x0001D414, 426}, {0x0001D415, 100}, {0x0001D416, 427},
-  
-{0x0001D417, 105}, {0x0001D418, 428}, {0x0001D419, 73}, {0x0001D42A, 437},
-  
-{0x0001D42B, 438}, {0x0001D42C, 20}, {0x0001D42D, 439}, {0x0001D42E, 5},
-  
-{0x0001D42F, 111}, {0x0001D420, 60}, {0x0001D421, 62}, {0x0001D422, 80},
-  
-{0x0001D423, 83}, {0x0001D424, 434}, {0x0001D425, 65}, {0x0001D426, 119},
-  
-{0x0001D427, 435}, {0x0001D428, 41}, {0x0001D429, 436}, {0x0001D43A, 420},
-  
-{0x0001D43B, 61}, {0x0001D43C, 63}, {0x0001D43D, 421}, {0x0001D43E, 422},
-  
-{0x0001D43F, 64}, {0x0001D430, 440}, {0x0001D431, 12}, {0x0001D432, 441},
-  
-{0x0001D433, 442}, {0x0001D434, 419}, {0x0001D435, 75}, {0x0001D436, 57},
-  
-{0x0001D437, 81}, {0x0001D438, 77}, {0x0001D439, 78}, {0x0001D44A, 427},
-  
-{0x0001D44B, 105}, {0x0001D44C, 428}, {0x0001D44D, 73}, {0x0001D44E, 431},
-  
-{0x0001D44F, 432}, {0x0001D440, 79}, {0x0001D441, 66}, {0x0001D442, 423},
-  
-{0x0001D443, 68}, {0x0001D444, 69}, {0x0001D445, 70}, {0x0001D446, 424},
-  
-{0x0001D447, 425}, {0x0001D448, 426}, {0x0001D449, 100}, {0x0001D45A, 119},
-  
-{0x0001D45B, 435}, {0x0001D45C, 41}, {0x0001D45D, 436}, {0x0001D45E, 437},
-  
-{0x0001D45F, 438}, {0x0001D450, 118}, {0x0001D451, 82}, {0x0001D452, 76},
-  
-{0x0001D453, 433}, {0x0001D454, 60}, {0x0001D456, 80}, {0x0001D457, 83},
-  
-{0x0001D458, 434}, {0x0001D459, 65}, {0x0001D46A, 57}, {0x0001D46B, 81},
-  
-{0x0001D46C, 77}, {0x0001D46D, 78}, {0x0001D46E, 420}, {0x0001D46F, 61},
-  
-{0x0001D460, 20}, {0x0001D461, 439}, {0x0001D462, 5}, {0x0001D463, 111},
-  
-{0x0001D464, 440}, {0x0001D465, 12}, {0x0001D466, 441}, {0x0001D467, 442},
-  
-{0x0001D468, 419}, {0x0001D469, 75}, {0x0001D47A, 424}, {0x0001D47B, 425},
-  
-{0x0001D47C, 426}, {0x0001D47D, 100}, {0x0001D47E, 427}, {0x0001D47F, 105},
-  
-{0x0001D470, 63}, {0x0001D471, 421}, {0x0001D472, 422}, {0x0001D473, 64},
-  
-{0x0001D474, 79}, {0x0001D475, 66}, {0x0001D476, 423}, {0x0001D477, 68},
-  
-{0x0001D478, 69}, {0x0001D479, 70}, {0x0001D48A, 80}, {0x0001D48B, 83},
-  
-{0x0001D48C, 434}, {0x0001D48D, 65}, {0x0001D48E, 119}, {0x0001D48F, 435},
-  
-{0x0001D480, 428}, {0x0001D481, 73}, {0x0001D482, 431}, {0x0001D483, 432},
-  
-{0x0001D484, 118}, {0x0001D485, 82}, {0x0001D486, 76}, {0x0001D487, 433},
-  
-{0x0001D488, 60}, {0x0001D489, 62}, {0x0001D49A, 441}, {0x0001D49B, 442},
-  
-{0x0001D49C, 419}, {0x0001D49E, 57}, {0x0001D49F, 81}, {0x0001D490, 41},
-  
-{0x0001D491, 436}, {0x0001D492, 437}, {0x0001D493, 438}, {0x0001D494, 20},
-  
-{0x0001D495, 439}, {0x0001D496, 5}, {0x0001D497, 111}, {0x0001D498, 440},
-  
-{0x0001D499, 12}, {0x0001D5AA, 422}, {0x0001D5AB, 64}, {0x0001D5AC, 79},
-  
-{0x0001D5AD, 66}, {0x0001D5AE, 423}, {0x0001D5AF, 68}, {0x0001D5A0, 419},
-  
-{0x0001D5A1, 75}, {0x0001D5A2, 57}, {0x0001D5A3, 81}, {0x0001D5A4, 77},
-  
-{0x0001D5A5, 78}, {0x0001D5A6, 420}, {0x0001D5A7, 61}, {0x0001D5A8, 63},
-  
-{0x0001D5A9, 421}, {0x0001D5BA, 431}, {0x0001D5BB, 432}, {0x0001D5BC, 118},
-  
-{0x0001D5BD, 82}, {0x0001D5BE, 76}, {0x0001D5BF, 433}, {0x0001D5B0, 69},
-  
-{0x0001D5B1, 70}, {0x0001D5B2, 424}, {0x0001D5B3, 425}, {0x0001D5B4, 426},
-  
-{0x0001D5B5, 100}, {0x0001D5B6, 427}, {0x0001D5B7, 105}, {0x0001D5B8, 428},
-  
-{0x0001D5B9, 73}, {0x0001D5CA, 437}, {0x0001D5CB, 438}, {0x0001D5CC, 20},
-  
-{0x0001D5CD, 439}, {0x0001D5CE, 5}, {0x0001D5CF, 111}, {0x0001D5C0, 60},
-  
-{0x0001D5C1, 62}, {0x0001D5C2, 80}, {0x0001D5C3, 83}, {0x0001D5C4, 434},
-  
-{0x0001D5C5, 65}, {0x0001D5C6, 119}, {0x0001D5C7, 435}, {0x0001D5C8, 41},
-  
-{0x0001D5C9, 436}, {0x0001D5DA, 420}, {0x0001D5DB, 61}, {0x0001D5DC, 63},
-  
-{0x0001D5DD, 421}, {0x0001D5DE, 422}, {0x0001D5DF, 64}, {0x0001D5D0, 440},
-  
-{0x0001D5D1, 12}, {0x0001D5D2, 441}, {0x0001D5D3, 442}, {0x0001D5D4, 419},
-  
-{0x0001D5D5, 75}, {0x0001D5D6, 57}, {0x0001D5D7, 81}, {0x0001D5D8, 77},
-  
-{0x0001D5D9, 78}, {0x0001D5EA, 427}, {0x0001D5EB, 105}, {0x0001D5EC, 428},
-  
-{0x0001D5ED, 73}, {0x0001D5EE, 431}, {0x0001D5EF, 432}, {0x0001D5E0, 79},
-  
-{0x0001D5E1, 66}, {0x0001D5E2, 423}, {0x0001D5E3, 68}, {0x0001D5E4, 69},
-  
-{0x0001D5E5, 70}, {0x0001D5E6, 424}, {0x0001D5E7, 425}, {0x0001D5E8, 426},
-  
-{0x0001D5E9, 100}, {0x0001D5FA, 119}, {0x0001D5FB, 435}, {0x0001D5FC, 41},
-  
-{0x0001D5FD, 436}, {0x0001D5FE, 437}, {0x0001D5FF, 438}, {0x0001D5F0, 118},
-  
-{0x0001D5F1, 82}, {0x0001D5F2, 76}, {0x0001D5F3, 433}, {0x0001D5F4, 60},
-  
-{0x0001D5F5, 62}, {0x0001D5F6, 80}, {0x0001D5F7, 83}, {0x0001D5F8, 434},
-  
-{0x0001D5F9, 65}, {0x0001D50A, 420}, {0x0001D50D, 421}, {0x0001D50E, 422},
-  
-{0x0001D50F, 64}, {0x0001D500, 440}, {0x0001D501, 12}, {0x0001D502, 441},
-  
-{0x0001D503, 442}, {0x0001D504, 419}, {0x0001D505, 75}, {0x0001D507, 81},
-  
-{0x0001D508, 77}, {0x0001D509, 78}, {0x0001D51A, 427}, {0x0001D51B, 105},
-  
-{0x0001D51C, 428}, {0x0001D51E, 431}, {0x0001D51F, 432}, {0x0001D510, 79},
-  
-{0x0001D511, 66}, {0x0001D512, 423}, {0x0001D513, 68}, {0x0001D514, 69},
-  
-{0x0001D516, 424}, {0x0001D517, 425}, {0x0001D518, 426}, {0x0001D519, 100},
-  
-{0x0001D52A, 119}, {0x0001D52B, 435}, {0x0001D52C, 41}, {0x0001D52D, 436},
-  
-{0x0001D52E, 437}, {0x0001D52F, 438}, {0x0001D520, 118}, {0x0001D521, 82},
-  
-{0x0001D522, 76}, {0x0001D523, 433}, {0x0001D524, 60}, {0x0001D525, 62},
-  
-{0x0001D526, 80}, {0x0001D527, 83}, {0x0001D528, 434}, {0x0001D529, 65},
-  
-{0x0001D53B, 81}, {0x0001D53C, 77}, {0x0001D53D, 78}, {0x0001D53E, 420},
-  
-{0x0001D530, 20}, {0x0001D531, 439}, {0x0001D532, 5}, {0x0001D533, 111},
-  
-{0x0001D534, 440}, {0x0001D535, 12}, {0x0001D536, 441}, {0x0001D537, 442},
-  
-{0x0001D538, 419}, {0x0001D539, 75}, {0x0001D54A, 424}, {0x0001D54B, 425},
-  
-{0x0001D54C, 426}, {0x0001D54D, 100}, {0x0001D54E, 427}, {0x0001D54F, 105},
-  
-{0x0001D540, 63}, {0x0001D541, 421}, {0x0001D542, 422}, {0x0001D543, 64},
-  
-{0x0001D544, 79}, {0x0001D546, 423}, {0x0001D55A, 80}, {0x0001D55B, 83},
-  
-{0x0001D55C, 434}, {0x0001D55D, 65}, {0x0001D55E, 119}, {0x0001D55F, 435},
-  
-{0x0001D550, 428}, {0x0001D552, 431}, {0x0001D553, 432}, {0x0001D554, 118},
-  
-{0x0001D555, 82}, {0x0001D556, 76}, {0x0001D557, 433}, {0x0001D558, 60},
-  
-{0x0001D559, 62}, {0x0001D56A, 441}, {0x0001D56B, 442}, {0x0001D56C, 419},
-  
-{0x0001D56D, 75}, {0x0001D56E, 57}, {0x0001D56F, 81}, {0x0001D560, 41},
-  
-{0x0001D561, 436}, {0x0001D562, 437}, {0x0001D563, 438}, {0x0001D564, 20},
-  
-{0x0001D565, 439}, {0x0001D566, 5}, {0x0001D567, 111}, {0x0001D568, 440},
-  
-{0x0001D569, 12}, {0x0001D57A, 423}, {0x0001D57B, 68}, {0x0001D57C, 69},
-  
-{0x0001D57D, 70}, {0x0001D57E, 424}, {0x0001D57F, 425}, {0x0001D570, 77},
-  
-{0x0001D571, 78}, {0x0001D572, 420}, {0x0001D573, 61}, {0x0001D574, 63},
-  
-{0x0001D575, 421}, {0x0001D576, 422}, {0x0001D577, 64}, {0x0001D578, 79},
-  
-{0x0001D579, 66}, {0x0001D58A, 76}, {0x0001D58B, 433}, {0x0001D58C, 60},
-  
-{0x0001D58D, 62}, {0x0001D58E, 80}, {0x0001D58F, 83}, {0x0001D580, 426},
-  
-{0x0001D581, 100}, {0x0001D582, 427}, {0x0001D583, 105}, {0x0001D584, 428},
-  
-{0x0001D585, 73}, {0x0001D586, 431}, {0x0001D587, 432}, {0x0001D588, 118},
-  
-{0x0001D589, 82}, {0x0001D59A, 5}, {0x0001D59B, 111}, {0x0001D59C, 440},
-  
-{0x0001D59D, 12}, {0x0001D59E, 441}, {0x0001D59F, 442}, {0x0001D590, 434},
-  
-{0x0001D591, 65}, {0x0001D592, 119}, {0x0001D593, 435}, {0x0001D594, 41},
-  
-{0x0001D595, 436}, {0x0001D596, 437}, {0x0001D597, 438}, {0x0001D598, 20},
-  
-{0x0001D599, 439}, {0x0001D6A0, 440}, {0x0001D6A1, 12}, {0x0001D6A2, 441},
-  
-{0x0001D6A3, 442}, {0x0001D60A, 57}, {0x0001D60B, 81}, {0x0001D60C, 77},
-  
-{0x0001D60D, 78}, {0x0001D60E, 420}, {0x0001D60F, 61}, {0x0001D600, 20},
-  
-{0x0001D601, 439}, {0x0001D602, 5}, {0x0001D603, 111}, {0x0001D604, 440},
-  
-{0x0001D605, 12}, {0x0001D606, 441}, {0x0001D607, 442}, {0x0001D608, 419},
-  
-{0x0001D609, 75}, {0x0001D61A, 424}, {0x0001D61B, 425}, {0x0001D61C, 426},
-  
-{0x0001D61D, 100}, {0x0001D61E, 427}, {0x0001D61F, 105}, {0x0001D610, 63},
-  
-{0x0001D611, 421}, {0x0001D612, 422}, {0x0001D613, 64}, {0x0001D614, 79},
-  
-{0x0001D615, 66}, {0x0001D616, 423}, {0x0001D617, 68}, {0x0001D618, 69},
-  
-{0x0001D619, 70}, {0x0001D62A, 80}, {0x0001D62B, 83}, {0x0001D62C, 434},
-  
-{0x0001D62D, 65}, {0x0001D62E, 119}, {0x0001D62F, 435}, {0x0001D620, 428},
-  
-{0x0001D621, 73}, {0x0001D622, 431}, {0x0001D623, 432}, {0x0001D624, 118},
-  
-{0x0001D625, 82}, {0x0001D626, 76}, {0x0001D627, 433}, {0x0001D628, 60},
-  
-{0x0001D629, 62}, {0x0001D63A, 441}, {0x0001D63B, 442}, {0x0001D63C, 419},
-  
-{0x0001D63D, 75}, {0x0001D63E, 57}, {0x0001D63F, 81}, {0x0001D630, 41},
-  
-{0x0001D631, 436}, {0x0001D632, 437}, {0x0001D633, 438}, {0x0001D634, 20},
-  
-{0x0001D635, 439}, {0x0001D636, 5}, {0x0001D637, 111}, {0x0001D638, 440},
-  
-{0x0001D639, 12}, {0x0001D64A, 423}, {0x0001D64B, 68}, {0x0001D64C, 69},
-  
-{0x0001D64D, 70}, {0x0001D64E, 424}, {0x0001D64F, 425}, {0x0001D640, 77},
-  
-{0x0001D641, 78}, {0x0001D642, 420}, {0x0001D643, 61}, {0x0001D644, 63},
-  
-{0x0001D645, 421}, {0x0001D646, 422}, {0x0001D647, 64}, {0x0001D648, 79},
-  
-{0x0001D649, 66}, {0x0001D65A, 76}, {0x0001D65B, 433}, {0x0001D65C, 60},
-  
-{0x0001D65D, 62}, {0x0001D65E, 80}, {0x0001D65F, 83}, {0x0001D650, 426},
-  
-{0x0001D651, 100}, {0x0001D652, 427}, {0x0001D653, 105}, {0x0001D654, 428},
-  
-{0x0001D655, 73}, {0x0001D656, 431}, {0x0001D657, 432}, {0x0001D658, 118},
-  
-{0x0001D659, 82}, {0x0001D66A, 5}, {0x0001D66B, 111}, {0x0001D66C, 440},
-  
-{0x0001D66D, 12}, {0x0001D66E, 441}, {0x0001D66F, 442}, {0x0001D660, 434},
-  
-{0x0001D661, 65}, {0x0001D662, 119}, {0x0001D663, 435}, {0x0001D664, 41},
-  
-{0x0001D665, 436}, {0x0001D666, 437}, {0x0001D667, 438}, {0x0001D668, 20},
-  
-{0x0001D669, 439}, {0x0001D67A, 422}, {0x0001D67B, 64}, {0x0001D67C, 79},
-  
-{0x0001D67D, 66}, {0x0001D67E, 423}, {0x0001D67F, 68}, {0x0001D670, 419},
-  
-{0x0001D671, 75}, {0x0001D672, 57}, {0x0001D673, 81}, {0x0001D674, 77},
-  
-{0x0001D675, 78}, {0x0001D676, 420}, {0x0001D677, 61}, {0x0001D678, 63},
-  
-{0x0001D679, 421}, {0x0001D68A, 431}, {0x0001D68B, 432}, {0x0001D68C, 118},
-  
-{0x0001D68D, 82}, {0x0001D68E, 76}, {0x0001D68F, 433}, {0x0001D680, 69},
-  
-{0x0001D681, 70}, {0x0001D682, 424}, {0x0001D683, 425}, {0x0001D684, 426},
-  
-{0x0001D685, 100}, {0x0001D686, 427}, {0x0001D687, 105}, {0x0001D688, 428},
-  
-{0x0001D689, 73}, {0x0001D69A, 437}, {0x0001D69B, 438}, {0x0001D69C, 20},
-  
-{0x0001D69D, 439}, {0x0001D69E, 5}, {0x0001D69F, 111}, {0x0001D690, 60},
-  
-{0x0001D691, 62}, {0x0001D692, 80}, {0x0001D693, 83}, {0x0001D694, 434},
-  
-{0x0001D695, 65}, {0x0001D696, 119}, {0x0001D697, 435}, {0x0001D698, 41},
-  
-{0x0001D699, 436}, {0x0001D7CE, 409}, {0x0001D7CF, 410}, {0x0001D7DA, 411},
-  
-{0x0001D7DB, 412}, {0x0001D7DC, 413}, {0x0001D7DD, 414}, {0x0001D7DE, 415},
-  
-{0x0001D7DF, 416}, {0x0001D7D0, 411}, {0x0001D7D1, 412}, {0x0001D7D2, 413},
-  
-{0x0001D7D3, 414}, {0x0001D7D4, 415}, {0x0001D7D5, 416}, {0x0001D7D6, 417},
-  
-{0x0001D7D7, 418}, {0x0001D7D8, 409}, {0x0001D7D9, 410}, {0x0001D7EA, 417},
-  
-{0x0001D7EB, 418}, {0x0001D7EC, 409}, {0x0001D7ED, 410}, {0x0001D7EE, 411},
-  
-{0x0001D7EF, 412}, {0x0001D7E0, 417}, {0x0001D7E1, 418}, {0x0001D7E2, 409},
-  
-{0x0001D7E3, 410}, {0x0001D7E4, 411}, {0x0001D7E5, 412}, {0x0001D7E6, 413},
-  
-{0x0001D7E7, 414}, {0x0001D7E8, 415}, {0x0001D7E9, 416}, {0x0001D7FA, 413},
-  
-{0x0001D7FB, 414}, {0x0001D7FC, 415}, {0x0001D7FD, 416}, {0x0001D7FE, 417},
-  
-{0x0001D7FF, 418}, {0x0001D7F0, 413}, {0x0001D7F1, 414}, {0x0001D7F2, 415},
-  
-{0x0001D7F3, 416}, {0x0001D7F4, 417}, {0x0001D7F5, 418}, {0x0001D7F6, 409},
-  
-{0x0001D7F7, 410}, {0x0001D7F8, 411}, {0x0001D7F9, 412}, {0x01CA, 24},
-  
-{0x01CB, 25}, {0x01CC, 26}, {0x01C7, 21}, {0x01C8, 22}, 
-{0x01C9, 23},
-  {0x01F1, 27}, {0x01F2, 28}, {0x01F3, 29}, 
-{0x0132, 15}, {0x0133, 16},
-  {0x0149, 17}, {0x0152, 18}, 
-{0x0152, 18}, {0x0153, 19}, {0x0153, 19},
-  {0x017F, 20}, 
-{0x02BC, 30}, {0x02CB, 32}, {0x02CD, 33}, {0x02C6, 31},
-  
-{0x02C8, 30}, {0x02DC, 35}, {0x02D0, 34}, {0x2A74, 259}, 
-{0x2A75, 260},
-  {0x2A76, 261}, {0x20AC, 54}, {0x20A8, 53}, 
-{0x200A, 0}, {0x200B, 36},
-  {0x2002, 0}, {0x2003, 0}, 
-{0x2004, 0}, {0x2005, 0}, {0x2006, 0}, {0x2008,
-                                                                     0},
-  
-{0x2009, 0}, {0x201A, 6}, {0x201B, 30}, {0x201C, 38}, 
-{0x201D, 38},
-  {0x201E, 39}, {0x201F, 38}, {0x2010, 3}, 
-{0x2011, 3}, {0x2012, 3}, {0x2013,
-                                                                       3},
-  {0x2014, 37}, 
-{0x2015, 3}, {0x2018, 30}, {0x2019, 30}, {0x202F, 0},
-  
-{0x2020, 40}, {0x2022, 41}, {0x2024, 42}, {0x2025, 43}, 
-{0x2026, 44},
-  {0x203A, 48}, {0x203C, 49}, {0x2035, 32}, 
-{0x2036, 45}, {0x2037, 46},
-  {0x2039, 47}, {0x2047, 50}, 
-{0x2048, 51}, {0x2049, 52}, {0x205F, 0},
-  {0x2060, 36}, 
-{0x2061, 36}, {0x2062, 36}, {0x2063, 36}, {0x21D0, 123},
-  
-{0x21D2, 124}, {0x21D4, 125}, {0x210A, 60}, {0x210B, 61}, 
-{0x210C, 61},
-  {0x210D, 61}, {0x210E, 62}, {0x2100, 55}, 
-{0x2101, 56}, {0x2102, 57},
-  {0x2105, 58}, {0x2106, 59}, 
-{0x211A, 69}, {0x211B, 70}, {0x211C, 70},
-  {0x211D, 70}, 
-{0x2110, 63}, {0x2111, 63}, {0x2112, 64}, {0x2113, 65},
-  
-{0x2115, 66}, {0x2116, 67}, {0x2119, 68}, {0x212C, 75}, 
-{0x212D, 57},
-  {0x212E, 76}, {0x212F, 76}, {0x2121, 71}, 
-{0x2122, 72}, {0x2124, 73},
-  {0x2126, 74}, {0x2128, 73}, 
-{0x2130, 77}, {0x2131, 78}, {0x2133, 79},
-  {0x2134, 41}, 
-{0x2139, 80}, {0x2145, 81}, {0x2146, 82}, {0x2147, 76},
-  
-{0x2148, 80}, {0x2149, 83}, {0x215A, 91}, {0x215B, 92}, 
-{0x215C, 93},
-  {0x215D, 94}, {0x215E, 95}, {0x215F, 96}, 
-{0x2153, 84}, {0x2154, 85},
-  {0x2155, 86}, {0x2156, 87}, 
-{0x2157, 88}, {0x2158, 89}, {0x2159, 90},
-  {0x216A, 106}, 
-{0x216B, 107}, {0x216C, 64}, {0x216D, 57}, {0x216E, 81},
-  
-{0x216F, 79}, {0x2160, 63}, {0x2161, 97}, {0x2162, 98}, 
-{0x2163, 99},
-  {0x2164, 100}, {0x2165, 101}, {0x2166, 102}, 
-{0x2167, 103}, {0x2168, 104},
-  {0x2169, 105}, {0x217A, 116}, 
-{0x217B, 117}, {0x217C, 65}, {0x217D, 118},
-  {0x217E, 82}, 
-{0x217F, 119}, {0x2170, 80}, {0x2171, 108}, {0x2172, 109},
-  
-{0x2173, 110}, {0x2174, 111}, {0x2175, 112}, {0x2176, 113}, 
-{0x2177, 114},
-  {0x2178, 115}, {0x2179, 12}, {0x2190, 120}, 
-{0x2192, 121}, {0x2194, 122},
-  {0x22D8, 131}, {0x22D9, 132}, 
-{0x2212, 3}, {0x2215, 126}, {0x2216, 127},
-  {0x2217, 128}, 
-{0x2223, 129}, {0x223C, 35}, {0x2236, 34}, {0x226A, 2},
-  
-{0x226B, 7}, {0x2264, 123}, {0x2265, 130}, {0x24AA, 222}, 
-{0x24AB, 223},
-  {0x24AC, 224}, {0x24AD, 225}, {0x24AE, 226}, 
-{0x24AF, 227}, {0x24A0, 212},
-  {0x24A1, 213}, {0x24A2, 214}, 
-{0x24A3, 215}, {0x24A4, 216}, {0x24A5, 217},
-  {0x24A6, 218}, 
-{0x24A7, 219}, {0x24A8, 220}, {0x24A9, 221}, {0x24BA, 237},
-  
-{0x24BB, 238}, {0x24BC, 239}, {0x24BD, 240}, {0x24BE, 241}, 
-{0x24BF, 242},
-  {0x24B0, 228}, {0x24B1, 229}, {0x24B2, 230}, 
-{0x24B3, 231}, {0x24B4, 232},
-  {0x24B5, 233}, {0x24B6, 234}, 
-{0x24B7, 235}, {0x24B8, 1}, {0x24B9, 236},
-  {0x24CA, 252}, 
-{0x24CB, 253}, {0x24CC, 254}, {0x24CD, 255}, {0x24CE, 256},
-  
-{0x24CF, 257}, {0x24C0, 243}, {0x24C1, 244}, {0x24C2, 245}, 
-{0x24C3, 246},
-  {0x24C4, 247}, {0x24C5, 248}, {0x24C6, 249}, 
-{0x24C7, 4}, {0x24C8, 250},
-  {0x24C9, 251}, {0x24DA, 218}, 
-{0x24DB, 219}, {0x24DC, 220}, {0x24DD, 221},
-  {0x24DE, 222}, 
-{0x24DF, 223}, {0x24D0, 208}, {0x24D1, 209}, {0x24D2, 210},
-  
-{0x24D3, 211}, {0x24D4, 212}, {0x24D5, 213}, {0x24D6, 214}, 
-{0x24D7, 215},
-  {0x24D8, 216}, {0x24D9, 217}, {0x24EA, 258}, 
-{0x24E0, 224}, {0x24E1, 225},
-  {0x24E2, 226}, {0x24E3, 227}, 
-{0x24E4, 228}, {0x24E5, 229}, {0x24E6, 230},
-  {0x24E7, 231}, 
-{0x24E8, 232}, {0x24E9, 233}, {0x240A, 143}, {0x240B, 144},
-  
-{0x240C, 145}, {0x240D, 146}, {0x240E, 147}, {0x240F, 148}, 
-{0x2400, 133},
-  {0x2401, 134}, {0x2402, 135}, {0x2403, 136}, 
-{0x2404, 137}, {0x2405, 138},
-  {0x2406, 139}, {0x2407, 140}, 
-{0x2408, 141}, {0x2409, 142}, {0x241A, 159},
-  {0x241B, 160}, 
-{0x241C, 161}, {0x241D, 162}, {0x241E, 163}, {0x241F, 164},
-  
-{0x2410, 149}, {0x2411, 150}, {0x2412, 151}, {0x2413, 152}, 
-{0x2414, 153},
-  {0x2415, 154}, {0x2416, 155}, {0x2417, 156}, 
-{0x2418, 157}, {0x2419, 158},
-  {0x2420, 165}, {0x2421, 166}, 
-{0x2423, 33}, {0x2424, 167}, {0x246A, 178},
-  {0x246B, 179}, 
-{0x246C, 180}, {0x246D, 181}, {0x246E, 182}, {0x246F, 183},
-  
-{0x2460, 168}, {0x2461, 169}, {0x2462, 170}, {0x2463, 171}, 
-{0x2464, 172},
-  {0x2465, 173}, {0x2466, 174}, {0x2467, 175}, 
-{0x2468, 176}, {0x2469, 177},
-  {0x247A, 174}, {0x247B, 175}, 
-{0x247C, 176}, {0x247D, 177}, {0x247E, 178},
-  {0x247F, 179}, 
-{0x2470, 184}, {0x2471, 185}, {0x2472, 186}, {0x2473, 187},
-  
-{0x2474, 168}, {0x2475, 169}, {0x2476, 170}, {0x2477, 171}, 
-{0x2478, 172},
-  {0x2479, 173}, {0x248A, 190}, {0x248B, 191}, 
-{0x248C, 192}, {0x248D, 193},
-  {0x248E, 194}, {0x248F, 195}, 
-{0x2480, 180}, {0x2481, 181}, {0x2482, 182},
-  {0x2483, 183}, 
-{0x2484, 184}, {0x2485, 185}, {0x2486, 186}, {0x2487, 187},
-  
-{0x2488, 188}, {0x2489, 189}, {0x249A, 206}, {0x249B, 207}, 
-{0x249C, 208},
-  {0x249D, 209}, {0x249E, 210}, {0x249F, 211}, 
-{0x2490, 196}, {0x2491, 197},
-  {0x2492, 198}, {0x2493, 199}, 
-{0x2494, 200}, {0x2495, 201}, {0x2496, 202},
-  {0x2497, 203}, 
-{0x2498, 204}, {0x2499, 205}, {0x25E6, 41}, {0x250C, 40},
-  
-{0x2500, 3}, {0x2502, 129}, {0x251C, 40}, {0x2510, 40}, 
-{0x2514, 40},
-  {0x2518, 40}, {0x252C, 40}, {0x2524, 40}, 
-{0x253C, 40}, {0x2534, 40},
-  {0x30A0, 262}, {0x3000, 0}, 
-{0x32BA, 287}, {0x32BB, 288}, {0x32BC, 289},
-  {0x32BD, 290}, 
-{0x32BE, 291}, {0x32BF, 292}, {0x32B1, 278}, {0x32B2, 279},
-  
-{0x32B3, 280}, {0x32B4, 281}, {0x32B5, 282}, {0x32B6, 283}, 
-{0x32B7, 284},
-  {0x32B8, 285}, {0x32B9, 286}, {0x325A, 272}, 
-{0x325B, 273}, {0x325C, 274},
-  {0x325D, 275}, {0x325E, 276}, 
-{0x325F, 277}, {0x3251, 263}, {0x3252, 264},
-  {0x3253, 265}, 
-{0x3254, 266}, {0x3255, 267}, {0x3256, 268}, {0x3257, 269},
-  
-{0x3258, 270}, {0x3259, 271}, {0x33AA, 341}, {0x33AB, 342}, 
-{0x33AC, 343},
-  {0x33AD, 344}, {0x33AE, 345}, {0x33AF, 346}, 
-{0x33A0, 331}, {0x33A1, 332},
-  {0x33A2, 333}, {0x33A3, 334}, 
-{0x33A4, 335}, {0x33A5, 336}, {0x33A6, 337},
-  {0x33A7, 338}, 
-{0x33A8, 339}, {0x33A9, 340}, {0x33BA, 357}, {0x33BB, 358},
-  
-{0x33BC, 359}, {0x33BD, 360}, {0x33BE, 361}, {0x33BF, 362}, 
-{0x33B0, 347},
-  {0x33B1, 348}, {0x33B2, 349}, {0x33B3, 350}, 
-{0x33B4, 351}, {0x33B5, 352},
-  {0x33B6, 353}, {0x33B7, 354}, 
-{0x33B8, 355}, {0x33B9, 356}, {0x33CA, 371},
-  {0x33CB, 372}, 
-{0x33CC, 373}, {0x33CD, 374}, {0x33CE, 375}, {0x33CF, 376},
-  
-{0x33C2, 363}, {0x33C3, 364}, {0x33C4, 365}, {0x33C5, 366}, 
-{0x33C6, 367},
-  {0x33C7, 368}, {0x33C8, 369}, {0x33C9, 370}, 
-{0x33DA, 387}, {0x33DB, 388},
-  {0x33DC, 389}, {0x33DD, 390}, 
-{0x33D0, 377}, {0x33D1, 378}, {0x33D2, 379},
-  {0x33D3, 380}, 
-{0x33D4, 381}, {0x33D5, 382}, {0x33D6, 383}, {0x33D7, 384},
-  
-{0x33D8, 385}, {0x33D9, 386}, {0x3371, 293}, {0x3372, 294}, 
-{0x3373, 295},
-  {0x3374, 296}, {0x3375, 297}, {0x3376, 298}, 
-{0x338A, 309}, {0x338B, 310},
-  {0x338C, 311}, {0x338D, 312}, 
-{0x338E, 313}, {0x338F, 314}, {0x3380, 299},
-  {0x3381, 300}, 
-{0x3382, 301}, {0x3383, 302}, {0x3384, 303}, {0x3385, 304},
-  
-{0x3386, 305}, {0x3387, 306}, {0x3388, 307}, {0x3389, 308}, 
-{0x339A, 325},
-  {0x339B, 326}, {0x339C, 327}, {0x339D, 328}, 
-{0x339E, 329}, {0x339F, 330},
-  {0x3390, 315}, {0x3391, 316}, 
-{0x3392, 317}, {0x3393, 318}, {0x3394, 319},
-  {0x3395, 320}, 
-{0x3396, 321}, {0x3397, 322}, {0x3398, 323}, {0x3399, 324},
-  
-{0, 0}
-};
-
-
-char *translit[] =
-  { 
-" ", "(C)", "<<", "-", 
-"(R)", "u", ",", ">>", 
-" 1/4 ", " 1/2 ",
-" 3/4 ", "AE", 
-"x", "ss", "ae", "IJ", 
-"ij", "'n", "OE", "oe", 
-"s", "LJ", "Lj", "lj",
-
-"NJ", "Nj", "nj", "DZ", 
-"Dz", "dz", "'", "^", 
-"`", "_", ":", "~", 
-"", "--", "\"", ",,",
-
-"+", "o", ".", "..", 
-"...", "``", "```", "<", 
-">", "!!", "??", "?!", 
-"!?", "Rs", "EUR",
-"a/c", 
-"a/s", "C", "c/o", "c/u", 
-"g", "H", "h", "I", 
-"L", "l", "N", "No", 
-"P", "Q", "R",
-"TEL", 
-"(TM)", "Z", "Ohm", "B", 
-"e", "E", "F", "M", 
-"i", "D", "d", "j", 
-" 1/3 ", " 2/3 ",
-" 1/5 ", " 2/5 ", 
-" 3/5 ", " 4/5 ", " 1/6 ", " 5/6 ", 
-" 1/8 ", " 3/8 ", " 5/8 ", " 7/8 ",
-
-" 1/", "II", "III", "IV", 
-"V", "VI", "VII", "VIII", 
-"IX", "X", "XI", "XII", 
-"ii", "iii",
-"iv", "v", 
-"vi", "vii", "viii", "ix", 
-"xi", "xii", "c", "m", 
-"<-", "->", "<->", "<=",
-
-"=>", "<=>", "/", "\\", 
-"*", "|", ">=", "<<<", 
-">>>", "NUL", "SOH", "STX", 
-"ETX", "EOT",
-"ENQ", "ACK", 
-"BEL", "BS", "HT", "LF", 
-"VT", "FF", "CR", "SO", 
-"SI", "DLE", "DC1", "DC2",
-
-"DC3", "DC4", "NAK", "SYN", 
-"ETB", "CAN", "EM", "SUB", 
-"ESC", "FS", "GS", "RS", 
-"US",
-"SP", "DEL", "NL", 
-"(1)", "(2)", "(3)", "(4)", 
-"(5)", "(6)", "(7)", "(8)", 
-"(9)", "(10)",
-"(11)", "(12)", 
-"(13)", "(14)", "(15)", "(16)", 
-"(17)", "(18)", "(19)", "(20)", 
-"1.",
-"2.", "3.", "4.", 
-"5.", "6.", "7.", "8.", 
-"9.", "10.", "11.", "12.", 
-"13.", "14.", "15.",
-"16.", 
-"17.", "18.", "19.", "20.", 
-"(a)", "(b)", "(c)", "(d)", 
-"(e)", "(f)", "(g)", "(h)",
-
-"(i)", "(j)", "(k)", "(l)", 
-"(m)", "(n)", "(o)", "(p)", 
-"(q)", "(r)", "(s)", "(t)",
-
-"(u)", "(v)", "(w)", "(x)", 
-"(y)", "(z)", "(A)", "(B)", 
-"(D)", "(E)", "(F)", "(G)",
-
-"(H)", "(I)", "(J)", "(K)", 
-"(L)", "(M)", "(N)", "(O)", 
-"(P)", "(Q)", "(S)", "(T)",
-
-"(U)", "(V)", "(W)", "(X)", 
-"(Y)", "(Z)", "(0)", "::=", 
-"==", "===", "=", "(21)", 
-"(22)",
-"(23)", "(24)", "(25)", 
-"(26)", "(27)", "(28)", "(29)", 
-"(30)", "(31)", "(32)", "(33)",
-
-"(34)", "(35)", "(36)", "(37)", 
-"(38)", "(39)", "(40)", "(41)", 
-"(42)", "(43)", "(44)",
-"(45)", 
-"(46)", "(47)", "(48)", "(49)", 
-"(50)", "hPa", "da", "AU", 
-"bar", "oV", "pc",
-"pA", 
-"nA", "uA", "mA", "kA", 
-"KB", "MB", "GB", "cal", 
-"kcal", "pF", "nF", "uF", 
-"ug",
-"mg", "kg", "Hz", 
-"kHz", "MHz", "GHz", "THz", 
-"ul", "ml", "dl", "kl", 
-"fm", "nm", "um",
-"mm", 
-"cm", "km", "mm^2", "cm^2", 
-"m^2", "km^2", "mm^3", "cm^3", 
-"m^3", "km^3", "m/s",
-"m/s^2", 
-"Pa", "kPa", "MPa", "GPa", 
-"rad", "rad/s", "rad/s^2", "ps", 
-"ns", "us", "ms",
-"pV", 
-"nV", "uV", "mV", "kV", 
-"MV", "pW", "nW", "uW", 
-"mW", "kW", "MW", "a.m.", 
-"Bq",
-"cc", "cd", "C/kg", 
-"Co.", "dB", "Gy", "ha", 
-"HP", "in", "KK", "KM", 
-"kt", "lm", "ln",
-"log", 
-"lx", "mb", "mil", "mol", 
-"PH", "p.m.", "PPM", "PR", 
-"sr", "Sv", "Wb", "ff", 
-"fi",
-"fl", "ffi", "ffl", 
-"st", ";", "?", "!", 
-"(", ")", "{", "}", 
-"#", "&", "$", "%", 
-"@",
-"0", "1", "2", 
-"3", "4", "5", "6", 
-"7", "8", "9", "A", 
-"G", "J", "K", "O", 
-"S", "T", "U",
-"W", 
-"Y", "[", "]", "a", 
-"b", "f", "k", "n", 
-"p", "q", "r", "t", 
-"w", "y", "z", "z", 
-    /* German */ "Ae", "Oe", "Ue", "ue", 
-    /* Scandinavian */ "Aa", "aa" 
-};
-
-
-static void
-addKeyword (struct EXTRACTOR_Keywords **list, 
-char *keyword,
-            
-EXTRACTOR_KeywordType type)
-{
-  
-EXTRACTOR_KeywordList * next;
-  
-next = malloc (sizeof (EXTRACTOR_KeywordList));
-  
-next->next = *list;
-  
-next->keyword = strdup (keyword);
-  
-next->keywordType = type;
-  
-*list = next;
-
-}
-
-
-struct EXTRACTOR_Keywords *
-libextractor_translit_extract (const char *filename, 
-const char *data,
-                               
-size_t size, 
-struct EXTRACTOR_Keywords *prev)
-{
-  
-struct EXTRACTOR_Keywords *pos;
-  
-unsigned int mem, src, dest, len;
-  
-char *transl;
-  
-
-pos = prev;
-  
-
-mem = 256;
-  
-transl = malloc (mem + 1);
-  
-
-
-while (pos != NULL)
-    
-    {
-      
-int charlen = 0;
-      
-char *srcdata = pos->keyword;
-      
-
-len = strlen (pos->keyword);
-      
-
-for (src = 0, dest = 0; src <= len; src += charlen)
-        {
-          
-char c;
-          
-int trlen;
-          
-long long unicode;
-          
-int idx;
-          
-char *tr;
-          
-
-            /* Get length of character */ 
-            c = srcdata[src];
-          
-if ((c & 0xC0) == 0xC0)
-            
-              /* UTF-8 char */ 
-              if ((c & 0xE0) == 0xE0)
-              
-if ((c & 0xF0) == 0xF0)
-                
-charlen = 4;
-          
-              else
-                
-charlen = 3;
-          
-            else
-              
-charlen = 2;
-          
-          else
-            
-charlen = 1;
-          
-
-if (src + charlen - 1 > len)
-            {
-              
-                /* incomplete UTF-8 */ 
-                src = len;
-              
-continue;
-            
-}
-          
-            /* Copy character to destination */ 
-            if (charlen > 1)
-            {
-              
-unicode = 0;
-              
-
-if (charlen == 2)
-                {
-                  
-                    /* 5 bits from the first byte and 6 bits from the second.
-                       64 = 2^6 */ 
-                    unicode =
-                    ((srcdata[src] & 0x1F) * 64) | (srcdata[src + 1] & 0x3F);
-                
-}
-              
-              else if (charlen == 3)
-                {
-                  
-                    /* 4 bits from the first byte and 6 bits from the second 
and third
-                       byte. 4096 = 2^12 */ 
-                    unicode = ((srcdata[src] & 0xF) * 4096) | 
-                    ((srcdata[src + 1] & 0x3F) *
-                     64) | (srcdata[src + 2] & 0x3F);
-                
-}
-              
-              else if (charlen == 4)
-                {
-                  
-                    /* 3 bits from the first byte and 6 bits from the second, 
third
-                       and fourth byte. 262144 = 2^18 */ 
-                    unicode = ((srcdata[src] & 7) * 262144) | 
-                    ((srcdata[src] & 0xF) * 4096) | 
-                    ((srcdata[src + 1] & 0x3F) *
-                     64) | (srcdata[src + 2] & 0x3F);
-                
-}
-              
-                /* Look it up */ 
-                idx = 0;
-              
-tr = srcdata + src;
-              
-trlen = charlen;
-              
-while (chars[idx][0])
-                {
-                  
-if (unicode == chars[idx][0])
-                    {
-                      
-                        /* Found it */ 
-                        tr = translit[chars[idx][1]];
-                      
-trlen = strlen (tr);
-                      
-break;
-                    
-}
-                  
-idx++;
-                
-}
-            
-}
-          
-          else
-            
-trlen = 1;
-          
-
-if (dest + trlen > mem)
-            {
-              
-mem = dest + trlen;
-              
-transl = (char *) realloc (transl, mem + 1);
-            
-}
-          
-
-if (charlen > 1)
-            {
-              
-                /* Copy character to destination string */ 
-                memcpy (transl + dest, tr, trlen);
-            
-}
-          
-          else
-            
-transl[dest] = c;
-          
-
-dest += trlen;
-        
-}
-      
-
-transl[dest] = 0;
-      
-
-if (strcmp (pos->keyword, transl) != 0)
-        
-addKeyword (&prev, transl, EXTRACTOR_UNKNOWN);
-      
-
-pos = pos->next;
-    
-}
-  
-
-free (transl);
-  
-
-return prev;
-
-}
-
-

Modified: Extractor/src/plugins/wav_extractor.c
===================================================================
--- Extractor/src/plugins/wav_extractor.c       2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/wav_extractor.c       2012-08-07 19:56:09 UTC (rev 
23156)
@@ -1,10 +1,10 @@
 /*
      This file is part of libextractor.
-     (C) 2004, 2009 Vidyut Samanta and Christian Grothoff
+     (C) 2004, 2009, 2012 Vidyut Samanta and Christian Grothoff
 
      libextractor is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published
-     by the Free Software Foundation; either version 2, or (at your
+     by the Free Software Foundation; either version 3, or (at your
      option) any later version.
 
      libextractor is distributed in the hope that it will be useful, but
@@ -24,101 +24,116 @@
      Please see file COPYING or http://bitzi.com/publicdomain
      for more info.
 */
+/**
+ * @file plugins/wav_extractor.c
+ * @brief plugin to support WAV files
+ * @author Christian Grothoff
+ */
 
-
 #include "platform.h"
 #include "extractor.h"
 
+
 #if BIG_ENDIAN_HOST
-static short
-toLittleEndian16 (short in)
+static uint16_t
+little_endian_to_host16 (uint16_t in)
 {
-  char *ptr = (char *) &in;
+  unsigned char *ptr = (unsigned char *) &in;
 
   return ((ptr[1] & 0xFF) << 8) | (ptr[0] & 0xFF);
 }
 
-static unsigned int
-toLittleEndian32 (unsigned int in)
+
+static uint32_t
+little_endian_to_host32 (uint32_t in)
 {
-  char *ptr = (char *) &in;
+  unsigned char *ptr = (unsigned char *) &in;
 
-  return ((ptr[3] & 0xFF) << 24) | ((ptr[2] & 0xFF) << 16) | ((ptr[1] & 0xFF)
-                                                              << 8) | (ptr[0]
-                                                                       &
-                                                                       0xFF);
+  return ((ptr[3] & 0xFF) << 24) | ((ptr[2] & 0xFF) << 16) | 
+    ((ptr[1] & 0xFF) << 8) | (ptr[0] & 0xFF);
 }
 #endif
 
 
-/*
-  16      4 bytes  0x00000010     // Length of the fmt data (16 bytes)
-  20      2 bytes  0x0001         // Format tag: 1 = PCM
-  22      2 bytes  <channels>     // Channels: 1 = mono, 2 = stereo
-  24      4 bytes  <sample rate>  // Samples per second: e.g., 44100
-*/
-int 
-EXTRACTOR_wav_extract (const unsigned char *buf,
-                      size_t bufLen,
-                      EXTRACTOR_MetaDataProcessor proc,
-                      void *proc_cls,
-                      const char *options)
+/**
+ * Extract information from WAV files.
+ *
+ * @param ec extraction context
+ *
+ * @detail
+ * A WAV header looks as follows:
+ *
+ * Offset  Value    meaning
+ * 16      4 bytes  0x00000010     // Length of the fmt data (16 bytes)
+ * 20      2 bytes  0x0001         // Format tag: 1 = PCM
+ * 22      2 bytes  <channels>     // Channels: 1 = mono, 2 = stereo
+ * 24      4 bytes  <sample rate>  // Samples per second: e.g., 44100
+ */
+void
+EXTRACTOR_wav_extract_method (struct EXTRACTOR_ExtractContext *ec)
 {
-  unsigned short channels;
-  unsigned short sampleSize;
-  unsigned int sampleRate;
-  unsigned int dataLen;
-  unsigned int samples;
+  void *data;
+  const unsigned char *buf;
+  uint16_t channels;
+  uint16_t sample_size;
+  uint32_t sample_rate;
+  uint32_t data_len;
+  uint32_t samples;
   char scratch[256];
 
-  if ((bufLen < 44) ||
-      (buf[0] != 'R' || buf[1] != 'I' ||
+  if (44 > 
+      ec->read (ec->cls,  &data, 44))
+    return;
+  buf = data;
+  if ((buf[0] != 'R' || buf[1] != 'I' ||
        buf[2] != 'F' || buf[3] != 'F' ||
        buf[8] != 'W' || buf[9] != 'A' ||
        buf[10] != 'V' || buf[11] != 'E' ||
        buf[12] != 'f' || buf[13] != 'm' || buf[14] != 't' || buf[15] != ' '))
-    return 0;                /* not a WAV file */
+    return;                /* not a WAV file */
 
-  channels = *((unsigned short *) &buf[22]);
-  sampleRate = *((unsigned int *) &buf[24]);
-  sampleSize = *((unsigned short *) &buf[34]);
-  dataLen = *((unsigned int *) &buf[40]);
+  channels = *((uint16_t *) &buf[22]);
+  sample_rate = *((uint32_t *) &buf[24]);
+  sample_size = *((uint16_t *) &buf[34]);
+  data_len = *((uint32_t *) &buf[40]);
 
 #if BIG_ENDIAN_HOST
-  channels = toLittleEndian16 (channels);
-  sampleSize = toLittleEndian16 (sampleSize);
-  sampleRate = toLittleEndian32 (sampleRate);
-  dataLen = toLittleEndian32 (dataLen);
+  channels = little_endian_to_host16 (channels);
+  sample_size = little_endian_to_host16 (sample_size);
+  sample_rate = little_endian_to_host32 (sample_rate);
+  data_len = little_endian_to_host32 (data_len);
 #endif
 
-  if (sampleSize != 8 && sampleSize != 16)
-    return 0;                /* invalid sample size found in wav file */
-  if (channels == 0)
-    return 0;                /* invalid channels value -- avoid division by 0! 
*/
-  samples = dataLen / (channels * (sampleSize >> 3));
+  if ( (8 != sample_size) &&
+       (16 != sample_size) )
+    return;                /* invalid sample size found in wav file */
+  if (0 == channels)
+    return;                /* invalid channels value -- avoid division by 0! */
+  samples = data_len / (channels * (sample_size >> 3));
 
   snprintf (scratch,
             sizeof (scratch),
             "%u ms, %d Hz, %s",
-            (samples < sampleRate)
-            ? (samples * 1000 / sampleRate)
-            : (samples / sampleRate) * 1000,
-            sampleRate, channels == 1 ? _("mono") : _("stereo"));
-  if (0 != proc (proc_cls, 
-                "wav",
-                EXTRACTOR_METATYPE_RESOURCE_TYPE,
-                EXTRACTOR_METAFORMAT_UTF8,
-                "text/plain",
-                scratch,
-                strlen (scratch) +1))
-    return 1;
-  if (0 != proc (proc_cls, 
-                "wav",
-                EXTRACTOR_METATYPE_MIMETYPE,
-                EXTRACTOR_METAFORMAT_UTF8,
-                "text/plain",
-                "audio/x-wav",
-                strlen ("audio/x-wav") +1))
-    return 1;
-  return 0;
+            (samples < sample_rate)
+            ? (samples * 1000 / sample_rate)
+            : (samples / sample_rate) * 1000,
+            sample_rate, (1 == channels) ? _("mono") : _("stereo"));
+  if (0 != ec->proc (ec->cls, 
+                    "wav",
+                    EXTRACTOR_METATYPE_RESOURCE_TYPE,
+                    EXTRACTOR_METAFORMAT_UTF8,
+                    "text/plain",
+                    scratch,
+                    strlen (scratch) + 1))
+    return;
+  if (0 != ec->proc (ec->cls, 
+                    "wav",
+                    EXTRACTOR_METATYPE_MIMETYPE,
+                    EXTRACTOR_METAFORMAT_UTF8,
+                    "text/plain",
+                    "audio/x-wav",
+                    strlen ("audio/x-wav") +1 ))
+    return;
 }
+
+/* end of wav_extractor.c */

Modified: Extractor/src/plugins/xm_extractor.c
===================================================================
--- Extractor/src/plugins/xm_extractor.c        2012-08-07 17:46:44 UTC (rev 
23155)
+++ Extractor/src/plugins/xm_extractor.c        2012-08-07 19:56:09 UTC (rev 
23156)
@@ -4,7 +4,7 @@
  *
  * libextractor is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License as published
- * by the Free Software Foundation; either version 2, or (at your
+ * by the Free Software Foundation; either version 3, or (at your
  * option) any later version.
  *
  * libextractor is distributed in the hope that it will be useful, but
@@ -18,13 +18,19 @@
  * Boston, MA 02111-1307, USA.
  *
  */
-
+/**
+ * @file plugins/xm_extractor.c
+ * @brief plugin to support XM files
+ * @author Toni Ruottu
+ * @author Christian Grothoff
+ */
 #include "platform.h"
 #include "extractor.h"
-#include "convert.h"
 
-#define HEADER_SIZE  64
 
+/**
+ * Header of an XM file.
+ */
 struct header
 {
   char magicid[17];
@@ -34,40 +40,48 @@
   char version[2];
 };
 
-#define ADD(s,t) do { if (0 != proc (proc_cls, "xm", t, 
EXTRACTOR_METAFORMAT_UTF8, "text/plain", s, strlen(s)+1)) return 1; } while (0)
 
+/**
+ * Give meta data to LE.
+ *
+ * @param s utf-8 string meta data value
+ * @param t type of the meta data
+ */
+#define ADD(s,t) do { if (0 != ec->proc (ec->cls, "xm", t, 
EXTRACTOR_METAFORMAT_UTF8, "text/plain", s, strlen (s) + 1)) return; } while (0)
 
-/* "extract" keyword from an Extended Module
+
+/**
+ * "extract" metadata from an Extended Module
  *
  * The XM module format description for XM files
  * version $0104 that was written by Mr.H of Triton
  * in 1994 was used, while this piece of software
  * was originally written.
  *
+ * @param ec extraction context
  */
-int 
-EXTRACTOR_xm_extract (const unsigned char *data,
-                     size_t size,
-                     EXTRACTOR_MetaDataProcessor proc,
-                     void *proc_cls,
-                     const char *options)
+void
+EXTRACTOR_xm_extract_method (struct EXTRACTOR_ExtractContext *ec)
 {
+  void *data;
+  const struct header *head;
   char title[21];
   char tracker[21];
   char xmversion[8];
-  const struct header *head;
 
-  /* Check header size */
-  if (size < HEADER_SIZE)
-    return 0;
-  head = (const struct header *) data;
+  if (sizeof (struct header) >
+      ec->read (ec->cls,
+               &data,
+               sizeof (struct header)))
+    return;
+  head = data;
   /* Check "magic" id bytes */
   if (memcmp (head->magicid, "Extended Module: ", 17))
-    return 0;
+    return;
   ADD("audio/x-xm", EXTRACTOR_METATYPE_MIMETYPE);
   /* Version of Tracker */
   snprintf (xmversion, 
-           sizeof(xmversion),
+           sizeof (xmversion),
            "%d.%d", 
            head->version[1],
            head->version[0]);
@@ -80,5 +94,7 @@
   memcpy (&tracker, head->tracker, 20);
   tracker[20] = '\0';
   ADD (tracker, EXTRACTOR_METATYPE_CREATED_BY_SOFTWARE);
-  return 0;
+  return;
 }
+
+/* end of xm_extractor.c */

Deleted: Extractor/test/test.jpg
===================================================================
(Binary files differ)




reply via email to

[Prev in Thread] Current Thread [Next in Thread]