gnash-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Gnash-commit] gnash ChangeLog server/button_character_instanc...


From: Benjamin Wolsey
Subject: [Gnash-commit] gnash ChangeLog server/button_character_instanc...
Date: Tue, 05 Feb 2008 12:01:52 +0000

CVSROOT:        /sources/gnash
Module name:    gnash
Changes by:     Benjamin Wolsey <bwy>   08/02/05 12:01:52

Modified files:
        .              : ChangeLog 
        server         : button_character_instance.cpp 
                         edit_text_character.cpp edit_text_character.h 
                         event_id.h gnash.h movie_root.cpp 
        libbase        : utf8.cpp utf8.h 

Log message:
        2008-02-05 Benjamin Wolsey <address@hidden>
        
                * server/movie_root.cpp: send unique gnash::key::code not 
SWFCode
                  to event_id; edit_text_character relies on this code to work 
out
                  which character was passed and SWFcode is bogus.
                * server/event_id.h: store gnash::key::code not SWFCode; 
setKeyCode
                  converts SWFCode to a corresponding gnash::key::code.
                * server/gnash.h: add enum for different key code types.
                * server/button_character_instance.cpp: lookup SWFCode from
                  gnash::key::code.
                * libbase/utf8.{cpp,h}: add {decode,encode}CanonicalString 
methods
                  for converting between UTF-8 encoded std::string and 
std::wstring.
                  Fix some bugs in character decoding.
                * server/edit_text_character.{cpp.h}: use std::wstring 
internally,
                  as once we receive multi-byte characters the cursor position 
and
                  std::string position are different. Wide characters mean 
string
                  manipulation can stay as before. Interfaces still use 
std::string
                  with conversions when necessary. Make wstring methods 
private. 
                  set_text_value(const char*) -> setTextValue(const wstring&).
                  Handles characters like ö or ¶.
        
        Sorry for the large commit. It was just a small bug with a lot of 
implications.
        
        It passes the testsuite and I think I've adapted button key events 
correctly, but I hope someone who knows more about them can check.

CVSWeb URLs:
http://cvs.savannah.gnu.org/viewcvs/gnash/ChangeLog?cvsroot=gnash&r1=1.5557&r2=1.5558
http://cvs.savannah.gnu.org/viewcvs/gnash/server/button_character_instance.cpp?cvsroot=gnash&r1=1.79&r2=1.80
http://cvs.savannah.gnu.org/viewcvs/gnash/server/edit_text_character.cpp?cvsroot=gnash&r1=1.145&r2=1.146
http://cvs.savannah.gnu.org/viewcvs/gnash/server/edit_text_character.h?cvsroot=gnash&r1=1.65&r2=1.66
http://cvs.savannah.gnu.org/viewcvs/gnash/server/event_id.h?cvsroot=gnash&r1=1.15&r2=1.16
http://cvs.savannah.gnu.org/viewcvs/gnash/server/gnash.h?cvsroot=gnash&r1=1.113&r2=1.114
http://cvs.savannah.gnu.org/viewcvs/gnash/server/movie_root.cpp?cvsroot=gnash&r1=1.157&r2=1.158
http://cvs.savannah.gnu.org/viewcvs/gnash/libbase/utf8.cpp?cvsroot=gnash&r1=1.7&r2=1.8
http://cvs.savannah.gnu.org/viewcvs/gnash/libbase/utf8.h?cvsroot=gnash&r1=1.5&r2=1.6

Patches:
Index: ChangeLog
===================================================================
RCS file: /sources/gnash/gnash/ChangeLog,v
retrieving revision 1.5557
retrieving revision 1.5558
diff -u -b -r1.5557 -r1.5558
--- ChangeLog   5 Feb 2008 10:50:14 -0000       1.5557
+++ ChangeLog   5 Feb 2008 12:01:50 -0000       1.5558
@@ -1,3 +1,24 @@
+2008-02-05 Benjamin Wolsey <address@hidden>
+
+       * server/movie_root.cpp: send unique gnash::key::code not SWFCode
+         to event_id; edit_text_character relies on this code to work out
+         which character was passed and SWFcode is bogus.
+       * server/event_id.h: store gnash::key::code not SWFCode; setKeyCode
+         converts SWFCode to a corresponding gnash::key::code.
+       * server/gnash.h: add enum for different key code types.
+       * server/button_character_instance.cpp: lookup SWFCode from
+         gnash::key::code.
+       * libbase/utf8.{cpp,h}: add {decode,encode}CanonicalString methods
+         for converting between UTF-8 encoded std::string and std::wstring.
+         Fix some bugs in character decoding.
+       * server/edit_text_character.{cpp.h}: use std::wstring internally,
+         as once we receive multi-byte characters the cursor position and
+         std::string position are different. Wide characters mean string
+         manipulation can stay as before. Interfaces still use std::string
+         with conversions when necessary. Make wstring methods private. 
+         set_text_value(const char*) -> setTextValue(const wstring&).
+         Handles characters like ö or ¶.
+
 2008-02-05 Bastiaan Jacques <address@hidden>
 
        * libmedia/gst/VideoDecoderGst.cpp: Use a different ffdec

Index: server/button_character_instance.cpp
===================================================================
RCS file: /sources/gnash/gnash/server/button_character_instance.cpp,v
retrieving revision 1.79
retrieving revision 1.80
diff -u -b -r1.79 -r1.80
--- server/button_character_instance.cpp        29 Jan 2008 12:31:09 -0000      
1.79
+++ server/button_character_instance.cpp        5 Feb 2008 12:01:51 -0000       
1.80
@@ -331,7 +331,7 @@
 button_character_instance::on_event(const event_id& id)
 {
 
-       if( (id.m_id==event_id::KEY_PRESS) && (id.m_key_code == key::INVALID) )
+       if( (id.m_id==event_id::KEY_PRESS) && (id.keyCode == key::INVALID) )
        {
                // onKeypress only responds to valid key code
                return false;
@@ -346,8 +346,10 @@
                button_action& ba = *(m_def->m_button_actions[i]);
 
                int keycode = (ba.m_conditions & 0xFE00) >> 9;
-               event_id key_event(event_id::KEY_PRESS, (key::code) keycode);
-               if (key_event == id)
+               
+               // Test match between button action conditions and the SWF code
+               // that maps to id.keyCode (the gnash unique key code). 
+               if (id.m_id == event_id::KEY_PRESS && 
gnash::key::codeMap[id.keyCode][key::SWF] == keycode)
                {
                        // Matching action.
                        VM::get().getRoot().pushAction(ba.m_actions, 
boost::intrusive_ptr<character>(this));

Index: server/edit_text_character.cpp
===================================================================
RCS file: /sources/gnash/gnash/server/edit_text_character.cpp,v
retrieving revision 1.145
retrieving revision 1.146
diff -u -b -r1.145 -r1.146
--- server/edit_text_character.cpp      4 Feb 2008 15:16:54 -0000       1.145
+++ server/edit_text_character.cpp      5 Feb 2008 12:01:51 -0000       1.146
@@ -17,8 +17,6 @@
 // Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 //
 
-/* $Id: edit_text_character.cpp,v 1.145 2008/02/04 15:16:54 bwy Exp $ */
-
 #ifdef HAVE_CONFIG_H
 #include "gnashconfig.h"
 #endif
@@ -373,6 +371,7 @@
                edit_text_character_def* def, int id)
        :
        character(parent, id),
+       _text(L""),
        m_def(def),
        _font(0),
        m_has_focus(false),
@@ -408,7 +407,8 @@
        // set default text *before* calling registerTextVariable
        // (if the textvariable already exist and has a value
        //  the text will be replaced with it)
-       set_text_value(m_def->get_default_text().c_str());
+       
+       setTextValue(utf8::decodeCanonicalString(m_def->get_default_text()));
 
        m_dummy_style.push_back(fill_style());
 
@@ -590,21 +590,22 @@
 
                case event_id::KEY_PRESS:
                {
-                       std::string s(_text);
-                       std::string c;
-                       c = (char) id.m_key_code;
+                       std::wstring s = _text;
+
+                       // id.keyCode is the unique gnash::key::code for a 
character
+                       uint32_t c = (uint32_t) id.keyCode;
 
-                       // may be _text is changed in ActionScript
+                       // maybe _text is changed in ActionScript
                        m_cursor = imin(m_cursor, _text.size());
 
-                       switch (c[0])
+                       switch (c)
                        {
                                case key::BACKSPACE:
                                        if (m_cursor > 0)
                                        {
                                                s.erase(m_cursor - 1, 1);
                                                m_cursor--;
-                                               set_text_value(s.c_str());
+                                               setTextValue(s);
                                        }
                                        break;
 
@@ -612,7 +613,7 @@
                                        if (s.size() > m_cursor)
                                        {
                                                s.erase(m_cursor, 1);
-                                               set_text_value(s.c_str());
+                                               setTextValue(s);
                                        }
                                        break;
 
@@ -644,12 +645,14 @@
                                        break;
 
                                default:
+                                       wchar_t t = (wchar_t) 
gnash::key::codeMap[c][key::ASCII];
+                                       if (t != 0)
                                {
-                                       s.insert(m_cursor, c);
+                                               s.insert(m_cursor, 1, t);
                                        m_cursor++;
-                                       set_text_value(s.c_str());
-                                       break;
                                }
+                                       setTextValue(s);
+                                       break;
                        }
                        onChanged();
                }
@@ -692,11 +695,19 @@
 }
 
 void
-edit_text_character::updateText(const std::string& new_text)
+edit_text_character::updateText(const std::string& str)
+{
+       std::wstring wstr = utf8::decodeCanonicalString(str);
+       updateText(wstr);
+}
+
+
+void
+edit_text_character::updateText(const std::wstring& wstr)
 {
        unsigned int maxLen = m_def->get_max_length();
 
-       std::string newText = new_text; // copy needed for eventual resize
+       std::wstring newText = wstr; // copy needed for eventual resize
        if (maxLen && newText.length() > maxLen )
        {
                newText.resize(maxLen);
@@ -718,12 +729,10 @@
 }
 
 void
-edit_text_character::set_text_value(const char* new_text_cstr)
+edit_text_character::setTextValue(const std::wstring& wstr)
 {
-       std::string newText;
-       if ( new_text_cstr ) newText = new_text_cstr;
 
-       updateText(newText);
+       updateText(wstr);
 
        if ( ! _variable_name.empty() && _text_variable_registered )
        {
@@ -732,12 +741,12 @@
                as_object* tgt = ref.first;
                if ( tgt )
                {
-                       tgt->set_member(ref.second, newText); // we shouldn't 
truncate, right ?
+                       tgt->set_member(ref.second, 
utf8::encodeCanonicalString(wstr)); // we shouldn't truncate, right ?
                }
                else    
                {
                        // nothing to do (too early ?)
-                       log_debug("set_text_value: variable name %s points to 
an unexisting target, I guess we would not be registered in this was true, or 
the sprite we've registered our variable name has been unloaded", 
_variable_name.c_str());
+                       log_debug("setTextValue: variable name %s points to an 
unexisting target, I guess we would not be registered in this was true, or the 
sprite we've registered our variable name has been unloaded", 
_variable_name.c_str());
                }
        }
 }
@@ -747,13 +756,13 @@
 {
        // we need the const_cast here because registerTextVariable
        // *might* change our text value, calling the non-const
-       // set_text_value().
+       // setTextValue().
        // This happens if the TextVariable has not been already registered
        // and during registration comes out to name an existing variable
        // with a pre-existing value.
        const_cast<edit_text_character*>(this)->registerTextVariable();
 
-       return _text;
+       return utf8::encodeCanonicalString(_text);
 }
 
 void
@@ -774,14 +783,14 @@
                //if (name == "text")
        {
                int version = 
get_parent()->get_movie_definition()->get_version();
-               set_text_value(val.to_string_versioned(version).c_str());
+               
setTextValue(utf8::decodeCanonicalString(val.to_string_versioned(version)));
                return;
        }
        case NSV::PROP_HTML_TEXT:
                //if (name == "htmlText")
        {
                int version = 
get_parent()->get_movie_definition()->get_version();
-               set_text_value(val.to_string_versioned(version).c_str());
+               
setTextValue(utf8::decodeCanonicalString(val.to_string_versioned(version)));
                format_text();
                return;
        }
@@ -1172,12 +1181,12 @@
 
        assert(! _text.empty() );
 
-       std::string::const_iterator it = _text.begin();
+       std::wstring::const_iterator it = _text.begin();
        
        // decodeNextUnicodeCharacter(std::string::const_iterator &it) works,
        // but unfortunately nothing is encoded in utf8.
        
-       while (boost::uint32_t code = utf8::decodeNextUnicodeCharacter(it))
+       while (boost::uint32_t code = *it++)
        {
                if ( _embedFonts )
                {
@@ -1262,7 +1271,7 @@
 
                        // HTML tag, just skip it...
                        bool closingTagFound = false;
-                       while ( (code = utf8::decodeNextUnicodeCharacter(it)) )
+                       while ( (code = *it++) )
                        {
                                if (code == '>')
                                {
@@ -1369,7 +1378,7 @@
                                        //log_debug(" autoSize=NONE!");
                                        // truncate long line, but keep 
expanding text box
                                        bool newlinefound = false;
-                                       while ( (code = 
utf8::decodeNextUnicodeCharacter(it)) )
+                                       while ( (code = *it++ ) )
                                        {
                                                if ( _embedFonts )
                                                {
@@ -1585,14 +1594,14 @@
 #endif
                // TODO: pass environment to to_string ?
                // as_environment& env = get_environment();
-               set_text_value(val.to_string().c_str());
+               setTextValue(utf8::decodeCanonicalString(val.to_string()));
        }
        else
        {
 #ifdef DEBUG_DYNTEXT_VARIABLES
                log_msg(_("target sprite (%p) does NOT have a member named %s 
(no problem, we'll add it)"), (void*)sprite, 
_vm.getStringTable().value(key).c_str());
 #endif
-               target->set_member(key, as_value(_text));
+               target->set_member(key, 
as_value(utf8::encodeCanonicalString(_text)));
        }
 
        sprite_instance* sprite = target->to_movie();
@@ -1618,7 +1627,7 @@
        {
                _variable_name = newname;
                _text_variable_registered = false;
-               //set_text_value(m_def->get_default_text().c_str());
+               //setTextValue(m_def->get_default_text());
 #ifdef DEBUG_DYNTEXT_VARIABLES
                log_debug("Calling updateText after change of variable name");
 #endif

Index: server/edit_text_character.h
===================================================================
RCS file: /sources/gnash/gnash/server/edit_text_character.h,v
retrieving revision 1.65
retrieving revision 1.66
diff -u -b -r1.65 -r1.66
--- server/edit_text_character.h        21 Jan 2008 20:55:50 -0000      1.65
+++ server/edit_text_character.h        5 Feb 2008 12:01:51 -0000       1.66
@@ -90,17 +90,10 @@
        /// 
        void set_variable_name(const std::string& newname);
 
-       /// Set our text to the given string.
-       //
-       /// This function will also update any registered variable
-       ///
-       void    set_text_value(const char* new_text);
-
        /// Set our text to the given string by effect of an update of a 
registered variable name
        //
        /// This cal only updates the text and is only meant to be called by 
ourselves
        /// or by sprite_instance when a registered TextVariable is updated.
-       ///
        void updateText(const std::string& s);
 
        /// Return value of our text.
@@ -285,6 +278,19 @@
 
 private:
 
+       /// Set our text to the given string.
+       //
+       /// This function will also update any registered variable
+       ///
+       void    setTextValue(const std::wstring& wstr);
+
+       /// Set our text to the given string by effect of an update of a 
registered variable name
+       //
+       /// This cal only updates the text and is only meant to be called by 
ourselves
+       /// or by sprite_instance when a registered TextVariable is updated.
+       ///
+       void updateText(const std::wstring& s);
+
        /// Set focus 
        void setFocus();
 
@@ -300,8 +306,10 @@
        /// Call this function when willing to invoke the onKillFocus event 
handler
        void onKillFocus();
 
-       /// The actual text
-       std::string _text;
+       /// The actual text. Because we have to deal with non-ascii characters 
(129-255)
+       /// this is a wide string; the cursor position and the position within 
the string
+       /// are then the same, which makes manipulating the string much easier.
+       std::wstring _text;
 
        /// immutable definition of this object, as read
        /// from the SWF stream. Assured to be not-NULL

Index: server/event_id.h
===================================================================
RCS file: /sources/gnash/gnash/server/event_id.h,v
retrieving revision 1.15
retrieving revision 1.16
diff -u -b -r1.15 -r1.16
--- server/event_id.h   21 Jan 2008 20:55:50 -0000      1.15
+++ server/event_id.h   5 Feb 2008 12:01:51 -0000       1.16
@@ -29,11 +29,8 @@
 
 #include "gnash.h" // for gnash::key namespace
 
-#include <cwchar>
-
 namespace gnash {
 
-
 /// For keyDown and stuff like that.
 //
 /// Implementation is currently in action.cpp
@@ -95,30 +92,48 @@
        };
 
        id_code m_id;
-       unsigned char   m_key_code;
 
-       event_id() : m_id(INVALID), m_key_code(key::INVALID) {}
+       // keyCode must be the unique gnash key identifier
+       // gnash::key::code.
+       // edit_text_character has to be able to work out the
+       // ASCII value from keyCode, while other users need 
+       // the SWF code or the Flash key code.
+       key::code keyCode;
+
+       event_id() : m_id(INVALID), keyCode(key::INVALID) {}
 
-       event_id(id_code id, unsigned char c = key::INVALID)
+       event_id(id_code id, key::code c = key::INVALID)
                :
                m_id(id),
-               m_key_code(c)
+               keyCode(c)
        {
                // you must supply a key code for KEY_PRESS event
                // 
-               // we do have a testcase with m_id == KEY_PRESS, and 
m_key_code==0(KEY_INVALID)
+               // we do have a testcase with m_id == KEY_PRESS, and 
keyCode==0(KEY_INVALID)
                // see key_event_test.swf(produced by Ming)
                // 
-               //assert((m_key_code == key::INVALID && (m_id != KEY_PRESS))
-               //      || (m_key_code != key::INVALID && (m_id == KEY_PRESS)));
+               //assert((keyCode == key::INVALID && (m_id != KEY_PRESS))
+               //      || (keyCode != key::INVALID && (m_id == KEY_PRESS)));
        }
 
-       void setKeyCode(unsigned char key)
+       ///
+       /// @param SWFKey The SWF code matched to the event. This
+       /// must be converted to a unique gnash::key::code.
+       void setKeyCode(boost::uint8_t SWFkey)
        {
-               m_key_code = key;
+               // Lookup the SWFcode in the gnash::key::code table.
+               // Some are not unique (keypad numbers are the
+               // same as normal numbers), so we take the first match.
+               // As long as we can work out the SWFCode from the
+               // gnash::key::code it's all right.
+               int i = 0;
+               while (key::codeMap[i][key::SWF] != SWFkey && i < 
key::KEYCOUNT) i++;
+
+               if (i == key::KEYCOUNT) keyCode = key::INVALID;
+               else keyCode = (key::code)i;
        }
 
-       bool    operator==(const event_id& id) const { return m_id == id.m_id 
&& m_key_code == id.m_key_code; }
+       bool    operator==(const event_id& id) const { return m_id == id.m_id 
&& keyCode == id.keyCode; }
 
        bool operator< (const event_id& id) const
        {
@@ -126,7 +141,7 @@
                if ( m_id > id.m_id ) return false;
 
                // m_id are equal, check key code
-               if ( m_key_code < id.m_key_code ) return true;
+               if ( keyCode < id.keyCode ) return true;
                return false;
        }
 

Index: server/gnash.h
===================================================================
RCS file: /sources/gnash/gnash/server/gnash.h,v
retrieving revision 1.113
retrieving revision 1.114
diff -u -b -r1.113 -r1.114
--- server/gnash.h      21 Jan 2008 20:55:51 -0000      1.113
+++ server/gnash.h      5 Feb 2008 12:01:51 -0000       1.114
@@ -637,7 +637,15 @@
   KEYCOUNT
 };
 
-const unsigned char codeMap[KEYCOUNT][3] = {
+enum type
+{
+       SWF,
+       KEY,
+       ASCII,
+       TYPES
+};
+
+const unsigned char codeMap[KEYCOUNT][TYPES] = {
 //{swfKeyCode, keycode, asciiKeyCode}
   {0,   0,   0}, // INVALID = 0
   {0,   0,   0}, // UNKNOWN1

Index: server/movie_root.cpp
===================================================================
RCS file: /sources/gnash/gnash/server/movie_root.cpp,v
retrieving revision 1.157
retrieving revision 1.158
diff -u -b -r1.157 -r1.158
--- server/movie_root.cpp       31 Jan 2008 13:41:32 -0000      1.157
+++ server/movie_root.cpp       5 Feb 2008 12:01:51 -0000       1.158
@@ -1133,7 +1133,8 @@
                        {
                                // KEY_UP and KEY_DOWN events are unrelated to 
any key!
                                ch->on_event(event_id(event_id::KEY_DOWN, 
key::INVALID)); 
-                               ch->on_event(event_id(event_id::KEY_PRESS, 
key::codeMap[k][0]));
+                               // Pass the unique Gnash key code!
+                               ch->on_event(event_id(event_id::KEY_PRESS, k));
                        }
                        else
                        {

Index: libbase/utf8.cpp
===================================================================
RCS file: /sources/gnash/gnash/libbase/utf8.cpp,v
retrieving revision 1.7
retrieving revision 1.8
diff -u -b -r1.7 -r1.8
--- libbase/utf8.cpp    4 Feb 2008 15:16:54 -0000       1.7
+++ libbase/utf8.cpp    5 Feb 2008 12:01:51 -0000       1.8
@@ -1,15 +1,58 @@
-// utf8.cpp    -- Thatcher Ulrich 2004   -*- coding: utf-8;-*-
-
-// This source code has been donated to the Public Domain.  Do
-// whatever you want with it.  THE AUTHOR DOES NOT WARRANT THIS CODE.
-
-// Utility code for dealing with UTF-8 encoded text.
+// utf8.cpp: utilities for converting to and from UTF-8
+// 
+//   Copyright (C) 2008 Free Software Foundation, Inc.
+// 
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation; either version 3 of the License, or
+// (at your option) any later version.
+// 
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+// 
+// You should have received a copy of the GNU General Public License
+// along with this program; if not, write to the Free Software
+// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+//
+// Based on the public domain work of Thatcher Ulrich <address@hidden> 2004
 //
 // Much useful info at "UTF-8 and Unicode FAQ" 
http://www.cl.cam.ac.uk/~mgk25/unicode.html
 
 
 #include "utf8.h"
 
+std::wstring
+utf8::decodeCanonicalString(const std::string& str)
+{
+       std::wstring wstr = L"";
+       
+       std::string::const_iterator it = str.begin();
+       while (boost::uint32_t code = utf8::decodeNextUnicodeCharacter(it))
+       {
+               wstr.push_back((wchar_t) code);
+       }
+       
+       return wstr;
+
+}
+
+std::string
+utf8::encodeCanonicalString(const std::wstring& wstr)
+{
+       std::string str = "";
+       
+       std::wstring::const_iterator it = wstr.begin();
+       while ( it != wstr.end())
+       {
+               str.append(utf8::encodeUnicodeCharacter(*it++));
+       }
+       
+       return str;
+
+}
+
 boost::uint32_t        
utf8::decodeNextUnicodeCharacter(std::string::const_iterator& it)
 {
        boost::uint32_t uc;
@@ -104,6 +147,7 @@
        else
        {
                // Invalid.
+               it++;
                return INVALID;
        }
 }
@@ -119,47 +163,47 @@
        if (ucs_character <= 0x7F)
        {
                // Plain single-byte ASCII.
-               text += (char) ucs_character;
+               text.push_back(ucs_character);
        }
        else if (ucs_character <= 0x7FF)
        {
                // Two bytes.
-               text += 0xC0 | (ucs_character >> 6);
-               text += 0x80 | ((ucs_character >> 0) & 0x3F);
+               text.push_back(0xC0 | (ucs_character >> 6));
+               text.push_back(0x80 | ((ucs_character >> 0) & 0x3F));
        }
        else if (ucs_character <= 0xFFFF)
        {
                // Three bytes.
-               text += 0xE0 | (ucs_character >> 12);
-               text += 0x80 | ((ucs_character >> 6) & 0x3F);
-               text += 0x80 | ((ucs_character >> 0) & 0x3F);
+               text.push_back(0xE0 | (ucs_character >> 12));
+               text.push_back(0x80 | ((ucs_character >> 6) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 0) & 0x3F));
        }
        else if (ucs_character <= 0x1FFFFF)
        {
                // Four bytes.
-               text += 0xF0 | (ucs_character >> 18);
-               text += 0x80 | ((ucs_character >> 12) & 0x3F);
-               text += 0x80 | ((ucs_character >> 6) & 0x3F);
-               text += 0x80 | ((ucs_character >> 0) & 0x3F);
+               text.push_back(0xF0 | (ucs_character >> 18));
+               text.push_back(0x80 | ((ucs_character >> 12) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 6) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 0) & 0x3F));
        }
        else if (ucs_character <= 0x3FFFFFF)
        {
                // Five bytes.
-               text += 0xF8 | (ucs_character >> 24);
-               text += 0x80 | ((ucs_character >> 18) & 0x3F);
-               text += 0x80 | ((ucs_character >> 12) & 0x3F);
-               text += 0x80 | ((ucs_character >> 6) & 0x3F);
-               text += 0x80 | ((ucs_character >> 0) & 0x3F);
+               text.push_back(0xF8 | (ucs_character >> 24));
+               text.push_back(0x80 | ((ucs_character >> 18) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 12) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 6) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 0) & 0x3F));
        }
        else if (ucs_character <= 0x7FFFFFFF)
        {
                // Six bytes.
-               text += 0xFC | (ucs_character >> 30);
-               text += 0x80 | ((ucs_character >> 24) & 0x3F);
-               text += 0x80 | ((ucs_character >> 18) & 0x3F);
-               text += 0x80 | ((ucs_character >> 12) & 0x3F);
-               text += 0x80 | ((ucs_character >> 6) & 0x3F);
-               text += 0x80 | ((ucs_character >> 0) & 0x3F);
+               text.push_back(0xFC | (ucs_character >> 30));
+               text.push_back(0x80 | ((ucs_character >> 24) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 18) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 12) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 6) & 0x3F));
+               text.push_back(0x80 | ((ucs_character >> 0) & 0x3F));
        }
        else
        {
@@ -169,199 +213,6 @@
        return text;
 }
 
-
-#ifdef UTF8_UNIT_TEST
-
-// Compile this test case with something like:
-//
-// c++ utf8.cpp -g -I.. -DUTF8_UNIT_TEST -o utf8_test
-//
-//    or
-//
-// cl utf8.cpp -Zi -Od -DUTF8_UNIT_TEST -I..
-//
-// If possible, try running the test program with the first arg
-// pointing at the file:
-//
-// http://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
-// 
-// and examine the results by eye to make sure they are acceptable to
-// you.
-
-
-#include "utility.h"
-#include <cstdio>
-
-
-bool   check_equal(const char* utf8_in, const boost::uint32_t* ucs_in)
-{
-       for (;;)
-       {
-               boost::uint32_t next_ucs = *ucs_in++;
-               boost::uint32_t next_ucs_from_utf8 = 
utf8::decode_next_unicode_character(&utf8_in);
-               if (next_ucs != next_ucs_from_utf8)
-               {
-                       return false;
-               }
-               if (next_ucs == 0)
-               {
-                       assert(next_ucs_from_utf8 == 0);
-                       break;
-               }
-       }
-
-       return true;
-}
-
-
-void   log_ascii(const char* line)
-{
-       for (;;)
-       {
-               unsigned char   c = (unsigned char) *line++;
-               if (c == 0)
-               {
-                       // End of line.
-                       return;
-               }
-               else if (c != '\n'
-                        && (c < 32 || c > 127))
-               {
-                       // Non-printable as plain ASCII.
-                       printf("<0x%02X>", (int) c);
-               }
-               else
-               {
-                       printf("%c", c);
-               }
-       }
-}
-
-
-void   log_ucs(const boost::uint32_t* line)
-{
-       for (;;)
-       {
-               boost::uint32_t uc = *line++;
-               if (uc == 0)
-               {
-                       // End of line.
-                       return;
-               }
-               else if (uc != '\n'
-                        && (uc < 32 || uc > 127))
-               {
-                       // Non-printable as plain ASCII.
-                       printf("<U-%04X>", uc);
-               }
-               else
-               {
-                       printf("%c", (char) uc);
-               }
-       }
-}
-
-
-int main(int argc, const char* argv[])
-{
-       // Simple canned test.
-       {
-               const char*     test8 = "Ignacio Castaño";
-               const boost::uint32_t   test32[] =
-                       {
-                               0x49, 0x67, 0x6E, 0x61, 0x63,
-                               0x69, 0x6F, 0x20, 0x43, 0x61,
-                               0x73, 0x74, 0x61, 0xF1, 0x6F,
-                               0x00
-                       };
-
-               assert(check_equal(test8, test32));
-       }
-
-       // If user passed an arg, try reading the file as UTF-8 encoded text.
-       if (argc > 1)
-       {
-               const char*     filename = argv[1];
-               FILE*   fp = fopen(filename, "rb");
-               if (fp == NULL)
-               {
-                       printf("Can't open file '%s'\n", filename);
-                       return 1;
-               }
-
-               // Read lines from the file, encode/decode them, and highlight 
discrepancies.
-               const int LINE_SIZE = 200;      // max line size
-               char    line_buffer_utf8[LINE_SIZE];
-               char    reencoded_utf8[6 * LINE_SIZE];
-               boost::uint32_t line_buffer_ucs[LINE_SIZE];
-
-               int     byte_counter = 0;
-               for (;;)
-               {
-                       int     c = fgetc(fp);
-                       if (c == EOF)
-                       {
-                               // Done.
-                               break;
-                       }
-                       line_buffer_utf8[byte_counter++] = c;
-                       if (c == '\n' || byte_counter >= LINE_SIZE - 2)
-                       {
-                               // End of line.  Process the line.
-                               line_buffer_utf8[byte_counter++] = '\0';        
// terminate.
-
-                               // Decode into UCS.
-                               const char*     p = line_buffer_utf8;
-                               boost::uint32_t*        q = line_buffer_ucs;
-                               for (;;)
-                               {
-                                       boost::uint32_t uc = 
utf8::decode_next_unicode_character(&p);
-                                       *q++ = uc;
-
-                                       assert(q < line_buffer_ucs + LINE_SIZE);
-                                       assert(p < line_buffer_utf8 + 
LINE_SIZE);
-
-                                       if (uc == 0) break;
-                               }
-
-                               // Encode back into UTF-8.
-                               q = line_buffer_ucs;
-                               int     index = 0;
-                               for (;;)
-                               {
-                                       boost::uint32_t uc = *q++;
-                                       assert(index < LINE_SIZE * 6 - 6);
-                                       int     last_index = index;
-                                       
utf8::encode_unicode_character(reencoded_utf8, &index, uc);
-                                       assert(index <= last_index + 6);
-                                       if (uc == 0) break;
-                               }
-
-// This can be useful for debugging.
-#if 0
-                               // Show the UCS and the re-encoded UTF-8.
-                               log_ucs(line_buffer_ucs);
-                               log_ascii(reencoded_utf8);
-#endif // 0
-
-                               assert(check_equal(line_buffer_utf8, 
line_buffer_ucs));
-                               assert(check_equal(reencoded_utf8, 
line_buffer_ucs));
-
-                               // Start next line.
-                               byte_counter = 0;
-                       }
-               }
-
-               fclose(fp);
-       }
-
-       return 0;
-}
-
-
-#endif // UTF8_UNIT_TEST
-
-
 // Local Variables:
 // mode: C++
 // c-basic-offset: 8 

Index: libbase/utf8.h
===================================================================
RCS file: /sources/gnash/gnash/libbase/utf8.h,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -b -r1.5 -r1.6
--- libbase/utf8.h      4 Feb 2008 15:16:54 -0000       1.5
+++ libbase/utf8.h      5 Feb 2008 12:01:52 -0000       1.6
@@ -1,28 +1,46 @@
-// utf8.h      -- Thatcher Ulrich <address@hidden> 2004
-
-// This source code has been donated to the Public Domain.  Do
-// whatever you want with it.  THE AUTHOR DOES NOT WARRANT THIS CODE.
-
-// Utility code for dealing with UTF-8 encoded text.
-
+// utf8.h: utilities for converting to and from UTF-8
+// 
+//   Copyright (C) 2008 Free Software Foundation, Inc.
+// 
+// This program is free software; you can redistribute it and/or modify
+// it under the terms of the GNU General Public License as published by
+// the Free Software Foundation; either version 3 of the License, or
+// (at your option) any later version.
+// 
+// This program is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+// 
+// You should have received a copy of the GNU General Public License
+// along with this program; if not, write to the Free Software
+// Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+//
+// Based on the public domain work of Thatcher Ulrich <address@hidden> 2004
 
 #ifndef UTF8_H
 #define UTF8_H
 
-#include "tu_config.h" // needed ?
+#include "tu_config.h" // For DSOEXPORT
 #include <string>
-
 #include <boost/cstdint.hpp> // for boost::?int??_t
 
 
 namespace utf8
 {
+       // Converts a UTF-8 encoded std::string with multibyte characters into
+       // a std::wstring.
+       DSOEXPORT std::wstring decodeCanonicalString(const std::string& str);
+
+       // Converts a std::wstring into a UTF-8 encoded std::string.
+       DSOEXPORT std::string encodeCanonicalString(const std::wstring& wstr);
+
        // Return the next Unicode character in the UTF-8 encoded
        // string.  Invalid UTF-8 sequences produce a U+FFFD character
        // as output.  Advances string iterator past the character
        // returned, unless the returned character is '\0', in which
        // case the iterator does not advance.
-       DSOEXPORT boost::uint32_t 
decodeNextUnicodeCharacter(std::string::const_iterator& it);
+       boost::uint32_t decodeNextUnicodeCharacter(std::string::const_iterator& 
it);
 
        // Encodes the given UCS character into the given UTF-8
        // buffer.  Writes the data starting at buffer[offset], and
@@ -30,7 +48,7 @@
        //
        // May write up to 6 bytes, so make sure there's room in the
        // buffer!
-       DSOEXPORT std::string encodeUnicodeCharacter(boost::uint32_t 
ucs_character);
+       std::string encodeUnicodeCharacter(boost::uint32_t ucs_character);
 }
 
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]