[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Phpgroupware-developers] SV: utf-8 vs iso-8859-1
From: |
Sigurd Nes |
Subject: |
[Phpgroupware-developers] SV: utf-8 vs iso-8859-1 |
Date: |
Thu, 23 Feb 2006 14:43:55 +0100 (CET) |
> On Thu, 2006-02-23 at 12:50 +0100, Sigurd Nes wrote:
> > > On Thu, 2006-02-23 at 09:08 +0100, Sigurd Nes wrote:
> > > > >
> > > > > The conversion to utf-8 is giving me problems.
> > > > > I have a database with more than 5000 dwellings, 35000
> > > > workorders ...
> > > > > The language is norwegian - and I really would like to keep the
> > > > character set (at least for norwegian) - this way I can use what
> > query
> > > > tool (as M$access) I like to make anaylis without the need for
> > > > postprocessing.
> > > > > Please enlighten me if I am missing something.
> > > > >
> > >
> > > There are several reasons for the switch to utf-8. The main one is
> > that
> > > from db to the user interface we can know that we are always dealing
> > > with utf-8. We can then remove things like lang('chartset').
> > >
> > > Unicode also means we can have multi lingual installs. For example
> > if a
> > > company has operations across Europe they can not use a single phpgw
> > > install, as we currently use at least 3 different charsets for
> > > translations. I would also like to hardcode urf-8 into stuff
> > instead of
> > > having to keep track of charsets which potentially causes problems.
> > It
> > > is also easier if everyone knows to use utf-8 compliant tools.
> > >
> > > I haven't used M$ Access since O2k days, but I know that OO.o2 Base
> > > allows you to specify the charset for the database connection.
> > Maybe M$
> > > Access has the same option tucked away somewhere
> > >
> > > What are the problems you have? I am happy to see if we can find a
> > way
> > > of fixing the problems instead of switching back to encoding soup :)
> > >
> > > Cheers
> > >
> > > Dave
> > >
> >
> > I'm not sure I grasp all the consequenses - this is from some testing:
> >
> > I seems that postgres has an unicode odbc-driver so that "should" be
> > ok - but it don't seems to work (if there is any converted characters
> > - I got 'ODBC -- called failed').
> >
>
> I am not sure what the issue is here. Is it when the db contains
> unicode chars or iso-8859-1 ?
>
> > I will need to convert all the characters in the database to unicode -
> > I figure I can dump the database, convert the characters (there is a
> > tool ?) and reload the data into an empty database. At this point I
> > will most certainly run into problems - 'cause the fields will be to
> > short in many cases.
> >
>
> check out iconv. That is what I used to convert the lang files. It is
> pretty simple. You should be able to convert a full db dump on the
> command line, then reimport it. On average, how many non ascii
> characters do you have in a field? How much slack do your fields have?
>
> > Writing lang-files will be somewhat more difficult ?
> > When saving a file with gedit as unicode it is ok when reopened in
> > gedit and TexPad (my favorite) but not in emacs.
> >
>
> What? you don't use vi? ;) Soemone suggested trying " C-x RET f utf-8
> RET" in emacs, but I have no idea when it comes to emacs.
>
> > When insterting new values to the database - do I need to filter the
> > values trough a converter?
> > I certainly cannot edit records with webmin.
> >
>
> You mean manual inserts? For that I use phpmyadmin or mysql query
> browser as I use mysql not pgsql. Does webmin set the charset based on
> a language?
>
> > I thought that the lang-table combined with the users preferences took
> > care of multilanguage issue.
> >
>
> Not completely. AFAIK Unless we use unicode we can't use say different
> charsets in 1 install. For example we can list languages in that
> language's local language and charset.
>
> > If there is special functions in the api the reqiure unicode - I'm
> > more than willing to convert the input to that function to unicode at
> > demand.
> >
> > All in all - As I see it - there is a number of limitations compared
> > to allow iso-8859-1 for the xsl:stylesheet
> >
>
> What limitations are there for the stylesheets? From what I understand
> it is best to use utf-8 for xml.
>
> Cheers
>
> Dave
>
OK - converting and import of database went well (saved the dump as unicode
from gedit).
The unicode ODBC-driver is working - which means all is well in M$access
phpPgAdmin seems to have no problem with the converted database.
The (test) phpgroupware seems to behave (and look) as usual
Regards
Sigurd
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1, Dave Hall, 2006/02/23
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/23
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1, Dave Hall, 2006/02/23
- [Phpgroupware-developers] SV: utf-8 vs iso-8859-1,
Sigurd Nes <=
- [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Dave Hall, 2006/02/23
- [Phpgroupware-developers] RE: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/23
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Chris Weiss, 2006/02/24
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/25
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Dave Hall, 2006/02/25