[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Phpgroupware-developers] Re: utf-8 vs iso-8859-1
From: |
Dave Hall |
Subject: |
[Phpgroupware-developers] Re: utf-8 vs iso-8859-1 |
Date: |
Fri, 24 Feb 2006 00:08:00 +1100 |
On Thu, 2006-02-23 at 12:50 +0100, Sigurd Nes wrote:
> > On Thu, 2006-02-23 at 09:08 +0100, Sigurd Nes wrote:
> > > >
> > > > The conversion to utf-8 is giving me problems.
> > > > I have a database with more than 5000 dwellings, 35000
> > > workorders ...
> > > > The language is norwegian - and I really would like to keep the
> > > character set (at least for norwegian) - this way I can use what
> query
> > > tool (as M$access) I like to make anaylis without the need for
> > > postprocessing.
> > > > Please enlighten me if I am missing something.
> > > >
> >
> > There are several reasons for the switch to utf-8. The main one is
> that
> > from db to the user interface we can know that we are always dealing
> > with utf-8. We can then remove things like lang('chartset').
> >
> > Unicode also means we can have multi lingual installs. For example
> if a
> > company has operations across Europe they can not use a single phpgw
> > install, as we currently use at least 3 different charsets for
> > translations. I would also like to hardcode urf-8 into stuff
> instead of
> > having to keep track of charsets which potentially causes problems.
> It
> > is also easier if everyone knows to use utf-8 compliant tools.
> >
> > I haven't used M$ Access since O2k days, but I know that OO.o2 Base
> > allows you to specify the charset for the database connection.
> Maybe M$
> > Access has the same option tucked away somewhere
> >
> > What are the problems you have? I am happy to see if we can find a
> way
> > of fixing the problems instead of switching back to encoding soup :)
> >
> > Cheers
> >
> > Dave
> >
>
> I'm not sure I grasp all the consequenses - this is from some testing:
>
> I seems that postgres has an unicode odbc-driver so that "should" be
> ok - but it don't seems to work (if there is any converted characters
> - I got 'ODBC -- called failed').
>
I am not sure what the issue is here. Is it when the db contains
unicode chars or iso-8859-1 ?
> I will need to convert all the characters in the database to unicode -
> I figure I can dump the database, convert the characters (there is a
> tool ?) and reload the data into an empty database. At this point I
> will most certainly run into problems - 'cause the fields will be to
> short in many cases.
>
check out iconv. That is what I used to convert the lang files. It is
pretty simple. You should be able to convert a full db dump on the
command line, then reimport it. On average, how many non ascii
characters do you have in a field? How much slack do your fields have?
> Writing lang-files will be somewhat more difficult ?
> When saving a file with gedit as unicode it is ok when reopened in
> gedit and TexPad (my favorite) but not in emacs.
>
What? you don't use vi? ;) Soemone suggested trying " C-x RET f utf-8
RET" in emacs, but I have no idea when it comes to emacs.
> When insterting new values to the database - do I need to filter the
> values trough a converter?
> I certainly cannot edit records with webmin.
>
You mean manual inserts? For that I use phpmyadmin or mysql query
browser as I use mysql not pgsql. Does webmin set the charset based on
a language?
> I thought that the lang-table combined with the users preferences took
> care of multilanguage issue.
>
Not completely. AFAIK Unless we use unicode we can't use say different
charsets in 1 install. For example we can list languages in that
language's local language and charset.
> If there is special functions in the api the reqiure unicode - I'm
> more than willing to convert the input to that function to unicode at
> demand.
>
> All in all - As I see it - there is a number of limitations compared
> to allow iso-8859-1 for the xsl:stylesheet
>
What limitations are there for the stylesheets? From what I understand
it is best to use utf-8 for xml.
Cheers
Dave
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1, Dave Hall, 2006/02/23
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/23
- [Phpgroupware-developers] Re: utf-8 vs iso-8859-1,
Dave Hall <=
- [Phpgroupware-developers] SV: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/23
- [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Dave Hall, 2006/02/23
- [Phpgroupware-developers] RE: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/23
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Chris Weiss, 2006/02/24
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Sigurd Nes, 2006/02/25
- Re: [Phpgroupware-developers] Re: SV: utf-8 vs iso-8859-1, Dave Hall, 2006/02/25