emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [O] [PATCH] Table continuation strings


From: Yasushi SHOJI
Subject: Re: [O] [PATCH] Table continuation strings
Date: Mon, 23 Dec 2013 01:13:34 +0900
User-agent: Wanderlust/2.15.9

Hi Nicolas,

At Sun, 22 Dec 2013 09:20:57 +0100,
Nicolas Goaziou wrote:
> 
> Yasushi SHOJI <address@hidden> writes:
> 
> > Ah, OK.  Those coding keys are for the back-ends to select proper
> > strings, not for the string encoding.
> 
> This is also related to string encoding. You will get garbage if you
> insert a string containing characters outside the encoding you use to
> save the file, won't you?

Right.

However, as you described below, the output file's encoding is not
determined by the language option, but by the current buffer coding
system, org-export-coding-system, or back-end specific variable, ie
org-html-coding-system.

That means that whenever your-choice-of-coding-system can handle the
"characters" for the translation string, meaning that the coding
system has code points for all of the characters of the translation
string and Emacs can convert between them, it is free to use any
character for the output, right?

If one wants to use French, she sets the current buffer coding system
to any coding system which can handle French and set the language
option as "fr".  In that case, her/his org buffer should already have
French characters in it, there is no need for translation string to be
strictly ASCII only when you export with plain / ascii, no?

I just don't see any use case. I must have missed something here.
Please enlighten me.

BTW, Here is a part of quick test I've done.

   source  lang  exporter     o-e-c-s  o-h-c-s  target buffer                   
       target file                           
  
---------------------------------------------------------------------------------------------------------------------------
   euc-jp  ja    plain/ascii  nil      -        euc-jp                          
       euc-jp                                
   euc-jp  ja    plain/utf-8  nil      -        euc-jp                          
       euc-jp                                
   euc-jp  ja    plain/ascii  utf-8    -        euc-jp                          
       utf-8                                 
   euc-jp  ja    plain/utf-8  utf-8    -        euc-jp                          
       utf-8                                 
   euc-jp  ja    html         nil      utf-8    euc-jp w/ charset=utf-8         
       utf-8                                 
   euc-jp  ja    html         nil      euc-jp   euc-jp w/ charset=euc-jp        
       euc-jp w/ charset=euc-jp              
  
---------------------------------------------------------------------------------------------------------------------------
   euc-jp  fr    plain/ascii  nil      -        euc-jp w/ fr trans              
       euc-jp w/ fr translation              
   euc-jp  fr    plain/utf-8  nil      -        euc-jp w/ fr trans & utf-8 
decoration  euc-jp w/ fr trans & utf-8 decoration 

All major encoding for Japanese, euc-jp, iso2022, shift-jis, and utf-8
can handle the current translation string without problem. So I'm
assuming that encoding for other language must have some problem.

> > Then, is there any restriction with HTML back-ends? Why does it need
> > numeric character reference instead of just plain characters, if the
> > coding system is not a concern?
> 
> See above. You may want to save your html file in a different encoding
> than UTF-8. IIUC, numeric character reference are more generic.

I agree that numeric reference is more generic.  As I've just checked,
HTML even allows us to put characters outside of the current content
charset with numeric reference!

# italian text exported as html with "ja" language option.  even if
# html has iso-8859-1 as charset, web browser shows japanese chars.

> > If my understanding is ok, all entries of Japanese translation should
> > have :default instead of :utf-8.
> 
> :default instead of :utf-8 means Org will use these translations also
> for LaTeX, HTML and ASCII export. If you think that is correct, then we
> can switch to :default, indeed.

Since I don't use LaTeX, I have no idea about it. I hope some LaTeX
user help me here.

I'm checking exporters I use, including plain text and html, but it
doesn't seems to go wrong. But I really needs some help for other
back-ends. I'll post a patch for testing if anyone's interested in.

Thanks,
-- 
           yashi





reply via email to

[Prev in Thread] Current Thread [Next in Thread]