[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [AUCTeX] composed characters in LaTeX source code

From: jfbu
Subject: Re: [AUCTeX] composed characters in LaTeX source code
Date: Thu, 8 Nov 2018 21:50:42 +0100
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:52.0) Gecko/20100101 Thunderbird/52.9.1

Le 08/11/2018 à 21:10, Stephen Berman a écrit :
On Thu, 8 Nov 2018 18:58:02 +0100 jfbu <address@hidden> wrote:

Le 08/11/2018 à 18:45, jfbu a écrit :
Le 08/11/2018 à 18:07, Stephen Berman a écrit :
I have a file containing such composed characters that I've imported to
a LaTeX file but in the output of pdflatex, the circumflex is displayed
over the character following the one it is composed with, e.g., the
sequence 'b^a' (where '^' means U+0302, the combining circumflex accent)
is displayed in Emacs with the circumflex over 'b' but in the PDF output
the circumflex is over 'a'.


I wanted to test your problem but I have another issue, which is that

'xb M-x 8 <RET> 302 <RET>ac'

the 'a' gets superimposed on top of the b in my Emacs buffer:

(see attached image)

That's strange, and doesn't happen for me (see the screenshot in my
followup to Joost Kremers).  Does it also look like that in emacs -Q?
And what version of Emacs?

You are right the problem does not show if using

 /Applications/ -Q

It is a GNU Emacs 26.1

which is a binary build I got from


No problem either in the emacs 22 which I launch in a Terminal window (Mac OS X)

As per your issue, it is going to be very hard in LaTeX to get the accent on
top of previous letter. (I think, but my knowledge of Unicode is
scarce). With LuaLaTeX that could be possible.

Haven't tried LuaLaTeX yet, but XeLaTeX does work, though with
suboptimal display (see my other followup again).


If really the combining accent is supposedly typed *after*  the letter
(which sounds strange to me, but again, I am no Unicode-guy).

Ok, I learned since that's way.

As per your original question

has a comment by D. Carlisle who said in 2012: I think the answer is "No".

A further comment by the same, when asked about "peek at previous character"

yes but you can't go back, you can in simple case write a macro that parses
the entire text stream re-ordering tokens when it sees a combining character,
but it would be very fragile and likely break most other package commands. If
your accented letters are single characters in Unicode form NFC then
normalising the input before passing to TeX will be a lot more robust.

(quote from D. C.)

The available answer recommends using Perl to malax the file and normalize the
Unicode characters.

Thanks for the URL and quotes; in fact, I had also found that before
posting, but I don't think it would work for me anyway, because in my
case it's about composed characters for which there are no corresponding
single Unicode characters, so nothing to normalize to.

Yes, sorry that I sort of lost sight of that crucial thing,
but then the follow-up idea would still be to pre-process
your file via a Perl script or use Emacs eLisp itself

but now to either output LaTeX mark-up such as you mentioned \^{b}a,

or keep the combining diacritic and \DeclareUnicodeCharacter{0302}{\^},
but move the combining diacritic before the letter.
With the defect it will not look nice in your Emacs buffer.

But you have already considered those options.

I tried XeLaTeX, result was not good with default fonts, but using

\setmainfont{Times New Roman}

it seems to work fine.

When I copy back from PDF I get same combining characters, but
something depends upon from which PDF viewer I copy paste.

- From I get like this (where ^ represents U+0302)

xb̂ ac

i.e. there is space after the U+0302

- But from Adobe Reader I get


Of course the first case is bad, because recompiling to PDF
the space does show.

Sorry for somewhat going far away from Emacs matters.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]