Perhaps. But, I suggest to postpone the further discussion on this matter because the problem is in emacs-unicode-2 branch only for the moment. How about putting these entries in TODO? * Make the rol
At least you now understand it's not trivial. Why do you think it's worth doing at this stage even if it requires nontrivial work? How about just asking users to use emacs-mule coding system for *.el
FYI, Emacs' mule feature has it's own version number and name (C-h v mule-version). Those of the CVS HEAD are "5.0 (SAKAKI)". Those of emacs-unicode are "6.0 (HANACHIRUSATO)". They are named after t
At first, coding system (utf-8) doesn't replace unsupported characters with '?' on decoding. It preserves the original byte sequence and attaches a special text property to display it the Unicode rep
Thank you for the report. I've just installed this change to fix the error at line 2671. Index: font.h == RCS file: /cvsroot/emacs/emacs/src/Attic/font.h,v retrieving revision 1.1.2.6 retrieving revi
I can't reproduce it on GNU/Linux. So, could you please use gdb to find exactly where (and how) Emacs causes segfault (see etc/DEBUG if you are not familiar with gdb). -- Kenichi Handa address@hidden
Sorry for the late response on this matter. Please check which libfontconfig is used by emacs and fc-list. Does this work on Mac OS? % ldd THE_ABSOLUTE_PATH_OF_EMACS_BINARY | grep fontconfig % ldd TH
[...] Ummm, GC error! This is very difficult to debug. What happens if you run Emacs as this: (gdb) run -Q --enable-font-backend -- Kenichi Handa address@hidden
No. That's simply because I didn't want to design a soon-to-be-obsolete new interface for that. I am going to delete all legacy font-handling codes and thus make "--enable-font-backend" the default.
I have no idea about the intention of the code. If Dave doesn't respond, I suggest to install the attached patch. -- Kenichi Handa address@hidden 2007-01-04 Kenichi Handa <address@hidden> * quail/un
I've just fixed titdic-cnv.el so that it always detects EOL format on reading files and it always write files with unix-like EOL format. So, now CXTERM-DIC/* and MISC-IDC/* can be checked out both by
They contain CRLF originally. So I dared not change it. It seems this change (for tsang-quick-converter) by Jason already takes care of such a case. ;; Handle double CR line ends, which result when c
Please find a reproducible test case for that bug. If you can't, please keep on using Emacs started from gdb as this: % cd EMACS_BUILD_TREE/src % gdb emacs (gdb) run and when it craches, show us the
As my working directory was a little bit old, when I did "cvs update", many files were updated. But perhaps that is because of your last sync, and it seems that there's no problem. Thank you. -- Keni
I think it's a good change, but I think unnecessary multiple inclusion should also be deleted. And, I want to postpone any cosmetic changes in emacs-unicode-2. I don't object to add "#ifdef" in buffe
Yes. I implemented it as an add-on code (with a little bit tricky way). Once it is found that it works well, I'll merge the code into mule-cmds.el while cleaning the code. Please try the attached cod
I can't reproduce it. Please tell me exactly what happened. For instance, how did you see that complaint, what happened after that complaint? -- Kenichi Handa address@hidden
map-charset-chars calls FUNCTION (modify-category-entry in the above case) on all characters in CHARSET. But, to know which characters belongs to CHARET (chinese-gb2312 in the above case), we must co
I think so, but perhaps it is better that we just set inhibit-load-charsets-map to t before loading mule-conf, and set it to nil just before dumping. And signal an error when it is found that loading
I'm now adding many missing entries for the default fontset. Please wait for a while. What is shown when you type C-u C-x = while putting the cursor on that character? -- Kenichi Handa address@hidden
It may be possible to automatically fallback to the current way of building a full list in such a case. Yes, that's one solution. ?? It simply filters out elements that doesn't match with STR from NA
I have no objection. But, please take care about a newly added property value (if any). If there's such, unidata-gen.el may need adjustment. -- address@hidden PS. Sorry for the late response. I'm now
Right. By the way, Takahashi-san has just given me a very short UTF-8 encoder/decoder. It decode UTF-8 into ascii, latin-1, mule-unicode-0100-24ff, mule-unicode-2500-33ff, mule-unicode-e000-ffff. Wit
Sure. How about putting this change in the current emacs-unicode? Could you please improve the English text? ** startup.el.~1.290.~ Fri May 10 11:13:48 2002 -- startup.el Wed Jul 17 21:09:59 2002 **
Not necessarily. If we put the text property `charset' (the value is a charset) to a text on decoding, and check it on encoding, we can preserve the same byte sequence. Putting that text property to
Oops, as I've been testing emacs-unicode on GNU/Linux which defines GC_MARK_STACK to GC_MAKE_GCPROS_NOOPS, I have not paid match attention to GC cleanness. I think we can test if c_functions is GC cl
I agree that it's possible to grasp the problem in that way, but I'm not sure which is the better way. Could you explain WHY yours is better? [...] I think you mean string-make-unibyte/multibyte, but
I see. Apart from the design itself, I agree that it's difficult to introduce a new type. But, when I discussed with Richard about the Character type object a few year ago, he was not that negative p
It seems that you keep of saying that "A does B, thus it's nonsense". But, I'm arguing that "A does C". It doesn't make sense because you treat the result as "a unibyte string encoded in Latin-1". It
Thank you for finding this bug. It seems that I made a mistake when I merged a change in trunk. I've just installed a fix. -- Ken'ichi HANDA address@hidden
Sorry for not responding on this matter earlier. I remember that I implemented a code to preserve the original desigination information of iso-2022 fairly long ago and assured that it worked at that
I think the correct software repository is: $ cvs -z3 -d:ext:address@hidden:/cvsroot/emacs ... ^^^^^^^ I've just tried it and succeeded. Please see this page: https://savannah.gnu.org/cvs/?group=emac
I've installed several fixes to emacs-unicode-2. I made a tarball by ./make-dist --snapshot, and from that tarball, I could build emacs simply by "configure; make" and also by "configure; make bootst
[...] First of all, is it safe to call Lisp program in read_escape? Don't we have to care about GC and buffer/string-data relocation? -- Kenichi Handa address@hidden
I don't see any strong reason for not following utf-fragment-on-decoding in read_escape leaving the question about the usefullness of this option. -- Kenichi Handa address@hidden
unify-8859-on-decoding-mode affects iso-8859-* coding systems. If it is on, characters in a file of those coding systems are decoded into iso-8859-1 or mule-unicode-0100-24FF. That's the meaning "uni
At least, you can use the input method "ucs" (leim/quail/uni-input.el). Input method: ucs (mode line indicator:U) Input as Unicode: U<hex> or u<hex>, where <hex> is a four-digit hex number. -- Kenich
[...] As Jan wrote, it was a bug of unicode-xft branch and is already fixed. emacs-unicode-2 branch should not have that bug. -- Kenichi Handa address@hidden
I've just reached to my todo-list entry about this matter. insert_1 can be called with the arg PREPARE zero, but as insert_from_buffer and insert_from_buffer_1 don't have such an arg, prepare_to_modi
Thank you for the patches and sorry for the late response. I'm going to apply them in emacs-unicode-2 branch. But, as your changes are more than what we can record as "tiny change", FSF requires you
On Windows and Mac, we can't make it on. If someone would like to work on changing the default only for Unix/GNU-Linux and on providing a way to disable font-backend on such systems, please go ahead.
The font-backend for Mac is now under development. I hope it's committed soon. But, when' it's committed, I don't know it's stable enough for enabling font-backend by default. I don't have a time to
I did "cvs update", "make distclean", "configure", and "make bootstrap" on Debian, but it worked without any problem. So, perhaps it is a Mac-specific problem. -- Kenichi Handa address@hidden
It seems that no one is working on fixing it for the Mac port (Carbon) now, sorry. On the other hand, Adrian is working on Cocoa port at <http://emacs-app.sf.net>. Just recently the new version was r
The variable mule-version of the trunk is "5.0 (SAKAKI)", and that of emacs-unicode-2 is "6.0 (HANACHIRUSATO)". How about using it to distinguish them? -- Kenichi Handa address@hidden
I have a few questions about how to re-orgranize ChangeLog.unicode for merging. (1) As emacs-unicode-2 branch has very long history, the changes have been made not only by me. If the same function ha
I have not yet calculated them. Each CJK charset defined by a map need a char-table of encoding and a vector for decoding. They are surely loaded on demand. And, at the end of the dumping process, by
Yes. Not yet, but such a change of encoding is easy. The problem is that lisp/international/characters.el setups syntax-table and category-table for many characters by map-charset-chars. Ex: (map-cha
I'm now trying to make the completion of unicode character name (used by read-char-by-name) faster, at least fast enough for interactive use. Now it's very slow at the first time and consumes so much
Attached is the first version for that. It provides two C functions (excerpt from chartab.c). /* Unicode character property This section provides a convenient and efficient way to get a Unicode chara
Sorry for not joining this important topic much earlier. But, we have font-lock (jit-lock), and a Lisp program is called while redisplaying. I think jit-lock/font-lock is very fast even if we run Ema
Yes. On reading and writing iso-2022 files, Emacs-unicode may designate different charsets. I still can't find a time to fix it. But, is it a big problem? Even in Emacs 21, iso-2022 files may be chan
It seems that the word "get rid of" is confusing. Emacs-unicode still keeps all latin-iso8859-X charsets. We can't get rid of them. Those charsets carry such information as how to map their code poin
I agree with you. Currently, I can think of these methods: (1) Perhaps the easiest way. Check `default-enable-multibyte-characters' or a newly instroduced variable `byte-as-byte' to decide whether a
Yes. But I thought generic or not is not a point here. My examples shows that we can't use encode-coding-string. How can we use encode-coding-string without knowing what coding system to use? I haven
Thank you for the report. I've just committed a change for addding coding system aliases unix, dos, and mac in emacs-unicode-2 branch. -- Ken'ichi HANDA address@hidden
I agree. Decode-char doesn't support unify-8859-on-*coding-mode but supports utf-fragment-on-decoding and utf-translate-cjk-mode. I think, at least, CJK characters should be decoded into one of CJK c
Actually, emacs-unicode already contains various data (including names) extracted from UnicodeData.txt, and get-char-code-property is extended to information about a character that is provided by Uni
Sorry for the late reponse on this thread. [...] Emacs 22 still doesn't support Unicode characters over BMP. If you really need to handle them, please use the CVS branch emacs-unicode-2. This is writ
To my understanding, Carbon port should be treated as dead, and the merging of Cocoa port into the trunk (or to emacs-unicode-2 branch) isn't that far. -- Kenichi Handa address@hidden
I've just installed that change. Sorry, but you did that on the character #x1D12A. What I wanted to know is the result when you put cursor on #x266D or #x266F. And, what exactly "unifont" means? apt-
As Emacs-unicode unifies, for instance, character C1 of charset CS1 and character C2 of CS2. So, so even if an original iso-2022-7bit file uses the different byte sequence to represent them, when ema
Thank you. For the moment, I see no problem. It seems that it is the branch Stephen mentioned in "Well, you'll have to create an Arch branch first.", right? -- Ken'ichi HANDA address@hidden
Yes, I know about that. I'm going to implement it in emacs-unicode-2 because it supports all all Korean syllables in Unicode and thus easier to implement. -- Kenichi Handa address@hidden
utf-translate-cjk-mode also plays a role on decoding utf-*. The latter. Just setting those variables doesn't work; they should be customized. In addition, the default value of utf-translate-cjk-mode
Ah, I found that this bug happens when you insert non-ASCII characters, and I found the reason why that causes the problem only in emacs-unicode-2 (difference in REPLACE handling in Finsert_file_cont
If Adrian's work is useful, why do you prefer stock emacs-unicode-2 build? Unfortunately, I don't have a Mac. Anyway, the compile error you showed is very strange. The macro CHARSET_8_BIT_CONTROL was
You can use the input method "ucs" to input any Unicode character by typing its codepoint. A method by typing Unicode name is not yet implemented. We can make it obsolete now. Yes. -- Kenichi Handa a
It's not that simple. This is the strategy of the charset map loading mechanism. I took that approach expecting that char-tables that are garbage-collected before dumping are not in the dumped file.
I agree with Eli, defining proper categories provides more flexible usage, but FYI, something like this defines the coding system utf-8-mes-2. (let ((repertory '((0 . #x017f) #x018f #x0192 #x01b4 (#
Each charset has different reason. IPA: Some characters are not in Unicode. Korean: It contains Chinese characters too. But, by default, it is unified with Unicode. (unify-charset 'korean-ksc5601 nil
About a month ago, I made a branch emacs-unicode-2 from HEAD and started to work on synchronizing codes of emacs-unicode branch to HEAD in that new branch. I've just finished the work and committed t
Almost all changes are about character and fontset handling; charset.[ch], coding.[ch], fontset.c are mostly re-written, character.[ch] and chartab.c are newly created. So, if your change depends on
The mailing-list address@hidden is still alive. If you want to join the list, please ask Eli <address@hidden>. I don't remeber UCS-E nor UTF-E well now. Anyway, in the latest code of the branch emacs
Ah! But, then why does fontconfig put higher priority to bistream fonts? In my environment (debian etch), /etc/fonts/conf.avail/60-latin.conf has these lines: <family>monospace</family> <prefer> <fam
No. Not completely obsolete, but should be modified somehow. At first, #x0..#x3FFFFF are all valid Emacs character codes. Some of U+NNNNNN are valid Unicode code points for "noncharacter" (e.g. U+FFF
Sorry for not responding on this matter. It seems that I missed your original mail. Mostly yes. The exception is in the case that x_encode_char is called on an element of composition glyph. In that c
One possibility is that the difference is because of big CJK charset maps loaded while creating emacs from temacs. Just before dumping, loadup.el calls `clear-charset-maps' which sets internal vector
I think displaying all available Unicode informaiton is too much (see the list shown when you customize describe-char-unidata-list). Perhaps, name, general-cateogory, and decompostion are the best fo