help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Multiple encodings in one file


From: Lambert, Joshua D
Subject: Re: Multiple encodings in one file
Date: Tue, 30 Apr 2024 17:17:37 +0000

Thank you. Your suggestion of editing one record at a time is how most MARC 
editors work and that was my first thought as well. It will require record by 
record redisplay of some sort to make it human readable to begin with. That 
said, I'm thinking of multiple interfaces depending on the user's goal.

MARC is a file transmission format from 1968 (which in part explains the odd 
encoding) and can include any number of records. It has no line breaks or 
carriage returns but Emacs' longlines-break-chars seems to work well enough 
that I can edit files with 10,000 typical sized records, including some 
fontification.

Thanks to you and all who contribute to Emacs.
Joshua
________________________________
From: help-gnu-emacs-bounces+jlambert=missouristate.edu@gnu.org 
<help-gnu-emacs-bounces+jlambert=missouristate.edu@gnu.org> on behalf of Stefan 
Monnier via Users list for the GNU Emacs text editor <help-gnu-emacs@gnu.org>
Sent: Monday, April 29, 2024 9:02 PM
To: help-gnu-emacs@gnu.org <help-gnu-emacs@gnu.org>
Subject: Re: Multiple encodings in one file

CAUTION: External Sender


>> Thank you for the time. What you said gives me some hope but I have
>> a follow-up question. If I visit a file literally, make a change, and
>> save it, the file seems to be different only where I changed it. Is
>> that true?
>
> If you save it while binding coding-system-to-write to no-conversion,
> yes.  IOW, you need to disable encoding while saving.

Also, if you open the file as a if it was all utf-8, then the utf-8
parts of the file should look just fine (and the MARC-8 parts may look
screwy) and if you edit it and save the result it *should* result in
a valid file where only the part your changed was modified.

>> If so, then does the following seem reasonable.
>>
>> 1 Find a file literally.
>> 2 The user will accept that some characters will show octal codes or
>>   something similar.
>> 3 Edit the records where understandable and possible.
>> 4 Save file.

For a quick&dirty solution that should work as long as you're doing
limited changes and only in parts that are mostly ASCII.

If you're designing a major mode, maybe a better approach would look
like: read the file literally (i.e. as bytes) and treat it as a kind of
directory or archive (think tar-mode, dired, archive-mode, Rmail) so
only show a summary of the contents, then let the users "open" a record
which is then extracted (and decoded) into another buffer.


        Stefan


This message originated outside Missouri State University. Please use caution 
when opening attachments, clicking links, or replying.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]