bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54154: Emacs commit ae963e80a79f5a9184daabfc8197f211a39b136d (sieve-


From: Eli Zaretskii
Subject: bug#54154: Emacs commit ae963e80a79f5a9184daabfc8197f211a39b136d (sieve-manage)
Date: Thu, 19 Jan 2023 16:08:58 +0200

> From: Kai Tetzlaff <emacs+bug@tetzco.de>
> Cc: herbert@gojira.at,  larsi@gnus.org,  54154@debbugs.gnu.org
> Date: Thu, 19 Jan 2023 13:38:13 +0100
> 
> Eli Zaretskii <eliz@gnu.org> writes:
> > The duplicate use of with-current-buffer is sub-optimal,
> > IMO.  What about the simpler code below:
> >
> >   (when sieve-manage-log
> >     (let* ((existing-log-buffer (get-buffer sieve-manage-log))
> >            (log-buffer (or existing-log-buffer
> >                            (get-buffer-create sieve-manage-log))))
> >       (with-current-buffer log-buffer
> >         (unless existing-log-buffer
> >           ;; Do this only once, when creating the log buffer.
> >           (set-buffer-multibyte nil)
> >           (buffer-disable-undo))
> >         (goto-char (point-max))
> >         (apply #'insert args)))))
> 
> Yes, that provides more insight into what the code intends to do. Here's
> the patch (with additional updated doc string):

Thanks, installed on the emacs-29 branch.

> > This still uses a less-than-elegant implementation that calls
> > with-current-buffer twice.
> 
> Yes, true. But since `sieve-manage--set-internal-buffer-properties' is
> used in two different places, the more elegant solution you suggested
> above would require duplicating the body of the function in those
> places. I just didn't see a better way.

I'm not sure why you need to force the encoding of the process buffer,
when you already set the coding-system to be used for decoding stuff
from the process.  Is that really needed?

But if you really need this, then just make the insertion of the text
into the buffer you create optional: then for the process-buffer pass
nil as the text to insert, and you can do the with-current-buffer
dance only inside that function.

> >>  (defun sieve-manage-encode (utf8-string)
> >>    "Convert UTF8-STRING to managesieve protocol octets."
> >> -  (encode-coding-string utf8-string 'raw-text t))
> >> +  (encode-coding-string utf8-string sieve-manage--coding-system t))
> >
> > Why is the argument called utf8-string?  If it's indeed a string
> > encoded in UTF-8, why do you need to encode it again with
> > raw-text-unix? it should be a no-op in that case.  So please tell more
> > about the underlying issue.
> 
> I chose the name as a hint to the user that the incoming string should
> be UTF-8 encoded. But that is probably misleading since the string
> itself doesn't have an encoding? So let's change the function to:
> 
>     (defun sieve-manage-encode (str)
>       "Convert STR to managesieve protocol octets."
>       (encode-coding-string str sieve-manage--coding-system t))
> 
> Regarding the potential double encoding: When sending data over the
> network connection, `sieve-manage-encode' intends to make sure that
> `utf8-string' data is converted to a byte/octet representation. I tried
> to explain that in the doc string of `sieve-manage--coding-system':
> 
>     (defconst sieve-manage--coding-system 'raw-text-unix
>       "Use 'raw-text-unix coding system for (network) communication.
>     
>     Sets the coding system used for the internal (process, log)
>     buffers and the network stream created to communicate with the
>     managesieve server.  Using 'raw-text encoding enables unibyte
>     mode and makes sure that sent/received octets (bytes) remain
>     untouched by the coding system.  The explicit use of `-unix`
>     avoids EOL conversions (and thus keeps CRLF (\"\\r\\n\") intact).")
> 
> The original problem was that when communicating with the sievemanage
> server, we need to handle length elements where we need make sure that
> calculated values take into account that UTF-8 characters may comprise
> multiple octets.
> 
> Even after reading the relevant sections of the documentation multiple
> times I was (and am still) not sure what exactly the various coding
> system settings do and how they interact with buffers and networking
> functions. So forgive me if what I'm doing there looks weird to your
> expert eyes.
> 
> When working on the original patch, I had several uhoh moments where
> data sent to or received from the network seemed to have been
> automatically modified by the coding system (unfortunately, I don't
> remember the exact details). So I tried to eliminate any such automatic
> modifications by using 'binary or 'raw-text encodings on code paths
> which handle network data. Basically, my thinking was: 'better do things
> twice/thrice/... before introducing new points of failure'.

Since you seem to be encoding and decoding to/from UTF-8 by hand in
sieve-manage-encode/decode, you should use 'binary' as the
process-codings-system for the network connection to the server, and
that's it.  I see no reason to encode again using raw-text-unix.

What you should do is call sieve-manage-encode inside
sieve-manage-send, and count the bytes there after encoding the
payload.

> >> @@ -244,8 +267,7 @@ sieve-manage-open-server
> >>                   (open-network-stream
> >>                    "SIEVE" buffer server port
> >>                    :type stream
> >> -                  ;; eol type unix is required to preserve "\r\n"
> >> -                  :coding 'raw-text-unix
> >> +                  :coding `(binary . ,sieve-manage--coding-system)
> >
> > Same question about encoding with raw-text-unix here: using it means
> > some other code will need to encode the text with a real encoding,
> > which in this case is UTF-8 (AFAIU the managesieve protocol RFC).  So
> > why not use utf-8-unix here?
> 
> Same as above: I'm just not sure that this is the right thing.

See above.

> But after thinking about it some more, I made the following changes (as
> an experiment):
> 
> 1. set `sieve-manage--coding-system' to 'utf-8-unix and
> 2. changed the call to `open-network-stream' in the patch above to
> >> @@ -244,8 +267,7 @@ sieve-manage-open-server
> >>                   (open-network-stream
> >>                    "SIEVE" buffer server port
> >>                    :type stream
> >> -                  ;; eol type unix is required to preserve "\r\n"
> >> -                  :coding 'raw-text-unix
> >> +                  :coding sieve-manage--coding-system
> instead of the previous, asymmetric mix of `binary and
> `sieve-manage--coding-system'.

This could work, but AFAIU, you need to specify the content length in
bytes for the PUTSCRIPT command, so you must encode the content
yourself.  Thus my suggestion to use 'binary' in the :coding attribute
of the process, and instead encode/decode using
sieve-manage-encode/decode to/from UTF-8 inside sieve-manage-send and
sieve-manage-parse-* functions.

> > Should the addition of BYE support be mentioned in NEWS?
> 
> I can certainly do that if you think that this is useful. It just seems
> to be more of an internal detail which probably doesn't mean much to
> most users.

Isn't BYE provides some capabilities that user/callers would like to
know about?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]