bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#54591: 29.0.50; sqlite-select returns blob result as multibyte strin


From: Lars Ingebrigtsen
Subject: bug#54591: 29.0.50; sqlite-select returns blob result as multibyte string
Date: Sat, 02 Apr 2022 14:59:21 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

Johannes Grødem <fjas@grdm.no> writes:

> I might be misunderstanding the issue, but SQLite column types are more
> like documentation than actual rules to be enforced, unless STRICT
> tables are enabled.

Yeah, you can put anything you want into TEXT and BLOB columns.  What
I'd like to see happening is that the Emacs interface here is
predictable and convenient, and that makes my brain hurt a bit here.

Let's take a TEXT column first.  Currently, if you have the multibyte
string "fóo" and insert with "insert into ... (?)", we encode to utf-8
and put the bytes #x66#xc3#xb3#x6f into the database.  Selecting from
the database, we get the bytes #x66#xc3#xb3#x6f back, decode and return
the string "fóo".

If you have a unibyte string containing the bytes #x66#xc3#xb3#x6f, we
don't do anything with that, but insert the bytes as is.  When
selecting, we decode and return "fóo", which is not what the user
inserted.  In this case, it would be nice to signal an error, but we
can't, because we don't know that it's a TEXT column in the first place.

Conversely, with BLOB columns, we would prefer to signal an error on
multibyte strings, but we can't, because we don't know that it's a BLOB
column.  But we do the right thing with unibyte strings -- if you give
it #x66#xc3#xb3#x6f, it'll put those bytes into the BLOB column, and
when selecting, we do know that it's a BLOB column, so we could return
the unibyte string #x66#xc3#xb3#x6f, and everything's fine.  However, if
the user wanted to insert the string "fóo", they'll be getting
#x66#xc3#xb3#x6f back and will probably be sad.

Today, the semantics are at least predictable: We insert everything
encoded to utf-8 (no matter whether using bound parameters or inside the
string), and if the user wanted something binary in the BLOB they
selected, they just have to call `decode-coding-string BLOB-RESULT
'utf-8' to get the binary data.

Which I understand is confusing, because it's very confusing indeed.
But it's consistent, at least.

If we knew what the type of the column we were inserting into, we could
be more helpful in the interface, but there doesn't seem to be a way to
get at that information?

> By the way, if you want to insert BLOBs in the query itself you can do
> it like this, but I guess this doesn't need Emacs support, except maybe
> a helper function for the conversion:
>
>   INSERT INTO foo VALUES (X'deadcafe');

Yes, but that leaves the issue to the caller, and the issue about what
to do when selecting is still unclear.

-- 
(domestic pets only, the antidote for overdose, milk.)
   bloggy blog: http://lars.ingebrigtsen.no





reply via email to

[Prev in Thread] Current Thread [Next in Thread]