lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

LYNX-DEV Re: Encoding: x-gzip


From: Klaus Weide
Subject: LYNX-DEV Re: Encoding: x-gzip
Date: Tue, 10 Dec 1996 12:16:43 -0600 (CST)

[This is longish, but maybe somebody finds it useful.  Ther is a patch
at the end.]

Here is a little empirical study about Lynx2.6's behaviour when 
encountering a MIME type / file suffix, possibly combined with
compression.  I use a non-standard type about which Lynx has no
built-in knowledge to simplify things, and use only .mailcap and
.mime.types to let Lynx know about it (or not, see below).

There are two files, f1.blah and f2.blah.gz, which I access with Lynx
either via file: URLs (equivalent to just giving the filenames in this
case) or via http: URLs from a HTTP server.  When served via HTTP,
the server is configured to send them with the following headers:

# http:[...]/f1.blah:
Content-type: application/x-blah
# (No Content-encoding)

# http:[...]/f2.blah.gz:
Content-type: application/x-blah
Content-encoding: gzip

The four cases I test are combinations of:

MCAP-: no relevant line in .mailcap
MCAP+: The following line in .mailcap defines a viewer:  
application/x-blah; less %s; needsterminal

MTYP-: no relevant line in .mime.types
MTYP+: Following line in .mime.types associates suffix .blah w/ the type:
application/x-blah      blah

I observe three possible results:
DWNL-1: Lynx offers "application/x-blah D)ownload, or C)ancel" prompt
DWNL-2: Lynx offers "application/gzip D)ownload, or C)ancel" prompt
VIEW: Lynx invokes viewer (in this case, less) (for f2: uncompressing)
TEXT: Lynx displays file as if it were text/plain (for f2: after uncompressing)

Now here the results:  [URLs only partially given to fit on line]

             file:/f1.blah  file:/f2.blah.gz  http:/f1.blah  http:/f2.blah.gz 
MCAP-/MTYP-   TEXT    [1]    TEXT    [1]       DWNL-1 [2]     DWNL-2 [2]
MCAP-/MTYP+   DWNL-1  [2]    DWNL-2  [2]       DWNL-1 [2]     DWNL-2 [2]
MCAP+/MTYP-   TEXT    [1]    TEXT    [1]       VIEW           TEXT   [3]
MCAP+/MTYP+   VIEW           VIEW              VIEW           VIEW

Remarks/Annotations/Interpretations:
[1] If the MIME type is not known, a local file is treated like text/plain.
[2] If the MIME type is known but Lynx doesn't know what to do with it,
    it offers download prompt.
[3] If the MIME type is known and the file was compressed, but no suffix
    is associated with this MIME type, Lynx treats the file like text/plain.

The behaviour noted in [3] looks wrong to me.  It should not be necessary
to have a suffix defined for a MIME type in order to invoke the defined
viewer for it, if that MIME type is already known from the HTTP Content-type
header.  After all it works fine without a suffix definition if the file
doesn't arrive in compressed form.

The behaviour in [3] is an artifact of the way how Lynx handles compressed
files:  first the compressed file is written to a temp file with a suffix
derived from the MIME type (_IF_ a suffix can be derived from the type).
[A compressor-specific suffix is also appended, in my case .gz.]  Then
that file is decompresses by a system call.  Then Lynx forgets everything
it knew about the MIME type from the HTTP header, and tries to derive
a MEDIA type from the suffix.

Worse is (IMHO) that character set information from HTTP headers is
also lost this way.  The behaviour wrt content-type can be changed
by association a suffix with a MIME type, but this won't work for
charset info.  [Note: this is about the standard Lynx, not specifically
about Lynx+chartrans.]

With the patch below, Lynx will use MIME type and charset info for
temporary files if it already has learnt it from somewhere else.

Fote, is there any way this would break something else?  I couldn't think
of any, but maybe you will...
 
   Klaus

*** ../lynxoffi/lynx2-6/WWW/Library/Implementation/HTFile.c~    Sun Nov 10 
13:13:56 1996
--- lynx2-6/WWW/Library/Implementation/HTFile.c Tue Dec 10 11:53:23 1996
***************
*** 1282,1291 ****
--- 1282,1308 ----
  #endif /* VMS */
  
      /*
+     ** If the anchor object passed in already has content_type set,
+     ** *trust that* instead of using suffix mapping.  This will be the
+     ** case when HTLoadFile is called on the temporary file which is
+     ** the result of decompressing a compressed file.  - kw
+     */
+     if (anchor->content_type && *anchor->content_type) {
+        format = HTAtom_for(anchor->content_type);
+     }
+     else
+     /*
      ** Determine the format and encoding mapped to any suffix.
      */
      format = HTFileFormat(filename, &encoding);
  
+     /*
+     ** If the anchor object passed in already has charset set,
+     ** *trust that* instead of using suffix mapping.  This will be the
+     ** case when HTLoadFile is called on the temporary file which is
+     ** the result of decompressing a compressed file.  - kw
+     */
+     if (!(anchor->charset && *anchor->charset))
      /*
      **  Check the format for an extended MIME charset value, and
      **  act on it if present.  Otherwise, assume the ISO-8859-1

;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]