lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: LYNX-DEV Another fotemods.zip update


From: Foteos Macrides
Subject: Re: LYNX-DEV Another fotemods.zip update
Date: Wed, 16 Apr 1997 13:06:34 -0500 (EST)

Klaus Weide <address@hidden> wrote:
>On Tue, 15 Apr 1997, Foteos Macrides wrote:
>
>>      Another update of tonight's fotemods.zip (this should be it for
>> tonight 8-) is available in:
>> 
>>      http://www.slcc.edu/lynx/fote/patches
>> 
>> 1997-04-15
>> * Miscellanous additional tweaks in HTML.c for more robust error recovery
>>   from bad HTML involving emphasis or style elements (B, BLINK, CITE, EM,
>>   FONT, I, STRONG, and U), or HREF-less NAME-ed Anchors without matching
>>   end tags. - FM
>> * Modified the declarations in HTMLDTD.c and code in SGML.C, HTML.c, and
>>   GridText.c to handle A, B, BLINK, CITE, EM, FONT, I, STRONG, and U
>>   container elements homologously to the modified handling of FORM (see
>>   1997-04-05 mods) so that if they are invalidly interdigitated or have
>>   spurious end tags in the markup, substitutions of the "expected" end
>>   tags by the SGML.c stack-based parser will not be made, and without
>>   messing up the HTML.c stack-based parser.  Appears to work reliably
>>   for all of the elements, and to be reasonably crash safe (hopefully
>>   as safe as the vanilla v2.7.1), but there are no guarantees. - FM
>
>This sounds scary...
>If you go on at that pace, where won't be anything left to do for
>the stack-based parsing !?

        I did it with the current API because it should still be
compatible with what can and should be done when Rob's color/style
stuff is worked into the next formal release (as you apparently
intend to do).  However, v2.7.1+FOTEMODS still is treating all the
"emphasis" elements as if they were synonyms, for underlining.  The
color/style stuff should allow them to be treated individually, but
with a hash table design still be able to cope equivalently with
bad HTML (that's purely "theoretical" at this point, though 8-).


>                                 I wonder whether there is anything which
>can not be subjected to the same treatment for principal reasons...

        You should only do this kind of thing with the current API
for elements which do not have styles registered in DefaultStyle.c,
nor have ALIGN attributes.  They inherit the registered style of any
element which contains them (and it's, or a P's, CENTER's, or DIV's,
current alignment setting), or the Normal style if they aren't nested.
Thus, there is no need to put them in the HTML.c stack, because their
entries simply will have the styles info for the preceding element in
the stack, reiterated.  So you can just look at the "containing"
element's (or Normal) style in the stack, and furthermore save the
memory allocations associated with reiterating that for A, FORM, and
the emphasis/style container elements and loading those as well onto
the HTML.c stack.  I'm not certain yet that the mods are fully immune
from inherited alignment glitches.  The alignment handling is too
complicated to be sure one has thought it all through correctly, but
empirically, so far, that seems to be OK too.


>I would appreciate reports to lynx-devfrom people who have tried Fote's
>latest code, does it improve then handling of invalid pages?  Does it make
>more sense of the typical invalid HTML than standard Lynx 2.7.1 (or  
>devel)?

        I tested it with http://www.businesswire.com/headlines.shtml
and it handles that perfectly now, despite all the horrible non-HTML
in it.  That page has a number of link names which are 3 or 4 lines
long, and Lynx highlights only their first two lines when making
them the current link.  I looked again at what would be involved in
modifying the code to highlight more of the current link lines, or
ideally all of them no matter how many, but again said "Ugh!" and
moved that to the bottom of my TODO list. :) :)

        What *is* unrealistic, and people dreaming about it should
"get real", is *fully* reproducing Netscape's and MSIE's non-HTML
handling, without access to either's source code.

                                Fote

=========================================================================
 Foteos Macrides            Worcester Foundation for Biomedical Research
 address@hidden         222 Maple Avenue, Shrewsbury, MA 01545
=========================================================================
;
; To UNSUBSCRIBE:  Send a mail message to address@hidden
;                  with "unsubscribe lynx-dev" (without the
;                  quotation marks) on a line by itself.
;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]