discuss-gnustep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNUstep Web browser (was Re: WebKit Bounty)


From: address@hidden
Subject: Re: GNUstep Web browser (was Re: WebKit Bounty)
Date: 5 Mar 2007 02:53:09 -0800
User-agent: G2/1.0

> or pass html through html tidy first.

It appears unnecessary to me to go that way because it first parses
HTML into a tree, then fixes some things and writes out HTML just to
parse it again...

I have read through the rules html tidy uses and in most cases the
following rules will have the same or a quite similar result (ok it
needs more testing with badly designed pages):
* if the closing tag does not match the opening tag, search outwards
until you find one (if you don't find, ignore)
* be lazy with missing quotes in tag attributes
* convert all tag names and attribute names to upper case
* ignore <html>, <head>, <body> (except for attributes)
* some tags always go to the HEAD section (e.g. <title>, <meta>)
wherever they appear
* ignore unknown tags

As soon as I have new more or less stable code, I will upload a
snapshot and you can look into it.

-- hns



reply via email to

[Prev in Thread] Current Thread [Next in Thread]