lynx-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: lynx-dev megapatch to dev.10 - and a rant


From: Vlad Harchev
Subject: Re: lynx-dev megapatch to dev.10 - and a rant
Date: Thu, 14 Oct 1999 21:32:34 +0500 (SAMST)

On Thu, 14 Oct 1999, Klaus Weide wrote:

 Very pleased to hear your reply, Klaus (you ignored a lot of my private email
- thou' their content was the same - 'when do you plan to release TRST?'). 

> On Thu, 14 Oct 1999, Vlad Harchev wrote:
> 
> >  I tried to apply your patch to dev10. It fails to compile since
> > 'src/TRSTable.c' includes non-existing file 'AC.c'. Can you make it 
> > available
> > (if you make that file available separately, I will be even more thankful to
> > you).
> 
> Very sorry about that. It is only one function that I moved to a separate
> file for testing, and then completely forgot about.  The contents are
> appended, if should just replace the `#include "AC.c"' in TRSTable.c.

  Thanks for quick response. I compiled it, it works rather good. 

> >  As for table rendering - I started thinking on implementing it completely,
> > using something like w3m algorithms.
> 
> Have you looked at w3m's algorithms?  (I haven't looked at any.)
> If yes a summary of the approach would be interesting - just out of
> curiosity.

 The description of the algorithm used is supplied with w3m in doc/STORY.html.
Here is an _entire_ relevant part (note: it was rendered with justification
and hyphenation enabled :)

Table rendering algorithm in w3m

   HTML table rendering is difficult. Tabular environment of LaTeX is not
   very  difficult,  which makes the width of a column either a specified
   value  or  the  maximum width to put items into it. On the other hand,
   HTML  table  renderer  has to decide the width of a column so that the
   entire table can fit into the display appropriately, and fold the con-
   tents of the table according to the column width. Inappropriate column
   width  decision  makes  the table ugly. Moreover, table can be nested,
   which makes the algorithm more complicated.
    1. First, calculate the maximum and minimum width of each column. The
       maximum  width is the width required to display the column without
       folding the contents. Generally, it is the length of paragraph de-
       limited  by  <BR>  or <P>. The minimum width is the lower limit to
       display  the  contents.  If the column contains the word `interna-
       tionalization',  the  minimum width will be 20. If the column con-
       tains  <pre>..</pre>,  the  maximum width of the preformatted text
       will be the minimum width of the column.
    2. If  the  width  of the column is specified by WIDTH attribute, fix
       the  column  width  using  that  value.  If the specified width is
       smaller than the minimum width of the column, fix the column width
       to the minimum width.
    3. Calculate  the  sum  of the maximum width (or fixed width) of each
       column  and  check  if  the sum exceeds the screen width. If it is
       smaller than screen width, these values are used for width of each
       column.
    4. If  the  sum is larger than the screen width, determine the widths
       of each column according to the following steps.
         1. Let  W be the screen width subtracted by the sum of widths of
            fixed-width columns.
         2. Distribute W into the columns whose width are not decided, in
            proportion to the logarithm of the maximum width of each col-
            umn.
         3. If the distributed width of a column is smaller than the min-
            imum  width,  then fix the width of the column to the minimum
            width, and do the distribution again.
       
   In this process, distributed width is proportion to logarithm of maxi-
   mum  width,  but I am not sure that this heuristic is the best. It can
   be, for example, square root of the maximum width.
   
   The  algorithm above assumes that the screen width is known. But it is
   not  true  for nested table. According the algorithm above, the column
   width  of  the outer table have to be known to render the inner table,
   while the total width of the inner table have to be known to determine
   the  column  width of the outer table. If WIDTH attribute exists there
   are  no  problems.  Otherwise, w3m assumes that the inner table is 0.8
   times  as wide as the outer table. It works fine, but if there are two
   tables  side  by  side in an outer table, the width of the outer table
   always  exceeds  the  screen  width. To render this kind of table cor-
   rectly,  one have to render the table once, check the width of outmost
   table,  and  then render the entire table again. Netscape might employ
   this kind of algorithm.


  I quickly looked in the sources (less than 5 minutes). Interestingly, It
uses a library for solving system of linear equations for something. It uses 
garbage collection extensively (author said it helped a lot). But w3m differs
from lynx in the fact that w3m allows shifting the screen (ie the rendered
text can be longer than screen width - as in 'less'). I can note that w3m 
works much slower than lynx, and that it takes much more memory (approx 3-4
times more).
  
>    ------------
> 
> Btw Vlad, while doing that merging of my older changes I found
> that there are some pieces of code that have become IMO completely
> unreadable, and some of your changes are foremost...

 What do you mean by 'foremost'?

> Some examples
> 
> (1) in HText_appendCharacter
> 
>     if (IsSpecialAttrChar(ch) && ch != LY_SOFT_NEWLINE) {
> #if !defined(USE_COLOR_STYLE) || !defined(NO_DUMP_WITH_BACKSPACES)
>         if (line->size >= (MAX_LINE-1)) return;
> #if defined(USE_COLOR_STYLE) && !defined(NO_DUMP_WITH_BACKSPACES)
>         if (with_backspaces && HTCJK==NOCJK && !text->T.output_utf8) {
> #endif
>         if (ch == LY_UNDERLINE_START_CHAR) {
>             .....
> 
> Does *anybody* get a clue, by looking at this, under which circumstances 
> the following section gets executed and under which it doesn't?
> With the double negatives combined with other flags that then control
> other conditions etc., I can stare at it for 20 minutes and still not get
> it.

  I hoped that writing cpp conds that resemble human language (ie without
boolean expression optimization) will be more understandable. Probably
indenting could help. And as I remember this way (in this particluar case)
required less changes to source code. And anybody can ask me what something
means.

> (2) in form_getstr
> 
>             case LTARROW:       /* 1999/04/14 (Wed) 15:01:33 */
>                 if (MyEdit.pos == 0 && repeat == -1) {
>                     int c = YES;    /* Go back immediately if no changes */
> #ifndef NO_NONSTICKY_INPUTS
>                     if (sticky_inputs
>                      && !textfield_stop_at_left_edge)
> #endif
>                     if (strcmp(MyEdit.buffer, value)) {
>                         c = HTConfirmDefault(PREV_DOC_QUERY, NO);
>                     }
>                     if (c == YES) {
> #ifndef NO_NONSTICKY_INPUTS
>                         if (textfield_stop_at_left_edge)
>                             goto again;
> #endif
>                         return(ch);
>                     } else {
> 
> Again, I don't get it.  In fact it seems to be wrong, and seems to
> have the result that now - with the set of macros I have compiled
> with *which do not include anything about `STICKY'* - I get *trapped*
> in form fields without a prompt to escape.  But the logic is too
> convoluted to see easily what's supposed to happen.

 What does 'get trapped' mean - does this mean that you are completely locked
in the field? Here is a background: STICKY_INPUTS - controls whether the text
inputs should be activated before editing (if is TRUE, then they should be
activated, if is FALSE, the inputs are already "activated"), STICKY_FIELDS -
controls whether you can leave the current document by pressing arrow-left in
already-activated input (if is TRUE, then user will be unable to leave
activated text input with arrow-left, if false then will be able).

 I compiled lss-enabled dev10 with your patch applied and tested all 4
combinations of settings - all works as expected. You can leave text input by
pressing ENTER or 'arrow-up|down'. So, the existance of other links on the
page guarantees the exit from text input (so the user will be trapped on the
page that contains only one input field and no hyperlinks).
 
 Again, in this case, indentation/comments could help. Sorry.

> (3) a lot in SGML.c - 'nuff said.

 I don't see any other way of implementing rather powerfull syntax highlighter
(that uses that same rules as lynx when parsing) without a lot of
efforts/pain. Klaus, you understood how it works, you updated it, so why the
question? I can explain how it works to any lynx developer.

> This proliferation of newe proprocessor macros to control some
> micro-feature is A Very Bad Thing.  Particularly when they are
> logiccal negatives and are used in complex logical combinations
> with other conditions, preprocessor and runtime.

  I use macro conditions for things like NO_DUMP_WITH_BACKSPACES mostly to
mark the places of changes (it's much easier to grep for code responsibe for
that thing, isn't it?). As I remember, I posted cpp condition simplifier (as
awk script, that remove 'conditionals with constant value' - just tell it that
NO_DUMP_WITH_BACKSPACES is always 0 and pass a code through it - thou' it
can't handle expressions like x || y). As for negatives, Tom is making
substitutions as I understand (when patches are applied).

> I was hesitant to write this, since no doubt I have done my share to
> make lynx code harder to understand.  Glass house and stones etc.
> Still it has to be said.  If you find I have done the same you can
> rant about it, too. :)

 I can advice you and anybody to use cpp macros to make hierarchical code
dependenices  - if some block of code is repeated, and it can't be made a
function, don't cut and paste it, make a macro out of it (so when the logic
has to be changed, only the
macro definition should be changed, rather than all occurences of the
block body). Do so even for reducing text size - it's visually easier to
compare several-letter macro with one parameter (like PUTC(ch)) than
HText_appendCharacter(text,ch). This is a machine's work. 
Examples: PSRCSTART in SGML.c.

 
> So can you please do something to make your addtions more
> understandable.  Particularly the examples above.  (And check
> whether that sticky stuff really does what it's supposed to,
> not just in your configuration.)  
> 
> Some of those alternatives are just unnecessary.  You have to decide
> whether a code alternative should be in the code or not.  If you
> can't decide, and feel the need to include several variants, then
> maybe the change isn't worth it at all.  I'm not talking about
> features that it might make sense to disable or enable at compile
> time, but those where it really doesn't make sense to provide
> configurability (and I would count NO_DUMP_WITH_BACKSPACES, 
> NO_NONSTICKY_INPUTS, and OPT,OPT1 in SGML.c here.  Either the code
> savings if someone chosses not to compile it in are negligible,
> or it's obvious that one alternative is _meant_ to be the better
> one and should be used.)

 As I said, I use cpp conditions to denote places with changes. And I'm sorry
for OPT, OPT1 - these was an attempt for optimizations. I hope I will clear
them one day.

>[...] 
 
 As for psrc mode - did you fixed it to handle XML-like input?

 Best regards,
  -Vlad


reply via email to

[Prev in Thread] Current Thread [Next in Thread]