[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re-working yyitems

From: Valentin Tolmer
Subject: Re-working yyitems
Date: Tue, 7 May 2019 11:23:44 +0900


I'm still in the progress of refactoring glr.cc to be more c++-friendly and
eventually introduce variants. Right now, I'm trying to eliminate the last
use of YYMALLOC: yyitems.

yyitems is an (optionally) expandable array of yyGLRStackItem, which
contains all the items in all the glr stacks. It's currently handled as an
arena-style array where allocating a new item is just bumping up the
yynextFree pointer. When it needs to grow, reallocation is a bit tricky
since we have to update all the cross-referencing fields of the elements.
Same for compression (IIUC, it's compressed when we get back to a single
stack after removing the others). The main advantage is that
allocating/deleting items on the stack is near-instantaneous, but it makes
the code complex.

If we want to avoid re-writing a vector-like container, the grow operation
should be a simple copy. However, the elements have internal pointers to
each other. The possibilities are thus:
- Keep the current implementation: fast, but ugly and complex
- Switch to indexes instead of pointers for internal references, and switch
the container to vector<item>: extra care must be taken when deleting
elements to update the indexes, but otherwise growing, pushing and popping
should be painless
- Switch to a vector<unique_ptr<item>>: This way the pointers are stable.
We're counting on malloc to handle the arena for us, and
growing/pushing/popping/deleting is painless. However, we lose access to
the index of an element (we can't work it out from a pointer), which is
only used in debug. If the index itself is not important and can be
replaced by a UID, it would be easy enough to add one to the item. This is
the approach I propose.

Detailed summary of the operations:


yyGLRStackItem* yyitems; -> raw array of items.
item.yystate/yyoption -> state or option (union field)
item.yystate.yypred/item.yyoption.yystate -> raw pointer to another state.
- when growing:
  - if state:
    - reloc yystate.yypred
    - reloc yystate.yysemantics.yyfirstVal
  - if option:
    - reloc yyoption.yystate
    - reloc yyoption.yynext
  - reloc yysplitPoint
  - reloc all the yystacks[].yystate
- when compressing:
  - update the yypred
  - null the split point and last deleted
  - after split point:
    - update yystate
    - update yystacks[0].yystate
- when creating an element:
  - essentially no-op
  - bump nextFree
  - callers should call reserve_glrstack to make sure there is headroom.
- when accessing an element:
  - faster (?), element is inlined in the vector.
- when dumping:
  - easy access to the element's index (&item - yyitems)


std::vector<std::unique_ptr<yyGLRStackItem>> yyitems; -> array of
unique_ptr of
items. Overall, much simpler code, removing several fields, simplifying the
growing/compressing functions, guaranteed memory safety, ...
- when growing:
  - small vector, cheaper to move
  - simple push_back
- when compressing:
  - std::remove_if or equivalent
- when creating an element:
  - cost of allocation
- when accessing an element:
  - cost of dereference (but anyway the elements were not often accessed
    sequentially (?))
- when dumping:
  - no access to the element's index
    - Is it needed?
    - Do they need to be dense? Or can we assign a UID to each element?
Need to
      store the UID, though.
Valentin Tolmer

reply via email to

[Prev in Thread] Current Thread [Next in Thread]