[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [SUGGESTION] Pretty-printing custom unit types
From: |
apache2 |
Subject: |
Re: [SUGGESTION] Pretty-printing custom unit types |
Date: |
Mon, 11 Jul 2022 16:39:26 +0200 |
User-agent: |
Mutt/1.9.3 (2018-01-21) |
On Mon, Jul 11, 2022 at 03:53:14PM +0200, Jose E. Marchesi wrote:
>
> > On Fri, Jul 08, 2022 at 08:43:06PM +0200, Jose E. Marchesi wrote:
> >>
> >> > Here it prints #32 instead of #U32bits.
> >> >
> >> > If how_many was an offset<int, B> it would however print how_many=0x0#B
> >> >
> >>
> > Hmm, maybe I'm not understanding it correctly, but would this correctly
> > handle
> > multiple units with the same size being interleaved in a struct?
>
> No, it would not.
>
> > I would have thought that to cover this comprehensively we would have to
> > add a
> > tag field to the unit type struct, but I'm happy to stand corrected if you
> > have a more elegant solution. :-)
>
> Hmm, so you are suggesting to expand both the boxed offset PVM values
> _and_ the boxed offset types PVM values in order to hold an unit name?
>
Yes, kind of. I don't know how unit types are currently represented in the
runtime.
unit types:
Units are only allowed to be initialized with a constant integer literal.
Their names are also constant, and I'd argue that nominal typing of units
is fine given that both names and values are constant; I don't see the
point of actually keeping track of scoped unit type declarations when
they are for all intents and purposes equivalent as far as I can tell.
This would let us "intern" the (unit name * bit size) tuples, deduplicating the
allocations.
Offset types are trickier because I'm guessing that the "container"
type in offset<CONTAINER,UNIT> is parameterizable/scope-dependent
(which causes the need for allocations alleviated to in your comment?), and if
that is the case
I agree that adding new fields would be unfortunate because it would increase
memory consumption
and trash our cache.
Some ideas I think are worth considering in that case:
1) We could keep a table in the environment mapping from offset type pointer
to unit/unit name.
This would let us keep the pointer in the offset type (to keep the actual
allocation small)
while still letting us access the unit name when
pretty-printing/enumerating/complaining about errors.
The runtime cost would be increased memory usage and increased bookkeeping
when allocating/deallocating offsets.
2) Interning offset types, too, would reduce the size of such a table.
I'm not sure how practical this is / how prone to changes offsets are from
changing variables etc?
3) Another idea, that I like more, would be to limit the maximum unit size
(currently uint64_t?) in favor
of storing a [unit name tag] (an offset into a global unit name string
table) in the upper bits.
Again, since unit names are constant strings that would need to be loaded
from source code,
I think it would be "enough for everybody" with a global limit of e.g. 4096
distinct unit names,
limiting our units to 64-12 bits. Do we have a practical use for units
larger than 2^52 bits (~4 petabytes)?
Then we wouldn't need extra fields, and we'd still be able to access the
size without chasing pointers.
We'd need a bit mask on access, and a tiny bit of hash table bookkeping on
allocation, but that seems reasonable?