[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] Why is utf8 type _NOVERIFY, and other vocab stuff.

From: Timothy Brownawell
Subject: Re: [Monotone-devel] Why is utf8 type _NOVERIFY, and other vocab stuff.
Date: Fri, 16 Feb 2007 14:55:08 -0600

On Thu, 2007-02-15 at 01:11 -0800, Nathaniel Smith wrote:
> On Wed, Feb 14, 2007 at 07:08:09PM -0600, Timothy Brownawell wrote:
> > Is there any particular reason that our utf8 type is ATOMIC_NOVERIFY()
> > instead of ATOMIC()?
> With, presumably, the verify() function verifying that the string in
> question was in fact valid utf8?  The problem is just that I suspect
> if we did that now, everything would start crashing, because we're
> really kind of fast and loose with charsets.  Lapo made this a little
> better at the summit, starting to add _strict and _best_effort
> conversion functions, but lots more work is definitely needed.  (Also,
> there were reports that the _best_effort code didn't actually work
> with lots of broken iconv's found in the wild...)

Ah, ok.

> > Also, does anyone have any thoughts about reorganizing vocab somewhat?
> > In particular, DECORATE() seems kinda backwards -- if the
> > 'revision'/'roster'/whatever was the template argument instead of the
> > outermost template, then a number of our transform functions could be
> > templatized themselves instead of manually defining however many copies.
> Whatever works... but what about the places where we use raw data, id,
> etc.?

Presumably, we'd have generic_whatever types for that. But then, this
still leaves us with two things for each transform, the generic_whatever
version and a (inline?) templatized wrapper for type-safety... I guess
what I'm thinking is that the transform functions shouldn't have to care
how many distinct versions of each type they operate on there are, or
what those types are called.

> > Would there be objections to deriving vocab types from eachother? We
> > seem to be using utf8 for a lot of things, and it might be nice to have
> > distinct types for these uses while preserving that they're still in
> > utf-8 format.
> Again, whatever works... I guess I'd want to see the use cases in the
> code before expressing an opinion about whether C++ inheritance gives
> anything useful?

Partly that the verify() functions could automatically be shared,
although this could also be achieved in a similar way to how _NOVERIFY
works. Mostly though, that the charset conversion functions would work
on all vocab types while only needing to be defined for whatever base
types we have (utf8, external, ace, ...). I guess I can't really say how
useful this would actually be until I'm farther along in checking that
we use our existing type system well.


Free (experimental) public monotone hosting:

reply via email to

[Prev in Thread] Current Thread [Next in Thread]