lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] Delegating ctor uncalled for?


From: Greg Chicares
Subject: Re: [lmi] Delegating ctor uncalled for?
Date: Thu, 9 Feb 2017 03:50:01 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0

On 2017-02-04 15:23, Vadim Zeitlin wrote:
> On Sat, 4 Feb 2017 12:58:39 +0000 Greg Chicares <address@hidden> wrote:
> 
> GC> If you have any specific questions, I'll try to answer them.
> 
>  The basic question I have is why do we have these 4 different ctors and
> when each of them is supposed to be used. The first one is relatively
> clear, it's used to parse the "input expression" string, but the other ones
> are not obvious to me.

There were three others, but now there are only two, and I've added a
comment above each of those two explaining why it exists.

> And even the first one is not totally obvious
> because I'm not sure what exactly is the separation line between it and
> SequenceParser.

The motive for separating them was to improve the physical design by
increasing encapsulation. There had been too many member variables and
functions in the combined class's namespace: e.g., realize_intervals()
could have called validate_duration() or modified current_interval_,
either of which might seem plausible judging only by their names, but
that would have been a mistake because those two members are morally
owned by the parser. Now that morality is physically enforced.

Oh, and BTW, realize_intervals() did call mark_diagnostic_context()
until a few days ago, resulting in diagnostics that seemed to say that
a parser error had been detected after the end of parser input. That
wasn't confusing unless one tried to figure out its meaning.

I'm convinced that any refactoring embarked upon for this sound purpose
and faithfully executed necessarily improves the design. That such a
change is actually an improvement, though, is not necessarily obvious.
I hope to make it clearer by rewriting the top-level documentation as
a final step. Is there anything else that would help?

> E.g. why not have SequenceParser::sequence() as a factory
> function for making InputSequences?

Well...we could hide its public ctors and add a function
  std::whatever_ptr<something> make_a_parsed_thing();
but IMO that's just adding a candy coating. And if we change the class
name to ParserFactory, maybe that makes everything clear to fans of
the GoF book. OTOH, IMO, I could say as much in a narrative comment
instead of contorting the physical design. After all, SequenceParser
is used by exactly one function in exactly one way that I think can
be readily understood as is. (Maybe I haven't convinced you, but I
don't see why not.)

And I have a violent personal objection to pointers, so I hope you
weren't going to ask me to write a function returning a pointer.
If instead you were thinking of showing me how to do something cool
with move semantics, then I'm quite willing to be persuaded that
it serves a substantial purpose.

Aside from that, given that InputSequence and SequenceParser are now
separate, I imagine it would be good to put them in separate files.
Smaller, self-contained source files are easier to understand.

I did think of making SequenceParser all private and extending its
friendship only to InputSequence, but that seems convoluted and
unnecessary. If you understand SequenceParser, you won't wish to use
it alone; if you don't, then you shouldn't try to use it. Or am I
being too cavalier?

> GC> I think you're looking for the "canonical representation", alluded to in
> GC> one of the remaining "TODO" comments, which has not yet been defined.
> 
>  I was indeed thinking of a canonical representation, but at C++ types,
> not necessarily string/textual, level. Of course, if we had such a
> representation at C++ level it would be trivial to convert it to a string.

Okay, then this is what you seek:

std::vector<ValueInterval> const& SequenceParser::intervals() const
{
    return intervals_;
}

We might call that the "plenary representation". To me, "canonical"
connotes "simplest form", i.e., most compact and universal; I hope to
enhance mathematical_representation() so that it becomes canonical.
It's already the right thing for xml (where we wouldn't dream of
deserializing a std::vector<ValueInterval> into some xml structure
emulating a vector of struct). Its present flaw is that round-trip
conversion
  plenary -> "mathematical" -> plenary
loses information, but that can be fixed.

> GC> > 2. The "numerous arguments" ctor is IMO unwieldy and difficult to both
> GC> >    use and read. I would strongly consider using named arguments idiom
> GC> >    in order to avoid having a function with 9 parameters which is about
> GC> >    twice greater than optimal.
> GC> 
> GC> I'm not sure whether you mean boost::parameter or this:
> GC>   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4172.htm
> 
>  Sorry, I should have been more clear: I meant the approximation of the
> latter available in the current C++. It is described (not very well) at
> https://en.wikibooks.org/wiki/More_C++_Idioms/Named_Parameter and used in
> wxWidgets with wxSizerFlags or wxFontInfo, for example.

Maybe the wikibooks author hasn't been an effective advocate,
but I'd be inclined to wait and see what happens with n4172.

> GC> But there are only a few combinations of defaulted parameters that
> GC> actually make sense.
> 
>  This is actually a big red flag for me, IMO the class shouldn't provide a
> ctor allowing to pass it incompatible values of the arguments.

The red flag belongs on classes derived from datum_sequence.

These arguments of InputSequence::InputSequence():

    ,std::vector<std::string> const& a_allowed_keywords
    ,bool                            a_keywords_only
    ,std::string const&              a_default_keyword

are simply taken from class datum_sequence's derived classes, e.g.:

class mode_sequence : public datum_sequence ...
    bool numeric_values_are_allowable() const override {return false;}
    bool keyword_values_are_allowable() const override {return true;}
    std::string const default_keyword() const override;
    std::map<std::string,std::string> const allowed_keywords() const override;

That's where an invariant such as "default must be allowable" must
be established: only there can the contract be fulfilled.

It is appropriate for a downstream client like class InputSequence
to verify that the contract was fulfilled, by asserting that as a
precondition. But if that precondition is violated, InputSequence
can do nothing about it except throw an exception.

An intermediary class to ensure argument consistency is not wanted
because the invariant is not the relationship among (hypothetically)
freely-chosen arguments. The arguments are dictated, with no freedom,
by an upstream datum_sequence derivative.

> With the
> named parameters idiom it would be simple enough to ensure that this can't
> happen, e.g. by having two separate methods such as allowed_keywords() and
> allowed_keywords_with_default() (which could specify the default as an
> index in the provided keywords array).

That would trade one invariant for another:
  default is_element_of allowed, vs.
  index < allowed.size()
yet I do agree that the latter invariant is simpler--e.g., it's more
immune to spelling errors.

I'm not opposed to the named-arguments *idea*. In C, e.g., if we have
a lot of parameters to pass around, we might combine them into a
struct, with members set individually, e.g.:
  settimeofday(const struct timeval *tv, const struct timezone *tz)

Indeed, we could do exactly that here; and we already do, in my
personal mental model of the code, which unfortunately makes it
difficult for me to see any physical difficulty...

> GC> The number of arguments per se doesn't bother me because I know
> GC> that the five 'int' arguments are a unitary, interdependent set
> GC> extracted from class Input, so notionally the arguments are
> GC>   std::string const& expression
> GC>   Input const&
> GC> and the three optional ones.

...but that's only five components of class Input, which somehow isn't
so many that I personally would feel a strong urge to combine them
(probably because I'm so deeply invested in the problem domain).

And combining them would erect a barrier to comprehension: no one could
begin to understand class InputSequence without first learning that
intermediate struct. OTOH, I guess such a struct would be transparent
enough to me, so if you think that using a struct an a temporary, poor
substitute for named arguments would help, I guess I wouldn't object.

It could be a worse idea if the struct (unnecessarily, I argue above)
were to become a class with invariants.

If a C++1X library lets us write this more clearly, that's great...

>  It's still simple enough to make a mistake with the order of these 5
> arguments. Maybe I just make too many mistakes, but I really appreciate
> APIs which prevent me from making them, so I'd strongly prefer to be able
> to write
> 
>       InputSequence sequence
>               (expression
>               ,InputParameters()
>                       .years_to_maturity(y2m)
>                       .issue_age(ia)
>                       .retirement_age(ra)
>                       .inforce_duration(id)
>                       .effective_year(ey)
>               );

but introducing a dependency on yet another library to accomplish this
is not so good, because third-party libraries introduce flaws (whereas
standard libraries have a much better reliability record).

>  FWIW I definitely didn't mean to propose to start using boost::parameters,
> I don't really like it as it seems too complicated and "magic". It's
> impressive that such a library could be written, but I'd probably only use
> it if I had hundreds of functions requiring its use, not just a few.

I'm certainly not going to argue against your conclusions there.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]