monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] basic_io inventory


From: Stephen Leake
Subject: Re: [Monotone-devel] basic_io inventory
Date: Thu, 26 Apr 2007 20:43:19 -0400
User-agent: Gnus/5.1006 (Gnus v5.10.6) Emacs/21.3 (windows-nt)

Thomas Keller <address@hidden> writes:

> Stephen Leake schrieb:
>> Stephen Leake <address@hidden> writes:
>> 
>>>>> I can help with the tests and documentation.
>> 
>> I've taken a first stab at updating
>> tests/automate_inventory/__driver__.lua, and adding some comments in
>> automate.cc.
>
> Wow... you've been busy! 

Well, this is kind of fun :). It's nice to find a project that has
such a strong test suite.

> You might want to provide a patch for everything if you don't have
> write access to venge.net.

I don't have access to venge.net. I'm looking into other server
options. I'll post a patch if that takes to long.

> For the tests, I think you should implement the actual matching after
> you've parsed the basic_io properly. For this there exists a function
> named parse_basic_io which takes the basic_io string as input and
> returns a lua table. The automate_keys test includes an example how to
> cope with that.

Hmm. I find that test very hard to understand. 

Since part of my goal is to understand the basic_io output in order to
help implement a parser for Emacs DVC, I'd rather have the monotone
test document the actual text format. Unless there's a strong
objection to that.

>> //    old_node "<old_node_id>" "<old_node path::status>"
>> //    new_node "<new_node_id>" "<new_node path::status>"
>> //     fs_type "<current path::status>"
>
> path::status could be simply resolved to "file|directory", and for
> fs_type "file|directory|none" (none if the file is missing). This makes
> it easier to read.

Partly I was documenting what types are used, so I could understand
the type structure.

I guess it depends on who is reading this doc; for developers, the
types probably make more sense. If there is a tool that pulls this
comment into the help output, then your suggestions are definitely
better.

>> //      status "<status>"
>> //     changes "{content|attrs}..."
>
> changes can be one of "content", "content attrs" or "attrs". 

That was the intent of my sytax; think Backus-Naur form. But the
complete set is small enough that they can just be listed.

>> // 'fs_type' gives the state of the item in the workspace filesystem.
>> //
>> // 'old_node' and 'new_node' give the status of the item in the manifest.
>> // The 'old_node' and 'new_node' lines are not output if the corresponding
>> // node state is 'nonexistent'. FIXME: better to output 'none'?
>> //
>> // '<status>' is one of 'known', 'missing', 'unknown'
>
> There is also the 'ignored' state.

Ah, right.

What do you think about outputing 'none' explicitly, rather than
leaving out 'old_node' and 'new_node'? It would make the parser more
regular. But maybe there's a standard basic_io style?

>> // Error conditions: If no workspace book keeping _MTN directory is found,
>> //   prints an error message to stderr, and exits with status 1.
>> 
>> Is this correct so far?
>
> You might want to add a short note about the possible PATH argument and
> the options, but this is nit-picking here and could also just go into
> monotone.texi.

PATH is mentioned as an argument; it seems to be a standard argument,
so I didn't think it needed describing.

Are there options? the declaration says 'options::opts::none'; so I
assumed there were none for 'inventory'; there are standard options
for 'automate'.

>> Here's the first set of checks in
>> tests/automate_inventory/__driver__.lua. I left the original output
>> for comparison. I added FIXME: comments where I think the output could
>> be improved.
>
> Now since the format is more complete, you may want to add checks for
> corner cases as well which weren't properly supported by the old format
> in the past, f.e. renames which change the type of a node (drop a dir,
> add a file with the same name and vice versa, stuff like that).

Right.

>> --  AP 0 0 added
>> --  FIXME: status should indicate 'added, needs commit'
>> check(grep('^    path "added"$', "stdout"), 0, false, false)
>> check(grep('^new_node "2147483648" "file"$', "stdout"), 0, false, false)
>> check(grep('^ fs_type "file"$', "stdout"), 0, false, false)
>> check(grep('^  status "known"$', "stdout"), 0, false, false)
>> check(grep('^$', "stdout"), 0, false, false)
>
> The new_node ids for added files are unique, but I've forgot how they're
> calculated (MAX_INTEGER - ... maybe), anyways
>
> a) its not a good idea to test for specific values here

Why not? You seem to be implying that running this test on different
machines might produce different results.

What is the alternative? I guess using parse_basic_io I could simply
not test the node_id; but then there is a feature that is not tested,
which isn't good.

Ah; I could use a regular expression that matches any integer. That
would be a good compromise, if the node id really can change.

> b) the explicit note that this file is added is indeed missing; as
> long as there is no other node outputted which has the new_node id
> set in its old_node id (which practically makes the add a rename) we
> can however assume that this is an add

That places the burden of computing "added" on the external tool.
Clearly 'automate inventory' could do that same computation; it is
already walking the entire tree. Then the computation would be
standard; different tools would share a common definition of "added".

>> --   I 0 0 ignored
>> --  FIXME: need an option to not output these
>> check(grep('^   path "ignored~"$', "stdout"), 0, false, false)
>> check(grep('^fs_type "file"$', "stdout"), 0, false, false)
>> check(grep('^ status "ignored"$', "stdout"), 0, false, false)
>> check(grep('^$', "stdout"), 0, false, false)
>
> You like to have an option to hide ignored files from basic_io output?

Yes. Because in the most common case, those files are ignored for good
reason. 

Consider reviewing the status of a build directory; it has two or
three versioned files (the Makefile, a compiler project file), and
zillions of ignored files (the object files and executables). You
don't want to wade thru those ignored files every time, so Emacs DVC
will not display them (there probably will be an option to display
them, since sometimes you want to see what's ignored). So you might as
well save time at the source, and have monotone completely ignore them
as well.

> What if a user explicitely wants to add an ignored file? (Maybe it
> was ignored by accident, f.e. a wrong .mtn-ignore entry?)

Then they can turn off the option. Or call add directly. It will be up
to them to realize they've used a name that is normally ignored; I
consider that an excellent trade for the speed of not processing
ignored files.

>> -- not in original output (why not?)
>> check(grep('^   path "inventory_hooks.lua"$', "stdout"), 0, false, false)
>> check(grep('^fs_type "file"$', "stdout"), 0, false, false)
>> check(grep('^ status "unknown"$', "stdout"), 0, false, false)
>> check(grep('^$', "stdout"), 0, false, false)
>
> You don't need to check things more than once, right? The complete test
> suite already takes a long time to finish, no need to make it take even
> longer =)

I'm not clear what you are saying.

What if a bug causes the text I skip to change in a bad way? I
prefer to explicitly check all output.

I guess it would be easier to relate the checks to the operations on
the workspace if these extra files were simply ignored.

Ah; I just ran tests/automate_inventory in the monotone branch. Those
"test" files are in fact present in the 'automate inventory' output;
the test simply ignores them. I forgot what '^' means; it matches any
line beginning, not just the next one.

>> 1) clearly indicate whether the item needs committing or not. 
>> 2) clearly indicate renamed and dropped.
>
> If we'd bring the additional "added", "dropped", "renamed_from" and
> "renamed_to" states back, I guess it should be fairly easy to determine
> if an item needs to be committed or not (either it has one of those
> states or the changes stanza is present).

Right; having those states explicit would be enough.

>> 3) option to not output unchanged and ignored files (to speed processing)
>
> I guess restriction support alone should speed up processing a lot, 

Well, yes, if you process a subset of the workspace it will be faster.

But I rely on the version control tool to remind me of what I've
changed. Every time I run it on a subset "to save time; I _know_ I
didn't change anything over there", I get burned :).

> if there are still speed issues one could think about adding an
> "--show-tracked-only" option or something along this line.

I guess I need to set up a monotone database for my main project,
that is currently under CVS, and do a fair timing comparison. That
will add real data to this discussion.

>> I'm not clear what the rationale is for the node ids; what subsequent
>> commands would use them?
>
> F.e. automate get_file (obviously there is a version available now which
> takes the file path and revision_id).

I guess 'get_file' would be required if the user requested a diff, to
review the changes.

I also realized that if the external tool needs to calculate
'renamed', it needs the node id.

-- 
-- Stephe




reply via email to

[Prev in Thread] Current Thread [Next in Thread]