[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Monotone-devel] "" is invalid file_path?
Re: [Monotone-devel] "" is invalid file_path?
Mon, 08 Sep 2008 21:12:41 -0400
Gnus/5.11 (Gnus v5.11) Emacs/22.2 (windows-nt)
"Nathaniel Smith" <address@hidden> writes:
> On Mon, Sep 8, 2008 at 6:04 AM, Stephen Leake
> <address@hidden> wrote:
>>> Why do you want to use the _external-handling functions for parsing
>>> path strings generated programatically by mtn?
>> Because those paths are intended to be used (and in some cases
>> generated) by user tools; I thought that was the intent of
> I guess it depends -- external mostly means "UI" and internal means
> "data structures", ATM; neither of those is exactly the same as "used
> programmatically by mtn and user tools and the user directly in a text
>> Since the intent of the conflicts file is to let external tools edit
>> it to provide resolutions, those strings "come from the user". We can
>> easily add a restriction that "relative paths must be relative to
>> workspace root" in the definition of the 'automate show_conflicts'
>> basic_io format.
> I would tend to think that show_conflicts, like the rest of basic_io,
> should provide the easy-to-parse standard-rules-using internal-style
"easy-to-parse" is only true for mtn code. External code will assume it
is in the system charset and syntax; that's the real point here, I think.
Although many programs do convert to a unix-like canonical path
representation, for the same reasons mtn does. But I don't think we
can _require_ that.
> Windows (and DOS) have always accepted / as a path separator in their
> system API.
Ok, that makes sense.
And file_path.as_external() does _not_ convert back to \.
>> More importantly, there is also a character set difference;
>> file_path.as_external() calls charset.cc utf8_to_system_strict(),
>> which converts to the system charset. I can't find the definition of
>> utf8 right now (where is that class defined?); I assume its
>> constructor does the system_to_utf8 conversion.
> Actually it's assumed that strings passed into the _external
> constructor are given in utf8 (this is a hole in our usage of the type
> system). They usually come from command line arguments, which we
> convert to utf8 in the command line processing code.
Right. And anything that calls file_path_external must convert to utf8
> Note that "as_external" is basically "as a string that's sensible to
> display to the user in the UI" right now, though -- the name is a lie,
> it's not the inverse of from_external.
Does as_external() convert to "relative to current user directory"? I
haven't checked. If so, the only thing missing from being a true
inverse of from_external is the Windows directory separator
conversion; close enough.
> There isn't any guarantee that going external -> file_path ->
> external will roundtrip, and similarly for file_path -> external ->
This is/will be a problem, I think.
>> I believe we should assume external tools expect file paths in the
>> system character set; they should be external.
> That seems logical.
>> So I'm left with a problem; if I use external paths, and a user
>> invokes 'automate show_conflicts' when the user current directory is
>> _not_ the workspace root, all the file paths will be wrong. Or at
>> least, the user tool will have to be invoked from the same working
>> directory; that's not good, since the intent of the conflicts file is
>> that it be used asynchronously; paths should be relative to the
>> workspace root.
>> I could add a restriction that 'automate show_conflicts' (and any
>> similar commands) must be invoked from the workspace root, but that
>> seems like a kludge.
I'm glad we agree :).
>> On the other hand, if I use internal paths, and the user is on a
>> system where the system character set is not utf8, all the file paths
>> will be wrong. Unless external tools are prepared to translate from
>> utf8 to system.
> That might be the best solution. I haven't followed exactly how this
> conflicts-handling workflow is intended to work, though, so dunno.
I don't think the issue is restricted to conflicts handling; it's
relevant for any file paths output by mtn that are intended to be used
in any way by a program on the system. Pasting into a text editor
"file open" dialog counts here.
The various 'report_*_conflicts' use P(F("... %s ...") % file_path);
that uses any_path.as_internal().
I'd like to hear from someone using a system where the system charset
is not utf8, according to mtn. I'm not familiar with
internationalization in general (and I'm not sure that's really the
issue); would a Japanese system use a different charset? Or a Mac?
If there are not any such systems, this is a moot point.
We could add a new constructor and extractor for file_path:
// path is assumed to be relative to workspace root, in system charset
// result is relative to workspace root, in system charset
That would satisfy the automate show_conflicts use case, and be
consistent with everything else.