monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Monotone-devel] "" is invalid file_path?


From: Stephen Leake
Subject: Re: [Monotone-devel] "" is invalid file_path?
Date: Mon, 08 Sep 2008 21:12:41 -0400
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.2 (windows-nt)

"Nathaniel Smith" <address@hidden> writes:

> On Mon, Sep 8, 2008 at 6:04 AM, Stephen Leake
> <address@hidden> wrote:
>
>>> Why do you want to use the _external-handling functions for parsing
>>> path strings generated programatically by mtn?
>>
>> Because those paths are intended to be used (and in some cases
>> generated) by user tools; I thought that was the intent of
>> file_path.as_external().
>
> I guess it depends -- external mostly means "UI" and internal means
> "data structures", ATM; neither of those is exactly the same as "used
> programmatically by mtn and user tools and the user directly in a text
> editor".
>
>> Since the intent of the conflicts file is to let external tools edit
>> it to provide resolutions, those strings "come from the user". We can
>> easily add a restriction that "relative paths must be relative to
>> workspace root" in the definition of the 'automate show_conflicts'
>> basic_io format.
>
> I would tend to think that show_conflicts, like the rest of basic_io,
> should provide the easy-to-parse standard-rules-using internal-style
> paths.

"easy-to-parse" is only true for mtn code. External code will assume it
is in the system charset and syntax; that's the real point here, I think.

Although many programs do convert to a unix-like canonical path
representation, for the same reasons mtn does. But I don't think we
can _require_ that.

> Windows (and DOS) have always accepted / as a path separator in their
> system API.

Ok, that makes sense.

And file_path.as_external() does _not_ convert back to \.

>> More importantly, there is also a character set difference;
>> file_path.as_external() calls charset.cc utf8_to_system_strict(),
>> which converts to the system charset. I can't find the definition of
>> utf8 right now (where is that class defined?); I assume its
>> constructor does the system_to_utf8 conversion.
>
> Actually it's assumed that strings passed into the _external
> constructor are given in utf8 (this is a hole in our usage of the type
> system).  They usually come from command line arguments, which we
> convert to utf8 in the command line processing code.

Right. And anything that calls file_path_external must convert to utf8
first also.

> Note that "as_external" is basically "as a string that's sensible to
> display to the user in the UI" right now, though -- the name is a lie,
> it's not the inverse of from_external.  

Does as_external() convert to "relative to current user directory"? I
haven't checked. If so, the only thing missing from being a true
inverse of from_external is the Windows directory separator
conversion; close enough.

> There isn't any guarantee that going external -> file_path ->
> external will roundtrip, and similarly for file_path -> external ->
> external.

This is/will be a problem, I think.

>> I believe we should assume external tools expect file paths in the
>> system character set; they should be external.
>
> That seems logical.
>
>> So I'm left with a problem; if I use external paths, and a user
>> invokes 'automate show_conflicts' when the user current directory is
>> _not_ the workspace root, all the file paths will be wrong. Or at
>> least, the user tool will have to be invoked from the same working
>> directory; that's not good, since the intent of the conflicts file is
>> that it be used asynchronously; paths should be relative to the
>> workspace root.
>>
>> I could add a restriction that 'automate show_conflicts' (and any
>> similar commands) must be invoked from the workspace root, but that
>> seems like a kludge.
>
> Ew.

I'm glad we agree :).

>> On the other hand, if I use internal paths, and the user is on a
>> system where the system character set is not utf8, all the file paths
>> will be wrong. Unless external tools are prepared to translate from
>> utf8 to system.
>
> That might be the best solution.  I haven't followed exactly how this
> conflicts-handling workflow is intended to work, though, so dunno.

I don't think the issue is restricted to conflicts handling; it's
relevant for any file paths output by mtn that are intended to be used
in any way by a program on the system. Pasting into a text editor
"file open" dialog counts here.

The various 'report_*_conflicts' use P(F("... %s ...") % file_path);
that uses any_path.as_internal().

I'd like to hear from someone using a system where the system charset
is not utf8, according to mtn. I'm not familiar with
internationalization in general (and I'm not sure that's really the
issue); would a Japanese system use a different charset? Or a Mac?

If there are not any such systems, this is a moot point.

We could add a new constructor and extractor for file_path:

// path is assumed to be relative to workspace root, in system charset
file_path_external_ws(std::string path);

// result is relative to workspace root, in system charset
std::string file_path::as_external_ws();

That would satisfy the automate show_conflicts use case, and be
consistent with everything else.

-- 
-- Stephe




reply via email to

[Prev in Thread] Current Thread [Next in Thread]