monotone-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Monotone-devel] i18n and portability


From: graydon hoare
Subject: [Monotone-devel] i18n and portability
Date: 12 Oct 2003 14:34:03 -0400
User-agent: Gnus/5.09 (Gnus v5.9.0) Emacs/21.2

hi,

I've been considering portability and i18n requirements recently, as I
want to get as many of these sort of "broad but shallow" issues -- not
really related to VC but affecting many parts of the software -- into
code and debugged as early as possible.

so I'm trying to make a list of such issues. I'd like feedback on the
decisions, notes, and additions to the list:

- international messages: 

  - for messages intended for a human reader, the program should use
    gettext(). I think it's pretty straightforward.

  - since some "messages" coming out of monotone may actually be
    interpreted as "program interface" by another program running
    monotone (vc-monotone.el springs to mind), I think either we need
    a command line switch called "--nolocale", or we need to categorize
    each message as to whether it's "for humans" or "for programs" and
    only gettext() the "for human" messages. 

- international pathnames:

  - I'm not totally adverse to supporting bigger or more complex
    filenames (eg. with spaces, wide chars, etc), so long as we are
    very careful about their security properties. breaking out of a
    monotone directory and wandering over to /home/user/.bashrc is
    something I want to remain impossible.

  - unfortunately, as near as I can tell this can't be done "portably"
    across many older unices, but it's more likely true on newer ones
    (and newer windows). linux appears to accept UTF-8 filenames,
    though I can't find a spec for it. likewise with windows: no spec
    in sight. can anyone find one? is it ok to break old unices to
    make things right for international users on linux/windows/macos,
    say?

  - we probably should not support localized filename *behavior*,
    though: namely it's important that all the entries in a manifest
    sort in the same order on any platform, not using the current
    localized coalation order. I think lexicographic byte sorting is
    the right thing to do, since manifests aren't *really* intended for
    users.

  - boost::filesystem is designed to be possibly "more portable" than we
    want; in particular it doesn't seem to like the notion of non-POSIX
    names (UTF-8 is well beyond POSIX)

- newlines:

  - the line merger rips files apart into lines ("\r", "\n" or "\r\n")
    and then glues them back together using "\n". I guess this isn't
    right, but on the other hand I'm not sure I want to include the
    line-ending convention in the text if I'm going to get a merge
    conflict when two lines differ in end of line format. yuck. shall
    I, instead, always glue lines back together using a
    platform-specific line ender, like std::endl?

  - njs has pointed out a horrible concrete case: if windows and unix
    users are collaborating on something, their programs tend to be
    incompetent about line-endings from the other platform. some VC
    systems will "massage" text files from one platform when checked
    out on another. njs suggests perhaps adding a persistent path
    attribute to activate such behavior. personally, I don't like this
    one bit. sha1sum will no longer confirm version identity if
    there's a "washing" stage after checkout. but maybe it's really
    important..  is it? I suppose if it's an explicit choice by the
    user to wash the file before reads and after writes .. it just
    seems really icky.

-graydon





reply via email to

[Prev in Thread] Current Thread [Next in Thread]