[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Module name mangling

From: Marius Vollmer
Subject: Re: Module name mangling
Date: 30 Jan 2001 01:52:01 +0100
User-agent: Gnus/5.0803 (Gnus v5.8.3) Emacs/20.7

Dirk Herrmann <address@hidden> writes:

> On 28 Jan 2001, Marius Vollmer wrote:
> > Keeping the filnames readable is a nice goal. But, as you say, the
> > main problem is to decide just _what_ characters to encode.  I
> > thought that the URL method would specify that list of characters.
> > Is that not the case?  Once we have that list, we can of course
> > use a more pretty encoding than hex numbers.
> IMO, if we actually agree about preferring a 'readable encoding', we don't
> have to know about all those characters yet.

I don't agree (yet).  We can certainly extend the list as we learn
about problematic cases, but each time we change the encoding of
module names as file names, everybody would have to check and possibly
rename their *.scm files.  This would be no good.

We could resort to an expensive search process where many encoding
variations of the module name are tried, but I don't want to go this
route.  On the other hand, it would probably be worthwhile to think
about trying two variations: without any encoding, and with full
encoding.  If people care for portability to non-Unix systems, they
can use encoded module names, if they don't, they get pretty ones.

> We could try to find a final solution now, but it is unlikely to
> find one anyway,

I thought that the URL encoding would be `final' in the sense that we
can safely assume that any odd-ball system that is not catered for by
this encoding can be ignored.  I still like to know whether the URL
encoding can do this for us.

In any case, there ought to be enough experience out there with this
problem to find a good solution from the start.

> because there are so many other problems that we have not yet
> dealt with:  Unix allows an arbitrary number of '.' characters in a
> filename, while for other well known systems there may be only one '.',
> which only may be followed by a limited number of characters.  This can
> not simply be but into an encoding as with the suggested scheme, because
> we would like the '.scm' postix to stay unencoded :-)

Why?  We can encode "." as "%dot%" and then tack on the ".scm" suffix
with no problem.

> Thus, a 'perfect' module-name>encoded-filename function would be
> much more complex than the current solution.  However, if we were
> trying to take all this into account, we would spend a lot of time
> for solving a problem of very limited relevance.

Maybe, but we should still try to find a more exhaustive list of
characters to encode anyway.  That should be easy enough.  I don't
think we can afford to change the encoding frequently.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]