[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

DogCows or Polymorphism in the Hurd

From: Marcus Brinkmann
Subject: DogCows or Polymorphism in the Hurd
Date: Thu, 02 Feb 2006 13:56:50 -0500


here is a small issue to ponder.  I'd like input on this.

Consider an object system.  Sometimes, it is very clear which interface
an object should implement.  Sometimes, it is not so clear, and objects
can be viewed from different angels; these objects have multiple facets.

Take for example a tar.gz file object.  You can look at it as a binary
file.  But you can also look at it as an archive, which provides many
files in a directory structure.  In this case, the tar.gz object itself
is the root directory of the archive.

In the object oriented paradigm, this problem can be elegantly solved by
polymorphism.  The tar.gz class can be derived from the file class on
the one hand, and the directory class on the other hand.  In Java, such
abstract base classes as file and directory in this example are called

The Hurd (on Mach) takes a similar approach.  Although MiG does not
support type hierarchies, the Hurd objects are organized in a hierarchy.
And in this hierarchy, there is a single class "fs", which is derived
from "io" and which contains the file _and_ the directory methods.  So,
in effect, all objects you encounter in the filesystem implement both
the file and directory operations.  The "fs" class constitutes a
"DirFile" type, a type that is the result of lumping together two
different types.

To make this more abstract, you can think of the "DirFile" Type as a
"DogCow" type, from which you can derive Dogs, Cows, or arbitrary
hybrids.  In this case, you don't even need polymorphism, as the Dog and
Cow interfaces are not seperated at the lowest level.

Now, the problem that now occurs is that a user of the object needs to
make a decision about which facet of the object it wants to consider.
This question is simple to answer if the user only knows one of the
supported facets.  For example, a program that only knows about Files
will treat all "DirFiles" in the Hurd as Files.  A program that only
knows about Directories will treat all "DirFiles" as Directories.

However, it happens that in Unix, Directories and Files are not only
very distinct objects, but they are also understand by a wide range of
applications simultaneously.  Ie, many applications look at a node in
the filesystem, decide if it is a file _or_ a directory, and then take
an appropriate action.  All applications that can traverse a filesystem
belong into this group, for example ls, rm, grep, find, etc.  This is
the most prominent group, but I would expect there to be isolated cases
of other applications that do this (maybe Apache?  Input welcome here).

These applications face a problem in the Hurd: They will see objects
that look like Directories _and_ like Files.  This causes erratic
behaviour.  For example, "grep *" will search through the binary content
of directories (because it treats them as files).  One suggestion was
that we add extra options to such programs to control how hybrid types
should be treated by the application.

So, here is the deal: Either we convince ourselves that such erratic
behaviour is isolated and can be fixed by proper defaults and making
minor modifications to existing programs.  Or we find out that in
general this problem is too hard to fix: How an application should treat
a DirFile may be context-sensitive and depend on the exact type of the
object, the intent of the application ("find" used for backup vs "find"
used for locatedb), or even the intent of the user.  In this case, it
may be better to drop the notion of "DogCows", and make the _current_
facet of the object explicit in its type.  Here is how this can be done:

All objects are derived from a polytype class, which provides the
following interfaces:

error_t poly_get_supported_types (obj_t obj, type_spec_t types[]);
error_t poly_get_facet (obj_t, type_spec_t new_type, out: obj_t

The function poly_get_supported_types returns a list of types which this
object can be viewed as.  Iow: these are the facets provided by this
object.  poly_get_facet is a bit like a "cast": It returns a new object
with a new type, but the object is, at the server side, intimately
related to the original object with the original type.  For example,
changes to the one object may become visible in the other object in some

What would this mean for our tar.gz example?  It would mean that a
translator implementing a tar.gz feature would either be seen as a
tar.gz file, _or_ as a directory to the root of the archive, but never
both with the same object.  In particular, the tar.gz file and the tar
root directory would get different names in the directory hierarchy.
For example, the tar.gz node could be called "foo.tar.gz" and the
archive root dir could be called "foo".

Using the poly-type approach would remove all ambiguities: Applications
would either see a file or a directory, but not both.  Applications who
_know_ about hybrid types can use the new functions to switch facets
explicitely.  If a user wants to use an application with a hybrid type,
he will have to make his intent explicit by providing the node with the
right facet type to the application.

It sounds a bit lame to leave the problem which facet is the right one
to use in the context of POSIX applications to the user.  But it may be
that this is the best we can do.  If it is the best we can do, I think
the poly-type approach provides a very clear and flexible solution to
this problem, while preserving the ability to implement and use hybrid

Some questions:

* What are compelling use cases for hybrid types?
* How severe are the confusions in the POSIX world introduced by hybrid
* How complex is the problem how hybrid types should be perceived by
legacy applications?  Is it multi-dimensional and/or context-sensitive?
Are there reasonable default resolutions?


reply via email to

[Prev in Thread] Current Thread [Next in Thread]