hfdb
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hfdb] Re: Grand Unified Hardware Database


From: James K. Lowden
Subject: [hfdb] Re: Grand Unified Hardware Database
Date: Wed, 21 Jul 2004 02:21:48 -0400

On Tue, 20 Jul 2004 Richard Stallman <address@hidden> wrote:

First of all, thank you for taking the time, Richard.  It's clear that you
and Zen (at least) have different visions of what this database needs to
hold.  I'm sure you've seen many a project founder because its eyes were
bigger then its stomach.  I'm sensitive to that, too, and I absolutely
agree we shouldn't over-design it.  An empty baroque model is nothing next
to a full primitive one.  
That said, since you took me seriously, I'll take you seriously and
address your points.  We disagree on the relative ease/simplicity of XML
vs. RDBMS.  I don't know that I'll convince you, but at least (I hope)
we'll understand each other.  
>     2.  Zen & Co. massage data into a form consistent with what we
>     already have, and consistent with a data model we devise.  He edits
>     this into the XML files and checks them into the public CVS
>     repository.
> 
> That is much simpler.

No, it isn't.  

1.  Someone has to type all that XML markup, and verify it, and make it
consistent with the other XML.  

2.  There's no way to verify the underlying relationships or make them
consistent.  

A vendor makes many devices, and a driver works in many OSes.  How will
XML ensure that the vendor and the OS are referred to consistently?  More
important, why should they?  In a relational model, the vendors are in one
table and the OSes in another.  Each has an ID.  The Driver table need
merely refer to the appropriate IDs to capture the referent completely and
accurately.  

3.  Once we've got all those XML files in our CVS repository, how are we
to discover how many of each device type we have?  Which devices by a
given vendor are supported?  How many printers can be used with a given
OS?  

Whatever answers you come up with, I'll have better questions waiting. 
There's no way a set of hierarchical files can match an RDBMS.  

But surely you know that.  Where we seem to disagree is over whether it's
"easier" to edit some XML or load an RDBMS.  One is a lot of markup and a
lot of eyeballing for consistency.  The other is staight data in flat
files, possibly with some scripts to load the data, and the machine does
the vetting.  It's really no contest.  

Even if I'm wrong, I'm right.  If the XML is any good, we can parse and
load it into the RDBMS anyway.  

>     For hardware, is it not necessary to describe the thing precisely? 
>     It's well known that vendors often sell radically different things
>     with the same model number.
> 
> This problem, if and when it matters to us, is not trivial.  

Acknowledged. 

> My idea would be to invent two separate models, FOO model 69 type A
> and FOO model 69 type A, and list each one.  Then we could add a
> listing for FOO model 69 which explains in hand written text how to
> tell whether you got type A or type B.  Or maybe explaining how to
> make sure you get type B if you order one now.
> 
> I don't see that an RDBMS has anything to do with this.

If you were to look at the NetBSD Ethernet drivers, you'd find an M:N
relationship between vendors and chipsets.  Where you'd just have a note
saying "get the one with the red LED", I'd add the chipset ID to each
device entry.  The driver would be attached (relationally) to the chipset.
 Each vendor's device, if correctly specified, would thereby automatically
be related to its driver.  

You might say, yeah, but we don't know the chipset, much less its ID, so
let's stuff in a Post-it note and leave it at that.  That won't carry very
far, because it will leave out the crucial technical information that the
software actually relies on.  

> If we want it to fly, we have to lighten the load.  We
> have to look for what we can throw out, not what we can add.

We're in complete agreement here.  I know that must sound odd, having just
read the previous paragraphs.  For me, for us both, it's a matter of
getting the crucial central useful stuff up and running.  You know what
people want to use this database for.  I know how to create databases.  We
should focus our energies on combining our strengths.  

>     If the database is just going to say "hardware made by Company Y is
>     supported in some form or fashion by Driver X" then I don't think it
>     will have any value at all.
> 
> The main purpose of this data base is so that people can determine
> whether hardware model N made by company Y is ok to buy.  For that,
> what we need is exactly the info that you've said is of no value.  Our
> first goal is to include this much information.

To know if "model N made by company Y is ok to buy", it's much better to
know *how* that's true, and better still to know to what degree it's true.
 Drivers aren't written to model numbers, they're written to hardware
specifications.  It's much better if the database captures those
relationships.  

Still, I see your point.  You're saying it's more important that we
capture the fact of the support, regardless of causality, because it's
easier a/k/a possible.  What separates me from your run-of-the-mill DBA
(one thing, anyway) is that I won't let perfection be the enemy of the
good.  We can record only what we have, and we have only what we produce. 
The model will reflect those constraints.  

> By using an extensible file format such as XML, we don't need to
> design the data in advance.  

That's a straw man, my good friend.  Relational models evolve, too, and I
said so at the outset.  I'll further assert they're more tractable than
XML files.   

>     I suggest this:  Show me a mock
>     tabular report you'd like to see, something you imagine would
>     benefit someone we're trying to help, and tell me who that person
>     is.
> 
> Here's what minimal data would look like.
> 
>    PCMCIA Modems:
> 
>      FOOBAR 33  max speed 28k
>        GNU/Linux driver mumble   BSD driver frobozz
>        * Note: to make this work on GNU/Linux,
>          see http://helpme.gnu.org/discuss/foobar33.

Right, thank you again.  This is very helpful to me.  

You have a few placeholders there.  FOOBAR is really Vendor+Device. 
"driver" is really driver+revision.  It cries out for a Vendor table, at
least.  

The real question is: how will we learn that mumble supports FOOBAR?  Is
someone going to enumerate all the driver-device pairs?  I don't think so.
 I think driver writers will state what chipsets (or similar) they
support, and manufacturers will state what chipsets they use (sometimes,
or driver writers will discover it).  Only by relating though the shared
information will it be possible to derive the the driver-device pairs.  

If you say that such a primitive report would be useful, and isn't already
available, I could accept that.  My understanding from other posts on the
list is the opposite: that such information as can be found today is
insufficiently detailed, sparse, and idiosyncratically organized.  I got
involved because Zen said he wanted to tackle a hard data problem.  If it
turns out we have "staff" for only a simple problem, OK, then my job is
easy and uninteresting.  I'll still do it, but you don't really need me,
and you're right that lesser technologies could solve the problem.  

Are we of one mind now, you and I, or close enough to call it a consensus?


Regards, 

--jkl




reply via email to

[Prev in Thread] Current Thread [Next in Thread]