Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database)

hfdb

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database)

From:	Zenaan Harkness
Subject:	Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database)
Date:	Tue, 27 Jul 2004 07:22:46 +1000

> Relational theory says you represent optional relationships in separate
> tables and disallow nulls.  That makes for a lot of joins that, in
> practice, often performs poorly.  Hence nulls.  

Ahah! Simply poor performance.

This kind of relates to my earlier question re: lookup tables - and eg.
lookup for "title" (mr, mrs, ms, dr, sri swami :). Such a lookup table
is not an _optional_ relationship, since it's not a "separate relation"
table, it's just a lookup from the original table to the sub table
(sorry if I'm not using proper terms here).

Am I correct in assuming that joins to lookup tables do not have the
performance problems that joins with optional data does?

> > However for storage, version control, etc, it should be easy to generate
> > XML from the db. WRT Till's linuxprinting database, it will make most
> > sense to auto-import the data, assuming it is already in a reasonably
> > standard format. As mentioned above though, we must maintain continuity
> > for existing users of that data by being able to re-generate the
> > format(s) they are using. So that's kind of like a precondition to
> > incorporating an external set of data I guess - we must be able to
> > readily export back to the currently used formats.
> 
> This is an interesting problem.  I agree, if we're to be useful to
> linuxprinting, it has to be possible to produce their files.  We could
> create round-trip processing: read their files, store in database,
> regenerate their files.  I'm not sure that's where we should focus our

Well, this is world domination we are out to achieve here! We hope to
become the Grand Unified Hardware Database (pretty guhd stuff huh? :)

> attention.  It depends if their files are the canonical source of
> information.  That is, if we're going to incorporate other information
> about printing that they'll want, then we'll stand to enhance their data. 
> In that case, we wouldn't necessarily ever re-load from their files,
> although they'd want us to regenerate them (from the combined data).  

Till please jump in at any point here. The way I see it, we aim to
merge. It might take a little while, and to do so we will have to ensure
complete import of the existing data set, and then auto-generation of
the required output formats (I imagine at least PPDs and some XML stuff
for the HAL project (if this is not already generated, it's a lower
priority naturally, and not on the critical path)).

This is an interesting problem - I just don't think I will personally be
able to look at hacking up scripts in the next three to six weeks.

If there are any reasons to not merge with a particular "currently
external" device data set, then consistent round-tripping might be more
important in the short term for such data.

> Generating their XML might be a combined effort.  We'd write the initial
> query and Perl script to generate the file.  They'd maintain the script as
> their needs change, and we'd update as the schema changes.  

Well, Till's the man, and he's here with us as I understand it, and
we're all aiming in the same direction too (World Domination (TM), just
for those who'd forgotten :)

Each such scripting-or-otherwise task will get done as soon as the first
of us has enough of an itch and the free time to do it. It's us, we, a
common team working within the free software community. I don't think
there is any utility to seeing things as "us" and "them". Well, that's
probably enough drilling the point home ... I could get a dentist in for
anyone if they really want it though...

> You asked elsewhere about the ease of XML generation from an SQL query
> output.  I've done some HTML-from-SQL work (as has half the www, if PHP
> and cgi scripts are any guide).  It's not hard.  If the XML tags directly
> correspond to the column names in the database, it seems to me it should
> be possible to write a generalized SQL-to-XML formatter.

Good to hear. For HAL for example, we might create some "views" for
particular joins/ table combinations, to match their data needs.
Likewise for other projects. Then the output munging will be pretty
straightforward.

>  In fact, I'd be surprised if that hasn't been done already.
> Once we're that far, we'd need some XSLT expertise to transform
> the "standard" XML into the desired form.  

Like HTML? Or just a transform of the XML itself?

> > I think you're right. XML schema, SQL schema, the data issues will be
> > the same - either we will address them properly, or we'll address them
> > properly :)
> 
> Ultimately, those are the choices, yes.  ;-)

Nothing like consensus on an issue.

cheers
zen

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database), (continued)

Prev by Date: Re: [hfdb] Sorry - fixed now Re: [Fwd: Summary of the LSM Free Software Printing Summit]
Next by Date: Re: [hfdb] Re: Grand Unified Hardware Database
Previous by thread: Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database)
Next by thread: Re: [hfdb] Scope (Was Re: Grand Unified Hardware Database)
Index(es):
- Date
- Thread