hfdb
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [hfdb] Re: Grand Unified Hardware Database


From: James K. Lowden
Subject: Re: [hfdb] Re: Grand Unified Hardware Database
Date: Mon, 26 Jul 2004 09:55:15 -0400

On Mon, 26 Jul 2004, Zenaan Harkness <address@hidden> wrote:
> On Mon, 2004-07-26 at 09:56, James K. Lowden wrote:
> > My idea is to define stored procedures to insert the data.  You call
> 
> Here's another question - do you have experience with stored procedures
> with more than one RDBMS backend, such that you can say how "specific to
> RDBMS backend" stored procedures can be?

I've worked mostly with Sybase and Microsoft RDBMSs.  They share a common
heritage, and even there it's not hard to write a query that won't execute
in both systems.  

>From reading manuals (because the question is interesting to me), it looks
like it's impossible to write RDBMS-independent procedures, and difficult
to write even a one-way conversion for them.  A human being can read one
and get the gist of it, and convert it by hand, of course.  

One example of a problem is that some databases have the notion of a
"temporary table" that exists only for the process and self destructs at
end of query.  The paradigm breaks down on systems that lack that feature.
 ;-)

> > 'PrintersInsert' passing it a bunch of parameters, and it does the
> > rest, doling out various parts in the related tables that define a
> > printer.  
> 
> And checking constraints? Or are constraints specified in some other
> manner (or a combination)?

Yes.  

> > I imagine your "Microsoft Natural Keyboard" to be a record in a file. 
> > A little DBI Perl script will read the records, taking the name of the
> > stored procedure as a parameter.  Note we need only 1 such script,
> > because it calls any procedure with any number of parameters.  If the
> > database rejects a record, it can note that in a log file and proceed
> > to the next one.  
> 
> For me-as-end-user submitting my single keyboard record, that file will
> be a plain text file with just the data for my keyboard - or now that I
> write that I imagine a little more structure will need to be defined,
> and me-as-hfdb-team-member will have to make sure that file (/email) is
> in the "correct" format?

Someone entering a single device would want a form, probably a web form. 
I'd suggest our CGI script append the data to a file that we subsequently
apply to the database after vetting.  You don't want random people adding
Bogon devices.  
> > Tell you what I'll do, just for you, special offer, today only.  I'll
> > write the aforementioned script to read a flat file.  It will treat
> > each line as a tab-delimited set of arguments to a stored procedure. 
> > It will call that procedure once per line in the file, writing errors
> > to stderr. At a bare minimum, we need that, it seems to me.
> 
> Is it necessary to go as far as having multiple entries in such a file?
> If I "submit" my keyboard data, it's only one keyboard.
> 
> I imagine that for importing pccids, printer db, etc, a custom script
> will be needed for each set of data we import, right? But I see that as
> a different task to entering a new device. Not sure though...

There are two steps, as I see it.  One is to transform the data into a
shape suitable for loading.  Second is to load it.  I'm just providing the
second step.  The first one is harder, because, as you guessed, each data
source has its own format and its own peculiarities.  It's not the
scripting that's so very hard, though.  The difficult part -- why you get
paid the big bucks ;-) -- is the data analysis:  How do this external data
source's attributes get mapped onto our data model?  That's where the real
work lies.  

> Whatever works/ makes sense for getting data stored. As long as I can
> enter data in a not overly obtuse way, it doesn't really matter how I'd
> think.

I see stored procedures as the basic building block of data entry. 
Scripts to call them repeatedly can automate imports.  Web forms that
build "import" sources can make it easier to do the data entry.  CGI
scripts (for project members) that call them once per 'submit' can
simplify one-off entries.  

> The other part is of course getting the data back out.

Sure.  I'll set up a nightly export.  We have to discuss whether or not
the export should be in CVS or managed more like rotating logs.  

> > There are several one-time steps before you get to #1, though:
> > 
> > 0.  Understand data.  
> > 1.  Devise schema (moi).  
> > 2.  Write procedure(s).  
> > 3.  Install tools.
> 
> That step 3 is the thing it would be nice to be able to minimize for
> people who join our data-entry team.

I can set up a mail/ssh gateway at schemamania.org, for example.  OTOH,
the toolchain I have in mind isn't that bad.  Much less than Apache, for
instance.  

--jkl




reply via email to

[Prev in Thread] Current Thread [Next in Thread]