[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Transferring/importing/bootstrapping from legacy syst

From: Syan Tan
Subject: Re: [Gnumed-devel] Transferring/importing/bootstrapping from legacy systems
Date: Fri, 20 Jan 2006 19:05:30 +0800

what's needed is an AI tool, that can read schemas in different formats.  There are tools that assume a tree structured organization

of relations between tables, or a forest of trees , but none of them make guesses at what kinds of data tables contain. If the

tool could make a hypothesis , much like a query optimizer does, and guess table person might contain demographics info, because

it has fields surname, and firstname, and that because it also has street , and city , and zip, it might be denormalized demographic

info, it might apply a pattern of extract denormalized demographic information into gnumed. It then assigns probabilities to what

fields actually contain what with respect to a gnumed dem. schema, much like say  the probabilistic data mining tools that do data cleaning , such as Dr Church's  open source tool.  In fact I can't remember the name of it , but if I google

"open source hidden markov model data cleaning" , guess what is the first match

no wonder the share market loves google.



On Fri Jan 20 1:06 , J Busser sent:

Transferring/importing/bootstrapping from legacy systems
To better help any reading I will want to do, could I ask some overview of options or best methods of getting data from a legacy system into gnumed?

Presently, I maintain some patient information in a program used for billing, with the data stored in FoxBASE .dbf (dBase-compatible) tables. It is organized loosely as follows:

- patient table fields including
surname + given names, date of birth, sex, health insurance num, demographics
foreign key (administrative identifier) of other providers (family & referral doctors)

- link table (patients' diagnoses)

- icd9 table (into which I had created some extra, "custom" codes)

I am not sure whether there exists for postgres any tools that permit it to "read" from the native .dbf form in which my data is stored, or whether that data would first need to be exported into tab, comma or standard delimited format.

But even with my data converted into a readable format, I did wonder what kind of scripting I would need to get someone (e.g. local) to do up for me.

I noticed that the bootstrap files seem to specify within them both the fields and the actual values of the data, rather than a statement like "append from fields ". Is the latter approach both supported and reasonable or must all data to be imported be specifically spelt-out inside a file that also contains the import syntax?

Also, the first time this is done on an "empty" gnumed database, all records that survive the data requirements/constraints will get appended. But if any records are rejected (fail the constraints), is the entire file import rejected? And if "no" (meaning the valid subset of records *is* imported) is there any automatic or standard practice that captures the records that were rejected?

Has anything like this already been programmed / submitted into the CVS? The only information that we have so far on the wiki is:


reply via email to

[Prev in Thread] Current Thread [Next in Thread]