gnumed-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Approaches to maintain clinical data uptime


From: Tim Churches
Subject: Re: [Gnumed-devel] Approaches to maintain clinical data uptime
Date: Mon, 01 May 2006 09:26:12 +1000
User-agent: Thunderbird 1.5.0.2 (Windows/20060308)

Syan Tan wrote:
> you would think they would do it for free , considering the free advertising 
> :-
> 
> "our system was used to help save lives in a flu epidemic " .

I am going to start banging the philanthropic drum in an attempt to get
more resources as soon as we have version 1.0 RC1 available, by the end
of June 2006 I hope. Although the multi-master distributed database
stuff does have application in rich countries like Australia, especially
in rural areas where connectivity can be very poor (but also in many
hospitals, alas) , where it is *really* important is in developing and
transitional countries, where Internet access is increasingly available,
but usually only via dial-up modems (hence the "frequent network
partition" requirements).

Now, back to functional testing of the latest version (I hate writing
tests!).

Tim C

> *On Sun Apr 30 17:32 , Tim Churches sent:
> 
> *
> 
>     Syan Tan wrote:
>      > couldn't you file a request for a academic replication system , like a
>     gossip
>      > architecture system ?
> 
>     Um, file a request with whom? Academics don't do anything without being
>     paid for it, these days.
> 
>      > BTW, I'm not quite clear about why lamport clocks as opposed to vector
>     clocks
>      > are used ;
>      >
>      > a lamport clock is just one sequence number for one site, which is kept
>     ordered
>      > whenever
>      >
>      > sites send messages to each other. Vector clocks are sequence numbers
>     kept at
>      > every site about
>      >
>      > every site , so when messages are received , changes can be causally 
> ordered
>      > between more
>      >
>      > than one other site . What sort of ordering is being aimed for the 
> netepi
>      > multi-site application and why ?
> 
>     Sorry - I said "some variation on Lamport clocks" by which I meant a
>     vector or logical clock, as you describe - they all grew out of the
>     original Lamport idea, I believe. Causal ordering is the aim. Multiple
>     flu clinics during a flu pandemic - a person may present to more than
>     one clinic, and clinics may have intermittent or unreliable connections.
> 
>     Tim C
> 
>      > *On Sun Apr 30 9:06 , Tim Churches sent:
>      >
>      > *
>      >
>      > James Busser wrote:
>      > > On Apr 29, 2006, at 4:35 AM, Tim Churches wrote:
>      > >
>      > >> (I keep wondering whether we should have used an EAV pattern for 
> storage
>      > >
>      > > Educated myself (just a bit) here
>      > >
>      > >
>      >
>     
> http://www.health-itworld.com/newsitems/2006/march/03-22-06-news-hitw-dynamic-data
>     
> <parse.pl?redirect=http%3A%2F%2Fwww.health-itworld.com%2Fnewsitems%2F2006%2Fmarch%2F03-22-06-news-hitw-dynamic-data>
>      >
>     
> www.health-itworld.com%2Fnewsitems%2F2006%2Fmarch%2F03-22-06-news-hitw-dynamic-data>
>      > >
>      > > http://www.pubmedcentral.gov/articlerender.fcgi\?artid=61439
>      > www.pubmedcentral.gov%2Farticlerender.fcgi%3Fartid%3D61439>
>      > > https://tspace.library.utoronto.ca/handle/1807/4677
>     
> <parse.pl?redirect=https%3A%2F%2Ftspace.library.utoronto.ca%2Fhandle%2F1807%2F4677>
>      >
>      > > http://www.jamia.org/cgi/content/abstract/7/5/475
>     
> <parse.pl?redirect=http%3A%2F%2Fwww.jamia.org%2Fcgi%2Fcontent%2Fabstract%2F7%2F5%2F475>
>      > www.jamia.org%2Fcgi%2Fcontent%2Fabstract%2F7%2F5%2F475>
>      >
>      > Thanks - we have copies of the latter three papers but I hadn't seen 
> the
>      > first article. Of course, PostGreSQL muddies the waters, because the 
> way
>      > it works under the bonnet (hood, engine cover) is rather similar to 
> (but
>      > not identical) to the EAV model - but all that is hidden behind the SQL
>      > interface which is not easy to bypass.
>      >
>      > We really wanted to use openEHR when we started in 2003 - openEHR can
>      > been seen as a very sophisticated metadata layer which can be used with
>      > an EAV-like back-end storage schema - but no openEHR storage engines
>      > were available then, and when I asked again earlier this year, there
>      > were still none available (as open source or closed source on a
>      > commercial basis) in a production-ready form.
>      >
>      > Anyway, plain old PostgreSQL tables work rather well, and are fast and
>      > reliable for large datasets - but we will need to build our own
>      > replication engine, I now think. What we really need is multi-master DB
>      > replication which can cope with slow and unreliable networks (hence it
>      > has to use asyncrhonous updates, not tightly-coupled synchronous 
> updates
>      > such as multi-phase commits) and with frequent "network partition". If
>      > we are funded to do that, then we'll write it in Python, probably using
>      > a stochastic "epidemic" model for the data propagation algorithm and
>      > some variation on Lamport logical clocks for data synchronisation. It
>      > als needs to propagate schema changes. Hopefully if we can make it
>      > sufficiently general so it might have utility for GNUmed eg when a copy
>      > of a clinic database is taken away on a laptop for use in the field eg
>      > at a nursing home or a satellite clinic, and network connection and
>      > synchronisation only occurs occasionally. However, we need the
>      > replication to scale to 200 to 300 sites. Interestingly, most of the
>      > commercial multi-master database replication products just gloss over
>      > the issue of data integrity, or leave it up to the application - but
>      > research in the 1990s showed that that is not good enough in more
>      > complex situations with more than a few master DB instances.
>      >
>      > >> - Slony would have worked with that..).
>      >
>      > There is a Slony-2 project, being done here in Sydney, but it is
>      > focussing on multi-master synchronous updates ie multiple servers in a
>      > single data centre, for load-balancing of write tasks as well as read
>      > tasks (for which Slony-1 can be used to facilitate load-balancing)
>      >
>      > Sorry to rave on, but don't let anyone tell you that there are some
>      > fundamental data management issues yet to be addressed by open source 
> or
>      > commercial software.
>      >
>      > Tim C
>      >
>      >
>      >
>      >
>      > _______________________________________________
>      > Gnumed-devel mailing list
>      > address@hidden
>     <javascript:top.opencompose('address@hidden','','','')>
>      > address@hidden
>     <javascript:top.opencompose('address@hidden','','','')>','','','')>
>      > http://lists.gnu.org/mailman/listinfo/gnumed-devel
>     
> <parse.pl?redirect=http%3A%2F%2Flists.gnu.org%2Fmailman%2Flistinfo%2Fgnumed-devel>
>      >
>      >
>      >
> 
> 





reply via email to

[Prev in Thread] Current Thread [Next in Thread]