[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] LDAP vs. SQL - RT comments

From: Horst Herb
Subject: Re: [Gnumed-devel] LDAP vs. SQL - RT comments
Date: Fri, 1 Aug 2003 08:18:34 +1000
User-agent: KMail/1.5.9

On Thu, 31 Jul 2003 10:50, Ian Haywood wrote:
> I've been doing some research into LDAP, we may be able to get equivalent
> functionality using LDAP alone, remaining compatible with other services,
> depending on what's in Richard's schema.
> If not, this is a good solution.

It is, and it can be done.
Together with Peter Schloeffel from GEHR/OpenEHR we made a joint submission 
for funding of a "demographic service" which is not only badly needed by 
gnumed but by everybody else also.
I have attached the proposal paper which outlines a few alternatives.
I am currently adapting my "person" business objects and the data 
entry/display panels to reflect future availability of such service.

Peter and I are confident that the funding will get through, and I am 
absolutely confident that the project specifications will require GPL 
licensing and use of gnumed compatible programming languages and tools.

On another positive note: Looks like OpenEHR is about to get serious funding 
by SUN - who of course made it a requirement for it to be written in JAVA. 
The prospect of bolting our frontend and perhaps busines objects onto a 
OpenEHR server/service becomes more realistic too (will not stop us (yet) 
from continuing development of our current own backend, but it is a nice 
prospect for the future and an encouragement to go ahead with the business 
object abstraction layer in any case)


The Need For Demographic Information Services (DIS) in Health Care


Demographic information is the pivotal data virtually any other information segment in health care depends on. Even anonymous study results will at least initially require some form of demographic information to be collected and administered, before the data is de-identified. In the domain of electronic health records (EHR), the presence of demographic information is the least common denominator.

Accepting that most if not all medical information management will deal with demographic information, it is astounding to find out that there are no de facto standards in place to store, administrate and access demographic information.

Out of the several dozen EHR packages available in Australia, there are no two packages that can reliably and completely share demographic data. The situation appears no better in the realm of hospital information systems.

In the domain where data exchange does already happen, as in the case of pathology results, mismatching demographic information and failure to identify identities correctly remain a major problem, making expensive human interaction a mandatory requirement whenever data is exchanged.

The purpose of this document is to

  • raise awareness that Australian health care needs unified and consistent demographic information services

  • suggest in what way existing solutions can be used or integrated

  • propose to the GPCG to investigate how a reference implementation of a DIS could be implemented and funded

What does a DIS do?

When we need the phone number of a person or company, we look it up in the "white pages" or the "yellow pages", unless we have that number stored in our own personal address book. We might also choose to ring an information service for that purpose.

A demographic information service typically works in a similar way: often needed information is cached locally, and information not available locally is queried in a hierarchical manner until the information is found or all sources exhausted.

In health care, we do not only need information regarding a person's name, address and phone number. We also want to know other basic demographic attributes like gender and date of birth. Administration might need to know about the role of a person or entity, and doctors will be interested in relationships between people, both of social and genetic nature. Some of this information will of course be confidential, and could not be accessed as openly as the "white pages". Some information will change over time (like surnames in people re-marrying) and yet both versions might still be of interest to epidemiologists or forensic specialists decades after the change. Stored information will of course not be limited to actual people, but will also include institutions, companies and other "entities" in the widest sense.

Information of such kind can be administered by a demographic information service (DIS). Since almost every single software application in health care will need to access such demographic information, it is pointless to reimplement that effort again and again. Not only pointless in the sense of unneccessary multiplication of effort and waste of ressources, but also with hindsight to non-interactive data processing and exchange.

If a single widely accepted demographic service model could be built, software developers could simply forget about the tedious task of collecting and administering demographic data. Once implemented, savings would materialize with every single new software project in the health domain, and possibly even in other domains. If demands for demographic data change (like the sudden availability of a nationwide or even world wide PIN or changes in privacy legislation) there would be only one single code to be changed instead of hundreds if not thousands of software packages needing substantial modifications!

When would a DIS be useful?

  • Whenever person -related data is exchanged between systems (roaming patient, change of health care provider)

  • Whenever person related data is shared between systems (billing/scheduling/practice management packages and EHR systems, discharge referrals)

  • Whenever person related electronic forms need to be filled in (work cover, centrelink, scientific or clinical studies, etc.)

  • Whenever person related services are requested (prescriptions, referrals, requests)

  • Whenever person related software applications are not integrated or are not feasible to be integrated into an existing system(study specific applets etc.)

What should we expect from a DIS specification and implementation?

  1. Searchable and updateable set of demographic data covering all needs of at least clinical and administrative subdomains within health care, including roles of and relationships between demographic entities

  2. Royalty free use, open standard or standard proposal. Any royalties or restrictions of use will make universal acceptance highly unlikely

  3. Platform and language independent access to demographic information: essential requirement for any communication standard.b

  4. unambiguous identification of individual entities - only achievable with either a centralized registration server or legislative support, but highly desirable to avoid the current need for human interaction with potentially every single data exchange

  5. Granular access privileges and reliable authentication: part of the demographic data (like biological relationships) might be highly confidential and should only be made accessible to authorized users.

  6. Identity mapping: allows different institutions (health care organizations, scientific studies etc.) to allocate internal identifiers for demographic entities, and still use the demographic server for unambiguous data exchange

  7. Versioning: demographic data can change over time (like for example the residential address or the maiden name). There are many circumstances (medicolegal or epidemiological, for example) where past versions of stored information need to be accessed.

  8. Ease of implementation (client side): to allow small ad-hoc software programs as often written in the context of pilot studies or proof-of-concept prototype software to access demographic services without an overly high entry threshold.

  9. Ease of implementation (server side): to allow existing software vendors to write wrappers around their existing demographic system in order to expose the data as a demographic service without need for unreasonable effort

  10. Acceptable performance: to allow even single -user single -process applications to use the same service interface as large distributed systems without responsiveness of the user interface suffering from performance degradation.

  11. Replication (allows to duplicate service and underlying database without service disruption), synchronization (allows loss -free and duplication -free merging of information between servers) and load balancing (allows sharing of processing load between more than one server in order to handle multiple simultaneous requests without delay).

Present state of demographic information management in health care

As mentioned in the introduction, demographic services are still lacking in Australian health care. The dominant market player in Australian EHR software uses a simplicistic denormalized flat file schema to store demographic information, and most of the competition follows that thread. Pathology, the main entity currently deploying electronic data exchange between health care providers, depends largely on the openly defined but not standardized PIT format or HL7 V 2.x - both methods creating substantial problems due to their inherent ambiguity.

The following non-exclusive listing gives a brief overview over existing models for possibly suitable demographic service models. Suggestions range from access protocols / APIs like LDAP to concrete service implementations with remote object access like CORBAMED PIDS - while not directly comparable, these different efforts emphasize how broad the scope can be.


The OMG has specified a Person Identification Service (PIDS) as part of the CORBAMED specifications i. The primary goal is to provide unique patient identification across heterogenous systems requiring interoperabilityii. A number of PIDS implementations have been in use for several years iii. However, the complexity of CORBA in general and CORBAMED PIDS specifically (the interface definition alone consists of more than 900 lines of code!) as well as possible compliance costs (branding of CORBAMED PIDS) appear to have prevented more widespread use. There are also concerns regarding granularity of access privileges, the CORBAMED PIDS relying solely on the "CORBA Security Services" for confidentiality enforcement iv. Versioning of demographic information (non destructive administration of information changes) is not specified. Identity mapping (one demographic entity can have multiple identifiers) is supported

Development of at least one free and open source (FOSS) PIDS compliant service has been started with a project called CIRCARE v, but the project appears to have stalled since 2001. Nevertheless, the still available code base providing a functional partial CORBAMED PIDS implementation would be a good starting point for other projects.


The "entity" and "role" subject areas within the HL7 reference information model (RIM) deal with basic demographic information vi. Biological relationships appear to be poorly supported. Versioning of information appears possible, but I could not find detailed specifications. Unambiguous identification of demographic entities appears problematic. Identity mapping is possible, but I could not find detailed specifications. Implementation as a service is not specified.

While there are several free implementations of HL7 V2.x libraries available, there is no specific implementation of the demographic part in itself, and no implementation as a service. There is no free implementation of the HL7 V3 specification available.


openEHR specifies a demographic reference model vii., but there has as yet been no reference implementation. The model appears to be expansible through the use of archetypes. Implementation of the demographic reference model as a service is planned but has not yet been undertaken in currently available documents, though the design clearly suggests implementation as a service. The openEHR demographic model gives more consideration to relationship between demographic entities than the HL7 RIM or CORBAMED PIDS - the latter models appear more business and administration focused rather than clinical. Versioning is specified in the openEHR demographic model but the details of access privileges have not yet been specified. From the documentation, it is not clear how demographic access privileges are handled. Identity mapping is possible, but not yet clearly specified.


While not limited to the medical domain, the "light weight directory access protocol" LDAP has established itself as widely accepted standard for access of demographic information within the Internet. The design goal was to lower the threshold of using X.500 directory services. However, LDAP has become increasingly independent of X.500, so that LDAP can be mapped onto any other directory system so long as the X.500 data and service model as used in LDAP is not violated in the LDAP interface.viii Authentication is still not standardized, but a number of standard proposals have been submitted to the IETF (Internet Engineering Taskforce) ix. Following a "white pages" paradigm, there are no standardized ways of expressing relationships between demographic entities. There are no standadized ways of versioning information. Identity mapping is possible, but no standards for this in place. Most LDAP servers nowadays have replication, synchronization and load balancing features available.

At least one freely available reference implementation exists x. Implementation of the client side is trivial, and a number of freely available interfaces to many popular programming languages exist. Using such a domain -independent standard would also facilitate integration with domains other than health care.

At least one Australian health care application uses LDAP for demographic information - Argus xi, "a suite of programs developed by the Collaborative Centre for eHealth that provides a secure mail-exchange system for the dissemination of documents between health service providers".

Implementation suggestions

  1. To agree on core data requirements; most if not all the required work has been done independently by CORBAMED PIDS, HL7 and openEHR - it is just a matter of choosing which model suits best or which way to modify (if necessary) each model to make all three suitable.

  2. To agree on one or more access protocols and APIs. There is no reason why not to expose the agreed core data through a variety of interfaces to suit everybody’s needs. There is probably no good case to develop yet another API and protocol besides the four examples discussed below four examples:

  • A CORBAMED PIDS interface has the advantage of excellent performance, but the disadvantage of rather high implementation costs

  • An LDAP interface can be quick and easy to implement, and demographic data exposed that way could be used immediately by the majority of email clients - further lowering the acceptance threshold. The disadvantage is the need for non-standardized enhancements to better cover biological relationships.

  • HL7 is already widely used - for applications that do not need real time performance, message exchange via HL7 might be viable if the ambiguity problem is solved.

  • The openEHR demographic model appears to cater best for the clinician’s need to know about biological relationships between people. It would be useful to get an opinion of involved developers regarding feasability of service implementation of their model.

In order to get started with a DIS implementation, it might be practical simply to start out with plain LDAP and the most basic demographic information (name, title, contact details, gender, date of birth). It has the significant advantage that many existing "address book" applications such as those integrated into email clients would support this approach out of the box, providing users with user interfaces to their demographic data they are already used to.

The next step could be to map the openEHR demographic model onto the LDAP protocol, and finally to bolt a CORBAMED PIDS compatible interface onto that evolving service as an option for performance hungry (or already PIDS compliant) applications. Advanced features such as synchronization, replication and load balancing would be supported right out of the box, and impressive scalability has already been demonstrated in real life.

Since implementation of LDAP client facilities is trivial and most software vendors can make use of a plethora of tested and readily (often even freely) available client libraries for most if not all programming languages in use, I would expect that early adoption would be high.

The nature of LDAP would enable us to use existing server software (like OpenLDAP) without any modification; information not already covered regarding for example personal relationships can be specified in server configuration schemas as needed.























reply via email to

[Prev in Thread] Current Thread [Next in Thread]