[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Gnumed-devel] Country zones and i18n

From: Busser, Jim
Subject: Re: [Gnumed-devel] Country zones and i18n
Date: Thu, 17 Nov 2011 18:20:20 +0000

On 2011-11-17, at 4:35 AM, Karsten Hilbert wrote:

> What I don't yet comprehend is - exactly when do you need to
>       using a SELECT, to figure out the referent record in
>       dem.urb where that urb's name (in dem.urb) is 'Corumba'
> ?

I am still working through the scenarios of data packs to load streets, cities, 
states and imagine I should link to this thread from the wiki.

The data packer (in this case me), before committing a data pack, will want to 
check the suitability of the data to be imported. The first checks are on 

1) the address records are, in each case, sufficiently complete and valid. For 
example, in a database of doctors whose contact information could be loaded by 
considering each doctor to be an org having one or more org_units,

        - some may have only a name and phone number, but no address
                (say, semi-retired doctors whose home address is not included)
                --> these records can have their empty address excluded
                        from the import

        - some may have an address, but which is incomplete
                (a minimum of number, street, city, postal are required)
                (a state/province, if unknown, can be "stubbed" '??')
                (a country, where appropriate, can be assigned-for-all)
                --> these conditions can be tested-for and
                        suitable records imported in
                        (a) multiple passes, or
                        (b) by using the CASE WHEN THEN END construct

        - some may have address information that is malformed
                (GNUmed may have insufficient knowledge of what is valid)
                (the data packager should at minimum
                - check if candidate 'additional countries' conform to ISO
                - check if candidate 'additional prov/states' conform to 
official values
                - where new source values should NOT be added
                        --> fix them and / or accept to not import them
                        --> otherwise provide insert statements
                                to import each of not-yet existing countries, 

        - note that AFAIK there exists no agreed abbreviations (codes) for 
                (just for airports) 
                - distinct (but ignoring-lettercase) lists of city/urb names 
should be viewed
                - distinct (but ignoring-lettercase) lists of street names 
should be viewed
                and a decision made whether to accept them "as is" or first 
sanitize them

2) next comes the question of whether the pack contains staging values that 
represent *accented or language variants* and how to manage them. This is where 
I am needing more thoughts. A data packer needs to consider

        - when a set or subset of city names does not yet exist in GNUmed
                then -- before adding them to dem -- 
                        where these names may represent variants for which 
translations exist
                        --> look for matches on language in i18n.translation
                                where .lang = 'specify a language'

        - failing which, the only choices are to
                (a) add them to dem. despite that they are accented etc or
                (b) hold off to add them, while searching for a key to first 

For Brazil, the source seems to have translations for cities but not for 
streets. So it should be
- feasible to import unaccented city names into dem.urb
- feasible to populate i18n.keys and i18n.translations with an accented form, 
however shall that form be available only to users whose environment or 
language is set to portuguese?
- not feasible to populate dem.street with unaccented values (unless or until 
people can point me to a tool)

I have a separate question about suitable SQL methods by which to build a query 
result that employs conditions / cases / maybe coalesce and nullif however that 
question would better await comment on the above, and direction as to whether 
the Brazilian pack should insert accented streets into dem.street or whether 
there exists a better alternative.

-- Jim

reply via email to

[Prev in Thread] Current Thread [Next in Thread]