bug-gne
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnupedia] Content Format


From: Jean-Daniel Fekete
Subject: Re: [Bug-gnupedia] Content Format
Date: Sat, 27 Jan 2001 09:46:12 +0100

There are two issues discussed in this thread.
One is about the final storage format and the other is about the submission
format.

The property of the "storage" format is to be able to express all the
information GNE or whatever needs.  Issues are meta-data AND text markup.
Submission format is not so important as long as it can be translated in the
storage format, mainly for meta-data insertion and consistency.

If GNE documents are to be transfered from a repository to some people's
machines, there should be meta-data inside the documents to specify lots of
information such as the original author(s), the copyright, the original
source if one exists, and more bibliographic information such as any
category this document has been classified in (if this makes sense, but it
sometimes does).
If this meta-data is NOT inside the document itself, then no copyright can
be maintained and it can be stolen.

Indeed, as I try to express for long time, TEI has been designed just for
this issue, and more.   Looking at the following document :
http://www.umdl.umich.edu/workshops/teidlf/
and more precisely
http://www.indiana.edu/~letrs/tei/
TEI documents are classified into 4 levels of encoding, depending on the
level at which structural encoding is used.
Translating from Word or RTF to TEI will be considered level 2 (if divisons
are kept)
Translating from ASCII or OCR will be level 1
Translating from LaTeX could be level 2 and YES, it will keep the math!

So, the easiest way to manage an large repository of documents is to have
one storage format, several translators and a form based interface to
enforce the definition of meta-data.  When submitting a document using any
format, except TEI itself, an HTMF form can ask for all the important
meta-data information.
Name, date of document, ownership, copyright (from a list of acceptable
copyrights) etc.

Is this proposal a way to solve the content format problem?
  Jean-Daniel Fekete
  Ecole des Mines de Nantes, 4 rue Alfred Kastler, La Chantrerie,
  BP 20722, 44307 Nantes Cedex 03, France
  Voice: +33-2-51-85-82-08  | Fax: +33-2-51-85-82-49
  address@hidden | http://www.emn.fr/fekete/


----- Original Message -----
From: "Mike Warren" <address@hidden>
To: <address@hidden>
Sent: Friday, January 26, 2001 11:58 PM
Subject: Re: [Bug-gnupedia] Content Format


> "Imran Ghory" <address@hidden> writes:
> > On 26 Jan 2001, at 15:02, Mike Warren wrote:
> > > Bob Dodd <address@hidden> writes:
>
> > > > People should be able to submit the content (not the header info
> > > > necessarily) in any well-known, well supported content
> > > > format. If particluar catalogs/filters prefer certain file
> > > > formats for presentation, [..]  I would argue that you *must*
> > > > preserve the original submission though: the author (and others)
> > > > may need to update the submission.
>
> > > If people would like to make use of converters on their Word
> > > documents before submission, that's fine. If you're seriously
> > > advocating that the GNE actually store Word documents, then I
> > > can't agree. I also don't think we should accept Word documents as
> > > a submission format.
>
> > GNE could just accept Word documents and convert them into whatever
> > format we wish to use and store them in that format.
>
> We could. I don't think that's a good idea.
>
> --
> address@hidden
> <URL:http://www.mike-warren.com>
> GPG: 0x579911BD :: 87F2 4D98 BDB0 0E90 EE2A  0CF9 1087 0884 5799 11BD
>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]