[open-cobol-list] Sort utility with OC

gnucobol-users
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[open-cobol-list] Sort utility with OC

From:	Moenck, Robert
Subject:	[open-cobol-list] Sort utility with OC
Date:	Thu, 7 Jun 2007 17:05:40 -0400

David,
This conversation seems to have tailed off so I thought I would give it another 
stir.

I took a look at CSORT and a further look at the GNU sort.
My feeling is that the GNU sort is a much better platform to build out from.
It contains a lot of very useful infrastructure,
e.g. code for memory buffer management, process management,
intermediate sort file management, an efficient sort algorithm, 
sort, merge or check capability, etc.

Agreed, it is targeted at LF(CR) delimited variable length records,
but that corresponds to LINE SEQUENTIAL in the Open-Cobol environment.
I think it would be straight forward to enhance it to support Open-Cobol
SEQUENTIAL (i.e. records prefixed with record length in binary)
or RELATIVE files.

Agreed, it is designed for variable length records and delimited keys,
but that is a more difficult problem than fixed offset fixed size keys.
UN*X sort has to scan through each record looking for the keys.
Again I think it would be straight forward to enhance it to support
fixed offset fixed sized keys for the Open Cobol data types.

It also includes all the GNU mechanisms including a suite of test cases
that could be built on.
The GNU mechanisms do provide some rigor and portability,
but working within them may be the biggest challenge.

I am still leaning towards Open Cobol's "-std=<dialect>" mechanism 
to specify different forms of sort parameters.
I don't know much about IP issues, but Open Cobol seems to get away with this 
approach.
One of the <dialect>s could be 'unix' to continue with the existing 
capabilities.
You say you have a YACC grammar for SORT syntax.
If you want to post it somewhere that I can pick up,
then I will take a look at how much can be easily implemented.

Regards,
Bob Moenck


-----Original Message-----
From: David Essex [mailto:address@hidden
Sent: May 29, 2007 12:57 AM
To: open-cobol-list
Cc: Moenck, Robert
Subject: Re: [open-cobol-list] Sort utility within OC


Robert Moenck wrote:

 > There seems to be a lot of interest in this issue.
 > So much, that I am not sure which E-mail to respond to.
 > I arbitrarily picked this one to add my two cents to
 > the pile.

Well, for lack of a better forum, perhaps you should post your views on
the OC mailing list. To which, I took the liberty of forwarding this reply.


 > In random order:
 > 1)
 > I agree with the modular concept you suggest.
 > My inclination is to use as many tools as are available
 > for a project like this.
 > Since we are talking about Open Cobol, we are thinking about
 > *NIX environments.
 > That being the case we should be able to leverage the *NIX
 > tool set where appropriate.
 > For example, a Mainframe Sort utility has a lot of reformatting,
 > filtering and post-processing capability.
 > I would look to the *NIX tool set to provide this.
 > It may be that you need some sort of wrapper to build shell
 > pipelines, and reformatting or filtering scripts to run behind
 > the scenes (for folks not too comfortable in the *NIX environment),
 > but I would try to "stand on other people's shoulders" for
 > something like this.
 > This wrapper could be part 1) that you identified.

The UN*X tools are designed to work in LF(CR) delimited variable length
records. And fields are usually delimited by white-space (tabs, spaces,
etc).

Main-frame tools (COBOL), are designed to work in a fixed length and
binary prefixed variable length records.

So the two methodologies are not really compatible.
It can be done, but it would be easier to change the data from
main-frame like data types to UN*X like data types.
This can be done with a simple COBOL program.


 > 2)
 > That said, then the biggest challenge would be the core sort
 > utility.
 > I took a quick skim of the GNU sort.c code.
 > It seems to me that adding comparison routines for COBOL data
 > types (e.g. packed decimal, etc.) would be less painful than
 > reinventing all the management functions (e.g. temp file
 > handling, etc.).
 > Perhaps some code could be scarfed from Open-Cobol's libcob.
 > This in turn uses the GNU MP multi-precision arithmetic package,
 > so some analysis is required.

The core of the SORT utility would require a comparison and sum
functions for COBOL data types. Both of these functions are available on
the OC run-time library.
So in essence all you would need is to do is create the RT compatible
structures, use the OC RT functions, and then move the data.

I don't know how useful the GNU sort code would be under the
circumstances. But I suspect it would be easier to adapt basic core sort
algorithms. These sources are available on the NET.
You could have multiple sort algorithms if you wanted to do so.

Personally, I think a simple merge (tape) sort would be a good start.

 > ...
 > 4)
 > With regard to what syntax should be supported,my inclination
 > is to be as useful/general as possible.
 > Perhaps something like Open Cobol's -std=<dialect> parm could
 > be used, and various syntax's or capabilities could be supported.

Yes, I think that could be done.
But there are some IP considerations which need to be considered with
this option.


 > 5)
 > I confess to being a Perl fan and am sorry that you have had
 > bad experiences with it.
 > I would propose it for the wrapper in an initial implementation
 > because it is part of the *NIX tool set and comes with a coral
 > reef of enhancements.
 > For example, there are parser modules that could process the
 > SORT syntax given a grammar for them.
 > Using Perl a prototype Sort utility could be put together quickly.
 > Given a prototype, others might be tempted to try it out and provide
 > feedback which could lead to improvements.
 > 6)
 > In the long run, once ideas and design have gelled, things could
 > be rewritten in C (say).
 > 7)
 > To my mind an important first step is to identify a "minimal useful
 > capability" for a sort utility.
 > This would be a collection of enough features that others would use
 > the sort utility as opposed to hand coding something.
 > This "minimal useful capability" could be provided as a prototype
 > and we could start getting feedback from users.
 > Maybe Roger's comments apply here.

I have implemented a SORT syntax, using YACC, which is mostly complete.
The critical part however, in what ever language used, is the ability to 
compare and sum COBOL data types.

This functionality could be implemented in a Perl module.

In my view however, it would be less work to use the OC run-time and
write it in C.

BTW, there is a minimal SORT utility called CSORT (1).
It uses a tape sort, and it was originally implemented for DOS, but it
does run on UN*X and Win32, with minor modifications.

Cheers

CSORT - Saga of a Sort
by Al Stevens
Dr Dobbs's Journal - April 1990
http://slawa.homeip.net/books/start.php/Dr.Dobbs/Dr.%20Dobbs%20Journal%20-%201988-2004.16Years/articles/1990/9004/9004j/9004j.htm





This e-mail and any attachments may contain confidential information. Any 
distributing, copying or reliance upon the contents of this e-mail by anyone 
other 
than the intended recipient is strictly prohibited. If you have received this 
e-mail 
accidentally, please delete it and notify the sender. Although this message has 
been 
screened for viruses, we cannot guarantee that our virus scanner will detect 
all 
viruses and take no responsibility for any damage or loss that may be caused by 
its 
contents.
[Prev in Thread]
Current Thread
[Next in Thread]
[open-cobol-list] Sort utility with OC, Moenck, Robert <=
- Re: [open-cobol-list] [open-cobol] Sort utility with OC, David Essex, 2007/06/07
- Message not available
  - Re: [open-cobol-list] Sort utility with OC, Roger While, 2007/06/08
Next by Date: [open-cobol-list] End of line character.
Next by thread: Re: [open-cobol-list] [open-cobol] Sort utility with OC
Index(es):
- Date
- Thread