[open-cobol-list] OpenCOBOL TODO

gnucobol-users
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[open-cobol-list] OpenCOBOL TODO

From:	Keisuke Nishida
Subject:	[open-cobol-list] OpenCOBOL TODO
Date:	Wed Feb 18 18:05:10 2004
User-agent:	Wanderlust/2.10.0 (Venus) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (Unebigoryōmae) APEL/10.3 Emacs/21.2 (i386-redhat-linux-gnu) MULE/5.0 (SAKAKI)
At Mon, 16 Feb 2004 18:47:19 +0100,
Thomas Biehler wrote:
> 
> What i also would like to see is your OPEN-COBOL TODO list in CVS!

OK, I have summarized my todo list (without priority) and added it
in CVS.

My priority (other than bug fixes) is "Intermediate data storage"
and "Single file output".  After these, I would like to implement
intrinsic functions and nested subprograms.

> The "sort table feature" i have requested some time ago, 
> is already in the TODO list, yes ?!   
> If not,  this is a reminder. ;-)

Wow, I had completely forgoten about it ;)

Let me know if there is anything to be added in this list.

Keisuke

------------------------------------------------------------------------
OpenCOBOL TODO                                          -*- outline -*-

* Pending requests

** Check /usr/local for library and header files in configure

** PROCEDURE DIVISION USING BY CONTENT etc.

** Parameter passing to the main program

If a main program has a USING clause, how should we pass parameters
to it?  Command line arguments or environment variables?  What other
compilers do?

** Handling of EBCDIC files

** Full implementation of COMP-5 and COMP-X

Need to implement binary truncation and maybe something other.

** Temporary SORT file and TMPDIR

SORT should be done by using a temporary file in the directory
specified by the environment variable TMPDIR.

** Table sort

** The default action of file exception

If a file does not have FILE STATUS, and if there is no USE AFTER
ERROR in the program, then the program should stop on exception.

On the other hand, if a file does have FILE STATUS, maybe we should
not report error on exception.

** Check the record size

READ statement should check the record size.

** Better ACCEPT
(2003-11-15: possibly bug: accept into numeric field)

Accepting numeric data item should be more user friendly.

** COBOL run program
(2003-11-17: Open Cobol Optimization)

Instead of building an executable from a COBOL source file, we
could always build modules (*.so files) and run them through
a boot program.  Roger While posted a sample program and some
observation about it to the list.

** Partial word replacement
(2004-02-03 minor preprocessor bug)

Partial word replacement by the COPY statement.  I have heard about
another compiler having a form COPY REPLACING //..// BY //..//,
which does partial word replacement?

** Listing file
(2004-02-13: BUG: "COPY REPLACING ..."  and  REQUEST for "mini-listing")

** And many bugs reported recently

* Other features to be implemented

** Intrinsic functions

Before implementing intrinsic functions, we need to implement
"Intermediate data storage" described below.

** Nested subprograms

Before implementing nested subprograms, I would like to review
the way of generating C variables.  See "Single file output" below.

** SCREEN SECTION

Too many things to do.  TinyCOBOL has better support of this.

** Embedded SQL

Frank Polscheit posted his implementation of SQL preprocessor.
(2004-02-17: SQL pre-processor for OPEN-COBOL)

Firebird (firebird.sourceforge.net) has a SQL preprocessor
for thier database.

* Improvement of compiler internals

** Error checking

*** If the VALUE clause does not match to the picture,
maybe we should print warnings.

*** Type checking with each statement

Most statements do not check the type of thier parameters.
We should do it at the beginning of each cb_emit_* functions.

*** Strict error checking depending on the standard

*** Use `error' token in the parser for better error recovery

** Intermediate data storage

Currently, the following statement

  MOVE A(B) TO B, C(B).

is converted into

  MOVE A(B) TO B.
  MOVE A(B) TO C(B).

which does not work correctly.  We should instead convert it into

  MOVE A(B) TO T.
  MOVE T TO B.
  MOVE T TO C(B).

where `T' is an intermediate data item allocated by the compiler.

More generally, all identifiers must be identified where described
in the standard.  Thus, the above statement should be converted
into the following internal code:

  t1 := A(B)
  t2 := B
  t3 := C(B)
  cob_move (t1, t2)
  cob_move (t1, t3)

** Single file output

Currently, cobc generates two C files.  I want to integrate them
into a single file.

** Better exception handling of file I/O

Currently, all file I/O statements (OPEN, CLOSE, READ, WRITE, etc.)
produces implicit PERFORM statements in order to run USE EXCEPTION.
This produces lots of redundant code and should be improved.

** Using fcntl instead of flock for file locking

* Optimization

** More inlining of run-time functions

Especially, cob_move, cob_get_int, and cob_cmp should be inlined for
each data type.  A good way of doing this would be to define macros
for each type and use them in both the run time and the generated code.
For example, cob_get_int might be defined as follows:

#define cob_get_int_display_u1(b) ((int) (b[0] - '0'))
#define cob_get_int_display_u2(b) ((int) (b[0] - '0') * 10 + (b[1] - '0'))
...

#define cob_get_int_binary_u1(b) ((int) *(unsigned char *)b)
#define cob_get_int_binary_u2(b) ((int) *(unsigned short *)b)
...

int cob_get_int (cob_field *f)
{
  switch (f->type)
    {
    case COB_TYPE_NUMERIC_DISPLAY:
      if (!COB_FIELD_HAVE_SIGN (f))
        switch (f->size)
          {
          case 1:
            return cob_get_int_display_u1 (f->data);
          case 2:
            return cob_get_int_display_u2 (f->data);
          ...
          }
      else
        ...
    case COB_TYPE_NUMERIC_BINARY:
      if (!COB_FIELD_HAVE_SIGN (f))
        switch (f->size)
          {
          case 1:
            return cob_get_int_binary_u1 (f->data);
          case 2:
            return cob_get_int_binary_u2 (f->data);
          ...
          }
      else
        ...
    }
}

After doing this, the compiler may directly generate one of the
macro instead of the general purpose function.

** Constant field allocation

When a run-time function is called, a field information is packed
into a C structure and passed to the function.  For example, the
following COBOL code

  01 X PIC X.

  DISPLAY X.

produces something like this:

  unsigned char b_X[1];
  cob_field_attr a_X = {COB_TYPE_ALPHANUMERIC, ...};
  cob_field f_X;

  f_X = (cob_field) {1, b_X, &a_X};
  cob_display (&f_X);

Now, there are four ways of doing this:

 1. Temporary assignment

      cob_field f_temp;
      f_temp = (cob_field) {1, b_X, &a_X};
      cob_display (&f_temp);

    Set a field when necessary.

 2. Compound literals

      cob_display (&(cob_field){1, b_X, &a_X});

    In this case, the variable f_X is unnecessary, and
    the C compiler seems to produce better code than 1.

    This syntax is supported by recent GCC and C99, but
    not all C compilers support this.  We should use this
    instead of 1 when possible.

 3. Local variable

      int func ()
      {
        const cob_field f_X = {1, b_X, &a_X};

        ...

        cob_display (&f_X);

        ...
      }

    If X is used several times, it might be better to
    allocate a variable once at the beginning and use it
    where necessary.  This way you can save the time of
    assignment and reduce the code size.

 4. Static variable

      static const cob_field f_X = {1, b_X, &a_X};

      int func ()
      {
        ...

        cob_display (&f_X);

        ...
      }

    Similar to 3 but allocate it as a static variable.
    If the data address of the field (`b_X' in this case)
    is static, you can also allocate the field statically.

If an identifier contains subscripts and reference modification,
or if the data item is variable length, then you have to use 1 or 2.
Otherwise, if the data item is defined in working-storage section,
you can use 4.  Otherwise, you can use 3.

Currently, cobc use 1 and 4.  We should also implement 2 and 3.

** Flow analysis and function separation

Currently, cobc translates a COBOL program into a single huge
C function.  There are two problems with doing this:

 - The speed of C compilation is very slow, especially when
   optimization is enabled.  It seems compiling a single huge
   C function is slower than compiling divided functions of it.

 - Debugging the generated COBOL program with gdb is hard
   because you cannot skip PERFORM statement by the 'next'
   command.  Currently PERFORM is implemented by C's goto
   statement, so you have to go there.

To solve there problems, we could separate COBOL sections into
multiple C functions, and use C function calls to execute each
section.  However, this does not work for all cases.  Consider
the following example:

  SAMPLE-1 SECTION.
    PERFORM SAMPLE-2.
    PERFORM SAMPLE-3.
  SAMPLE-2 SECTION.
    GO TO SAMPLE-3.
  SAMPLE-3 SECTION.
    EXIT.

You might want to generate three functions SAMPLE_1, SAMPLE_2,
and SAMPLE_3.  SAMPLE_1 might be defined as follows:

  void SAMPLE_1 ()
  {
    SAMPLE_2 ();
    SAMPLE_3 ();
  }

But you cannot define SAMPLE_2 because you cannot jump from
one function to another function.  SAMPLE_1 and SAMPLE_2 must
be defined within the same function, and thus you cannot call
them separately.

Maybe this is not a good example, but it illustrates a problem
that can happen.  To detect and avoid this kind of problems,
we will need control flow analysis of COBOL programs.

If a portion of program is used only through a PERFORM
statement, and if there is no GO TO statement that jumps
to outside of the portion, then we can safely separate the
portion as a C function.

* Debugging support

** Better line directive

When line directive is enabled, the following COBOL code

  DISPLAY "Hello" "world".

produces something like this:

  #line 1 "source.cob"
    cob_display ("Hello");
    cob_display ("world");
    cob_newline ();

Suppose you are debugging the COBOL program using gdb.
You set a break point at the beginning of DISPLAY.
You typed 'next' command.  Then, you'd see nothing.

This is because the first `cob_display' in the C code was
executed.  We could instead produce the following code:

  #line 1 "source.cob"
    cob_display ("Hello"); cob_display ("world"); cob_newline ();

In this case, you'll get all three function calls executed at once.

** Data access method

We should generate all data hierarchy defined in a COBOL program
with all relevant information, including data names, picture clauses,
and source locations.  We should also define a debugging function
that receive a data name and displays its value, using the generated
data hierarchy.  By calling the function from gdb, we can easily
access to the COBOL data at debugging time.

* Documentation

** Better web page

We should at least states the goal of open-cobol, what open-cobol
can and cannot do now, and what is going on now.

** Better user manual
[Prev in Thread]
Current Thread
[Next in Thread]
[open-cobol-list] OpenCOBOL TODO, Keisuke Nishida <=
Prev by Date: RE: [open-cobol-list] feature request: EVALUATE: partial-expression
Next by Date: Re: [open-cobol-list] feature request: EVALUATE: partial-expression
Previous by thread: [open-cobol-list] Compilation errors with latest CVS
Next by thread: [open-cobol-list] next release
Index(es):
- Date
- Thread