octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Single/Double precision NA values


From: John W. Eaton
Subject: Single/Double precision NA values
Date: Tue, 29 Jul 2008 11:53:16 -0400

On 28-Jul-2008, David Bateman wrote:

| I got no response from the R maintainers to the attached e-mail that I
| sent twice to their mailing list. I therefore assume that the R
| maintainers don't really care too much about the compatibility of the NA
| values between R and Octave, and that Octave must make its own choice
| about how to handle this issue.
| 
| I previously sent a patch that addressed the issue of Single/Double
| precision NA values
| 
| 
http://www.nabble.com/-Changeset--Re%3A-Single-Precision-versus-double-precision-NA-p17679409.html
| 
| that replaced the double NA value with one that can be converted from
| double to single precision without change, but that is incompatible with
| the value from R. The patch also included code such that saved files
| that contained the old NA value would have this NA value changed
| internally to the new representation.
| 
| John's comment on this was "What about always writing values in the old
| format when writing binary data, and converting them when reading?".
| Sure we can do this, but there are a few caveats
| 
| * Linking R and Octave libraries together in a single application
| becomes complex as the Octave/R NA representation is different, though
| is this really used that much?

Some people would like to do this, but it hasn't really happened yet
in any reliable way, and I don't expect it will happen any time soon.
In addition to NA differences, there are probably other trouble spots
for coping with differences in the value types for the two
interpreters.  So I'm not sure this reason is sufficient to prevent
doing what is best for Octave.

| * The conversion of the NA values will require either an addition copy
| of the array to be written or a slow write function that checks every
| value before writing to see if its a NA value.

Checks could be done in blocks to limit the actual number of "write"
system calls.

I don't know the best thing to do, but here are some additional
thoughts.

First, I made NA compatible with R so that we could exchange binary
data and preserve the meaning of NA between the systems.  But if no
one really cares about that, then I don't think it matters if this
feature stops working.

After applying your patch, what will happen when reading old binary
files that contain NA?  Will I get the new NA values internally?  That
would be helpful.  However, I don't think it is as important for the
newer Octave to be able to write out NA in a way that older versions
can read.

Aside from compatibility with R, the other main reason for adding NA
to Octave was so that we could have a way to represent NA separate
from NaN.  I still that is a good thing, but I'm not sure that the
current implementation really does the job.  For example, we know that
some library functions (in the C library, for example) on some systems
do things like

  if (isnan (x))
    return NaN;

which will convert an NA value to a NaN.  Operations like this cause
trouble because NA is converted to a generic NaN value.

I had also hoped that over time, people would modify existing
functions in Octave to be NA-aware, but I don't think that has really
happened.  I only see NA used in the following functions:

  interp1  interp2  interp3  interpn  interp1q  imshow

and isna is only used in

  interpn  __go_draw_axes__  assert

Given the limited usage and the surprising, system dependent, and, to
the end user, apparently random problems due to library functions,
should we even bother to keep the current definition of NA?  Or should
we just remove it and overload NaN to handle missing values?

jwe


reply via email to

[Prev in Thread] Current Thread [Next in Thread]