[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Difference between NaN and NA?
From: |
Jaroslav Hajek |
Subject: |
Re: Difference between NaN and NA? |
Date: |
Thu, 8 Apr 2010 13:31:43 +0200 |
On Thu, Apr 8, 2010 at 1:15 PM, John W. Eaton <address@hidden> wrote:
> On 8-Apr-2010, Jaroslav Hajek wrote:
>
> | On Thu, Apr 8, 2010 at 9:03 AM, David Bateman <address@hidden> wrote:
> | > Jaroslav Hajek wrote:
> |
> | > Exact.. NaN is flagged by a particular exponent in IEEE754, so there are
> | > many values of the mantissa that still give a NaN.
>
> Right, I added this feature to Octave specifically for compatibility
> with NA in R. The idea was to be able to exchange binary data with R
> and preserve the special NaN value that R uses for NA.
>
> | In the current situation, NA's do no harm in Octave, but they're also
> | not much useful, because most functions ignore them, for instance,
> | mean ([1,2,3,NA]) is NA etc.
> | On the contrary, inserting checks for NA's everywhere is a good idea
> | either, because it will slow things down for everyone.
>
> Yes, I assume R has a lot of special checks for NA (and that you meant
> "is not a good idea". Un?fortunately, no one was ever motivated to
> make Octave functions NA-aware. So the value exists, but it is not
> really used for anything.
>
> Also, there is no guarantee that NA will be preserved across a
> function call. For example, on some platforms, you might see
> something like
>
> sin (NA) ==> NaN
>
> because it might be implemented with something like
>
> double sin (double x)
> {
> if (isnan (x))
> return NaN;
> ...
> }
>
> instead of
>
> double sin (double x)
> {
> if (isnan (x))
> return x;
> ...
> }
>
> So even ensuring that NA is preserved would add overhead...
>
> But I still do like the idea of having NA separate from NaN so that
> you can have a way to express "missing data" that is separate from
> "failed calculation". But I don't know of any really great way to
> implement it.
>
But why make it built-in? I think that even with the current limited
OOP capabilities, it is possible to build a class that maintains the
NA's as a separate mask, making even the elementary operations
NA-aware (i.e. so that x + NA is NA and never NaN). Overloading the
statistics functions like mean etc. would also be quite simple.
Of course, as a performance zealot, I would oppose making Octave's
built-in operations NA-aware by inserting special checks, because in
general I don't like to pay for something I never use. Neither do R
users, I suppose, so even in R, skipping the NA's is, in general,
optional:
> mean (c(1, 1, NA))
[1] NA
> mean (c(1, 1, NA), na.rm=1)
[1] 1
Yes, we could do something similar in Octave, but isn't an extra class
taking care of this more elegant?
It seems quite inconvenient to have to pass the flag everywhere.
--
RNDr. Jaroslav Hajek, PhD
computing expert & GNU Octave developer
Aeronautical Research and Test Institute (VZLU)
Prague, Czech Republic
url: www.highegg.matfyz.cz
- Difference between NaN and NA?, Matthias Brennwald, 2010/04/08
- Re: Difference between NaN and NA?, Søren Hauberg, 2010/04/08
- Re: Difference between NaN and NA?, Jaroslav Hajek, 2010/04/08
- Re: Difference between NaN and NA?, David Bateman, 2010/04/08
- Re: Difference between NaN and NA?, Søren Hauberg, 2010/04/09
- Re: Difference between NaN and NA?, Jaroslav Hajek, 2010/04/09
- Re: Difference between NaN and NA?, Søren Hauberg, 2010/04/09
- Re: Difference between NaN and NA?, Jaroslav Hajek, 2010/04/09
- Re: Difference between NaN and NA?, Søren Hauberg, 2010/04/09
- Re: Difference between NaN and NA?, Matthias Brennwald, 2010/04/09
- Re: Difference between NaN and NA?, Søren Hauberg, 2010/04/09
- Re: Difference between NaN and NA?, Jaroslav Hajek, 2010/04/09