[kepler-dev] Null/nil/missing values

Thu Dec 22 15:41:50 PST 2005

Hi All,

    There has been some discussion on how to handle null/nil/missing 
values in Kepler. I thought I would find out how this is handled 
internally in 'R'. The response I got from from Duncan Temple Lang (one 
of R's develpers) is below. Apparently, they just define one float, 
integer, or character array as representing a 'NA' (as R calls a missing 
value). For example, a missing integer is just the largest possible integer.

    For what its worth, we could take the same approach in Kepler and 
not have to change any token handling. Actor that cared could then just 
check for missing values (which is what R functions do).

----

  Indeed I do know and have to as one of the developers of R. It
would be nice if it were simpler, but alas....

  This is handled in the C code with a collection of constants.
See R_ext/Arith.h and Rmath.h in the include/ directory.

  For reals, we have a constant R_NaReal or its macro equivalent NA_REAL
which is preferred.

  For logicals and integers, we have R_NaInt and their preferred macros
  NA_LOGICAL and NA_INTEGER.

  For these three types of NAs, they are regular elements within the
  int * or double * arrays.

  For strings, the internal representation is different.  In the R
langauge, there is no string, just character vectors.
And each character vector is really a vector of internal
string types which are CHARSXP's in the C code.
And there is a predefined value which is R_NaString and its macro
NA_STRING which identify this.

  So that is how you can deal with the NAs in the various formats
when working with the C-level representations of the R objects.

 Hope that helps. Let me know if it is not clear or if there is
anything else I can point you to.

 D.

Dan Higgins wrote:

>> Hi Duncan
>> 
>>    Some discussions have come up in our Kepler/SEEK project about how to
>> handle null/nil/missing values. So I wondered how R stores NA values
>> internally. I can see that IEEE NaNs might be used for floating point
>> values, but what about strings or integers? I realize that you may not
>> know or remember this, but could you point me to some reference?
>> 
>> Merry Christmas/Happy Holidays  (take your pick of PC greetings  ;-)      )
>> 
>> Thanks,
>> 
>> Dan
>> 
>  
>

- --
Duncan Temple Lang                    duncan at wald.ucdavis.edu
Department of Statistics              work:  (530) 752-4782
4210 Mathematical Sciences Building   fax:   (530) 752-7099
One Shields Ave.
University of California at Davis
Davis,
CA 95616,
USA

-- 
*******************************************************************
Dan Higgins                                  higgins at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Ph: 805-893-5127
National Center for Ecological Analysis and Synthesis (NCEAS) Marine Science Building - Room 3405
Santa Barbara, CA 93195
*******************************************************************