[LTER-im] measurmentScale/precision - what definition? how tohandle?
Tim Bergsma
tbergsma at kbs.msu.edu
Mon Aug 4 07:47:42 PDT 2003
Peter, others,
regarding the previous comment...
> However we go, its obivous that we need to re write the definiation of
> precision since, as David points out, its doesnt define the term
> precision. - is it significant digits or an iterval? and does that
> refer only to the mimumum reported digit or interval or is it a
> statement of accuracy?
I agree. We need to rewrite the definition. It is interval, not
significant digits, because significant digits is just a special case of
interval, and we need to stay general. It is not a statement of
accuracy OR statistically qualified "precision", because there is no
universally-received definitions for these things, and no way to attach
a custom definition to the precision element.
Tim
P.S. This doesn't solve Barbara's two-fish-scales problem. Alas! I
fear it is unsolvable. Perhaps she should just report the worst of the
two relevant precisions, or split the table.
>
>
>
> Peter McCartney (peter.mccartney at asu.edu)
> Center for Environmental-Studies
> Arizona State University
>
>
> -----Original Message-----
> From: Wade Sheldon [mailto:sheldon at uga.edu]
> Sent: Friday, August 01, 2003 7:11 AM
> To: dblankman at lternet.edu; Matt Jones
> Cc: im at lternet.edu; eml-dev at ecoinformatics.org
> Subject: Re: [LTER-im] measurmentScale/precision - what
> definition? how to handle?
>
> David and all,
>
> This is an important point to nail down, because it has
> bearings on both statistical analysis and display of data
> set values by eml-savvy software (i.e. when the data are
> stored in an RDBMS field or program variable using a single
> or double-precision floating point storage type that
> supports arbitrary scale and precision).
>
> In my experience, most researchers use "precision" to
> reflect the number of significant decimal places to display
> based on the stated or perceived accuracy of the analytical
> procedure, or instrument readability if that information is
> not known. In other words this is used as a surrogate
> for significant digits, which is generally a more accurate
> way of conveying this information but poorly supported in
> most computational software (i.e. without resorting to
> scientific notation).
>
> When I read the eml spec I interpreted "precision" to
> be what I more commonly see described as "accuracy", or the
> smallest difference between two measurements that can be
> resolved using the stated analytical method. This is closely
> related to the significant digits concept but allows values
> that are not even powers of 10 (e.g. .005).
>
> At GCE we store precision information for all numerical
> attributes in data sets as integers indicating the number of
> significant decimal points to display (i.e. our approach is
> most consistent with your mathematics definition below).
> This value is based on the accuracy/readability reported by
> the investigators on metadata forms, or is determined by
> instrument specifications or value inspection if the
> investigator didn't provide the information and couldn't be
> contacted. For data that span many orders of magnitude (e.g.
> bacterial abundances ranging from 10^4 to 10^8) we use an
> exponential data storage type and report precision as
> significant digits. This precision information is used to
> generate input masks for data editing forms and output
> format commands when data sets are exported in ASCII format.
> It is also used to (optionally) round or truncate values
> following calculations of derived attributes to remove
> spurious trailing decimal places. To support eml precision I
> am just using the inverse power of 10 of my precision values
> (i.e. 10^-x, so GCE precision = 2 becomes eml precision =
> .01), and software writers will presumably have to reverse
> this process (using common logs and rounding) when integer
> decimal place tokens are needed for formatted output
> statement arguments.
>
> I am interested to hear other comments on this, but in the
> absence of reported precision I think using 0 would be worse
> than nothing as it could definitely lead to inappropriate
> data handling and analysis. I think the only legitimate way
> to "fudge" precision in the absence of contributor feedback
> is value inspection for flat files (i.e. look up maximum
> number of digits past the decimal point) or maximum number
> of "used" decimal places for RDBMS entries. It appears to me
> that precision and units-dictionary compliance are clearly
> going to be the make-or-break issues in the decision to
> provide attribute-level metadata for legacy data sets, and
> where the most effort and resources will be required.
>
> Wade Sheldon
> GCE-LTER Information Manager
>
>
> ----- Original Message -----
>
> From: David Blankman
> To: Matt Jones
> Cc: im at lternet.edu ; eml-dev at ecoinformatics.org
> Sent: Thursday, July 31, 2003 9:38 PM
> Subject: [LTER-im] measurmentScale/precision -
> what definition? how to handle?
>
> Matt & IMs & EML-Dev
>
> How to Handle Missing Precision Information
> Most of the metadata files that I have been
> working with and most of those from sites like NTL
> do not have precision information. While XML Spy
> seems to validate empty elements, the EML
> Validator service does a better job and will not
> allow empty elements.
>
> Because many, if not most, of the LTER Information
> Managers have told me that they need to check with
> researchers to get precision informaton, it may be
> some time before we are able to get precision
> information.
>
> Initially I thought that we could handle precision
> by just using empty elements but that seems not
> possible.
>
> It seems to me that we have two alternatives:
>
> 1. Use a precision of "0" to indicate that
> precision is missing.
> 2. Put in metadata without dataTable.
>
> Perhaps the problem with precision is that
> different people are interpreting precision
> differently.
>
> The eml documentation states:
> <doc:description>The precision element represents
> the precision
> of the measurement, in the same unit as
> the measurement. For
> example, for an attribute with unit
> "meter", a precision of "0.1"
> would be interpreted as precise to the
> nearest 1/10th of a
> meter, and a precision of "1" would be
> interpreted as precise
> to the nearest 1 meter.
> </doc:description>
>
> This description does not help since it does not
> defiine precision, but rather assumes that you
> know what precison means. I remember that we
> discissed the definition, but I cannot remember
> what definition we decided to use.
>
> Some definitions:
> b. The number of significant digits to which a
> value has been reliably measured.
>
> precision: 1. The degree of mutual agreement among
> a series of individual measurements, values, or
> results; often, but not necessarily, expressed by
> the standard deviation. 2. With respect to a set
> of independent devices of the same design, the
> ability of these devices to produce the same value
> or result, given the same input conditions and
> operating in the same environment. 3. With respect
> to a single device, put into operation repeatedly
> without adjustments, the ability to produce the
> same value or result, given the same input
> conditions and operating in the same environment.
> Synonym (for defs. 1, 2, and 3) reproducibility.
> 4. In computer science, a measure of the ability
> to distinguish between nearly equal values. (188)
> 5. The degree of discrimination with which a
> quantity is stated; for example, a three-digit
> numeral to the base 10 discriminates among 1000
> possibilities.
>
> <mathematics> The number of decimal places to
> which a number
> is computed.
>
> What concept are we trying to capture?
>
> Can the precision be simply a statement of the
> number of decimal points in the data, e.g. unit =
> meter
> DATA
> 1.75
> 10.6
> 11.765
>
> Can we say that the precision is .001 without
> knowing anything about the source of the data?
>
> Or are we making a statement about the number of
> significant digits, for example, a data logger can
> record 4 digits, e.g.
>
> The following can be recorded:
>
> 12.75
> 127.5
> 1.275
> 1275
>
> but NOT 127.53
>
> Is the precision here also .001?
>
> If the data is derived data, is the precsion
> depenmdent on the precision of the original data,
> e.g. an instrument can only discriminate to .1
> meter, but the data involves some statistical
> operation and the data is reported with additional
> decimal places.
>
> unit = meter
>
> Original Data
>
> 12.1
> 11.5
> 26.4
>
> Reported/Derived DATA
> 11.75
> 10.6
> 21.765
>
> Is the precision 0.1 or 0.001?
>
> David
>
> --
> David Blankman
> EML Integration Developer
> LTER Network Office
> 801 University, SE #104
> Albuquerque, NM 87106
> (505) 272-7346
--
Tim Bergsma
LTER Information Manager
W.K. Kellogg Biological Station
Michigan State University
Hickory Corners, MI 49060
269/671-2337
tbergsma at kbs.msu.edu
http://lter.kbs.msu.edu
More information about the Eml-dev
mailing list