[LTER-im] measurmentScale/precision - what definition? how tohandle?

Matt Jones jones at nceas.ucsb.edu
Mon Aug 4 10:32:16 PDT 2003


Hey,

OK, I confess to writing the text surrounding precision in EML.  Sorry 
if it has been confusing.  I agree that we need to rewrite it to clarify 
our intent.  This is a long note (sorry), but it deals with two basic 
things: 1) how do we define precision, and 2) should it be required?

1) How do we define precision
-----------------------------
Earlier versions of EML defined precision as the number of significant 
digits, but our discussions on this revealed that this was inadequate 
for two reasons (iirc): 1) you can only have precsions to the closest 
order of magnitude, and 2) it is exceedingly difficult to compare 
precision of measurements using two different units.  For example, body 
size in centimeters with precision of 3 significant digits is very 
different from body size in meters with precision of 3 significant 
digits.  That was the basic motivation for an expression involving the 
unit of measurement, which is the common way it is done in physics.

So, here's what I think we are trying to capture: a measure of how 
repeatable a measurement is (as opposed to how close a measurement is to 
its true value, which is accuracy). A common expression of precision is 
standard deviation. Precision can be calculated without reference to a 
known standard, while accuracy can not. Precision is generally a 
property of the measuring device.

Here's a explanation of precision and accuracy from a high school 
physics tutorial:

 From 
http://www.carlton.paschools.pa.sk.ca/chemical/Sigfigs/accuracy_and_precision.htm:
----------
Precision indicates how close together or how repeatable the results 
are.  A precise measuring instrument will give very nearly the same 
result each time it is used.

Accuracy indicates how close a measurement is to the accepted value.
----------

 From http://www.angelfire.com/stars/dhsphysics/mathskills.html:
----------
Precision is how fine of a measurement that the measuring instrument is 
marked off for.  A typical meter stick or metric ruler has millimeter 
marks as its smallest marking. A yard stick’s smallest marks may be 1/8 
inches.  By estimating between the marks we can measure to half 
millimeters and to 1/16 of  an inch with these tools.  This is the limit 
of our precision with meter sticks and yard sticks.  But if we use these 
measuring tools in a sloppy fashion, say by letting the position of the 
end slip, the measurements will not be accurate to anywhere near a 
millimeter or an 1/8 of an inch.

Accuracy is how correct or true that a measurement is.  If I estimate 
the width of a room to be 4 meters, and it is, than that is an accurate, 
but not precise measurement.  If I let the ends slip on my meterstick, 
and measure the width of the room to be 4.2165 meters, that is a precise 
but inaccurate measurement.  Only if I am very careful, I can get an 
accuracy that equals my precision when using a meter stick.
----------

or, from http://webphysics.iupui.edu/NH/Projects/TEAMS%5B2%5D/err6.htm:
----------
Precision is the degree to which several measurements provide answers 
very close to each other. It is an indicator of the scatter in the 
data.The lesser the scatter, higher the precision.

Accuracy descibes the nearness of a measurement to the standard or true 
value, i.e., a highly accurate measuring device will provide 
measurements very close to the standard, true or known values.
----------

Links to many more definitions like these can be found at:
http://www.chemistrycoach.com/math_skills_for_chemistry_tutori.htm#Accuracy%20and%20Precision

So from these definitions I surmise that precision is a measure of 
repeatability, and has nothing to do with accuracy (ie, a measurement 
can be highly precise yet extremely inaccurate).

So, our current "precision" field in EML is a straightforward interval 
showing the spread of the measurements in the unit of measurements. 
This is common in physics (e.g., 17 +/- 1 cm).  I don't think we need to 
change the fundamental usage in EML, but we do need to clarify its 
definition.  Unfortunatley, our definition doesn't state how the 
interval was calculated. A more exact measure of precision would be a 
standard deviation, in that we specify exactly how the precision is 
calculated. I think this is, however, overly prescriptive, and that the 
looser sense of precision as an interval in which measurements will fall 
is better.

Proposed definition for precision in EML:
Precision indicates how close together or how repeatable measurements 
are.  A precise measuring instrument will give very nearly the same 
result each time it is used. This means that someone interpreting the 
data should expect that if a measurement were repeated, most measured 
values would fall within the interval specified by the precision. The 
value of precision should be expressed in the same unit as the 
measurement. For example, for an attribute with unit "meter", a 
precision of "0.1" would be interpreted to mean that most repeat 
measurements would fall within an interval of 1/10th of a meter.

2) Should precision be required?
--------------------------------
Originally we felt that precision was fundamental to the expression of 
measured values, so much so that it should be required.  It really is 
fundamental information about the measurement, but it is clear from 
David's experience that it is commonly not available.  By making it 
required, we essentially prohibit people from providing other attribute 
metadata if precision isn't available, because their EML documents can 
not validate.

Thus, I propose we change precision to be optional, so that we capture 
more metadata now than we did before.  However, this does not mean that 
precision is not fundamental -- it just means that we would rather have 
the other metadata about an attribute than require precision.  We should 
strongly recommend that a measure of precision be provided without 
requiring it for valid EML.  This change would be backwards compatible 
with EML 2.0.0 because all existing EML 2.0.0 documents would still be 
valid under the new cardinality rule.

Comments?

In order to track this issue and its resolution I am going to open a new 
bug in bugzilla for the precision field.

Matt

-- 
-------------------------------------------------------------------
Matt Jones                                     jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
Interested in ecological informatics? http://www.ecoinformatics.org
-------------------------------------------------------------------




More information about the Eml-dev mailing list