Measurement scale in EML

Fri Feb 25 16:25:50 PST 2005

Dear Matt and Peter:

I have seen a lot of discussions recently on issues about measurement 
scale and temporal coverage.  They are very helpful for our better 
understanding of EML.  The following are my questions and concerns I 
raised during my work on our EML-based metadata. <#temporalCoverage>

1. About the Measurement scale

The measurementSclae is a little bit confusing.  I spent a lot of time 
working on the measurementScale for nominal data.  Here I want to give 
you an example about how I use the measurmentScale to describe nominal 
data in our dataset, and you can see whether my implementation is based 
on correct understanding of this element.

We have a data table with four columns (attributes): recordID, 
variable_name, variable_unit, and avriable_value.  The values for 
variable_name column include certain measurements for the chemical and 
physical properites of sea water such as temperature, salinity, 
nitrate......  The following is a sample piece of my EML file for this 
dataset.
- <#> <attribute>
      <attributeName>varName</attributeName>
      <attributeDefinition>Name of chemical or physical property 
measured</attributeDefinition>
      <storageType>String</storageType>
- <#>     <measurementScale>
- <#>         <nominal>
-            <#><nonNumericDomain>
-                <#><enumeratedDomain>
-                    <#><codeDefinition>
                      <code>T</code>
                      <definition>Temperature, unit: C</definition>
                  </codeDefinition>
-                <#>    <codeDefinition>
                         <code>S</code>
                         <definition>Salinity, unit: PPT</definition>
                  </codeDefinition>
-                    <#><codeDefinition>
                         <code>ST</code>
                         <definition>Sigma-T, unit: KG/M**3</definition>
                     </codeDefinition>   <#>
              </enumeratedDomain>
          </nonNumericDomain>
      </nominal>
  </measurementScale>
</attribute>
- <#> <attribute>
      <attributeName>varUnit</attributeName>
      <attributeDefinition>Unit of chemical or physical property 
measured</attributeDefinition>
      <storageType>String</storageType>
- <#>     <measurementScale>
- <#>         <nominal>
- <#>             <nonNumericDomain>
- <#>                 <textDomain>
                      <definition>*</definition>
              </textDomain>
          </nonNumericDomain>
      </nominal>
  </measurementScale>
</attribute>

My questions / concerns are:
(1) Is it suitable to use enumeratedDomain element to describe varName?

(2) For the varUnit, I don't think it is necessary to include 
measurementScale element.  However, since the measurementScale is an 
required field, I have to put something there in order to pass the EML 
validation.  So I put a "*" sign for the definition element.  I have 
seen some other similar cases in which the EML metadata developers use a 
"*" for the definition element.  Obviously, the measurementScale content 
described here tells no useful information about the varUnit.

2. About the information of metadata itself

Based on my understanding of EML schemas, the only inforamtion 
associated with the metadata itself is the information about metadata 
provider(s).  However, my supervisors and I  think that  it is important 
to provide other metadata information, such as when metadata document is 
created, if further update of metadata is neede, and if the answer is 
yes, what is the metadata update frequency and the date of last update.  
Those pieces of  information are particularly important in the case when 
the endDate value for the dataset from on-going projects is going to 
change, because first they can remind metadata providers / developer 
when they should update their metadata, and second they can tell 
metadata users if the metadata document provides the most current 
information about the dataset described.

3. About the temporal coverage <#temporalCoverage>

We have many metadata records with uncertain endDate because the new 
data are being continuously loaded into the dataset.  Whenever new data 
are loaded, we have to change the values for end date, number of 
records, and /or size of table......  I am wondering when you can 
provide a solution for this issue.

In addition, I found from John's email that you had a KNB data 
management workshop early this year.  I am very interested in this kind 
of workshop, particular workshop associated with the use of metacat.  If 
you have this type of workshop in the future, please let me know.

Thank you very much for your support!

Xiaoping Wang

PMEL /NOAA