EML questions at HFR
David Blankman
dblankman at lternet.edu
Fri Apr 4 12:51:08 PST 2003
Hi Emery,
I am planning to be out in Boston sometime in the 30 - 45 days.
In the mean time, I'll answer your questions as best as I can. (Answers
are after questions.)
Emery R. Boose wrote:
> Hi David, James & Peter,
>
> I'm writing for guidance on a few (very simple, I suspect) EML questions.
>
> We're currently revising the Harvard Forest online Data Archive in
> preparation for our site review next summer. I'd like to incorporate
> project-level EML as part of this revision. At our annual LTER
> meeting in mid-February I circulated a metadata survey form to all of
> our researchers, and now have most of the necessary content in hand.
> I'd like to create a project-level EML template into which I can cut &
> paste using an XML editor (see attached hf001.xml for a first
> attempt). Though not very elegant, I think this approach will work
> fine for now and give us more time to think about long-term plans for
> managing our metadata and EML.
>
>
> The technical documents on ecoinformatics.org are quite helpful but
> I'm still puzzled on several basic points:
>
> (1) Dataset vs. project. Our data & metadata are organized according
> to (what we call) "project," which appears to correspond most closely
> to "dataset" in the world of EML. Is the "project" category in EML
> intended to provide broader information for a specific dataset, or is
> the project category really a broader entity that might encompass more
> than one dataset?
*An EML <dataset> is comprised of one or more data entities. While there
are no specific standards for determining the contants of a dataset, the
basic guideline is that a dataset is composed one data entities that are
clearly related. For example, if the title is "Effects of Hurricanes on
Primary Productivity in New England and Puerto Rico", then the dataset
might include data entities like, "NEWeather","NEProductivity",
"PRWeather", PRPrductivity".
Project is a somewhat fuzzier area. The primary intent of project is to
allow for the documentation of something broader research context than
just a dataset. Continuing the previous example, supposing that you have
the following datasets with the following titles, **"Effects of
Hurricanes on Primary Productivity in New England and Puerto Rico",
**"Effects of Hurricanes on Biodiversity in New England and Puerto
Rico"*,*"Effects of Hurricanes on Water Quality in New England and
Puerto Rico", then a project might be: "Ecological Effects of Hurricanes
in **New England and Puerto Rico".
Continuing on, there might be other similar dataset like: **"Effects of
Hurricanes on Primary Productivity in Andrews Experimental Forest and
Florida Coastal Everglades.",
**"Effects of Hurricanes on Biodiversity in Andrews Experimental Forest
and Florida Coastal Everglades,"** with a corresponding project of:
**"Ecological Effects of Hurricanes in **Andrews Experimental Forest and
Florida Coastal Everglades.**".
This might produce something like:
*<eml>
<dataset>
<title>Effects of Hurricanes on Primary Productivity in New
England and Puerto Rico</title>
<project>Ecological Effects of Hurricanes in New England and
Puerto Rico</project>
<relatedProject>Ecological Effects of Hurricanes
in Andrews Experimental Forest and Florida Coastal
Everglades<relatedProject>
</project>
</dataset>
<eml>*
or alternatively
*<eml>
<dataset>
<title>Effects of Hurricanes on Primary Productivity in New
England and Puerto Rico</title>
<project>Ecological Effects of Hurricanes </project>
<relatedProject>Ecological Effects of Hurricanes
in Andrews Experimental Forest and Florida Coastal
Everglades</relatedProject>
<relatedProject>Ecological Effects of Hurricanes in New England and
Puerto Rico</relatedProject>
</project>
</dataset>
<eml>
*
**a project could be something like an LTER Core Area or even the whole
LTER site research project. The project module is intented to place a
given dataset into a broader research context. Project is optional, but
if you have the information it makes the metadata richer.
*
>
>
> (2) Scope of identifiers. I'd like to place personnel and
> publications information into separate EML files that are referenced
> from the dataset files (see attached hfpers.xml & hfpubs.xml). Is it
> necessary to wrap the personnel, publications, and dataset files into
> a single EML file (where scope = document)? *YES *Or can I implement
> these as separate EML files on the same directory (where scope =
> system and system = URL)?
*References in EML 2 are internal to a document. An EML 2.0 document
knows only what is inside it. The only time that you can reference
something outside the eml document is when there is a <citation> element
or something like <dataset>/<distribution>/<online><url>. *
>
>
> (3) Multiple study sites. Many of our projects are comparative
> studies (e.g., hurricane impacts in New England and Puerto Rico). Is
> it possible to include spatial coverage information for two distinct
> sites at the project (dataset) level? Or is it necessary to move the
> spatial coverage information to the data entity level and repeat it
> (as appropriate) for each data entity?
*You can do <geographicCoverage> at the dataset level. This approach
would be best if your data is integrated, that is, a single data files
includes data from both New England and Puerto Rico. Let's say a data
file looked like:
SITE DATE PRECIPITATION
NE 2002-12-12 .75
PR **2002-12-12 1.1*
*
The geographic coverage might be something like:
*<eml>
<dataset>
<geographicCoverage>
<geographicDescription>New England</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
<geographicCoverage>
<geographicDescription>Puerto Rico</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
</dataset>
</eml>*
On the other other hand if the New England data was recorded in a
separate table from the Puerto Rico file, for example:
TABLE 1 **NEweather*
*SITE DATE PRECIPITATION
NE1 2002-12-12 .75
NE2 **2002-12-12 1.1
**TABLE 2 **PuertoRicoWeather*
*SITE DATE PRECIPITATION
NE1 2002-12-12 .75
NE2 **2002-12-12 1.1*
*
it might be better to do the following:
*<eml>
<dataset>
<dataTable>
<entityName>NEweather>
<geographicCoverage>
<geographicDescription>New England</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
</dataTable>
<dataTable>
<entityName>PuertoRicoWeather>
<geographicCoverage>
<geographicDescription>Puerto Rico</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
</dataTable>
</dataset>
</eml>*
A third alternative and arguable the ideal solution (assuming that the
NE & PR data are in separate tables) would be to combine the two
approaches as follows:
*<eml>
<dataset>
<geographicCoverage>
<geographicDescription>New England</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
<geographicCoverage>
<geographicDescription>Puerto Rico</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
<dataTable>
<entityName>NEweather>
<geographicCoverage>
<geographicDescription>New England</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
</dataTable>
<dataTable>
<entityName>PuertoRicoWeather>
<geographicCoverage>
<geographicDescription>Puerto Rico</geographicDescription>
<boundingCoordinates></boundingCoordinates>
</geographicCoverage>
</dataTable>
</dataset>
</eml>
>
> (4) Examples. Most or all of my questions could be answered by
> looking at a few well-chosen examples (many in fact have been answered
> by looking at the NTL web page). Are there other examples available
> for study? Perhaps a preliminary draft of the core EML specification?
*I'm working on the core EML specification?
Also we are working on an EML for Mere Mortals that will provide
examples and guidance.
David Blankman
*
>
>
> Many thanks,
>
> Emery
--
David E. Blankman
Database Integration Developer
Long Term Ecological Research Network Office
University of New Mexico
801 University, SE #104
Albuquerque, NM 87106
(505) 272-7346 / (505) 272-7080 FAX
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20030404/79d02110/attachment.htm
More information about the Eml-dev
mailing list