temporal coverage tags, alternativeTimeScale

Peter McCartney peter.mccartney at asu.edu
Fri Oct 22 14:43:11 PDT 2004


Im one of the ones that disagrees with Matt in that I think a dataset
can indeed be  "ongoing" or actively accumulating new records and that
users can simply use this information to signal that they cannot make
the assumptions about coverage that he describes. Any user is of course
free to query such a dataset, and "fix" its coverage in that point in
time and, with permission, archive the subset as a derived dataset that
is not "ongoing". However, I do agree with Matt that the reccomendation
to enter "ongoing" in a field that otherwise should receive a date type
entry is a bad solution. To me a far better solution (though not one
that addresses all of Matt's objections) would be to simply leave off
the ending date element and use only beginning when selecting a range of
dates. This is what we do internally in CAP's catalog, but unfortunately
is is invalid in EML as both are required. I agree it's a bit awkward to
query but its not that hard to write some code that searches for
"beginDate <=1988 and (endDate>=1988 or endDate==null)"

There is a maintenance section in which one can convey the information
that the dataset is dynamic and I do remember the dicussion where it was
agreed that this was an appropriate spot to put that. To stick things
like this in typed fields destroys the utility of strong typing in the
schema.

Until a better solution is made, why not just set enddate for all your
active datsets to the current year? Worst case scenario is you have to
update all your metadata annually to reflect the extended coverage.   

Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental-Studies
Arizona State University
 


> -----Original Message-----
> From: eml-dev-admin at ecoinformatics.org 
> [mailto:eml-dev-admin at ecoinformatics.org] On Behalf Of Matt Jones
> Sent: Friday, October 22, 2004 12:57 PM
> To: Margaret O'Brien
> Cc: eml-dev at ecoinformatics.org
> Subject: Re: temporal coverage tags, alternativeTimeScale
> 
> 
> Hi Margaret,
> 
> AlternativeTimeScale was written (originally by me for the BDP, and 
> modified in the EML2 process) to accomodate various stratigraphic and 
> geologic time scales.
> 
> The documentation stating to put "ongoing" in those fields 
> (or anywhere) 
> is a hack, and a bad one at that, that incorrectly uses 
> alternativeTimeScale.
> 
> The problem is this: a data set that is ongoing isn't really ongoing 
> from the data user's perspective, because at any given point 
> in time the 
> temporal coverage of the data returned is a fixed interval.  
> Many data 
> managers don't want to deal with the fact that when their data sets 
> change the metadata describing those data should also be 
> updated.  One 
> way to get around updating is to say that data colleciton is 
> "ongoing", 
> but this is meaningless to a search engine.  Let me 
> illustrate with an 
> example.
> 
> Lets say I collect data in 1985, 86, 87, and 88.  I create a metadata 
> document that says my temporal coverage is 1985-1988.  
> Someone queries 
> the metadata to find all data collected in 1988, and my data set is 
> returned in the list of matching results.  Perfect. Now, lets say I 
> really intend to collect annually, so instead I change my metadata to 
> say the temporal coverage is 1985-ongoing.  Someone does a search for 
> data in collected in 1988 -- should the metadata search engine return 
> the data set as a match?  What if I search for 2004, should 
> it return a 
> match even though the only data are actually collected up to 1988?
> 
> This comes down to an issue of metadata maintenance.  Putting ongoing 
> into any field is a horrible practice because 1) it reduces the 
> information about the actual temporal coverage of the data that is 
> currently available, and 2) it is highly prone to error 
> because people 
> WILL forget to update their metadata record when their data 
> collection 
> actually stops.  So from a best practices perspective, its a horrible 
> practice in my opinion.  If you want to indicate to someone that you 
> intend to collect more data, by all means indicate that in the 
> samplingDesign and other methods sections.  But not in 
> temporalCoverage, 
> which describes the data you have already collected, not what 
> you plan 
> to collect.
> 
> We've had this discussion in the EML group before.  I lost 
> this battle 
> before, and thus the hack was placed in the EML documentation.  Some 
> people probably still support this practice.  I don't.
> 
> Hope this has been helpful.
> Matt
> 
> Margaret O'Brien wrote:
> > Hi -
> > I'd like someone to clear up some confusion about the temporal 
> > coverage
> > module, specifically, the alternativeTimeScale.  This has 
> an inpact on a 
> > recommendation which will be made by the lter EML best 
> practices group.
> > Since sites conduct time series, many data sets can be considered 
> > 'ongoing' ie, their data tables will be appended at some interval - 
> > perhaps regularly, but not necessarily.
> > 
> > In the introduction to the module documentation for 
> eml-coverage, you
> > state:
> > " In order to express an "ongoing" time frame, the end date 
> in the range 
> > would likely use the alternate time scale fields with a value of 
> > "ongoing", whereas the begin date would use the specific 
> calendar date 
> > fields. "
> > 
> > Should we extend this statement to mean that for our time-series
> > datasets, an eml creator can populate the <timeScaleName> 
> tag with a 
> > value of "ongoing" and a <timeScaleAgeEstimate> tag with a 
> description 
> > of the update frequency for the data?
> > 
> > When we investigate the element defintions later in the document, it
> > seems that these tags are intended for stratigraphic data 
> and geologic 
> > time scales, which makes them inappropriate for an ongoing 
> time series 
> > dataset.
> > 
> > Can you clarify 1) how you intended the 
> alternativeTimeScale tree to 
> > be
> > used, and 2) how you recommend we alert a reader that a 
> dataset is a 
> > time series and will most definitely be updated.
> > Thanks-
> > Margaret O'Brien
> 
> -- 
> -------------------------------------------------------------------
> Matt Jones                                     jones at nceas.ucsb.edu
> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
> National Center for Ecological Analysis and Synthesis (NCEAS) 
> University of California Santa Barbara Interested in 
> ecological informatics? http://www.ecoinformatics.org
> -------------------------------------------------------------------
> _______________________________________________
> eml-dev mailing list
> eml-dev at ecoinformatics.org 
> http://www.ecoinformatics.org/mailman/listinfo> /eml-dev
> 



More information about the Eml-dev mailing list