Calendar dates are interval scale
Peter McCartney
peter.mccartney at asu.edu
Thu Oct 31 07:33:50 PST 2002
grrrrr....this is not a very extensible solution to come up with a new
element every time we come up with something different that stmml cant
handle. it tells me more about the inadequacies of stmml than anything else.
Its either one of a standard set of unit enumerations or its not. Just put
these in the dictionary and extend the stmml.xsd to accomodate them.
-----Original Message-----
From: Matt Jones [mailto:jones at nceas.ucsb.edu]
Sent: Wednesday, October 30, 2002 1:16 PM
To: Tim Bergsma
Cc: Eml-Dev (E-mail)
Subject: Re: Calendar dates are interval scale
Tim and other date-time fanatics,
Thanks for the comments. I was having similar problems myself. I agree
date-times are probably interval scale in some contorted way. I
discussed this on IRC with chad, and we decided to try out your
recommendation, with some slight changes. Here's what we ended up with:
1) a bunch of new units for expressing durations according to their
nominal length (e.g., nominalMinute = 60 seconds, nominalHour = 3600
seconds, nominalDay = 86,400 seconds). People should use these for
attributes that contain durations like "18 minutes" or "4.56 days"
2) A new way of expressing unit and domain for date-time values.
Basically, now unit is a choice of standardUnit, customUnit, and
formattedDateTimeUnit. The formattedDateTimeUnit takes as content a
format representation of the date-time value that complies with the ISO
8601 format string rules (e.g., YYYY-MM-DD). This should be sufficient
information to allow software that understands the gregorian calendar
and all of its idiosyncracies to calculate differences between date-time
values. The precision for these values should always be 1 (its sort of
implied by the format string). The domain of interval scale attributes
can now be of type DateTimeDomainType, which allows one to use date-time
values in the expression of the domain min and max.
Take a look at eml-attribute.xsd and let me know what you think. The
lib/sample/eml-sample.xml has an example of the use of these structures
that would be common I think in datasets.
We're still cleaning up loose ends, but we're close.
Matt
Tim Bergsma wrote:
> I'm not on IRC, so if you want to hash this there, call me at
> 269-671-2337.
>
> We can't rehash forever, but this is a usability issue of the first
> order.
>
> There are two problems with yesterday's conference call consensus
> regarding datetime: 1) we provide no mechanism for handling durations;
> 2) calendar dates are interval scale not ordinal scale.
>
> Regarding durations, one might argue that we provide xs:duration in the
> kludge of the ordinal measurementScale. But I looked at the
> representation of xs:duration
> (http://www.w3.org/TR/xmlschema-2/#duration), and quite frankly, no one
> has duration data in that format! EML has to handle data like this:
>
> Watershed YearOfClearCut YearsToReforestation
> W3 1887 40
> LittleCreek 1910 35
> JasperRidge 1950 52
>
> -or-
>
> EggMass DateOfLaying DaysTillHatching
> DuckPond 5-15-2000 30
> LittleCreek 4-31-2000 18
> GullLake 6-1-2000 16
>
> Recommendation: we should provide categories in the unitDictionary such
> as nominalYears, nominalDays, nominalMonths, nominalHours, etc. (or
> YearsDuration, DaysDuration, etc) and define them in conventional terms,
> explicitly acknowledging lack of precision. For instance, a
> nominalMinute is 60 seconds, +/- 1 second. A nominalYear is 365
> nominalDays, +/- 1 day. xs:gYear is fine for YearOfClearCut, but
> xs:duration will not be adequate for YearsToReforestation.
>
> Regarding scale: I'm convinced that ordinal scales are simply ranked
> categories. You don't do math on ranked categories, other than to test
> for order relations. But we do lots of math on CalendarDates, such as
> taking the difference between two dates, or adding a duration to a
> date. The objection is raised that the duration of sub-units of the
> Calendar are not constant. True, but we do the math, still the same, so
> it must be an interval scale. Actually, it is a deeply nested
> concatenation of interval scales of varying domain. But the scale is
> completely determined, and even naive calculations are valid, albeit
> with qualified precision, while sophisticated calculations are exact. I
> found one webpage that explicitly assigns calendar dates to interval
> scale: http://www.rattlesnake.com/notions/guttman-scales.html.
>
> So, modeling DateTime etc. under ordinal is wrong. But if we provide
> DateTime etc. under interval MeasurementScale, what are the units?
> DateTime does have units (year-month-day-hour-min-sec) , but they are
> concatenated. The concatenation is a mechanism for traversing the
> nested tree of (arbitrary, often-non periodic) interval scales that
> comprise the calendar. I think, as someone suggested yesterday, we will
> have to provide a notation for indicating date format, such as
> CCYY-MM-DD or MM-DD-YY, etc. Applications will need the notation as a
> key for digesting date strings. We can't expect eml authors to change
> their data to conform to some format. Given the ubiquity of date/time
> data, we either have to enumerate some common formats (unit
> concatenations) or provide a notation for describing formats.
>
> And this just in...Campbell data loggers everywhere are storing dates as
> as a field pair: Year and DayOfYear. This just proves that there are
> alternate ways of traversing a nested interval scale. This is perhaps
> our last opportunity to trap DayOfYear and do something meaningful with
> it. It is not a duration. It has exactly the same properties as
> xs:gMonthDay:
>
> "[Definition:] gMonthDay is a gregorian date that recurs, specifically
> a day of the year such as the third of May. Arbitrary recurring dates
> are not supported by this datatype. The ·value space· of gMonthDay is
> the set of calendar dates, as defined in § 3 of [ISO 8601].
> Specifically, it is a set of one-day long, annually periodic instances."
>
> Solutions welcome.
>
> Tim.
--
*******************************************************************
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
Interested in ecological informatics? http://www.ecoinformatics.org
*******************************************************************
_______________________________________________
eml-dev mailing list
eml-dev at ecoinformatics.org
http://www.ecoinformatics.org/mailman/listinfo/eml-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20021031/b726be89/attachment.htm
More information about the Eml-dev
mailing list