dateTime is a coordinate system

Tim Bergsma tbergsma at kbs.msu.edu
Fri Nov 1 10:32:13 PST 2002


Okay, RC3 is out, so this discussion is oriented more toward long term
thinking rather than eml revision.

It's hard to imagine saying anything original about dateTime data at
this point.  But it struck me yesterday that a Calendar is really a
coordinate system.  It is a markup (in units) of a real, shared entity. 
By analogy, consider the shared surface of the earth.  You geospatial
people will appreciate better than I that attempts to impose a cartesian
grid (meters, miles, whatever) on a spherical object entail arbitrary
accomodations, which is why county baseline roads occasionally have
those bizarre doglegs.  Also note that WGS84 zone 16 has an arbitrary
maximum value for its Eastings domain (now I'm way out of my field).

Calendars have all the properties of a coordinate system.  There is a
real, shared entity...the vector of time.  There are attempts to mark up
time using standard units.  There are arbitrary adjustments built in to
these attempts, due to underlying inconsistency between the thing itself
and our model of it (e.g., a year is not a whole number of days).  Best
of all, there are "projection" algorythms for converting between
different coordinate systems (calendars, timezones, SAS, etc.).  Just as
lat/long seems to be the most robust model of the earth's surface,
"endless string of seconds" seems to be the most robust model of time. 
It's just terribly impractical.

The point of this muse is to try to clarify the relationship between
dateTimes and units.  When you measure the length of a blade of grass,
you report its physical extent, in comparison to the physical extent of
some standard called a unit.  When you measure the duration of a bird
song, you report its temporal extent, in comparison to the temporal
extent of some standard called a unit.  But when you report a dateTime,
what have you measured?  Nothing.  You are citing a point on the
coordinate system.  It is the coordinate system itself that is a markup,
in standard units, of the shared reality called time.  So there is a
sense in which a dateTime has no units, and an indirect sense in which
it does have units, which is why we keep hearing both answers.

Viewing dateTime as a coordinate system also clarifies (?) assignment of
measurement scale.  Durations are obviously ratio, because they have a
natural zero.  10 years old is twice as old as 5 years old.  But
dateTimes, using (?) the same units, hardly seem to qualify for
interval, what with their arbitrary leaps and transitions.  Again, a
dateTime doesn't measure anything:  it specifies a location on a
coordinate system (in that sense, dateTimes are clearly ordinal). 
Useful measurements can be made, however, from comparing two points on a
coordinate system.  Arbitrariness is handled by projecting the points
onto an evenly divided, "true" interval scale, such as a string of named
seconds starting in 1970.  So dateTime doesn't have a measurement scale,
because it isn't a measurement.  But the underlying entity (time) can be
measured on an interval scale, so dateTimes can be used to make
(indirect) measurements of temporal extent.  Similarly, boats don't have
odometers, but you can know how far you've sailed by comparing two GPS
points.  The points themselves do not measure extent, but extent can be
calculated if the coordinate system is known.  Since the experience of
unique points in time cannot be repeated, it is often more practical to
measure temporal extent by comparison within acoordinate system rather
than directly using an appropriate chronometer.

So, the question in EML has been whether to represent dateTime according
to its true nature (ordinal) or its expected use (interval).  I think it
was justified to provide special handling for dateTime.
Tim.




 
David Blankman wrote:
> 
> I participated in only part of the dateTime discussion. At first it
> seemed to me that treating dates as ordinal made sense, but  I think
> that Tim has a valid point: use the guiding prinicipal of useability
> rather than philosopohical correctness.  While it is true that time is
> very problematic when looked at from a deep strucure philosophical
> perspective, it is also true that  people add and subract dates all the
> time..Most of ecology functions in a Newtonian world, free from quantum
> mechanical and relativistic paradoxes and fluidity of time and space.
> 
> Many measurements done in ecology have limited precision. A 1-meter plot
> is probably 1-meter plus or minus some centimeters (2 - 3 cm maybe).
> Coverage estimates have even less precision. Why should we be concerned
> about an even smaller lack of precision in dates?
> 
> Processing systems can be built to factor in the complex calendar rules.
> EML doesn't have to do that.
> 
> I think that Tim's suggestion of units makes great sense, as long as the
> person documenting a dataset does not imply greater precision than is
> appropriate.
> 
> David
> 
> Tim Bergsma wrote:
> 
> >I'm not on IRC, so if you want to hash this there, call me at
> >269-671-2337.
> >
> >We can't rehash forever, but this is a usability issue of the first
> >order.
> >
> >There are two problems with yesterday's conference call consensus
> >regarding datetime:  1) we provide no mechanism for handling durations;
> >2) calendar dates are interval scale not ordinal scale.
> >
> >Regarding durations, one might argue that we provide xs:duration in the
> >kludge of the ordinal measurementScale.  But I looked at the
> >representation of xs:duration
> >(http://www.w3.org/TR/xmlschema-2/#duration), and quite frankly, no one
> >has duration data in that format!  EML has to handle data like this:
> >
> >Watershed      YearOfClearCut  YearsToReforestation
> >W3             1887            40
> >LittleCreek    1910            35
> >JasperRidge    1950            52
> >
> >-or-
> >
> >EggMass                DateOfLaying    DaysTillHatching
> >DuckPond       5-15-2000       30
> >LittleCreek    4-31-2000       18
> >GullLake       6-1-2000        16
> >
> >Recommendation:  we should provide categories in the unitDictionary such
> >as nominalYears, nominalDays, nominalMonths, nominalHours, etc. (or
> >YearsDuration, DaysDuration, etc) and define them in conventional terms,
> >explicitly acknowledging lack of precision.  For instance, a
> >nominalMinute is 60 seconds, +/- 1 second. A nominalYear is 365
> >nominalDays, +/- 1 day.  xs:gYear is fine for YearOfClearCut, but
> >xs:duration will not be adequate for YearsToReforestation.
> >
> >Regarding scale: I'm convinced that ordinal scales are simply ranked
> >categories.  You don't do math on ranked categories, other than to test
> >for order relations.  But we do lots of math on CalendarDates, such as
> >taking the difference between two dates, or adding a duration to a
> >date.  The objection is raised that the duration of sub-units of the
> >Calendar are not constant.  True, but we do the math, still the same, so
> >it must be an interval scale.  Actually, it is a deeply nested
> >concatenation of interval scales of varying domain.  But the scale is
> >completely determined, and even naive calculations are valid, albeit
> >with qualified precision, while sophisticated calculations are exact.  I
> >found one webpage that explicitly assigns calendar dates to interval
> >scale: http://www.rattlesnake.com/notions/guttman-scales.html.
> >
> >So, modeling DateTime etc. under ordinal is wrong.  But if we provide
> >DateTime etc. under interval MeasurementScale, what are the units?
> >DateTime does have units (year-month-day-hour-min-sec) , but they are
> >concatenated.  The concatenation is a mechanism for traversing the
> >nested tree of (arbitrary, often-non periodic) interval scales that
> >comprise the calendar.  I think, as someone suggested yesterday, we will
> >have to provide a notation for indicating date format, such as
> >CCYY-MM-DD or MM-DD-YY, etc.  Applications will need the notation as a
> >key for digesting date strings.  We can't expect eml authors to change
> >their data to conform to some format. Given the ubiquity of date/time
> >data, we either have to enumerate some common formats (unit
> >concatenations) or provide a notation for describing formats.
> >
> >And this just in...Campbell data loggers everywhere are storing dates as
> >as a field pair:  Year and DayOfYear.  This just proves that there are
> >alternate ways of traversing a nested interval scale.  This is perhaps
> >our last opportunity to trap DayOfYear and do something meaningful with
> >it.  It is not a duration.  It has exactly the same properties as
> >xs:gMonthDay:
> >
> >"[Definition:]   gMonthDay is a gregorian date that recurs, specifically
> >a day of the year such as the third of May. Arbitrary recurring dates
> >are not supported by this datatype. The ·value space· of gMonthDay is
> >the set of calendar dates, as defined in § 3 of [ISO 8601].
> >Specifically, it is a set of one-day long, annually periodic instances."
> >
> >Solutions welcome.
> >
> >Tim.
> >
> >
> 
> --
> David E. Blankman
> Database Integration Developer
> Long Term Ecological Research  (LTER) Network Office
> University of New Mexico
> 801 University, SE #104
> Albuquerque, NM 87106
> Phone 505/272-7346  fax 505/272-7080

-- 
Tim Bergsma
LTER Information Manager
W.K. Kellogg Biological Station
Michigan State University
Hickory Corners, MI   49060
616/671-2337
tbergsma at kbs.msu.edu
http://lter.kbs.msu.edu



More information about the Eml-dev mailing list