[Bug 544] - issues about storageType and attributeDomain
bugzilla-daemon@ecoinformatics.org
bugzilla-daemon at ecoinformatics.org
Mon Sep 2 10:29:37 PDT 2002
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=544
jones at nceas.ucsb.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |eml-dev at ecoinformatics.org
------- Additional Comments From jones at nceas.ucsb.edu 2002-09-02 10:29 -------
Thanks for the comments on these data typing issues, Dan. There are two
distinct issues you raised, which I will address separately:
1) Enumerated domain doesn't allow a simple list without definitions
This is true, and intentional. When data are distributed, it is critical to
know the definitions for the string values that are present in the data
entity. String values or enumerated lists are generally codes that represent
some type of measurement (e.g., HIGH, MEDIUM, LOW), or are names of
sampling locations (e.g., SUBPLOT4).
In either case, it is critical to have the definition. From a data re-use
or data preservation perspective, can you show a case where it would be
acceptable to not have a definition of an enumerated value? If so, I would
agree that we should consider relaxing this requirement, but for now I think
it is a fundamental part of the definition of an enumerated attribute.
2) XML Schema data types used in storageType overlap with attributeDomain
Also true, but the two fields serve different purposes.
The storageType of an attribute is an indication of the type that might be
used to represent the value in a data management system, such as
a database or programming language. It is not actually an
expression of the true domain, as it may in fact be defined slightly
differently than the attributeDomain (e.g., storageType might be "character"
while the domain might be a restricted list of character values).
That we recommend XML Schema Datatypes (which allow restrictions) for the
storageType does not change the need for an independent specification of the
domain. If someone were to use a different type system for the storageType,
especially one which didn't have the restriction capabilities that XML Schema
Datatypes does, then the elimination of attributeDomain would be problematic.
So, basically, attributeDomain is a required expression of the domain, while
storageType is an optional expression of the likely type from some
(hopefully common) type system (e.g., Oracle datatypes, Java datatypes,
XML Schema data types). One might think of storageType as a hint to
automated processing systems as to how one might represent the values of
the attribute. storageType was originally repeatable, and one might
argue that it should be repeatable so that the type from multiple systems
can be indicated. I think that would be a positive change.
In summary, although you make cogent points, I don't think that we should make
substantial changes to the model at this time. I will, however, revise the
schemas to try to clarify the documentation with respect to these issues, and to
make storageType repeatable. Comments? In the absence of further comments,
I'll close this bug this week. Thanks.
More information about the Eml-dev
mailing list