[Bug 484] - eml-attribute changes needed
Chad Berkley
berkley at nceas.ucsb.edu
Thu May 30 14:20:43 PDT 2002
See my comments inline below:
On Thu, 2002-05-30 at 13:09, bugzilla-daemon at ecoinformatics.org wrote:
> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=484
> ------- Additional Comments From peter.mccartney at asu.edu 2002-05-30 13:09 -------
> Here are some comments.
>
> 1) under storageType, the prefix xs: should not be expected since prefixes
> for content models are defined by the individual schemas that import them and
> this is not the context in which these will be used. so i would expect people
> to type in "string" and not "xs:string", or "xsd:string"
>
agreed, as long as the exact word after the namespace is used, including
case. I think this is an application issue.
> 2) I like the spirit behind your proposed unit field, but i feel the same about
> is as i do about connection URLs - i dont believe people will understand it
> well enough to use it. Typing in "http://ecoinformatics.org/unitDictionary?"
> for every entry seems a bit awkward and unnecessary. Just like with connection
> URLs, users will require both a wizard processor to help them construct it as
> well as processing code to interpret it, and if the dictionary isn't shipped
> with eml, then youve now created a dependency on a web address that we cant
> guarantee will always be there. What i thought was going to come out of the
> sevilleta discussion was an element that was either free choice or an
> enumeration based on a list of stmml type names that are taken from this
> directory.
do you really think that someone who doesn't know EML well is going to
sit down with a text editor and fill out these fields by hand? I
seriously don't. I wouldn't even do that. Like I said in the previous
note on this bug, this would be filled out by some application with a
hash of normal units to URIs. the user should never need to know that
the URI exists. When you sit down with XMLSpy and create a schema, it
creates URIs just like this. You never need to know they are there but
the XMLSchema namespace depends on them.
I think the first part of the URI is necessary so that other unit
dictionaries can be used if need be.
I would argue that the dictionary should be shipped with EML so that it
can be parsed and used by any application that uses EML. also note,
that the URI is not a true active URI. you are not actually linking to
a live web page where the dictionary exists. it is merely a way to
create a unique identifier to the dialect that you are using.
the problem with having this filled in free form is you can't keep
people from filling in all sorts of different things for the units.
like meters/s2 or m/s2 or met/sec^2. all of those are homogeneous, but
how would an automated system know that? If we want to do any automated
processing using this metadata, we must have a standard vocabulary for
describing units. This is fundamental to most if not all automation
engines. There is also no way to successfully integrate two datasets if
the units cannot be compared.
>
>
> The problem seems similar to me to the spatial reference module, in which
> projections that people frequently refer to by a name "UTM zone 12" actually
> require a fairly complex set of terms and references to standard algorithms. In
> eml-spatialReference, these are encoded as complex types that define which
> parameters need to be filled in. I understand that part of what you want to do
> is allow a syntax for people to build thier own data types using the
> established ontology, but i think they will not respond well to the URI model
> for doing that. I'm afraid of them simply not filling it in if they can either
> type in the name they use or pick it from a controlled list.
>
> Related to this section, i have a question regarding some fields from ISO that
> i am trying to eliminate on the grounds that we cover them elsewhere. for
> raster cells, ISO defines cellattributedescription, cellvalueunits,
> tonegradation, scalefactor and offset. the first, and perhaps all of these are
> covered in eml-attribute.xsd. cellvalueunits has a list of codes that i think
> could be indetified as an externalcodeset domain if that is brought back (see
> below). tone gradation is the number of colors(64 colors, 256 colors, etc). i
> think this could be gotten from storageType, but maybe we need something in
> enumeratedDomain for numberOfUniqueValues?.
I'm not sure that I know what you mean. I think you want to map these
ISO fields to eml fields, right? If you do, I don't know if this
information belongs in attribute. Is
rasterImage/cellattributedescription symantically equivalent to
attribute/attributeDescription? What do you propose as a map of ISO
fields -> eml-attribute fields?
> scale factor and offset are for any
> scale multipliers or delta constants that have been applied to the values. i
> think they mean transformations done to allow expression of values that are
> either larger or have a broader range than can be accomodated by the storage
> type used (ie using a byte data type for annual accumulation in thousands of
> inches). does stmml have a way of deal with this or do we need to leave these
> in?
>
I don't believe that stmml handles this. See if any of the examples in
http://www.xml-cml.org/stmml look right.
>
> 3) the sequence portion of enumeratedDomain needs to repeat so that multiple
> codes can be entered. I dont thinke each code needs a separate source, but
> thats a minor point. My recollection from sevilleta was that we were going to
> let enumerated domain include a choice between providing a value list,
> providing a reference to an external codeset (codeSetName,
> codeSetURI?,codeSetCitation?) or a reference to an entity within the dataset
> whose data define the domain (entity, codeAttribute, codeDefinitionAttribute,
agreed. the original note from the sev meeting was:
8) move "textDomain" and "enumeratedDomain" up so that they are siblings
of numeric domain, remove the choice
it looks like the only thing that didn't happen was to remove the
choice.
chad
More information about the Eml-dev
mailing list