status of EML 2.0
Matt Jones
jones at nceas.ucsb.edu
Tue Aug 20 09:11:57 PDT 2002
EML-dev'ers,
Thanks for the agenda, Chad. A couple of suggestions. I'm sure that
everyone is aware that there is far too much material for us to review
in a single conference call, much less to resolve the issues. So, I
think the main goals of the conference call should be to 1) identify the
major blocking issues, 2) identify people who will propose a concrete
solution to those and other issues, and 3) establish a timeline for
releases so we have some target dates.
I would suggest that everybody spend some time reviewing their notes and
make sure that every outstanding issue is entered into Bugzilla. That
way, we can make assignments of the bugzilla tasks and track the
resolution of each of the issues there. Personally I find that long
email threads are not particularly effective at helping come to
resolution because it is difficult to read the history, and so I prefer
the bugzilla approach. Many of the issues that have been brought up are
already in bugzilla, and you might want to make some notes so that
everyone's needs are clear. If an issue that is important to you is not
in Bugzilla, then please enter it: this can be as simple as cutting and
pasting the text from an earlier email message. The current list of EML
issues can be seen in Bugzilla at:
http://bugzilla.ecoinformatics.org/buglist.cgi?product=EML&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED
Anybody who is interested in the resolution of a particular issue should
feel free to create an account and add themselves to the cc list for
that bug so that you get all of the email traffic about the issue.
Cheers,
Matt
Chad Berkley wrote:
>>
>>------------------------------------------------------------------------
>>
>>EML Conference Call Agenda
>>Wednesday Aug. 21, 2002 9:00 am.
>>Participant list as of Monday Aug. 19 at 14:00: Tim Bergsma, Peter McCartney,
>>Ken Ramsey, David Blankman, Matt Jones, Scott Chapal, Owen Eddins, Chad Berkley
>>--------------------------------------------------------------------------------
>>
>>Bugs:
>>-----
>>please see: http://bugzilla.ecoinformatics.org/buglist.cgi?product=EML&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED
>>1) missing recursive link within project to a related project description
>>2) missing recursive link within protocol to reference an existing protocol
>>3) the ascii fixed section of physical doesn't work, nor does it support records
>> with multiple physical lines. We've already defined a structure that does
>> this.
>>4) improper distinction between precision and accuracy in eml-attribute
>> (bug 484)
>>5) spelling errors in eml-constraint. *Occurances should be *Occurences
>> (bug 486)
>>6) bug 485 re: eml-physical:
>> a.I still suggest changing dataFormat. FormatName is only needed if you
>> are NOT providing the parsing information inline. This structure is
>> confusing because someone could enter ascii fixed info but also enter
>> dbase under format.
>> b.Distribution element repeats and contains a repeating choice. This
>> doesnt make sense unless theres something about the inline element that might
>> occur more than once per entity.
>> c.Ascii fixed wont work as it is. Start column needs to repeat with field
>> length and you need to add physical record information.
>> d. drop genericBinary unless someone has a definition for it
>>7) eml-dev posting from Tim: <distribution> is defined
>> somewhat differently under <resourceGroup> vs. <physical>, i.e. no
>> <inline> option.
>>8) using facets for storageType and attributeDomain in eml-attribute (bug 544)
>> **the description for this is long so please just look at the bug:
>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=544
>>9) We need to change the namespaces in EML 2 to inlcude the text
>> 'ecoinformatics.org'. This will ensure uniqueness (bug 556).
>>
>>Structural Issues:
>>------------------
>>1) eml-literature proposed changes (peter's comments bug 492):
>> a.Drop section from book
>> b.Drop editors from edited book
>> c.Add bookChapter or bookSection
>> d.Drop conference proceedings. If these are published, then they are a
>> book. The information about the event and venue is part of the title.
>> e.Drop publicationPlace?
>> the locational information is already in
>> publisher
>> f.Drop presentationPlace. Move the proceedings information from
>> conferenceProceedings to this module.
>> g.Drop institution from report. Institutional affiliation of authors is
>> already in the RP information of the authors.
>> h.Make report number optional. This may be part of the title or non-
>> existent
>> i.Drop publisher from thesis. If it is published then it is a book.
>> j.Drop software package. This is covered under eml-software
>> k.Drop the unnecessary sequence element containing access and project.
>>2) citation is imported "weirdly" into eml-coverage (bug 491)
>>3) comments from Tim (bug 482):
>> <dataset> should have an optional <protocol> child. Currently it
>> does not. <project>, <dataTable>, and <attribute> have the <protocol>
>> option, but not <dataset>. Peter McCartney defined 'dataset' as "the
>> product of a discrete research activity" (6-20-2002). It is very
>> natural to suppose that a discrete research activity has a protocol. As
>> things are now, dataset protocols must be associated with their
>> entities, which makes it awkward to represent a protocol which
>> effectively corresponds to several entities. For instance, a bird
>> survey protocol could generate a table of weather conditions and a table
>> of sightings, maybe even a set of audio recordings. Such a protocol is
>> represented more naturally at the dataset level than at the entity
>> level.
>>4) Paragraph tag needs internal structural formatting (bug 557). There are
>> several approaches to this:
>> a. Decompose structured text into a series of <paragraph>.
>> b. Inject structured text, with its native markup, as a CDATA block in
>> <paragraph>.
>> c. Make <paragraph> nestable.
>>5) comments from Tim on eml-dataTable (bug 539)
>> a. An <alternateIdentifier> for <dataTable> would be useful.
>> b. An <additionalMetadata> for <dataTable> would be useful. Several
>> of us have a 'Comments' field associated with our dataTables, but no
>> natural place to put them.
>>6) eml-protocol changes needed. Please see Tim's comments in bug 489 and
>> Peter's comments in his recent email (item 6).
>> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=489
>>7) eml-physical changes needed. there seems to be a disagreement between
>> the way that eml-physical and eml-protocol model the real world. Should
>> they attempt to do so? What changes need to be made to them? See Peter's
>> email and Scott's response (item 6).
>>8) eml-protocol dataSourceUsed element-should it exist?
>>
>>Content Issues:
>>---------------
>>1) online distribution notation--how do we identify online distributors
>> in a way that will allow both human readability and machine parsability.
>> Why should connection details be necessary in EML, at all?
>>
>>Other Issues:
>>-------------
>>1) technical problems with the identifier and keyref statements
>> which prevent any instance file from validating. I dont understand this
>> aspect of XML very well so i cant really suggest how to fix it or where
>> the problem lies but I assume it is just a technical matter and not a
>> fundamental problem with what we are trying to do with references
>>2) normalization issues: pointer normalization (with references), triples,
>> no normalization, etc. Why normalize EML metadata? sort out the issues
>> surrounding the use of references for normalization.
>
--
*******************************************************************
Matt Jones jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)
Interested in ecological informatics? http://www.ecoinformatics.org
*******************************************************************
More information about the Eml-dev
mailing list