protocol, methods, and project
Peter McCartney
peter.mccartney at asu.edu
Thu Aug 29 09:52:27 PDT 2002
Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental Studies
Arizona State University
480-965-6791
> -----Original Message-----
> From: Matt Jones [mailto:jones at nceas.ucsb.edu]
> Sent: Wednesday, August 28, 2002 10:36 AM
> To: Peter McCartney
> Cc: Eml-Dev (E-mail)
> Subject: protocol, methods, and project
>
>
> Hi Peter,
>
> So I looked over your protocol and methods changes as well.
> Excellent
> start. Some comments...
>
> I like the idea of distinguishing between a method that is
> ad-hoc from a
> protocol that is a resource-level version of a method. And I
> agree that
> protocol should be modified to import eml-method so that they are
> identical underneath in structure (except that protocol would
> have the
> resource fields). Will you do that as well?
Yes.
>
> I noticed that you removed the recursion from methodStep, although
> methodSteps are repeatable. This has some interaction with the
> "procedure/step/substep" debate that's been happening in the eml-text
> thread. Personally, I think it would be better if EML defined the
> methodStep as repeatable and recursive, and left
> "procedure/step/substep" docbook tags out of the eml-text
> module. Or I
> could be persuaded to leave it out of eml-method and put it
> in eml-text.
> What I think would be bad is putting it in both places. That would
> just make it confusing to end users who are trying to document their
> procedures.
Ok..i hadnt clued in to the step/substep thing, but it makes sense. I was
assuming that some aspects of how this information gets entered depended
partly on the related bug on inlining structured text, since some of the
formatting tags we were discussing there can also be used to nest steps and
substeps. I agree that we should pick one or the other depending on how the
other bug is resolved.
> I also was a bit confused about the inclusion of "protocol"
> in method.
> If a user has a protocol to describe, should they use the methodStep
> block under method, or under method/protocol? How do they decide?
What i wanted to get across, was that the user has basically three choices
to provide a description of their procedures with respect to this dataset-
enter a description of those procedures, indicate via a protocol reference
that they followed procedures described there, or indicate via a citation
that they followed proceudres described in that publication. It seemed to me
that the problem generated by making protocol a resource was that now every
methodological description now had to have a title, creator and potentially
a system-persistent identifier. In the model i proposed, eml-methods
contain methods relevant to this dataset alone - its not meant to be used as
standalone information. In that description, you might say (by including a
protocol or citation element) that you followed general procedures described
in _____. The difference between using protocll vs citation is that the
information is actually included in eml-protocol - eml-citation merely
points to where you can find it. But in neither case would the person
filling in the metadata be likely to be editing either of those segments -
they would probably be referencing them from an exiting metadata document.
So I not sure i see confusiotn about where to enter the information - if you
have information to enter, it goes into eml-methods. you only would use
eml-protocol to point to some existing methods that has some formal
persistance independent of this dataset.
>
> The single most frequent question from our EML/Morpho users in our
> seminars, and the one that frustrates users to no end, is: "I
> have piece
> of information, in which of these X places should i put it in eml?".
> Basically, they have a very difficult time distinguishing the
> intent in
> eml of reused element trees. For example, they might say "data was
> collected in 1999", and want to know if they should put that piece of
> information in dataset/coverage or dataset/dataTable/coverage.
> Legitimate question, subtle distinction. There are hundreds of these
> subtleties in EML, which add up to massive confusion to users.
>
> So how does this relate to protocol? By sprinkling "protocol/method"
> elements throughout eml, users will have lots of choices of
> where to put
> things, and almost no understanding of why things should go
> in one place
> over another. In general, if we can, I would like to see us come up
> with a solid rationale and use case for each use of "method"
> in eml. If
> we can avoid it, I'd prefer to not have two or more locations for
> information that can be interpreted identically (e.g., method
> in dataset
> and project). Soo....can we consolidate the method
> references somewhat?
> I'm happy to move it to dataset if that's where people want
> to see it,
> but I propose that we should not have it in both dataset and project.
>
method , like coverage , was sprinkled in multiple places because we
recognized that within each of those categories, there can be information
whose scope is limited to those places. My understanding was that something
entered under dataset applied to the whole dataset...something under
attribute applied to just that attribute. So the way you handle it in Morpho
is to make the context of the information clear "Methodological information
about this dataset" vs "any additional methodological information about this
attribute"
> I'm also a little confused about the whole method structure
> itself -- it
> seems overly complex for what it does. This is not
> information that we
> intend to machine process at this point. So why are
> instrumentation and
People have requested places to put instrumentation and qualityControl,
although they've pretty obviously resisted syntax to structure that
information. Is it there merely for the traditional practice of putting a
box for it on the form so that people remember to provided it? - i dont
know, maybe. Or is there as a placeholder for when we hope to have better
descriptors of this information?
Im glad you didnt include dataSourceUsed in that list - this is absolutly
essential for establishing the lineage of a dataset.
> software broken out from the description? Can't methodStep just be a
> structured text block, maybe called "method"? Is protocol needed, or
> can it be referecend under citation if it is a published
Im reaching this same conclusion, Matt. if we reduce the structure of
protocol to a text document, then its just another piece of gray literature
and we can simply cite it as a lit item. Im all for this, i dont see
protocol as a resource and never have. its a document.
> protocol? Is qualityCOntrol so different from methodStep in its machine
processing
> that it needs to be a separate structure? Can't we just
> include quality
> control methods as a methodStep (aka method in my earlier terms)?
>
> Finally, in project, some of the cardinality rules don't make sense.
> designDescription is 0 to many, and contains a child choice
> that is 1 to
> many, which contains an optional citation. I think it should be:
> designDescription is 0 to 1, anc contains a choice that is 1 to many,
> where both elements in the choice are 1 to 1. An analogous argument
> could be applied to studyAreaDescription. This approach allows these
> elements to be optional, but if they are present, requires that they
> have at least one child for content. In general having a child of a
> choice be optional is unneeded.
This is probably all true - i dont' see the point in sweating the details
when we have so much debate over the big picture. If, by chance, there's an
agreement to go in this direction, then it would be worth cleaning up the
syntax.
> Well, that's probably enough of a tome for now. Thanks for your
> excellent effort on this...
>
> Matt
>
> Peter McCartney wrote:
> > 2) my proposed changes to address the LTER IM complaints about
> > protocol/project: these involved defining a new module called
> > eml-methods which includes elements for quality control, sampling
> > description and methodStep(s). MethodSteps includes
> pointers to related
> > sofware, datsets, and instruments and a description which any
> > combination of text, eml-protocol, or eml-citation.
> >
> > methods is imported optionally into dataset, entityBase and
> attribute
> > and any previous links to protocol is removed.
> >
> > This solution leaves eml-protocol to be used only for
> generic protocols
> > that we wish to publish as resources. There is still a flaw
> here in that
> > protocol also defines a methodStep elment that differs from
> the one in
> > method. so the name should be changed or perhaps protocol
> should just
> > import eml-method so that all it does is add the
> resourceBase elements.
> >
>
> --
> *******************************************************************
> Matt Jones jones at nceas.ucsb.edu
> http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
> National Center for Ecological Analysis and Synthesis (NCEAS)
>
> Interested in ecological informatics? http://www.ecoinformatics.org
> *******************************************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020829/46de4042/attachment.htm
More information about the Eml-dev
mailing list