[Fwd: protocol, methods, and project]

Wed Aug 28 12:25:03 PDT 2002

Matt, Peter,

I'm scrambling to keep up here, but two comments:

I think project needs its own method.  In fact, the best reason for
declaring a project is to document methods that form the context for
multiple datasets.  Advice to the user:  if a method is dataset
specific, associate it with dataset.  If datasets share an experimental
context, declare a project and document the activities that created the
experimental context as project/method.  For instance, at KBS, our
mainsite layout is a randomized complete block agricultural experiment,
installed in 1986.  That's a project, with a method, and created no data
per se.  All of our main datasets, however, each with their own sampling
techniques (methods/protocols) implicitly rely on the project method.

I have no objection to simplifying method by dropping software,
instrumentation, etc and just making it a textType.  It solves some of
the problems I mentioned in my last email.  I have hesitated to suggest
this, because I don't know the history of why they are there.

regards,

Tim.

-------- Original Message --------
Subject: protocol, methods, and project
Date: Wed, 28 Aug 2002 09:35:59 -0800
From: Matt Jones <jones at nceas.ucsb.edu>
To: Peter McCartney <peter.mccartney at asu.edu>
CC: "Eml-Dev (E-mail)" <eml-dev at ecoinformatics.org>
References: <11232E890694F74893375BC7AD88A62DAAB20E at MAINEX4.ASU.EDU>

Hi Peter,

So I looked over your protocol and methods changes as well.  Excellent 
start. Some comments...

I like the idea of distinguishing between a method that is ad-hoc from a 
protocol that is a resource-level version of a method.  And I agree that 
protocol should be modified to import eml-method so that they are 
identical underneath in structure (except that protocol would have the 
resource fields).  Will you do that as well?

I noticed that you removed the recursion from methodStep, although 
methodSteps are repeatable.  This has some interaction with the 
"procedure/step/substep" debate that's been happening in the eml-text 
thread.  Personally, I think it would be better if EML defined the 
methodStep as repeatable and recursive, and left 
"procedure/step/substep" docbook tags out of the eml-text module.  Or I 
could be persuaded to leave it out of eml-method and put it in eml-text. 
  What I think would be bad is putting it in both places.  That would 
just make it confusing to end users who are trying to document their 
procedures.

I also was a bit confused about the inclusion of "protocol" in method. 
If a user has a protocol to describe, should they use the methodStep 
block under method, or under method/protocol?  How do they decide?

The single most frequent question from our EML/Morpho users in our 
seminars, and the one that frustrates users to no end, is: "I have piece 
of information, in which of these X places should i put it in eml?". 
Basically, they have a very difficult time distinguishing the intent in 
eml of reused element trees.  For example, they might say "data was 
collected in 1999", and want to know if they should put that piece of 
information in dataset/coverage or dataset/dataTable/coverage. 
Legitimate question, subtle distinction.  There are hundreds of these 
subtleties in EML, which add up to massive confusion to users.

So how does this relate to protocol?  By sprinkling "protocol/method" 
elements throughout eml, users will have lots of choices of where to put 
things, and almost no understanding of why things should go in one place 
over another.  In general, if we can, I would like to see us come up 
with a solid rationale and use case for each use of "method" in eml.  If 
we can avoid it, I'd prefer to not have two or more locations for 
information that can be interpreted identically (e.g., method in dataset 
and project).  Soo....can we consolidate the method references somewhat? 
  I'm happy to move it to dataset if that's where people want to see it, 
but I propose that we should not have it in both dataset and project.

I'm also a little confused about the whole method structure itself -- it 
seems overly complex for what it does.  This is not information that we 
intend to machine process at this point.  So why are instrumentation and 
software broken out from the description?  Can't methodStep just be a 
structured text block, maybe called "method"?  Is protocol needed, or 
can it be referecend under citation if it is a published protocol?  Is 
qualityCOntrol so different from methodStep in its machine processing 
that it needs to be a separate structure?  Can't we just include quality 
control methods as a methodStep (aka method in my earlier terms)?

Finally, in project, some of the cardinality rules don't make sense. 
designDescription is 0 to many, and contains a child choice that is 1 to 
many, which contains an optional citation.  I think it should be: 
designDescription is 0 to 1, anc contains a choice that is 1 to many, 
where both elements in the choice are 1 to 1.  An analogous argument 
could be applied to studyAreaDescription.  This approach allows these 
elements to be optional, but if they are present, requires that they 
have at least one child for content.  In general having a child of a 
choice be optional is unneeded.

Well, that's probably enough of a tome for now.  Thanks for your 
excellent effort on this...

Matt

Peter McCartney wrote:
> 2) my proposed changes to address the LTER IM complaints about 
> protocol/project: these involved defining a new module called 
> eml-methods which includes elements for quality control, sampling 
> description and methodStep(s). MethodSteps includes pointers to related 
> sofware, datsets, and instruments and a description which any 
> combination of text, eml-protocol, or eml-citation.
> 
> methods is imported optionally into dataset, entityBase and attribute 
> and any previous links to protocol is removed.
> 
> This solution leaves eml-protocol to be used only for generic protocols 
> that we wish to publish as resources. There is still a flaw here in that 
> protocol also defines a methodStep elment that differs from the one in 
> method. so the name should be changed or perhaps protocol should just 
> import eml-method so that all it does is add the resourceBase elements.
>

-- 
*******************************************************************
Matt Jones                                    jones at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Fax: 425-920-2439   Ph: 907-789-0496
National Center for Ecological Analysis and Synthesis (NCEAS)

Interested in ecological informatics? http://www.ecoinformatics.org
*******************************************************************

_______________________________________________
eml-dev mailing list
eml-dev at ecoinformatics.org
http://www.ecoinformatics.org/mailman/listinfo/eml-dev