Procedure, method, protocol: consensus?
Tim Bergsma
tbergsma at kbs.msu.edu
Wed Sep 4 06:48:07 PDT 2002
Peter,
I agree with your earlier intuition that procedure/method/protocol needs
to be solved concurrently with the TextType/paragraph issue. I think
Matt needs to jump in here when he has a chance, because he's taking the
technical lead on TextType. Frankly, I consider both problems solved,
or nearly so. I really liked Matt's proposal for TextType, which
answered all of the concerns I had regarding structural richness of
text. I also really like what you are doing with
procedure/method/protocol, which answers nearly all the concerns I had
with those. (For that reason, I hesitate to jump right in to CVS at
this late date and start doing my own thing. I will if you see a good
reason.)
DocBook is really pretty simple. The original issue was, NOT that we
need more liberal mark up of free text, but that we need structural
elements in eml text so that we may express the structure of
(structured) text correctly. DocBook people have already solved this
problem: we could just adopt a subset of their solution (leaving the
technicalities of 'adopt' to...Matt?). Any user will still be able to
dump an ASCII stream culled from a paper, but the DocBook solution gives
them a way to indicate, for example, that some subset of the text is
actually a title for the next two paragraphs. This is not formatting;
it is hierarchical structure.
Probably my unfamiliarity with FGDC explains my under-appreciation for
the value of associating elements with steps. No problem. If it's
clear to you that steps need children, then I'm guessing (someone help?)
that it's smarter to define our own procedural steps rather than try to
extend DocBook, for forward-compatibility reasons.
At your prompt, I went back and reread James' comments. With apologies
to James, (and hoping we're talking about the same thing,) I think
creation of Method was a great idea, precisely because it lets you
choose between associating child elements with steps vs. associating
them with the method as a whole. From an object-oriented point of view,
the method itself is a discrete object (which is probably why I prefer
'method' to 'methods') regardless of whether it contains one step, many
steps, or just a block of text.
Which brings me to my only remaining concern: we should explicitly
allow authors of procedures (methods or protocols) to provide a single
block of text (however structured) as an alternative to a series of
steps, with children. Granted, a formal procedure implies the existence
of steps; but there is a lot of written metadata already in existence
that clearly sounds more like protocol or method than any other eml
elements, yet cannot easily (or cannot at all) be broken into a clean
sequence of steps. A flowchart is a protocol, but is not a sequence of
steps (other examples available if you don't like this one). A
flowchart may have a descriptive equivalent, and eml should give us an
obvious place to put it.
best regards,
Tim.
> Peter McCartney wrote:
>
> I dont really know much about docbook so i dont really have an answer
> for some of this. AS i understand it, Docbook is a spec that is
> available in a variety of ways, including an xml dtd. there does not
> seem to be a schema, so in order to include a namespace we have to
> create one based on a subset. the last time i did that with content
> from FGDC i wound up spending a day changing all the element names
> from the FGDC DTD to make them more readable, which to me defeats the
> purpose of trying to leverage existing work. All i can say is that our
> ONLY model for this is the FGDC processing step which provides a way
> to associate source data with each step in a procedure. beyond that, i
> would be most happy to have a clean break with formating and allow
> people to paste in anything they wanted (html seems the easiest to me)
> and our parsers treat it as opaque text. I dont see how to incorporate
> docbook without making it part of eml the way other external schemas
> have become. if the whole community can learn to create docbook text
> and then paste it in or our editors can have docbook plugins then that
> would be cool, but it seems like the content model for a segment in
> EML would be xs:anyType and that the docbook markup that gets pasted
> in lies outside the eml schema. But as i said, i probably havent
> looked at docbook enough to understand how it would work here.
>
> I think we should probably try to address james's comments which are
> at a higher level and seem to go against some of the recent
> suggestions ive made. The things i did recently came directly in
> response to LTER data manager requests, but maybe that was a knee-jerk
> reaction just as i'd originally feared.
>
> Peter McCartney (peter.mccartney at asu.edu)
> Center for Environmental Studies
> Arizona State University
> 480-965-6791
>
> > -----Original Message-----
> > From: Tim Bergsma [mailto:tbergsma at kbs.msu.edu]
> > Sent: Tuesday, September 03, 2002 8:13 AM
> > To: Peter McCartney
> > Cc: Eml-Dev (E-mail)
> > Subject: Re: protocol and method: radical proposal
> >
> >
> > Peter,
> >
> > I like your idea of explicitly creating a base procedure type. It
> > allows a formal representation of the similarities and differerences
>
> > between protocol and method.
> >
> > May I assume that procedureType has 'step' as well as 'substep'?
> >
> > I'm beginning to appreciate your comment about 'clean line'. In a
> > certain sense, though, we don't want ANY formattting tags in
> > eml -- only
> > structural tags. Even if procedure/step/substep emulates DocBook, I
>
> > would still consider them eml.
> >
> > Both you and Matt and I have indicated that it would be
> > tolerable to see
> > procedure/step/substep be nested under textType (DocBook
> > style) or under
> > a native EML procedural type (method/protocol etc), but not
> > both. If we
> > don't need to attach elements to steps, it would be much
> > simpler to nest
> > procedure/step/substep under textType (Proposal A). If we
> definitely
> > want to attach elements to steps (or preserve the option)
> > then we should
> > exclude procedure/step/substep from textType and declare them under
> > procedureType or equivalent, as you are doing (Proposal B).
> >
> > One thing that I think has evolved without my realizing it, is a
> > container for methodSteps. Apparently the viable proposals have
> this,
> > but the version we saw at the last Phoenix meeting didn't
> (methodStep
> > was just repeatable). I apologize for rehashing a discussion
> > I probably
> > missed, but it is now possible to attach elements to a method as a
> > whole, or to individual steps, where before it was only possible to
> > attach elements to steps. For instance, in the last
> > eml-method.xsd you
> > sent out, instrumentation and software are attached to methodStep,
> but
> > sampling, quality control, citation, and protocol are attached to
> the
> > MethodType as a whole. I don't understand the choices. Why, for
> > instance, shouldn't some specific step (but not the whole
> > method) simply
> > defer to some protocol?
> >
> > ACTION ITEM: If we go with Proposal B, we should force the user to
> > choose between writing a set of steps versus writing one TextType
> > block. As it currently stands, the user who won't or can't break a
> > procedure into discrete steps is forced to create a single
> > step and dump
> > all content into it's attached TextType block.
> >
> > regards,
> >
> > Tim.
> >
> > > Peter McCartney wrote:
> > >
> > > Hi Tim....im afraid im not familiar enough with docbook to
> evaluate
> > > your radical proposal other than to say that it will start getting
>
> > > confusing if we dont have a clean line between where EML tags
> leave
> > > off and formatting tags begin. the step/substep is a good
> > example - if
> > > we dont care to attach tags to steps and substeps then it probably
>
> > > doesnt matter if then that aspect of structure become more
> freeform,
> > > otherwise, we might need to define the steps as eml elements and
> use
> > > docbook/html/whatever for internal structure of the text portions.
>
> > >
> > > Ive done some more experimenting and checked the results into the
> > > branch of cvs that chad created "EML_PROSPECTIVE_CHANGES". this
> > > version has a base procedure type that contains little more than
> > > substep, description and citation. Protocol imports this. Method
> > > defines a methodProcedure which extends this with specifics that
> are
> > > contextually meaningful like instrumentation, datasources,
> > etc. Method
> > > then includes this element plus things that people didnt want to
> see
> > > under a "step" element like sampling and qualityControl. Of
> > course, as
> > > ive said elsewhere, im now in favor of dropping protocol as
> > a resource
> > > - i dont think it is a distinct kind of resource unless it
> > is going to
> > > take on enough structure to be machine readable, otherwise its a
> > > document.
> > >
> > > Peter McCartney (peter.mccartney at asu.edu)
> > > Center for Environmental Studies
> > > Arizona State University
> > > 480-965-6791
> > >
> > > > -----Original Message-----
> > > > From: Tim Bergsma [mailto:tbergsma at kbs.msu.edu]
> > > > Sent: Friday, August 30, 2002 8:30 AM
> > > > To: Eml-Dev (E-mail)
> > > > Subject: protocol and method: radical proposal
> > > >
> > > >
> > > > Peter et al.,
> > > >
> > > > Recent talk about importing method into protocol confuses me,
> both
> > > > because I don't know the technical implications of
> > 'import' and more
> > >
> > > > importantly because semantically it blurs the distinction
> between
> > > the
> > > > two.
> > > >
> > > > For clarity, I suggest language that sees method and
> > protocol as two
> > >
> > > > kinds of procedures: method is a descriptive procedure and
> > > > protocol is
> > > > a prescriptive procedure. Thus, protocol is an
> > abstraction, albeit
> > > an
> > > > important one; method is concrete, representing activity that
> > > actually
> > > > occured, which may or may not be consistent with
> > (validate against?)
> > >
> > > > some protocol. This is pretty much how we've all been using
> > > > the terms.
> > > >
> > > > For importing, then, theoretically there should be a
> > > > procedure base type
> > > > that both method and protocol import. But there is already a
> > > > procedure
> > > > element in DocBook, which leads me to a radical proposal...
> > > >
> > > > Let's put procedure/step/substep in TextType (a DocBook
> > > > subset). Then,
> > > > define protocol and method independently, but identically, as...
>
> > > > exactly one TextType
> > > > optional instrumentation
> > > > optional software
> > > > optional sampling
> > > > optional qualitycontrol
> > > > optional protocol
> > > >
> > > > leaving cardinality etc. to the experts.
> > > >
> > > > Notes:
> > > >
> > > > 1. method would still be available at project, dataset,
> > entity, and
> > >
> > > > attribute levels, but that's a different discussion. (I
> wouldn't
> > > mind
> > > > if it were only available at the dataset level).
> > > >
> > > > 2. Both method and protocol can reference other protocols,
> > > > which makes
> > > > sense.
> > > >
> > > > 3. Authors of both granular (stepped) procedures and
> > casual (prose)
> > >
> > > > procedures have simple, common container for their procedure:
> > > > TextType. procedure/step/substep is there for those who
> > plan to use
> > >
> > > > it. No contorted paths.
> > > >
> > > > 4. Nothing is lost except the ability to formally associate
> > > > instrument
> > > > and software with steps, rather than with method (or
> > protocol) as a
> > >
> > > > whole. The association could still be authored in a human
> > > > readable form
> > > > in the step itself, but a search engine couldn't return all
> > > > the *steps*
> > > > that use a piece of software, just all the *methods* (or
> > protocols).
> > >
> > > > How bad is that?
> > > >
> > > > 5. I keep using method where eml uses methods, sorry. I like
> > > method
> > > > (singular) better, but I don't plan to debate it.
> > > >
> > > > Regards,
> > > >
> > > > Tim.
> > > >
> > > >
> > > > Tim Bergsma
> > > > LTER Information Manager
> > > > W.K. Kellogg Biological Station
> > > > Michigan State University
> > > > Hickory Corners, MI 49060
> > > > 616/671-2337
> > > > tbergsma at kbs.msu.edu
> > > > http://lter.kbs.msu.edu
> > > > _______________________________________________
> > > > eml-dev mailing list
> > > > eml-dev at ecoinformatics.org
> > > > http://www.ecoinformatics.org/mailman/listinfo/eml-dev
> > > >
> >
> > --
> > Tim Bergsma
> > LTER Information Manager
> > W.K. Kellogg Biological Station
> > Michigan State University
> > Hickory Corners, MI 49060
> > 616/671-2337
> > tbergsma at kbs.msu.edu
> > http://lter.kbs.msu.edu
> >
--
Tim Bergsma
LTER Information Manager
W.K. Kellogg Biological Station
Michigan State University
Hickory Corners, MI 49060
616/671-2337
tbergsma at kbs.msu.edu
http://lter.kbs.msu.edu
More information about the Eml-dev
mailing list