[eml-dev] EML access control, bug 1132
Margaret O'Brien
mob at icess.ucsb.edu
Thu Aug 28 16:37:36 PDT 2008
Recently, Chris and I talked over some of these problems surrounding
access rules in EML, since it has always been slated for EML-2.1
(outlined and discussed at:
http://bugzilla.ecoinformatics.org/show_bug.cgi?id=1132). Generally, to
create fine-grained access control, EML 2.0.1 uses access rules in
multiple places in a document, either <access> elements in datasets,
citations, etc, or by reference (to any node with an id) with the
<access> tree in <additionalMetadata>. However rule-combinations can
arise which are confusing; e.g, an access tree at the dataset level
could have more restrictive permissions than one applied by reference to
its child node. We tried to come up with a solution that didn't deviate
too far from the current model, and what is below is somewhat similar to
what Sid put into the EML-2.1 branch long ago. An image of the schema is
attached.
In this model, the <access> tree appears only at the top level, and no
longer under dataset, citation, software and protocol. The
AccessRuleType now has a 0..many child, <describes>, for holding the id
of the node that the rule applies to. If the <describes> is absent, then
the rule applies to the whole document. An instance would look like this:
<eml>
<access authSystem="knb" order="allowFirst">
<allow>
<describes>table.1.1</describes>
<principal ... >
<permission ...>
</allow>
<deny>
<describes>table.2.1</describes>
<principal ... >
<permission ...>
</deny>
</access>
<dataset>
...dataset markup...
</dataset>
</eml>
We should encourage use of the order attribute (should it be required?)
so that authors will be fully aware of the the rules they create. Rules
should be applied in the order they appear (after what is dictated by
the order attribute). Presumably, if no order attribute is included,
then the rules are applied as they appear. Keeping the access tree in
one area at the top of the document makes maintenance simpler, and the
<describes> element acts as it does under <additionalMetadata>.
It would have to be decided if access rules should still be allowed in
<additionalMetadata>. These could be 1) not recognized as EML access
trees since node-level control can be described in eml/access, or 2)
discouraged for the same reason, but applied, or 3) allowed and applied
after the eml/access rules.
The model itself doesn't catch conflicting access rules, but it does
simplify descriptions, making it easier for authors to see potential
hang-ups. One way to control some basic conflicts might be to embed a
rule-based schema, like Schematron. Conflict detection could also be
added to the eml-parser.
Since the access tree is only available at the <eml> level, then this
would end the use of dataset, citation, etc as root level elements for
some purposes (e.g metacat) -- since metacat's default behavior is to
allow access only to the logged-in owner if no access tree is present.
In order to specify that a doc was publicly-readable, an author would
have to wrap the dataset in <eml>...</eml> to include those access
instructions.
Access is a major issue, and it would be good to get some discussion
going. Here, we've only addressed simplifying the location of the access
tree, it's another whole issue to deal with conflicts. It's always been
slated for 2.1, but we may need to discuss that, too.
thanks -
Margaret and Chris
--
========================
Margaret O'Brien
Information Management
Santa Barbara Coastal LTER
Marine Science Institute
University of California
Santa Barbara, CA 93106-6150
805-893-2071
mob at icess.ucsb.edu
http://sbc.lternet.edu
========================
-------------- next part --------------
A non-text attachment was scrubbed...
Name: eml_with_access_proposed.png
Type: image/png
Size: 8922 bytes
Desc: not available
URL: <http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20080828/396819e0/attachment-0001.png>
More information about the Eml-dev
mailing list