[kepler-dev] seek/kepler conference call notes

Shawn Bowers bowers at sdsc.edu
Mon Aug 23 15:29:58 PDT 2004


Edward, thanks for the responses. My comments are interspersed.


Edward A Lee wrote:

> At 11:39 AM 8/23/2004 -0700, Shawn Bowers wrote:
> 
>> I guess my biggest thing when it comes to semantically annotating MoML 
>> directly is the following.
>>
>> There are two main uses that have been proposed in SEEK for semantic 
>> annotations on actors/workflows: (1) to search for and discover actors 
>> via their semantic markup and (2) to use the semantic markup to help 
>> compose heterogeneous actors (i.e., actors that don't have compatible 
>> input/output types).
>>
>> For (1), if we were to store the annotations directly in MoML, then 
>> searching based on semantic types would require obtaining, loading, 
>> and parsing *every* MoML file that describes/represents an 
>> actor/workflow. (Note also that the majority of time parsing MoML in 
>> this case would be spent on non-semantic markup.) We could 
>> alternatively create some form of semantic-annotation "index." But 
>> then it seems the index would end up just being the proposed external 
>> semantic annotation.
> 
> 
> Hmm... First, It seems that if you keep the annotations together with the
> actor/workflows, then _any_ scheme requires some level of parsing
> the file, regardless of what the format is.

I was thinking we would basically have some way to uniquely identify an 
actor or workflow, e.g., through LSIDs or some such mechanism. Then, we 
would externally (i.e., in some index) associate the id with the 
semantic annotation.  Thus, the annotation wouldn't be stored with the 
actor or workflow, but in some other location.


> Second, the parsing could be greatly speeded up if, for example,
> you first searched the MoML file for a full class name that matched
> the class used to specify the semantic markup.

In general, a semantic annotation isn't a class name, and could be as 
complex as a query.

> Third (much more interestingly), if you use a custom class (subclass
> of Attribute) to specify semantic markup, then you could have actors
> that transparently (in the background) update a dynamically maintained
> peer-to-peer index of actors/workflows ... a Napster of actors.
> Yang Zhao has, in fact, prototyped a mechanism like this...  This
> index could be updated whenever an instance of this custom attribute
> is instantiated, for example.

I was thinking that the index for semantic annotations might be stored 
in a separate location through the EcoGrid, which I suppose acts sort of 
like a P2P framework.  Actually, using P2P technology for Ptolemy sounds 
very sexy, and perhaps the EcoGrid folks too have thought about this.

 >
> I think that if you have a separate file that is the semantic markup,
> separately maintained from the actor/workflow source, then keeping
> the two consistent will be very challenging...
> 
> 
>> A smaller issue is that we would need to change/extend the existing 
>> parser for MoML. Whereas using external semantic annotations, we could 
>> build our own, and possibly keep this parser at the location of the 
>> semantic annotations (e.g., on the "EcoGrid").  It also seems that 
>> building our own parser for this is probably just quicker and easier 
>> to get going.
> 
> 
> I don't see why you would need to extend the parser for MoML (?).
> Can you give me an example of what it might look like?  I'll show
> then how it can be done in MoML with no changes to the parser 
> (hopefully)...
> 

I guess I was thinking that you would have to at least extend the parser 
to extract the semantic annotation part and forward it to some other 
service to start up annotation-handler code.

There are two types of annotations for an actor. The simplest form is to 
just say that an actor represents some instance of a particular ontology 
class (e.g., EcoNicheModel). This type of annotation could also be made 
more specific, e.g., by giving more details on how it instantiates a 
class (which by the way, essentially creates a new class "on-the-fly"). 
For example, we might say it is an instance of an EcoNicheModel that 
uses a specific type of LogisticRegressionModel, etc.  An actor might 
also be an instance of multiple ontological concepts (its an instance of 
an EcoNicheModel and something else ... which doesn't really fit this 
example, but anyway).

For input/output types, semantic annotations are similar to the unit 
stuff, but generally resemble views (queries). Mainly because we need 
finer levels of detail. For example, an actor might take as input a list 
of records:

[{lt1=double, lt2=double, ln1=double, ln2=double, s=int, n=int}]

and the input annotation might be (where here record attributes denote 
values from the same tuple in the list):

Community(C), LatLonPt(NW), lat(NW, lt1), lon(NW, ln1), LatLonPt(SE), 
lat(SE, lt2), lon(SE, ln2), BBox(B), nwCorner(B, NW), seCorner(B, SE), 
communityLocation(C, B), communitySpeciesCount(C, s), 
communityPopulation(C, n)


>> Perhaps there are other ways around these issues for MoML that I am 
>> unaware of. But generally speaking, it seems like it is reasonable to 
>> store "extra" information in MoML so long as it isn't used as the 
>> mechanism to query and search for actors.
> 
> I'm thinking that the information should be in MoML, but that your
> annotation classes could generate rapidly searchable indexes.  This way,
> the "source" is all together, but the search is quick...

At one point I was calling the result of computing what you call the 
"annotation class" the "context" of the annotation. In general, it is an 
overestimate (generalization) of the annotation. It seems reasonable to 
compute such a thing to speed up searching either for MoML-embedded or 
external annotations.  It sounds reasonable to use it as an index 
mechanism as well, which would allow MoML to include the actual 
annotation and not bog down search too much.


shawn



> 
> Edward
> 
> 
> 
> ------------
> Edward A. Lee, Professor
> 518 Cory Hall, UC Berkeley, Berkeley, CA 94720
> phone: 510-642-0455, fax: 510-642-2739
> eal at eecs.Berkeley.EDU, http://ptolemy.eecs.berkeley.edu/~eal




More information about the Kepler-dev mailing list