[seek-kr] RE: [seek-dev] data typing in ptolemy
Amy Sundermier
Amy.Sundermier at asu.edu
Fri Oct 10 07:49:11 PDT 2003
Hi Chad (and mailing list recipients),
Do you have the source for Ptolemy? It seems to me that one way to solve your "missing value" problem is to have a boolean in the base class of the hierarchy (Token) called "nullValue" that all subclasses inherit. Any token could then declare itself null by setting that boolean to true. Your code would have to check "if (token.isNullValue())" instead of "if (token == null)" but that's pretty self-documenting.
As long as the Ptolemy infrastructure code is not using reflection to set and get values and making assumptions about what it will find, you should be able to modify a superclass in this way without breaking their code.
This seems like an obvious solution so I wonder if you've made a policy decision not to modify the Ptolemy source?
I'd be interested in learning more about the semantic typing issues alluded to by these emails. Looking forward to meeting you all.
Amy Sundermier
Arizona State University
-----Original Message-----
From: Chad Berkley [mailto:berkley at nceas.ucsb.edu]
Sent: Thu 10/9/2003 10:46 AM
To: seek-dev at ecoinformatics.org; seek-kr at ecoinformatics.org
Cc:
Subject: [seek-dev] data typing in ptolemy
Hi,
Matt and I had a conversation on IRC the other day that we thought might
be of interrest to those on these lists.
Basically, I am now dealing with typing issues within ptolemy. Problems
arise when you get missing values in the data. Ptolemy's type heirarchy
does not allow missing values in a data tokens so Matt and I were
talking about extending the ptolemy typing system to allow missing
values. It occured to us that the typing system will need to be
extended to allow for semantic typing in the future.
The type class hierarchy currently looks like the following:
Token
|
--------------------------
| |
ScalarToken AbstractConvertableToken
| |
---------------...* -----------------
| | | |
DoubleToken IntToken BooleanToken StringToken
*Note that ScalarToken also includes LongToken and ComplexToken.
In addition to this Token hierarchy (Tokens are the means by which you
pass data between actors over ports) there is also a port typing
hierarchy implemented in the class BaseType. BaseType is the means by
which you actually specify a port's type. It looks like this:
BaseType
|
---------------------------------------------------------....
| | | | |
BooleanType ComplexType GeneralType IntType DoubleType ....*
* BaseType also includes EventType, LongType, NumericalType, ObjectType,
SCalarType, StringType, UnknownType, UnsignedByteType
Basically, in order to extend this typing system, we must extend both of
these hierarchies since Tokens are the means by which data is transfered
between ports and BaseTypes are the means by which you allow (or
disallow) a port to accept different types of data.
Extending the hierarchy
-----------------------
There are two different ways that I see to extend the hierarchy. The
first is to extend the base class Token with our own tree of token types
extending from the root of the tree. This will probably allow us the
most flexibility in implementing types the way we need to, however, the
main drawback I see to doing this is that we would not be able to use
most existing actors because their ports are typed according to the
current hierarchy. I think that one fact pretty much eliminates this
approach from the options.
The second approach I see is to extend each of the leaf token types.
For example, extend DoubleToken to ExtendedDoubleToken and add our
additional functionality there. This keeps our type system within the
bounds of the current ptolemy hierarchy but limits our flexibility in
extension. we are basically limited to the hierarchy that already
exists. It is still unclear to me what the affects of doing this will
be on existing actors. For instance, if we extend DoubleToken to allow
missing values, and an actor with a port of BaseType.DoubleType gets an
ExtendedDoubleToken, it would still need to be able to handle whatever
value we assign as a missing value code. This is problematic, because
we are then restricted to using an actual double value as a missing
value code (i.e. -999.999) which we've always maintained was bad data
practice. This could also cause problems because the actor cannot
differentiate -999.999 from a normal value and will operate on it
normally.
This same problem comes up (but to a lesser extent) when you think about
extending this system for semantics. What does an existing actor do
with semantic information stored in the token? It can ignore it, but
that may be detrimental to the analysis.
Another possible option
-----------------------
The other possible solution to the missing value problem is to simply
not send any data over the port when a missing value is encountered. I
have modified the EML ingestion actor to dynamically create one typed
port for each attribute in the data package. These ports can then be
hooked up to other actors. The data is sent asyncronously and depends
on the receiving ports to queue the data until all the input data is
present to run the analysis.
If I simply do not send a token when a missing value comes up, I forsee
major timing problems. For instance, port A and port B are mapped to
input ports X and Y (res.) of a plotter. port A sends a token to X,
then B gets a missing value. It sends nothing. The plotter is then
waiting for its second input. the next record is iterated into. port A
sends another token to X. This causes an exception. The other scenario
is, on the second iteration, A is a missing value but B is not. Then we
are plotting two values from different records when Y recieves data from
B in the second record. This would be a nightmare to deal with given
the current directors.
So, does anyone see something that I'm missing here? What are the needs
of the semantic typing going to be as far as ptolemy goes? Anyone have
a better solution than the three that I've layed out? This is a complex
issue that I need to deal with before I can continue moving forward with
AMS. I don't want to do anything that will hinder the future semantic
extensions of ptolemy and this is just too much of a basic
infrastructure item to try to hack. If anyone want to have an IRC chat
about this, I'm on #seek.
chad
--
-----------------------
Chad Berkley
National Center for
Ecological Analysis
and Synthesis (NCEAS)
berkley at nceas.ucsb.edu
-----------------------
_______________________________________________
seek-dev mailing list
seek-dev at ecoinformatics.org
http://www.ecoinformatics.org/mailman/listinfo/seek-dev
More information about the Seek-kr
mailing list