[seek-dev] data typing in ptolemy

Fri Oct 10 08:33:14 PDT 2003

Hey Amy, Rich and Shawn (and others),

Thanks for all your comments.  Shawn and I had a pretty in-depth 
discussion about this yesterday afternoon on irc and we hashed out a few 
ideas.  It seems we all have slightly different ideas about what needs 
to be done so I'd like to propose having a conference call on this 
issue. would sometime monday work?  I'll propose 9:00 am PDT.  If you 
have a problem with that time, propose a different one.  I can setup a 
vnc connection so we can have a whiteboard to transfer ideas on.  shawn 
or amy, can your phones do 3 way calling?  Mine does, do if one of your 
phones can do it, we can chain together all 4 of us without having to 
spend $.60/minute/participant on a conf. call.  If anyone else wants to 
engage in an extremely technical discussion about this issue, you are 
more than welcome to join in (Bertram?  Matt?).

Amy, to answer your question about the Ptolemy source, we do have the 
ptolemy source code, but I've been trying my best not to alter it. 
We've realized that we may have to do so, but I'd like to avoid that if 
it is at all possible.  This would be a relatively simple thing to fix 
if we just hacked the existing ptolemy source.  (if you want the source, 
by the way, it's freely available at ptolemy.eecs.berkeley.edu).

Rich, and Amy, you should look at the class 
ptolemy.data.type.TypeLattice.  Shawn was under the impression that we 
could extend that class to subvert the entire class hierarchy within 
ptolemy.  we kind of went back and forth about this for a while but it 
looks interresting.  I'm going to try just hacking it to see if changing 
the TypeLattice will indeed allow us to change the hierarchy without 
changing the actual java class hierarchy (something I'm still not 
convinced of....but Shawn found documentation to that reguard).

chad

Amy Sundermier wrote:
> Hi Chad (and mailing list recipients),
> 
> Do you have the source for Ptolemy?  It seems to me that one way to solve your "missing value" problem is to have a boolean in the base class of the hierarchy (Token) called "nullValue" that all subclasses inherit.  Any token could then declare itself null by setting that boolean to true.  Your code would have to check "if (token.isNullValue())" instead of "if (token == null)" but that's pretty self-documenting.  
> 
> As long as the Ptolemy infrastructure code is not using reflection to set and get values and making assumptions about what it will find, you should be able to modify a superclass in this way without breaking their code.
> 
> This seems like an obvious solution so I wonder if you've made a policy decision not to modify the Ptolemy source?  
> 
> I'd be interested in learning more about the semantic typing issues alluded to by these emails.  Looking forward to meeting you all.
> 
> Amy Sundermier
> Arizona State University  
> 
> 
> -----Original Message-----
> From:	Chad Berkley [mailto:berkley at nceas.ucsb.edu]
> Sent:	Thu 10/9/2003 10:46 AM
> To:	seek-dev at ecoinformatics.org; seek-kr at ecoinformatics.org
> Cc:	
> Subject:	[seek-dev] data typing in ptolemy
> 
> Hi,
> 
> Matt and I had a conversation on IRC the other day that we thought might 
> be of interrest to those on these lists.
> 
> Basically, I am now dealing with typing issues within ptolemy.  Problems 
> arise when you get missing values in the data.  Ptolemy's type heirarchy 
> does not allow missing values in a data tokens so Matt and I were 
> talking about extending the ptolemy typing system to allow missing 
> values.  It occured to us that the typing system will need to be 
> extended to allow for semantic typing in the future.
> 
> The type class hierarchy currently looks like the following:
> 
>                Token
>                  |
>       --------------------------
>       |                        |
> ScalarToken             AbstractConvertableToken
>         |                               |
> ---------------...*              -----------------
>   |            |                 |               |
> DoubleToken  IntToken         BooleanToken  StringToken
> 
> *Note that ScalarToken also includes LongToken and ComplexToken.
> 
> In addition to this Token hierarchy (Tokens are the means by which you 
> pass data between actors over ports) there is also a port typing 
> hierarchy implemented in the class BaseType.  BaseType is the means by 
> which you actually specify a port's type.  It looks like this:
> 
>                                BaseType
>                                   |
>       ---------------------------------------------------------....
>       |             |             |          |          |
> BooleanType   ComplexType   GeneralType   IntType   DoubleType ....*
> 
> * BaseType also includes EventType, LongType, NumericalType, ObjectType, 
> SCalarType, StringType, UnknownType, UnsignedByteType
> 
> Basically, in order to extend this typing system, we must extend both of 
> these hierarchies since Tokens are the means by which data is transfered 
> between ports and BaseTypes are the means by which you allow (or 
> disallow) a port to accept different types of data.
> 
> Extending the hierarchy
> -----------------------
> There are two different ways that I see to extend the hierarchy.  The 
> first is to extend the base class Token with our own tree of token types 
> extending from the root of the tree.  This will probably allow us the 
> most flexibility in implementing types the way we need to, however, the 
> main drawback I see to doing this is that we would not be able to use 
> most existing actors because their ports are typed according to the 
> current hierarchy.  I think that one fact pretty much eliminates this 
> approach from the options.
> 
> The second approach I see is to extend each of the leaf token types. 
> For example, extend DoubleToken to ExtendedDoubleToken and add our 
> additional functionality there.  This keeps our type system within the 
> bounds of the current ptolemy hierarchy but limits our flexibility in 
> extension.  we are basically limited to the hierarchy that already 
> exists.  It is still unclear to me what the affects of doing this will 
> be on existing actors.  For instance, if we extend DoubleToken to allow 
> missing values, and an actor with a port of BaseType.DoubleType gets an 
> ExtendedDoubleToken, it would still need to be able to handle whatever 
> value we assign as a missing value code.  This is problematic, because 
> we are then restricted to using an actual double value as a missing 
> value code (i.e. -999.999) which we've always maintained was bad data 
> practice.  This could also cause problems because the actor cannot 
> differentiate -999.999 from a normal value and will operate on it 
> normally.
> 
> This same problem comes up (but to a lesser extent) when you think about 
> extending this system for semantics.  What does an existing actor do 
> with semantic information stored in the token?  It can ignore it, but 
> that may be detrimental to the analysis.
> 
> Another possible option
> -----------------------
> The other possible solution to the missing value problem is to simply 
> not send any data over the port when a missing value is encountered.  I 
> have modified the EML ingestion actor to dynamically create one typed 
> port for each attribute in the data package.  These ports can then be 
> hooked up to other actors.  The data is sent asyncronously and depends 
> on the receiving ports to queue the data until all the input data is 
> present to run the analysis.
> 
> If I simply do not send a token when a missing value comes up, I forsee 
> major timing problems.  For instance, port A and port B are mapped to 
> input ports X and Y (res.) of a plotter.  port A sends a token to X, 
> then B gets a missing value.  It sends nothing.  The plotter is then 
> waiting for its second input.  the next record is iterated into.  port A 
> sends another token to X.  This causes an exception.  The other scenario 
> is, on the second iteration, A is a missing value but B is not.  Then we 
> are plotting two values from different records when Y recieves data from 
> B in the second record.  This would be a nightmare to deal with given 
> the current directors.
> 
> So, does anyone see something that I'm missing here?  What are the needs 
> of the semantic typing going to be as far as ptolemy goes?  Anyone have 
> a better solution than the three that I've layed out?  This is a complex 
> issue that I need to deal with before I can continue moving forward with 
> AMS.  I don't want to do anything that will hinder the future semantic 
> extensions of ptolemy and this is just too much of a basic 
> infrastructure item to try to hack.  If anyone want to have an IRC chat 
> about this, I'm on #seek.
> 
> chad
> 

-- 
-----------------------
Chad Berkley
National Center for
Ecological Analysis
and Synthesis (NCEAS)
berkley at nceas.ucsb.edu
-----------------------