[seek-kr] Re: [seek-dev] data typing in ptolemy
Bertram Ludaescher
ludaesch at sdsc.edu
Thu Oct 9 11:19:04 PDT 2003
Chad:
This is a very interesting issues and directly related to the semantic
typing issues Shawn has been working on recently.
Shawn:
Can you directly work with Chad and Matt on this to make sure the
extensions will be compatible with what we've been discussing?
I can't join today's IRC session, but maybe we can have a phone
conf. on this next week.
Bertram
>>>>> "CB" == Chad Berkley <berkley at nceas.ucsb.edu> writes:
CB>
CB> Hi,
CB> Matt and I had a conversation on IRC the other day that we thought might
CB> be of interrest to those on these lists.
CB>
CB> Basically, I am now dealing with typing issues within ptolemy. Problems
CB> arise when you get missing values in the data. Ptolemy's type heirarchy
CB> does not allow missing values in a data tokens so Matt and I were
CB> talking about extending the ptolemy typing system to allow missing
CB> values. It occured to us that the typing system will need to be
CB> extended to allow for semantic typing in the future.
CB>
CB> The type class hierarchy currently looks like the following:
CB>
CB> Token
CB> |
CB> --------------------------
CB> | |
CB> ScalarToken AbstractConvertableToken
CB> | |
CB> ---------------...* -----------------
CB> | | | |
CB> DoubleToken IntToken BooleanToken StringToken
CB>
CB> *Note that ScalarToken also includes LongToken and ComplexToken.
CB>
CB> In addition to this Token hierarchy (Tokens are the means by which you
CB> pass data between actors over ports) there is also a port typing
CB> hierarchy implemented in the class BaseType. BaseType is the means by
CB> which you actually specify a port's type. It looks like this:
CB>
CB> BaseType
CB> |
CB> ---------------------------------------------------------....
CB> | | | | |
CB> BooleanType ComplexType GeneralType IntType DoubleType ....*
CB>
CB> * BaseType also includes EventType, LongType, NumericalType, ObjectType,
CB> SCalarType, StringType, UnknownType, UnsignedByteType
CB>
CB> Basically, in order to extend this typing system, we must extend both of
CB> these hierarchies since Tokens are the means by which data is transfered
CB> between ports and BaseTypes are the means by which you allow (or
CB> disallow) a port to accept different types of data.
CB>
CB> Extending the hierarchy
CB> -----------------------
CB> There are two different ways that I see to extend the hierarchy. The
CB> first is to extend the base class Token with our own tree of token types
CB> extending from the root of the tree. This will probably allow us the
CB> most flexibility in implementing types the way we need to, however, the
CB> main drawback I see to doing this is that we would not be able to use
CB> most existing actors because their ports are typed according to the
CB> current hierarchy. I think that one fact pretty much eliminates this
CB> approach from the options.
CB>
CB> The second approach I see is to extend each of the leaf token types.
CB> For example, extend DoubleToken to ExtendedDoubleToken and add our
CB> additional functionality there. This keeps our type system within the
CB> bounds of the current ptolemy hierarchy but limits our flexibility in
CB> extension. we are basically limited to the hierarchy that already
CB> exists. It is still unclear to me what the affects of doing this will
CB> be on existing actors. For instance, if we extend DoubleToken to allow
CB> missing values, and an actor with a port of BaseType.DoubleType gets an
CB> ExtendedDoubleToken, it would still need to be able to handle whatever
CB> value we assign as a missing value code. This is problematic, because
CB> we are then restricted to using an actual double value as a missing
CB> value code (i.e. -999.999) which we've always maintained was bad data
CB> practice. This could also cause problems because the actor cannot
CB> differentiate -999.999 from a normal value and will operate on it
CB> normally.
CB>
CB> This same problem comes up (but to a lesser extent) when you think about
CB> extending this system for semantics. What does an existing actor do
CB> with semantic information stored in the token? It can ignore it, but
CB> that may be detrimental to the analysis.
CB>
CB> Another possible option
CB> -----------------------
CB> The other possible solution to the missing value problem is to simply
CB> not send any data over the port when a missing value is encountered. I
CB> have modified the EML ingestion actor to dynamically create one typed
CB> port for each attribute in the data package. These ports can then be
CB> hooked up to other actors. The data is sent asyncronously and depends
CB> on the receiving ports to queue the data until all the input data is
CB> present to run the analysis.
CB>
CB> If I simply do not send a token when a missing value comes up, I forsee
CB> major timing problems. For instance, port A and port B are mapped to
CB> input ports X and Y (res.) of a plotter. port A sends a token to X,
CB> then B gets a missing value. It sends nothing. The plotter is then
CB> waiting for its second input. the next record is iterated into. port A
CB> sends another token to X. This causes an exception. The other scenario
CB> is, on the second iteration, A is a missing value but B is not. Then we
CB> are plotting two values from different records when Y recieves data from
CB> B in the second record. This would be a nightmare to deal with given
CB> the current directors.
CB>
CB> So, does anyone see something that I'm missing here? What are the needs
CB> of the semantic typing going to be as far as ptolemy goes? Anyone have
CB> a better solution than the three that I've layed out? This is a complex
CB> issue that I need to deal with before I can continue moving forward with
CB> AMS. I don't want to do anything that will hinder the future semantic
CB> extensions of ptolemy and this is just too much of a basic
CB> infrastructure item to try to hack. If anyone want to have an IRC chat
CB> about this, I'm on #seek.
CB>
CB> chad
CB>
CB> --
CB> -----------------------
CB> Chad Berkley
CB> National Center for
CB> Ecological Analysis
CB> and Synthesis (NCEAS)
CB> berkley at nceas.ucsb.edu
CB> -----------------------
CB>
CB> _______________________________________________
CB> seek-dev mailing list
CB> seek-dev at ecoinformatics.org
CB> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
More information about the Seek-kr
mailing list