[kepler-dev] Re: question on typing processes and higher-order functions a la map

Wed May 26 13:45:50 PDT 2004

I think Edward understood your original question better than I did...

The way I would perhaps summarize this is that, despite our efforts
to orthogonalize computation from communication, Ptolemy still
results in the construction of actors that are explicit about how data
is packaged into tokens.  From the point of view of describing the functions
that some actors perform, however, there is no apparent reason why this
needs to be explicit in the definition of how to compute factors.

Maybe a better way to think of it is to separate the dataflow interface 
from the
definition of the behavior.  Define factors using a function closure, then wrap
this function closure in different dataflow behaviors...  One that maps 
collections
onto different ports.  One that maps them onto a sequence from a single port.
One that maps them onto a single data token from a single port.

In terms of a type system to select automatically between the above options
I think one issue comes down to "How do I select between different possible
implementations?"  If this is left implicit is the behavior of a model
still well-defined?  What are the criterion for making such a selection?
What information is needed to evaluate the criterion?  Can this information be
extracted automatically?

Note that for your example, the behavior of the system is more statically 
analyzeable
(and more implementation freedom is probably available) if the number 
factors is fixed.
Hence, part of the original specification needs to give properties of the 
collections (such as
fixed size) that are important for making the decision.  The difficulty 
with inferring the
size of arrays in Java is because I can say:

new int[x];

where the value of x cannot be determined statically...  In Java, the size 
of array
allocations are not decideable in general.  Note that there is nothing to 
prevent
building an analysis that says "these arrays have this size, and these 
arrays I can't tell,
so you have to give me more information before I will let you put an SDF 
wrapper
around your function".   Edward:  the "dependent type problem" you refer to 
is a red
herring... It just says that for languages with dynamic allocation of 
arrays, it is undecideable
to determine the size of every array object...  Language implementations 
(e.g. Java compilers)
often analyze array bounds anyway (e.g. to remove bounds checks).  In 
languages where
memory management is very important (e.g. Loop nest optimization for 
Fortran 77 on
supercomputers) such array size analysis is crucial to efficient performance.

Steve

At 12:59 PM 5/26/2004, Bertram Ludaescher wrote:

>Hi:
>
>sorry for replying late.
>
>Here is again the issue (see earlier posting):
>A factors actor can be conceived of as consuming one input token I (an int)
>and producing multiple output tokens O_1, ..., O_n, the prime factors
>of I:   factors(I) = O_1, ... O_n
>
>One way to assign a type to factors is:
>         factors :: int --> <int>
>where <a> stands for a stream (or sequence, possibly infinite) of
>items of type 'a'.
>
>If such a type is indeed assigned, it leads to an interesting typing
>issue, when applying the higher-order function
>         map :: (a-->b) --> <a> --> <b>
>to factors. According to the obvious type inference, the type of the
>"pipelined" version of factors is
>         Factors :: <int> --> < <int> >
>where Factors is defined as being the same as map(factors).
>
>If we treat <a> just as a list [a], all is well.
>
>On the other hand, on Ptolemy one may look at factors as follows:
>         factors :: int --> int
>indicating that
>
>(i) the input and output ports, respectively of factors are int (not
>arrray of int or sequence or list of int..), and
>(ii) stating that the token consumption rate/pattern is
>         factors :: 1 x  int  --> n x int
>(1 input token is consumed an n output tokens are produced)
>
>Indeed a straightforward modeling in Ptolemy using the PN director
>(not SDF as I indicated earlier) would now produce
>         Factors <10, 20, ...> = <2, 5, 2, 2, 5, ...>
>and thus loose the information which factors go with which number.
>
>There are different ways to handle it:
>(a) not at all (sometimes this behavior might be desired -- examples
>please!).
>
>(b) use a collection type (array in Java, list, etc) and type
>         factors :: int --> [int]
>resulting in a process of the form
>         Factors :: <int> --> < [int] >
>Example
>         Factors <10,20,...> = < [2,5], [2,2,5], ... >
>
>(c) use a special blend (a variant of (a)!?)
>         factors :: int --> <int>
>that when viewed as a process gives
>         Factors <int> --> <int>_;
>
>where the subscript ";" indicates a punctuated stream:
>         Factors <10, 20, ...> = < 2, 5 ; 2, 2, 5 ; ... >
>in which special control tokens are inserted into the data streams
>that have the purpose preserving the nesting structure of variant (b)
>while having the streaming features of variant (a).
>
>Does that make sense?
>
>I'm interested in a type system than can do all of the above in a neat
>way (and then we add semantic types)
>
>Bertram
>
>
>
>
>
>
> >>>>> "SAN" == Stephen Andrew Neuendorffer <neuendor at eecs.berkeley.edu> 
> writes:
>SAN>
>SAN> At 01:08 PM 5/25/2004, Bertram Ludaescher wrote:
> >> Hi:
> >>
> >> Liying Sui and I recently came across the following typing problem:
> >>
> >> Consider an actor, say "factors" which computes for a given int I, all
> >> the prime factors of I. For example factors(20) = [2,2,5]
> >>
> >> Thus, the signature of factors is:
> >>
> >> factors :: int --> [int]
> >>
> >> Now assume factors is to be applied on a stream <x1, x2, x3, ...> of
> >> integers, denoted <int>
> >>
> >> It seems tempting to view the *process* Factors that is so created as
> >> applying the higher-order map function to factors, i.e.,
> >> Factors = map(factors)
> >>
> >> There are some interesting typing issues. Let's say map has the
> >> following type on streams:
> >>
> >> map :: (a-->b) --> <a> --> <b>
> >>
> >> That is, map takes a function of type (a-->b) and a stream of a's
> >> (denoted <a>) and returns a stream of b's.
> >> Therefore the type of the Factors process can be determined to be
> >>
> >> Factors :: <int> --> < [int] >
> >>
> >> Example: Factors( <4, 6, 10, ... > ) = < [2,2], [2,3], [2,5], ... >
> >>
> >> So far so good -- no information is lost.
> >>
> >> It seems, however, that in practise sometimes  another process is
> >> created:
> >> Factors'( <4, 6, 10, ... > ) = < 2,2,2,3,2,5, ... >
> >>
> >> Clearly this process Factors' does lose some information (the grouping
> >> of result tuples into list of prime factors). While for this specific
> >> example, Factors' is not desirable, such a "flattening" behavior seems
> >> to be used in practise:
>SAN>
>SAN> I'm confused: Are you saying that this is what Ptolemy does, and you 
>don't
>SAN> like
>SAN> it, or that Ptolemy does not do this, and you would like it to?
>SAN>
>SAN> Could you consider this to be another higher-order function that takes
>SAN> expandArray :: <[int]> -> <int>?
>SAN>
>SAN>
> >> Let say we change our original function factors to produce not a list
> >> of ints, but a stream of them:
> >>
> >> factors' :: int --> <int>
> >>
> >> This correspond to a token consumption/production pattern of " 1:* "
> >> (for each input token, we might get multiple output tokens).
> >>
> >> Is it correct that in Ptolemy II using factors' with an SDF director
> >> produces a process Factors' on streams that has the signature:
> >>
> >> Factors' :: <int> --> <int>
> >>
> >> In order to make this behavior type-correct it seems we may have to
> >> say that <<int>> = <int>, because we get
> >>
> >> map(factors') = Factors'
> >>
> >> and the former has the type
> >>
> >> map(factors') ::  <int> --> < <int> >
> >>
> >> Note that the type of map(factors') is obtained by using the general
> >> type of map above:
> >>
> >> map :: (a-->b) --> <a> --> <b>
> >>
> >> and the applying this to factors' :: int --> <int>
> >> (hence a = int and b = <int>)
> >>
> >> So if indeed Factors' is of the said type we must accept (whether we
> >> like it or not ;-) that <<int>> = <int> (or in general, nesting
> >> streams in this way yields a "flat" stream).
> >>
> >> Comments??
> >> Does Ptolemy II with an SDF director and an actor of type
> >> myactor :: a --> <b>
>SAN>
>SAN> I don't think this makes sence...  SDF actor functions don't have 
>access to
>SAN> the whole stream...  they have access to a fixed length prefix of the 
>stream.
>SAN>
> >> produce a process
> >> MYACTOR :: <a> --> <b>
> >> which thus can be explained as a "special map" over streams with the
> >> type identity <<b>> = <b> ??
>SAN>
>SAN> why not another HOF: expandStream :: <<b>> -> <b> ?
>SAN> I think that the advantage of expandArray over expandStream is that 
>arrays
>SAN> are generally finite, while
>SAN> streams are not, and hence it is more likely that the computation I've
>SAN> specified actually processes the
>SAN> data being created...  Note that there are two ways to implement
>SAN> expandStream (does it produce an infinite stream consisting of the first
>SAN> element of each input stream, or does it produce an infinite stream that
>SAN> begins with the first
>SAN> infinite input stream, and then never gets to the other ones?)
>SAN>
> >> Thanks in advance!
> >>
> >> Bertram and Liying
> >>
> >>
> >> 
> ----------------------------------------------------------------------------
> >> Posted to the ptolemy-hackers mailing list.  Please send administrative
> >> mail for this list to: ptolemy-hackers-request at ptolemy.eecs.berkeley.edu
>SAN>