[seek-dev] RE: pgm_cut program

Bing Zhu bzhu at sdsc.edu
Tue Jul 13 11:51:25 PDT 2004


I believe currently it is hard to say which one is a right approach in terms
of 'genric'.
Based on my experience dealing with remote file transfer software, currently
there
is no such software can handle the requirement from 'pgm_cut' as SRB in a
distributed
environment.

Here is an example of 'pgm_cut'.

aaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaa
aaaaBBBBBBBBaaaaa
aaaaBBBBBBBBaaaaa
aaaaBBBBBBBBaaaaa
aaaaaaaaaaaaaaaaaaa

We want to get partial image labeled by 'B'.

Although grid-ftp provide partial file transfer, it can be done by one
function call
for the above case. So the client side (grid-ftp client) has to repeatedly
calculate
offsets and then make grid-ftp call with new offsets and numbers of bytes
for data
transfer. In this case, you are creating a fat customized client (or Kepler
actor).

Another approach can be to download whole file into a local machine (e.g. in
a Kepler
workflow) and clip the picture locally. I believe no one will consider this
is a good
solution in terms of performance since we just need partial data.

So extending SRB functionality by pushing clipping function(s) into server
side
is worth experimenting. And our idea is to find a 'generic' solution rather
than just
a solution for SRB. That's why I want it to be implemented into Ecogrid.

Thanks for so many discussions and currently I am thinking about a plug
model
which allows different clip software to be implemented in server side in
Ecogrid
which use SRB or grid-ftp (currently we don't have grid-ftp implementation
of 'get' but
we can do it later with same syntax). In this approach, Ecogrid is
configurable based
a XML configuration file which specifies different clipping methods and
their implementation
code. In this way, for instance, a Ecogrid client will have the following
syntax to get data for 'pgm_cut'.

  $ java org.ecoinformatics.ecogrid.client.EcogridQueryClient get  \

srb://../home/whywhere/.../n00an1.10.pgm?method=pgm_cut&left=10&top=10&width
=25&height=30 \

http://orion.sdsc.edu:8080/ogsa/services/org/ecoinformatics/ecogrid/EcoGridL
evelI

Bing





-----Original Message-----
From: Bertram Ludaescher [mailto:ludaesch at sdsc.edu]
Sent: Tuesday, July 13, 2004 11:18 AM
To: Matt Jones
Cc: David Stockwell; Dave Vieglais; bzhu at sdsc.edu; Arcot Rajasekar;
Seek-Dev; hwliu
Subject: Re: [seek-dev] RE: pgm_cut program



Matt:

I agree that we should have a "preferred Kepler data handling"
strategy. At the same time, for the time being I think it doesn't hurt
to experiment with various approaches. For one, maybe we don't exactly
know what we want/need. Also, some Kepler spin-offs (non-SEEK) might
want to use their own approach.

One thing I had been thinking about, and I wonder how it related to
the current SEEK approach, is to use a "handle/reference type" in
Kepler actors or as extension to web services. This would allow
efficient transfer of handles, while still supporting large data
transfer on demand (via the reference). It's simple and solves the 3rd
party transfer problem right away and best of all works nicely with
the underlying Ptolemy dataflow style of programming.

Bertram

>>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
MJ>
MJ> Hi,
MJ> Also, Chad has implemented several important GIS functions by wrapping
MJ> GRASS libraries behind web services rather than using GDAL directly.
MJ> These services can be called by Kepler and other clients.  The idea is
MJ> that a workflow could be constructed that would first query to locate
MJ> data, then do the various clipping and reprojections needed directly at
MJ> the source, then transfer the clipped data to whatever analytical
MJ> service needs it in the workflow.  This use of what we've been calling
MJ> 'third party data transfer' and sometimes 'passing data by reference' is
MJ> going to be an important component of data handling. Ultimately, by
MJ> exposing the actual data query and manipulation plan to the workflow
MJ> engine, we will be able to design a scheduler and optimizer that can
MJ> figure out what the most efficient way to execute the workflow is (where
MJ> to do what).  The workflow engine becomes a controller for a variety of
MJ> distributed data access, data manipulation, and analysis tasks.
MJ>
MJ> So, I think we should be careful to plan out an end-to-end strategy for
MJ> all data manipulation, rather than doing it piece by piece.  Dave's
MJ> suggestion to use GDAL I think is the right approach in that it is more
MJ> generic and more broadly useful than just adding on a specific SRB
MJ> extension.  Dave -- could Chad's web-service wrappers around GRASS serve
MJ> in the same function?
MJ>
MJ> Matt
MJ>
MJ> David Stockwell wrote:
>> Hi,
>> pgm_cut.c is a client program that just implements the srb library
>> fread and fseek routines. I thought about implementing more
>> functions on the server side but that is a much bigger project.
>>
>> There is an argument of just having a cut function on the
>> server side, as a cut (or crop) must reduce the data to be
>> transferred. Some other operations like rescaling may
>> increse the size of the data to be transferred, and besides
>> may put a lot of load on the server.
>>
>> GDAL seems useful, but it may make sense to modify
>> it so it could be used on the client side to make simple
>> SRB (or ecogrid) calls. I have been hoping for something
>> like this on the server side but  I am not in a position to
>> modify the SRB.
>>
>> Cheers
>>
>>
>> Dave Vieglais wrote:
>>
>>> Hi Bing and Dave,
>>> I was wondering if you had considered using tools more appropriate for
>>> handling geospatial data?  The pgm libraries handle image data ok, but
>>> spatial raster data sets have a number of nuances that require
>>> additional processing on top of merely extracting pixel values.  One
>>> library in particular that is applicable to a wide variety of raster
>>> data formats is GDAL (geospatial data abstraction library).
>>> Integrating this with the SRB and as an EcoGrid service would provide
>>> access to a huge variety and volume of spatial data available from
>>> various sources, as well as the pgm images.  That library also has
>>> numerous methods and examples for windowing, scaling and reprojecting
>>> raster data, and so should be fairly easy to integrate in the same
>>> manner as you have described below.  It would also provide the
>>> significant advantage of providing the foundation of a more
>>> standardized mechanism for accessing raster spatial data through the
>>> EcoGrid- something that will be useful to a much larger audience
>>> compared with the very limited data accessible for the "Why Where"
>>> system.
>>>
>>> cheers,
>>> Dave V.
>>>
>>> Bing Zhu wrote:
>>>
>>>> David,
>>>>
>>>> I am glad that the SRB implementation works fine in getting partial
>>>> images
>>>> for Niche
>>>> Modeling pgm data.
>>>>
>>>> This is a good step. I just had a meeting with Raja discussing about
>>>> moving
>>>> this approach to our Ecogrid. Raja suggested to modify the SRB code in
>>>> Ecogrid
>>>> node to handle the parameters currently implemented in pgm_cut.
>>>>
>>>> With this approach, the URL used by Niche Modeling in calling our
>>>> Ecogrid's
>>>> 'get' function
>>>> to retrieve partial pgm image will have the syntax similar to a cgi
>>>> request.
>>>> e.g.
>>>> $ java org.ecoinformatics.ecogrid.client.EcogridQueryClient get  \
>>>>
>>>>
srb://../home/whywhere/.../n00an1.10.pgm?left=10&top=10&width=25&height=30
>>>> \
>>>>
>>>>
http://orion.sdsc.edu:8080/ogsa/services/org/ecoinformatics/ecogrid/EcoGridL
>>>>
>>>> evelI
>>>>
>>>> One of my next tasks in SEEK will be modify the SEEK SRB code to
>>>> handle this
>>>> for Niche Modeling data. And this approach is also applicable for other
>>>> applications which
>>>> request partial data transfer.
>>>>
>>>> Bing
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: hwliu [mailto:hwliu at sdsc.edu]
>>>> Sent: Monday, July 12, 2004 12:59 PM
>>>> To: bzhu at sdsc.edu
>>>> Cc: davids at sdsc.edu
>>>> Subject: RE: pgm_cut program
>>>>
>>>>
>>>> Hi, Bing,
>>>> Thanks for your help. I've just tried the program and it worked well.
>>>> But, I don't understand the need to use S-command. I just copied
>>>> pgm_cut to
>>>> my local directory on landscape and it worked even if I don't copy the
>>>> S-command directory. By the way, if I want to rebuild the program on
>>>> Windows, do I need to download the whole directory of SRB1.2.1?
>>>>
>>>> Thanks,
>>>>
>>>> Haowei
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Bing Zhu [mailto:bzhu at sdsc.edu]
>>>> Sent: Monday, July 12, 2004 10:56 AM
>>>> To: hwliu
>>>> Cc: davids at sdsc.edu
>>>> Subject: RE: pgm_cut program
>>>>
>>>> Hi Haowei,
>>>>
>>>> I changed the permission for some directories and files.
>>>> Please try it again and let me know if the permission problem still
>>>> exists.
>>>>
>>>> The files are in the directory, /export/home/bzhu/pgm_cut.
>>>>
>>>> You might also need to use SRB software which is installed in
>>>> /export/home/bzhu/SRB2.1.2. Basically, you just need to use
>>>> SRB client programs, S-commands, which can be found
>>>> in /export/home/bzhu/SRB2.1.2/utilities/bin. To use them, you need
>>>> an SRB account. Currently we can use the user, 'whywhere'.
>>>> I will be glad to show you how to use SRB software so that
>>>> you can test the software with different pgm files stored
>>>> under 'whywhere'.
>>>>
>>>> Sincerely,
>>>> Bing
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: hwliu [mailto:hwliu at sdsc.edu]
>>>> Sent: Monday, July 12, 2004 10:05 AM
>>>> To: bzhu at sdsc.edu
>>>> Cc: davids at sdsc.edu
>>>> Subject: RE: pgm_cut program
>>>>
>>>>
>>>> Hi, Bing:
>>>> I will test the new version of pgm_cut, but I don't have the
>>>> permission
>>>> to access the directory. Is it possible for you to mail me pgm_cut?
>>>>
>>>> Thanks,
>>>>
>>>> Haowei
>>>>
>>>> -----Original Message-----
>>>> From: Bing Zhu [mailto:bzhu at sdsc.edu]
>>>> Sent: Friday, July 09, 2004 5:48 PM
>>>> To: davids at sdsc.edu
>>>> Cc: Seek-Dev; Arcot Rajasekar; Matt Jones
>>>> Subject: pgm_cut program
>>>>
>>>> David,
>>>>
>>>> I modified the pgm_cut.c code to read pgm files stored in SRB storage
>>>> place.
>>>>
>>>> The modified pgm_cut.c can be found in the directory,
>>>> /export/home/bzhu/pgm_cut,
>>>> in landscape machine.
>>>>
>>>> Here is an example of running 'pgm_cut' to get partial image from a
>>>> pgm file
>>>> in SRB.
>>>>
>>>> $ pgm_cut 0 0 20 20
>>>> /home/whywhere.seek/ei/Data/Terrestrial/n00an1.10.pgm
>>>>
>>>> Note that I have re-uploaded whywhere data using a new SRB account,
>>>> 'whywhere',
>>>> as we discussed last time. The new uploaded data, along with their
>>>> metadata,
>>>> is in the collection, /home/whywhere.seek/ei/Data/Terrestrial.
>>>>
>>>> Here is the whywhere account info.
>>>> user name: whywhere
>>>> password: I will send you in another mail.
>>>> SRB server: srb.sdsc.edu
>>>> SRB domain: seek
>>>> SRB port: 6613
>>>>
>>>> I'd like to test it next Monday. Will you be in office next Monday?
>>>>
>>>> Sincerely,
>>>> Bing
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> seek-dev mailing list
>>>> seek-dev at ecoinformatics.org
>>>> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
>>>>
>>
MJ>
MJ> --
MJ> -------------------------------------------------------------------
MJ> Matt Jones                                     jones at nceas.ucsb.edu
MJ> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 907-789-0496
MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
MJ> University of California Santa Barbara
MJ> Interested in ecological informatics? http://www.ecoinformatics.org
MJ> -------------------------------------------------------------------
MJ> _______________________________________________
MJ> seek-dev mailing list
MJ> seek-dev at ecoinformatics.org
MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-dev




More information about the Seek-dev mailing list