[kepler-dev] RE: [seek-dev] SQL db candidates for data query

Bertram Ludaescher ludaesch at sdsc.edu
Tue Jun 29 00:35:38 PDT 2004


Bing Zhu writes:
 > Jing,
 > 
 > I didn't see any advantage of choosing hsqldb or Mckoi over
 > mysql and postgres. Both mysql and postgres have the features
 > listed in your mail.
 > 
 > My experience with mysql or postgres is very positive that they
 > are easy to install and to use. I bet most of us in SEEK project
 > have more or less use one of them.
 > 
 > Since we already have a JDBC actor in Kepler, accessing data
 > in either mysql or postgres should be pretty easy and straightforward.

yep. With the caveat that the JDBC actor is "schema unaware" and a
query is just some SQL string against an unknown schema. In a more
sophisticated future version, a DB actor could "instantiate itself"
with the relational schema of the underlying database to which it has
been connected and thus facilitate query construction. There are
probably a lot of different ways this could be done.
Overall the idea would be a bit similar to the EML ingestor and to the
web service actor, both of which can expose the "schema" of their
underlying component at runtime.

This is another example were the borders between runtime and design
time are blurred: while designing a workflow involving a web service
or a DB, one interacts with those resources, discovers their schema,
then specializes the actor to a specific "task" (the selected web
service operation, or a specific query)

Bertram

 > In Ecogrid, we can provide a similar implementation treating mysql
 > or postgres as a data source in Ecogrid. Or we can simply install
 > OGSA-DAI in Ecogrid.
 > 
 > (I might misunderstand the goal of the task. In this case, would you
 > send me some info. Thanks.)
 > 
 > Bing
 > 
 > 
 > 
 > 
 > 
 > 
 > -----Original Message-----
 > From: seek-dev-admin at ecoinformatics.org
 > [mailto:seek-dev-admin at ecoinformatics.org]On Behalf Of Jing Tao
 > Sent: Wednesday, June 23, 2004 2:17 PM
 > To: Peter McCartney
 > Cc: Bertram Ludaescher; Matt Jones; seek-dev at ecoinformatics.org
 > Subject: RE: [seek-dev] SQL db candidates for data query
 > 
 > 
 > Hi, all:
 > 
 > 
 > 
 > On Wed, 23 Jun 2004, Peter McCartney wrote:
 > 
 > > Date: Wed, 23 Jun 2004 10:18:10 -0700
 > > From: Peter McCartney <peter.mccartney at asu.edu>
 > > To: Bertram Ludaescher <ludaesch at sdsc.edu>, Matt Jones
 > <jones at nceas.ucsb.edu>
 > > Cc: Jing Tao <tao at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
 > > Subject: RE: [seek-dev] SQL db candidates for data query
 > >
 > > This thread has listed a number of cool products that vary in features,
 > > but its not clear to me that everyone's contributiion is motivated by
 > > exactly the same intended use. I understood Jing's original queston to
 > > be about a suitable tool for dynamically loading data that are normally
 > > stored as ascii files into a relational database so that they may be
 > > queryied. For that I think products like th hsqldb (or PointBase which
 > > was a commercial java db shipped with forte for a while), are ideal
 > > because they are exposed as jdbc connections and thus will work with any
 > > code you've already written to work with existing sql data. Exist is an
 > > xpath/xquery engin and berkelydb seems to be somewhat proprietary
 > > (although I didn't really look at it). Thus with those tools, you don't
 > > get the immediate benefit of yourexisting sql code.
 > >
 > > We had been thinking we would do this in our project using mysql or
 > > postgres, but both of those involve an installation and configuration
 > > step inorder to make them accessible. A javabased db avoides that
 > > neatly, albeit at the expense of performance.
 > >
 > > Peter McCartney (peter.mccartney at asu.edu)
 > > Center for Environmental-Studies
 > > Arizona State University
 > >
 > >
 > >
 > > > -----Original Message-----
 > > > From: seek-dev-admin at ecoinformatics.org
 > > > [mailto:seek-dev-admin at ecoinformatics.org] On Behalf Of
 > > > Bertram Ludaescher
 > > > Sent: Wednesday, June 23, 2004 1:41 AM
 > > > To: Matt Jones
 > > > Cc: Jing Tao; seek-dev at ecoinformatics.org
 > > > Subject: Re: [seek-dev] SQL db candidates for data query
 > > >
 > > >
 > > >
 > > > Hi all:
 > > >
 > > > Sorry that I might have missed the beginning of this thread..
 > > >
 > > > There is also  Sparrow DB ;-)
 > > >
 > > > We have done some experiments with storing a simple
 > > > relational query engine close to the data. It's a 100KB
 > > > runtime overhead and gives you relational and recursive
 > > > queries, possibly in the future some XML querying
 > > > capabilities as well. Right now, not much is available or
 > > > checked in, but the local SMSers will provide more info once
 > > > we're back in town and can actually work on this =B-)
 > > >
 > > > Bertram
 > > >
 > > > PS I don't want to get into a XML vs. relational debate right
 > > > now. The
 > > > short answer: there a good arguments for each of them..
 > > >
 > > >
 > > >
 > > > >>>>> "MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
 > > > MJ>
 > > > MJ> Hi Jing,
 > > > MJ> Also, you might consider this Java version of Berkeley DB from
 > > > MJ> Sleepycat.
 > > > MJ>
 > > > MJ> http://www.sleepycat.com/products/je.php?src=javaed
 > > > MJ>
 > > > MJ> I'm not sure about its features, particularly sql support, but it
 > > > MJ> seems
 > > > MJ> like a good potential system given the excellence of the
 > > > underlying
 > > > MJ> berkeley db product.
 > > > MJ>
 > > > MJ> Matt
 > > > MJ>
 > > > MJ>
 > > > MJ> Jing Tao wrote:
 > > > >> Hi, Serguei:
 > > > >>
 > > > >> Actually the query is base on sql. Now we are thinking about the
 > > > >> issue
 > > > >> that user don't want a entire data object(i.e. data tables
 > > > or text files)
 > > > >> but part of this data object which match a sql query.
 > > > >> One approach to achieve this purpose is to load text files into a
 > > > >> relational db and it is easy to run a sql query against
 > > > the db. We are
 > > > >> think this approach can be done in both ecogrid server
 > > > side and kepler
 > > > >> client side.
 > > > >> Of course, postsql, oracle and other one are good
 > > > candidates as a sql
 > > > >> engine. But they are too huge to redistribution with
 > > > kepler. So we are looking for a light
 > > > >> weight java relational db.
 > > > >>
 > > > >> Thanks.
 > > > >>
 > > > >> Jing
 > > > >>
 > > > >> On Thu, 17 Jun 2004, Serguei Krivov wrote:
 > > > >>
 > > > >>
 > > > >>> Date: Thu, 17 Jun 2004 22:10:39 -0400
 > > > >>> From: Serguei Krivov <Serguei.Krivov at uvm.edu>
 > > > >>> To: 'Jing Tao' <tao at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
 > > > >>> Subject: RE: [seek-dev] SQL db candidates for data query
 > > > >>>
 > > > >>> Hi All,
 > > > >>> I did not attend the last meeting and I do not know much
 > > > about the
 > > > >>> requirements for ql. Yet , before opting for sql db it is good to
 > > > >>> know if sql support (not XQuery and friends) is really the main
 > > > >>> requirement. In fact, should we abandon the world of well
 > > > >>> established sql rdbms (e.g postgresql, oracle) and switch to new
 > > > >>> java  databases, then we shall have a wide vistas of options that
 > > > >>> include native xml databases and a lot of other things.
 > > > Ferdinando
 > > > >>> has  installed one here at
 > > > >>> http://ecoinformatics.uvm.edu:8080/exist/index.xml
 > > > >>> There are a lot of others as well, see:
 > > > >>>
 > > > http://www.garshol.priv.no/download/xmltools/cat_ix.html#SC_XMLDBMS
 > > > >>>
 > > > >>> In fact I wonder if there is a DB specifically designed
 > > > for DL( or
 > > > >>> may be we can write one ;-)  ) But surely, if the target  query
 > > > >>> language is not  sql, then why do not to consider non sql
 > > > dbs? Ciao,
 > > > >>> serguei
 > > > >>>
 > > > >>>
 > > > >>>
 > > > >>>
 > > > >>>
 > > > >>> -----Original Message-----
 > > > >>> From: seek-dev-admin at ecoinformatics.org
 > > > >>> [mailto:seek-dev-admin at ecoinformatics.org] On Behalf Of Jing Tao
 > > > >>> Sent: Wednesday, June 16, 2004 6:46 PM
 > > > >>> To: seek-dev at ecoinformatics.org
 > > > >>> Subject: [seek-dev] SQL db candidates for data query
 > > > >>>
 > > > >>> Hi, everyone:
 > > > >>>
 > > > >>> I am eveluating the sql db candidates for data query. It
 > > > turns out
 > > > >>> that
 > > > >>> the following ones are pretty good: hsqldb and Mckoi.
 > > > >>>
 > > > >>> Here is the features both of them share:
 > > > >>> 1)Open source
 > > > >>> 2)Write in pure java and everything is in jar files.
 > > > >>> 3)Have server/client and stand-alone mode.
 > > > >>> 4)Have JDBC implementation.
 > > > >>> 5)Support Linux, Windows.
 > > > >>>
 > > > >>> Moreover, hsqldb has a good feature that support CSV (Comma
 > > > >>> Separated
 > > > >>> Value) or other delimited text file as the source of
 > > > their data. So user
 > > > >>>
 > > > >>> don't need use sql command to insert data into db and
 > > > only tell the
 > > > >>> text
 > > > >>>
 > > > >>> file location and the sperator. It even can ommit the first line
 > > > >>> when it
 > > > >>>
 > > > >>> is a column name. It pretty matches eml semantic.
 > > > >>> Except pipe(|), comma(,) and period(.), HSQLDB also recognises the
 > > > >>> following special indicators for separators:
 > > > >>> \semi - semicolon
 > > > >>> \quote - quote
 > > > >>> \space - space character
 > > > >>> \apos - apostrophe
 > > > >>> \n - newline - Used as an end anchor (like $ in regular
 > > > expressions)
 > > > >>> \r - carriage return
 > > > >>> \t - tab
 > > > >>> \\ - backslash
 > > > >>> \u#### - a Unicode character specified in hexadecimal
 > > > >>>
 > > > >>> This feature is every good for us to load data into db.
 > > > So I prefer
 > > > >>> to
 > > > >>> use hsqldb.
 > > > >>>
 > > > >>> Any comments, suggestions are apprecaited.
 > > > >>>
 > > > >>> Jing
 > > > >>>
 > > > >>>
 > > > >>
 > > > >>
 > > > MJ>
 > > > MJ> --
 > > > MJ>
 > > > -------------------------------------------------------------------
 > > > MJ> Matt Jones
 > > > jones at nceas.ucsb.edu
 > > > MJ> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph:
 > > > 907-789-0496
 > > > MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
 > > > MJ> University of California Santa Barbara
 > > > MJ> Interested in ecological informatics?
 > > > http://www.ecoinformatics.org
 > > > MJ>
 > > > -------------------------------------------------------------------
 > > > MJ> _______________________________________________
 > > > MJ> seek-dev mailing list
 > > > MJ> seek-dev at ecoinformatics.org
 > > > MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
 > > > _______________________________________________
 > > > seek-dev mailing list
 > > > seek-dev at ecoinformatics.org
 > > > http://www.ecoinformatics.org/mailman/listinfo> /seek-dev
 > > >
 > >
 > 
 > --
 > Jing Tao
 > National Center for Ecological
 > Analysis and Synthesis (NCEAS)
 > 735 State St. Suite 204
 > Santa Barbara, CA 93101
 > 
 > _______________________________________________
 > seek-dev mailing list
 > seek-dev at ecoinformatics.org
 > http://www.ecoinformatics.org/mailman/listinfo/seek-dev



More information about the Kepler-dev mailing list