[seek-dev] SQL db candidates for data query

Jing Tao tao at nceas.ucsb.edu
Wed Jun 23 16:13:24 PDT 2004


Hi, Dave:

Thanks for the information. I took a look at http://www.csv-jdbc.com/ and 
it is cool. But it is not free :(

Thanks!

Jing 

On Thu, 24 Jun 2004, Dave Vieglais wrote:

> Date: Thu, 24 Jun 2004 10:29:10 +1200
> From: Dave Vieglais <vieglais at KU.EDU>
> To: Jing Tao <tao at nceas.ucsb.edu>
> Cc: Peter McCartney <peter.mccartney at asu.edu>,
>      Bertram Ludaescher <ludaesch at sdsc.edu>,
>      Matt Jones <jones at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
> Subject: Re: [seek-dev] SQL db candidates for data query
> 
> Oh ok.  That makes more sense.  Basically you want to have an SQL 
> interface to an existing data file.  On windows this is pretty easy 
> through ODBC and the text file driver.  Not need to load the data or 
> anything, just create a config file.  The sql support is rather, umm, 
> tenuous though...  I'm not sure if a similar driver would be available 
> for *nix ODBC implementations or even if there's an equivalent JDBC 
> driver (though http://www.csv-jdbc.com/ looks pretty sweet).  Good luck.
> 
> Dave V.
> 
> Jing Tao wrote:
> 
> > Hi, Dave:
> > 
> > I am sorry I didn't specify our task very clearly.
> > Acturally our tasks includes those parts: 
> > 1)Find a relational db which can store data.
> > 2)Create tables in the db after parsing the metadata corresponding to the 
> >   data file.
> > 3)Find a way to load the data file into the tables in db.
> > 4)Run a sql query against the tables in the db.
> > 
> > So first we need a simple db which can store data. The feature which user 
> > can set a text file as data source in hsqldb makes it as a good candidate 
> > for our solution because this will make the task 3 very easy.
> > 
> > Thanks.
> > 
> > Jing
> > 
> > On Thu, 24 Jun 2004, Dave Vieglais wrote:
> > 
> > 
> >>Date: Thu, 24 Jun 2004 09:48:19 +1200
> >>From: Dave Vieglais <vieglais at KU.EDU>
> >>To: Jing Tao <tao at nceas.ucsb.edu>
> >>Cc: Peter McCartney <peter.mccartney at asu.edu>,
> >>     Bertram Ludaescher <ludaesch at sdsc.edu>,
> >>     Matt Jones <jones at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
> >>Subject: Re: [seek-dev] SQL db candidates for data query
> >>
> >>Jing,
> >>Perhaps I missed an important part of the discussion, but from the 
> >>description of your task, it seems like you might be better off looking 
> >>for tools that load text data into relational databases rather than 
> >>trying to find a database that happens to support direct loading of text.
> >>
> >>FYI Berkeley DB is not a relational database.  It provides a transaction 
> >>aware data store that may be used as the basis for building a relational 
> >>database.
> >>
> >>Dave V.
> >>
> >>Jing Tao wrote:
> >>
> >>
> >>>Hi, everyone:
> >>>
> >>>Sorry for previous email which I sent out accidently :)
> >>>
> >>>As Peter mentioned, my purpose to start this thread is found a suitable 
> >>>tool for dynamic loading data which normally stored in text file into 
> >>>relational db and user can query it through sql command.
> >>>
> >>>So far, I am thinking hsql is a best candidate. Because it doesn't need 
> >>>installation (just put a jar file into class path), it supports sql 
> >>>command and JDBC, it supports CSV or other delimited text file as data 
> >>>source.
> >>>
> >>>I also took a look to Berkeley DB Java Edition from sleepycat. It is good 
> >>>db engine too. But it seems it doesn't support sql at all(at least, I 
> >>>didn't find it in the documentation). In this db, records are stored as 
> >>>pairs - key/value, both key and value will be stored as byte array. User 
> >>>search a db through the key value. It works like a hashtable.
> >>>
> >>>I didn't find much info about Sparrow DB. Bertram, would you please point 
> >>>out a url to it?
> >>>
> >>>Thanks.
> >>>
> >>>Jing
> >>>
> >>>
> >>>On Wed, 23 Jun 2004, Peter McCartney wrote:
> >>>
> >>>
> >>>
> >>>>Date: Wed, 23 Jun 2004 10:18:10 -0700
> >>>>From: Peter McCartney <peter.mccartney at asu.edu>
> >>>>To: Bertram Ludaescher <ludaesch at sdsc.edu>, Matt Jones <jones at nceas.ucsb.edu>
> >>>>Cc: Jing Tao <tao at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
> >>>>Subject: RE: [seek-dev] SQL db candidates for data query
> >>>>
> >>>>This thread has listed a number of cool products that vary in features,
> >>>>but its not clear to me that everyone's contributiion is motivated by
> >>>>exactly the same intended use. I understood Jing's original queston to
> >>>>be about a suitable tool for dynamically loading data that are normally
> >>>>stored as ascii files into a relational database so that they may be
> >>>>queryied. For that I think products like th hsqldb (or PointBase which
> >>>>was a commercial java db shipped with forte for a while), are ideal
> >>>>because they are exposed as jdbc connections and thus will work with any
> >>>>code you've already written to work with existing sql data. Exist is an
> >>>>xpath/xquery engin and berkelydb seems to be somewhat proprietary
> >>>>(although I didn't really look at it). Thus with those tools, you don't
> >>>>get the immediate benefit of yourexisting sql code.
> >>>>
> >>>>We had been thinking we would do this in our project using mysql or
> >>>>postgres, but both of those involve an installation and configuration
> >>>>step inorder to make them accessible. A javabased db avoides that
> >>>>neatly, albeit at the expense of performance.
> >>>>
> >>>>Peter McCartney (peter.mccartney at asu.edu)
> >>>>Center for Environmental-Studies
> >>>>Arizona State University
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>>-----Original Message-----
> >>>>>From: seek-dev-admin at ecoinformatics.org 
> >>>>>[mailto:seek-dev-admin at ecoinformatics.org] On Behalf Of 
> >>>>>Bertram Ludaescher
> >>>>>Sent: Wednesday, June 23, 2004 1:41 AM
> >>>>>To: Matt Jones
> >>>>>Cc: Jing Tao; seek-dev at ecoinformatics.org
> >>>>>Subject: Re: [seek-dev] SQL db candidates for data query
> >>>>>
> >>>>>
> >>>>>
> >>>>>Hi all:
> >>>>>
> >>>>>Sorry that I might have missed the beginning of this thread..
> >>>>>
> >>>>>There is also  Sparrow DB ;-) 
> >>>>>
> >>>>>We have done some experiments with storing a simple 
> >>>>>relational query engine close to the data. It's a 100KB 
> >>>>>runtime overhead and gives you relational and recursive 
> >>>>>queries, possibly in the future some XML querying 
> >>>>>capabilities as well. Right now, not much is available or 
> >>>>>checked in, but the local SMSers will provide more info once 
> >>>>>we're back in town and can actually work on this =B-)
> >>>>>
> >>>>>Bertram
> >>>>>
> >>>>>PS I don't want to get into a XML vs. relational debate right 
> >>>>>now. The 
> >>>>>short answer: there a good arguments for each of them.. 
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>
> >>>>>>>>>>"MJ" == Matt Jones <jones at nceas.ucsb.edu> writes:
> >>>>>
> >>>>>MJ> 
> >>>>>MJ> Hi Jing,
> >>>>>MJ> Also, you might consider this Java version of Berkeley DB from 
> >>>>>MJ> Sleepycat.
> >>>>>MJ> 
> >>>>>MJ> http://www.sleepycat.com/products/je.php?src=javaed
> >>>>>MJ> 
> >>>>>MJ> I'm not sure about its features, particularly sql support, but it 
> >>>>>MJ> seems
> >>>>>MJ> like a good potential system given the excellence of the 
> >>>>>underlying 
> >>>>>MJ> berkeley db product.
> >>>>>MJ> 
> >>>>>MJ> Matt
> >>>>>MJ> 
> >>>>>MJ> 
> >>>>>MJ> Jing Tao wrote:
> >>>>>
> >>>>>
> >>>>>>>Hi, Serguei:
> >>>>>>>
> >>>>>>>Actually the query is base on sql. Now we are thinking about the 
> >>>>>>>issue
> >>>>>>>that user don't want a entire data object(i.e. data tables 
> >>>>>
> >>>>>or text files) 
> >>>>>
> >>>>>
> >>>>>>>but part of this data object which match a sql query.
> >>>>>>>One approach to achieve this purpose is to load text files into a 
> >>>>>>>relational db and it is easy to run a sql query against 
> >>>>>
> >>>>>the db. We are 
> >>>>>
> >>>>>
> >>>>>>>think this approach can be done in both ecogrid server 
> >>>>>
> >>>>>side and kepler 
> >>>>>
> >>>>>
> >>>>>>>client side.
> >>>>>>>Of course, postsql, oracle and other one are good 
> >>>>>
> >>>>>candidates as a sql 
> >>>>>
> >>>>>
> >>>>>>>engine. But they are too huge to redistribution with 
> >>>>>
> >>>>>kepler. So we are looking for a light 
> >>>>>
> >>>>>
> >>>>>>>weight java relational db.
> >>>>>>>
> >>>>>>>Thanks.
> >>>>>>>
> >>>>>>>Jing
> >>>>>>>
> >>>>>>>On Thu, 17 Jun 2004, Serguei Krivov wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>Date: Thu, 17 Jun 2004 22:10:39 -0400
> >>>>>>>>From: Serguei Krivov <Serguei.Krivov at uvm.edu>
> >>>>>>>>To: 'Jing Tao' <tao at nceas.ucsb.edu>, seek-dev at ecoinformatics.org
> >>>>>>>>Subject: RE: [seek-dev] SQL db candidates for data query
> >>>>>>>>
> >>>>>>>>Hi All,
> >>>>>>>>I did not attend the last meeting and I do not know much 
> >>>>>
> >>>>>about the 
> >>>>>
> >>>>>
> >>>>>>>>requirements for ql. Yet , before opting for sql db it is good to 
> >>>>>>>>know if sql support (not XQuery and friends) is really the main  
> >>>>>>>>requirement. In fact, should we abandon the world of well 
> >>>>>>>>established sql rdbms (e.g postgresql, oracle) and switch to new 
> >>>>>>>>java  databases, then we shall have a wide vistas of options that 
> >>>>>>>>include native xml databases and a lot of other things. 
> >>>>>
> >>>>>Ferdinando 
> >>>>>
> >>>>>
> >>>>>>>>has  installed one here at 
> >>>>>>>>http://ecoinformatics.uvm.edu:8080/exist/index.xml
> >>>>>>>>There are a lot of others as well, see:
> >>>>>>>>
> >>>>>
> >>>>>http://www.garshol.priv.no/download/xmltools/cat_ix.html#SC_XMLDBMS
> >>>>>
> >>>>>
> >>>>>>>>In fact I wonder if there is a DB specifically designed 
> >>>>>
> >>>>>for DL( or 
> >>>>>
> >>>>>
> >>>>>>>>may be we can write one ;-)  ) But surely, if the target  query 
> >>>>>>>>language is not  sql, then why do not to consider non sql 
> >>>>>
> >>>>>dbs? Ciao,
> >>>>>
> >>>>>
> >>>>>>>>serguei
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>-----Original Message-----
> >>>>>>>>From: seek-dev-admin at ecoinformatics.org 
> >>>>>>>>[mailto:seek-dev-admin at ecoinformatics.org] On Behalf Of Jing Tao
> >>>>>>>>Sent: Wednesday, June 16, 2004 6:46 PM
> >>>>>>>>To: seek-dev at ecoinformatics.org
> >>>>>>>>Subject: [seek-dev] SQL db candidates for data query
> >>>>>>>>
> >>>>>>>>Hi, everyone:
> >>>>>>>>
> >>>>>>>>I am eveluating the sql db candidates for data query. It 
> >>>>>
> >>>>>turns out 
> >>>>>
> >>>>>
> >>>>>>>>that
> >>>>>>>>the following ones are pretty good: hsqldb and Mckoi.
> >>>>>>>>
> >>>>>>>>Here is the features both of them share:
> >>>>>>>>1)Open source
> >>>>>>>>2)Write in pure java and everything is in jar files.
> >>>>>>>>3)Have server/client and stand-alone mode.
> >>>>>>>>4)Have JDBC implementation.
> >>>>>>>>5)Support Linux, Windows.
> >>>>>>>>
> >>>>>>>>Moreover, hsqldb has a good feature that support CSV (Comma 
> >>>>>>>>Separated
> >>>>>>>>Value) or other delimited text file as the source of 
> >>>>>
> >>>>>their data. So user
> >>>>>
> >>>>>
> >>>>>>>>don't need use sql command to insert data into db and 
> >>>>>
> >>>>>only tell the 
> >>>>>
> >>>>>
> >>>>>>>>text
> >>>>>>>>
> >>>>>>>>file location and the sperator. It even can ommit the first line 
> >>>>>>>>when it
> >>>>>>>>
> >>>>>>>>is a column name. It pretty matches eml semantic.
> >>>>>>>>Except pipe(|), comma(,) and period(.), HSQLDB also recognises the
> >>>>>>>>following special indicators for separators:
> >>>>>>>>\semi - semicolon
> >>>>>>>>\quote - quote
> >>>>>>>>\space - space character
> >>>>>>>>\apos - apostrophe
> >>>>>>>>\n - newline - Used as an end anchor (like $ in regular 
> >>>>>
> >>>>>expressions)
> >>>>>
> >>>>>
> >>>>>>>>\r - carriage return
> >>>>>>>>\t - tab
> >>>>>>>>\\ - backslash
> >>>>>>>>\u#### - a Unicode character specified in hexadecimal
> >>>>>>>>
> >>>>>>>>This feature is every good for us to load data into db. 
> >>>>>
> >>>>>So I prefer 
> >>>>>
> >>>>>
> >>>>>>>>to
> >>>>>>>>use hsqldb. 
> >>>>>>>>
> >>>>>>>>Any comments, suggestions are apprecaited.
> >>>>>>>>
> >>>>>>>>Jing
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>MJ> 
> >>>>>MJ> --
> >>>>>MJ> 
> >>>>>-------------------------------------------------------------------
> >>>>>MJ> Matt Jones                                     
> >>>>>jones at nceas.ucsb.edu
> >>>>>MJ> http://www.nceas.ucsb.edu/    Fax: 425-920-2439    Ph: 
> >>>>>907-789-0496
> >>>>>MJ> National Center for Ecological Analysis and Synthesis (NCEAS)
> >>>>>MJ> University of California Santa Barbara
> >>>>>MJ> Interested in ecological informatics? 
> >>>>>http://www.ecoinformatics.org
> >>>>>MJ> 
> >>>>>-------------------------------------------------------------------
> >>>>>MJ> _______________________________________________
> >>>>>MJ> seek-dev mailing list
> >>>>>MJ> seek-dev at ecoinformatics.org
> >>>>>MJ> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
> >>>>>_______________________________________________
> >>>>>seek-dev mailing list
> >>>>>seek-dev at ecoinformatics.org 
> >>>>>http://www.ecoinformatics.org/mailman/listinfo> /seek-dev
> >>>>>
> >>>>
> >>_______________________________________________
> >>seek-dev mailing list
> >>seek-dev at ecoinformatics.org
> >>http://www.ecoinformatics.org/mailman/listinfo/seek-dev
> >>
> > 
> > 
> _______________________________________________
> seek-dev mailing list
> seek-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/seek-dev
> 

-- 
Jing Tao
National Center for Ecological
Analysis and Synthesis (NCEAS)
735 State St. Suite 204
Santa Barbara, CA 93101




More information about the Seek-dev mailing list