[kepler-dev] Consolidating Datasources

Efrat Frank efrat at sdsc.edu
Fri Mar 11 17:11:09 PST 2005

Dear all,

Matt, Jing, Kai and I just had a discussion about consolidating the different data source binding approaches in Kepler to a single user interface. Below are our conclusions:

Federate Metadata across different communities: Create a unified metadata object, called DataProxy. The DataProxy can get the metadata (probably using a DataSystem class as described below) and parse it using different metadata formats interpreter, such as EML, Darwin Core, ADN, FGDC, etc... After parsing the metadata, the DataProxy object will have the info to download the data object as described by the metadata specification and pass the info to proper DataSystem class to download the data. The API will include the following functionality:
InputStream getFullMetadata(String id, String endPoints);
DataSystem parseMetadata(InpuStream metadata);
void downloadData(DataSystem object);

The DataSystem class: a generic class to handle get data object (including metadata object) from different data sources (data system). 
API requirements: 
Inputstream getData(String identifier, String endPoints); 
InputStream getData(other signatures). 
Extending classes: EcoGridDataSystem, MetacatDataSystem(for metacats that don't implement the ecogrid interface), JDBCDataSystem, etc... 

Certificate authority: Create a single centralized certificate authority to provide a shared infrastructure to access and maintain different sites' certificate authorities, e.g., the GEON portal, the seek different sites' CAs. - follow up with Karan for more information about the Grid Account Management Architecture (GAMA) used in the GEON portal authentication.

A unified web service access to the datasources: 
- In order to support other clients than the Kepler interface to access the various datasources.
- A web service access to datasources with no additional requirements (such as registering). Communities can benefits from accessing each other datasources directly.
The GEON and SEEK datasources access architectures are very similar - a follow up meeting is required on consolidating datasources access through a unified web service with Kai, Ashraf, Karan, Sandeep, Efrat, Chaitan from GEON and folks from SEEK.

Unified query for data sources in Kepler either by adding more datasources querying classes (besides EMLDataSource and DCDataSource), or once there is a unified web service access, using a generic web service actor to query all the data sources.

Comments are welcome,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/kepler-dev/attachments/20050311/a7f6c0c2/attachment.htm

More information about the Kepler-dev mailing list