[kepler-dev] [eml-dev] Question about EML-based file access in Kepler

Matthew Jones jones at nceas.ucsb.edu
Tue Mar 18 14:06:45 PDT 2008


I agree, I don't think it does handle this, but this is a bug in my 
opinion.  It should distinguish these URL types.  The intention of the 
"function" attribute in EML was to handle exactly what Wade is trying to 
do, so Kepler should look for it and only really try to parse and 
download data from 'download' URLs.  If a "function" attribute has not 
been provided on the URL, then maybe it should try to download it as 
well, but that is open to discussion.  I've been looking for the query 
specification in Kepler -- but to no avail.  Any idea how hard this 
would be to implement in our query, Jing?

Matt

Jing Tao wrote:
> Hi, Wade:
> 
> Base on my knowledge, I don't think kepler disguishes the "information" 
> and "download" attributes. It will grab the content of the given url.
> 
> Hope this is helpful.
> 
> Jing
> 
> Jing Tao
> National Center for Ecological
> Analysis and Synthesis (NCEAS)
> 735 State St. Suite 204
> Santa Barbara, CA 93101
> 
> On Tue, 18 Mar 2008, Wade Sheldon wrote:
> 
>> Hi Matt,
>>
>> I'm in the process of rolling out a new GCE website so I've been 
>> reviewing and updating web application code for xml/xhtml 
>> compatibility, etc. As part of this process I'm also making some minor 
>> changes to the GCE EML implementation, including how data access urls 
>> are encoded for data sets that aren't yet publicly downloadable. I 
>> just wanted to run these changes by you to check for potential impact 
>> on Kepler users accessing our docs via Metacat.
>>
>> In our original implementation I omitted the 
>> dataTable/physical/distribution node entirely for unreleased data 
>> sets, but as a consequence users viewing an outdated metadata document 
>> would not easily be able to find the data object after it becomes 
>> publicly accessible. This is particularly an issue for the EcoTrends 
>> project, because we're providing pre-release data and EML for the 
>> static web page and book they are producing, and the legacy metadata 
>> will be retained and potentially accessed in the future (i.e. outside 
>> of Metacat).
>>
>> In the new implementation, I will still include direct pass-through 
>> links to data objects in EML in Metacat for public data sets, but I 
>> will now include urls for private datasets as well. These private data 
>> urls will point to a web page that will either allow the user to 
>> register and download the data after it is public, or will inform them 
>> of the private status and allow them to fill out a form to request the 
>> data in advance of the release date. In order to distinguish between 
>> these different endpoints I am explicitly setting the 
>> distribution/online/url function attribute to "download" or 
>> "information" as appropriate for data or a web page.
>>
>> My question for you is how does Kepler handle dataTable distribution 
>> urls in EML with the function="information" attribute? Because I 
>> differentially generate EML for Metacat I could revert to the old 
>> practice to prevent problems, but I'd prefer to use the same approach 
>> for both GCE-centric and KNB-centric metadata to prevent confusion.
>>
>> Here's a link to an example document with the new implementation for a 
>> private data set:
>> http://gce-nas.marsci.uga.edu/public/app/send_eml.asp?detail=full&missing=NaN&delimiter=tab&metacat=yes&accession=INV-GCEM-0705c2 
>>
>>
>> Thanks in advance for any input.
>>
>> -Wade Sheldon
>>
>>
>> -- 
>> ______________________________________________________________________________ 
>>
>>
>> Wade M. Sheldon
>> GCE-LTER Information Manager/SIMO Database Administrator
>> School of Marine Programs
>> University of Georgia
>> Athens, GA 30602-3636
>> Email: sheldon at uga.edu
>> WWW: 
>> http://gce-lter.marsci.uga.edu/public/app/personnel_bios.asp?id=wsheldon
>>
>> _______________________________________________
>> Eml-dev mailing list
>> Eml-dev at ecoinformatics.org
>> http://mercury.nceas.ucsb.edu/ecoinformatics/mailman/listinfo/eml-dev
>>
>>
> 

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Matthew B. Jones
Director of Informatics Research and Development
National Center for Ecological Analysis and Synthesis (NCEAS)
UC Santa Barbara
jones at nceas.ucsb.edu                       Ph: 1-907-523-1960
http://www.nceas.ucsb.edu/ecoinfo
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


More information about the Kepler-dev mailing list