distribution element issues
Peter McCartney
peter.mccartney at asu.edu
Wed May 15 16:10:36 PDT 2002
Well this was the very reason why i proposed providing a choice of parameter
models that were specific to schemes. I appreciate the ambiguity of your
"database" example, but only because i dont recognize the scheme. By asking
you to pick the scheme (MS sql server) from a controlled list that is
documented in EML i could then force you to enter version = 7.0,
host=maricopa, port=1433, networkProtocol=named pipes, database=arthropods.
There would be no ambiguity as to semantics and you would not have to know
the exact syntax of how to build the url string for whatever driver i wish
have available to use.
To blow this problem off in favor of just using urls I think puts us
(almost) back where we were a year ago. I do see more utility in URLs after
our discussion of providing urls for a specific driver. But is still a
problem for users who want to use my data and neither have that particular
driver nor know how to rewrite the url string into an equivalent one for an
alternate driver (although I can mitigate that somewhat by trying to provide
urls for as many different connection protocols out there that i can
anticipate). There is also a problem in that I do not see evidence that ALL
online connections can indeed be described by a structured URL string (i
can't find one for an SDE connection, although i did find one for modem
dialups). Im also not convinced that urls carry enough information. If i
give you file://maricopa.asu.edu/proj/lter/filename.txt its a crap shoot
whether it will work for you because i havent told you that maricopa.asu.edu
is an NT server located in the LTER domain. Similarly, with JDBC there is a
keyword for the the driver in the url string, but jdbc isnt smart enough to
parse the url and figure out what driver to use - you need to separately
provide the class name of the driver that the url is for. Finally, I really
question whether users can be expected to know the proper structure for
providing a url string for most service connections - we will have to
provide wizards to help them with that. Those wizards will have to be based
on content models of parameters for each known scheme, so why the heck dont
we make them part of EML in the first place?
Part of the problem i think we're having here is the difference between
connection info we share with the world vesus connection information we want
to use locally. I need a metadata format that allows me to generate a
display in our data catalog for local users (or my local web application) to
know how to find a file while they are sitting in the lab (eg.... network
protocol: MS windows networking, domain: LTER, server:maricopa, folder:
proj\lter\po10\, filename:xxxx). perhaps one solution is to make URL a
required connection type but provide some form of parameter model as an
option. editors could generate the url version from the parameters but the
parameters would remain in the metadata for local applications.
I certainly agree that if there is an unambiguous way of describing a url to
a connection, that should be preferred. But I'm pretty sure that if this is
the only way of defining a connection in EML, many sites using server
connections or local file system addresses (myself included) will wind up
extending EML with their own locally defined connection description schemas
to solve some of the problems I mention above. If im on my own on this, then
im likely to just locally use my original content models for each kind of
connection scheme we use at CAP and simply build URLs in XSL when generating
valid EML documents. Now maybe this isnt so bad if I am not inclined to show
that detailed info to the public anyway. I guess it all depends on how much
we want EML to set standards for managing metadata at the internal site
level, but I see some advantages to a solution that is itself part of EML so
that we dont have a bazillon different solutions to the same problem.
Before we drop this, has anyone looked at how the SRB MCAT stores connection
information? it seems like it has a similar problem in having to deal with a
lot of different kinds of connections. Does it manage to do all this with a
single URL field?
On a totally separate note, i like the idea of token substitutions for
defining url's in such a way that they can be used more generically - this
neatly allows you to define the host and path of an ftp connection once, and
then substitute the filename for datasets that have several files on one ftp
site. So i say add that feature, regardless of how we resolved the
url/parameter debate.
But this feature begs another question. For web apps that dont expose their
form parameters in the url via GET, the token substitution trick still won't
help us automate running these applications. How do we reference an online
application for which further interactive user input cannot be avoided in
order to get the data. Do we enter these under "connections" or is an
onlineApplicationURL different from an onlineURL?
Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental Studies
Arizona State University
480-965-6791
-----Original Message-----
From: Chad Berkley [mailto:berkley at nceas.ucsb.edu]
Sent: Wednesday, May 15, 2002 1:35 PM
To: Matt Jones
Cc: eml-dev at ecoinformatics.org
Subject: Re: distribution element issues
I think we should eliminate the parameters altogether. I don't see the
point of them since all of the information that they can encode can be
more precisely encoded in a URL.
chad
On Wed, 2002-05-15 at 12:58, Matt Jones wrote:
> Hey --
>
> I pointed out some problems with the "distribution" element that I am
> trying to resolve in my second comment on bug 480:
> http://bugzilla.ecoinformatics.org/show_bug.cgi?id=480#c2
>
> I could really use some feedback on this to see what others think before
> I finalize the changes. This is a plea for help! Thanks.
>
> Matt
>
> --
> *******************************************************************
> Matt Jones jones at nceas.ucsb.edu
> http://www.nceas.ucsb.edu/ Fax: 425-920-2439 Ph: 907-789-0496
> National Center for Ecological Analysis and Synthesis (NCEAS)
>
> Interested in ecological informatics? http://www.ecoinformatics.org
> *******************************************************************
>
> _______________________________________________
> eml-dev mailing list
> eml-dev at ecoinformatics.org
> http://www.ecoinformatics.org/mailman/listinfo/eml-dev
--
_______________________________________________
eml-dev mailing list
eml-dev at ecoinformatics.org
http://www.ecoinformatics.org/mailman/listinfo/eml-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20020515/bdd507c9/attachment.htm
More information about the Eml-dev
mailing list