'In-Line' data in EML2 using base64 encoding
Peter McCartney
peter.mccartney at asu.edu
Wed May 28 11:46:22 PDT 2003
That's right ..its a package-relative address that asumes the eml file and
data file are in the same package. If you separate them you're on your own
but then, you probably know where they are and don't need the eml to tell
you.
We don't use the zip solution for communicating between services. For that
we use a drop-box - a system-visible location that is pointed to in the EML
that is embedded in the xylopia reply. We originally make it optional to
either use a drop box or encode the data in xml, but based on our first
experience with writing a SOAP message for SDSC's climdb system and running
out of memory trying to populate the dom, we quickly dropped the inline XML
option! So I agree - for inlining the data, some sort of encoding of the
native schema would be better.
But the problem I have is how do you decide when the data are small enough
to inline versus package separately? And how do you warn the recipient how
much they are about to get so that they can determine whether they can
swallow that much in an XML message? If the xml file is too big, their
parser will probably blow up before they get to the part that tells them the
physical size of the stream.
Peter McCartney (peter.mccartney at asu.edu)
Center for Environmental-Studies
Arizona State University
-----Original Message-----
From: Dan Higgins [mailto:higgins at nceas.ucsb.edu]
Sent: Tuesday, May 27, 2003 4:17 PM
To: Peter McCartney
Cc: Eml-Dev (eml-dev at ecoinformatics.org); Christopher Jones
Subject: Re: 'In-Line' data in EML2 using base64 encoding
Peter,
There is nothing bad about your suggestion, but it is only one of a
variety of ways data can be 'related' to the metadata in eml2. Since we
have an 'in-line' option, we need to figure out how to handle it.
There are some difficulties with your zip file method. The url reference
in the 'online/url' would probably have to be a relative file url since
one cannot know the complete path on a new machine. Presumably the
relative location is relative to the eml metadata file, but that may not
be the current working directory. And, once unzipped, a user can always
move the data file and mess up the relative file url. Also, I don't
think zip files can be directly added to a SOAP container if we go that
way in SEEK.
Dan
---
Peter McCartney wrote:
> Whats so bad about putting it in a separate file and sending both in a
> zip package? just put the filename in the distribution/online/url element.
>
> Peter McCartney (peter.mccartney at asu.edu
> <mailto:peter.mccartney at asu.edu>) Center for Environmental-Studies
> Arizona State University
>
>
> -----Original Message-----
> *From:* Dan Higgins [mailto:higgins at nceas.ucsb.edu]
> *Sent:* Friday, May 23, 2003 10:01 AM
> *To:* Eml-Dev (eml-dev at ecoinformatics.org); Christopher Jones
> *Subject:* 'In-Line' data in EML2 using base64 encoding
>
> Hi All,
>
> We at NCEAS have had several discussions about how to 'in-line'
> data in eml2 documents. One method is to encode arbitrary data in
> base64 format. The EML2 spec even says "encode the data using a
> text encoding algorithm such as base64, and then include that in a
> CDATA section" in the 'inline' documentation.
>
> In looking into the subject, I discovered the reference below
> which points out that base64 encoded data DOES NOT need to be
> inside a CDATA section since it contains no special XML
> characters!
>
> Dan
> -------------------------
>
>
> from: http://www.xml.com/axml/notes/CDprob.html
>
>
> CDATA Sections and Binary Data
>
> A lot of people would like a way to package up any old binary data
> and include it in an XML file. The conventional XML answer to this
> would be to store it separately and point at it with an unparsed
> entity <http://www.xml.com/axml/target.html#dt-unparsed>. Which is
> fine, but that's not what people want; they want to include the
> data right in the file, which is a reasonable way to go if you're
> going to transmit it over the network.
>
> When you look at CDATA, you might get the impression that you
> could maybe jam your binary data in a CDATA section. You'd be
> right, but you'd have to guarantee that it never included a byte
> sequence that looks like ]]>. There is a trick you can use to get
> around that, but it's awkward:
>
> <![CDATA[Use *two* CDATA sections when you need to embed a
> "]]]]><![CDATA[>" in the data ]]>
>
> Another way to go would be to encode the binary data in base64 or
> some other technique that's guaranteed never to contain a <; but
> if you're going to do that, you don't need a CDATA section; any
> old element would do. Perhaps this is a good use for XML's
> notation attributes <http://www.xml.com/axml/target.html#notatn>.
>
> ----------------------
> from: http://lists.xml.org/archives/xml-dev/199910/msg00388.html
>
>>As Tim says in his annotation, if you use Base64 then you don't need
>>the CDATA section as _none_ of the XML reserved chars appear in the
>>Base64 character set (A-Za-z0-9/+).
>
>
>
>--
>*******************************************************************
>Dan Higgins higgins at nceas.ucsb.edu
>http://www.nceas.ucsb.edu/ Ph: 805-892-2531
>National Center for Ecological Analysis and Synthesis (NCEAS)
>735 State Street - Room 205
>Santa Barbara, CA 93195
>*******************************************************************
>
--
*******************************************************************
Dan Higgins higgins at nceas.ucsb.edu
http://www.nceas.ucsb.edu/ Ph: 805-892-2531
National Center for Ecological Analysis and Synthesis (NCEAS)
735 State Street - Room 205
Santa Barbara, CA 93195
*******************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20030528/0f71a430/attachment.htm
More information about the Eml-dev
mailing list