'In-Line' data in EML2 using base64 encoding

Dan Higgins higgins at nceas.ucsb.edu
Fri May 23 10:01:09 PDT 2003


Hi All,
 
We at NCEAS have had several discussions about how to 'in-line' data in 
eml2 documents. One method is to encode arbitrary data in base64 format. 
The EML2 spec even says "encode the data using a text encoding algorithm 
such as base64, and then include that in a CDATA section" in the 
'inline' documentation.

In looking into the subject, I discovered the reference below which 
points out that base64 encoded data DOES NOT need to be inside a CDATA 
section since it contains no special XML characters!

Dan
-------------------------


      from: http://www.xml.com/axml/notes/CDprob.html


      CDATA Sections and Binary Data

A lot of people would like a way to package up any old binary data and 
include it in an XML file. The conventional XML answer to this would be 
to store it separately and point at it with an unparsed entity 
<http://www.xml.com/axml/target.html#dt-unparsed>. Which is fine, but 
that's not what people want; they want to include the data right in the 
file, which is a reasonable way to go if you're going to transmit it 
over the network.

When you look at CDATA, you might get the impression that you could 
maybe jam your binary data in a CDATA section. You'd be right, but you'd 
have to guarantee that it never included a byte sequence that looks like 
]]>. There is a trick you can use to get around that, but it's awkward:

<![CDATA[Use *two* CDATA sections when you need to embed a 
"]]]]><![CDATA[>" in the data ]]>

Another way to go would be to encode the binary data in base64 or some 
other technique that's guaranteed never to contain a <; but if you're 
going to do that, you don't need a CDATA section; any old element would 
do. Perhaps this is a good use for XML's notation attributes 
<http://www.xml.com/axml/target.html#notatn>.

----------------------
from: http://lists.xml.org/archives/xml-dev/199910/msg00388.html

>As Tim says in his annotation, if you use Base64 then you don't need the
>CDATA section as _none_ of the XML reserved chars appear in the Base64
>character set (A-Za-z0-9/+).


-- 
*******************************************************************
Dan Higgins                                  higgins at nceas.ucsb.edu
http://www.nceas.ucsb.edu/    Ph: 805-892-2531
National Center for Ecological Analysis and Synthesis (NCEAS) 
735 State Street - Room 205
Santa Barbara, CA 93195
*******************************************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mercury.nceas.ucsb.edu/ecoinformatics/pipermail/eml-dev/attachments/20030523/030b6a8b/attachment.htm


More information about the Eml-dev mailing list