Re: [xml] OS/390 compatibility (EBCDIC)



Thank you for your comments.

I found the first problem, the correct code page should be used in
order to interpret header, patch attached.

But libxml does not work, it cannot work in EBCDIC environment.
As it convert the stream into UTF-8, it then tries to parse it using
native literals.

For example:
---
if (c == 'a')
---

Will not work, as 'a' is in EBCDIC and it is compared to c which is
UTF-8. Unlike ANSI, the character value is different between UTF-8
(latin1) and EBCDIC.

I tried to use #pragma convert("ISO8859-1"), and also tried to use
-qconvlit=ISO8859-1 compiler option, but both have too wide effect in
order to solve this.

Correct solution is to use:
#define UTF8_CHARACTER_A '\x41'
#define UTF8_CHARACTER_GT '\x3c'

And use these in the parsers.

Any other solution is welcomed. But it looks like I need to find
alternative to libxml2 so I can do cross platform XML.

Thanks,
Alon.

On Wed, May 12, 2010 at 11:34 AM, Tim Van Holder
<tim vanholder anubex com> wrote:
On 2010-05-11 23:05, Roumen Petrov wrote:
Hi, Alon,
Alon Bar-Lev wrote:
Hello,

Trying to compile on OS/390, it should work, right?

I am getting:
---
$ ../../dep/bin/xmllint --version
../../dep/bin/xmllint: using libxml version 20707
  compiled with: Iconv ISO8859X
So far so good but what about iconv ?
Is it from gnu libc or external (standalone) from libiconv. The second
didn't support EBCDIC{xxx}.

I remember building libxml on OMVS (on z/OS 1.8?) quite some time ago,
and this mostly working. I do remember making some portability changes,
possibly to things like the IS_SPACE stuff Roumen mentions below.
I'm pretty sure OMVS had iconv, but maybe the encoding names it supports
do not include "ebcdic" (after all "ebcdic" is ambiguous - there are
several EBCDIC codepages).
Try "iconv -l" to get a list of valid encoding names.

Based on http://itc.virginia.edu/mss/OpenSSH.html it looks like you may
have more luck specifying "IBM-1047" as encoding name.

$ ../../dep/bin/xmllint sample.xml
sample.xml:1: parser error : encoding not supported EBCDIC
<?xml version="1.0" encoding="ebcdic"?>

So if libiconv support EBCDIC then no idea.

Another point is that libxml contain some macros similar to IS_SPACE
that check for ASCII codes and this may break something in library. o
idea how to find all of them. May be with command like "grep -i
space.*32" or "grep -i space.*20" and to look near to lines found for
other macros.

^
sample.xml:1: parser error : Input is not proper UTF-8, indicate
encoding !
Bytes: 0xA7 0x94 0x93 0x40
<?xml version="1.0" encoding="ebcdic"?>
---

Attachment: libxml2-2.7.7-ebcdic.patch
Description: Binary data



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]