[ipac] Illegal XML char in items out?

Natasha Stephan stephann at lindahall.org
Mon Mar 26 12:21:43 EDT 2007


Hi Jonathan,
I suspect the "illegal XML character" is in the bibliographic record
whose citation is stored in that patron's HIP list.  We have not had
this exact problem, but we have had other issues with HIP where there
are incorrect codes for diacritics lurking in our database.  We get an
XSLT error when attempting to look at the brief or full bib.

We found that there were hundreds of records in our catalog containing
bad codes (perhaps from a bad tape load way back when).  Not all had the
problem in the title or author where it caused HIP to throw up an error
- bad codes were found in the physical description and in the notes.
There were several varieties of incorrect codes, but all began with
"<U+008".  Each required a different replacement character.  Cataloging
staff were asked to fix problem bibliographic records individually.

Here is the relevant portion of the SQL query we used to find those
records.  You'll probably want to change "<U+008" to something specific
to your situation.  And your mileage may vary... you may want to inquire
of SirsiDynix whether there is a better solution.

/*
Run this script to find records containing diacritics or non-keyboard
characters with invalid UCS/Unicode.  Fix by entering ALA code or
significant digits of Unicode.
Certain characters may not print in results; replacements vary.
*/
SELECT distinct b1.bib#, b2.text, b1.tag, b1.text, i.location
FROM bib b1, bib b2, item i
WHERE b1.bib# = b2.bib#
AND b1.bib# = i.bib#
/* Use this to find char stored as UCS/Unicode text */
/* enter the text in the sequence as below */
AND b1.text like "%<U+008%"

/* Use this to get OCLC#, if 001 exists */
AND b2.tag="001"

/* Use this to test */
AND b1.bib# < 5000

Good luck,
Natasha

> -----Original Message-----
> From: ipac-bounces at lists.tblc.org [mailto:ipac-bounces at lists.tblc.org]
On
> Behalf Of Jonathan Rochkind
> Sent: Monday, March 26, 2007 10:43 AM
> To: Dynix's Horizon Information Portal,formerly iPac (discussion)
> Subject: [ipac] Illegal XML char in items out?
> 
> When a particular patron clicks on their 'items out' link in HIP
(3.08),
> they get only a message There is a problem with the XSLT. Check XSL
> Server for errors.
> 
> I have no reproduced this in my test server so I can try and debug.
> Checking the xsl.log tells me:
> 
> Mar 26 10:36:07 [XSLProcessor:transformUsingSaxon]
> javax.xml.transform.Transform
> erException: org.xml.sax.SAXParseException: illegal XML character U+6
> 
> 
> Hmm, is this a char that Saxon doesn't like in my actual item title or
> something? I'm going to look at the actual items to try and identify
it.
> 
> But has anyone run into this before? Any workaround?
> 
> I need to find a pristine copy of whatever the relevant XML file is,
to
> put in and make sure the error is reproducible with that, instead of
our
> customized searchresponse.xml, before reporting this to SD. I have no
> idea what is the relevant XML file, or if I even still have it around.
> Sigh. If anyone wants to send me a .zip'd copy of pristine HIP 3.08
xml/
> directory, that would be very helpful.
> 
> Jonathan
> _______________________________________________
> ipac mailing list
> ipac at lists.tblc.org
> http://lists.tblc.org/mailman/listinfo/ipac



More information about the ipac mailing list