[ipac] HIP keyword indexes; accessing directly?
Tod Olson
tod at uchicago.edu
Wed Jun 13 14:59:25 EDT 2007
On Jun 13, 2007, at Jun 13, 10:57 AM, Jonathan Rochkind wrote:
> Hey, anyone know what format the HIP keyword indexes are stored in?
> Are
> they in interbase/firebird, or are they in a proprietary format?
ProIndex is, I believe, the name of the full text indexing software
that runs the keyword indexes.
> I am writing some software that wants to, behind the scenes, query
> Horizon/HIP on ISSN/ISBN (maybe some other identifiers too), and get
> back a list of bib#s.
>
> I am doing this now by 'screen scraping' the HIP XML. It works. But
> it's
> SO slow. So very slow. So I'm wondering if I can talk to the HIP
> indexes
> directly maybe. I guess I could talk to Horizon indexes too (via the
> rdbms?), but if there's a way to talk to HIP keyword indexes, I think
> that might work better.
Talking to the search server directly is probably preferable. It is
unclear whether ProIndex provides an on the wire protocol, or whether
the HIP implementers had to devise one. But you might be able to
capture some of the JBoss/hznsearchserver traffic and make some
deductions. Or figure out what class in the HIP software submits
searches to ProIndex, and leverage that with a new servlet. If the on-
the-wire protocol is not binary, querying the search server directly
may be manageable. In any case, it seems like a lot of effort.
Tod Olson <tod at uchicago.edu>
Systems Librarian
University of Chicago Library
More information about the ipac
mailing list