[ipac] HIP keyword indexes; accessing directly?

Tod Olson tod at uchicago.edu
Wed Jun 13 14:59:25 EDT 2007


On Jun 13, 2007, at Jun 13, 10:57 AM, Jonathan Rochkind wrote:

> Hey, anyone know what format the HIP keyword indexes are stored in?  
> Are
> they in interbase/firebird, or are they in a proprietary format?

ProIndex is, I believe, the name of the full text indexing software  
that runs the keyword indexes.

> I am writing some software that wants to, behind the scenes, query
> Horizon/HIP on ISSN/ISBN (maybe some other identifiers too), and get
> back a list of bib#s.
>
> I am doing this now by 'screen scraping' the HIP XML. It works. But  
> it's
> SO slow. So very slow. So I'm wondering if I can talk to the HIP  
> indexes
> directly maybe.  I guess I could talk to Horizon indexes too (via the
> rdbms?), but if there's a way to talk to HIP keyword indexes, I think
> that might work better.

Talking to the search server directly is probably preferable. It is  
unclear whether ProIndex provides an on the wire protocol, or whether  
the HIP implementers had to devise one. But you might be able to  
capture some of the JBoss/hznsearchserver traffic and make some  
deductions. Or figure out what class in the HIP software submits  
searches to ProIndex, and leverage that with a new servlet. If the on- 
the-wire protocol is not binary, querying the search server directly  
may be manageable. In any case, it seems like a lot of effort.


Tod Olson <tod at uchicago.edu>
Systems Librarian
University of Chicago Library




More information about the ipac mailing list