<HTML>
<HEAD>
<!-- @(#) $Revision: 4.43 $ $Source: /cvsroot/judy/doc/ext/JudyHS_3.htm,v $ --->
<TITLE>JudyHS(3)</TITLE>
</HEAD>
<BODY>
<TABLE border=0 width="100%"><TR>
<TD width="40%" align="left">JudyHS(3)</TD>
<TD width="10%" align="center"> </TD>
<TD width="40%" align="right">JudyHS(3)</TD>
</TR></TABLE>
<P>
<!----------------->
<DT><B>NAME</B></DT>
<DD>
JudyHS macros - C library for creating and accessing a dynamic array,
using an array-of-bytes of <B>Length</B> as an <B>Index</B> and a word
as a <B>Value</B>.
<!----------------->
<P>
<DT><B>SYNOPSIS</B></DT>
<DD>
<B><PRE>
cc [flags] <I>sourcefiles</I> -lJudy
#include <Judy.h>
Word_t * PValue; // JudyHS array element
int Rc_int; // return flag
Word_t Rc_word; // full word return value
Pvoid_t PJHSArray = (Pvoid_t) NULL; // initialize JudyHS array
uint8_t * Index; // array-of-bytes pointer
Word_t Length; // number of bytes in Index
<A href="#JHSI" >JHSI</A>( PValue, PJHSArray, Index, Length); // <A href="JudyHS_funcs_3.htm#JudyHSIns">JudyHSIns()</A>
<A href="#JHSD" >JHSD</A>( Rc_int, PJHSArray, Index, Length); // <A href="JudyHS_funcs_3.htm#JudyHSDel">JudyHSDel()</A>
<A href="#JHSG" >JHSG</A>( PValue, PJHSArray, Index, Length); // <A href="JudyHS_funcs_3.htm#JudyHSGet">JudyHSGet()</A>
<A href="#JHSFA">JHSFA</A>(Rc_word, PJHSArray); // <A href="JudyHS_funcs_3.htm#JudyHSFreeArray">JudyHSFreeArray()</A>
</PRE></B>
<!----------------->
<DT><B>DESCRIPTION</B></DT>
<DD>
A JudyHS array is the equivalent of an array of word-sized
value/pointers. An <B>Index</B> is a pointer to an array-of-bytes of
specified length: <B>Length</B>. Rather than using a null terminated
string, this difference from <A href="JudySL_3.htm">JudySL(3)</A>
allows strings to contain all bits (specifically the null character).
This new addition (May 2004) to Judy arrays is a hybird using the best
capabilities of hashing and Judy methods. <B>JudyHS</B> does not have a
poor performance case where knowledge of the hash algorithm can be used
to degrade the performance.
<P>
Since JudyHS is based on a hash method, <B>Indexes</B> are not stored in
any particular order. Therefore the JudyHSFirst(), JudyHSNext(),
JudyHSPrev() and JudyHSLast() neighbor search functions are not
practical. The <B>Length</B> of each array-of-bytes can be from 0 to
the limits of <I>malloc()</I> (about 2GB).
<P>
The hallmark of <B>JudyHS</B> is speed with scalability, but memory
efficiency is excellent. The speed is very competitive with the best
hashing methods. The memory efficiency is similar to a linked list of
the same <B>Indexes</B> and <B>Values</B>. <B>JudyHS</B> is designed to
scale from 0 to billions of <B>Indexes</B>.
<P>
A JudyHS array is allocated with a <B>NULL</B> pointer
<PRE>
Pvoid_t PJHSArray = (Pvoid_t) NULL;
</PRE>
<P>
Because the macro forms of the API have a simpler error handling
interface than the equivalent
<A href="JudyHS_funcs_3.htm">functions</A>,
they are the preferred way to use JudyHS.
<P>
<DT>
<A name="JHSI"><B>JHSI(PValue, PJHSArray, Index, Length)</B></A> // <A href="JudyHS_funcs_3.htm#JudyHSIns">JudyHSIns()</A></DT>
<DD>
Given a pointer to a JudyHS array (<B>PJHSArray</B>), insert an
<B>Index</B> string of length: <B>Length</B> and a <B>Value</B> into the
JudyHS array: <B>PJHSArray</B>. If the <B>Index</B> is successfully
inserted, the <B>Value</B> is initialized to 0. If the <B>Index</B> was
already present, the <B>Value</B> is not modified.
<P>
Return <B>PValue</B> pointing to <B>Value</B>. Your program should use
this pointer to read or modify the <B>Value</B>, for example:
<PRE>
Value = *PValue;
*PValue = 1234;
</PRE>
<P>
<B>Note</B>:
<B>JHSI()</B> and <B>JHSD</B> can reorganize the JudyHS array.
Therefore, pointers returned from previous <B>JudyHS</B> calls become
invalid and must be re-acquired (using <B>JHSG()</B>).
<P>
<DT><A name="JHSD"><B>JHSD(Rc_int, PJHSArray, Index, Length)</B></A> // <A href="JudyHS_funcs_3.htm#JudyHSDel">JudyHSDel()</A></DT>
<DD>
Given a pointer to a JudyHS array (<B>PJHSArray</B>), delete the
specified <B>Index</B> along with the <B>Value</B> from the JudyHS
array.
<P>
Return <B>Rc_int</B> set to 1 if successfully removed from the array.
Return <B>Rc_int</B> set to 0 if <B>Index</B> was not present.
<P>
<DT><A name="JHSG"><B>JHSG(PValue, PJHSArray, Index, Length)</B></A> // <A href="JudyHS_funcs_3.htm#JudyHSGet">JudyHSGet()</A></DT>
<DD>
Given a pointer to a JudyHS array (<B>PJHSArray</B>),
find <B>Value</B> associated with <B>Index</B>.
<P>
Return <B>PValue</B> pointing to <B>Index</B>'s <B>Value</B>.
Return <B>PValue</B> set to <B>NULL</B> if the <B>Index</B> was not present.
<P>
<DT><A name="JHSFA"><B>JHSFA(Rc_word, PJHSArray)</B></A> // <A href="JudyHS_funcs_3.htm#JudyHSFreeArray">JudyHSFreeArray()</A></DT>
<DD>
Given a pointer to a JudyHS array (<B>PJHSArray</B>), free the entire array.
<P>
Return <B>Rc_word</B> set to the number of bytes freed and <B>PJHSArray</B> set to NULL.
<!----------------->
<P>
<DT><A name="ERRORS"><B>ERRORS:</B> See: </A><A href="Judy_3.htm#ERRORS">Judy_3.htm#ERRORS</A></DT>
<DD>
<P>
<DT><B>EXAMPLES</B></DT>
<DD>
Show how to program with the JudyHS macros. This program will print
duplicate lines and their line number from <I>stdin</I>.
<P><PRE>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <Judy.h>
// Compiled:
// cc -O PrintDupLines.c -lJudy -o PrintDupLines
#define MAXLINE 1000000 /* max fgets length of line */
uint8_t Index[MAXLINE]; // string to check
int // Usage: PrintDupLines < file
main()
{
Pvoid_t PJArray = (PWord_t)NULL; // Judy array.
PWord_t PValue; // Judy array element pointer.
Word_t Bytes; // size of JudyHS array.
Word_t LineNumb = 0; // current line number
Word_t Dups = 0; // number of duplicate lines
while (fgets(Index, MAXLINE, stdin) != (char *)NULL)
{
LineNumb++; // line number
// store string into array
JHSI(PValue, PJArray, Index, strlen(Index));
if (PValue == PJERR) // See ERRORS section
{
fprintf(stderr, "Out of memory -- exit\n");
exit(1);
}
if (*PValue == 0) // check if duplicate
{
Dups++;
printf("Duplicate lines %lu:%lu:%s", *PValue, LineNumb, Index);
}
else
{
*PValue = LineNumb; // store Line number
}
}
printf("%lu Duplicates, free JudyHS array of %lu Lines\n",
Dups, LineNumb - Dups);
JHSFA(Bytes, PJArray); // free JudyHS array
printf("JudyHSFreeArray() free'ed %lu bytes of memory\n", Bytes);
return (0);
}
</PRE>
<!----------------->
<P>
<DT><B>AUTHOR</B></DT>
<DD>
JudyHS was invented and implemented by Doug Baskins after retiring from Hewlett-Packard.
<!----------------->
<P>
<DT><B>SEE ALSO</B></DT>
<DD>
<A href="Judy_3.htm">Judy(3)</A>,
<A href="Judy1_3.htm">Judy1(3)</A>,
<A href="JudyL_3.htm">JudyL(3)</A>,
<A href="JudySL_3.htm">JudySL(3)</A>,
<BR>
<I>malloc()</I>,
<BR>
the Judy website,
<A href="http://judy.sourceforge.net">
http://judy.sourceforge.net</A>,
for further information and Application Notes.
</BODY>
</HTML>