1. Xref text

I have a large ( 5 meg ) document that I would like to provide word
searching with pattern matching.

I'm I talking hashing here?  If so, I really don't know how to do a good
quality hash. I've been writing software for over 10 years. I have always
used binary searching or some other system for indexing.  But, since I have
been reading this list, I have noticed alot of hashing.  I did study that
in school, and even did a project, but I cannot remember how to begin.

I would like to provide the user with a list of locations. Speed would be
very important. Even words like 'a','the' and 'they' would be candidates.

This is a static document, so I can store the xref after it is built.
Therefore, the build does not have to be fast.



 Joe Phillips, Assistant Director
 University Computing and Telecommunications
 Texas Wesleyan University     817-531-4284

new topic     » topic index » view message » categorize

2. Re: Xref text

At 01:04 PM 5/12/98 -0500, Joe Phillips wrote:
>I'm I talking hashing here?  If so, I really don't know how to do a good
>quality hash. I've been writing software for over 10 years. I have always
>used binary searching or some other system for indexing.  But, since I have
>been reading this list, I have noticed alot of hashing.  I did study that
>in school, and even did a project, but I cannot remember how to begin.
>
>I would like to provide the user with a list of locations. Speed would be
>very important. Even words like 'a','the' and 'they' would be candidates.
>
>This is a static document, so I can store the xref after it is built.
>Therefore, the build does not have to be fast.

Have you looked at Junko Miura's hash.ex? I think you could
modify this to do what you want.

Irv

new topic     » goto parent     » topic index » view message » categorize

3. Re: Xref text

--=====================_895023875==_

At 01:04 PM 5/12/98 -0500, Joe Phillips wrote:

>I'm I talking hashing here?  I cannot remember how to begin.

After suggesting Junko's hash program, I decided to see if
I could modify it to do what you want. Surprise! it took less
than a minute. I have attached the modified program.
It produces a list formatted as:
word: count loc1 loc2 loc3....
where count is the number of times the word appears in the
doc, and locn is where the word appears in the doc.

Irv

--=====================_895023875==_

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu