Re: Spell checker

new topic     » goto parent     » topic index » view thread      » older message » newer message

Robert B Pilkington wrote:

> I'm trying to write a spell checker, but there is one problem, and I
> don't know if anybody can answer it, but I figured I'd ask anyway:
>
> How do I find the words in the dictionary that are close enough to the
> mispelled word to be put into the "Replace" list?
>
> (And it would be nice if it were fast, and if it checked for words that
> might be two words but the spacebar wasn't pressed hard enough, and if it
> loaded faster than the current version I have.....)

well there are quite a number of ways you could do this, and the way i'm going
to tell you is just the way i would do (i've only thought about this for about 5
seconds so this is the first thing that popped into my head)

okay first of all you need a function which returns the list of possible
replacements for that word (i'm assuming you want to return a list of a number
of choices, not just /THE/ best choice).  let's call it replace_word

function replace_word(string word, sequence word_list,
    integer num_recursions, integer max_recursions)

btw a good idea is to lower_case() (or whatever it's called) 'word' and
'word_list'.  you'll also have to define type string if you're using my example
literally.

when you call this routine, make sure you pass a value of 0 to num_recursions;
max_recursions >= 1 (the higher the value of max_recursions, the slower it is,
but the higher the number of words it returns); sequence word_list is list of an
unknown number of strings (a string is a sequence of an unknown number of bytes)
which contains all words in the dictionary.

the first thing to do is check to see if the word is just spelled wrong (i.e.
there's not just a space missing).  forgive me if i use some functions wrong in
the following example, as i don't have the euphoria documentation in front of me
right now.  i'm also using euphoria pre-processor, so if you don't use
pre-processor, you'll just have to guess what this code means :)

function replace_word(string word, sequence word_list,
    integer num_recursions, integer max_recursions)

    integer     good
    sequence    ret
    string      w1, w2

    ret = {}
    for c = 1 to length(word) do
        with each dic_word in word_list do
            if c = 1 then
                good = wildcard_match("*" & word[2..end], dic_word)
            elsif c = length(word) then
                good = wildcard_match(word[1..c - 1] & "*", dic_word)
            else
                good = wildcard_match(word[1..c - 1] & "*" & word[c + 1..end],
                    dic_word)
            end if
            if good then
                ret = append(ret, dic_word)
            end if
        end with
    end for

-- check for missing spaces
    if num_recursions < max_recursions then
        c =+ 1
        for c = 2 to length(word) - 1 do
            w1 = word[1..c - 1]
            w2 = word[c + 1..end]
            ret =& replace_word(w1, word_list, num_recursions, max_recursions)
            ret =& replace_word(w2, word_list, num_recursions, max_recursions)
        end for
    end if

    return ret

end function

if you're the slightest bit foggiest on anything mentioned, don't hesitate to
re-post with your questions.  keep in mind i haven't tried this (or thought too
much about it heh) so it's bound not to work exactly as it is.  i suppose i
should have documented a bit but i think it's pretty self-explanatory (and you
can always post again with your questions).  the only problem is, that not only
does it break it up into different words to see if the user was just missing a
space, but it even spell checks the words that it breaks it up into!

if you're really serious about this (and really stuck) tell me and i'll poke
around a bit in euphoria and see if can't get a (semi-)working model.

--

    .  m   i   k   e       b   u   r   r   e   l   l  .

   . h t t p : / / m i k p o s . h o m e . m l . o r g .

. m  i  k  p  o  s  @  s  o  f  t  h  o  m  e  .  n  e  t .

     . ftp://ftp.scene.org /pub/music/artists/mikpos/ .

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu