Re: lost link

new topic     » goto parent     » topic index » view thread      » older message » newer message

Kat wrote:

> i was looking for the method it determined what was
> pronounceable, or how the generated words were
> ranked for pronounceability. Does anyone know about it?

It's easier to create a list of rules for euphonious letter combinations
than it is to generate words randomly and rank them. For example, here's a
list of consonants and blends:

b|bl|br|c|cl|cr|d|dr|f|fl|fr|ft|g|gl|gr|h|j|k|kl|kr|l|m|n|p|pl|pr|qu|r|rt|s|
sc|sh|sl|sp|spr|st|str|sw

and here's a list of vowels and blends (sustitute any vowel for 'a'):

a|ab|ac|ad|af|ag|ai|ain|ak|al|am|an|ap|ar|as|at|aw|ax|ay|az

This can be encoded in Euphoria easily:

   constant Blend = { "b", "bl", "br" ... },
   Vowel = { "a", "ae", "ai", "ao", "au", "e" ... }
   Follows = { "", "b", "c" ... }

The second list contains the letter(s) that follow the vowel. You can select
a list of followers by choosing a blend that start with a vowel follower:

   function get_follower( int char )
      sequence list
      list = {}
      for i = 1 to length( Blends )
         if Blends[i][1] = char then
            list = append( list, Blends[i] )
         end if
      end for
      return list
   end function

Here's a routine to pick a random item from a list:

   function pick( sequence list )
      return list[ rand(length(list)) ]
   end function

So the actual routine might be something like:

   function make_word()
      word = ""
      for i = 2 to rand( 3 ) + 2
         if i = 1 and rand(2) = 1 then
           -- randomly don't start with consonant
         else
            cons = pick( Blends )
            word &= cons
         end if

         vowel = pick( Vowel )
         word &= vowel

         follows = pick( Follows )
         picks = get_follower( follows )
         if length( picks ) = 0 then
            exit
         end if

      end for
      return word
   end function

It's not hard to elaborate on this. Determining if a word "sounds good" is a
similar process, but in reverse. Just see if you can traverse the graph.
After a bit of tuning, it should be easy for the algorithm to determine
that:

   "foozebar"

is easy to say, while

   "qzezlhg"

is less so.

-- David Cuny

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu