1. $100 Contest Question

For the contests, what is "standard input?"

When it says, " Write a program in Euphoria that reads from standard 
input a "cipher" line containing the 26 letters of the alphabet, 
rearranged somehow, e.g.
PQRSTUVZABCDEFGHWXYIJKLMNO"

does the "standard input" indicate a file?

Or is standard input the keyboard?

I know I could probably find this in the docs, but this is much more 
reliable. :)

Also, will the cipher line necessarily be sequential? For instance, 
might we encounter:

ABCDEFGHIJKLMNOPQRSTUVWXYZ
PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line

This is a simple one-to-one replacement, but the cipher line, you'll 
note, is not sequential.

new topic     » topic index » view message » categorize

2. Re: $100 Contest Question

On 3 Mar 2002, at 0:22, C. K. Lester wrote:

> 
> For the contests, what is "standard input?"
> 
> When it says, " Write a program in Euphoria that reads from standard 
> input a "cipher" line containing the 26 letters of the alphabet, 
> rearranged somehow, e.g.
> PQRSTUVZABCDEFGHWXYIJKLMNO"
> 
> does the "standard input" indicate a file?
> 
> Or is standard input the keyboard?
> 
> I know I could probably find this in the docs, but this is much more 
> reliable. :)

I was wondering this too. File input, dox box, windoze gui, or wxwindows, or 
what?

> Also, will the cipher line necessarily be sequential? For instance, 
> might we encounter:
> 
> ABCDEFGHIJKLMNOPQRSTUVWXYZ
> PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line
> 
> This is a simple one-to-one replacement, but the cipher line, you'll 
> note, is not sequential.

It *is* sequential, but it's not alphabetical.

My new question: It's taking me 143 seconds to load a dictionary with the 
following code:

dfilename = "D:\\Gertie\\decypher\\dfile.txt"
readfile = open(dfilename,"r")
readline = get(readfile)
readline = readline[2]
close(readfile)

And since i don't see how i can make that any faster, can we *not* count 
load times for this contest? My puter is a K6-2-233 running win95, so it's 
prolly the slowest one here (not counting Igor's 386's), but still, i don't see 
the load times improving off the hd with faster cpus..

Kat

new topic     » goto parent     » topic index » view message » categorize

3. Re: $100 Contest Question

> And since i don't see how i can make that any faster, can we *not* count 
> load times for this contest? My puter is a K6-2-233 running win95, so it's 
> prolly the slowest one here (not counting Igor's 386's), but still, i don't
> see
> the load times improving off the hd with faster cpus..
> 
> Kat

If anything, you and me are at an advantage because we have slow machines
so our algorythms will prolly be faster just cause we have to make'em that
way just to see results on them.

blink

Euman

new topic     » goto parent     » topic index » view message » categorize

4. Re: $100 Contest Question

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


>
> On 3 Mar 2002, at 0:22, C. K. Lester wrote:
>
> >
> > For the contests, what is "standard input?"
> >
> > When it says, " Write a program in Euphoria that reads from standard
> > input a "cipher" line containing the 26 letters of the alphabet,
> > rearranged somehow, e.g.
> > PQRSTUVZABCDEFGHWXYIJKLMNO"
> >
> > does the "standard input" indicate a file?
> >
> > Or is standard input the keyboard?
> >
> > I know I could probably find this in the docs, but this is much more
> > reliable. :)
>
> I was wondering this too. File input, dox box, windoze gui, or wxwindows,
or
> what?
>
> > Also, will the cipher line necessarily be sequential? For instance,
> > might we encounter:
> >
> > ABCDEFGHIJKLMNOPQRSTUVWXYZ
> > PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line
> >
> > This is a simple one-to-one replacement, but the cipher line, you'll
> > note, is not sequential.
>
> It *is* sequential, but it's not alphabetical.
>
> My new question: It's taking me 143 seconds to load a dictionary with the
> following code:
>
> dfilename = "D:\\Gertie\\decypher\\dfile.txt"
> readfile = open(dfilename,"r")
> readline = get(readfile)
> readline = readline[2]
> close(readfile)
>
> And since i don't see how i can make that any faster, can we *not* count
> load times for this contest? My puter is a K6-2-233 running win95, so it's
> prolly the slowest one here (not counting Igor's 386's), but still, i
don't see
> the load times improving off the hd with faster cpus..
>
Hi Kat,
I would advise people not to use get() unless there was a really good reason
to do so. It is a very slow way of reading in text. Even so, 143 seconds is
amazingly slow.

Here is a test I whipped up and in takes 0.07 seconds to run on my machine
(Pentium III,550MHz, 256Meg, Windows Me).

 -------------
 include get.e
 include file.e
 atom flen
 atom s
 sequence dfilename, readline
 integer readfile
 s = time()
 dfilename = "f:\\euphoria\\words.txt"
 readfile = open(dfilename,"rb") -- NB, binary for speed
 flen = seek(readfile, -1)
 flen = where(readfile)
 if seek(readfile,0) then end if
 readline = repeat(0,flen)
 for i = 1 to flen do
    readline[i] = getc(readfile)
 end for
 close(readfile)
 printf(1, "%f\n", time()-s)
 --------------
Derek

new topic     » goto parent     » topic index » view message » categorize

5. Re: $100 Contest Question

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
> On 3 Mar 2002, at 2:52, C. K. Lester wrote:
>
> >
> > Kat wrote:
> > >
> > > My new question: It's taking me 143 seconds to load a
> > > dictionary with the following code:
> >
> > Kat, I'm loading Junko's 50,000-word dictionary in less than a second.
> > What dictionary are YOU using?! :D
>
> One i threw together and premunged just for this. Unfortunately, it's
making a
> 4.8Meg file out of the dictionaries, even after deleting all the trailing
spaces,
> CR, LF, and duplicates. Using:
>
>   writefile = open(dfilename,"wb")
>   print(writefile,dictionary)
>   close(writefile)
>
> puts this stuff into the file:
>
>
{{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67}
,{68},{68}
> , etc, which is just a listing of the letters of the alphabet, which is
what i
> wanted at that point. But i didn't need it represented that way, i wanted
it
> done in a way Eu could reload it instantly,, like,, umm, in
> D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as
> much space and load time as needed. Any ideas?

That is exactly why print() and get() should rarely be used. They are
extremely inefficient. In my version of reformatting Junko's WORDS.TXT, I
went from the original 508,190 bytes to 531,828 bytes. In speed differences,
it takes my about 6 seconds to use WORDS.TXT to build the internal
dictionary, and about 0.5 seconds to build it using the reformatted
words.txt.

For speed, use binary reads and writes, and just use getc() and puts().

I'm not sure if I'm allowed to give any more details of the algorithms I
used etc..., but I did strip out unnecessary delimiters between words.

----------
Derek

new topic     » goto parent     » topic index » view message » categorize

6. Re: $100 Contest Question

I meant to ask this question that I dont think anyones asked.

Can I have text length files? what I mean is word files that are
seperated by the length of their text?

oh yeah, Im not in this for fun anymore. Im in it to win it!

Euman
euman at bellsouth.net

Q: Are we monetarily insane?
A: YES
----- Original Message ----- 
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Sent: Sunday, March 03, 2002 12:17 AM
Subject: RE: $100 Contest Question


> 
> Derek, are you using a preformatted version of words.txt, or are you 
> formatting it within your program? It sounds like the latter, which I 
> don't think is allowed.
> 
> Chris
> 
> 
> Derek Parnell wrote:
> > ----- Original Message -----
> > From: "Kat" <gertie at PELL.NET>
> > To: "EUforum" <EUforum at topica.com>
> > Sent: Sunday, March 03, 2002 2:39 PM
> > Subject: RE: $100 Contest Question
> > 
> > 
> > > On 3 Mar 2002, at 2:52, C. K. Lester wrote:
> > >
> > > >
> > > > Kat wrote:
> > > > >
> > > > > My new question: It's taking me 143 seconds to load a
> > > > > dictionary with the following code:
> > > >
> > > > Kat, I'm loading Junko's 50,000-word dictionary in less than a second.
> > > > What dictionary are YOU using?! :D
> > >
> > > One i threw together and premunged just for this. Unfortunately, it's
> > making a
> > > 4.8Meg file out of the dictionaries, even after deleting all the 
> > > trailing
> > spaces,
> > > CR, LF, and duplicates. Using:
> > >
> > >   writefile = open(dfilename,"wb")
> > >   print(writefile,dictionary)
> > >   close(writefile)
> > >
> > > puts this stuff into the file:
> > >
> > >
> > {{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67}
> > 
> > ,{68},{68}
> > > , etc, which is just a listing of the letters of the alphabet, which is
> > what i
> > > wanted at that point. But i didn't need it represented that way, i 
> > > wanted
> > it
> > > done in a way Eu could reload it instantly,, like,, umm, in
> > > D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as
> > > much space and load time as needed. Any ideas?
> > 
> > That is exactly why print() and get() should rarely be used. They are
> > extremely inefficient. In my version of reformatting Junko's WORDS.TXT, 
> > I
> > went from the original 508,190 bytes to 531,828 bytes. In speed 
> > differences,
> > it takes my about 6 seconds to use WORDS.TXT to build the internal
> > dictionary, and about 0.5 seconds to build it using the reformatted
> > words.txt.
> > 
> > For speed, use binary reads and writes, and just use getc() and puts().
> > 
> > I'm not sure if I'm allowed to give any more details of the algorithms I
> > used etc..., but I did strip out unnecessary delimiters between words.
> > 
> > ----------
> > Derek
> > 
> > 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

7. Re: $100 Contest Question

On 3 Mar 2002, at 14:37, Derek Parnell wrote:

> 
> ----- Original Message -----
> From: "Kat" <gertie at PELL.NET>
> To: "EUforum" <EUforum at topica.com>
> Sent: Sunday, March 03, 2002 1:15 PM
> Subject: Re: $100 Contest Question
> 
> 
> > On 3 Mar 2002, at 0:22, C. K. Lester wrote:
> >
> > >
> > > For the contests, what is "standard input?"
> > >
> > > When it says, " Write a program in Euphoria that reads from standard
> > > input a "cipher" line containing the 26 letters of the alphabet,
> > > rearranged somehow, e.g.
> > > PQRSTUVZABCDEFGHWXYIJKLMNO"
> > >
> > > does the "standard input" indicate a file?
> > >
> > > Or is standard input the keyboard?
> > >
> > > I know I could probably find this in the docs, but this is much more
> > > reliable. :)
> >
> > I was wondering this too. File input, dox box, windoze gui, or wxwindows,
> or
> > what?
> >
> > > Also, will the cipher line necessarily be sequential? For instance,
> > > might we encounter:
> > >
> > > ABCDEFGHIJKLMNOPQRSTUVWXYZ
> > > PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line
> > >
> > > This is a simple one-to-one replacement, but the cipher line, you'll
> > > note, is not sequential.
> >
> > It *is* sequential, but it's not alphabetical.
> >
> > My new question: It's taking me 143 seconds to load a dictionary with the
> > following code:
> >
> > dfilename = "D:\\Gertie\\decypher\\dfile.txt"
> > readfile = open(dfilename,"r")
> > readline = get(readfile)
> > readline = readline[2]
> > close(readfile)
> >
> > And since i don't see how i can make that any faster, can we *not* count
> > load times for this contest? My puter is a K6-2-233 running win95, so it's
> > prolly the slowest one here (not counting Igor's 386's), but still, i
> don't see
> > the load times improving off the hd with faster cpus..
> >
> Hi Kat,
> I would advise people not to use get() unless there was a really good reason
> to
> do so. It is a very slow way of reading in text. Even so, 143 seconds is
> amazingly slow.

But it reads the whole file at one time,, no optimisation behind the scenes?

 What about saving the file? Looks to me like saving it with the 
prints(file,HeavilyNestedSequence) by using 5x as much space as the 
original is overkill.
 
> Here is a test I whipped up and in takes 0.07 seconds to run on my machine
> (Pentium III,550MHz, 256Meg, Windows Me).

I ran it, my best run was .67sec,, they generally ran 1.4sec. So i need to 
figure my runs of 50minutes will meet the 5minute runtime limits of the 
contest? <blink><blink>

>  -------------
>  include get.e
>  include file.e
>  atom flen
>  atom s
>  sequence dfilename, readline
>  integer readfile
>  s = time()
>  dfilename = "f:\\euphoria\\words.txt"
>  readfile = open(dfilename,"rb") -- NB, binary for speed
>  flen = seek(readfile, -1)
>  flen = where(readfile)
>  if seek(readfile,0) then end if
>  readline = repeat(0,flen)
>  for i = 1 to flen do
>     readline[i] = getc(readfile)

This doesn't allow for any premunging of the dictionary, or entry of it into 
nested sequences. Does it?

>  end for
>  close(readfile)
>  printf(1, "%f\n", time()-s)

As Euman pointed out, i haveto squeeze as much out of the 5 minute 
contest timeframe as i can, so i premunged the dictionary. I have no 
concrete factor to figure the difference in exec times, but loading the 
premunged, nested seq dictionary is way faster than munging it, loading 2 
dictionaries and munging, then writing out, took 19 minutes.

Kat

new topic     » goto parent     » topic index » view message » categorize

8. Re: $100 Contest Question

----- Original Message -----
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
> Derek, are you using a preformatted version of words.txt, or are you
> formatting it within your program? It sounds like the latter, which I
> don't think is allowed.
>
> Chris
>
>

The short answer is both.

For comp#2, the algorithm I used when the matching routine is called was:
   If the internal dictionary is not set up,
       look for a file called 'dict.dat'.
       If that is present, use it's contents to
            set up the internal dictionary.
       otherwise look for 'words.txt'.
       If that is present, use it's contents to
            set up the internal dictionary, then
            write out the internal dictionary to
            'dict.dat' using a special format.

In either case, there is a small delay the first time the routine is called
while it initialises the internal dictionary. Only with the dict.dat file,
this delay is a lot smaller than with words.txt. Once the dictionary is set
up, find the matching words is lightening fast.

When I submitted my program to Robert yesterday, the wording of the
competition did not say "You must use words.txt contained in Junko's spell
checker in the Archive." So I guess the rules have changed after my
submission! Oh well. Of course, in one sense. I did use Junko's file - to
create a reformatted one - and I can use Junko's file if the dict.dat file
is not present.

I wrote the program as if it was to be used in the real world, not just some
artificial competetion environment. Thus the routine that uses words.txt is
not hyper-optimised as I was only going to use it once to create the
dict.dat file. That file is the optimised one.

If Robert rules against this concept, I guess I can submit another version
of the program.
-----
Derek.

new topic     » goto parent     » topic index » view message » categorize

9. Re: $100 Contest Question

----- Original Message -----
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


> about hyphens, and apostrophes?
>

I documented my assumptions in the source code. For this one, it was than
any character in the pattern text, whose ASCII value is higher than SPACE,
is meant to be an exact-match character. Meaning that the integers 0 to 32
are reserved for inexact matches.

On a similar point I made the assumption that a pattern of {4,6,9} is
equivalent to {1,2,3}. In other words, the actual value of the pattern
characters is not important, only that they represent a unique character in
the target word(s).

--------
Derek.

new topic     » goto parent     » topic index » view message » categorize

10. Re: $100 Contest Question

Chris Bensler writes:
> What platform will be used to test with? To be fair, 
> it would have to be tested on all 3.

I'll use DOS. It's a bit inconvenient to boot into Linux.

> What if one entry only works on a specific platform, 
> but is faster than all others for that platform?

It should be easy to make your program work on all platforms,
but if a program fails on DOS, but works on Linux or Windows,
I'll measure its time on a system it works on (on the same machine)
and decide if there is any unfairness. A program like that
can still win, but won't be eligible for the $5 bonus.

> What are the valid match characters for Contest#2? A-Z and a-z? 
> What about hyphens, and apostrophes?

As Derek suggested,
if a '-' or '\'' is supplied (or some character greater than ASCII 32),
it should be treated as a literal character to be matched. Values
from 0 to 32 represent "meta" characters, or placeholders for
unspecified characters in the pattern. I'll only give you upper case
literal characters, A, B, C, ...

Euman writes:
> Can I have text length files? what I mean is word files that are
> seperated by the length of their text?

I'm not sure what you mean.
On problem #2 you must use Junko's dictionary (word list),
because I will be determining correctness based on
the words in that particular dictionary. If you want to reformat
her file into a different file, that's fine, but the time that it
takes to do that will be included in your total time. 

You can assume that there is enough memory to store
the 51802 words in memory (unless you store them in a 
bizarrely inefficient way). 
I'll be using a machine with 64 Mb of RAM.

On problem #3 you can use any dictionary, formatted
any way you like, but I think Junko's is quite reasonable.

Martin Stachon writes (privately):
> After the load of the wordlist, how many times you will 
> call the function? 

In problem #2, assume that I will make 1000
calls to your function.

Derek Parnell writes:
> On a similar point I made the assumption that a pattern of {4,6,9} is
> equivalent to {1,2,3}. In other words, the actual value of the pattern
> characters is not important, only that they represent a unique character in
> the target word(s).

Yes, that's correct.

Aku writes:
> (Problem #1) How is the time calculated?
> How many iteration (loops) will it be tested?

I'm planning to run each program once,
with a few megabytes of input text.
I'll actually do it a few times each, 
and ignore the first run,
since the first time, the data won't be in
the operating system's memory cache.

Aku writes:
> (Problem #1)  Will the input contain byte 0 ?

No. It will consist of the cipher line plus 
many lines of English text - no weird characters,
no huge (over 1000 characters) lines.

I'll add these points to the Web page later today.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

11. Re: $100 Contest Question

On 3 Mar 2002, at 13:44, Robert Craig wrote:

<snippage happened>

> > What are the valid match characters for Contest#2? A-Z and a-z? 
> > What about hyphens, and apostrophes?
> 
> As Derek suggested,
> if a '-' or '\'' is supplied (or some character greater than ASCII 32),
> it should be treated as a literal character to be matched. Values
> from 0 to 32 represent "meta" characters, or placeholders for
> unspecified characters in the pattern. I'll only give you upper case
> literal characters, A, B, C, ...

So the input file will be all upper()'d already?

> On problem #3 you can use any dictionary, formatted
> any way you like, but I think Junko's is quite reasonable.
> 
> Martin Stachon writes (privately):
> > After the load of the wordlist, how many times you will 
> > call the function? 
> 
> In problem #2, assume that I will make 1000
> calls to your function.

You realise, at 5 minutes runtime per iteration, that's 83 hours? If you get 
100 such entries, that's 345 DAYS of runtime for testing problem #2 
programs.
 
> Derek Parnell writes:
> > On a similar point I made the assumption that a pattern of {4,6,9} is
> > equivalent to {1,2,3}. In other words, the actual value of the pattern
> > characters is not important, only that they represent a unique character in
> > the target word(s).
> 
> Yes, that's correct.
> 
> Aku writes:
> > (Problem #1) How is the time calculated?
> > How many iteration (loops) will it be tested?
> 
> I'm planning to run each program once,
> with a few megabytes of input text.

*megabytes*? of the same few sentences repeated ad naseum, with different 
keys? Or will you be feeding it a online book text with unique sentences and 
the same key for all sentences? 
 
Kat

new topic     » goto parent     » topic index » view message » categorize

12. Re: $100 Contest Question

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


> > if a '-' or '\'' is supplied (or some character greater than ASCII 32),
> > it should be treated as a literal character to be matched. Values
> > from 0 to 32 represent "meta" characters, or placeholders for
> > unspecified characters in the pattern. I'll only give you upper case
> > literal characters, A, B, C, ...
>
> So the input file will be all upper()'d already?

Here, Robert is only talking about the characters in the pattern text used
to test comp#2. Not the word list or any file used in any competition. If I
was you, I'd expect mixed case text for the input files in comp#1 and
comp#3.


> Robert Craig wrote:
> > In problem #2, assume that I will make 1000
> > calls to your function.
>
> You realise, at 5 minutes runtime per iteration, that's 83 hours? If you
get
> 100 such entries, that's 345 DAYS of runtime for testing problem #2
> programs.

There is something really wrong if a single iteration takes that long. I'm
sure Robert will pull the plug on any program runs longer than 5 minutes.
Currently I'm doing 1280 iterations in 1.48 seconds, and I haven't finished
optimising it yet.

> Robert Craig wrote:
> > I'm planning to run each program once,
> > with a few megabytes of input text.
>
> *megabytes*? of the same few sentences repeated ad naseum, with different
> keys? Or will you be feeding it a online book text with unique sentences
and
> the same key for all sentences?

In comp#1, we are required to *encipher* some English text. As a trial, I
combined all the Euphoria\DOC files in to a single file then copied that
file to itself a few times until I had a file that was 4,267,462 bytes long.
My program takes about 4 seconds to encipher this file.

--------
Derek

new topic     » goto parent     » topic index » view message » categorize

13. Re: $100 Contest Question

Hi Ray,
----- Original Message -----
From: "Ray Smith" <smithr at ix.net.au>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
>
> Robert Craig wrote:
>
> > On problem #2 you must use Junko's dictionary (word list),
> > because I will be determining correctness based on
> > the words in that particular dictionary. If you want to reformat
> > her file into a different file, that's fine, but the time that it
> > takes to do that will be included in your total time.
>
> I've just wasted a weekend of programming!!!!
>

All Robert is saying here, is that at some point in the execution of the
comp#2 program, you must use Junko's file 'as is' to get the list of words
that will be used to test the program against. You are perfectly free to
reformat that DATA in your program to help speed things up. I think what
Robert is getting at is that for comp#2, you cannot use a different file as
the SOURCE of the word list. Junko's file is the only SOURCE of words
permitted. If you want to create a new file based on her's, you can. The
time to do that will be included in the test time though.

> Robert Craig wrote in message:
>
http://www.topica.com/lists/EUforum/read/message.html?mid=1709760866&sort=d&
start=10864
>
>
> For problem #3, you can change the input word list if you want.
> For #2 you must use Junko's list as is.
>
> NOTE THE "as is"  !!!!
>
> after I specifically asked the question:
> >Is the time to load the database part of the calculated time to
> >decipher the sentence? If it is can we re-organise the
> >database in a particular order to help the load time.
> >ie. if we create some new hash scheme can we re-order the table to
> >make the load time faster and therefore reduce total decipher time.
>
> I don't mind the rules being manipulated for cases that haven't
> already been discussed but it's frustrating when the rules change
> from something that has already been said!
>

Comp #2 is NOT a deciphering exercise. That is for comp#3. In comp#2 it is a
pattern matching exercise and that must use Junko's file. For comp#3 you do
not have to use Junko's file in any way, shape or form.

I can understand your frustration though, because my first attempt at comp#2
also used a reformatted version of Junko's file. I did that because in the
original competition rules, it didn't prohibit doing that. Only after I
submitted it to Robert did the rules change. But I figure that's pretty
real-world behaviour - we have customers changing the system specifications
constantly while we are developing the system.

-----
Derek.

new topic     » goto parent     » topic index » view message » categorize

14. Re: $100 Contest Question

----- Original Message ----- 
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>

> You are perfectly free to
> reformat that DATA in your program to help speed things up. I think what
> Robert is getting at is that for comp#2, you cannot use a different file as
> the SOURCE of the word list. Junko's file is the only SOURCE of words
> permitted. If you want to create a new file based on her's, you can. The
> time to do that will be included in the test time though.

Derek this is how I read this:

>From Robert
"If you want to reformat
her file into a different file, that's fine, but the time that it
takes to do that will be included in your total time."

This means to me that you must actually read her file and 
reformat it if you wish (making another file) but this must be
part of your program.

Which to me would be kicking a dead horse (so to speak)!

Read her file, format the text, write it out, only to be read by your
program again...hmmm hell of alot of work it seem like to me.

Euman

new topic     » goto parent     » topic index » view message » categorize

15. Re: $100 Contest Question

----- Original Message -----
From: <euman at bellsouth.net>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


>
> ----- Original Message -----
> From: "Derek Parnell" <ddparnell at bigpond.com>
> To: "EUforum" <EUforum at topica.com>
>
> > You are perfectly free to
> > reformat that DATA in your program to help speed things up. I think what
> > Robert is getting at is that for comp#2, you cannot use a different file
as
> > the SOURCE of the word list. Junko's file is the only SOURCE of words
> > permitted. If you want to create a new file based on her's, you can. The
> > time to do that will be included in the test time though.
>
> Derek this is how I read this:
>
> >From Robert
> "If you want to reformat
> her file into a different file, that's fine, but the time that it
> takes to do that will be included in your total time."
>
> This means to me that you must actually read her file and
> reformat it if you wish (making another file) but this must be
> part of your program.
>
> Which to me would be kicking a dead horse (so to speak)!
>
> Read her file, format the text, write it out, only to be read by your
> program again...hmmm hell of alot of work it seem like to me.
>

Oh I agree with you. I would be a silly thing to do. But you still can if
you want to though.

In my first attempt I had a variation on this. For the first ever run of the
program I used words.txt file, reformatted it and wrote out another file. In
second and subsequent uses of the program, I used my reformatted file. But
apparently this is no longer allowed. I still haven't had an official ruling
on this case, but I'm going to assume that I have to change the way I did
the program.

--------
Derek.

new topic     » goto parent     » topic index » view message » categorize

16. Re: $100 Contest Question

----- Original Message -----
From: "Ray Smith" <smithr at ix.net.au>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
>
> Derek Parnell wrote:
>
> > In comp#1, we are required to *encipher* some English text. As a trial,
> > I
> > combined all the Euphoria\DOC files in to a single file then copied that
> > file to itself a few times until I had a file that was 4,267,462 bytes
> > long.
> > My program takes about 4 seconds to encipher this file.
>
> What type of PC did you run this on Derek?
> My PC at work (PIII 650 Windows 2000 Professional) does this
> in 2.36 seconds (for a 4,121,548 bytes sized file)
>
> I haven't tried to optimise it yet (not sure what I could do really)
>
> Note: I won't be submitting my entry to contest 1 (I assume Derek
> won't either)
>

A PIII 550 Windows Me, 256Meg RAM, 13GB HD.

I only guessed at the time as I didn't actually put any timing coding in it.
Also, I have tried to optimise it either. Once I found out it ran so fast, I
couldn't see the point in doing any more tweaking.

And no, I won't be submitting any entries into any of the three
competitions. I will be submitting programs to Robert just for fun though.
More of a personal challenge than anything else.
----
Derek.

new topic     » goto parent     » topic index » view message » categorize

17. Re: $100 Contest Question

----- Original Message -----
From: "Ray Smith" <smithr at ix.net.au>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


> My PC at work (PIII 650 Windows 2000 Professional) does this
> in 2.36 seconds (for a 4,121,548 bytes sized file)
>

Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
seconds for that 4,267,462 byte file.

---------
Derek.

new topic     » goto parent     » topic index » view message » categorize

18. Re: $100 Contest Question

Hello all,

Talking about speed!

I found a much better way to implement my Hash table (below) for the 
"words.txt" file so I thought Id post this old piece.

-- hashing the dictionary takes @3.5 sec on my 233mhz laptop
-- 26 x 26 In alphabetical and length of text order.

The routine here is pretty fast, believe it or not I figured out a way to speed
this up 100 fold and still have the same accuracy and distribution.
I'll share that one in a month or so...

Maybe I could get some people to tell me how fast a PIII 550 or higher
might run this piece.

<snip.....>

without type_check
without trace

include get.e
include file.e
include print.e

atom t, junk
     t = time()

integer fnA, len
fnA = open("words.txt","r")

integer len_text
function EumsHash( sequence text) -- a more unique hash 
    atom h, g 
    integer short
    h = 0
    len = length(text)
    for i = 1 to len do 
        short = text[i]
        if short = 32 or short = 10 then
           len_text = i - 1
           exit
        end if
        if h > #0FFFFFFF or h < 0 then -- (overflow)
           h = and_bits(h, #0FFFFFFF)
        end if
        h *= 16
        h += short
        g = and_bits(floor(h / #1000000), #F0)
        if g != 0 then
           h = xor_bits(h, g)
        end if
        h = and_bits(h, not_bits(g))
    end for
    return h
end function

sequence hash_table
hash_table = repeat(repeat({}, 26),26) -- 26 -> 676 buckets

object line
integer letter
atom hashed

while 1 do -- hashing the dictionary takes @3.5 sec on my 233mhz laptop
    line = gets(fnA)
    if atom(line) then
       exit   
    end if
    hashed = EumsHash(line)
    hash_table[line[1] - 64][len_text] &= hashed
end while

close(fnA)

? time() - t

if getc(0) then
end if

<snip>


Later yall.

Euman
euman at bellsouth.net

----- Original Message ----- 
From: "Ray Smith" <smithr at ix.net.au>
To: "EUforum" <EUforum at topica.com>

> Derek Parnell wrote:
> 
> > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
> > seconds for that 4,267,462 byte file.
> 
> DAMN!
> 
> I can only get 2.8 seconds.
> I'll have to wait a whole month to see how u did it!
> 
> Ray Smith
> http://rays-web.com
> 
> P.S. Is it possible your PIII 550 is faster than my PIII 650???
> P.S.S I can't hope can't I!

new topic     » goto parent     » topic index » view message » categorize

19. Re: $100 Contest Question

On 4 Mar 2002, at 10:20, Derek Parnell wrote:

> 
> ----- Original Message -----
> From: "Ray Smith" <smithr at ix.net.au>
> To: "EUforum" <EUforum at topica.com>
> Sent: Monday, March 04, 2002 9:47 AM
> Subject: RE: $100 Contest Question
> 
> 
> > My PC at work (PIII 650 Windows 2000 Professional) does this
> > in 2.36 seconds (for a 4,121,548 bytes sized file)
> >
> 
> Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
> seconds for that 4,267,462 byte file.

Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just 
to load a dictionary into memory. Building the dictionary from 3 files takes 72 
minutes. I took the premunge time to sort words by length, and i load only 
words corresponding in length to words present in the cyphered line. You are 
getting times 10x to 140x faster than i am. I concede.

Kat

new topic     » goto parent     » topic index » view message » categorize

20. Re: $100 Contest Question

Kat did you try my posted routine yet?

This builds a alphabetical 'A' to 'Z' sequence
that allows for text length upto 26 letters

example

"A," -- length (1)
"AD,AH,AL,AM,AN,AS,AT,AU,AW,AX,AY," -- length (2)
"ABC,ABE,ABO,ACE,ACT,ADA,ADD,ADO,ADS,ADZ,AFT,AGE, etc..etc" -- length (3)
"ABBA,ABBE,ABBY,ABED,ABEL,ABET,ABLE,ABLY,ABUT,ACES,ACHE,ACID, etc...etc" --
length (4)

down to length (26)
then starts over with the next beginning letter (B,C,D out to Z)

3.5 sec on my 233mhz but instead of the text I presented it will be UNIQUE
numerical values
that represent the text.

Later,

Euman
euman at bellsouth.net

Q: Are we monetarily insane?
A: YES
----- Original Message ----- 
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Sent: Sunday, March 03, 2002 8:02 PM
Subject: Re: $100 Contest Question


> 
> On 4 Mar 2002, at 10:20, Derek Parnell wrote:
> 
> > 
> > ----- Original Message -----
> > From: "Ray Smith" <smithr at ix.net.au>
> > To: "EUforum" <EUforum at topica.com>
> > Sent: Monday, March 04, 2002 9:47 AM
> > Subject: RE: $100 Contest Question
> > 
> > 
> > > My PC at work (PIII 650 Windows 2000 Professional) does this
> > > in 2.36 seconds (for a 4,121,548 bytes sized file)
> > >
> > 
> > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
> > seconds for that 4,267,462 byte file.
> 
> Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just 
> to load a dictionary into memory. Building the dictionary from 3 files takes
> 72
> minutes. I took the premunge time to sort words by length, and i load only 
> words corresponding in length to words present in the cyphered line. You are 
> getting times 10x to 140x faster than i am. I concede.
> 
> Kat
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

21. Re: $100 Contest Question

I'm at my office now, so I'll try your code tomight when I get home.

On my office machine (P6 800Mhz, Windows 2000) this runs in 1.16 seconds.


4/03/2002 11:59:20 AM, euman at bellsouth.net wrote:

>
>Hello all,
>
>Talking about speed!
>
>I found a much better way to implement my Hash table (below) for the 
>"words.txt" file so I thought Id post this old piece.
>
>-- hashing the dictionary takes @3.5 sec on my 233mhz laptop
>-- 26 x 26 In alphabetical and length of text order.
>
>The routine here is pretty fast, believe it or not I figured out a way to speed
>this up 100 fold and still have the same accuracy and distribution.
>I'll share that one in a month or so...
>
>Maybe I could get some people to tell me how fast a PIII 550 or higher
>might run this piece.
>
><snip.....>
>
>without type_check
>without trace
>
>include get.e
>include file.e
>include print.e
>
>atom t, junk
>     t = time()
>
>integer fnA, len
>fnA = open("words.txt","r")
>
>integer len_text
>function EumsHash( sequence text) -- a more unique hash 
>    atom h, g 
>    integer short
>    h = 0
>    len = length(text)
>    for i = 1 to len do 
>        short = text[i]
>        if short = 32 or short = 10 then
>           len_text = i - 1
>           exit
>        end if
>        if h > #0FFFFFFF or h < 0 then -- (overflow)
>           h = and_bits(h, #0FFFFFFF)
>        end if
>        h *= 16
>        h += short
>        g = and_bits(floor(h / #1000000), #F0)
>        if g != 0 then
>           h = xor_bits(h, g)
>        end if
>        h = and_bits(h, not_bits(g))
>    end for
>    return h
>end function
>
>sequence hash_table
>hash_table = repeat(repeat({}, 26),26) -- 26 -> 676 buckets
>
>object line
>integer letter
>atom hashed
>
>while 1 do -- hashing the dictionary takes @3.5 sec on my 233mhz laptop
>    line = gets(fnA)
>    if atom(line) then
>       exit   
>    end if
>    hashed = EumsHash(line)
>    hash_table[line[1] - 64][len_text] &= hashed
>end while
>
>close(fnA)
>
>? time() - t
>
>if getc(0) then
>end if

---------
Cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

22. Re: $100 Contest Question

Hey Kat,
who says that any of my code would be faster than yours when run on your
machine?
For that matter, if your code was run on my machine, who says that it would be
slower than my code?

Don't concede on the basis of your own hardware platform. The contest will be
run on Robert's
platform anyhow!

The idea is to improve our Euphoria skills, not to see who has the beefiest
machine.

4/03/2002 12:02:36 PM, Kat <gertie at PELL.NET> wrote:

>
>On 4 Mar 2002, at 10:20, Derek Parnell wrote:
>
>> 
>> ----- Original Message -----
>> From: "Ray Smith" <smithr at ix.net.au>
>> To: "EUforum" <EUforum at topica.com>
>> Sent: Monday, March 04, 2002 9:47 AM
>> Subject: RE: $100 Contest Question
>> 
>> 
>> > My PC at work (PIII 650 Windows 2000 Professional) does this
>> > in 2.36 seconds (for a 4,121,548 bytes sized file)
>> >
>> 
>> Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
>> seconds for that 4,267,462 byte file.
>
>Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just 
>to load a dictionary into memory. Building the dictionary from 3 files takes 72
>
>minutes. I took the premunge time to sort words by length, and i load only 
>words corresponding in length to words present in the cyphered line. You are 
>getting times 10x to 140x faster than i am. I concede.
>
>Kat
>
>
>
>
---------
Cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

23. Re: $100 Contest Question

On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote:

> 
> Kat did you try my posted routine yet?

No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to 
20x slower on my computer than on his.

Kat

new topic     » goto parent     » topic index » view message » categorize

24. Re: $100 Contest Question

4/03/2002 1:09:21 PM, Kat <gertie at PELL.NET> wrote:

>
>On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote:
>
>> 
>> Kat did you try my posted routine yet?
>
>No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to 
>20x slower on my computer than on his.
>

Kat,
the purpose of that post was to see if it would run faster on your machine than
the code you were
currently using, not to see if your machine was slower than mine or not. By the
sounds of it, the
code I posted (using getc() rather than get()) is heaps faster for you. 

--------
Derek.

new topic     » goto parent     » topic index » view message » categorize

25. Re: $100 Contest Question

Ray Smith writes:
> OK, I see your point. (my weekend wasn't wasted .. phew!)
> 
> Can this be clarified by Rob as it wasn't to clear me (and probably 
> others) when I read the previous thread.

I think Derek summed it up correctly already, but I'll try again.

In problem #2, I want everyone to start by reading the same
set of words (Junko's), formatted the same way on disk.
Certainly, if everyone used a different set of words I'd have
a hard time verifying correctness, and a hard time comparing
performance.

In problem #3 it doesn't matter. You can use whatever dictionary
file you like, formatted however you like. If you don't use Junko's
you'll have to send me your dictionary file. 

In both #2 and #3 I'm imagining that people will load
the words into memory (i.e. a sequence of some kind,
although poking the data into allocated memory would be legal),
so they can make many dictionary queries at high speed. 
I have a 64 Mb RAM machine (as I've noted now on the Web page),
so there shouldn't be any problem keeping the whole
51802-word dictionary in memory. Writing it out to disk
in some optimized format is legal, but unlikely to be as fast
as accessing the words in memory.

In #2 you will first load the words into memory,
maybe with top-level code in your include file,
or with an init() routine that you call, or maybe your pattern
matching routine will do it on the first call (only) 
to that routine.

Derek Parnell writes:
> For the first ever run of the program I used words.txt file, 
> reformatted it and wrote out another file. In second and 
> subsequent uses of the program, I used my reformatted file. 
> But apparently this is no longer allowed. I still haven't had an 
> official ruling on this case, but I'm going to assume that 
> I have to change the way I did the program.

This is interesting, but for this competition it won't
be useful because I want to time one run of each program,
starting from scratch with Junko's file. The time will
include the word loading time plus the time to perform
1000 pattern matching calls.

(In reality, I'll probably run programs more than once
to get a consistent time based on everything being in the
operating system cache, but I don't want programs to 
take advantage of anything they did in a previous run.)

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

26. Re: $100 Contest Question

On 4 Mar 2002, at 13:51, Derek Parnell wrote:

> 
> 4/03/2002 1:09:21 PM, Kat <gertie at PELL.NET> wrote:
> 
> >
> >On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote:
> >
> >> 
> >> Kat did you try my posted routine yet?
> >
> >No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to
> >20x
> >slower on my computer than on his.
> >
> 
> Kat,
> the purpose of that post was to see if it would run faster on your machine
> than
> the code you were currently using, not to see if your machine was slower than
> mine or not. By the sounds of it, the code I posted (using getc() rather than
> get()) is heaps faster for you. 

You'd think so, but code Jiri posted last month ran *much slower* using 
getc() than gets() or get(). 

Kat

new topic     » goto parent     » topic index » view message » categorize

27. Re: $100 Contest Question

On 3 Mar 2002, at 19:59, euman at bellsouth.net wrote:

> 
> Hello all,
> 
> Talking about speed!
> 
> I found a much better way to implement my Hash table (below) for the 
> "words.txt" file so I thought Id post this old piece.
> 
> -- hashing the dictionary takes @3.5 sec on my 233mhz laptop
> -- 26 x 26 In alphabetical and length of text order.

It reported 6.5sec. I have not had a positive virus scan in 3 years. 
Taskinfo2000 consistantly reports Eu gets 90% or better cpu time to itself, 
typically 97%.

Kat

new topic     » goto parent     » topic index » view message » categorize

28. Re: $100 Contest Question

On 4 Mar 2002, at 13:39, Derek Parnell wrote:

> 
> Hey Kat,
> who says that any of my code would be faster than yours when run on your
> machine? For that matter, if your code was run on my machine, who says that it
> would be slower than my code?
> 
> Don't concede on the basis of your own hardware platform. The contest will be
> run on Robert's platform anyhow!
> 
> The idea is to improve our Euphoria skills, not to see who has the beefiest
> machine.

I know, but i can't account for some speed differences of 140x between us. 
Or why me loading a 1meg file uses 5megs of memory, or when i save it, it 
uses 6megs. Granted, you have better hardware, and even Eumans's laptop 
is probably newer, but even considering you two running pio mode 4+ and/or 
wide scuzzy harddrives, all i can guess is you are better programmers. My 
cap's off to you.

Kat

> 4/03/2002 12:02:36 PM, Kat <gertie at PELL.NET> wrote:
> 
> >
> >On 4 Mar 2002, at 10:20, Derek Parnell wrote:
> >
> >> 
> >> ----- Original Message -----
> >> From: "Ray Smith" <smithr at ix.net.au>
> >> To: "EUforum" <EUforum at topica.com>
> >> Sent: Monday, March 04, 2002 9:47 AM
> >> Subject: RE: $100 Contest Question
> >> 
> >> 
> >> > My PC at work (PIII 650 Windows 2000 Professional) does this
> >> > in 2.36 seconds (for a 4,121,548 bytes sized file)
> >> >
> >> 
> >> Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7
> >> seconds for that 4,267,462 byte file.
> >
> >Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just
> >to
> >load a dictionary into memory. Building the dictionary from 3 files takes 72
> >minutes. I took the premunge time to sort words by length, and i load only
> >words corresponding in length to words present in the cyphered line. You are
> >getting times 10x to 140x faster than i am. I concede.
> >
> >Kat
> >
> >
> ---------
> Cheers,
> Derek Parnell 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

29. Re: $100 Contest Question

----- Original Message ----- 
From: "Kat" <gertie at PELL.NET>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


> all i can guess is you are better programmers. My 
> cap's off to you.

Kat are you having a bad night?

Your far better than I am, Im sure of it.

Maybe you need an aspirin to come back to reality...
Just kiddin'

Euman

new topic     » goto parent     » topic index » view message » categorize

30. Re: $100 Contest Question

Hi Chris,
I'm now so here is the details I promised.

>
> What kind of bench test are you using Derek? I get nowhere near those
> kinds of results.

This is the group of 16 tests I repeat 80 times.

    Test({1,1,2,3,4})
    Test({'E', 1, 2, 3, 1, 3})
    Test({})
    Test({1,1})
    Test({1})
    Test({1,2,3})
    Test({4,'i','e'})
    Test({1,'E',2, 'E', 3, 'e'})
    Test({"ABC"&'\'','s'})
    Test("rabbit")
    Test("r" & {1,1,2})
    Test({'t',1,2})
    Test({1,2,3,'s'})
    Test({1,1,2,2})
    Test({1,2,2,1})
    Test({1,2,3,4,'e','d'})


> I use..
>   match_pattern({1,2,3,1,4,3,3,2,5,3}) -- ZIGZAGGING
> ..in an iterated loop
>
> I don't think I can even iterate 50x in 1.48 seconds
> And that pattern is on the faster side of the avergage.
>
> I don't understand how you could possibly get it that fast, unless your
> pattern is {1}, only then is my program comparable to your results.
>
> I've still got some tweaking to do, but I don't see it getting much
> faster.
>
> Try...
> -- find all 7 letter words with unique letters
> match_pattern({1,2,3,4,5,6,7})
>
> I'd like to know what your iteration time is for that one.

This takes between 0.99 and 1.04 seconds to do 10,000 iterations.

  sequence VOID
  atom ts
  ts=time()
  for i = 1 to 10000 do
    Test({1,2,3,4,5,6,7})
  end for
  printf(1, "%f\n", time()-ts)


For the ZIGZAGGING test, it takes between 2.97 and 3.04 seconds to do 10,000
iterations.


Thanks for getting me to do this test 'cos I can see a further optimisation
now blink

----------
cheers,
Derek

new topic     » goto parent     » topic index » view message » categorize

31. Re: $100 Contest Question

Kat wrote:
> My new question: It's taking me 143 seconds to load a dictionary with the 
> following code:
> 
> dfilename = "D:\\Gertie\\decypher\\dfile.txt"
> readfile = open(dfilename,"r")
> readline = get(readfile)
> readline = readline[2]
> close(readfile)
> 
> And since i don't see how i can make that any faster, can we *not* count 
> load times for this contest? My puter is a K6-2-233 running win95, so it's 
> prolly the slowest one here (not counting Igor's 386's), but still, i don't
> see
> the load times improving off the hd with faster cpus..

I have a Winchip C6 200Mhz machine and I load 51,802 wordlist in 5 secs
including building several special tables and calculations. I have DMA33
(can do UDMA66), 5200 spin/s, 2MB cache harddisk but this is mainly the matter
of the algorithm. Before tweaking, the time was about 50 secs...

P.S. I think that either David Cuny or Jiri Babor will win #3. Your bets?
P.S.2 Now I know what time_profiling is good for ! I wish I was registered !

    Martin

new topic     » goto parent     » topic index » view message » categorize

32. Re: $100 Contest Question

From: <bensler at mail.com>
> Can someone give me a benchmark for problme#2?
> 
> Total run time, number of iterations, and the filter/s used
> I need to know if I'm in the ball park, or if I need to reconsider my 
> implementation.

Using this benchmark :
-----------------------------------------------
    sequence words
    load_words()
    ? time() - t
    constant pats =
        {
            {1,2,3,4,5,4,3,2,1},
            {1,2,'X'},
            {'M',1,2,3,4,5},
            {1,2,3,4,5,6,7,8,9,10},
            {'E',1,1,2},
            {1,2,2,1,3},
            {1,1,2},
            {1,2,1},
            {'M',1,2,1,'M'},
            "MARTIN",
            {1,2,'X',2,1},
            {1,2,3,'B',3,4},
            {1,2,'M',2,1},
            {'E',1,2,3,1,3}
        }
    for p=1 to length(pats) do
        words = get_words_by_pattern(pats[p])
    end for
    ? time()-t
-----------------------------------------------
I get this output:
    5.06
    5.5
    
On my Winchip @200Mhz, Win98. And you?

    Martin

new topic     » goto parent     » topic index » view message » categorize

33. Re: $100 Contest Question

Maybe someone is interested in that I casually found the word BBUFFALOES
instead of BUFFALOES in Junko´s list. I don't know if there are other
errors.
----- Original Message -----
From: "Robert Craig" <rds at RapidEuphoria.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


>
> Chris Bensler writes:
> > What platform will be used to test with? To be fair,
> > it would have to be tested on all 3.
>
> I'll use DOS. It's a bit inconvenient to boot into Linux.
>
> > What if one entry only works on a specific platform,
> > but is faster than all others for that platform?
>
> It should be easy to make your program work on all platforms,
> but if a program fails on DOS, but works on Linux or Windows,
> I'll measure its time on a system it works on (on the same machine)
> and decide if there is any unfairness. A program like that
> can still win, but won't be eligible for the $5 bonus.
>
> > What are the valid match characters for Contest#2? A-Z and a-z?
> > What about hyphens, and apostrophes?
>
> As Derek suggested,
> if a '-' or '\'' is supplied (or some character greater than ASCII 32),
> it should be treated as a literal character to be matched. Values
> from 0 to 32 represent "meta" characters, or placeholders for
> unspecified characters in the pattern. I'll only give you upper case
> literal characters, A, B, C, ...
>
> Euman writes:
> > Can I have text length files? what I mean is word files that are
> > seperated by the length of their text?
>
> I'm not sure what you mean.
> On problem #2 you must use Junko's dictionary (word list),
> because I will be determining correctness based on
> the words in that particular dictionary. If you want to reformat
> her file into a different file, that's fine, but the time that it
> takes to do that will be included in your total time.
>
> You can assume that there is enough memory to store
> the 51802 words in memory (unless you store them in a
> bizarrely inefficient way).
> I'll be using a machine with 64 Mb of RAM.
>
> On problem #3 you can use any dictionary, formatted
> any way you like, but I think Junko's is quite reasonable.
>
> Martin Stachon writes (privately):
> > After the load of the wordlist, how many times you will
> > call the function?
>
> In problem #2, assume that I will make 1000
> calls to your function.
>
> Derek Parnell writes:
> > On a similar point I made the assumption that a pattern of {4,6,9} is
> > equivalent to {1,2,3}. In other words, the actual value of the pattern
> > characters is not important, only that they represent a unique character
in
> > the target word(s).
>
> Yes, that's correct.
>
> Aku writes:
> > (Problem #1) How is the time calculated?
> > How many iteration (loops) will it be tested?
>
> I'm planning to run each program once,
> with a few megabytes of input text.
> I'll actually do it a few times each,
> and ignore the first run,
> since the first time, the data won't be in
> the operating system's memory cache.
>
> Aku writes:
> > (Problem #1)  Will the input contain byte 0 ?
>
> No. It will consist of the cipher line plus
> many lines of English text - no weird characters,
> no huge (over 1000 characters) lines.
>
> I'll add these points to the Web page later today.
>
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    http://www.RapidEuphoria.com
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

34. Re: $100 Contest Question

Martin Stachon wrote:
>             {1,2,3,4,5,4,3,2,1},
>             {1,2,'X'},
>             {'M',1,2,3,4,5},
>             {1,2,3,4,5,6,7,8,9,10},
>             {'E',1,1,2},
>             {1,2,2,1,3},
>             {1,1,2},
>             {1,2,1},
>             {'M',1,2,1,'M'},
>             "MARTIN",
>             {1,2,'X',2,1},
>             {1,2,3,'B',3,4},
>             {1,2,'M',2,1},
>             {'E',1,2,3,1,3}
> -----------------------------------------------
> I get this output:
>     5.06
>     5.5
>
> On my Winchip @200Mhz, Win98. And you?
>

Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to run at
0.22

For Comp#2 program:
 I get this output for Martin's benchmark list
     5.38  (load time)
     5.98  (total time after 100x14 tests)
    -----
     0.60  Time for 100x14 tests, thus
     0.0004 (per test)


This is my benchmark list now and I added further optimisations last night.
    {1,1,2,3,4}
    {'E', 1, 2, 3, 1, 3}
    {1,'e',2,3,4,'s'}
    {1,1}
    {1}
    {1,2,3}
    {4,'i','e'}
    {1,'E',2, 'E', 3, 'e'}
    {"ABC" & '\'','s'}
    "rabbit"
    "r" & {1,1,2}
    {'t',1,2}
    {1,2,3,'s'}
    {1,1,2,2}
    {1,2,2,1}
    {1,2,3,4,'e','d'}
    {1,2,3,4,5,6,7}
    {1,2,3,1,4,3,3,2,5,3}


 I get this output for my's benchmark list
     5.38  (load time)
     6.32  (total time after 100x18 tests )
    -----
     0.94  time for 100x18 tests, thus
     0.0005 (per test)

 On my Intel P3 @550Mhz, WinMe.

Comparing these with Martin's and Chris Bensler's times, it seems that I
spend most of the time in loading in the word list, but once loaded, finding
matches is very fast.

---------------
Derek

new topic     » goto parent     » topic index » view message » categorize

35. Re: $100 Contest Question

----- Original Message -----
From: "Robert Craig" <rds at RapidEuphoria.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question

Rob Craig writes:
 <snip>
 > In problem #2, assume that I will make 1000
> calls to your function.
<snip>

It seems unfair to call only 1000 times the function, regarding that 50,000
words had to be somehow processed before.

new topic     » goto parent     » topic index » view message » categorize

36. Re: $100 Contest Question

----- Original Message ----- 
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>
> 
> Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to run at
> 0.22

Yes you did, thanks BIG D

When "I" changed this: 
h *= 16

to this:
h *= 3 -- which doesnt give as Unique a value as before but is still very
effective.

The routine was @250% faster bringing the time from 3.5 sec on my 233mhz to 1.1
sec
This is the routine I was talking about sharing in a month or so.

But when BIG D convinced me that getc( ) would shave another 0.25 sec off the
load time.
I was impressed. He also went a few steps further and now EumsHash runs in at
0.60 sec
on my 233mhz laptop ( at 200mhz desk). I can imagine figures on PIII 500 or
higher machine
being (0.0something) or better now. 

Prolly could beat Junko's Spellchecker I havent coded this so Im not sure.

sure is BLAZING FAST!

Euman

new topic     » goto parent     » topic index » view message » categorize

37. Re: $100 Contest Question

Chris are you sure the output is the same (hash table values)?

If not I would say that the table lookup speed might differ.

Right now, the table lookup is 100's time faster than reading in the
"words.txt" and creating the table. 

reading and creating the table is 0.60 sec (233mhz) FAST
imagine what the lookup will be on say a 15000 word .txt file
like Junkos spell checker is timed at (2 sec)

Im saying the spellchecker could be @ 1 sec for the same using
EumsHash( ) but this is just a guestament.

Euman
euman at bellsouth.net

Q: Are we monetarily insane?
A: YES
----- Original Message ----- 
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
Sent: Monday, March 04, 2002 5:47 PM
Subject: RE: $100 Contest Question


> 
> I got it down to 0.11 with minor tweaks, I think I can get it even 
> faster. :)
> 
> Chris
> 
> euman at bellsouth.net wrote:
> > ----- Original Message ----- 
> > From: "Derek Parnell" <ddparnell at bigpond.com>
> > To: "EUforum" <EUforum at topica.com>
> > > 
> > > Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to 
> > > run at
> > > 0.22
> > 
> > Yes you did, thanks BIG D
> > 
> > When "I" changed this: 
> > h *= 16
> > 
> > to this:
> > h *= 3 -- which doesnt give as Unique a value as before but is still 
> > very effective.
> > 
> > The routine was @250% faster bringing the time from 3.5 sec on my 233mhz 
> > to 1.1 sec 
> > This is the routine I was talking about sharing in a month or so.
> > 
> > But when BIG D convinced me that getc( ) would shave another 0.25 sec 
> > off the load time.
> > I was impressed. He also went a few steps further and now EumsHash runs 
> > in at 0.60 sec
> > on my 233mhz laptop ( at 200mhz desk). I can imagine figures on PIII 500 or 
> > higher machine 
> > being (0.0something) or better now. 
> > 
> > Prolly could beat Junko's Spellchecker I havent coded this so Im not 
> > sure.
> > 
> > sure is BLAZING FAST!
> > 
> > Euman
> > 
> > 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

38. Re: $100 Contest Question

On Mon, 4 Mar 2002 16:45:58 -0300, rforno at tutopia.com wrote:

>
>Maybe someone is interested in that I casually found the word BBUFFALOES
>instead of BUFFALOES in Junko´s list. I don't know if there are other
>errors.

"AGENCY" and "AGENCY " (agency+space),
"DYNAMICALLY" and "DYNAMICALLY " (ditto)
"REWRITING" and "REWRITING "(ditto)

"ST." (ST+fullstop)

Pete

new topic     » goto parent     » topic index » view message » categorize

39. Re: $100 Contest Question

----- Original Message ----- 
From: <bensler at mail.com>
To: "EUforum" <EUforum at topica.com>
> 
> Grr, I DID test before I said anything, to be sure. But I double 
> checkled, and the values aren't the same :(
> I'll have to see if I can still do it.
> 
> Like Derek said though, I don't see the purpose of using that table.
> What does it matter what the first letter of each word is?
> Only a minute set of circumstances would benefit from it.
> 
> Chris
> 

As long as the values for each word are unique values and as long
as the two single characters are 'A' and 'I' then your probably doing
OK with the speed up youve done.

> I don't see the purpose of using that table

I can see randomly seeding a word till I find a match and applying
the seeds to other words for consideration being fairly fast.
This is my mind set for the project, Im sure I'll be beaten to death
on this but I'll still give it good shot.

Euman

new topic     » goto parent     » topic index » view message » categorize

40. Re: $100 Contest Question

I fixed the errors in the dictionary, reported by
Pete Lomax and Ricardo Forno.

BBUFFALOES    -> BUFFALOES
AGENCY + space    -> duplicate, deleted
DYNAMICALLY + space  -> duplicate, deleted
REWRITING + space   -> duplicate, deleted
ST. (ST+fullstop) -> deleted, "ST" already exists, no other words have "."

So the dictionary now has 51798 words.
You can assume that it's sorted and is all upper case.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

41. Re: $100 Contest Question

Derek:
It seems to me I don't understand the rules. Being that Junko's words.txt
contains only upper case, what is the point of:
Test({1,2,3,4,'e','d'})  ?
Thanks.


----- Original Message -----
From: "Derek Parnell" <ddparnell at bigpond.com>
To: "EUforum" <EUforum at topica.com>
Subject: Re: $100 Contest Question


>
> Hi Chris,
> I'm now so here is the details I promised.
>
> >
> > What kind of bench test are you using Derek? I get nowhere near those
> > kinds of results.
>
> This is the group of 16 tests I repeat 80 times.
>
>     Test({1,1,2,3,4})
>     Test({'E', 1, 2, 3, 1, 3})
>     Test({})
>     Test({1,1})
>     Test({1})
>     Test({1,2,3})
>     Test({4,'i','e'})
>     Test({1,'E',2, 'E', 3, 'e'})
>     Test({"ABC"&'\'','s'})
>     Test("rabbit")
>     Test("r" & {1,1,2})
>     Test({'t',1,2})
>     Test({1,2,3,'s'})
>     Test({1,1,2,2})
>     Test({1,2,2,1})
>     Test({1,2,3,4,'e','d'})
>
>
> > I use..
> >   match_pattern({1,2,3,1,4,3,3,2,5,3}) -- ZIGZAGGING
> > ..in an iterated loop
> >
> > I don't think I can even iterate 50x in 1.48 seconds
> > And that pattern is on the faster side of the avergage.
> >
> > I don't understand how you could possibly get it that fast, unless your
> > pattern is {1}, only then is my program comparable to your results.
> >
> > I've still got some tweaking to do, but I don't see it getting much
> > faster.
> >
> > Try...
> > -- find all 7 letter words with unique letters
> > match_pattern({1,2,3,4,5,6,7})
> >
> > I'd like to know what your iteration time is for that one.
>
> This takes between 0.99 and 1.04 seconds to do 10,000 iterations.
>
>   sequence VOID
>   atom ts
>   ts=time()
>   for i = 1 to 10000 do
>     Test({1,2,3,4,5,6,7})
>   end for
>   printf(1, "%f\n", time()-ts)
>
>
> For the ZIGZAGGING test, it takes between 2.97 and 3.04 seconds to do
10,000
> iterations.
>
>
> Thanks for getting me to do this test 'cos I can see a further
optimisation
> now blink
>
> ----------
> cheers,
> Derek
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

42. Re: $100 Contest Question

5/03/2002 7:43:18 AM, rforno at tutopia.com wrote:

>
>Derek:
>It seems to me I don't understand the rules. Being that Junko's words.txt
>contains only upper case, what is the point of:
>Test({1,2,3,4,'e','d'})  ?
>Thanks.
>
>

Strictly you're correct. If we read the rules it does imply that the pattern is
case sensitive.

"Assume that the numbers 0 to 32 are used to indicate the pattern, while the
numbers above 32 are
the ASCII codes of literal characters to be matched (including apostrophe and
hyphen)."

And thus applying lowercase characters in the pattern to WORDS.TXT would be a
waste of time.

Because the other competition's stressed case-insensitivity, I assumed it would
apply to this comp
as well. Thank's for pointing out my error.

---------
Derek.

new topic     » goto parent     » topic index » view message » categorize

43. Re: $100 Contest Question

For those not too bored with this contest yet, after more tweaking in
comp#2, I now get this output for my benchmark list:

      1.43  (load time)
      3.02  (total time after 100x32 tests )
     -----
      1.59  time for 100x32 tests, thus
      0.0005 (per test)

  On my Intel P3 @550Mhz, WinMe.

The suite of 32 tests I'm using is:

{1,1,2,3,4}
{'E', 1, 2, 3, 1, 3}
{1,'E',2,3,4,'S'}
{1,1}
{1}
{1,2,3}
{4,'I','E'}
{1,'E',2, 'E', 3, 'E'}
{1,2,2,3,'S'}
"RABBIT"
"R" & {1,1,2}
{'T',1,2}
{1,2,3,'S'}
{1,1,2,2}
{1,2,2,1}
{1,2,3,4,'E','D'}
{1,2,3,4,5,6,7}
{1,2,3,1,4,3,3,2,5,3}
{1,2,3,4,5,4,3,2,1}
{1,2,'X'}
{'M',1,2,3,4,5}
{1,2,3,4,5,6,7,8,9,10}
{'E',1,1,2}
{1,2,2,1,3}
{1,1,2}
{1,2,1}
{'M',1,2,1,'M'}
"MARTIN"
{1,2,'X',2,1}
{1,2,3,'B',3,4}
{1,2,'M',2,1}
{'E',1,2,3,1,3}

which I repeat 100 times.

-------
Derek

new topic     » goto parent     » topic index » view message » categorize

44. Re: $100 Contest Question

On Sun, 3 Mar 2002 23:37:20 -0500, Robert Craig
<rds at RapidEuphoria.com> wrote:

>but I don't want programs to 
>take advantage of anything they did in a previous run.)

Everyone knows that there are 51798 words in the dictionary.
I've also worked out that one of my arrays will need 122545 entries.

I assume it is ok to code these constants into the program.

In contest #2, you specify a library routine. Therefore I expect you
to invoke my code with

include myfile
...
select(pattern)

in a standard test rig rather than the dos prompt redirection thingy.
Is that right? If not, do you want a library routine and a loading
program, or do you want them munged into a single standalone program?

You also specify "a sequence containing all the words". Should this be
in alphabetical order? I'm going to assume not unless otherwise told.

Pete Lomax

new topic     » goto parent     » topic index » view message » categorize

45. Re: $100 Contest Question

Pete Lomax writes:
> Everyone knows that there are 51798 words in the dictionary.
> I've also worked out that one of my arrays will need 122545 entries.
> 
> I assume it is ok to code these constants into the program.

I'm not judging based on coding style, but
the dictionary has changed in size once.
It could happen again. If I have to go into your code
and change a bunch of hard-coded numbers to make it work,
I won't be too happy. If you have a single constant defined
for the dictionary size, from which you compute other constants,
I guess I can change that single constant for you.

> In contest #2, you specify a library routine. Therefore I expect you
> to invoke my code with
>
> include myfile
> ...
> select(pattern)

> in a standard test rig rather than the dos prompt redirection thingy.
> Is that right? 

Yes.

> If not, do you want a library routine and a loading
> program, or do you want them munged into a single 
> standalone program?

See 
http://www.RapidEuphoria.com/contest.htm

I recently clarified a bunch of those issues.

> You also specify "a sequence containing all the words". Should this be
> in alphabetical order? I'm going to assume not unless otherwise told.

The result sequence does not have to be in alphabetical order.

Ray Smith writes:
> Just playing with input redirection and I can't get it to work with 
> the windows interpreter (exw).

It works fine for me with Windows ME, but apparently it won't
work on NT/2000/XP systems. So just use ex.exe instead.
You can still get the $5 bonus for "working on all platforms".

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

46. Re: $100 Contest Question

No, I tested my program with some patterns and then it appeared.
----- Original Message -----
From: "C. K. Lester" <cklester at yahoo.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: $100 Contest Question


>
> rforno at tutopia.com wrote:
> > Maybe someone is interested in that I casually found the
> > word BBUFFALOES instead of BUFFALOES in Junko´s list.
> > I don't know if there are other errors.
>
> No wonder my program kept crashing!!! heheh
>
> How'd you find that, btw? Just reading the list?
>
>
>
>

new topic     » goto parent     » topic index » view message » categorize

47. Re: $100 Contest Question

Chris Bensler writes:
> I'm asuming, we cannot asume anything about words.txt except that it 
> will be sorted, and all words are capitalized.
>
> Are there other specifications of the file that we CAN count on Rob? 
> like number of words, or number of words beginning with each letter.
> Or number of words of each length. What about maximum word length?

You can assume that if there are further changes to words.txt,
they will only be minor changes, perhaps deleting a word here
or there. Three words were removed recently because they were
duplicates of previous words, except that they also had a 
blank character on the end. Another word was a duplicate,
but it had a period on the end, and it was the only word 
in the dictionary like that ("ST." - abbreviation for "saint"). 
I didn't want to leave that sort of corruption in the file, 
lest it trip someone up unfairly. I ran a little program 
yesterday to confirm that all words are in alphabetical order, 
and there are no duplicates, and no blank characters or periods.

So go ahead and assume what you like,
but for safety you might want to allow for some slight changes,
i.e. create sequences slightly larger than they need to be.
In general, this is the kind of coding decision that C/C++
forces on you, not Euphoria, unless you are obsessed with speed,
which I guess you are. smile

> Should the routine handle lowercase parameters?

In problem #2, I won't give you any lower case letters
as part of the pattern. 

In problem #3 you will have to accept upper and lower
case text as the input, and you should preserve it in the output.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

48. Re: $100 Contest Question

Chris Bensler wrote:
> On my PIII 600mhz/64MB :
> 
>   Euman's hashtable runs at 0.8s

On my machine it takes 6-8 secs...

Damn the machine {1,2,'C','K',1} ...

Using TestCPU

                   MOV memory  MOVSD memory  Calcul.score
PIII    @550Mhz :  429 MB/s    1126 MB/s     220
Winchip @200Mhz :   60 MB/s     142 MB/s      35

          score -  Dhrystones  Whetstones    MIPS   MFLOPS
                :  203         195           200    215
                :   45          24            17     22

So I should divide my times by 10 ...

Also I can't tell wheter an algorithm which is faster on my
machine will be faster on Rob's machine because of different
cache size (actually I have larger L1 cache than PIII but PIII
has faster L2 cache) and PIII's advanced features such is branch
prediction, SSE etc.

P.S. : I've just fixed yet another bottleneck in my routine
the #2 will be really tough...also a bit of luck will decide...
Maybe I should move to #3 but I do not dare to challenge David Cuny.

    Martin

new topic     » goto parent     » topic index » view message » categorize

49. Re: $100 Contest Question

On Tue, 5 Mar 2002 12:50:35 -0500, Robert Craig
<rds at RapidEuphoria.com> wrote:

> If I have to go into your code
>and change a bunch of hard-coded numbers to make it work,
>I won't be too happy.

I assumed you'd immediately disqualify it!

>If you have a single constant defined
>for the dictionary size, from which you compute other constants,
>I guess I can change that single constant for you.

I have several, but like 122545 mainly empirical results. Without the
program that blatted that number out, (or including all that code in
my submission), there's no simple calc'n for it.  Hence, If my program
doesn't work because that number is wrong, don't change it, just
disqualify my entry.

>See 
>http://www.RapidEuphoria.com/contest.htm
>
Thanks, I'd forgotten to make select() global!
A couple of extra test cases too, I see, thanks also.

>The result sequence does not have to be in alphabetical order.

UGH. I was sneakily hoping to put a few competitors back a step or two
on that one, sly devil that I am. (grin)

Cheers,
Pete

new topic     » goto parent     » topic index » view message » categorize

50. Re: $100 Contest Question

On Wed, 6 Mar 2002 00:57:33 +1100, Derek Parnell
<ddparnell at bigpond.com> wrote:

>
>For those not too bored with this contest yet, after more tweaking in
>comp#2, I now get this output for my benchmark list:
>
>      1.43  (load time)
>      3.02  (total time after 100x32 tests )
>     -----
>      1.59  time for 100x32 tests, thus
>      0.0005 (per test)
>
>  On my Intel P3 @550Mhz, WinMe.

Load time 3.170000, test 15.480000, total 18.650000, iteration
0.009675

Lets call it a draw.

Pete

new topic     » goto parent     » topic index » view message » categorize

51. Re: $100 Contest Question

Martin wrote:

> Maybe I should move to #3 but I do not dare to 
> challenge David Cuny.

I'm not the one to worry about. Doesn't C. K. Lester have his "White Rabbit" 
program solving crytograms in about 5 seconds? I'm using a fairly naive 
genetic algorithm (GA) as the core of my solver. Even with a number of 
optimizations, it took about an hour to crunch through:

  king nero prvby khnc avcl fhyyvy cavrdlk chfn
  deaf tfengy ic dhuly erbv ychtl
  riby xhff ertfikerd fhyyel

before finally settling on:

  dumb minx knots damp rope lassos pronged palm
  girl climbs up gazes into space
  nuts fall including lassie

There may be room in the code for improvement - profiling will tell - but a 
10:1 improvement? I suspect not, especially when "White Rabbit" is apparently 
churning out the solution in seconds.

The real problem for me seems to be the algorithm. GA is dead simple to 
implement, and will eventually hit on the solution. But it's not especially 
efficient. I'm considering using trigraphs instead; they look *very* 
successful. Take a look at:

   http://www.muth.org/Robert/Cipher/substitution

Because the topic of solving crytograms is so well researched, I suspect the 
winners will all select the same general algorithm. So the prize will go to 
the person who bothers to do a little research, and optimizes the heck out of 
their code. That sounds like jiri, C.K. Lester or  Derek... but certainly not 
me. I'll don't take my code down to that sort of optimization.

Thanks anyway for the vote!

-- David Cuny

new topic     » goto parent     » topic index » view message » categorize

52. Re: $100 Contest Question

http://www.muth.org/Robert/Cipher/substitution

-- David Cuny

I think we've studied the same material DC, I posted this when
the contest started.

http://www.cs.arizona.edu/http/html/people/muth/Cipher/

Euman

new topic     » goto parent     » topic index » view message » categorize

53. Re: $100 Contest Question

Euman wrote:

> I think we've studied the same material DC, I posted this when
> the contest started.

Sorry about not giving credit. I ran across the site after doing a Google 
search for a cryptogram you posted.

-- David Cuny

new topic     » goto parent     » topic index » view message » categorize

54. Re: $100 Contest Question

I hesitate to mention this, not only because it's the only optimization my 
pathetic code has, but because it's so obvious...

People have been talking about a single dictionary, but you actually need two 
hash tables: one hashed based on the word, and the other based on the pattern 
the word generates. For example:

   SILLY

would be placed in the "normal" dictionary under the key "SILLY" and under 
the "pattern" dictionary as "ABCCD".

If you read your cryptogram before you read the dictionary, you don't have to 
create a hash table for the pattern dictionary - just those that match 
patterns found in the cryptogram. That is, if the cryptogram were:

  IG EES SWWM

then your dictionary would only need to hold words that match the patterns:

  AB
  AAB
  ABBC

And since crytogram words always create the same pattern, if you store the 
hash key to the pattern with the cryptogram words, you don't need to 
recalculate it.

Yeah, all obvious stuff.

-- David Cuny

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu