1. $100 Contest Question
- Posted by "C. K. Lester" <cklester at yahoo.com> Mar 02, 2002
- 485 views
For the contests, what is "standard input?" When it says, " Write a program in Euphoria that reads from standard input a "cipher" line containing the 26 letters of the alphabet, rearranged somehow, e.g. PQRSTUVZABCDEFGHWXYIJKLMNO" does the "standard input" indicate a file? Or is standard input the keyboard? I know I could probably find this in the docs, but this is much more reliable. :) Also, will the cipher line necessarily be sequential? For instance, might we encounter: ABCDEFGHIJKLMNOPQRSTUVWXYZ PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line This is a simple one-to-one replacement, but the cipher line, you'll note, is not sequential.
2. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 02, 2002
- 461 views
On 3 Mar 2002, at 0:22, C. K. Lester wrote: > > For the contests, what is "standard input?" > > When it says, " Write a program in Euphoria that reads from standard > input a "cipher" line containing the 26 letters of the alphabet, > rearranged somehow, e.g. > PQRSTUVZABCDEFGHWXYIJKLMNO" > > does the "standard input" indicate a file? > > Or is standard input the keyboard? > > I know I could probably find this in the docs, but this is much more > reliable. :) I was wondering this too. File input, dox box, windoze gui, or wxwindows, or what? > Also, will the cipher line necessarily be sequential? For instance, > might we encounter: > > ABCDEFGHIJKLMNOPQRSTUVWXYZ > PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line > > This is a simple one-to-one replacement, but the cipher line, you'll > note, is not sequential. It *is* sequential, but it's not alphabetical. My new question: It's taking me 143 seconds to load a dictionary with the following code: dfilename = "D:\\Gertie\\decypher\\dfile.txt" readfile = open(dfilename,"r") readline = get(readfile) readline = readline[2] close(readfile) And since i don't see how i can make that any faster, can we *not* count load times for this contest? My puter is a K6-2-233 running win95, so it's prolly the slowest one here (not counting Igor's 386's), but still, i don't see the load times improving off the hd with faster cpus.. Kat
3. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 02, 2002
- 477 views
> And since i don't see how i can make that any faster, can we *not* count > load times for this contest? My puter is a K6-2-233 running win95, so it's > prolly the slowest one here (not counting Igor's 386's), but still, i don't > see > the load times improving off the hd with faster cpus.. > > Kat If anything, you and me are at an advantage because we have slow machines so our algorythms will prolly be faster just cause we have to make'em that way just to see results on them. Euman
4. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 02, 2002
- 488 views
----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > > On 3 Mar 2002, at 0:22, C. K. Lester wrote: > > > > > For the contests, what is "standard input?" > > > > When it says, " Write a program in Euphoria that reads from standard > > input a "cipher" line containing the 26 letters of the alphabet, > > rearranged somehow, e.g. > > PQRSTUVZABCDEFGHWXYIJKLMNO" > > > > does the "standard input" indicate a file? > > > > Or is standard input the keyboard? > > > > I know I could probably find this in the docs, but this is much more > > reliable. :) > > I was wondering this too. File input, dox box, windoze gui, or wxwindows, or > what? > > > Also, will the cipher line necessarily be sequential? For instance, > > might we encounter: > > > > ABCDEFGHIJKLMNOPQRSTUVWXYZ > > PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line > > > > This is a simple one-to-one replacement, but the cipher line, you'll > > note, is not sequential. > > It *is* sequential, but it's not alphabetical. > > My new question: It's taking me 143 seconds to load a dictionary with the > following code: > > dfilename = "D:\\Gertie\\decypher\\dfile.txt" > readfile = open(dfilename,"r") > readline = get(readfile) > readline = readline[2] > close(readfile) > > And since i don't see how i can make that any faster, can we *not* count > load times for this contest? My puter is a K6-2-233 running win95, so it's > prolly the slowest one here (not counting Igor's 386's), but still, i don't see > the load times improving off the hd with faster cpus.. > Hi Kat, I would advise people not to use get() unless there was a really good reason to do so. It is a very slow way of reading in text. Even so, 143 seconds is amazingly slow. Here is a test I whipped up and in takes 0.07 seconds to run on my machine (Pentium III,550MHz, 256Meg, Windows Me). ------------- include get.e include file.e atom flen atom s sequence dfilename, readline integer readfile s = time() dfilename = "f:\\euphoria\\words.txt" readfile = open(dfilename,"rb") -- NB, binary for speed flen = seek(readfile, -1) flen = where(readfile) if seek(readfile,0) then end if readline = repeat(0,flen) for i = 1 to flen do readline[i] = getc(readfile) end for close(readfile) printf(1, "%f\n", time()-s) -------------- Derek
5. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 02, 2002
- 482 views
----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > On 3 Mar 2002, at 2:52, C. K. Lester wrote: > > > > > Kat wrote: > > > > > > My new question: It's taking me 143 seconds to load a > > > dictionary with the following code: > > > > Kat, I'm loading Junko's 50,000-word dictionary in less than a second. > > What dictionary are YOU using?! :D > > One i threw together and premunged just for this. Unfortunately, it's making a > 4.8Meg file out of the dictionaries, even after deleting all the trailing spaces, > CR, LF, and duplicates. Using: > > writefile = open(dfilename,"wb") > print(writefile,dictionary) > close(writefile) > > puts this stuff into the file: > > {{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67} ,{68},{68} > , etc, which is just a listing of the letters of the alphabet, which is what i > wanted at that point. But i didn't need it represented that way, i wanted it > done in a way Eu could reload it instantly,, like,, umm, in > D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as > much space and load time as needed. Any ideas? That is exactly why print() and get() should rarely be used. They are extremely inefficient. In my version of reformatting Junko's WORDS.TXT, I went from the original 508,190 bytes to 531,828 bytes. In speed differences, it takes my about 6 seconds to use WORDS.TXT to build the internal dictionary, and about 0.5 seconds to build it using the reformatted words.txt. For speed, use binary reads and writes, and just use getc() and puts(). I'm not sure if I'm allowed to give any more details of the algorithms I used etc..., but I did strip out unnecessary delimiters between words. ---------- Derek
6. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 02, 2002
- 462 views
I meant to ask this question that I dont think anyones asked. Can I have text length files? what I mean is word files that are seperated by the length of their text? oh yeah, Im not in this for fun anymore. Im in it to win it! Euman euman at bellsouth.net Q: Are we monetarily insane? A: YES ----- Original Message ----- From: <bensler at mail.com> To: "EUforum" <EUforum at topica.com> Sent: Sunday, March 03, 2002 12:17 AM Subject: RE: $100 Contest Question > > Derek, are you using a preformatted version of words.txt, or are you > formatting it within your program? It sounds like the latter, which I > don't think is allowed. > > Chris > > > Derek Parnell wrote: > > ----- Original Message ----- > > From: "Kat" <gertie at PELL.NET> > > To: "EUforum" <EUforum at topica.com> > > Sent: Sunday, March 03, 2002 2:39 PM > > Subject: RE: $100 Contest Question > > > > > > > On 3 Mar 2002, at 2:52, C. K. Lester wrote: > > > > > > > > > > > Kat wrote: > > > > > > > > > > My new question: It's taking me 143 seconds to load a > > > > > dictionary with the following code: > > > > > > > > Kat, I'm loading Junko's 50,000-word dictionary in less than a second. > > > > What dictionary are YOU using?! :D > > > > > > One i threw together and premunged just for this. Unfortunately, it's > > making a > > > 4.8Meg file out of the dictionaries, even after deleting all the > > > trailing > > spaces, > > > CR, LF, and duplicates. Using: > > > > > > writefile = open(dfilename,"wb") > > > print(writefile,dictionary) > > > close(writefile) > > > > > > puts this stuff into the file: > > > > > > > > {{{50},{67},{71},{72},{75},{77},{78},{84},{88},{65},{73},{65},{66},{66},{67} > > > > ,{68},{68} > > > , etc, which is just a listing of the letters of the alphabet, which is > > what i > > > wanted at that point. But i didn't need it represented that way, i > > > wanted > > it > > > done in a way Eu could reload it instantly,, like,, umm, in > > > D:\Euphoria\DEMO\MYDATA.EX. As it is saved tho, it is taking up 5x as > > > much space and load time as needed. Any ideas? > > > > That is exactly why print() and get() should rarely be used. They are > > extremely inefficient. In my version of reformatting Junko's WORDS.TXT, > > I > > went from the original 508,190 bytes to 531,828 bytes. In speed > > differences, > > it takes my about 6 seconds to use WORDS.TXT to build the internal > > dictionary, and about 0.5 seconds to build it using the reformatted > > words.txt. > > > > For speed, use binary reads and writes, and just use getc() and puts(). > > > > I'm not sure if I'm allowed to give any more details of the algorithms I > > used etc..., but I did strip out unnecessary delimiters between words. > > > > ---------- > > Derek > > > > > > >
7. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 02, 2002
- 469 views
On 3 Mar 2002, at 14:37, Derek Parnell wrote: > > ----- Original Message ----- > From: "Kat" <gertie at PELL.NET> > To: "EUforum" <EUforum at topica.com> > Sent: Sunday, March 03, 2002 1:15 PM > Subject: Re: $100 Contest Question > > > > On 3 Mar 2002, at 0:22, C. K. Lester wrote: > > > > > > > > For the contests, what is "standard input?" > > > > > > When it says, " Write a program in Euphoria that reads from standard > > > input a "cipher" line containing the 26 letters of the alphabet, > > > rearranged somehow, e.g. > > > PQRSTUVZABCDEFGHWXYIJKLMNO" > > > > > > does the "standard input" indicate a file? > > > > > > Or is standard input the keyboard? > > > > > > I know I could probably find this in the docs, but this is much more > > > reliable. :) > > > > I was wondering this too. File input, dox box, windoze gui, or wxwindows, > or > > what? > > > > > Also, will the cipher line necessarily be sequential? For instance, > > > might we encounter: > > > > > > ABCDEFGHIJKLMNOPQRSTUVWXYZ > > > PRSTWUVZQMABCDEFGHXYIJKLNO <-- cipher line > > > > > > This is a simple one-to-one replacement, but the cipher line, you'll > > > note, is not sequential. > > > > It *is* sequential, but it's not alphabetical. > > > > My new question: It's taking me 143 seconds to load a dictionary with the > > following code: > > > > dfilename = "D:\\Gertie\\decypher\\dfile.txt" > > readfile = open(dfilename,"r") > > readline = get(readfile) > > readline = readline[2] > > close(readfile) > > > > And since i don't see how i can make that any faster, can we *not* count > > load times for this contest? My puter is a K6-2-233 running win95, so it's > > prolly the slowest one here (not counting Igor's 386's), but still, i > don't see > > the load times improving off the hd with faster cpus.. > > > Hi Kat, > I would advise people not to use get() unless there was a really good reason > to > do so. It is a very slow way of reading in text. Even so, 143 seconds is > amazingly slow. But it reads the whole file at one time,, no optimisation behind the scenes? What about saving the file? Looks to me like saving it with the prints(file,HeavilyNestedSequence) by using 5x as much space as the original is overkill. > Here is a test I whipped up and in takes 0.07 seconds to run on my machine > (Pentium III,550MHz, 256Meg, Windows Me). I ran it, my best run was .67sec,, they generally ran 1.4sec. So i need to figure my runs of 50minutes will meet the 5minute runtime limits of the contest? <blink><blink> > ------------- > include get.e > include file.e > atom flen > atom s > sequence dfilename, readline > integer readfile > s = time() > dfilename = "f:\\euphoria\\words.txt" > readfile = open(dfilename,"rb") -- NB, binary for speed > flen = seek(readfile, -1) > flen = where(readfile) > if seek(readfile,0) then end if > readline = repeat(0,flen) > for i = 1 to flen do > readline[i] = getc(readfile) This doesn't allow for any premunging of the dictionary, or entry of it into nested sequences. Does it? > end for > close(readfile) > printf(1, "%f\n", time()-s) As Euman pointed out, i haveto squeeze as much out of the 5 minute contest timeframe as i can, so i premunged the dictionary. I have no concrete factor to figure the difference in exec times, but loading the premunged, nested seq dictionary is way faster than munging it, loading 2 dictionaries and munging, then writing out, took 19 minutes. Kat
8. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 02, 2002
- 471 views
----- Original Message ----- From: <bensler at mail.com> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > Derek, are you using a preformatted version of words.txt, or are you > formatting it within your program? It sounds like the latter, which I > don't think is allowed. > > Chris > > The short answer is both. For comp#2, the algorithm I used when the matching routine is called was: If the internal dictionary is not set up, look for a file called 'dict.dat'. If that is present, use it's contents to set up the internal dictionary. otherwise look for 'words.txt'. If that is present, use it's contents to set up the internal dictionary, then write out the internal dictionary to 'dict.dat' using a special format. In either case, there is a small delay the first time the routine is called while it initialises the internal dictionary. Only with the dict.dat file, this delay is a lot smaller than with words.txt. Once the dictionary is set up, find the matching words is lightening fast. When I submitted my program to Robert yesterday, the wording of the competition did not say "You must use words.txt contained in Junko's spell checker in the Archive." So I guess the rules have changed after my submission! Oh well. Of course, in one sense. I did use Junko's file - to create a reformatted one - and I can use Junko's file if the dict.dat file is not present. I wrote the program as if it was to be used in the real world, not just some artificial competetion environment. Thus the routine that uses words.txt is not hyper-optimised as I was only going to use it once to create the dict.dat file. That file is the optimised one. If Robert rules against this concept, I guess I can submit another version of the program. ----- Derek.
9. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 02, 2002
- 486 views
----- Original Message ----- From: <bensler at mail.com> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > about hyphens, and apostrophes? > I documented my assumptions in the source code. For this one, it was than any character in the pattern text, whose ASCII value is higher than SPACE, is meant to be an exact-match character. Meaning that the integers 0 to 32 are reserved for inexact matches. On a similar point I made the assumption that a pattern of {4,6,9} is equivalent to {1,2,3}. In other words, the actual value of the pattern characters is not important, only that they represent a unique character in the target word(s). -------- Derek.
10. Re: $100 Contest Question
- Posted by Robert Craig <rds at RapidEuphoria.com> Mar 03, 2002
- 463 views
Chris Bensler writes: > What platform will be used to test with? To be fair, > it would have to be tested on all 3. I'll use DOS. It's a bit inconvenient to boot into Linux. > What if one entry only works on a specific platform, > but is faster than all others for that platform? It should be easy to make your program work on all platforms, but if a program fails on DOS, but works on Linux or Windows, I'll measure its time on a system it works on (on the same machine) and decide if there is any unfairness. A program like that can still win, but won't be eligible for the $5 bonus. > What are the valid match characters for Contest#2? A-Z and a-z? > What about hyphens, and apostrophes? As Derek suggested, if a '-' or '\'' is supplied (or some character greater than ASCII 32), it should be treated as a literal character to be matched. Values from 0 to 32 represent "meta" characters, or placeholders for unspecified characters in the pattern. I'll only give you upper case literal characters, A, B, C, ... Euman writes: > Can I have text length files? what I mean is word files that are > seperated by the length of their text? I'm not sure what you mean. On problem #2 you must use Junko's dictionary (word list), because I will be determining correctness based on the words in that particular dictionary. If you want to reformat her file into a different file, that's fine, but the time that it takes to do that will be included in your total time. You can assume that there is enough memory to store the 51802 words in memory (unless you store them in a bizarrely inefficient way). I'll be using a machine with 64 Mb of RAM. On problem #3 you can use any dictionary, formatted any way you like, but I think Junko's is quite reasonable. Martin Stachon writes (privately): > After the load of the wordlist, how many times you will > call the function? In problem #2, assume that I will make 1000 calls to your function. Derek Parnell writes: > On a similar point I made the assumption that a pattern of {4,6,9} is > equivalent to {1,2,3}. In other words, the actual value of the pattern > characters is not important, only that they represent a unique character in > the target word(s). Yes, that's correct. Aku writes: > (Problem #1) How is the time calculated? > How many iteration (loops) will it be tested? I'm planning to run each program once, with a few megabytes of input text. I'll actually do it a few times each, and ignore the first run, since the first time, the data won't be in the operating system's memory cache. Aku writes: > (Problem #1) Will the input contain byte 0 ? No. It will consist of the cipher line plus many lines of English text - no weird characters, no huge (over 1000 characters) lines. I'll add these points to the Web page later today. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
11. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 473 views
On 3 Mar 2002, at 13:44, Robert Craig wrote: <snippage happened> > > What are the valid match characters for Contest#2? A-Z and a-z? > > What about hyphens, and apostrophes? > > As Derek suggested, > if a '-' or '\'' is supplied (or some character greater than ASCII 32), > it should be treated as a literal character to be matched. Values > from 0 to 32 represent "meta" characters, or placeholders for > unspecified characters in the pattern. I'll only give you upper case > literal characters, A, B, C, ... So the input file will be all upper()'d already? > On problem #3 you can use any dictionary, formatted > any way you like, but I think Junko's is quite reasonable. > > Martin Stachon writes (privately): > > After the load of the wordlist, how many times you will > > call the function? > > In problem #2, assume that I will make 1000 > calls to your function. You realise, at 5 minutes runtime per iteration, that's 83 hours? If you get 100 such entries, that's 345 DAYS of runtime for testing problem #2 programs. > Derek Parnell writes: > > On a similar point I made the assumption that a pattern of {4,6,9} is > > equivalent to {1,2,3}. In other words, the actual value of the pattern > > characters is not important, only that they represent a unique character in > > the target word(s). > > Yes, that's correct. > > Aku writes: > > (Problem #1) How is the time calculated? > > How many iteration (loops) will it be tested? > > I'm planning to run each program once, > with a few megabytes of input text. *megabytes*? of the same few sentences repeated ad naseum, with different keys? Or will you be feeding it a online book text with unique sentences and the same key for all sentences? Kat
12. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 461 views
----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > > if a '-' or '\'' is supplied (or some character greater than ASCII 32), > > it should be treated as a literal character to be matched. Values > > from 0 to 32 represent "meta" characters, or placeholders for > > unspecified characters in the pattern. I'll only give you upper case > > literal characters, A, B, C, ... > > So the input file will be all upper()'d already? Here, Robert is only talking about the characters in the pattern text used to test comp#2. Not the word list or any file used in any competition. If I was you, I'd expect mixed case text for the input files in comp#1 and comp#3. > Robert Craig wrote: > > In problem #2, assume that I will make 1000 > > calls to your function. > > You realise, at 5 minutes runtime per iteration, that's 83 hours? If you get > 100 such entries, that's 345 DAYS of runtime for testing problem #2 > programs. There is something really wrong if a single iteration takes that long. I'm sure Robert will pull the plug on any program runs longer than 5 minutes. Currently I'm doing 1280 iterations in 1.48 seconds, and I haven't finished optimising it yet. > Robert Craig wrote: > > I'm planning to run each program once, > > with a few megabytes of input text. > > *megabytes*? of the same few sentences repeated ad naseum, with different > keys? Or will you be feeding it a online book text with unique sentences and > the same key for all sentences? In comp#1, we are required to *encipher* some English text. As a trial, I combined all the Euphoria\DOC files in to a single file then copied that file to itself a few times until I had a file that was 4,267,462 bytes long. My program takes about 4 seconds to encipher this file. -------- Derek
13. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 484 views
Hi Ray, ----- Original Message ----- From: "Ray Smith" <smithr at ix.net.au> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > > Robert Craig wrote: > > > On problem #2 you must use Junko's dictionary (word list), > > because I will be determining correctness based on > > the words in that particular dictionary. If you want to reformat > > her file into a different file, that's fine, but the time that it > > takes to do that will be included in your total time. > > I've just wasted a weekend of programming!!!! > All Robert is saying here, is that at some point in the execution of the comp#2 program, you must use Junko's file 'as is' to get the list of words that will be used to test the program against. You are perfectly free to reformat that DATA in your program to help speed things up. I think what Robert is getting at is that for comp#2, you cannot use a different file as the SOURCE of the word list. Junko's file is the only SOURCE of words permitted. If you want to create a new file based on her's, you can. The time to do that will be included in the test time though. > Robert Craig wrote in message: > http://www.topica.com/lists/EUforum/read/message.html?mid=1709760866&sort=d& start=10864 > > > For problem #3, you can change the input word list if you want. > For #2 you must use Junko's list as is. > > NOTE THE "as is" !!!! > > after I specifically asked the question: > >Is the time to load the database part of the calculated time to > >decipher the sentence? If it is can we re-organise the > >database in a particular order to help the load time. > >ie. if we create some new hash scheme can we re-order the table to > >make the load time faster and therefore reduce total decipher time. > > I don't mind the rules being manipulated for cases that haven't > already been discussed but it's frustrating when the rules change > from something that has already been said! > Comp #2 is NOT a deciphering exercise. That is for comp#3. In comp#2 it is a pattern matching exercise and that must use Junko's file. For comp#3 you do not have to use Junko's file in any way, shape or form. I can understand your frustration though, because my first attempt at comp#2 also used a reformatted version of Junko's file. I did that because in the original competition rules, it didn't prohibit doing that. Only after I submitted it to Robert did the rules change. But I figure that's pretty real-world behaviour - we have customers changing the system specifications constantly while we are developing the system. ----- Derek.
14. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 03, 2002
- 481 views
----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> > You are perfectly free to > reformat that DATA in your program to help speed things up. I think what > Robert is getting at is that for comp#2, you cannot use a different file as > the SOURCE of the word list. Junko's file is the only SOURCE of words > permitted. If you want to create a new file based on her's, you can. The > time to do that will be included in the test time though. Derek this is how I read this: >From Robert "If you want to reformat her file into a different file, that's fine, but the time that it takes to do that will be included in your total time." This means to me that you must actually read her file and reformat it if you wish (making another file) but this must be part of your program. Which to me would be kicking a dead horse (so to speak)! Read her file, format the text, write it out, only to be read by your program again...hmmm hell of alot of work it seem like to me. Euman
15. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 462 views
----- Original Message ----- From: <euman at bellsouth.net> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > > ----- Original Message ----- > From: "Derek Parnell" <ddparnell at bigpond.com> > To: "EUforum" <EUforum at topica.com> > > > You are perfectly free to > > reformat that DATA in your program to help speed things up. I think what > > Robert is getting at is that for comp#2, you cannot use a different file as > > the SOURCE of the word list. Junko's file is the only SOURCE of words > > permitted. If you want to create a new file based on her's, you can. The > > time to do that will be included in the test time though. > > Derek this is how I read this: > > >From Robert > "If you want to reformat > her file into a different file, that's fine, but the time that it > takes to do that will be included in your total time." > > This means to me that you must actually read her file and > reformat it if you wish (making another file) but this must be > part of your program. > > Which to me would be kicking a dead horse (so to speak)! > > Read her file, format the text, write it out, only to be read by your > program again...hmmm hell of alot of work it seem like to me. > Oh I agree with you. I would be a silly thing to do. But you still can if you want to though. In my first attempt I had a variation on this. For the first ever run of the program I used words.txt file, reformatted it and wrote out another file. In second and subsequent uses of the program, I used my reformatted file. But apparently this is no longer allowed. I still haven't had an official ruling on this case, but I'm going to assume that I have to change the way I did the program. -------- Derek.
16. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 473 views
----- Original Message ----- From: "Ray Smith" <smithr at ix.net.au> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > > Derek Parnell wrote: > > > In comp#1, we are required to *encipher* some English text. As a trial, > > I > > combined all the Euphoria\DOC files in to a single file then copied that > > file to itself a few times until I had a file that was 4,267,462 bytes > > long. > > My program takes about 4 seconds to encipher this file. > > What type of PC did you run this on Derek? > My PC at work (PIII 650 Windows 2000 Professional) does this > in 2.36 seconds (for a 4,121,548 bytes sized file) > > I haven't tried to optimise it yet (not sure what I could do really) > > Note: I won't be submitting my entry to contest 1 (I assume Derek > won't either) > A PIII 550 Windows Me, 256Meg RAM, 13GB HD. I only guessed at the time as I didn't actually put any timing coding in it. Also, I have tried to optimise it either. Once I found out it ran so fast, I couldn't see the point in doing any more tweaking. And no, I won't be submitting any entries into any of the three competitions. I will be submitting programs to Robert just for fun though. More of a personal challenge than anything else. ---- Derek.
17. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 479 views
----- Original Message ----- From: "Ray Smith" <smithr at ix.net.au> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > My PC at work (PIII 650 Windows 2000 Professional) does this > in 2.36 seconds (for a 4,121,548 bytes sized file) > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 seconds for that 4,267,462 byte file. --------- Derek.
18. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 03, 2002
- 480 views
Hello all, Talking about speed! I found a much better way to implement my Hash table (below) for the "words.txt" file so I thought Id post this old piece. -- hashing the dictionary takes @3.5 sec on my 233mhz laptop -- 26 x 26 In alphabetical and length of text order. The routine here is pretty fast, believe it or not I figured out a way to speed this up 100 fold and still have the same accuracy and distribution. I'll share that one in a month or so... Maybe I could get some people to tell me how fast a PIII 550 or higher might run this piece. <snip.....> without type_check without trace include get.e include file.e include print.e atom t, junk t = time() integer fnA, len fnA = open("words.txt","r") integer len_text function EumsHash( sequence text) -- a more unique hash atom h, g integer short h = 0 len = length(text) for i = 1 to len do short = text[i] if short = 32 or short = 10 then len_text = i - 1 exit end if if h > #0FFFFFFF or h < 0 then -- (overflow) h = and_bits(h, #0FFFFFFF) end if h *= 16 h += short g = and_bits(floor(h / #1000000), #F0) if g != 0 then h = xor_bits(h, g) end if h = and_bits(h, not_bits(g)) end for return h end function sequence hash_table hash_table = repeat(repeat({}, 26),26) -- 26 -> 676 buckets object line integer letter atom hashed while 1 do -- hashing the dictionary takes @3.5 sec on my 233mhz laptop line = gets(fnA) if atom(line) then exit end if hashed = EumsHash(line) hash_table[line[1] - 64][len_text] &= hashed end while close(fnA) ? time() - t if getc(0) then end if <snip> Later yall. Euman euman at bellsouth.net ----- Original Message ----- From: "Ray Smith" <smithr at ix.net.au> To: "EUforum" <EUforum at topica.com> > Derek Parnell wrote: > > > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 > > seconds for that 4,267,462 byte file. > > DAMN! > > I can only get 2.8 seconds. > I'll have to wait a whole month to see how u did it! > > Ray Smith > http://rays-web.com > > P.S. Is it possible your PIII 550 is faster than my PIII 650??? > P.S.S I can't hope can't I!
19. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 470 views
On 4 Mar 2002, at 10:20, Derek Parnell wrote: > > ----- Original Message ----- > From: "Ray Smith" <smithr at ix.net.au> > To: "EUforum" <EUforum at topica.com> > Sent: Monday, March 04, 2002 9:47 AM > Subject: RE: $100 Contest Question > > > > My PC at work (PIII 650 Windows 2000 Professional) does this > > in 2.36 seconds (for a 4,121,548 bytes sized file) > > > > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 > seconds for that 4,267,462 byte file. Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just to load a dictionary into memory. Building the dictionary from 3 files takes 72 minutes. I took the premunge time to sort words by length, and i load only words corresponding in length to words present in the cyphered line. You are getting times 10x to 140x faster than i am. I concede. Kat
20. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 03, 2002
- 497 views
Kat did you try my posted routine yet? This builds a alphabetical 'A' to 'Z' sequence that allows for text length upto 26 letters example "A," -- length (1) "AD,AH,AL,AM,AN,AS,AT,AU,AW,AX,AY," -- length (2) "ABC,ABE,ABO,ACE,ACT,ADA,ADD,ADO,ADS,ADZ,AFT,AGE, etc..etc" -- length (3) "ABBA,ABBE,ABBY,ABED,ABEL,ABET,ABLE,ABLY,ABUT,ACES,ACHE,ACID, etc...etc" -- length (4) down to length (26) then starts over with the next beginning letter (B,C,D out to Z) 3.5 sec on my 233mhz but instead of the text I presented it will be UNIQUE numerical values that represent the text. Later, Euman euman at bellsouth.net Q: Are we monetarily insane? A: YES ----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Sent: Sunday, March 03, 2002 8:02 PM Subject: Re: $100 Contest Question > > On 4 Mar 2002, at 10:20, Derek Parnell wrote: > > > > > ----- Original Message ----- > > From: "Ray Smith" <smithr at ix.net.au> > > To: "EUforum" <EUforum at topica.com> > > Sent: Monday, March 04, 2002 9:47 AM > > Subject: RE: $100 Contest Question > > > > > > > My PC at work (PIII 650 Windows 2000 Professional) does this > > > in 2.36 seconds (for a 4,121,548 bytes sized file) > > > > > > > Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 > > seconds for that 4,267,462 byte file. > > Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just > to load a dictionary into memory. Building the dictionary from 3 files takes > 72 > minutes. I took the premunge time to sort words by length, and i load only > words corresponding in length to words present in the cyphered line. You are > getting times 10x to 140x faster than i am. I concede. > > Kat > > > >
21. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 462 views
I'm at my office now, so I'll try your code tomight when I get home. On my office machine (P6 800Mhz, Windows 2000) this runs in 1.16 seconds. 4/03/2002 11:59:20 AM, euman at bellsouth.net wrote: > >Hello all, > >Talking about speed! > >I found a much better way to implement my Hash table (below) for the >"words.txt" file so I thought Id post this old piece. > >-- hashing the dictionary takes @3.5 sec on my 233mhz laptop >-- 26 x 26 In alphabetical and length of text order. > >The routine here is pretty fast, believe it or not I figured out a way to speed >this up 100 fold and still have the same accuracy and distribution. >I'll share that one in a month or so... > >Maybe I could get some people to tell me how fast a PIII 550 or higher >might run this piece. > ><snip.....> > >without type_check >without trace > >include get.e >include file.e >include print.e > >atom t, junk > t = time() > >integer fnA, len >fnA = open("words.txt","r") > >integer len_text >function EumsHash( sequence text) -- a more unique hash > atom h, g > integer short > h = 0 > len = length(text) > for i = 1 to len do > short = text[i] > if short = 32 or short = 10 then > len_text = i - 1 > exit > end if > if h > #0FFFFFFF or h < 0 then -- (overflow) > h = and_bits(h, #0FFFFFFF) > end if > h *= 16 > h += short > g = and_bits(floor(h / #1000000), #F0) > if g != 0 then > h = xor_bits(h, g) > end if > h = and_bits(h, not_bits(g)) > end for > return h >end function > >sequence hash_table >hash_table = repeat(repeat({}, 26),26) -- 26 -> 676 buckets > >object line >integer letter >atom hashed > >while 1 do -- hashing the dictionary takes @3.5 sec on my 233mhz laptop > line = gets(fnA) > if atom(line) then > exit > end if > hashed = EumsHash(line) > hash_table[line[1] - 64][len_text] &= hashed >end while > >close(fnA) > >? time() - t > >if getc(0) then >end if --------- Cheers, Derek Parnell
22. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 445 views
Hey Kat, who says that any of my code would be faster than yours when run on your machine? For that matter, if your code was run on my machine, who says that it would be slower than my code? Don't concede on the basis of your own hardware platform. The contest will be run on Robert's platform anyhow! The idea is to improve our Euphoria skills, not to see who has the beefiest machine. 4/03/2002 12:02:36 PM, Kat <gertie at PELL.NET> wrote: > >On 4 Mar 2002, at 10:20, Derek Parnell wrote: > >> >> ----- Original Message ----- >> From: "Ray Smith" <smithr at ix.net.au> >> To: "EUforum" <EUforum at topica.com> >> Sent: Monday, March 04, 2002 9:47 AM >> Subject: RE: $100 Contest Question >> >> >> > My PC at work (PIII 650 Windows 2000 Professional) does this >> > in 2.36 seconds (for a 4,121,548 bytes sized file) >> > >> >> Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 >> seconds for that 4,267,462 byte file. > >Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just >to load a dictionary into memory. Building the dictionary from 3 files takes 72 > >minutes. I took the premunge time to sort words by length, and i load only >words corresponding in length to words present in the cyphered line. You are >getting times 10x to 140x faster than i am. I concede. > >Kat > > > > --------- Cheers, Derek Parnell
23. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 467 views
On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote: > > Kat did you try my posted routine yet? No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to 20x slower on my computer than on his. Kat
24. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 03, 2002
- 476 views
4/03/2002 1:09:21 PM, Kat <gertie at PELL.NET> wrote: > >On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote: > >> >> Kat did you try my posted routine yet? > >No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to >20x slower on my computer than on his. > Kat, the purpose of that post was to see if it would run faster on your machine than the code you were currently using, not to see if your machine was slower than mine or not. By the sounds of it, the code I posted (using getc() rather than get()) is heaps faster for you. -------- Derek.
25. Re: $100 Contest Question
- Posted by Robert Craig <rds at RapidEuphoria.com> Mar 03, 2002
- 471 views
Ray Smith writes: > OK, I see your point. (my weekend wasn't wasted .. phew!) > > Can this be clarified by Rob as it wasn't to clear me (and probably > others) when I read the previous thread. I think Derek summed it up correctly already, but I'll try again. In problem #2, I want everyone to start by reading the same set of words (Junko's), formatted the same way on disk. Certainly, if everyone used a different set of words I'd have a hard time verifying correctness, and a hard time comparing performance. In problem #3 it doesn't matter. You can use whatever dictionary file you like, formatted however you like. If you don't use Junko's you'll have to send me your dictionary file. In both #2 and #3 I'm imagining that people will load the words into memory (i.e. a sequence of some kind, although poking the data into allocated memory would be legal), so they can make many dictionary queries at high speed. I have a 64 Mb RAM machine (as I've noted now on the Web page), so there shouldn't be any problem keeping the whole 51802-word dictionary in memory. Writing it out to disk in some optimized format is legal, but unlikely to be as fast as accessing the words in memory. In #2 you will first load the words into memory, maybe with top-level code in your include file, or with an init() routine that you call, or maybe your pattern matching routine will do it on the first call (only) to that routine. Derek Parnell writes: > For the first ever run of the program I used words.txt file, > reformatted it and wrote out another file. In second and > subsequent uses of the program, I used my reformatted file. > But apparently this is no longer allowed. I still haven't had an > official ruling on this case, but I'm going to assume that > I have to change the way I did the program. This is interesting, but for this competition it won't be useful because I want to time one run of each program, starting from scratch with Junko's file. The time will include the word loading time plus the time to perform 1000 pattern matching calls. (In reality, I'll probably run programs more than once to get a consistent time based on everything being in the operating system cache, but I don't want programs to take advantage of anything they did in a previous run.) Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
26. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 463 views
On 4 Mar 2002, at 13:51, Derek Parnell wrote: > > 4/03/2002 1:09:21 PM, Kat <gertie at PELL.NET> wrote: > > > > >On 3 Mar 2002, at 20:55, euman at bellsouth.net wrote: > > > >> > >> Kat did you try my posted routine yet? > > > >No, was i supposed to? I did try code Derek posted 3days ago, it ran 10x to > >20x > >slower on my computer than on his. > > > > Kat, > the purpose of that post was to see if it would run faster on your machine > than > the code you were currently using, not to see if your machine was slower than > mine or not. By the sounds of it, the code I posted (using getc() rather than > get()) is heaps faster for you. You'd think so, but code Jiri posted last month ran *much slower* using getc() than gets() or get(). Kat
27. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 472 views
On 3 Mar 2002, at 19:59, euman at bellsouth.net wrote: > > Hello all, > > Talking about speed! > > I found a much better way to implement my Hash table (below) for the > "words.txt" file so I thought Id post this old piece. > > -- hashing the dictionary takes @3.5 sec on my 233mhz laptop > -- 26 x 26 In alphabetical and length of text order. It reported 6.5sec. I have not had a positive virus scan in 3 years. Taskinfo2000 consistantly reports Eu gets 90% or better cpu time to itself, typically 97%. Kat
28. Re: $100 Contest Question
- Posted by Kat <gertie at PELL.NET> Mar 03, 2002
- 462 views
On 4 Mar 2002, at 13:39, Derek Parnell wrote: > > Hey Kat, > who says that any of my code would be faster than yours when run on your > machine? For that matter, if your code was run on my machine, who says that it > would be slower than my code? > > Don't concede on the basis of your own hardware platform. The contest will be > run on Robert's platform anyhow! > > The idea is to improve our Euphoria skills, not to see who has the beefiest > machine. I know, but i can't account for some speed differences of 140x between us. Or why me loading a 1meg file uses 5megs of memory, or when i save it, it uses 6megs. Granted, you have better hardware, and even Eumans's laptop is probably newer, but even considering you two running pio mode 4+ and/or wide scuzzy harddrives, all i can guess is you are better programmers. My cap's off to you. Kat > 4/03/2002 12:02:36 PM, Kat <gertie at PELL.NET> wrote: > > > > >On 4 Mar 2002, at 10:20, Derek Parnell wrote: > > > >> > >> ----- Original Message ----- > >> From: "Ray Smith" <smithr at ix.net.au> > >> To: "EUforum" <EUforum at topica.com> > >> Sent: Monday, March 04, 2002 9:47 AM > >> Subject: RE: $100 Contest Question > >> > >> > >> > My PC at work (PIII 650 Windows 2000 Professional) does this > >> > in 2.36 seconds (for a 4,121,548 bytes sized file) > >> > > >> > >> Okay, so I took the bait. I've now got it down from 4.33 seconds to 1.7 > >> seconds for that 4,267,462 byte file. > > > >Ok, i bow out of the contest. It takes me 434 seconds (over 7 minutes) just > >to > >load a dictionary into memory. Building the dictionary from 3 files takes 72 > >minutes. I took the premunge time to sort words by length, and i load only > >words corresponding in length to words present in the cyphered line. You are > >getting times 10x to 140x faster than i am. I concede. > > > >Kat > > > > > --------- > Cheers, > Derek Parnell > > > >
29. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 03, 2002
- 498 views
----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > all i can guess is you are better programmers. My > cap's off to you. Kat are you having a bad night? Your far better than I am, Im sure of it. Maybe you need an aspirin to come back to reality... Just kiddin' Euman
30. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 04, 2002
- 485 views
Hi Chris, I'm now so here is the details I promised. > > What kind of bench test are you using Derek? I get nowhere near those > kinds of results. This is the group of 16 tests I repeat 80 times. Test({1,1,2,3,4}) Test({'E', 1, 2, 3, 1, 3}) Test({}) Test({1,1}) Test({1}) Test({1,2,3}) Test({4,'i','e'}) Test({1,'E',2, 'E', 3, 'e'}) Test({"ABC"&'\'','s'}) Test("rabbit") Test("r" & {1,1,2}) Test({'t',1,2}) Test({1,2,3,'s'}) Test({1,1,2,2}) Test({1,2,2,1}) Test({1,2,3,4,'e','d'}) > I use.. > match_pattern({1,2,3,1,4,3,3,2,5,3}) -- ZIGZAGGING > ..in an iterated loop > > I don't think I can even iterate 50x in 1.48 seconds > And that pattern is on the faster side of the avergage. > > I don't understand how you could possibly get it that fast, unless your > pattern is {1}, only then is my program comparable to your results. > > I've still got some tweaking to do, but I don't see it getting much > faster. > > Try... > -- find all 7 letter words with unique letters > match_pattern({1,2,3,4,5,6,7}) > > I'd like to know what your iteration time is for that one. This takes between 0.99 and 1.04 seconds to do 10,000 iterations. sequence VOID atom ts ts=time() for i = 1 to 10000 do Test({1,2,3,4,5,6,7}) end for printf(1, "%f\n", time()-ts) For the ZIGZAGGING test, it takes between 2.97 and 3.04 seconds to do 10,000 iterations. Thanks for getting me to do this test 'cos I can see a further optimisation now ---------- cheers, Derek
31. Re: $100 Contest Question
- Posted by Martin Stachon <martin.stachon at worldonline.cz> Mar 04, 2002
- 478 views
Kat wrote: > My new question: It's taking me 143 seconds to load a dictionary with the > following code: > > dfilename = "D:\\Gertie\\decypher\\dfile.txt" > readfile = open(dfilename,"r") > readline = get(readfile) > readline = readline[2] > close(readfile) > > And since i don't see how i can make that any faster, can we *not* count > load times for this contest? My puter is a K6-2-233 running win95, so it's > prolly the slowest one here (not counting Igor's 386's), but still, i don't > see > the load times improving off the hd with faster cpus.. I have a Winchip C6 200Mhz machine and I load 51,802 wordlist in 5 secs including building several special tables and calculations. I have DMA33 (can do UDMA66), 5200 spin/s, 2MB cache harddisk but this is mainly the matter of the algorithm. Before tweaking, the time was about 50 secs... P.S. I think that either David Cuny or Jiri Babor will win #3. Your bets? P.S.2 Now I know what time_profiling is good for ! I wish I was registered ! Martin
32. Re: $100 Contest Question
- Posted by Martin Stachon <martin.stachon at worldonline.cz> Mar 04, 2002
- 462 views
From: <bensler at mail.com> > Can someone give me a benchmark for problme#2? > > Total run time, number of iterations, and the filter/s used > I need to know if I'm in the ball park, or if I need to reconsider my > implementation. Using this benchmark : ----------------------------------------------- sequence words load_words() ? time() - t constant pats = { {1,2,3,4,5,4,3,2,1}, {1,2,'X'}, {'M',1,2,3,4,5}, {1,2,3,4,5,6,7,8,9,10}, {'E',1,1,2}, {1,2,2,1,3}, {1,1,2}, {1,2,1}, {'M',1,2,1,'M'}, "MARTIN", {1,2,'X',2,1}, {1,2,3,'B',3,4}, {1,2,'M',2,1}, {'E',1,2,3,1,3} } for p=1 to length(pats) do words = get_words_by_pattern(pats[p]) end for ? time()-t ----------------------------------------------- I get this output: 5.06 5.5 On my Winchip @200Mhz, Win98. And you? Martin
33. Re: $100 Contest Question
- Posted by rforno at tutopia.com Mar 04, 2002
- 593 views
Maybe someone is interested in that I casually found the word BBUFFALOES instead of BUFFALOES in Junko´s list. I don't know if there are other errors. ----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > > Chris Bensler writes: > > What platform will be used to test with? To be fair, > > it would have to be tested on all 3. > > I'll use DOS. It's a bit inconvenient to boot into Linux. > > > What if one entry only works on a specific platform, > > but is faster than all others for that platform? > > It should be easy to make your program work on all platforms, > but if a program fails on DOS, but works on Linux or Windows, > I'll measure its time on a system it works on (on the same machine) > and decide if there is any unfairness. A program like that > can still win, but won't be eligible for the $5 bonus. > > > What are the valid match characters for Contest#2? A-Z and a-z? > > What about hyphens, and apostrophes? > > As Derek suggested, > if a '-' or '\'' is supplied (or some character greater than ASCII 32), > it should be treated as a literal character to be matched. Values > from 0 to 32 represent "meta" characters, or placeholders for > unspecified characters in the pattern. I'll only give you upper case > literal characters, A, B, C, ... > > Euman writes: > > Can I have text length files? what I mean is word files that are > > seperated by the length of their text? > > I'm not sure what you mean. > On problem #2 you must use Junko's dictionary (word list), > because I will be determining correctness based on > the words in that particular dictionary. If you want to reformat > her file into a different file, that's fine, but the time that it > takes to do that will be included in your total time. > > You can assume that there is enough memory to store > the 51802 words in memory (unless you store them in a > bizarrely inefficient way). > I'll be using a machine with 64 Mb of RAM. > > On problem #3 you can use any dictionary, formatted > any way you like, but I think Junko's is quite reasonable. > > Martin Stachon writes (privately): > > After the load of the wordlist, how many times you will > > call the function? > > In problem #2, assume that I will make 1000 > calls to your function. > > Derek Parnell writes: > > On a similar point I made the assumption that a pattern of {4,6,9} is > > equivalent to {1,2,3}. In other words, the actual value of the pattern > > characters is not important, only that they represent a unique character in > > the target word(s). > > Yes, that's correct. > > Aku writes: > > (Problem #1) How is the time calculated? > > How many iteration (loops) will it be tested? > > I'm planning to run each program once, > with a few megabytes of input text. > I'll actually do it a few times each, > and ignore the first run, > since the first time, the data won't be in > the operating system's memory cache. > > Aku writes: > > (Problem #1) Will the input contain byte 0 ? > > No. It will consist of the cipher line plus > many lines of English text - no weird characters, > no huge (over 1000 characters) lines. > > I'll add these points to the Web page later today. > > Regards, > Rob Craig > Rapid Deployment Software > http://www.RapidEuphoria.com > > > >
34. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 04, 2002
- 474 views
Martin Stachon wrote: > {1,2,3,4,5,4,3,2,1}, > {1,2,'X'}, > {'M',1,2,3,4,5}, > {1,2,3,4,5,6,7,8,9,10}, > {'E',1,1,2}, > {1,2,2,1,3}, > {1,1,2}, > {1,2,1}, > {'M',1,2,1,'M'}, > "MARTIN", > {1,2,'X',2,1}, > {1,2,3,'B',3,4}, > {1,2,'M',2,1}, > {'E',1,2,3,1,3} > ----------------------------------------------- > I get this output: > 5.06 > 5.5 > > On my Winchip @200Mhz, Win98. And you? > Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to run at 0.22 For Comp#2 program: I get this output for Martin's benchmark list 5.38 (load time) 5.98 (total time after 100x14 tests) ----- 0.60 Time for 100x14 tests, thus 0.0004 (per test) This is my benchmark list now and I added further optimisations last night. {1,1,2,3,4} {'E', 1, 2, 3, 1, 3} {1,'e',2,3,4,'s'} {1,1} {1} {1,2,3} {4,'i','e'} {1,'E',2, 'E', 3, 'e'} {"ABC" & '\'','s'} "rabbit" "r" & {1,1,2} {'t',1,2} {1,2,3,'s'} {1,1,2,2} {1,2,2,1} {1,2,3,4,'e','d'} {1,2,3,4,5,6,7} {1,2,3,1,4,3,3,2,5,3} I get this output for my's benchmark list 5.38 (load time) 6.32 (total time after 100x18 tests ) ----- 0.94 time for 100x18 tests, thus 0.0005 (per test) On my Intel P3 @550Mhz, WinMe. Comparing these with Martin's and Chris Bensler's times, it seems that I spend most of the time in loading in the word list, but once loaded, finding matches is very fast. --------------- Derek
35. Re: $100 Contest Question
- Posted by rforno at tutopia.com Mar 04, 2002
- 453 views
----- Original Message ----- From: "Robert Craig" <rds at RapidEuphoria.com> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question Rob Craig writes: <snip> > In problem #2, assume that I will make 1000 > calls to your function. <snip> It seems unfair to call only 1000 times the function, regarding that 50,000 words had to be somehow processed before.
36. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 04, 2002
- 486 views
----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> > > Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to run at > 0.22 Yes you did, thanks BIG D When "I" changed this: h *= 16 to this: h *= 3 -- which doesnt give as Unique a value as before but is still very effective. The routine was @250% faster bringing the time from 3.5 sec on my 233mhz to 1.1 sec This is the routine I was talking about sharing in a month or so. But when BIG D convinced me that getc( ) would shave another 0.25 sec off the load time. I was impressed. He also went a few steps further and now EumsHash runs in at 0.60 sec on my 233mhz laptop ( at 200mhz desk). I can imagine figures on PIII 500 or higher machine being (0.0something) or better now. Prolly could beat Junko's Spellchecker I havent coded this so Im not sure. sure is BLAZING FAST! Euman
37. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 04, 2002
- 471 views
Chris are you sure the output is the same (hash table values)? If not I would say that the table lookup speed might differ. Right now, the table lookup is 100's time faster than reading in the "words.txt" and creating the table. reading and creating the table is 0.60 sec (233mhz) FAST imagine what the lookup will be on say a 15000 word .txt file like Junkos spell checker is timed at (2 sec) Im saying the spellchecker could be @ 1 sec for the same using EumsHash( ) but this is just a guestament. Euman euman at bellsouth.net Q: Are we monetarily insane? A: YES ----- Original Message ----- From: <bensler at mail.com> To: "EUforum" <EUforum at topica.com> Sent: Monday, March 04, 2002 5:47 PM Subject: RE: $100 Contest Question > > I got it down to 0.11 with minor tweaks, I think I can get it even > faster. :) > > Chris > > euman at bellsouth.net wrote: > > ----- Original Message ----- > > From: "Derek Parnell" <ddparnell at bigpond.com> > > To: "EUforum" <EUforum at topica.com> > > > > > > Euman's hashtable runs at 1.16 but I tweaked that a lot and got it to > > > run at > > > 0.22 > > > > Yes you did, thanks BIG D > > > > When "I" changed this: > > h *= 16 > > > > to this: > > h *= 3 -- which doesnt give as Unique a value as before but is still > > very effective. > > > > The routine was @250% faster bringing the time from 3.5 sec on my 233mhz > > to 1.1 sec > > This is the routine I was talking about sharing in a month or so. > > > > But when BIG D convinced me that getc( ) would shave another 0.25 sec > > off the load time. > > I was impressed. He also went a few steps further and now EumsHash runs > > in at 0.60 sec > > on my 233mhz laptop ( at 200mhz desk). I can imagine figures on PIII 500 or > > higher machine > > being (0.0something) or better now. > > > > Prolly could beat Junko's Spellchecker I havent coded this so Im not > > sure. > > > > sure is BLAZING FAST! > > > > Euman > > > > > > >
38. Re: $100 Contest Question
- Posted by petelomax at blueyonder.co.uk Mar 04, 2002
- 454 views
On Mon, 4 Mar 2002 16:45:58 -0300, rforno at tutopia.com wrote: > >Maybe someone is interested in that I casually found the word BBUFFALOES >instead of BUFFALOES in Junko´s list. I don't know if there are other >errors. "AGENCY" and "AGENCY " (agency+space), "DYNAMICALLY" and "DYNAMICALLY " (ditto) "REWRITING" and "REWRITING "(ditto) "ST." (ST+fullstop) Pete
39. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 04, 2002
- 468 views
----- Original Message ----- From: <bensler at mail.com> To: "EUforum" <EUforum at topica.com> > > Grr, I DID test before I said anything, to be sure. But I double > checkled, and the values aren't the same :( > I'll have to see if I can still do it. > > Like Derek said though, I don't see the purpose of using that table. > What does it matter what the first letter of each word is? > Only a minute set of circumstances would benefit from it. > > Chris > As long as the values for each word are unique values and as long as the two single characters are 'A' and 'I' then your probably doing OK with the speed up youve done. > I don't see the purpose of using that table I can see randomly seeding a word till I find a match and applying the seeds to other words for consideration being fairly fast. This is my mind set for the project, Im sure I'll be beaten to death on this but I'll still give it good shot. Euman
40. Re: $100 Contest Question
- Posted by Robert Craig <rds at RapidEuphoria.com> Mar 04, 2002
- 462 views
I fixed the errors in the dictionary, reported by Pete Lomax and Ricardo Forno. BBUFFALOES -> BUFFALOES AGENCY + space -> duplicate, deleted DYNAMICALLY + space -> duplicate, deleted REWRITING + space -> duplicate, deleted ST. (ST+fullstop) -> deleted, "ST" already exists, no other words have "." So the dictionary now has 51798 words. You can assume that it's sorted and is all upper case. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
41. Re: $100 Contest Question
- Posted by rforno at tutopia.com Mar 04, 2002
- 475 views
Derek: It seems to me I don't understand the rules. Being that Junko's words.txt contains only upper case, what is the point of: Test({1,2,3,4,'e','d'}) ? Thanks. ----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: "EUforum" <EUforum at topica.com> Subject: Re: $100 Contest Question > > Hi Chris, > I'm now so here is the details I promised. > > > > > What kind of bench test are you using Derek? I get nowhere near those > > kinds of results. > > This is the group of 16 tests I repeat 80 times. > > Test({1,1,2,3,4}) > Test({'E', 1, 2, 3, 1, 3}) > Test({}) > Test({1,1}) > Test({1}) > Test({1,2,3}) > Test({4,'i','e'}) > Test({1,'E',2, 'E', 3, 'e'}) > Test({"ABC"&'\'','s'}) > Test("rabbit") > Test("r" & {1,1,2}) > Test({'t',1,2}) > Test({1,2,3,'s'}) > Test({1,1,2,2}) > Test({1,2,2,1}) > Test({1,2,3,4,'e','d'}) > > > > I use.. > > match_pattern({1,2,3,1,4,3,3,2,5,3}) -- ZIGZAGGING > > ..in an iterated loop > > > > I don't think I can even iterate 50x in 1.48 seconds > > And that pattern is on the faster side of the avergage. > > > > I don't understand how you could possibly get it that fast, unless your > > pattern is {1}, only then is my program comparable to your results. > > > > I've still got some tweaking to do, but I don't see it getting much > > faster. > > > > Try... > > -- find all 7 letter words with unique letters > > match_pattern({1,2,3,4,5,6,7}) > > > > I'd like to know what your iteration time is for that one. > > This takes between 0.99 and 1.04 seconds to do 10,000 iterations. > > sequence VOID > atom ts > ts=time() > for i = 1 to 10000 do > Test({1,2,3,4,5,6,7}) > end for > printf(1, "%f\n", time()-ts) > > > For the ZIGZAGGING test, it takes between 2.97 and 3.04 seconds to do 10,000 > iterations. > > > Thanks for getting me to do this test 'cos I can see a further optimisation > now > > ---------- > cheers, > Derek > > > >
42. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 04, 2002
- 436 views
5/03/2002 7:43:18 AM, rforno at tutopia.com wrote: > >Derek: >It seems to me I don't understand the rules. Being that Junko's words.txt >contains only upper case, what is the point of: >Test({1,2,3,4,'e','d'}) ? >Thanks. > > Strictly you're correct. If we read the rules it does imply that the pattern is case sensitive. "Assume that the numbers 0 to 32 are used to indicate the pattern, while the numbers above 32 are the ASCII codes of literal characters to be matched (including apostrophe and hyphen)." And thus applying lowercase characters in the pattern to WORDS.TXT would be a waste of time. Because the other competition's stressed case-insensitivity, I assumed it would apply to this comp as well. Thank's for pointing out my error. --------- Derek.
43. Re: $100 Contest Question
- Posted by Derek Parnell <ddparnell at bigpond.com> Mar 05, 2002
- 453 views
For those not too bored with this contest yet, after more tweaking in comp#2, I now get this output for my benchmark list: 1.43 (load time) 3.02 (total time after 100x32 tests ) ----- 1.59 time for 100x32 tests, thus 0.0005 (per test) On my Intel P3 @550Mhz, WinMe. The suite of 32 tests I'm using is: {1,1,2,3,4} {'E', 1, 2, 3, 1, 3} {1,'E',2,3,4,'S'} {1,1} {1} {1,2,3} {4,'I','E'} {1,'E',2, 'E', 3, 'E'} {1,2,2,3,'S'} "RABBIT" "R" & {1,1,2} {'T',1,2} {1,2,3,'S'} {1,1,2,2} {1,2,2,1} {1,2,3,4,'E','D'} {1,2,3,4,5,6,7} {1,2,3,1,4,3,3,2,5,3} {1,2,3,4,5,4,3,2,1} {1,2,'X'} {'M',1,2,3,4,5} {1,2,3,4,5,6,7,8,9,10} {'E',1,1,2} {1,2,2,1,3} {1,1,2} {1,2,1} {'M',1,2,1,'M'} "MARTIN" {1,2,'X',2,1} {1,2,3,'B',3,4} {1,2,'M',2,1} {'E',1,2,3,1,3} which I repeat 100 times. ------- Derek
44. Re: $100 Contest Question
- Posted by petelomax at blueyonder.co.uk Mar 05, 2002
- 446 views
On Sun, 3 Mar 2002 23:37:20 -0500, Robert Craig <rds at RapidEuphoria.com> wrote: >but I don't want programs to >take advantage of anything they did in a previous run.) Everyone knows that there are 51798 words in the dictionary. I've also worked out that one of my arrays will need 122545 entries. I assume it is ok to code these constants into the program. In contest #2, you specify a library routine. Therefore I expect you to invoke my code with include myfile ... select(pattern) in a standard test rig rather than the dos prompt redirection thingy. Is that right? If not, do you want a library routine and a loading program, or do you want them munged into a single standalone program? You also specify "a sequence containing all the words". Should this be in alphabetical order? I'm going to assume not unless otherwise told. Pete Lomax
45. Re: $100 Contest Question
- Posted by Robert Craig <rds at RapidEuphoria.com> Mar 05, 2002
- 468 views
Pete Lomax writes: > Everyone knows that there are 51798 words in the dictionary. > I've also worked out that one of my arrays will need 122545 entries. > > I assume it is ok to code these constants into the program. I'm not judging based on coding style, but the dictionary has changed in size once. It could happen again. If I have to go into your code and change a bunch of hard-coded numbers to make it work, I won't be too happy. If you have a single constant defined for the dictionary size, from which you compute other constants, I guess I can change that single constant for you. > In contest #2, you specify a library routine. Therefore I expect you > to invoke my code with > > include myfile > ... > select(pattern) > in a standard test rig rather than the dos prompt redirection thingy. > Is that right? Yes. > If not, do you want a library routine and a loading > program, or do you want them munged into a single > standalone program? See http://www.RapidEuphoria.com/contest.htm I recently clarified a bunch of those issues. > You also specify "a sequence containing all the words". Should this be > in alphabetical order? I'm going to assume not unless otherwise told. The result sequence does not have to be in alphabetical order. Ray Smith writes: > Just playing with input redirection and I can't get it to work with > the windows interpreter (exw). It works fine for me with Windows ME, but apparently it won't work on NT/2000/XP systems. So just use ex.exe instead. You can still get the $5 bonus for "working on all platforms". Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
46. Re: $100 Contest Question
- Posted by rforno at tutopia.com Mar 05, 2002
- 481 views
No, I tested my program with some patterns and then it appeared. ----- Original Message ----- From: "C. K. Lester" <cklester at yahoo.com> To: "EUforum" <EUforum at topica.com> Subject: RE: $100 Contest Question > > rforno at tutopia.com wrote: > > Maybe someone is interested in that I casually found the > > word BBUFFALOES instead of BUFFALOES in Junko´s list. > > I don't know if there are other errors. > > No wonder my program kept crashing!!! heheh > > How'd you find that, btw? Just reading the list? > > > >
47. Re: $100 Contest Question
- Posted by Robert Craig <rds at RapidEuphoria.com> Mar 05, 2002
- 458 views
Chris Bensler writes: > I'm asuming, we cannot asume anything about words.txt except that it > will be sorted, and all words are capitalized. > > Are there other specifications of the file that we CAN count on Rob? > like number of words, or number of words beginning with each letter. > Or number of words of each length. What about maximum word length? You can assume that if there are further changes to words.txt, they will only be minor changes, perhaps deleting a word here or there. Three words were removed recently because they were duplicates of previous words, except that they also had a blank character on the end. Another word was a duplicate, but it had a period on the end, and it was the only word in the dictionary like that ("ST." - abbreviation for "saint"). I didn't want to leave that sort of corruption in the file, lest it trip someone up unfairly. I ran a little program yesterday to confirm that all words are in alphabetical order, and there are no duplicates, and no blank characters or periods. So go ahead and assume what you like, but for safety you might want to allow for some slight changes, i.e. create sequences slightly larger than they need to be. In general, this is the kind of coding decision that C/C++ forces on you, not Euphoria, unless you are obsessed with speed, which I guess you are. > Should the routine handle lowercase parameters? In problem #2, I won't give you any lower case letters as part of the pattern. In problem #3 you will have to accept upper and lower case text as the input, and you should preserve it in the output. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
48. Re: $100 Contest Question
- Posted by Martin Stachon <martin.stachon at worldonline.cz> Mar 05, 2002
- 466 views
Chris Bensler wrote: > On my PIII 600mhz/64MB : > > Euman's hashtable runs at 0.8s On my machine it takes 6-8 secs... Damn the machine {1,2,'C','K',1} ... Using TestCPU MOV memory MOVSD memory Calcul.score PIII @550Mhz : 429 MB/s 1126 MB/s 220 Winchip @200Mhz : 60 MB/s 142 MB/s 35 score - Dhrystones Whetstones MIPS MFLOPS : 203 195 200 215 : 45 24 17 22 So I should divide my times by 10 ... Also I can't tell wheter an algorithm which is faster on my machine will be faster on Rob's machine because of different cache size (actually I have larger L1 cache than PIII but PIII has faster L2 cache) and PIII's advanced features such is branch prediction, SSE etc. P.S. : I've just fixed yet another bottleneck in my routine the #2 will be really tough...also a bit of luck will decide... Maybe I should move to #3 but I do not dare to challenge David Cuny. Martin
49. Re: $100 Contest Question
- Posted by petelomax at blueyonder.co.uk Mar 05, 2002
- 444 views
On Tue, 5 Mar 2002 12:50:35 -0500, Robert Craig <rds at RapidEuphoria.com> wrote: > If I have to go into your code >and change a bunch of hard-coded numbers to make it work, >I won't be too happy. I assumed you'd immediately disqualify it! >If you have a single constant defined >for the dictionary size, from which you compute other constants, >I guess I can change that single constant for you. I have several, but like 122545 mainly empirical results. Without the program that blatted that number out, (or including all that code in my submission), there's no simple calc'n for it. Hence, If my program doesn't work because that number is wrong, don't change it, just disqualify my entry. >See >http://www.RapidEuphoria.com/contest.htm > Thanks, I'd forgotten to make select() global! A couple of extra test cases too, I see, thanks also. >The result sequence does not have to be in alphabetical order. UGH. I was sneakily hoping to put a few competitors back a step or two on that one, sly devil that I am. (grin) Cheers, Pete
50. Re: $100 Contest Question
- Posted by petelomax at blueyonder.co.uk Mar 05, 2002
- 482 views
On Wed, 6 Mar 2002 00:57:33 +1100, Derek Parnell <ddparnell at bigpond.com> wrote: > >For those not too bored with this contest yet, after more tweaking in >comp#2, I now get this output for my benchmark list: > > 1.43 (load time) > 3.02 (total time after 100x32 tests ) > ----- > 1.59 time for 100x32 tests, thus > 0.0005 (per test) > > On my Intel P3 @550Mhz, WinMe. Load time 3.170000, test 15.480000, total 18.650000, iteration 0.009675 Lets call it a draw. Pete
51. Re: $100 Contest Question
- Posted by David Cuny <dcuny at LANSET.COM> Mar 05, 2002
- 485 views
Martin wrote: > Maybe I should move to #3 but I do not dare to > challenge David Cuny. I'm not the one to worry about. Doesn't C. K. Lester have his "White Rabbit" program solving crytograms in about 5 seconds? I'm using a fairly naive genetic algorithm (GA) as the core of my solver. Even with a number of optimizations, it took about an hour to crunch through: king nero prvby khnc avcl fhyyvy cavrdlk chfn deaf tfengy ic dhuly erbv ychtl riby xhff ertfikerd fhyyel before finally settling on: dumb minx knots damp rope lassos pronged palm girl climbs up gazes into space nuts fall including lassie There may be room in the code for improvement - profiling will tell - but a 10:1 improvement? I suspect not, especially when "White Rabbit" is apparently churning out the solution in seconds. The real problem for me seems to be the algorithm. GA is dead simple to implement, and will eventually hit on the solution. But it's not especially efficient. I'm considering using trigraphs instead; they look *very* successful. Take a look at: http://www.muth.org/Robert/Cipher/substitution Because the topic of solving crytograms is so well researched, I suspect the winners will all select the same general algorithm. So the prize will go to the person who bothers to do a little research, and optimizes the heck out of their code. That sounds like jiri, C.K. Lester or Derek... but certainly not me. I'll don't take my code down to that sort of optimization. Thanks anyway for the vote! -- David Cuny
52. Re: $100 Contest Question
- Posted by euman at bellsouth.net Mar 05, 2002
- 451 views
http://www.muth.org/Robert/Cipher/substitution -- David Cuny I think we've studied the same material DC, I posted this when the contest started. http://www.cs.arizona.edu/http/html/people/muth/Cipher/ Euman
53. Re: $100 Contest Question
- Posted by David Cuny <dcuny at LANSET.COM> Mar 06, 2002
- 462 views
Euman wrote: > I think we've studied the same material DC, I posted this when > the contest started. Sorry about not giving credit. I ran across the site after doing a Google search for a cryptogram you posted. -- David Cuny
54. Re: $100 Contest Question
- Posted by David Cuny <dcuny at LANSET.COM> Mar 06, 2002
- 505 views
I hesitate to mention this, not only because it's the only optimization my pathetic code has, but because it's so obvious... People have been talking about a single dictionary, but you actually need two hash tables: one hashed based on the word, and the other based on the pattern the word generates. For example: SILLY would be placed in the "normal" dictionary under the key "SILLY" and under the "pattern" dictionary as "ABCCD". If you read your cryptogram before you read the dictionary, you don't have to create a hash table for the pattern dictionary - just those that match patterns found in the cryptogram. That is, if the cryptogram were: IG EES SWWM then your dictionary would only need to hold words that match the patterns: AB AAB ABBC And since crytogram words always create the same pattern, if you store the hash key to the pattern with the cryptogram words, you don't need to recalculate it. Yeah, all obvious stuff. -- David Cuny