1. contest #2 derek
- Posted by tone.skoda at siol.net Apr 04, 2002
- 462 views
This is a multi-part message in MIME format. ------=_NextPart_000_0013_01C1DBF1.D776E860 charset="iso-8859-2" Derek I still haven't seen your code for contest 2 which is suppposed to be the fastest? Can you send it? I'm interested to see how u did it. If anybody wantz to download mine: http://www10.brinkster.com/tskoda/euphoria.asp It doesn't work entirely correct though, a few small modifications would be needed (I forgot '-' character ...). ------=_NextPart_000_0013_01C1DBF1.D776E860 charset="iso-8859-2" <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META http-equiv=Content-Type content="text/html; charset=iso-8859-2"> <META content="MSHTML 6.00.2462.0" name=GENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=#ffffff> <DIV><FONT face=Arial size=2>Derek I still haven't seen your code for contest 2 which is suppposed to be the fastest? Can you send it? I'm interested to see how u did it.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>If anybody wantz to download mine:</FONT></DIV> <DIV><FONT face=Arial size=2><A href="http://www10.brinkster.com/tskoda/euphoria.asp">http://www10.brinkster.com/tskoda/euphoria.asp</A></FONT></DIV> <DIV><FONT face=Arial size=2>It doesn't work entirely correct though, a few small modifications would be needed (I forgot '-' character ------=_NextPart_000_0013_01C1DBF1.D776E860--
2. Re: contest #2 derek
- Posted by petelomax at blueyonder.co.uk Apr 05, 2002
- 472 views
On Fri, 5 Apr 2002 20:25:48 +0200, Tone Skoda <tone.skoda at siol.net> wrote: > >----- Original Message ----- >From: "Pete Lomax" <petelomax at blueyonder.co.uk> > >> You might be interested in mine then. It is 12X faster than the >> winning entry (2X faster than yours) and 100% correct. > >If it works 100% correct (it does as I checked it) why didn't you win then? >(not that I have anything against Jason Mirwald) > Talking about shooting oneself in the foot! I yanked it at the 11th hour, entered compe#3. Mind you, I did actually post the best algorithm in #3 too, even though a stupid linefeed buglette made me judged second. I actually put a rackload of code in there to start panicing every 30 seconds; in fact the test set given I solved 4 of the five entries in under 10 seconds (yes, for four, ie 2.5 seconds each on average) Even the hard beast (#3) ran in 50seconds & progressed well enough to avoid invoking the panic button... OH, what the heck, I was going to post this private, but anyone who has my code (Rob already posted it), uncomment: if length(wschk)=2 then rank+=32*32*1024 end if if length(wschk)=3 then rank+=32*1024 end if in procedure pick_best_comment(). WOW! Two tiny changes: 97.42% Sure C.K (well done mate!) won, rightly too, my bugs, my bad! But that's not the point. 2nd in the "hard" competition is damn good & I'm rightly proud. I've already grumbled publicly about the test conditions (2+ weeks ago). They are on record. No doubt I won't let it lie. Expect me to repeatedly remark on the unrealistic/ unfair aspects of this competition. Please give credit:- I ain't grumblin for not bein first, just voicin my opinion! Pete big oaks from little acorns grow big aches from little toe corns grow quick folks by fakes can hoax the slow crypt jokes from such ideas grow
3. Re: contest #2 derek
- Posted by Derek Parnell <ddparnell at bigpond.com> Apr 05, 2002
- 457 views
This is a multi-part message in MIME format. ------=_NextPart_000_02D9_01C1DD4B.86A795A0 charset="iso-8859-1" Attached is the code I sent to Robert. comp1.ex is not optimised for speed. I was just trying for correctness. It could certainly be made to run faster. comp2sub.e is fast but it is biased towards patterns with zero or few known letters. After submitting this to Robert, I continued to experiment with optimisations and got it running about twice as fast again. I didn't submit anything for comp#3 as I ran out of time with other commitments. ---------- Derek. ------=_NextPart_000_02D9_01C1DD4B.86A795A0 Content-Type: application/octet-stream; name="comp1.ex" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="comp1.ex" include get.e include wildcard.e include sort.e sequence vCipherMap, vFullAlphabet, vFrequencies constant kAlphabet =3D "ABCDEFGHIJKLMNOPQRSTUVWXYZ" procedure AbortWithError(sequence pText) printf(2, "ERROR DETECTED!\n%s\n\n", {pText}) abort(1) end procedure procedure Main() object lTextIn integer lPosn integer lChar integer lFH sequence lLetterFreqs ---------------- -- Get cipher map ---------------- lTextIn =3D gets(0) if atom(lTextIn) then AbortWithError("No cipher map provided") end if -- strip off LF lTextIn =3D lTextIn[1..length(lTextIn)-1] if length(lTextIn) !=3D 26 then AbortWithError("The cipher map must be exactly 26 chars long") end if -- Make room for upper and lowercase mapping vCipherMap =3D repeat(0, 26 * 2) vFullAlphabet =3D repeat(0, 26 * 2) vFrequencies =3D repeat(0, 255) for i =3D 1 to 26 do lPosn =3D find(lTextIn[i], vCipherMap) if lPosn !=3D 0 then AbortWithError(sprintf("The cipher map at char %d '%s' has = been repeated", {i,lTextIn[i]})) end if lPosn =3D find(lTextIn[i], kAlphabet) if lPosn =3D 0 then AbortWithError(sprintf("The cipher map at char %d '%s' is = not A-Z", {i,lTextIn[i]})) end if vCipherMap[i] =3D lTextIn[i] vCipherMap[i + 26] =3D lower(lTextIn[i]) vFullAlphabet[i] =3D kAlphabet[i] vFullAlphabet[i + 26] =3D lower(kAlphabet[i]) end for ---------------- -- Write out the Inverse cipher map ---------------- for i =3D 1 to 26 do lPosn =3D find(vFullAlphabet[i], vCipherMap) lChar =3D kAlphabet[lPosn] puts(1, lChar) end for puts(1, '\n') ---------------- -- encode text ---------------- lTextIn =3D gets(0) while not atom(lTextIn) do for i =3D 1 to length(lTextIn) do lPosn =3D find(lTextIn[i], vFullAlphabet) if lPosn =3D 0 then lChar =3D lTextIn[i] else lChar =3D vCipherMap[lPosn] end if puts(1, lChar) vFrequencies[lTextIn[i]] +=3D 1 end for lTextIn =3D gets(0) end while ---------------- -- record frequencies ---------------- lLetterFreqs =3D repeat(0,26) for i =3D 1 to 26 do -- Lower and upper cases are considered to be the same letter. lLetterFreqs[i] =3D {vFrequencies[vCipherMap[i]] + vFrequencies[lower(vCipherMap[i])], vCipherMap[i]} end for lLetterFreqs =3D sort(lLetterFreqs) lFH =3D open("letters.txt","w") if lFH < 0 then AbortWithError("Failed to create the letters.txt file.") end if for i =3D 26 to 1 by -1 do printf(lFH, "%s %6d\n", {lLetterFreqs[i][2], = lLetterFreqs[i][1]}) end for close(lFH) end procedure Main() ------=_NextPart_000_02D9_01C1DD4B.86A795A0 Content-Type: application/octet-stream; name="comp2sub.e" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="comp2sub.e" -- comp2.e -- 5/Mar/2002 Derek Parnell ------------------ -- Specifications: ------------------ -- Read in a list of 51,798 English words and organize it somehow in -- memory. You must use words.txt contained in Junko's spell checker in = the -- Archive. In Junko's dictionary, the words are in alphabetical order, = one -- per line, and are all in upper case. -- Write a library routine that will return a sequence containing all = words -- in the dictionary that match a given pattern, e.g. given {1,1,2,3,4} = as -- the argument to your function, you'd return a sequence of all = 5-letter -- words where the first 2 letters are the same and the next 3 are -- different. The user can also specify certain specific letters, -- e.g. {'E', 1, 2, 3, 1, 3} would get all 6-letter words beginning with -- 'E' that have that pattern. Note in this case that 1, 2 and 3 = represent -- three distinct and different characters, none of which is 'E', and = none -- of which is the same as one of the others. -- -- Assume that the numbers 0 to 32 are used to indicate the pattern, = while -- the numbers above 32 are the ASCII codes of literal characters to be -- matched (including apostrophe and hyphen). -- -- Correctness, and then speed decides the winner. Your program's total -- time will include the time to read Junko's file, plus the time to -- process 1000 different calls to your pattern-matching subroutine. The -- test set will be created using random numbers, but everyone will be -- given the same set. --------------------- -- Assumptions: --------------------- -- 1) Each element in the supplied pattern must be an integer. -- 2) The routine can return error results. I choose the form {{-ve = int,"msg"}} -- 3) Pattern elements whose ASCII value is higher than SPACE will -- be used as exact character matches. without trace without type_check include get.e include wildcard.e include file.e include misc.e include machine.e sequence vDictionary vDictionary =3D {} sequence vKnownPatterns sequence vDictLen constant kNumBins =3D 9824*5-1 sequence vHash vHash =3D repeat(0, kNumBins) atom vConvertPattern vConvertPattern =3D allocate(100) -- Note, this is = never freed up. sequence vBinVals vBinVals =3D {vConvertPattern,6} sequence vMatrix vMatrix =3D {1,3,5,7,11,13} atom hash1, hash2 sequence hashraw -- The returns a standardised pattern sequence for any word. -- eg. "CARROT" becomes {1,2,3,3,4,5} -- "THERE" becomes {1,2,3,4,3} -- "HAPPY" becomes {1,2,3,3,4} ------------------------------------------------------ global function MakePattern(sequence pWord) ------------------------------------------------------ sequence lWordPattern integer lPatternSet integer lPosn -- Validate and standardize the parameter. lWordPattern =3D repeat(0,length(pWord)) lPatternSet =3D 0 for i =3D 1 to length(pWord) do -- Each element must be an integer. if not integer(pWord[i]) then return {{-1, sprintf("Bad element in pattern at pos %d",i)}} = -- Bad element in pattern end if lPosn =3D find(pWord[i], pWord) if lPosn < i then lWordPattern[i] =3D lWordPattern[lPosn] else lPatternSet +=3D 1 lWordPattern[i] =3D lPatternSet end if end for return lWordPattern end function integer vPosn integer vFH integer vPatMax sequence vWordPattern integer vFileLen integer vEOLChar,vEOLSize sequence vFileBuf sequence vEmpty integer vWordStart, vWordEnd --------------------------- -- Initialize the dictionary. --------------------------- if platform() =3D LINUX then vEOLChar=3D10 vEOLSize=3D1 else vEOLChar=3D13 vEOLSize=3D2 end if vFH =3D open("words.txt", "rb") if vFH > 0 then -- Pull the whole file into RAM vFileLen =3D seek(vFH, -1) vFileLen =3D where(vFH) vFileBuf =3D repeat(0, vFileLen+1) if seek(vFH, 0) then end if for i =3D 1 to vFileLen do vFileBuf[i] =3D getc(vFH) end for close(vFH) vFileBuf[vFileLen+1]=3DvEOLChar vWordStart =3D 1 vWordEnd =3D 2 vEmpty =3D repeat(0,30) vKnownPatterns =3D repeat(0,10000) vDictionary =3D repeat(vEmpty,10000) vDictLen =3D repeat(0,10000) vPatMax =3D 0 while vWordEnd < vFileLen do -- get next word while vFileBuf[vWordEnd] !=3D vEOLChar do vWordEnd+=3D1 end while -- Find the pattern for this word. vWordPattern =3D MakePattern(vFileBuf[vWordStart..vWordEnd-1]) -- BIG ASSUMPTION: word file has no errors. -- See if I've recorded this pattern before, and -- if not, add it to the list of known patterns. poke(vConvertPattern, repeat(0, 24)) poke(vConvertPattern, vWordPattern) hashraw =3D peek4u(vBinVals) hashraw *=3D vMatrix hash1 =3D remainder(hashraw[1] + hashraw[2] + hashraw[3] + hashraw[4] + hashraw[5] + hashraw[6] , kNumBins)+1 if equal(vHash[hash1], 0) then vPatMax +=3D 1 vPosn =3D vPatMax vKnownPatterns[vPosn] =3D vWordPattern vHash[hash1] =3D {vPosn, vWordPattern} else if equal(vWordPattern, vHash[hash1][2]) then vPosn =3D vHash[hash1][1] else hash2 =3D hash1 hash1 +=3D 1 vPosn =3D 0 while hash1 <=3D kNumBins and not equal(vHash[hash1],0) = do if equal(vWordPattern, vHash[hash1][2]) then vPosn =3D vHash[hash1][1] exit end if hash1 +=3D 1 end while if vPosn =3D 0 and hash1 > kNumBins then hash1 =3D 1 while hash1 < hash2 and not equal(vHash[hash1],0) do if equal(vWordPattern, vHash[hash1][2]) then vPosn =3D vHash[hash1][1] exit end if hash1 +=3D 1 end while end if if vPosn =3D 0 then vPatMax +=3D 1 vPosn =3D vPatMax vKnownPatterns[vPosn] =3D vWordPattern vHash[hash1] =3D {vPosn, vWordPattern} end if end if end if -- At this point I should know the pattern group and -- pattern subindex. The word is inserted into the -- dictionary in a sublist. The sublist is referenced -- by pattern group (ie length) and pattern subindex. -- Thus all words that have the same pattern are put -- into the dictionary in the same sublist. -- No attempt is made to detect duplicated words. vDictLen[vPosn] +=3D 1 hash1 =3D vDictLen[vPosn] if hash1 > length(vDictionary[vPosn]) then vDictionary[vPosn] &=3D vEmpty end if vDictionary[vPosn][hash1] =3D vFileBuf[vWordStart..vWordEnd-1] vWordStart =3D vWordEnd + vEOLSize vWordEnd =3D vWordStart+1 end while -- Adjust to actual lengths used. vKnownPatterns =3D vKnownPatterns[1..vPatMax] vDictionary =3D vDictionary[1..vPatMax] for i =3D 1 to vPatMax do vDictionary[i] =3D vDictionary[i][1..vDictLen[i]] end for -- Make the buffer memory for reuse. vFileBuf =3D {} end if -- This returns a sequence containing all the words in the dictionary -- that match the supplied pattern. ------------------------------------------------------ global function MatchedWords(sequence pPattern) ------------------------------------------------------ sequence lFinds sequence lWordPattern sequence lExactChars integer lPosn integer lCnt sequence lWord -- If we still haven't got a dictionary, we are screwed. if length(vDictionary) =3D 0 then return {{-2, "No dictionary is available."}} end if ------------------- -- Start searching... ------------------- -- Validate and standardize the parameter. -- Create a standardized pattern, and check for errors. lWordPattern =3D MakePattern(pPattern) if length(lWordPattern) =3D 0 or (length(lWordPattern) =3D 1 and sequence(lWordPattern[1]) and lWordPattern[1][1] < 0) then return lWordPattern end if -- Build a list of characters to be matched later. lExactChars =3D {} for i =3D 1 to length(pPattern) do -- Is this also an 'exact' char? if pPattern[i] > ' ' then lExactChars =3D append(lExactChars, {pPattern[i], i}) end if end for -- At this point, lWordPattern is the pattern in standard form, -- lExactChars is a list of positions we have -- to check for exact character matches -- Scan through the patterns in this group, looking for a matching = one. poke(vConvertPattern, repeat(0, 24)) poke(vConvertPattern, lWordPattern) hashraw =3D peek4u(vBinVals) hashraw *=3D vMatrix hash1 =3D remainder(hashraw[1] + hashraw[2] + hashraw[3] + hashraw[4] + hashraw[5] + hashraw[6] , kNumBins)+1 if equal(vHash[hash1], 0) then return {} -- No matching words in dictionary. end if if equal(lWordPattern, vHash[hash1][2]) then lPosn =3D vHash[hash1][1] else hash2 =3D hash1 hash1 +=3D 1 lPosn =3D 0 while hash1 <=3D kNumBins and not equal(vHash[hash1],0) do if equal(lWordPattern, vHash[hash1][2]) then lPosn =3D vHash[hash1][1] exit end if hash1 +=3D 1 end while if lPosn =3D 0 and hash1 > kNumBins then hash1 =3D 1 while hash1 < hash2 and not equal(vHash[hash1],0) do if equal(lWordPattern, vHash[hash1][2]) then lPosn =3D vHash[hash1][1] exit end if hash1 +=3D 1 end while end if if lPosn =3D 0 then return {} -- We don't have any words for this pattern. end if end if -- No exact character matching to do, so just return the whole = sublist -- of words. if length(lExactChars) =3D 0 then return vDictionary[lPosn] end if -- I need to scan through each word, checking for exact character -- matches, according to the supplied pattern. lFinds =3D {} for i =3D 1 to length(vDictionary[lPosn]) do lCnt =3D 0 lWord =3D vDictionary[lPosn][i] for j =3D 1 to length(lExactChars) do if lWord[lExactChars[j][2]] =3D lExactChars[j][1] then lCnt +=3D 1 else exit -- Leave as soon as one match fails. end if end for -- If every required character is in its required position, -- then this is a word that I want. if lCnt =3D length(lExactChars) then lFinds =3D append(lFinds, vDictionary[lPosn][i]) end if end for return lFinds end function ------=_NextPart_000_02D9_01C1DD4B.86A795A0--
4. Re: contest #2 derek
- Posted by petelomax at blueyonder.co.uk Apr 06, 2002
- 449 views
On Fri, 5 Apr 2002 21:38:53 +0000, "C. K. Lester" <cklester at yahoo.com> wrote: > > >petelomax at blueyonder.co.uk wrote: > >> I did actually post the best algorithm in #3... > >What was your approach? I've not looked at your code yet, but I'd be >interested in how you approached it. Have you seen my code? Any tweak >suggestions? :) > >> WOW! Two tiny changes: 97.42% > >For me, one tiny change: 100%. I'll agree that's a nice tweak, but, 100%? - you really ought to have bit the bullet and picked just one answer (he says, ducking >From what has happened to both you and me, I begin to think the competition should have allowed a chance to bug fix, especially looking at one of the quotes Rob himself picked! Anyway, congratulations, a fine program sir. Pete
5. Re: contest #2 derek
- Posted by petelomax at blueyonder.co.uk Apr 06, 2002
- 440 views
On Sat, 6 Apr 2002 09:14:59 +1000, Derek Parnell <ddparnell at bigpond.com> wrote: >Attached is the code I sent to Robert. > >After submitting this to Robert, I continued to experiment with >optimisations and got it running about twice as fast again. I'de definitely like to see that, since my program runs in 1.98, the one you attached 5.17, so those optimisations would make it very close >comp2sub.e is fast but it is biased towards patterns with zero or few known >letters. Thats exactly what I thought about my program. Pete
6. Re: contest #2 derek
- Posted by rforno at tutopia.com Apr 06, 2002
- 451 views
It its strange, but when I developed my code, I found that, for the special case of the dictionary, it was faster to append than to replace items in repeat(0, length). ----- Original Message ----- From: "George Orr" <georgeorr at donet.com> To: "EUforum" <EUforum at topica.com> Subject: RE: contest #2 derek > > Derek - > > Can you get Rob to post your code in the archives? I only access this > list through the web interface, and don't get attachments. I was trying > an approach that sounds a lot like yours, and am really interested in > how you implemented it. I think I overused appends, etc. and this cost > me too much time in building the "dictionary". > > Thanks > > George > > Derek Parnell wrote: > > Attached is the code I sent to Robert. > > ---------- > > Derek. > > > > > > >