OpenEuphoria: Forum: contest #2 derek

1. contest #2 derek

Posted by tone.skoda at siol.net Apr 04, 2002
568 views

This is a multi-part message in MIME format.

------=_NextPart_000_0013_01C1DBF1.D776E860
	charset="iso-8859-2"

Derek I still haven't seen your code for contest 2 which is suppposed to be the
fastest? Can you send it? I'm interested to see how u did it.

If anybody wantz to download mine:
http://www10.brinkster.com/tskoda/euphoria.asp
It doesn't work entirely correct though, a few small modifications would be
needed (I forgot '-' character ...).

------=_NextPart_000_0013_01C1DBF1.D776E860
	charset="iso-8859-2"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-2">
<META content="MSHTML 6.00.2462.0" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2>Derek I still haven't seen your code for contest 
2&nbsp;which is suppposed to be the&nbsp;fastest? Can you send it? I'm 
interested to see how u did it.</FONT></DIV>
<DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
<DIV><FONT face=Arial size=2>If anybody wantz to download mine:</FONT></DIV>
<DIV><FONT face=Arial size=2><A 
href="http://www10.brinkster.com/tskoda/euphoria.asp">http://www10.brinkster.com/tskoda/euphoria.asp</A></FONT></DIV>
<DIV><FONT face=Arial size=2>It doesn't work entirely correct though, a few 
small modifications would be needed (I forgot '-' character 

------=_NextPart_000_0013_01C1DBF1.D776E860--

new topic » topic index » view message » categorize

2. Re: contest #2 derek

Posted by petelomax at blueyonder.co.uk Apr 05, 2002
564 views

On Fri, 5 Apr 2002 20:25:48 +0200, Tone Skoda <tone.skoda at siol.net>
wrote:

>
>----- Original Message -----
>From: "Pete Lomax" <petelomax at blueyonder.co.uk>
>
>> You might be interested in mine then. It is 12X faster than the
>> winning entry (2X faster than yours) and 100% correct.
>
>If it works 100% correct (it does as I checked it) why didn't you win then?
>(not that I have anything against Jason Mirwald)
>
Talking about shooting oneself in the foot! I yanked it at the 11th
hour, entered compe#3. Mind you, I did actually post the best
algorithm in #3 too, even though a stupid linefeed buglette made me
judged second.

I actually put a rackload of code in there to start panicing every 30
seconds; in fact the test set given I solved 4 of the five entries in
under 10 seconds (yes, for four, ie 2.5 seconds each on average)

Even the hard beast (#3) ran in 50seconds & progressed well enough to
avoid invoking the panic button...

OH, what the heck, I was going to post this private, but anyone who
has my code (Rob already posted it), uncomment:

	if length(wschk)=2 then rank+=32*32*1024 end if 
	if length(wschk)=3 then rank+=32*1024 end if

in procedure pick_best_comment().

WOW!   Two tiny changes: 97.42%

Sure C.K (well done mate!) won, rightly too, my bugs, my bad!

But that's not the point. 2nd in the "hard" competition is damn good &
I'm rightly proud.

I've already grumbled publicly about the test conditions (2+ weeks
ago). They are on record. No doubt I won't let it lie.

Expect me to repeatedly remark on the unrealistic/ unfair aspects of
this competition. Please give credit:- I ain't grumblin for not bein
first, just voicin my opinion!

Pete
big oaks from little acorns grow 
big aches from little toe corns grow
quick folks by fakes can hoax the slow 
crypt jokes from such ideas grow

new topic » goto parent » topic index » view message » categorize

3. Re: contest #2 derek

Posted by Derek Parnell <ddparnell at bigpond.com> Apr 05, 2002
550 views

This is a multi-part message in MIME format.

------=_NextPart_000_02D9_01C1DD4B.86A795A0
	charset="iso-8859-1"

Attached is the code I sent to Robert.

comp1.ex is not optimised for speed. I was just trying for correctness. It
could certainly be made to run faster.

comp2sub.e is fast but it is biased towards patterns with zero or few known
letters. After submitting this to Robert, I continued to experiment with
optimisations and got it running about twice as fast again.

I didn't submit anything for comp#3 as I ran out of time with other
commitments.

----------
Derek.

------=_NextPart_000_02D9_01C1DD4B.86A795A0
Content-Type: application/octet-stream;
	name="comp1.ex"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="comp1.ex"

include get.e
include wildcard.e
include sort.e

sequence vCipherMap, vFullAlphabet, vFrequencies
constant kAlphabet =3D "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

procedure AbortWithError(sequence pText)
    printf(2, "ERROR DETECTED!\n%s\n\n", {pText})
    abort(1)
end procedure


procedure Main()
    object lTextIn
    integer lPosn
    integer lChar
    integer lFH
    sequence lLetterFreqs

    ----------------
    -- Get cipher map
    ----------------
    lTextIn =3D gets(0)
    if atom(lTextIn) then
        AbortWithError("No cipher map provided")
    end if
        -- strip off LF
    lTextIn =3D lTextIn[1..length(lTextIn)-1]
    if length(lTextIn) !=3D 26 then
        AbortWithError("The cipher map must be exactly 26 chars long")
    end if
        -- Make room for upper and lowercase mapping
    vCipherMap =3D repeat(0, 26 * 2)
    vFullAlphabet =3D repeat(0, 26 * 2)
    vFrequencies =3D repeat(0, 255)

    for i =3D 1 to 26 do
        lPosn =3D find(lTextIn[i], vCipherMap)
        if lPosn !=3D 0 then
            AbortWithError(sprintf("The cipher map at char %d '%s' has =
been repeated",
                         {i,lTextIn[i]}))
        end if

        lPosn =3D find(lTextIn[i], kAlphabet)
        if lPosn =3D 0 then
            AbortWithError(sprintf("The cipher map at char %d '%s' is =
not A-Z",
                         {i,lTextIn[i]}))
        end if

        vCipherMap[i] =3D lTextIn[i]
        vCipherMap[i + 26] =3D lower(lTextIn[i])
        vFullAlphabet[i] =3D kAlphabet[i]
        vFullAlphabet[i + 26] =3D lower(kAlphabet[i])

    end for

    ----------------
    -- Write out the Inverse cipher map
    ----------------
    for i =3D 1 to 26 do
        lPosn =3D find(vFullAlphabet[i], vCipherMap)
        lChar =3D kAlphabet[lPosn]
        puts(1, lChar)
    end for
    puts(1, '\n')

    ----------------
    -- encode text
    ----------------
    lTextIn =3D gets(0)
    while not atom(lTextIn) do
        for i =3D 1 to length(lTextIn) do
            lPosn =3D find(lTextIn[i], vFullAlphabet)
            if lPosn =3D 0 then
                lChar =3D lTextIn[i]
            else
                lChar =3D vCipherMap[lPosn]
            end if
            puts(1, lChar)

            vFrequencies[lTextIn[i]] +=3D 1
        end for
        lTextIn =3D gets(0)
    end while

    ----------------
    -- record frequencies
    ----------------
    lLetterFreqs =3D repeat(0,26)
    for i =3D 1 to 26 do
        -- Lower and upper cases are considered to be the same letter.
        lLetterFreqs[i] =3D {vFrequencies[vCipherMap[i]] +
                           vFrequencies[lower(vCipherMap[i])],
                           vCipherMap[i]}
    end for
    lLetterFreqs =3D sort(lLetterFreqs)
    lFH =3D open("letters.txt","w")
    if lFH < 0 then
        AbortWithError("Failed to create the letters.txt file.")
    end if

    for i =3D 26 to 1 by -1 do
        printf(lFH, "%s %6d\n", {lLetterFreqs[i][2], =
lLetterFreqs[i][1]})
    end for

    close(lFH)


end procedure

Main()

------=_NextPart_000_02D9_01C1DD4B.86A795A0
Content-Type: application/octet-stream;
	name="comp2sub.e"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: attachment;
	filename="comp2sub.e"

-- comp2.e
-- 5/Mar/2002 Derek Parnell
------------------
-- Specifications:
------------------
-- Read in a list of 51,798 English words and organize it somehow in
-- memory. You must use words.txt contained in Junko's spell checker in =
the
-- Archive. In Junko's dictionary, the words are in alphabetical order, =
one
-- per line, and are all in upper case.
-- Write a library routine that will return a sequence containing all =
words
-- in the dictionary that match a given pattern, e.g. given {1,1,2,3,4} =
as
-- the argument to your function, you'd return a sequence of all =
5-letter
-- words where the first 2 letters are the same and the next 3 are
-- different. The user can also specify certain specific letters,
-- e.g. {'E', 1, 2, 3, 1, 3} would get all 6-letter words beginning with
-- 'E' that have that pattern. Note in this case that 1, 2 and 3 =
represent
-- three distinct and different characters, none of which is 'E', and =
none
-- of which is the same as one of the others.
--
-- Assume that the numbers 0 to 32 are used to indicate the pattern, =
while
-- the numbers above 32 are the ASCII codes of literal characters to be
-- matched (including apostrophe and hyphen).
--
-- Correctness, and then speed decides the winner. Your program's total
-- time will include the time to read Junko's file, plus the time to
-- process 1000 different calls to your pattern-matching subroutine. The
-- test set will be created using random numbers, but everyone will be
-- given the same set.



---------------------
-- Assumptions:
---------------------
-- 1) Each element in the supplied pattern must be an integer.
-- 2) The routine can return error results. I choose the form {{-ve =
int,"msg"}}
-- 3) Pattern elements whose ASCII value is higher than SPACE will
--    be used as exact character matches.

without trace
without type_check
include get.e
include wildcard.e
include file.e
include misc.e
include machine.e

sequence vDictionary vDictionary =3D {}
sequence vKnownPatterns
sequence vDictLen
constant kNumBins =3D 9824*5-1
sequence vHash vHash =3D repeat(0, kNumBins)

atom vConvertPattern vConvertPattern =3D allocate(100) -- Note, this is =
never freed up.
sequence vBinVals vBinVals =3D {vConvertPattern,6}
sequence vMatrix vMatrix =3D {1,3,5,7,11,13}
atom hash1, hash2
sequence hashraw
-- The returns a standardised pattern sequence for any word.
-- eg.  "CARROT" becomes {1,2,3,3,4,5}
--      "THERE" becomes {1,2,3,4,3}
--      "HAPPY" becomes {1,2,3,3,4}
------------------------------------------------------
global function MakePattern(sequence pWord)
------------------------------------------------------
    sequence lWordPattern
    integer lPatternSet
    integer lPosn

    -- Validate and standardize the parameter.
    lWordPattern =3D repeat(0,length(pWord))
    lPatternSet =3D 0
    for i =3D 1 to length(pWord) do
        -- Each element must be an integer.
        if not integer(pWord[i]) then
            return {{-1, sprintf("Bad element in pattern at pos %d",i)}} =
-- Bad element in pattern
        end if

        lPosn =3D find(pWord[i], pWord)
        if lPosn < i then
            lWordPattern[i] =3D lWordPattern[lPosn]
        else
            lPatternSet +=3D 1
            lWordPattern[i] =3D lPatternSet
        end if

    end for

    return lWordPattern
end function

integer vPosn
integer vFH
integer vPatMax
sequence vWordPattern
integer vFileLen
integer vEOLChar,vEOLSize
sequence vFileBuf
sequence vEmpty
integer vWordStart, vWordEnd

---------------------------
-- Initialize the dictionary.
---------------------------
if platform() =3D LINUX then
    vEOLChar=3D10
    vEOLSize=3D1
else
    vEOLChar=3D13
    vEOLSize=3D2
end if

vFH =3D open("words.txt", "rb")
if vFH > 0 then
    -- Pull the whole file into RAM
    vFileLen =3D seek(vFH, -1)
    vFileLen =3D where(vFH)
    vFileBuf =3D repeat(0, vFileLen+1)
    if seek(vFH, 0) then end if
    for i =3D 1 to vFileLen do
        vFileBuf[i] =3D getc(vFH)
    end for
    close(vFH)
    vFileBuf[vFileLen+1]=3DvEOLChar

    vWordStart =3D 1
    vWordEnd =3D 2
    vEmpty =3D repeat(0,30)
    vKnownPatterns =3D repeat(0,10000)
    vDictionary =3D repeat(vEmpty,10000)
    vDictLen =3D repeat(0,10000)

    vPatMax =3D 0

    while vWordEnd < vFileLen do
        -- get next word
        while vFileBuf[vWordEnd] !=3D vEOLChar do
            vWordEnd+=3D1
        end while
        -- Find the pattern for this word.
        vWordPattern =3D MakePattern(vFileBuf[vWordStart..vWordEnd-1])
        -- BIG ASSUMPTION: word file has no errors.

        -- See if I've recorded this pattern before, and
        -- if not, add it to the list of known patterns.
        poke(vConvertPattern, repeat(0, 24))
        poke(vConvertPattern, vWordPattern)
        hashraw =3D peek4u(vBinVals)
        hashraw *=3D vMatrix
        hash1 =3D remainder(hashraw[1] + hashraw[2] +
                          hashraw[3] + hashraw[4] +
                          hashraw[5] + hashraw[6]
                        , kNumBins)+1

        if equal(vHash[hash1], 0) then
            vPatMax +=3D 1
            vPosn =3D vPatMax
            vKnownPatterns[vPosn] =3D vWordPattern
            vHash[hash1] =3D {vPosn, vWordPattern}
        else
            if equal(vWordPattern, vHash[hash1][2]) then
                vPosn =3D vHash[hash1][1]
            else
                hash2 =3D hash1
                hash1 +=3D 1
                vPosn =3D 0
                while hash1 <=3D kNumBins and not equal(vHash[hash1],0) =
do
                    if equal(vWordPattern, vHash[hash1][2]) then
                        vPosn =3D vHash[hash1][1]
                        exit
                    end if
                    hash1 +=3D 1
                end while
                if vPosn =3D 0 and hash1 > kNumBins then
                    hash1 =3D 1
                    while hash1 < hash2 and not equal(vHash[hash1],0) do
                        if equal(vWordPattern, vHash[hash1][2]) then
                            vPosn =3D vHash[hash1][1]
                            exit
                        end if
                        hash1 +=3D 1
                    end while
                end if
                if vPosn =3D 0 then
                    vPatMax +=3D 1
                    vPosn =3D vPatMax
                    vKnownPatterns[vPosn] =3D vWordPattern
                    vHash[hash1] =3D {vPosn, vWordPattern}
                end if
            end if
        end if

        -- At this point I should know the pattern group and
        -- pattern subindex. The word is inserted into the
        -- dictionary in a sublist. The sublist is referenced
        -- by pattern group (ie length) and pattern subindex.
        -- Thus all words that have the same pattern are put
        -- into the dictionary in the same sublist.
        -- No attempt is made to detect duplicated words.
        vDictLen[vPosn] +=3D 1
        hash1 =3D vDictLen[vPosn]
        if hash1 > length(vDictionary[vPosn]) then
            vDictionary[vPosn] &=3D vEmpty
        end if

        vDictionary[vPosn][hash1] =3D vFileBuf[vWordStart..vWordEnd-1]
        vWordStart =3D vWordEnd + vEOLSize
        vWordEnd =3D vWordStart+1

    end while

    -- Adjust to actual lengths used.
    vKnownPatterns =3D vKnownPatterns[1..vPatMax]
    vDictionary =3D vDictionary[1..vPatMax]
    for i =3D 1 to vPatMax do
        vDictionary[i] =3D vDictionary[i][1..vDictLen[i]]
    end for

    -- Make the buffer memory for reuse.
    vFileBuf =3D {}
end if


-- This returns a sequence containing all the words in the dictionary
-- that match the supplied pattern.
------------------------------------------------------
global function MatchedWords(sequence pPattern)
------------------------------------------------------
    sequence lFinds
    sequence lWordPattern
    sequence lExactChars
    integer lPosn
    integer lCnt
    sequence lWord

    -- If we still haven't got a dictionary, we are screwed.
    if length(vDictionary) =3D 0 then
        return {{-2, "No dictionary is available."}}
    end if

    -------------------
    -- Start searching...
    -------------------

    -- Validate and standardize the parameter.

    -- Create a standardized pattern, and check for errors.
    lWordPattern =3D MakePattern(pPattern)
    if length(lWordPattern) =3D 0 or
        (length(lWordPattern) =3D 1 and
         sequence(lWordPattern[1]) and
         lWordPattern[1][1] < 0) then
        return lWordPattern
    end if

    -- Build a list of characters to be matched later.
    lExactChars =3D {}
    for i =3D 1 to length(pPattern) do
        -- Is this also an 'exact' char?
        if pPattern[i] > ' ' then
            lExactChars =3D append(lExactChars, {pPattern[i], i})
        end if
    end for

    -- At this point, lWordPattern is the pattern in standard form,
    --                lExactChars is a list of positions we have
    --                to check for exact character matches

    -- Scan through the patterns in this group, looking for a matching =
one.
    poke(vConvertPattern, repeat(0, 24))
    poke(vConvertPattern, lWordPattern)
    hashraw =3D peek4u(vBinVals)
    hashraw *=3D vMatrix
    hash1 =3D remainder(hashraw[1] + hashraw[2] +
                      hashraw[3] + hashraw[4] +
                      hashraw[5] + hashraw[6]
                    , kNumBins)+1
    if equal(vHash[hash1], 0) then
        return {} -- No matching words in dictionary.
    end if

    if equal(lWordPattern, vHash[hash1][2]) then
        lPosn =3D vHash[hash1][1]
    else
        hash2 =3D hash1
        hash1 +=3D 1
        lPosn =3D 0
        while hash1 <=3D kNumBins and not equal(vHash[hash1],0) do
            if equal(lWordPattern, vHash[hash1][2]) then
                lPosn =3D vHash[hash1][1]
                exit
            end if
            hash1 +=3D 1
        end while
        if lPosn =3D 0 and hash1 > kNumBins then
            hash1 =3D 1
            while hash1 < hash2 and not equal(vHash[hash1],0) do
                if equal(lWordPattern, vHash[hash1][2]) then
                    lPosn =3D vHash[hash1][1]
                    exit
                end if
                hash1 +=3D 1
            end while
        end if
        if lPosn =3D 0 then
            return {}  -- We don't have any words for this pattern.
        end if
    end if


    -- No exact character matching to do, so just return the whole =
sublist
    -- of words.
    if length(lExactChars)  =3D 0 then
        return vDictionary[lPosn]
    end if

    -- I need to scan through each word, checking for exact character
    -- matches, according to the supplied pattern.
    lFinds =3D {}
    for i =3D 1 to length(vDictionary[lPosn]) do
        lCnt =3D 0
        lWord =3D vDictionary[lPosn][i]
        for j =3D 1 to length(lExactChars) do
            if lWord[lExactChars[j][2]] =3D lExactChars[j][1] then
                lCnt +=3D 1
            else
                exit -- Leave as soon as one match fails.
            end if
        end for
        -- If every required character is in its required position,
        -- then this is a word that I want.
        if lCnt =3D length(lExactChars) then
            lFinds =3D append(lFinds, vDictionary[lPosn][i])
        end if
    end for

    return lFinds
end function


------=_NextPart_000_02D9_01C1DD4B.86A795A0--

new topic » goto parent » topic index » view message » categorize

4. Re: contest #2 derek

Posted by petelomax at blueyonder.co.uk Apr 06, 2002
537 views

On Fri,  5 Apr 2002 21:38:53 +0000, "C. K. Lester"
<cklester at yahoo.com> wrote:

>
>
>petelomax at blueyonder.co.uk wrote:
>
>> I did actually post the best algorithm in #3...
>
>What was your approach? I've not looked at your code yet, but I'd be 
>interested in how you approached it. Have you seen my code? Any tweak 
>suggestions? :)
>
>> WOW!   Two tiny changes: 97.42%
>
>For me, one tiny change: 100%.

I'll agree that's a nice tweak, but, 100%? - you really ought to have
bit the bullet and picked just one answer (he says, ducking 

>From what has happened to both you and me, I begin to think the
competition should have allowed a chance to bug fix, especially
looking at one of the quotes Rob himself picked!

Anyway, congratulations, a fine program sir.

Pete

new topic » goto parent » topic index » view message » categorize

5. Re: contest #2 derek

Posted by petelomax at blueyonder.co.uk Apr 06, 2002
522 views

On Sat, 6 Apr 2002 09:14:59 +1000, Derek Parnell
<ddparnell at bigpond.com> wrote:

>Attached is the code I sent to Robert.
>
>After submitting this to Robert, I continued to experiment with
>optimisations and got it running about twice as fast again.

I'de definitely like to see that, since my program runs in 1.98, the
one you attached 5.17, so those optimisations would make it very close

>comp2sub.e is fast but it is biased towards patterns with zero or few known
>letters.
Thats exactly what I thought about my program.

Pete

new topic » goto parent » topic index » view message » categorize

6. Re: contest #2 derek

Posted by rforno at tutopia.com Apr 06, 2002
519 views

It its strange, but when I developed my code, I found that, for the special
case of the dictionary, it was faster to append than to replace items in
repeat(0, length).
----- Original Message -----
From: "George Orr" <georgeorr at donet.com>
To: "EUforum" <EUforum at topica.com>
Subject: RE: contest #2 derek


>
> Derek -
>
> Can you get Rob to post your code in the archives?  I only access this
> list through the web interface, and don't get attachments.  I was trying
> an approach that sounds a lot like yours, and am really interested in
> how you implemented it.  I think I overused appends, etc. and this cost
> me too much time in building the "dictionary".
>
> Thanks
>
>   George
>
> Derek Parnell wrote:
> > Attached is the code I sent to Robert.
> > ----------
> > Derek.
> >
> >
>
>
>

OpenEuphoria

1. contest #2 derek

2. Re: contest #2 derek

3. Re: contest #2 derek

4. Re: contest #2 derek

5. Re: contest #2 derek

6. Re: contest #2 derek

Search

Include:

Quick Links

User menu

Misc Menu