1. Contest

Derek:
I tried to read the words.txt file with getc() opening "rb", instead of
gets() opening "r", or else using only one of the two alternatives and
modifying slightly the routine according to the read mode, but no
matter what I did I got worse timings. I am testing in a DOS window under
Windows 98 SE.
Thanks for your help.

new topic     » topic index » view message » categorize

2. Contest

In the spirit of having fun, I have created a contest for us. It was
inspired by the recent discussion of such a contest, plus its something
I wanted to do anyway.

The contest description can be found at 

  http://www.users.bigpond.com/ddparnell/contest1/rules.htm

and it doesn't start til Nov 1st, so they rules are not set in concrete
just yet. I invite discussion, clarifications and improvements until then.

The idea is to have a bit of fun and learn some good stuff along the way.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

3. Re: Contest

Sounds like fun! Count me in...

ASCII 00-7F here:
http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm

Given all the discussion on lower() and upper() recently... the only
'word characters' are A-Z, a-z ? Just clarifying...

Any idea when you'll release a text file for us to play with?
-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

4. Re: Contest

Looks like he already has, I got this link off the Rules page:

http://www.users.bigpond.com/ddparnell/contest1/file1.txt

~Greg


On Thu, 28 Oct 2004 14:51:42 +1000, Patrick Barnes <mrtrick at gmail.com> wrote:
> 
> Sounds like fun! Count me in...
> 
> ASCII 00-7F here:
> http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm
> 
> Given all the discussion on lower() and upper() recently... the only
> 'word characters' are A-Z, a-z ? Just clarifying...
> 
> Any idea when you'll release a text file for us to play with?
> --
> MrTrick
> 
>

new topic     » goto parent     » topic index » view message » categorize

5. Re: Contest

Oh dear... Derek, you certainly do have a sense of humour.


On Thu, 28 Oct 2004 01:23:32 -0400, Greg Haberek <ghaberek at gmail.com> wrote:
> 
> Looks like he already has, I got this link off the Rules page:
> 
> http://www.users.bigpond.com/ddparnell/contest1/file1.txt
> 
> ~Greg
> 
> 
> On Thu, 28 Oct 2004 14:51:42 +1000, Patrick Barnes <mrtrick at gmail.com>
> wrote:
> >
> > Sounds like fun! Count me in...
> >
> > ASCII 00-7F here:
> > http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm
> >
> > Given all the discussion on lower() and upper() recently... the only
> > 'word characters' are A-Z, a-z ? Just clarifying...
> >
> > Any idea when you'll release a text file for us to play with?
> > --
> > MrTrick
> >
> >


-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

6. Re: Contest

Can any assumptions be made about maximum word length?
-- 
MrTrick
-------------------------------------------------------------------------------------
magnae clunes mihi placent, nec possum de hac re mentiri.

new topic     » goto parent     » topic index » view message » categorize

7. Re: Contest

Patrick Barnes wrote:
> 
> Sounds like fun! Count me in...
> 
> ASCII 00-7F here:
> <a
> href="http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm">http://www.cdrummond.qc.ca/cegep/informat/Professeurs/Alain/files/ascii.htm</a>

Without quibbling over what is truely ASCII or not, you can rest assured
that the files only contain bytes in the range 0 to 127 inclusively.

> Given all the discussion on lower() and upper() recently... the only
> 'word characters' are A-Z, a-z ? Just clarifying...

Have another read of the 'rules' about what is a word. If you still 
wnat further clarification ask again.

> Any idea when you'll release a text file for us to play with?

The web site now includes a link to download the first text file. It
is a copy of Shakespeare's Hamlet.
 
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

8. Re: Contest

Patrick Barnes wrote:
> 
> Can any assumptions be made about maximum word length?

Not really. One of the test files could be a C++ source code file.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

9. Re: Contest

re:
(c) 10 points for getting the correct words in the most-frequent list.

since the contest is case insensitive, can the correct words be printed out in
any
case?

Kat

new topic     » goto parent     » topic index » view message » categorize

10. Re: Contest

Kat wrote:
> 
> re:
> (c) 10 points for getting the correct words in the most-frequent list.
> 
> since the contest is case insensitive, can the correct words be printed out in
> any
> 
> case?

Yes. I don't care about how you display the words, except that for ease of
judging, I'd like them formatted as per the rules. And I do need to be 
able to read them. I suggest that all upppercase or all lowercase would 
be the best option, but that's a call for the code author.

 
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

11. Re: Contest

Can we be assured that some stated minimum amount of RAM will be available to
the program?

new topic     » goto parent     » topic index » view message » categorize

12. Re: Contest

Andy Serpa wrote:
> 
> Can we be assured that some stated minimum amount of RAM will be available to
> the program?

It has to run on my machine at home and that has 512KB RAM 
and 2GB swap space.

This means that I can easily create arrays of 8MB, for example.

So I guess you can assume that I should be able to run most code, but
if I can't I'll let you know and we can discuss solutions then. 

If you program needs to use its own temporary files I can pretty well
tell you now that it won't win on speed.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

13. Re: Contest

Derek Parnell wrote:
> It has to run on my machine at home and that has 512KB RAM 
> and 2GB swap space.

And how old is that machine? Can you even run Windows with only
512KB RAM? Now this is a real challenge!

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

14. Re: Contest

512 *KB*?  Under Windows?  That can't be right.  You can't even turn on Windows
with less than 64MB these days.  It is hard to imagine any system having less
than 256MB.

What OS, by the way?  Euphoria 2.4 under 98/ME has a number of idiosycrancies
with memory allocation/deallocation that don't exist (much) under XP.  This was
caused when Rob switched to using the Windows malloc instead of the Watcom
malloc...

new topic     » goto parent     » topic index » view message » categorize

15. Re: Contest

Tommy Carlier wrote:
> 
> Derek Parnell wrote:
> > It has to run on my machine at home and that has 512KB RAM 
> > and 2GB swap space.
> 
> And how old is that machine? Can you even run Windows with only
> 512KB RAM? Now this is a real challenge!

The BIOS is 1998. Its a P-III 550MHz running Windows XP SP2.
It runs Windows apps just fine, with no noticable slowness.

I can't run many of the newer games on it, but that's not what
this contest is about.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

16. Re: Contest

Andy Serpa wrote:
> 
> 
> 512 *KB*?  

LOL....  Yeas, I did really mean megabytes. 

>Under Windows?  That can't be right.  You can't even turn on Windows with
> less than 64MB these days.  It is hard to imagine any system having less than
> 256MB.
> 
> What OS, by the way?  Euphoria 2.4 under 98/ME has a number of idiosycrancies
> with
> memory allocation/deallocation that don't exist (much) under XP.  This was
> caused when
> Rob switched to using the Windows malloc instead of the Watcom malloc...

Windows XP

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

17. Re: Contest

I've updated the 'rules' page with some further clarifications and I've
included the first results of my program.
 
  http://www.users.bigpond.com/ddparnell/contest1/rules.htm

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

18. Re: Contest

Derek Parnell wrote:
> Tommy Carlier wrote:
> > Derek Parnell wrote:
> > > It has to run on my machine at home and that has 512KB RAM 
> > > and 2GB swap space.
> > And how old is that machine? Can you even run Windows with only
> > 512KB RAM? Now this is a real challenge!
> The BIOS is 1998. Its a P-III 550MHz running Windows XP SP2.
> It runs Windows apps just fine, with no noticable slowness.
> 
> I can't run many of the newer games on it, but that's not what
> this contest is about.

Well, there goes my OpenGL version.

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

19. Re: Contest

Derek Parnell wrote:
> 
> I've updated the 'rules' page with some further clarifications and I've
> included the first results of my program.

How do you know your results are accurate? Did you do a manual count? Are
those numbers published elsewhere?

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

20. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> > 
> > I've updated the 'rules' page with some further clarifications and I've
> > included the first results of my program.
> 
> How do you know your results are accurate? Did you do a manual count? Are
> those numbers published elsewhere?
> 

I don't know. I've run the program over some specific files where I had
done a manual count and the program displayed the correct values. So I
hope these are correct. If other people get different numbers then I'll
work out where I might have gone wrong and update the benchmark values
and any scores if required.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

21. Re: Contest

Hi Derek,

>The next set of lines must list the most frequent words. 
>These are to be displayed in descending frequency count order 
>such that the most frequent is displayed first. 

What should we do if two or more words have the same frequency count?

Regards,

Phil

new topic     » goto parent     » topic index » view message » categorize

22. Re: Contest

Derek Parnell wrote:
> 
> Andy Serpa wrote:
> > 
> > Can we be assured that some stated minimum amount of RAM will be available
> > to the program?
> 
> It has to run on my machine at home and that has 512KB RAM 
> and 2GB swap space.
> 
> This means that I can easily create arrays of 8MB, for example.
> 
> So I guess you can assume that I should be able to run most code, but
> if I can't I'll let you know and we can discuss solutions then. 
> 
> If you program needs to use its own temporary files I can pretty well
> tell you now that it won't win on speed.
> 
> -- 
> Derek Parnell
> Melbourne, Australia
> 

Windows 95 is the oldest OS I can think of that supports 32 bit
apps without a huge upgrade. And Euphoria is 32 bit duh :P

I do believe Windows 95 requires at least 4 MB of RAM to run at all.
Also if your Windows 95 version is older than OSR2 then your file
system is limited to FAT16 thus 2 GB MAX size partitions.
I think you mean 512 MB on a older Pentium 3 system. Which is fine.
Not for games but just dandy for Euphoria. :)

new topic     » goto parent     » topic index » view message » categorize

23. Re: Contest

Derek Parnell wrote:
> 
> cklester wrote:
> > 
> > Derek Parnell wrote:
> > > 
> > > I've updated the 'rules' page with some further clarifications and I've
> > > included the first results of my program.
> > 
> > How do you know your results are accurate? Did you do a manual count? Are
> > those numbers published elsewhere?
> > 
> 
> I don't know. I've run the program over some specific files where I had
> done a manual count and the program displayed the correct values. So I
> hope these are correct. If other people get different numbers then I'll
> work out where I might have gone wrong and update the benchmark values
> and any scores if required.

I'm getting the same results as Derek posted on his page.

Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

24. Re: Contest

On Thu, 28 Oct 2004 07:02:41 -0700, cklester <guest at RapidEuphoria.com>
wrote:

>How do you know your results are accurate? Did you do a manual count? Are
>those numbers published elsewhere?
I have independently verified them blink)

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

25. Re: Contest

Some questions:
1. How far can we go in using machine-code?
2. How far can we go in calling external DLLs/routines?

To make the contest interesting and competitive for everybody,
I think it would be a good idea to disallow machine-code, and
make it a pure Euphoria-contest.
Calling external DLLs/routines also means that it's not pure
Euphoria.
I think it would be a big challenge for everybody, if only
the use of Euphoria-code was allowed: no call, open_dll,
define_c_proc, define_c_func, c_proc or c_func.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

26. Re: Contest

There might be a problem with using exw.exe:
If I open a Command Prompt, and run 'exw contest.exw file1.txt',
a new console is opened and closed very quickly. When I redirect
the output to a file, the file is empty. If I run it with ex.exe,
there is no problem.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

27. Re: Contest

Do this:
    C:\EUPHORIA\BIN> copy exw.exe exwc.exe
    C:\EUPHORIA\BIN> exw makecon.exw exwc.exe

Voila! You have a console version of exw.exe!

OR

include get.e
object junk
-- use this:
    sleep(5)
-- or this:
    junk = wait_key()
-- or this:
    junk = getc(0)
-- or this:
    junk = prompt_string("Press Enter...")


~Greg

On Thu, 28 Oct 2004 10:27:20 -0700, Tommy Carlier
<guest at rapideuphoria.com> wrote:
> 
> posted by: Tommy Carlier <tommy.carlier at telenet.be>
> 
> There might be a problem with using exw.exe:
> If I open a Command Prompt, and run 'exw contest.exw file1.txt',
> a new console is opened and closed very quickly. When I redirect
> the output to a file, the file is empty. If I run it with ex.exe,
> there is no problem.
> 
> 
> --
> tommy online: http://users.telenet.be/tommycarlier
> tommy.blog: http://tommycarlier.blogspot.com
> Euphoria Message Board: http://uboard.proboards32.com
> Empire for Euphoria: http://empire.iwireweb.com
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

28. Re: Contest

Derek, I think "it's" and "its" should count for TWO words, because despite
sharing the same letters, they are certainly NOT the same word.

Might we encounter something like this:

"... Arbitrar-
ily..."

I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

29. Re: Contest

Found a problem paragraph:

   Ham. Then can each Actor on his Asse-
  Polon. The best Actors in the world, either for Tragedie,
Comedie, Historie, Pastorall:
Pastoricall-Comicall-Historicall-Pastorall:
Tragicall-Historicall: Tragicall-Comicall-Historicall-Pastorall:
Scene indiuidible: or Poem
vnlimited. Seneca cannot be too heauy, nor Plautus
too light, for the law of Writ, and the Liberty. These are
the onely men

Specifically, should we really count

   "Pastoricall-Comicall-Historicall-Pastorall"

as one word?

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

30. Re: Contest

One of the rules is: 'Any string longer than 20 characters, excluding
any quotes, is not a word.'

But when someone previously asked about the maximum word length, Derek
answered that 'no assumptions can be made about maximum word length'.

Isn't this a contradiction? Or did you mean that the files can contain
words of more than 20 characters, but these words aren't valid words?

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

31. Re: Contest

cklester wrote:
> Specifically, should we really count
> 
>    "Pastoricall-Comicall-Historicall-Pastorall"
> 
> as one word?

I would say no -- rules state any string longer than 20 chars excluding 
any quotes is not a word and that words containing hyphens are a single word.

new topic     » goto parent     » topic index » view message » categorize

32. Re: Contest

Jason Gade wrote:
> 
> cklester wrote:
> > Specifically, should we really count
> > 
> >    "Pastoricall-Comicall-Historicall-Pastorall"
> > 
> > as one word?
> 
> I would say no -- rules state any string longer than 20 chars excluding 
> any quotes is not a word and that words containing hyphens are a single word.

You're right, Jason. Thanks! I read that... I just forgot about it! :D

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

33. Re: Contest

cklester wrote:
> I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.

I'm only getting 27873 words sad
Can a word span over multiple lines?
I'm only processing 1 line at a time, so if a word can span over
2 lines, I'm in trouble.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

34. Re: Contest

Tommy Carlier wrote:
> 
> cklester wrote:
> > I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.
> 
> I'm only getting 27873 words sad
> Can a word span over multiple lines?
> I'm only processing 1 line at a time, so if a word can span over
> 2 lines, I'm in trouble.

Rules state no line spanning: "A word cannot extend over line a boundary,"
meaning, "A word cannot extend over a line boundary." I guess! :)

I'm getting 29545 words and 4788 unique. I have 5 fewer 1-letter words
than the [supposedly accurate :) ] target, 3 fewer words in the 3-letter
words category, one more 4-letter word, and two more 5-letter words. That
makes up the '5' I've missing from the total, but doesn't explain my
unique numbers.

Derek! Could you post an alphabetized list of ALL words and UNIQUE words
for the test file?

Thanks! :)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

35. Re: Contest

cklester wrote:
> 
> 
> I'm getting 29545 words and 4788 unique.

P.S. I am getting the same counts for the top five words... :/

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

36. Re: Contest

Since I'm at work and I don't have Euphoria here, I've been playing with the 
test file and UNIX command line tools.

Using 'tr', 'wc', 'sort', and 'uniq' I get the following totals:

29555 words
4793 unique

Derek's:
29550 words
4794 unique

I can account for the difference in total words by counting
very-long-words-that-have-hyphens
and words that begin or end with a hyphen.

I don't know how to account for the 1 off discrepancy in unique words, though.
I would expect it to be at least 3 to 4 higher instead of 1 less.

new topic     » goto parent     » topic index » view message » categorize

37. Re: Contest

I'm getting the same as Derek:

> 29550 words
> 4794 unique

Phil

new topic     » goto parent     » topic index » view message » categorize

38. Re: Contest

Phil Russell wrote:
> 
> I'm getting the same as Derek:
> 
> > 29550 words
> > 4794 unique

It's a conspiracy!

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

39. Re: Contest

Tommy Carlier wrote:
> 
> Some questions:
> 1. How far can we go in using machine-code?
> 2. How far can we go in calling external DLLs/routines?
> 
> To make the contest interesting and competitive for everybody,
> I think it would be a good idea to disallow machine-code, and
> make it a pure Euphoria-contest.
> Calling external DLLs/routines also means that it's not pure
> Euphoria.
> I think it would be a big challenge for everybody, if only
> the use of Euphoria-code was allowed: no call, open_dll,
> define_c_proc, define_c_func, c_proc or c_func.

Good point. I'll make this a stipulation. I did want to encourage Euphoria
code.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

40. Re: Contest

cklester wrote:
> 
> Derek, I think "it's" and "its" should count for TWO words, because despite
> sharing the same letters, they are certainly NOT the same word.

If I were an English major or similar I'd agree. However, for the pusposes of
this contest, quotes are valid word characters but they are ignored when
counting word length and when comparing words.

Thus "the kid's dogs" and "the kids, dogs" both yield the same three
words - "THE", "KIDS" and "DOGS".
 

> Might we encounter something like this:
> 
> "... Arbitrar-
> ily..."

Yes. And if so it would count as two words "ABRITRAR" and "ILY".
 
> I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.

The correct algorithm blink

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

41. Re: Contest

cklester wrote:
> 
> Found a problem paragraph:
> 
>    Ham. Then can each Actor on his Asse-
>   Polon. The best Actors in the world, either for Tragedie,
> Comedie, Historie, Pastorall:
> Pastoricall-Comicall-Historicall-Pastorall:
> Tragicall-Historicall: Tragicall-Comicall-Historicall-Pastorall:
> Scene indiuidible: or Poem
> vnlimited. Seneca cannot be too heauy, nor Plautus
> too light, for the law of Writ, and the Liberty. These are
> the onely men
> 
> Specifically, should we really count
> 
>    "Pastoricall-Comicall-Historicall-Pastorall"
> 
> as one word?

It is one potential word however because it is longer than 20 characters
it is deemed as not a word. Embedded hyphens are a part of a word, they
are not word delimiters.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

42. Re: Contest

Tommy Carlier wrote:
> 
> One of the rules is: 'Any string longer than 20 characters, excluding
> any quotes, is not a word.'
> 
> But when someone previously asked about the maximum word length, Derek
> answered that 'no assumptions can be made about maximum word length'.
> 
> Isn't this a contradiction? Or did you mean that the files can contain
> words of more than 20 characters, but these words aren't valid words?

The question about word length was asked before I made the clarification
in the rules. Initially I made no restriction but after that question
I decided to make a 20-character limit. 

Yes, a file may contain very large strings but if > 20 characters we
are calling them "not words".

The rules are still in flux until the contest starts. So don't get too
stuck with how they appear today as there could still be refinements
to come. And dont anyone get annoyed about that either because I've
given all a fair warning. This is close to real-world in which the
bosses often make changes to the specification after coding has started.
I will not change the rules (unless a you all says otherwise) once
the contest starts.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

43. Re: Contest

Tommy Carlier wrote:
> 
> cklester wrote:
> > I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.
> 
> I'm only getting 27873 words sad
> Can a word span over multiple lines?
> I'm only processing 1 line at a time, so if a word can span over
> 2 lines, I'm in trouble.

A word, in the definition of this contest, can never span multiple lines.
An end-of-line marker is a word delimiter.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

44. Re: Contest

cklester wrote:
> 
> Tommy Carlier wrote:
> > 
> > cklester wrote:
> > > I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.
> > 
> > I'm only getting 27873 words sad
> > Can a word span over multiple lines?
> > I'm only processing 1 line at a time, so if a word can span over
> > 2 lines, I'm in trouble.
> 
> Rules state no line spanning: "A word cannot extend over line a boundary,"
> meaning, "A word cannot extend over a line boundary." I guess! :)
> 
> I'm getting 29545 words and 4788 unique. I have 5 fewer 1-letter words
> than the [supposedly accurate :) ] target, 3 fewer words in the 3-letter
> words category, one more 4-letter word, and two more 5-letter words. That
> makes up the '5' I've missing from the total, but doesn't explain my
> unique numbers.

Obviously you haven't quite got the right algo yet.  Double check each 
of the rules again. 

You might be tripping up with the leading-trailing hyphen rule(?) Or
with words with mixed digits and alphabetics.

> Derek! Could you post an alphabetized list of ALL words and UNIQUE words
> for the test file?

Yes, but that might be too much of a help. If you really get stuck by
this time tomorrow I'll consider this request more sympathetically.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

45. Re: Contest

cklester wrote:
> 
> Phil Russell wrote:
> > 
> > I'm getting the same as Derek:
> > 
> > > 29550 words
> > > 4794 unique
> 
> It's a conspiracy!

You could be right. All the people who have the same count as me are 
obviously all wrong too blink

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

46. Re: Contest

I can say that the other four files I'll test against are very much
larger than the one 'calibration' file you have now. They range from
1.2MB to 4.4MB.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

47. Re: Contest

On 28 Oct 2004, at 16:10, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> Tommy Carlier wrote:
> > 
> > cklester wrote:
> > > I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.
> > 
> > I'm only getting 27873 words sad
> > Can a word span over multiple lines?
> > I'm only processing 1 line at a time, so if a word can span over
> > 2 lines, I'm in trouble.
> 
> A word, in the definition of this contest, can never span multiple lines.
> An end-of-line marker is a word delimiter.

Given:

...arbit-
trarily...

is the first word "arbit-" or "arbit" ?

Kat

new topic     » goto parent     » topic index » view message » categorize

48. Re: Contest

Phil Russell wrote:
> 
> Hi Derek,
> 
> >The next set of lines must list the most frequent words. 
> >These are to be displayed in descending frequency count order 
> >such that the most frequent is displayed first. 
> 
> What should we do if two or more words have the same frequency count?

Ok, I should clarify this.

The output is to be sorted in Frequency Count (descending) and Word Text
(ascending). You must only display the required number of output lines. 
Sorting of Word Text is to ignore case and ignore quotes.

Thus if you calculate that you need to display the top 15 frequencies and
after sorting the 15th and 16th have the same frequence count, you only
display the 15th and you do not display the 16th.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

49. Re: Contest

Kat wrote:
> 
> On 28 Oct 2004, at 16:10, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > Tommy Carlier wrote:
> > > 
> > > cklester wrote:
> > > > I'm getting 29548 words and 4984 unique, so I wonder what I'm missing.
> > > 
> > > I'm only getting 27873 words sad
> > > Can a word span over multiple lines?
> > > I'm only processing 1 line at a time, so if a word can span over
> > > 2 lines, I'm in trouble.
> > 
> > A word, in the definition of this contest, can never span multiple lines.
> > An end-of-line marker is a word delimiter.
> 
> Given:
> 
> ...arbit-
> trarily...
> 
> is the first word "arbit-" or "arbit" ?

I'm sorry for the confusion. In a previous reply I gave the wrong
information.

The rule is that words that end or start with a hyphen are deemed
as not being words. Thus in your example "arbit-" is not a word 
and is thus ignored. However, "trarily" would be deemed a word.

Is everyone happy with this situation or do you wish to be 
more respecting of English typography?

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

50. Re: Contest

To ensure that I am playing fair, I'll be posting my program's source code
on the web page at the start of the contest.

It will be encrypted but you are welcome to download it as I will
be disclosing the pass phrase at the end of the contest. So you can 
then decrypt it and inspect it against the source code revealed at the
end of the contest.

You can also try to crack the encryption, for extra bonus points blink

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

51. Re: Contest

On Thu, 28 Oct 2004 16:02:07 -0700, Derek Parnell > The question about
>word length was asked before I made the clarification
> in the rules. Initially I made no restriction but after that question
> I decided to make a 20-character limit.
> 
> Yes, a file may contain very large strings but if > 20 characters we
> are calling them "not words".

So, if I pased a file containing only:
"Pastoricall-Comicall-Historicall-Pastorall is not a word"

Does it contain only 4 words? ( 'is', 'not', 'a', 'word' )


Also, if I pass a file containing only:
"test1-     -test2      t-e-s-t--3"

Does it contain ('test1', 'test2', 't-e-s-t--3' ) ? 
Does a leading or trailing '-' invalidate the entire word? 
Are sequential multiples of  '-' ok, as long as they're not leading or trailing?

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

52. Re: Contest

Patrick Barnes wrote:
> 
> On Thu, 28 Oct 2004 16:02:07 -0700, Derek Parnell > The question about
> >word length was asked before I made the clarification
> > in the rules. Initially I made no restriction but after that question
> > I decided to make a 20-character limit.
> > 
> > Yes, a file may contain very large strings but if > 20 characters we
> > are calling them "not words".
> 
> So, if I pased a file containing only:
> "Pastoricall-Comicall-Historicall-Pastorall is not a word"
> 
> Does it contain only 4 words? ( 'is', 'not', 'a', 'word' )

Yes, it contains just those four words. 

> 
> Also, if I pass a file containing only:
> "test1-     -test2      t-e-s-t--3"
> 
> Does it contain ('test1', 'test2', 't-e-s-t--3' ) ? 

No. It only contains one word, namely 't-e-s-t--3'.

> Does a leading or trailing '-' invalidate the entire word? 

Yes.

> Are sequential multiples of  '-' ok, as long as they're not leading or
> trailing?

Yes. An embedded hyphens are treated as word characters and not word
delimiters. Note that a string such as "-te-st-wo--rd" is not a word
because of the leading hyphen. 

If people wish me to change this rule, I'm happy to do it, but I need
a "show of hands" first.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

53. Re: Contest

Derek Parnell wrote:

> An embedded hyphens are treated as word characters and not word
> delimiters. Note that a string such as "-te-st-wo--rd" is not a word
> because of the leading hyphen. 

Would "a-m" be a three letter word, or do we filter out the dash?

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

54. Re: Contest

On 28 Oct 2004, at 19:32, cklester wrote:

> 
> 
> posted by: cklester <cklester at yahoo.com>
> 
> Derek Parnell wrote:
> 
> > An embedded hyphens are treated as word characters and not word
> > delimiters. Note that a string such as "-te-st-wo--rd" is not a word
> > because of the leading hyphen. 
> 
> Would "a-m" be a three letter word, or do we filter out the dash?

# letters. You filter out the ' in a'm. Or not,, it's filtered as a word, to
become
am, but i don't know if we are to display  we found am or a'm.

Kat

new topic     » goto parent     » topic index » view message » categorize

55. Re: Contest

> > Would "a-m" be a three letter word, or do we filter out the dash?

Further, the rules say "'A-12-section' is a word." Is it a 12-letter word
or a 10-letter word? Do we keep the dashes in the word, or are "a-m" and 
"a--m" and "a---m" the same word, all counted as length 2?

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

56. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> 
> > An embedded hyphens are treated as word characters and not word
> > delimiters. Note that a string such as "-te-st-wo--rd" is not a word
> > because of the leading hyphen. 
> 
> Would "a-m" be a three letter word, or do we filter out the dash?

What don't you understand by "embedded hyphens are treated as word
characters and not word delimiters"?  Is the "-" in "a-m" an embedded
hyphen? Yes. Therefore it is a word character. Thus the string "a-m" is a
3-character word.

You never filter out hyphens. The only strings that contain hyphens and
are *NOT* words are those strings that begin with a hyphen, strings that
end with a hyphen, and strings that only contain digits and hyphens.

"a-m" does not fall into any of those three groups.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

57. Re: Contest

cklester wrote:
> 
> > > Would "a-m" be a three letter word, or do we filter out the dash?
> 
> Further, the rules say "'A-12-section' is a word." Is it a 12-letter word
> or a 10-letter word? 

It is a 12-letter word.

>Do we keep the dashes in the word, or are "a-m" and 
> "a--m" and "a---m" the same word, all counted as length 2?

Yes, keep the dashes. These are three different words, of lengths 3,4,
and 5 respectively.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

58. Re: Contest

Derek Parnell wrote:
> 
> cklester wrote:
> > 
> > Derek Parnell wrote:
> > 
> > > An embedded hyphens are treated as word characters and not word
> > > delimiters. Note that a string such as "-te-st-wo--rd" is not a word
> > > because of the leading hyphen. 
> > 
> > Would "a-m" be a three letter word, or do we filter out the dash?
> 
> What don't you understand by "embedded hyphens are treated as word
> characters and not word delimiters"?

Because you say this in the rules:

"A word is defined as a string of adjacent characters composed entirely
from the alphabetic characters A-Z and a-z, the digits 0-9, and
punctuation characters hyphen '-' and single quote "'"."

Okay, '-' and "'" are valid string characters.

Then you say:

"For the purposes of comparison and display, QUOTES ARE IGNORED in any
word. For example "it's" and "its" are the same, "'heaven'" and "heaven"
are the same word. A word's length does not include any quotes in the
count."

Okay, so despite the "'" being valid, we're supposed to strip it for
purposes of counting word length. However, we DON'T strip "-" for
purposes of counting word length.

So "it's" and "a-m" are words of the same length.

Doesn't seem consistent.

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

59. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> > 
> > cklester wrote:
> > > 
> > > Derek Parnell wrote:
> > > 
> > > > An embedded hyphens are treated as word characters and not word
> > > > delimiters. Note that a string such as "-te-st-wo--rd" is not a word
> > > > because of the leading hyphen. 
> > > 
> > > Would "a-m" be a three letter word, or do we filter out the dash?
> > 
> > What don't you understand by "embedded hyphens are treated as word
> > characters and not word delimiters"?
> 
> Because you say this in the rules:
> 
> "A word is defined as a string of adjacent characters composed entirely
> from the alphabetic characters A-Z and a-z, the digits 0-9, and
> punctuation characters hyphen '-' and single quote "'"."
> 
> Okay, '-' and "'" are valid string characters.
>

Well, technically they are valid *word* characters. That is to say, they
do not delimit one word from another, such as spaces and commas do, for
example.

> Then you say:
> 
> "For the purposes of comparison and display, QUOTES ARE IGNORED in any
> word. For example "it's" and "its" are the same, "'heaven'" and "heaven"
> are the same word. A word's length does not include any quotes in the
> count."
> 
> Okay, so despite the "'" being valid, we're supposed to strip it for
> purposes of counting word length. However, we DON'T strip "-" for
> purposes of counting word length.
> 
> So "it's" and "a-m" are words of the same length.
> 
> Doesn't seem consistent.

Oh well. You'll get over it. No-one told me I had to be consistent with
respect to quotes and hyphens. I had to throw in a few quirks to make it
a little more than a toy application. How easy did you want it to be blink

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

60. Re: Contest

On 28 Oct 2004, at 20:28, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> cklester wrote:
> > 
> > > > Would "a-m" be a three letter word, or do we filter out the dash?
> > 
> > Further, the rules say "'A-12-section' is a word." Is it a 12-letter word or
> > a
> > 10-letter word? 
> 
> It is a 12-letter word.
> 
> >Do we keep the dashes in the word, or are "a-m" and 
> > "a--m" and "a---m" the same word, all counted as length 2?
> 
> Yes, keep the dashes. These are three different words, of lengths 3,4,
> and 5 respectively.

Ok, but given :
am
a'm

they are equal, but is one a 2 letter and one a 3 letter word? I guess what i 
am asking is which sort/parse is to occur first? Convert "a'm" to 2 letters, 
then count it as "am", or count it as 3 letters for the purpose of word freq, 
and then remove the ' so it matches "am" from then on?

Kat

new topic     » goto parent     » topic index » view message » categorize

61. Re: Contest

Kat wrote:
> 
> On 28 Oct 2004, at 20:28, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > cklester wrote:
> > > 
> > > > > Would "a-m" be a three letter word, or do we filter out the dash?
> > > 
> > > Further, the rules say "'A-12-section' is a word." Is it a 12-letter word
> > > or a
> > > 10-letter word? 
> > 
> > It is a 12-letter word.
> > 
> > >Do we keep the dashes in the word, or are "a-m" and 
> > > "a--m" and "a---m" the same word, all counted as length 2?
> > 
> > Yes, keep the dashes. These are three different words, of lengths 3,4,
> > and 5 respectively.
> 
> Ok, but given :
> am
> a'm
> 
> they are equal, but is one a 2 letter and one a 3 letter word? I guess what i 
> am asking is which sort/parse is to occur first? Convert "a'm" to 2 letters, 
> then count it as "am", or count it as 3 letters for the purpose of word freq, 
> and then remove the ' so it matches "am" from then on?

"am" and "a'm" are exactly equivalent to "AM". Its up to you to
work out how to implement that contest fact. That's the 'contest'
part.


-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

62. Re: Contest

Kat wrote:

> Ok, but given :
> am
> a'm
> 
> they are equal, but is one a 2 letter and one a 3 letter word? I guess what i 
> am asking is which sort/parse is to occur first? Convert "a'm" to 2 letters, 
> then count it as "am", or count it as 3 letters for the purpose of word freq, 
> and then remove the ' so it matches "am" from then on?
> 
> Kat

>From the rules:

4. For the purposes of comparison and display, quotes are ignored in any word.
For example "it's" and "its" are the same ... A word's length does not include
any quotes in the count.

It took me a few readings of the rules to get it all down as well.

new topic     » goto parent     » topic index » view message » categorize

63. Re: Contest

I'm apparently misunderstanding something because, for instance, you show
five more one-letter words in the count than I do. My unique one-letter
words are i, a, o, t, k, y.

Do you have a longer list than that?

Here's my output (the items in parenthesis are how much I'm off of yours):

c.k.lester file1.txt
Total:   29545, Unique:    4788
01 the                  992 (0)
02 and                  862 (0)
03 to                   683 (0)
04 of                   608 (0)
05 i                    547 (0)
01 1088 (-5)
02 5266 (0)
03 6941 (-3)
04 6113 (1)
05 3759 (2)
06 2554 (0)
07 1633 (0)
08 1004 (0)
09 588 (0)
10 331 (0)
11 151 (0)
12 63 (0)
13 40 (0)
14 9 (0)
15 3 (0)
16 2 (0)
Elapsed time: 0.240000

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

64. Re: Contest

On 28 Oct 2004, at 21:33, cklester wrote:

<snip>

> Elapsed time: 0.240000

I can't get the word file loaded in that time. Or in ten times that time.

Kat,
still clusterless,
still with 1996 technology.

new topic     » goto parent     » topic index » view message » categorize

65. Re: Contest

Derek, 
"ASCII text characters #00 - #7F." ... Might there be embedded nulls
in the text files?
-- 
MrTrick
-------------------------------------------------------------------------------------
magnae clunes mihi placent, nec possum de hac re mentiri.

new topic     » goto parent     » topic index » view message » categorize

66. Re: Contest

On Fri, 29 Oct 2004 00:29:15 -0500, Kat <gertie at visionsix.com> wrote:
> > Elapsed time: 0.240000
> 
> I can't get the word file loaded in that time. Or in ten times that time.

Don't worry... cklester may just have a Quad Xeon system to your pentium 1.
In the interests of the contest though, how about none of us post
times until after the contest is over?
They're not going to be official, because it depends on the machine
it's run on. And there are some of us who don't want to be shown up
before we can even submit our own code getlost

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

67. Re: Contest

Derek Parnell wrote:

> To ensure that I am playing fair, I'll be posting my program's source code
> on the web page at the start of the contest.
>
> It will be encrypted but you are welcome to download it as I will
> be disclosing the pass phrase at the end of the contest. So you can
> then decrypt it and inspect it against the source code revealed at the
> end of the contest.
>
> You can also try to crack the encryption, for extra bonus points blink

Please use ROT-13 for the encryption! ;o)

Regards,
   Juergen

-- 
 /"\  ASCII ribbon campain  |  This message has been ROT-13 encrypted
 \ /  against HTML in       |  twice for higher security.
  X   e-mail and news,      |
 / \  and unneeded MIME     |  http://home.arcor.de/luethje/prog/

new topic     » goto parent     » topic index » view message » categorize

68. Re: Contest

Derek Parnell wrote:

<snip>

>> So "it's" and "a-m" are words of the same length.
>>
>> Doesn't seem consistent.
>
> Oh well. You'll get over it. No-one told me I had to be consistent with
> respect to quotes and hyphens. I had to throw in a few quirks to make it
> a little more than a toy application. How easy did you want it to be blink

Keep the rules as they are. They are well designed.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

69. Re: Contest

Patrick Barnes wrote:
> 
> Derek, 
> "ASCII text characters #00 - #7F." ... Might there be embedded nulls
> in the text files?

Yes.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

70. Re: Contest

Patrick Barnes wrote:
> 
> On Fri, 29 Oct 2004 00:29:15 -0500, Kat <gertie at visionsix.com> wrote:
> > > Elapsed time: 0.240000
> > 
> > I can't get the word file loaded in that time. Or in ten times that time.
> 
> Don't worry... cklester may just have a Quad Xeon system to your pentium 1.
> In the interests of the contest though, how about none of us post
> times until after the contest is over?
> They're not going to be official, because it depends on the machine
> it's run on. And there are some of us who don't want to be shown up
> before we can even submit our own code getlost

It is almost irrelevant how fast or slow it runs on your machine, as
all the scoring is done by running on my machine. Any times you
publish based on running it on your machine are meaningless from
the point of view of the contest.


-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

71. Re: Contest

On Fri, 29 Oct 2004 00:10:13 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> It is almost irrelevant how fast or slow it runs on your machine, as
> all the scoring is done by running on my machine. 

It would be interesting if (not affecting the contest) you had access
to another computer, to see if the relationship between the different
submissions performance was constant, no matter what speed the
hardware ran at.

> Any times you
> publish based on running it on your machine are meaningless from
> the point of view of the contest.

>From the point of view of the contest, I know... but not so much from
the social point of view. I'd much rather not know that someone else
managed to get their code running 10x faster than mine.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

72. Re: Contest

Patrick Barnes wrote:
> 
> On Fri, 29 Oct 2004 00:10:13 -0700, Derek Parnell
> <guest at rapideuphoria.com> wrote:
> > It is almost irrelevant how fast or slow it runs on your machine, as
> > all the scoring is done by running on my machine. 
> 
> It would be interesting if (not affecting the contest) you had access
> to another computer, to see if the relationship between the different
> submissions performance was constant, no matter what speed the
> hardware ran at.
> 

Yeah, I could do that.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

73. Re: Contest

Kat wrote:
> 
> 
> > Elapsed time: 0.240000
> 
> I can't get the word file loaded in that time. Or in ten times that time.

LOL! Yeah, I'm on an Athlon XP CPU so take it with a grain of salt. Whatever
that means. :)

Derek said he'd be releasing his code so we could run it on our own
machines for a more relative comparison. I think he said that. :)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

74. Re: Contest

Patrick Barnes wrote:
> 
> In the interests of the contest though, how about none of us post
> times until after the contest is over?

I agree. That was my slip-up. Sorry!

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

75. Re: Contest

cklester wrote:
> 
> Kat wrote:
> > 
> > 
> > > Elapsed time: 0.240000
> > 
> > I can't get the word file loaded in that time. Or in ten times that time.
> 
> LOL! Yeah, I'm on an Athlon XP CPU so take it with a grain of salt. Whatever
> that means. :)
> 
> Derek said he'd be releasing his code so we could run it on our own
> machines for a more relative comparison. I think he said that. :)
> 

Yes I will. At the start of the contest you can grab my encrypted source,
and at the end of the contest you can grab my pass-phrase and/or my
unencrypted source.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

76. Re: Contest

cklester wrote:
> 
> Patrick Barnes wrote:
> > 
> > In the interests of the contest though, how about none of us post
> > times until after the contest is over?
> 
> I agree. That was my slip-up. Sorry!

Just to clarify, I intend to publish the run-times that I get for
each submission. I'll do this as soon as possible after receiving
the submission. 

My thinking behind this is that everyone can then
see how they are doing in a relative way and then they can decide 
whether or not to enter a new submission. Everyone can enter as 
many programs as they want during the contest period.

If anyone does *not* want me to publish their program's run-times,
please let me know at the time you submit it.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

77. Re: Contest

Derek Parnell wrote:
> 
> Just to clarify, I intend to publish the run-times that I get for
> each submission. I'll do this as soon as possible after receiving
> the submission. 

That's appropriate because they will all be relative and based on the
same PC. My publishing my time here ON MY PC was not useful.

> My thinking behind this is that everyone can then
> see how they are doing in a relative way and then they can decide 
> whether or not to enter a new submission. Everyone can enter as 
> many programs as they want during the contest period.

If we just make a modification to an already submitted program, will you
remove the prior version?

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

P.S. Where's that word list? :)

new topic     » goto parent     » topic index » view message » categorize

78. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> > 
> > Just to clarify, I intend to publish the run-times that I get for
> > each submission. I'll do this as soon as possible after receiving
> > the submission. 
> 
> That's appropriate because they will all be relative and based on the
> same PC. My publishing my time here ON MY PC was not useful.
> 
> > My thinking behind this is that everyone can then
> > see how they are doing in a relative way and then they can decide 
> > whether or not to enter a new submission. Everyone can enter as 
> > many programs as they want during the contest period.
> 
> If we just make a modification to an already submitted program, will you
> remove the prior version?

If that's what you'd personally like, yes. But that's not what I was
intending to do. I thought it would be nice to see people improving
on the personal best times.

> P.S. Where's that word list? :)

Are you still having trouble getting your program to give you
the figures I published?

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

79. Re: Contest

Derek Parnell wrote:
> cklester wrote:
> > P.S. Where's that word list? :)
> Are you still having trouble getting your program to give you
> the figures I published?

Unfortunately, yes. :/

But, as you can see from my current results, I'm not that far off! :)

I'm also going to download the test file again, in case that's gotten
corrupted.

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

80. Re: Contest

I like the rules. But that's probably because I've already re-written
my code 3 times, and each time I hit spot on to Derek's results. smile

I guess this contest is more of an exercise in logic than anything.
It's kinda like those assignments we got in elementary school that say
"Read all directions first before doing anything!" And I always found
kids asking me for a purple crayon because step number 7 says "draw a
circle on the back of this paper with purple crayon" when the last
step says "do nothing but write your name on the top and turn this in"
and I'm done in 2 minutes.

Read the directions first people! Don't question it, and apply the
logic in the order it is presented.

~Greg


On Fri, 29 Oct 2004 09:03:53 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> 
> Derek Parnell wrote:
> 
> <snip>
> 
> >> So "it's" and "a-m" are words of the same length.
> >>
> >> Doesn't seem consistent.
> >
> > Oh well. You'll get over it. No-one told me I had to be consistent with
> > respect to quotes and hyphens. I had to throw in a few quirks to make it
> > a little more than a toy application. How easy did you want it to be blink
> 
> Keep the rules as they are. They are well designed.
> 
> Regards,
>   Juergen
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

81. Re: Contest

Derek Parnell wrote:
> cklester wrote:
> > If we just make a modification to an already submitted program, will you
> > remove the prior version?
> 
> If that's what you'd personally like, yes. But that's not what I was
> intending to do. I thought it would be nice to see people improving
> on the personal best times.

I figured it would be something like that. I agree that it would be neat
to see how everybody improves.

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

82. Re: Contest

I thought the competitors might appreciate this:

http://www.gutenberg.org/etext/2600 <-- War and Peace, by Leo Tolstoy

It'll be a bit of a stress-test for y'all.
I'm still working on my program, but I'll put in my word results when
I have them.

I did a quick check - it doesn't have any extended ascii characters in
it, either.
-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

83. Re: Contest

Patrick Barnes wrote:
> 
> I thought the competitors might appreciate this:
> 
> <a
> href="http://www.gutenberg.org/etext/2600">http://www.gutenberg.org/etext/2600</a>
> <-- War and Peace, by Leo Tolstoy
> 
> It'll be a bit of a stress-test for y'all.
> I'm still working on my program, but I'll put in my word results when
> I have them.
> 

Here's mine (less the time):

Matt Lewis wrnpc11.txt
Total:  564137, Unique:   18471
01 the                  34629
02 and                  22254
03 to                   16738
04 of                   14930
05 a                    10560
06 he                   9864
07 in                   8926
08 his                  7983
09 that                 7899
10 was                  7352
11 with                 5671
12 had                  5364
13 it                   5210
14 her                  4706
15 not                  4687
16 him                  4574
17 at                   4538
18 i                    4147
19 but                  4053
01 14887
02 93460
03 141927
04 97813
05 57861
06 48118
07 42138
08 29705
09 17444
10 10811
11 4584
12 2925
13 1512
14 572
15 201
16 106
17 46
18 14
19 2
20 11

new topic     » goto parent     » topic index » view message » categorize

84. Re: Contest

in my opinion to have a score based on processing time even tho you
are doing the testing on your machine 25 times or whatever i believe
it is unfair. The coder of the application may be inclined to use
certain techniques that run better , say on one of the latest
processors with hyperthreading or a myriad of other possibilities. How
is someone not running the equivelant to your testing machine suppose
to account for this testing / implementation of speed in algorithm to
score in this contest??


On Fri, 29 Oct 2004 09:25:51 -0700, Matt Lewis <guest at rapideuphoria.com>
wrote:
> 
> posted by: Matt Lewis <matthewwalkerlewis at yahoo.com>
> 
> Patrick Barnes wrote:
> >
> > I thought the competitors might appreciate this:
> >
> > <a
> > href="http://www.gutenberg.org/etext/2600">http://www.gutenberg.org/etext/2600</a>
> > <-- War and Peace, by Leo Tolstoy
> >
> > It'll be a bit of a stress-test for y'all.
> > I'm still working on my program, but I'll put in my word results when
> > I have them.
> >
> 
> Here's mine (less the time):
> 
> Matt Lewis wrnpc11.txt
> Total:  564137, Unique:   18471
> 01 the                  34629
> 02 and                  22254
> 03 to                   16738
> 04 of                   14930
> 05 a                    10560
> 06 he                   9864
> 07 in                   8926
> 08 his                  7983
> 09 that                 7899
> 10 was                  7352
> 11 with                 5671
> 12 had                  5364
> 13 it                   5210
> 14 her                  4706
> 15 not                  4687
> 16 him                  4574
> 17 at                   4538
> 18 i                    4147
> 19 but                  4053
> 01 14887
> 02 93460
> 03 141927
> 04 97813
> 05 57861
> 06 48118
> 07 42138
> 08 29705
> 09 17444
> 10 10811
> 11 4584
> 12 2925
> 13 1512
> 14 572
> 15 201
> 16 106
> 17 46
> 18 14
> 19 2
> 20 11
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

85. Re: Contest

besides try this : i don't know if it's just my machine but i find
this strange. make a new file with the below code, run it 5 times and
note the different times

--<EuCode>--

include get.e
constant START = time()
sequence fn  fn = gets(0)
printf(1, "Elapsed time: %f\n", time() - START)
integer junk    junk = wait_key()

--</EuCode>--

type hello or some other word, make sure it is the same word each of
the 5 times u test it. Notice the different times. I know this is
normal because the machine is handling diff amounts of processing from
other apps but common, i was getting at least half a seconf
differences for just 1 line -- sequence fn  fn = gets(0) -- so in
theory if there is an average of a half second difference over 1 line
of code in 5 instances how do you think this will implement to some of
the larger algorithms written. ??




On Sat, 30 Oct 2004 02:49:56 +1000, spent memory <spent.memory at gmail.com>
wrote:
> 
> 
> in my opinion to have a score based on processing time even tho you
> are doing the testing on your machine 25 times or whatever i believe
> it is unfair. The coder of the application may be inclined to use
> certain techniques that run better , say on one of the latest
> processors with hyperthreading or a myriad of other possibilities. How
> is someone not running the equivelant to your testing machine suppose
> to account for this testing / implementation of speed in algorithm to
> score in this contest??
> 
> On Fri, 29 Oct 2004 09:25:51 -0700, Matt Lewis <guest at rapideuphoria.com>
> wrote:
> >
> > posted by: Matt Lewis <matthewwalkerlewis at yahoo.com>
> >
> > Patrick Barnes wrote:
> > >
> > > I thought the competitors might appreciate this:
> > >
> > > <a
> > > href="http://www.gutenberg.org/etext/2600">http://www.gutenberg.org/etext/2600</a>
> > > <-- War and Peace, by Leo Tolstoy
> > >
> > > It'll be a bit of a stress-test for y'all.
> > > I'm still working on my program, but I'll put in my word results when
> > > I have them.
> > >
> >
> > Here's mine (less the time):
> >
> > Matt Lewis wrnpc11.txt
> > Total:  564137, Unique:   18471
> > 01 the                  34629
> > 02 and                  22254
> > 03 to                   16738
> > 04 of                   14930
> > 05 a                    10560
> > 06 he                   9864
> > 07 in                   8926
> > 08 his                  7983
> > 09 that                 7899
> > 10 was                  7352
> > 11 with                 5671
> > 12 had                  5364
> > 13 it                   5210
> > 14 her                  4706
> > 15 not                  4687
> > 16 him                  4574
> > 17 at                   4538
> > 18 i                    4147
> > 19 but                  4053
> > 01 14887
> > 02 93460
> > 03 141927
> > 04 97813
> > 05 57861
> > 06 48118
> > 07 42138
> > 08 29705
> > 09 17444
> > 10 10811
> > 11 4584
> > 12 2925
> > 13 1512
> > 14 572
> > 15 201
> > 16 106
> > 17 46
> > 18 14
> > 19 2
> > 20 11
> >
> >

new topic     » goto parent     » topic index » view message » categorize

86. Re: Contest

spent memory wrote:
> type hello or some other word, make sure it is the same word each of
> the 5 times u test it. Notice the different times. I know this is
> normal because the machine is handling diff amounts of processing from
> other apps but common, i was getting at least half a seconf
> differences for just 1 line -- sequence fn  fn = gets(0) -- so in
> theory if there is an average of a half second difference over 1 line
> of code in 5 instances how do you think this will implement to some of
> the larger algorithms written. ??

Derek will run each program 5 times per input file, and ignore the
slowest 2 runs. This probably means that each program will run slower
the first time, and perhaps even also the second time, but caching will
make the next runs faster. Since this occurs for each program, no program
will have an advantage above the other programs.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

87. Re: Contest

spent memory wrote:
> 
> besides try this : i don't know if it's just my machine but i find
> this strange. make a new file with the below code, run it 5 times and
> note the different times
> 
> --<EuCode>--
> <font color="#330033"></font>
> <font color="#0000FF">include </font><font color="#330033">get.e</font>
> <font color="#0000FF">constant </font><font color="#330033">START =
> </font><font color="#FF00FF">time</font><font color="#330033">()</font>
> <font color="#FF00FF">sequence </font><font color="#330033">fn  fn =
> </font><font color="#FF00FF">gets</font><font color="#330033">(0)</font>
> <font color="#FF00FF">printf</font><font color="#330033">(1, </font><font
> color="#00A033">"Elapsed time: %f\n"</font><font color="#330033">, </font><font
> color="#FF00FF">time</font><font color="#993333">() </font><font
> color="#330033">- START)</font>
> <font color="#FF00FF">integer </font><font color="#330033">junk    junk =
> wait_key()</font>
> <font color="#330033"></font>
> --</EuCode>--
> 
> type hello or some other word, make sure it is the same word each of
> the 5 times u test it. Notice the different times. I know this is
> normal because the machine is handling diff amounts of processing from
> other apps but common, i was getting at least half a seconf
> differences for just 1 line -- sequence fn  fn = gets(0) -- so in
> theory if there is an average of a half second difference over 1 line
> of code in 5 instances how do you think this will implement to some of
> the larger algorithms written. ??

I think the algorithms in the contest won't rely on human input (i.e.,
how fast do you type and how fast/consistent are your reactions).

> On Sat, 30 Oct 2004 02:49:56 +1000, spent memory <spent.memory at gmail.com>
> wrote:
> > 
> > in my opinion to have a score based on processing time even tho you
> > are doing the testing on your machine 25 times or whatever i believe
> > it is unfair. The coder of the application may be inclined to use
> > certain techniques that run better , say on one of the latest
> > processors with hyperthreading or a myriad of other possibilities. How
> > is someone not running the equivelant to your testing machine suppose
> > to account for this testing / implementation of speed in algorithm to
> > score in this contest??

How else would you judge the contest?  Also, the differences that you're
mentioning here are probably fairly minimal, since they'll all be running
from the same executable (or similar enough not to matter--PD vs Complete,
console vs. normal Win32).

I think that some of those things could conceivably make a slight 
difference for this competition, but I suspect that the relative speeds
of the submissions will be similar across platforms.

The stakes are pretty low, too, although maybe there should be something
for the winner(s).  What if people pledge some Micro Economy money--this
was how Rob 'financed' the last contest (although I think he said that
someone had donated some money to cover it)?  Since the contest period 
is November, we could pledge our $3 for that month.  I'd volunteer my 
bucks.

Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

88. Re: Contest

Matt Lewis wrote:
> Patrick Barnes wrote:
> > I thought the competitors might appreciate this:
> > <a
> > href="http://www.gutenberg.org/etext/2600">http://www.gutenberg.org/etext/2600</a>
> > <-- War
> and Peace, by Leo Tolstoy</font></i>
> > 
> > It'll be a bit of a stress-test for y'all.
> > I'm still working on my program, but I'll put in my word results when
> > I have them.
> 
> Here's mine (less the time):
> 
> Matt Lewis wrnpc11.txt
> Total:  564137, Unique:   18471
> 01 the                  34629
> 02 and                  22254
> 03 to                   16738
> 04 of                   14930
> 05 a                    10560
> 06 he                   9864
> 07 in                   8926
> 08 his                  7983
> 09 that                 7899
> 10 was                  7352
> 11 with                 5671
> 12 had                  5364
> 13 it                   5210
> 14 her                  4706
> 15 not                  4687
> 16 him                  4574
> 17 at                   4538
> 18 i                    4147
> 19 but                  4053
> 01 14887
> 02 93460
> 03 141927
> 04 97813
> 05 57861
> 06 48118
> 07 42138
> 08 29705
> 09 17444
> 10 10811
> 11 4584
> 12 2925
> 13 1512
> 14 572
> 15 201
> 16 106
> 17 46
> 18 14
> 19 2
> 20 11

My program outputs the following result for wrnpc11.txt:

Tommy Carlier wrnpc11.txt
Total:  564138, Unique:   18491
01 THE                  34629
02 AND                  22254
03 TO                   16738
04 OF                   14930
05 A                    10559
06 HE                   9864
07 IN                   8926
08 HIS                  7983
09 THAT                 7899
10 WAS                  7352
11 WITH                 5671
12 HAD                  5364
13 IT                   5210
14 HER                  4706
15 NOT                  4687
16 HIM                  4574
17 AT                   4538
18 I                    4147
19 BUT                  4053
01 14882
02 93439
03 141939
04 97826
05 57854
06 48118
07 42144
08 29706
09 17444
10 10811
11 4584
12 2925
13 1512
14 572
15 201
16 106
17 46
18 14
19 2
20 13


--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

89. Re: Contest

Tommy Carlier wrote:
> 
> Matt Lewis wrote:
> > Patrick Barnes wrote:
> > > War and Peace, by Leo Tolstoy
> > > 
> > 
> > Here's mine (less the time):
> > 
> > Matt Lewis wrnpc11.txt
> > Total:  564137, Unique:   18471
 
> My program outputs the following result for wrnpc11.txt:
> 
> Tommy Carlier wrnpc11.txt
> Total:  564138, Unique:   18491

Does your code agree with Derek's figures (mine does).

Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

90. Re: Contest

Matt Lewis wrote:
> 
> Tommy Carlier wrote:
> > 
> > Matt Lewis wrote:
> > > Patrick Barnes wrote:
> > > > War and Peace, by Leo Tolstoy
> > > > 
> > > 
> > > Here's mine (less the time):
> > > 
> > > Matt Lewis wrnpc11.txt
> > > Total:  564137, Unique:   18471
>  
> > My program outputs the following result for wrnpc11.txt:
> > 
> > Tommy Carlier wrnpc11.txt
> > Total:  564138, Unique:   18491
> 
> Does your code agree with Derek's figures (mine does).
> 
> Matt Lewis
> 

FWIW, My totals match Tommy's.  I got the same figures as Derek on the 
calibration file.

Phil

new topic     » goto parent     » topic index » view message » categorize

91. Re: Contest

On Fri, 29 Oct 2004 09:25:51 -0700, Matt Lewis
<guest at RapidEuphoria.com> wrote:

<snip>
>20 11

As well as some other differences, I got
20 13

On investigation, I discovered there are 7 "gentlemen-in-waiting" and
6 "gentleman-in-waiting" (less the quotes) in that file.

Just for a laugh, I tried words.txt (omitting most frequent!):
Total:   51792, Unique:   51682
01 2
02 128
03 824
04 2823
05 5059
06 7544
07 9182
08 7627
09 6525
10 4900
11 3176
12 1932
13 1135
14 522
15 242
16 90
17 52
18 22
19 4
20 3
Elapsed time: 1.700000

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

92. Re: Contest

On Fri, 29 Oct 2004 12:48:24 -0700, Phil Russell
<guest at RapidEuphoria.com> wrote:

>FWIW, My totals match Tommy's.  I got the same figures as Derek on the 
>calibration file.
Ditto.

Pete

new topic     » goto parent     » topic index » view message » categorize

93. Re: Contest

Pete Lomax wrote:
> 
> On Fri, 29 Oct 2004 12:48:24 -0700, Phil Russell
> <guest at RapidEuphoria.com> wrote:
> 
> >FWIW, My totals match Tommy's.  I got the same figures as Derek on the 
> >calibration file.
> Ditto.
> 

Hmm.  I may have messed something up while optimizing...

Matt Lewis

new topic     » goto parent     » topic index » view message » categorize

94. Re: Contest

I'd like to see at least the list of unique 1-letter words from the
"official" count by Derek. Pleeeeease?!

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

95. Re: Contest

Pete Lomax wrote:
> 
> Just for a laugh, I tried words.txt (omitting most frequent!):
> Total:   51792, Unique:   51682
> 01 2
> 02 128
> 03 824
> 04 2823
> 05 5059
> 06 7544
> 07 9182
> 08 7627
> 09 6525
> 10 4900
> 11 3176
> 12 1932
> 13 1135
> 14 522
> 15 242
> 16 90
> 17 52
> 18 22
> 19 4
> 20 3

Phil Russell words.txt
Total:   51799, Unique:   51684
<most frequent omitted>
01 2
02 129
03 824
04 2823
05 5060
06 7545
07 9183
08 7628
09 6525
10 4901
11 3177
12 1932
13 1135
14 522
15 242
16 90
17 52
18 22
19 4
20 3

Sigh.

Phil

new topic     » goto parent     » topic index » view message » categorize

96. Re: Contest

Pete Lomax wrote:
> Just for a laugh, I tried words.txt (omitting most frequent!):
> Total:   51792, Unique:   51682

Doh! I forgot that I had slightly amended my copy of words.txt 
for my own nefarious purposes a few months ago. 

On a freshly-downloaded copy I get the same results as Pete. 

I'll get me coat...

Phil

new topic     » goto parent     » topic index » view message » categorize

97. Re: Contest

spent memory wrote:
> 
> in my opinion to have a score based on processing time even tho you
> are doing the testing on your machine 25 times or whatever i believe
> it is unfair. The coder of the application may be inclined to use
> certain techniques that run better , say on one of the latest
> processors with hyperthreading or a myriad of other possibilities. How
> is someone not running the equivelant to your testing machine suppose
> to account for this testing / implementation of speed in algorithm to
> score in this contest??

Fair enough. But I guess you either don't use such 'advanced' techniques,
don't enter the contest, or just get over it.

Its the speed on my machine that will be used. If you don't like that
then don't enter the contest.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

98. Re: Contest

spent memory wrote:
> 
> besides try this : i don't know if it's just my machine but i find
> this strange. make a new file with the below code, run it 5 times and
> note the different times
> 
> --<EuCode>--
> <font color="#330033"></font>
> <font color="#0000FF">include </font><font color="#330033">get.e</font>
> <font color="#0000FF">constant </font><font color="#330033">START =
> </font><font color="#FF00FF">time</font><font color="#330033">()</font>
> <font color="#FF00FF">sequence </font><font color="#330033">fn  fn =
> </font><font color="#FF00FF">gets</font><font color="#330033">(0)</font>
> <font color="#FF00FF">printf</font><font color="#330033">(1, </font><font
> color="#00A033">"Elapsed time: %f\n"</font><font color="#330033">, </font><font
> color="#FF00FF">time</font><font color="#993333">() </font><font
> color="#330033">- START)</font>
> <font color="#FF00FF">integer </font><font color="#330033">junk    junk =
> wait_key()</font>
> <font color="#330033"></font>
> --</EuCode>--
> 
> type hello or some other word, make sure it is the same word each of
> the 5 times u test it. Notice the different times. I know this is
> normal because the machine is handling diff amounts of processing from
> other apps but common, i was getting at least half a seconf
> differences for just 1 line -- sequence fn  fn = gets(0) -- so in
> theory if there is an average of a half second difference over 1 line
> of code in 5 instances how do you think this will implement to some of
> the larger algorithms written. ??

Your code is also timing the speed it takes to type in the word. That
can vary greatly. In the contest, there is no human data entry during
the timing period.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

99. Re: Contest

Matt Lewis wrote:

[snip]

> The stakes are pretty low, too, although maybe there should be something
> for the winner(s).  What if people pledge some Micro Economy money--this
> was how Rob 'financed' the last contest (although I think he said that
> someone had donated some money to cover it)?  Since the contest period 
> is November, we could pledge our $3 for that month.  I'd volunteer my 

Thanks Matt. I'll donate $30ME to the pot. I think that Win32lib can
afford that blink 

Robert Craig: Is this sort of thing do-able in your Micro Economy database?

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

100. Re: Contest

cklester wrote:
> 
> I'd like to see at least the list of unique 1-letter words from the
> "official" count by Derek. Pleeeeease?!
> 

Word               Frequency
-------------------------------
 K                    1
 L                    1
 Y                    1
 M                    2
 S                    2
 T                    4
 O                    38
 A                    497
 I                    547

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

101. Re: Contest

On Fri, 29 Oct 2004 15:13:50 -0700, Phil Russell
<guest at RapidEuphoria.com> wrote:

>Doh! I forgot that I had slightly amended my copy of words.txt 
>for my own nefarious purposes a few months ago. 
>
>On a freshly-downloaded copy I get the same results as Pete. 
>
>I'll get me coat...
LOL, I very nearly posted the results from my three-year-old copy,
only at the very last moment did I think to d/l a fresh one.

BTW, I'd managed four entries in that time:
<BBUFFALOES
>BUFFALOES
<GRUYERE
<ZEALOTS

/me runs off and deletes the strange one.
Pete

new topic     » goto parent     » topic index » view message » categorize

102. Re: Contest

cklester wrote:
> 
> I'd like to see at least the list of unique 1-letter words from the
> "official" count by Derek. Pleeeeease?!
> 

Okay. For those that really must see it, the full analysis of the
calibration file can be found at ...

  http://www.users.bigpond.com/ddparnell/contest1/word.lst

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

103. Re: Contest

but i thought that the rules of the contest state that the timing
functions are at the start and the end of the file respectively, that
would put , the typing of the file to read in within the time profile.
this did not occur to me in the early hours of lastnight. yes i see
alot of points that have been posted in reply, was not something i was
terribly concerned with or anything , just thought i would offer my
opinion, as for the times all panning out over 5 tests, that would run
true provided derek didn't check his email or something during testing
lol, and probably run even more true if he shut down all processes in
his task manager during testing except for explorer and systray (for
98 anyway), not to sure about XP to get the same thing. the less
processes running in the background of the testing O/S the less your
time will be i imagine.


On Fri, 29 Oct 2004 16:04:16 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> spent memory wrote:
> >
> > besides try this : i don't know if it's just my machine but i find
> > this strange. make a new file with the below code, run it 5 times and
> > note the different times
> >
> > --<EuCode>--
> > <font color="#330033"></font>
> > <font color="#0000FF">include </font><font color="#330033">get.e</font>
> > <font color="#0000FF">constant </font><font color="#330033">START =
> > </font><font color="#FF00FF">time</font><font color="#330033">()</font>
> > <font color="#FF00FF">sequence </font><font color="#330033">fn  fn =
> > </font><font color="#FF00FF">gets</font><font color="#330033">(0)</font>
> > <font color="#FF00FF">printf</font><font color="#330033">(1, </font><font
> > color="#00A033">"Elapsed time: %f\n"</font><font color="#330033">, </font><font
> > color="#FF00FF">time</font><font color="#993333">() </font><font
> > color="#330033">- START)</font>
> > <font color="#FF00FF">integer </font><font color="#330033">junk    junk =
> > wait_key()</font>
> > <font color="#330033"></font>
> > --</EuCode>--
> >
> > type hello or some other word, make sure it is the same word each of
> > the 5 times u test it. Notice the different times. I know this is
> > normal because the machine is handling diff amounts of processing from
> > other apps but common, i was getting at least half a seconf
> > differences for just 1 line -- sequence fn  fn = gets(0) -- so in
> > theory if there is an average of a half second difference over 1 line
> > of code in 5 instances how do you think this will implement to some of
> > the larger algorithms written. ??
> 
> Your code is also timing the speed it takes to type in the word. That
> can vary greatly. In the contest, there is no human data entry during
> the timing period.
> 
> 
> --
> Derek Parnell
> Melbourne, Australia
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

104. Re: Contest

Derek Parnell wrote:
> 
> Matt Lewis wrote:
> 
> [snip]
> 
> > The stakes are pretty low, too, although maybe there should be something
> > for the winner(s).  What if people pledge some Micro Economy money--this
> > was how Rob 'financed' the last contest (although I think he said that
> > someone had donated some money to cover it)?  Since the contest period 
> > is November, we could pledge our $3 for that month.  I'd volunteer my 
> 
> Thanks Matt. I'll donate $30ME to the pot. I think that Win32lib can
> afford that blink 
> 
> Robert Craig: Is this sort of thing do-able in your Micro Economy database?

Sure. 
Tell me who's donating what, and I'll credit that to the winner.

(As long as I don't have to be the judge. That's a lot of work.
Thanks Derek for taking this on.)

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

105. Re: Contest

spent memory wrote:
> 
> but i thought that the rules of the contest state that the timing
> functions are at the start and the end of the file respectively, that
> would put , the typing of the file to read in within the time profile.

They also say that the file name is to be provided on the command line 
and thus NOT typed in by the user.

> this did not occur to me in the early hours of lastnight. yes i see
> alot of points that have been posted in reply, was not something i was
> terribly concerned with or anything , just thought i would offer my
> opinion, as for the times all panning out over 5 tests, that would run
> true provided derek didn't check his email or something during testing
> lol, and probably run even more true if he shut down all processes in
> his task manager during testing except for explorer and systray (for
> 98 anyway), not to sure about XP to get the same thing. the less
> processes running in the background of the testing O/S the less your
> time will be i imagine.

I will quiesce (look that up in your funk & wagnells) my system prior
to each test and ensure that only the minimum of other processes
are running at the same time.

Remember though that the comparisions will be relative, so as long as all
the programs are run in the same environment, the timings will have a
good measure of credibilty.

After the contest is over, anyone can run the series of programs in their
own environment to validate the relative timings.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

106. Re: Contest

Robert Craig wrote:
> 
> Tell me who's donating what, and I'll credit that to the winner.
> 

Please add my ME$ for Oct. and Nov. to the pot.

May the best person win!


Regards,

Marc

new topic     » goto parent     » topic index » view message » categorize

107. Re: Contest

ok sounds like a fair contest, i was not trying to imply it was unfair
or anything just trying to offer ideas to make it as fair as possible
:), besides who can argue with the win32lib commander ;)

cheers Joe


On Fri, 29 Oct 2004 17:39:34 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> spent memory wrote:
> >
> > but i thought that the rules of the contest state that the timing
> > functions are at the start and the end of the file respectively, that
> > would put , the typing of the file to read in within the time profile.
> 
> They also say that the file name is to be provided on the command line
> and thus NOT typed in by the user.
> 
> > this did not occur to me in the early hours of lastnight. yes i see
> > alot of points that have been posted in reply, was not something i was
> > terribly concerned with or anything , just thought i would offer my
> > opinion, as for the times all panning out over 5 tests, that would run
> > true provided derek didn't check his email or something during testing
> > lol, and probably run even more true if he shut down all processes in
> > his task manager during testing except for explorer and systray (for
> > 98 anyway), not to sure about XP to get the same thing. the less
> > processes running in the background of the testing O/S the less your
> > time will be i imagine.
> 
> I will quiesce (look that up in your funk & wagnells) my system prior
> to each test and ensure that only the minimum of other processes
> are running at the same time.
> 
> Remember though that the comparisions will be relative, so as long as all
> the programs are run in the same environment, the timings will have a
> good measure of credibilty.
> 
> After the contest is over, anyone can run the series of programs in their
> own environment to validate the relative timings.
> 
> --
> Derek Parnell
> Melbourne, Australia
> 
> 
> 
> 
>

new topic     » goto parent     » topic index » view message » categorize

108. Re: Contest

Derek Parnell wrote:
> 
> cklester wrote:
> > 
> > I'd like to see at least the list of unique 1-letter words from the
> > "official" count by Derek. Pleeeeease?!
> > 
> 
> Word               Frequency
> -------------------------------
>  K                    1
>  L                    1
>  Y                    1
>  M                    2
>  S                    2
>  T                    4
>  O                    38
>  A                    497
>  I                    547

I've checked the entire file. I've looked at every occurance of
's'. Where does 's' appear as a word?!

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

109. Re: Contest

Line 3511:
To morrow is S[aint]. Valentines day, all in the morning betime,

Line 3519:
By gis, and by S[aint]. Charity,

You are incorrectly identifying 'S' here as a single word. My
program does not. Or is the "new rule" that anything in square brackets
does not count as a word?

(I knew I was right. ;) )

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

110. Re: Contest

cklester wrote:
> 
> Line 3511:
> To morrow is S[aint]. Valentines day, all in the morning betime,
> 
> Line 3519:
> By gis, and by S[aint]. Charity,
> 
> You are incorrectly identifying 'S' here as a single word. My
> program does not. Or is the "new rule" that anything in square brackets
> does not count as a word?

When I include "[" and "]" as word delimiters, I get the same results
as Derek. There are a few X[..Y..] "words" in the file. Whereas Derek
identifies that as 'X,' I identify it as "X..Y.." (having stripped the
square brackets).

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

111. Re: Contest

On 29 Oct 2004, at 19:33, cklester wrote:

> 
> 
> posted by: cklester <cklester at yahoo.com>
> 
> Line 3511:
> To morrow is S[aint]. Valentines day, all in the morning betime,
> 
> Line 3519:
> By gis, and by S[aint]. Charity,
> 
> You are incorrectly identifying 'S' here as a single word. My
> program does not. Or is the "new rule" that anything in square brackets
> does not count as a word?

The brackets are not in the list of valid word components, ergo, they are 
delimiters.
 
> (I knew I was right. ;) )

Oh..... i take it back then, my appologies!

Kat

new topic     » goto parent     » topic index » view message » categorize

112. Re: Contest

Kat wrote:
> 
> On 29 Oct 2004, at 19:33, cklester wrote:
> > 
> > posted by: cklester <cklester at yahoo.com>
> > 
> > Line 3511:
> > To morrow is S[aint]. Valentines day, all in the morning betime,
> > 
> > Line 3519:
> > By gis, and by S[aint]. Charity,
> > 
> > You are incorrectly identifying 'S' here as a single word. My
> > program does not. Or is the "new rule" that anything in square brackets
> > does not count as a word?
> 
> The brackets are not in the list of valid word components, ergo, they are 
> delimiters.

You're right! I knew something wasn't right. ;)

(I figured it was an oversight by Derek... I mean, why shouldn't
"L[ord]" be considered "Lord" instead of "L?" Ah well. His rules.) ;)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

113. Re: Contest

> > > > > War and Peace, by Leo Tolstoy
> > > > Matt Lewis wrnpc11.txt
> > > > Total:  564137, Unique:   18471
> > > Tommy Carlier wrnpc11.txt
> > > Total:  564138, Unique:   18491

c.k.lester wrnpc11.txt
Total:  564032, Unique:   18499

Uh oh. :/

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

114. Re: Contest

cklester wrote:
> 
> Line 3511:
> To morrow is S[aint]. Valentines day, all in the morning betime,
> 
> Line 3519:
> By gis, and by S[aint]. Charity,
> 
> You are incorrectly identifying 'S' here as a single word. My
> program does not. Or is the "new rule" that anything in square brackets
> does not count as a word?
> 
> (I knew I was right. ;) )
> 


I don't think you are reading the rules correctly. Maybe the problem
is that I use the word "word" to name the valid set of strings. You
seem to be fixated on the idea that "word" means an English Language word.
In this contest I'm not referring to English Language Words. Maybe
it would help if I called the strings "ValidStrings" instead of "words".

The line ..

To morrow is S[aint]. Valentines day, all in the morning betime,

has 12 ValidStrings in it ...

TO
MORROW
IS
S
AINT
VALENTINES
DAY
ALL
IN
THE
MORNING
BETIME

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

115. Re: Contest

Derek, you said readability counts too.... so if my program reads itself out 
loud to you, what language/accent do you prefer?

Kat

new topic     » goto parent     » topic index » view message » categorize

116. Re: Contest

On 29 Oct 2004, at 20:02, cklester wrote:

> 
> 
> posted by: cklester <cklester at yahoo.com>
> 
> > > > > > War and Peace, by Leo Tolstoy
> > > > > Matt Lewis wrnpc11.txt
> > > > > Total:  564137, Unique:   18471
> > > > Tommy Carlier wrnpc11.txt
> > > > Total:  564138, Unique:   18491
> 
> c.k.lester wrnpc11.txt
> Total:  564032, Unique:   18499
> 
> Uh oh. :/

Yes, same here. On the official test file, i am getting 4 more words than 
Derek did. I did get the same top 5 words. It's going to take another 20 
minutes to print them to a file, to run a file compare on them.

Kat

new topic     » goto parent     » topic index » view message » categorize

117. Re: Contest

On Fri, 29 Oct 2004 22:30:50 -0500, Kat <gertie at visionsix.com> wrote:
> 
> Derek, you said readability counts too.... so if my program reads itself out
> loud to you, what language/accent do you prefer?

LMAO....
How about Klingon?

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

118. Re: Contest

On 29 Oct 2004, at 20:39, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> cklester wrote:
> > 
> > Line 3511:
> > To morrow is S[aint]. Valentines day, all in the morning betime,
> > 
> > Line 3519:
> > By gis, and by S[aint]. Charity,
> > 
> > You are incorrectly identifying 'S' here as a single word. My
> > program does not. Or is the "new rule" that anything in square brackets
> > does not count as a word?
> > 
> > (I knew I was right. ;) )
> > 
> 
> I don't think you are reading the rules correctly. Maybe the problem
> is that I use the word "word" to name the valid set of strings. You
> seem to be fixated on the idea that "word" means an English Language word.
> In this contest I'm not referring to English Language Words. Maybe
> it would help if I called the strings "ValidStrings" instead of "words".

TOKENS!!!

Kat

new topic     » goto parent     » topic index » view message » categorize

119. Re: Contest

Kat wrote:
> 
> On 29 Oct 2004, at 20:39, Derek Parnell wrote:
> > Maybe it would help if I called the strings "ValidStrings" instead 
> > of "words".
> 
> TOKENS!!!

D'oh! Of course.  I've updated the rules accordingly. Thanks Kat.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

120. Re: Contest

Hi Derek,

according to your rules,
   2004    is not a word, but
    '04    is a word.

Even
   '''     is a word (of length 0).

Is this right?

Regards,
   Juergen

PS: Thanks for organizing the contest!

new topic     » goto parent     » topic index » view message » categorize

121. Re: Contest

Robert Craig wrote:

<snip>

> Tell me who's donating what, and I'll credit that to the winner.
>
> (As long as I don't have to be the judge. That's a lot of work.
> Thanks Derek for taking this on.)

Please add my November $3.00 to the prize pool. Thanks.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

122. Re: Contest

Derek Parnell wrote:

<snip>

> I'll donate $30ME to the pot. I think that Win32lib can afford that blink

<snip>

Yery nice!
Maybe the 2nd and the 3rd places should also get a piece of the cake,
rather than that the winner gets the whole pool? For instance the pool
could be divided like this:

1st: 40%
2nd: 33%
3rd: 27%

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

123. Re: Contest

On Sat, 30 Oct 2004 10:19:49 +0200, Juergen Luethje <j.lue at gmx.de> wrote:
> could be divided like this:
> 
> 1st: 40%
> 2nd: 33%
> 3rd: 27%

50% 30% 20% is my vote, that way the winner gets a bit more...

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

124. Re: Contest

Juergen Luethje wrote:
> 
> Hi Derek,
> 
> according to your rules,
>    2004    is not a word, but
>     '04    is a word.
> 
> Even
>    '''     is a word (of length 0).
> 
> Is this right?

Yep, you got it right.

By the way, I've adopted Kat's suggestion and I'm calling them tokens
now and not words.
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

125. Re: Contest

Please add my $3 as well.

-- Mike Nelson

new topic     » goto parent     » topic index » view message » categorize

126. Re: Contest

Derek,

Might I suggest that a token composed entirely of single quotes not be
counted as a token? (The most common case would be a lone single quote.)
This would resolve to zero length and your specifications don't include a
frequency count for 0 length tokens.

-- Mike Nelson

new topic     » goto parent     » topic index » view message » categorize

127. Re: Contest

On Sat, 30 Oct 2004 06:22:26 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> > according to your rules,
> >    2004    is not a word, but
> >     '04    is a word.
> >
> > Even
> >    '''     is a word (of length 0).
> >
> > Is this right?
> 
> Yep, you got it right.

Eep... time to redo my parser again (which unfortunately worked until
you informed us of this)

So an apostrophe surrounded by delimiters is counted as a token, even
though it is stripped out of the token, so the token has no length?
* It affects the total number of words?
* The number of occurences of the empty token (caused by a single
apostrophe) need to be stored, too.


-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

128. Re: Contest

Derek Parnell wrote:
> 
> I don't think you are reading the rules correctly. Maybe the problem
> is that I use the word "word" to name the valid set of strings. You
> seem to be fixated on the idea that "word" means an English Language word.
> In this contest I'm not referring to English Language Words. Maybe
> it would help if I called the strings "ValidStrings" instead of "words".

You're right. I thought we were counting words, not letter groups.

But I'm all clear now. I hope! ;)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

129. Re: Contest

Derek Parnell wrote:
> 
> Juergen Luethje wrote:
> > 
> > Hi Derek,
> > 
> > according to your rules,
> >    2004    is not a word, but
> >     '04    is a word.

Would this be a token of "04," despite that "all numbers" is not
considered a valid token?

> > Even
> >    '''     is a word (of length 0).
> > 
> > Is this right?
> 
> Yep, you got it right.

Are we supposed to count these zero-length tokens?!

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

130. Re: Contest

cklester wrote:
> Derek Parnell wrote:
> > Juergen Luethje wrote:
> > > Hi Derek,
> > > 
> > > according to your rules,
> > >    2004    is not a word, but
> > >     '04    is a word.
> 
> Would this be a token of "04," despite that "all numbers" is not
> considered a valid token?
> 
> > > Even
> > >    '''     is a word (of length 0).
> > > 
> > > Is this right?
> > 
> > Yep, you got it right.
> 
> Are we supposed to count these zero-length tokens?!

This is kind of a problem, I think. One solution would be to
say in the rules, that a token should have at least 1 letter in
it to be a valid token. A simple rule, I think, that captures the
essence of a valid token.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

131. Re: Contest

Patrick Barnes wrote:
> 
> On Sat, 30 Oct 2004 06:22:26 -0700, Derek Parnell
> <guest at rapideuphoria.com> wrote:
> > > according to your rules,
> > >    2004    is not a word, but
> > >     '04    is a word.
> > >
> > > Even
> > >    '''     is a word (of length 0).
> > >
> > > Is this right?
> > 
> > Yep, you got it right.


Arrgghhhh!! I made that reply after coming home from a party at the 
neighbours house. I guess it was the wine talking blink

No, we can't have zero-length tokens, so a string of only quotes cannot
be a valid token. You guys are right and I was wrong.

> Eep... time to redo my parser again (which unfortunately worked until
> you informed us of this)

Sorry. Of course this happens with real-life specs too. The customer swears
that 'X' is required right to the day you are about to deliver the
application to them. Then is "no 'X' is wrong. I need 'Y' instead".
 
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

132. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> > 
> > Juergen Luethje wrote:
> > > 
> > > Hi Derek,
> > > 
> > > according to your rules,
> > >    2004    is not a word, but
> > >     '04    is a word.
> 
> Would this be a token of "04," despite that "all numbers" is not
> considered a valid token?

'04     Is a three-character token

04,     Is a two-character digit string(which is not a token) and
        a one character delimiter. The net result is a 3-character
        delimiter.

> > > Even
> > >    '''     is a word (of length 0).
> > > 
> > > Is this right?
> > 
> > Yep, you got it right.
> 
> Are we supposed to count these zero-length tokens?!

No. I got that wrong.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

133. Re: Contest

Tommy Carlier wrote:
> 
> cklester wrote:
> > Derek Parnell wrote:
> > > Juergen Luethje wrote:
> > > > Hi Derek,
> > > > 
> > > > according to your rules,
> > > >    2004    is not a word, but
> > > >     '04    is a word.
> > 
> > Would this be a token of "04," despite that "all numbers" is not
> > considered a valid token?
> > 
> > > > Even
> > > >    '''     is a word (of length 0).
> > > > 
> > > > Is this right?
> > > 
> > > Yep, you got it right.
> > 
> > Are we supposed to count these zero-length tokens?!
> 
> This is kind of a problem, I think. One solution would be to
> say in the rules, that a token should have at least 1 letter in
> it to be a valid token. A simple rule, I think, that captures the
> essence of a valid token.

Thanks Tommy. Consider it done.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

134. Re: Contest

On Sat, 30 Oct 2004 16:28:49 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> > > > Would this be a token of "04," despite that "all numbers" is not
> > > > considered a valid token?
> > >
> > > '04     Is a three-character token
> > >
<SNIP>
> LOL! I've only just got up and I'm a bit "fuzzy".
> 
> The length of  '04  is TWO even though it uses up three characters.

Ok, hangon...

10-4 is not a token - it has no letters in it.
Are you saying that '10-4' is a token of length 4? It has no letters!

IMO, the rule should be something like this:

Rule 3, part 1: Strings that DO NOT CONTAIN ANY LETTERS are delimiters.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

135. Re: Contest

Derek Parnell wrote:

> '04     Is a three-character token

But "it's" is not a four-character token?

So "'04" != "04" but "it's" = "its?"

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

136. Re: Contest

On Sat, 30 Oct 2004 15:44:35 -0700, Derek Parnell
<guest at RapidEuphoria.com> wrote:

>Arrgghhhh!! I made that reply after coming home from a party at the 
>neighbours house. I guess it was the wine talking blink
>
>No, we can't have zero-length tokens, so a string of only quotes cannot
>be a valid token. You guys are right and I was wrong.
>
>> Eep... time to redo my parser again (which unfortunately worked until
>> you informed us of this)
>
>Sorry. Of course this happens with real-life specs too. <snip>

Arrgghhhh!! I just added +1 a hundred times blink)
I'll forgive ya!

Can we have an absolute ruling on the '10-4' query?
I feel certain you will agree it *is* a 4 character token.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

137. Re: Contest

On Sat, 30 Oct 2004 15:51:25 -0700, Derek Parnell
<guest at RapidEuphoria.com> wrote:

>Tommy Carlier wrote:
>> This is kind of a problem, I think. One solution would be to
>> say in the rules, that a token should have at least 1 letter in
>> it to be a valid token. A simple rule, I think, that captures the
>> essence of a valid token.
>
>Thanks Tommy. Consider it done.

It might then clarify to somehow place last sentence rule 3,
	"A token must have a minimum length of 1"
after point 7
	"A token's length does not include any quotes in the count".

Maybe,
Pete

PS: I seriously doubt the grammar in rule 5 helps anyone.

<Sorry, rant:>
	, including you. Either it is *absolutely* correct, and possibly 
	contains some obscure hint, or it is(/is it) just a misleading red
	herring(?) As I see it, there is just no way that says "12" is not
	a word? I know you said "don't get pedantic", yet I cannot but ask
	why! Who is going to understand that, to any degree, anyway?
<OK, rant over!>

PPS If, point 11, there is no command line argument, will anyone get
penalised for defaulting? (I had to ask blink)

new topic     » goto parent     » topic index » view message » categorize

138. Re: Contest

Pete Lomax wrote:
> 

> 
> Can we have an absolute ruling on the '10-4' query?

It's already in the rules. Not a token. :)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

139. Re: Contest

> > Can we have an absolute ruling on the '10-4' query?
>
> It's already in the rules. Not a token. :)
>

I would say that this is implicit in the rules (as was the string composed
soley of quotes is not a token). I would like it explicit.

The rule is that single quotes are disregarded for comparison and display,
but it does not explicitly state that single quotes are disregared for
determining whether a string is a token or a delimiter.

Based on Dereks ruling about the 0-length tokens, I believe that consistency
indicates that '10-4' is not a token. By one rule  '10-4' is the same token
as 10-4, but by another rule 10-4 is not a token.

In the real world, I have had clients impose contradicory specifications. I
try to argue them out of it, I often I fail. Perhaps this inconsistency is
more realistic than a consistent rule would be. I'm fine with it either way.

When all details are ironed out, my program is ready to fly. I intend to
submit it at the first legal moment.

-- Mike Nelson

new topic     » goto parent     » topic index » view message » categorize

140. Re: Contest

On 30 Oct 2004, at 19:59, cklester wrote:

> 
> 
> posted by: cklester <cklester at yahoo.com>
> 
> Pete Lomax wrote:
> > 
> 
> > Can we have an absolute ruling on the '10-4' query?
> 
> It's already in the rules. Not a token. :)

Which is a lil odd, because "a-b" is!

Kat

new topic     » goto parent     » topic index » view message » categorize

141. Re: Contest

Patrick Barnes wrote:
> 
> On Sat, 30 Oct 2004 16:28:49 -0700, Derek Parnell
> <guest at rapideuphoria.com> wrote:
> > > > > Would this be a token of "04," despite that "all numbers" is not
> > > > > considered a valid token?
> > > >
> > > > '04     Is a three-character token
> > > >
> <SNIP>
> > LOL! I've only just got up and I'm a bit "fuzzy".
> > 
> > The length of  '04  is TWO even though it uses up three characters.
> 
> Ok, hangon...
> 
> 10-4 is not a token - it has no letters in it.
> Are you saying that '10-4' is a token of length 4? It has no letters!

The four bytes that go to make up 10-4 consists of only digits and
hyphens. A string composed only of digits and/or hyphens is not a token.
That's in the rules already.

The six bytes that go to make up '10-4' contains quotes which are token
characters, but they are considered zero-length characters. Thus the 
effective length of the six-byte string is 4. As this is >= 1 and <= 20
the string is deemed a token. That's in the rules already.

> IMO, the rule should be something like this:
> 
> Rule 3, part 1: Strings that DO NOT CONTAIN ANY LETTERS are delimiters.

However, strings that contain digits and/or embedded hyphens, *and*
contain quotes are real tokens. Weird but true (for this contest).

'-'    is a token
'1'    is a token
'1-    is a delimiter (ends in a hyphen)
-'1    is a delimiter (starts with a hyphen)
a-     is a delimiter
a'     is a token
10     is a delimiter
10-4   is a delimiter
'10-4' is a token
'''''' is a delimiter (effective length = 0)
'''1'' is a token (effective length = 1)
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

142. Re: Contest

cklester wrote:
> 
> Derek Parnell wrote:
> 
> > '04     Is a three-character token
> 
> But "it's" is not a four-character token?
> 
> So "'04" != "04" but "it's" = "its?"

Correct. The difference is that the first pair a digits type strings.

"'04" contains quotes so it is a potential token with an effective length
of 2, "04" only contains digits so its not a token. 
"it's" and "its" are both token with an effective length of 3.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

143. Re: Contest

Pete Lomax wrote:
> 
> On Sat, 30 Oct 2004 15:44:35 -0700, Derek Parnell
> <guest at RapidEuphoria.com> wrote:
> 
> >Arrgghhhh!! I made that reply after coming home from a party at the 
> >neighbours house. I guess it was the wine talking blink
> >
> >No, we can't have zero-length tokens, so a string of only quotes cannot
> >be a valid token. You guys are right and I was wrong.
> >
> >> Eep... time to redo my parser again (which unfortunately worked until
> >> you informed us of this)
> >
> >Sorry. Of course this happens with real-life specs too. <snip>
> 
> Arrgghhhh!! I just added +1 a hundred times blink)
> I'll forgive ya!
> 
> Can we have an absolute ruling on the '10-4' query?
> I feel certain you will agree it *is* a 4 character token.

'10-4' is a six byte string that forms a token with an 
effective length of 4.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

144. Re: Contest

Pete Lomax wrote:
> 
> On Sat, 30 Oct 2004 15:51:25 -0700, Derek Parnell
> <guest at RapidEuphoria.com> wrote:
> 
> >Tommy Carlier wrote:
> >> This is kind of a problem, I think. One solution would be to
> >> say in the rules, that a token should have at least 1 letter in
> >> it to be a valid token. A simple rule, I think, that captures the
> >> essence of a valid token.
> >
> >Thanks Tommy. Consider it done.
> 
> It might then clarify to somehow place last sentence rule 3,
> 	"A token must have a minimum length of 1"
> after point 7
> 	"A token's length does not include any quotes in the count".

Okay, I've reworded this and expanded it. Hope it reads better now.

> 
> PS: I seriously doubt the grammar in rule 5 helps anyone.

Me too. You're right, its stupid and misleading. Its gone.


 
> PPS If, point 11, there is no command line argument, will anyone get
> penalised for defaulting? (I had to ask blink)

However, whenever I run your program it will have a command line 
argument. No defaulting will be useful. (I ahd to respond ;0)

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

145. Re: Contest

cklester wrote:
> 
> Pete Lomax wrote:
> > 
> 
> > Can we have an absolute ruling on the '10-4' query?
> 
> It's already in the rules. Not a token. :)

'10-4' is a token.
10-4 is not a token.


-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

146. Re: Contest

On 30 Oct 2004, at 23:25, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> cklester wrote:
> > 
> > Pete Lomax wrote:
> > > 
> > 
> > > Can we have an absolute ruling on the '10-4' query?
> > 
> > It's already in the rules. Not a token. :)

This is getting so complicated that in a go/no-go contest, it's going to be 
plain luck if anyone gets all the correct answers.
 
> '10-4' is a token.

Ok, because it has the two ' in it. I understand. But we are to ignore and strip
them, making what was a token into:

> 10-4 is not a token.

But now this "is not a token" item is a 4 char token of length 6?

What about 5'8 ?

Which is why i asked 2 days ago which order the parsing was to happen.

Kat

new topic     » goto parent     » topic index » view message » categorize

147. Re: Contest

Mike Nelson wrote:
> 
> > > Can we have an absolute ruling on the '10-4' query?
> >
> > It's already in the rules. Not a token. :)
> >
> 
> I would say that this is implicit in the rules (as was the string composed
> soley of quotes is not a token). I would like it explicit.
> 
> The rule is that single quotes are disregarded for comparison and display,
> but it does not explicitly state that single quotes are disregared for
> determining whether a string is a token or a delimiter.

Well, I'm not sure if this helps but it does say that a quote is a token
character. 

> Based on Dereks ruling about the 0-length tokens, I believe that consistency
> indicates that '10-4' is not a token. By one rule  '10-4' is the same token
> as 10-4, but by another rule 10-4 is not a token.

'10-4' is a six byte string. Due to the presence of quotes, the six-byte
string is deemed a token whose effective length is 6.

10-4 is a four-byte string. Due to the four-byte string being only composed
of digits and hyphens, it is deemed to be a delimiter.


> In the real world, I have had clients impose contradicory specifications. I
> try to argue them out of it, I often I fail. Perhaps this inconsistency is
> more realistic than a consistent rule would be. I'm fine with it either way.
> 
> When all details are ironed out, my program is ready to fly. I intend to
> submit it at the first legal moment.

I can't wait to see the submissions. I'm getting excited about this, even
though its going to mean a lot of work for me. 

Be patient with me as I'm bound to be too slow and I will make
some mistakes about the results I post.  Please just politely point
out any thing that might be wrong and I'll check it out, and fix it
if required.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

148. Re: Contest

Kat wrote:
> 
> On 30 Oct 2004, at 19:59, cklester wrote:
> 
> > 
> > posted by: cklester <cklester at yahoo.com>
> > 
> > Pete Lomax wrote:
> > > 
> > 
> > > Can we have an absolute ruling on the '10-4' query?
> > 
> > It's already in the rules. Not a token. :)
> 
> Which is a lil odd, because "a-b" is!
> 

Yes, it is a little odd, but that's one of the quirks I put in. 
 
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

149. Re: Contest

I'm a little confused:

I said:
> This is kind of a problem, I think. One solution would be to
> say in the rules, that a token should have at least 1 letter in
> it to be a valid token. A simple rule, I think, that captures the
> essence of a valid token.

Derek said:
> Thanks Tommy. Consider it done.

I think that means that Derek agrees that a token is only a valid token
if it contains at least 1 letter (letter = character between A-Z or a-z).
So, if a token doesn't contain any letters, it's not a valid token.
This means that '10-4' is not a token, because it doesn't contain any
letters.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

150. Re: Contest

On 31 Oct 2004, at 1:12, Tommy Carlier wrote:

> 
> 
> posted by: Tommy Carlier <tommy.carlier at telenet.be>
> 
> I'm a little confused:
> 
> I said:
> > This is kind of a problem, I think. One solution would be to
> > say in the rules, that a token should have at least 1 letter in
> > it to be a valid token. A simple rule, I think, that captures the
> > essence of a valid token.
> 
> Derek said:
> > Thanks Tommy. Consider it done.
> 
> I think that means that Derek agrees that a token is only a valid token
> if it contains at least 1 letter (letter = character between A-Z or a-z).
> So, if a token doesn't contain any letters, it's not a valid token.
> This means that '10-4' is not a token, because it doesn't contain any
> letters.

If Derek meant for this to be a "real life exercise", he's suceeded! This 
reminds me of when I recieved firm specs for a project, picked out hardware, 
did the programming, built the boxes to house it all, demo'd the thing, and 
they immeadiately said they wanted another 150 input lines and the sliding 
window thru time had to run not an hour but the entire shift *in addition to the
hour window*, and have an end of shift printout in addition to the hourlies. 
And the existing signal lines were not as spec'd when i tied into them in the 
factory. I ended up building it the way i had originally wanted. The fun part 
was HP and IBM had said it couldn't be done (but they wanted $50K to look 
at it), and i did it on a C64.

Kat

new topic     » goto parent     » topic index » view message » categorize

151. Re: Contest

Kat wrote:
> 
> On 30 Oct 2004, at 23:25, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > cklester wrote:
> > > 
> > > Pete Lomax wrote:
> > > > 
> > > 
> > > > Can we have an absolute ruling on the '10-4' query?
> > > 
> > > It's already in the rules. Not a token. :)
> 
> This is getting so complicated that in a go/no-go contest, it's going to be 
> plain luck if anyone gets all the correct answers.

No, I'm gettting the correct answers. blink 

This is NOT complicated, really. You guys are just reading too much into
simple rules. My code that does all this tokenizing runs to about 20 lines.
It's not so hard, honestly.

> > '10-4' is a token.
> 
> Ok, because it has the two ' in it. I understand. 

Good. 

>But we are to ignore and strip them, making what was a token into:

Who said anything about stripping off characters? I talked about
not counting quotes when determining the length, but never about
removing them. Ignoring is not removing.

'10-4' is a SIX-BYTE string. Because it has a MIXTURE of quotes and other
token characters it is a token. It has an EFFECTIVE length of 4.

> > 10-4 is not a token.

True, but why are you converting '10-4' into 10-4 ? The specs do not talk
about removing bytes from strings. If you find these 4 bytes surrounded
by spaces rather than quotes then it is a delimiter, in fact the spaces
would also be a part of the same delimiter, but that's not we are talking
about either. 

> But now this "is not a token" item is a 4 char token of length 6?
> 
> What about 5'8 ?

Again, its a token because it is a MIXTURE of quotes and token characters.

> Which is why i asked 2 days ago which order the parsing was to happen.

I may have misunderstood this question. Sorry. Can you ask it again for me?
-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

152. Re: Contest

Tommy Carlier wrote:
> 
> I'm a little confused:
> 
> I said:
> > This is kind of a problem, I think. One solution would be to
> > say in the rules, that a token should have at least 1 letter in
> > it to be a valid token. A simple rule, I think, that captures the
> > essence of a valid token.
> 
> Derek said:
> > Thanks Tommy. Consider it done.

What I meant by "Consider it done" was that I will update the rules to
further clarify what is a token and what is not. And try to keep it
simple. I'm obviously failing miserably to do this.


> I think that means that Derek agrees that a token is only a valid token
> if it contains at least 1 letter (letter = character between A-Z or a-z).

Not quite. A token character can be letters, digits, hyphen, or a quote.
The list of valid token characters has been unchanged since day one. The
problem seems to be that certain combinations of these token characters
are not real tokens. These are the *exceptions* to the rule. I wanted to
put in some exceptions so that you'd have to really think about your
approach, and it gives various different opportunitites for optimization.

> So, if a token doesn't contain any letters, it's not a valid token.
> This means that '10-4' is not a token, because it doesn't contain any
> letters.


Read the rules again. I am *not* saying that at all.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

153. Re: Contest

On 31 Oct 2004, at 1:40, Derek Parnell wrote:

> 
> 
> posted by: Derek Parnell <ddparnell at bigpond.com>
> 
> Kat wrote:
> > 
> > On 30 Oct 2004, at 23:25, Derek Parnell wrote:
> > 
> > > 
> > > posted by: Derek Parnell <ddparnell at bigpond.com>
> > > 
> > > cklester wrote:
> > > > 
> > > > Pete Lomax wrote:
> > > > > 
> > > > 
> > > > > Can we have an absolute ruling on the '10-4' query?
> > > > 
> > > > It's already in the rules. Not a token. :)
> > 
> > This is getting so complicated that in a go/no-go contest, it's going to be
> > plain luck if anyone gets all the correct answers.
> 
> No, I'm gettting the correct answers. blink 
> 
> This is NOT complicated, really. You guys are just reading too much into
> simple rules. My code that does all this tokenizing runs to about 20 lines.
> It's
> not so hard, honestly.
> 
> > > '10-4' is a token.
> > 
> > Ok, because it has the two ' in it. I understand. 
> 
> Good. 
> 
> >But we are to ignore and strip them, making what was a token into:
> 
> Who said anything about stripping off characters? I talked about
> not counting quotes when determining the length, but never about
> removing them. Ignoring is not removing.

You have said :
it's = its
if that's not stripping or ignoring them, i don't know the meaning of the words.

> '10-4' is a SIX-BYTE string. Because it has a MIXTURE of quotes and other
> token characters it is a token. It has an EFFECTIVE length of 4.
> 
> > > 10-4 is not a token.
> 
> True, but why are you converting '10-4' into 10-4 ? 

Again, because its = it's.

> The specs do not talk
> about removing bytes from strings. 

And that is confusing.

> If you find these 4 bytes surrounded
> by spaces rather than quotes then it is a delimiter, 

or with leading or trailing -

> in fact the spaces
> would also be a part of the same delimiter, but that's not we are talking
> about either. 


 
> > But now this "is not a token" item is a 4 char token of length 6?
> > 
> > What about 5'8 ?
> 
> Again, its a token because it is a MIXTURE of quotes and token characters.
> 
> > Which is why i asked 2 days ago which order the parsing was to happen.
> 
> I may have misunderstood this question. Sorry. Can you ask it again for me? --

Ok........

if we strip the ' out of it's, then check it's length, it's 3 bytes long.
if we check length, and then remove the ', it's 4 bytes long.

something must be done with the ' to make it's equal its, and keeping it 
means both its and it's will be listed in the list of valid tokens.

Kat

new topic     » goto parent     » topic index » view message » categorize

154. Re: Contest

> > > > 10-4 is not a token.
> >
> > True, but why are you converting '10-4' into 10-4 ?
> 
> Again, because its = it's.
> 
> > The specs do not talk
> > about removing bytes from strings.
> 
> And that is confusing.

Derek, I'd say the principle source of confusion comes from:
"For the purposes of comparison and display, quotes are ignored in any token."

So, we should compare and display our tokens without quotes, but we
shouldn't remove the quotes byte from tokens when we store it
internally? Why *wouldn't* we?

"Thus the tokens Ma'am, maam and 'MAAM' are considered to be the same
4-character token." So, the quote mark should be ignored, but what...
we should still store a copy of "ma'am" and "maam" separately, but
combine them when we go to calculate statistics?

You can't specify the internal workings of the entries... only the
external effects.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

155. Re: Contest

Kat wrote:
> 
> On 31 Oct 2004, at 1:40, Derek Parnell wrote:
> 
> > 
> > posted by: Derek Parnell <ddparnell at bigpond.com>
> > 
> > Kat wrote:
> > > 
> > > On 30 Oct 2004, at 23:25, Derek Parnell wrote:
> > > 
> > > > 
> > > > posted by: Derek Parnell <ddparnell at bigpond.com>
> > > > 
> > > > cklester wrote:
> > > > > 
> > > > > Pete Lomax wrote:
> > > > > > 
> > > > > 
> > > > > > Can we have an absolute ruling on the '10-4' query?
> > > > > 
> > > > > It's already in the rules. Not a token. :)
> > > 
> > > This is getting so complicated that in a go/no-go contest, it's going to
> > > be
> > > plain luck if anyone gets all the correct answers.
> > 
> > No, I'm gettting the correct answers. blink 
> > 
> > This is NOT complicated, really. You guys are just reading too much into
> > simple rules. My code that does all this tokenizing runs to about 20 lines.
> > It's
> > not so hard, honestly.
> > 
> > > > '10-4' is a token.
> > > 
> > > Ok, because it has the two ' in it. I understand. 
> > 
> > Good. 
> > 
> > >But we are to ignore and strip them, making what was a token into:
> > 
> > Who said anything about stripping off characters? I talked about
> > not counting quotes when determining the length, but never about
> > removing them. Ignoring is not removing.
> 
> You have said :
> it's = its
> if that's not stripping or ignoring them, i don't know the meaning of the
> words.

Kat, if I say anything hurtful in this reply, please excuse me. I'm not
intending to do that. I'm frustrated at myself for not being a great
explainer. Given that, here we go ...

The actual quotation from the rules is ...

For the purposes of comparison and display, quotes are ignored in any
token. You can think of a quote as a zero-length token character. When
determining the effective length of a token string, it is the sum of
the lengths of each token character, and all token characters except
quote have a length of 1. For example it's and its are the same, 
'heaven' and heaven are the same token. A token's length does not
include any quotes in the count. Thus the tokens Ma'am, maam and
'MAAM' are considered to be the same 4-character token.

...
Notice the context. I'm talking about the effective length of a token.
I'm sorry (again) that I'm failing to clearly explain *my* rules. 

Would it help if I said "  it's and its are *equivalent* tokens. "
They both have effective length of 3. They compare as equals. 

Obviously they are not the same strings. But FOR THE PURPOSES OF THIS
CONTEST, AND THIS CONTEST ONLY, they are deemed to be equivalent. 

That's the rule. Just get over it. You might like it to be a different
rule, but it isn't. 

Above you say "if that's not stripping or ignoring them ...". I believe 
that stripping them off and ignoring them are two different things.
I am telling you that the rules say they are to be ignored when
counting the length of a token and when comparing them. Please just take
this as a requirement. Don't question it. It just is. 

I agree the the phrase "purposes of comparison and display" in the rules
is confusing. Hopefully the 'comparision' part isn't confusing, but 
what I meant about the display is that I don't care if you display 
the token with or without the quotes, however don't go displaying
BOTH "it's" and "its". Pick one of the variants when displaying 
the token. I don't really care which variant you choose.

> 
> > '10-4' is a SIX-BYTE string. Because it has a MIXTURE of quotes and other
> > token characters it is a token. It has an EFFECTIVE length of 4.
> > 
> > > > 10-4 is not a token.
> > 
> > True, but why are you converting '10-4' into 10-4 ? 
> 
> Again, because its = it's.

No, no, no.  Because of the mixture of quotes and digits, its a token.
So you don't have to go and re-examine it after ignoring the quotes. 
You have already determined that it is a token. 

After finding '10-4' you ask yourself - is this a token? Yes it is, so
move to find the next token. Don't strip off the quotes and then say, well
is it still a token? Instead, ignore the quotes, save it in your token
store, and move on to the next one.

> > The specs do not talk
> > about removing bytes from strings. 
> 
> And that is confusing.

Sorry. But hopeful the above helps de-confuse the rules.

> > If you find these 4 bytes surrounded
> > by spaces rather than quotes then it is a delimiter, 
> 
> or with leading or trailing -

Yes, but that's not the topic. Stay with the context.

> > in fact the spaces
> > would also be a part of the same delimiter, but that's not we are talking
> > about either. 
> 
> 
> > > But now this "is not a token" item is a 4 char token of length 6?
> > > 
> > > What about 5'8 ?
> > 
> > Again, its a token because it is a MIXTURE of quotes and token characters.
> > 
> > > Which is why i asked 2 days ago which order the parsing was to happen.
> > 
> > I may have misunderstood this question. Sorry. Can you ask it again for me?
> > --
> 
> Ok........
> 
> if we strip the ' out of it's, then check it's length, it's 3 bytes long.
> if we check length, and then remove the ', it's 4 bytes long.

Don't strip off the quotes then. Just ignore them.

PSEUDO CODE ::: 
DO NOT ATTEMPT THIS IN YOUR PROGRAM AS THERE ARE MUCH BETTER WAYS TO DO IT.

  grab bytes until you get to a non-token character.
  for each byte in potential_token
     if byte is not "'" then
        add 1 to effective_length
     end if
  end for
  if effective_length > 0 and effective_length <= 20 then
     if first byte is "-" or last byte is "-" then
        mark this as a delimiter
     otherwise
        if any byte in potenial_token is alphabetic or "'" then
            mark this a real_token
        otherwise
            mark this as a delimiter
        end if
     end if
  otherwise
     mark this as a delimiter.
  end if


> something must be done with the ' to make it's equal its, and keeping it 
> means both its and it's will be listed in the list of valid tokens.

That is part of the puzzle you must workout how to implement. It is 
possible to do because I've done it, and it looks like others have too.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

156. Re: Contest

Patrick Barnes wrote:
> 
> > > > > 10-4 is not a token.
> > >
> > > True, but why are you converting '10-4' into 10-4 ?
> > 
> > Again, because its = it's.
> > 
> > > The specs do not talk
> > > about removing bytes from strings.
> > 
> > And that is confusing.
> 
> Derek, I'd say the principle source of confusion comes from:
> "For the purposes of comparison and display, quotes are ignored in any token."

Agreed. I'll fix that up tonight.

> So, we should compare and display our tokens without quotes, but we
> shouldn't remove the quotes byte from tokens when we store it
> internally? Why *wouldn't* we?

Frankly, I don't care whether you internally remove quotes or not. That
could be an implementation strategy. *ALL* I'm saying, and no more than
this, is that when you compare strings, disregard any quotes. How you
do that is your concern. For example, one might remove quotes *after*
determining that you've got a token. Some other people might keep
alternate lists... there are many variations on this theme. Some work
better than others. 

As for displaying, I don't care if you display quotes or not really, 
but don't go and display both variants.


> "Thus the tokens Ma'am, maam and 'MAAM' are considered to be the same
> 4-character token." So, the quote mark should be ignored, but what...
> we should still store a copy of "ma'am" and "maam" separately, but
> combine them when we go to calculate statistics?
> 
> You can't specify the internal workings of the entries... only the
> external effects.

I'm *not* specifying the implementation, just the requirements. I don't
care if you store "ma'am" and "maam" seperately or not. What I care about
is the results. I expect that every submission will solve these weird
and cantankerous rules in their own unique manner.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

157. Re: Contest

>From the rules page:
> Final results will probably be posted by December 6th, 2004.

Is it a coincidence, or does this date have a meaning for you?
The 6th of december happens to be a holiday in Belgium. It's the
day that Sinterklaas (our version of Santa Claus) and Zwarte
Piet (Black Pete) cross the country at night, and leave presents
for the children.

--
tommy online: http://users.telenet.be/tommycarlier
tommy.blog: http://tommycarlier.blogspot.com
Euphoria Message Board: http://uboard.proboards32.com
Empire for Euphoria: http://empire.iwireweb.com

new topic     » goto parent     » topic index » view message » categorize

158. Re: Contest

On 31 Oct 2004, at 9:06, Tommy Carlier wrote:

> 
> 
> posted by: Tommy Carlier <tommy.carlier at telenet.be>
> 
> >From the rules page:
> > Final results will probably be posted by December 6th, 2004.
> 
> Is it a coincidence, or does this date have a meaning for you?
> The 6th of december happens to be a holiday in Belgium. It's the
> day that Sinterklaas (our version of Santa Claus) and Zwarte
> Piet (Black Pete) cross the country at night, and leave presents
> for the children.

Are you implying you have been a good little girl and/or boy? 

Kat

new topic     » goto parent     » topic index » view message » categorize

159. Re: Contest

Here are my results for win32lib.ew, v0.60.6:

Mike Nelson win32lib.ew
Total:  124680, Unique:    7530
01 THE                  4213
02 IF                   3878
03 ID                   2818
04 END                  2556
05 THEN                 2216
06 I                    1764
07 A                    1655
08 TO                   1603
01 6136
02 20045
03 17633
04 20540
05 11108
06 12580
07 11247
08 9005
09 5024
10 3847
11 2605
12 1480
13 1382
14 766
15 558
16 265
17 127
18 178
19 96
20 58

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu