Re: Contest Update
- Posted by Jason Gade <jaygade at yahoo.com> Nov 07, 2004
- 500 views
Derek Parnell wrote: > > Jason Gade wrote: > > I also don't understand how it failed on file8 (War and Peace) since I've > > been > > testing this file (or one similar) all through the development. > > Don't know either. I'm not with the machine I tested the submission on > at the moment so I can't inspect the detailed results. However, because > you got all the top-used tokens correct, I'm going to guess that its the > 'funny' tokens you are having problems with. My copy of file8 has been > doctored somewhat to include some odd looking tokens. > > Check again for tokens that might contain quotes and/or hyphens, > especially at the start or end of a string. Also, strings just made > up of quotes gave my program problems at first. > > > > Hmmm... Difficult to troubleshoot when I don't know what my program is doing > > wrong. > > > > Any troubleshooting hints anyone? > > I'll give a hint that some people may have tripped up on. > > A file opened as "text" will appear to prematurely end if it contains > the End-Of-File marker for text files. > > -- > Derek Parnell > Melbourne, Australia > Okay. So in my testing I made a file that contained edge cases identified in the rules and they were counted correctly. Also I do open the file in binary mode, so... hmm. I may need to think of some new edge cases to test for. Currently, the program follows these rules: -- words consist of upper and lower case letters, digits 0-9, single quote and dash; -- for the purposes of comparison, case does not matter and quotes are not counted; -- words consisting of only digits, or digits and dashes, are not counted as words -- unless they are quoted; -- words of zero length after quotes are removed are not counted. If I am interpreting the rules correctly, I will try to come up with a new (short) test file to validate with. I wish now that I had saved the version of your web page that had your unique counts and total counts for each file posted -- at least then it would be easier to compare with. It kind of sucks that the calibration file works perfectly but the others do not!! ;^) ------------------------------------ Too many freaks, not enough circuses. j.