Re: Bug in new get()?
- Posted by CChris <christian.cuvier at ag?icu?ture.gouv.fr> Aug 17, 2007
- 534 views
I was in a hurry when writing the first reply, some precisions now. Jason Gade wrote: > > Okay, I'm writing a stub to test your new get.e with regards to sanity.ex, so > I can fix sanity.ex. > > And I'm looking at your docs. I don't quite understand them and I have a > change > suggestion. > > Your docs say: > get() returns a 4 element sequence, like value() does: > a status code (success/error/end of file), > the value just read (meaningful only when the status code is > GET_SUCCESS), > the number of characters read, > the number of leading whitespace characters. > > I'm getting a bunch of 1s, -1s, or 0s for the last two elements. In fact I'm > not getting any other values. > The bunch of 0s in 3rd position is a bug, corrected. value() didn't have this bug. The bunch of -1s in the last position should be a bunch of 0s, hardly more useful. This is because nearly all the test values start without any leading whitespace. Otherwise, you'd get more useful output - I'll add some. > Are these really useful return values? Maybe a different function to read > those > last two numbers would be better (state kept in a file variable of course)? > What do you mean by this "file variable"? As I said in my previous post, I don't like the idea of having too many routine names to do "almost similar but ..." things. C's stdlib went that way, there are some good reasons to do this indeed, but the drawbacks are obvious. > Plus your comment at the top of Get2() doesn't seem to reflect reality: you > say that Get2() will return a 2-element sequence unless record_whitespace flag > is set. Get2() seems to return 4 elements regardless. > Actually the comments of Get() and Get2() need to be swapped. Will do that. Bear in mind they are local routines anyway. It is quite exceptional for local routines to be commented in the standard files - perhaps a bad thing, not sure. > I would see more value in returning (in this order) "success or failure", > "sequence > of interest", "sequence containing invalid characters until next valid > character > (not read)" and "total characters read". And probably not even that, as I > would > prefer to see it separated from get(). get()'s pointer, of course, would point > to the next valid character. > > So the optimal situation in my opinion would be to leave get() returning a > two-element > sequence as now, but add get_leading_whitespace_count(), get_chars_read(), and > get_last_invalid(). Obviously, those names suck, but the point still stands. > > Same for value(). > Answered the previous two points in previous post. Actually, I don't think that the symmetry between get() and value() is very deep. With value(), you still can inspect the input string easily. With get(), it is not as easy - you need to seek() back where you were, and then get_bytes() the same amount you read so as to inspect the string that was just read and return to file position. On some devices, this may be slow or impossible. So, the idea of returning at some point the whole string, which will stop right before the first invalid character anyway, may make sense for get(), but not as much for value(). Conversely, while value_from() can be just as useful as find_from() is (and easier to implement), get_from() would just be seek() followed by get(), so it would hardly be useful. CChris > -- > A complex system that works is invariably found to have evolved from a simple > system that works. > --John Gall's 15th law of Systemantics. > > "Premature optimization is the root of all evil in programming." > --C.A.R. Hoare > > j.