Re: Bug in new get()?
- Posted by Jason Gade <jaygade at yah?o.?om> Aug 17, 2007
- 530 views
CChris wrote: > > I was in a hurry when writing the first reply, some precisions now. > > > Jason Gade wrote: > > > > Okay, I'm writing a stub to test your new get.e with regards to sanity.ex, > > so > > I can fix sanity.ex. > > > > And I'm looking at your docs. I don't quite understand them and I have a > > change > > suggestion. > > > > Your docs say: > > get() returns a 4 element sequence, like value() does: > > a status code (success/error/end of file), > > the value just read (meaningful only when the status code is > > GET_SUCCESS), > > the number of characters read, > > the number of leading whitespace characters. > > > > I'm getting a bunch of 1s, -1s, or 0s for the last two elements. In fact I'm > > not getting any other values. > > > > The bunch of 0s in 3rd position is a bug, corrected. value() didn't have this > bug. > The bunch of -1s in the last position should be a bunch of 0s, hardly more > useful. > This is because nearly all the test values start without any leading > whitespace. Otherwise, you'd get more useful output - I'll add some. Okay. I don't know if I'll get a chance to test tonight... Actually I can probably just download get.e and sanity.ex and test them here at work. > > > Are these really useful return values? Maybe a different function to read > > those > > last two numbers would be better (state kept in a file variable of course)? > > > > What do you mean by this "file variable"? "file variable" -- a variable defined at the top level of a file. It can hold state but is not visible to routines outside of that file. Also known as a "static variable". > As I said in my previous post, I don't like the idea of having too many > routine > names to do "almost similar but ..." things. C's stdlib went that way, there > are some good reasons to do this indeed, but the drawbacks are obvious. > Right. I remember discussing on the list whether value() and get() should read and ignore comments like they do whitespace but I don't remember discussing the extra return values. I might have missed the discussion. I don't really have a problem with it -- it just seems lie extraneous information to me. get() and value() should return a status and a value. Maybe some other functions should return extended state information. But I don't really care either way. > > Plus your comment at the top of Get2() doesn't seem to reflect reality: you > > say that Get2() will return a 2-element sequence unless record_whitespace > > flag > > is set. Get2() seems to return 4 elements regardless. > > > > Actually the comments of Get() and Get2() need to be swapped. Will do that. > Bear in mind they are local routines anyway. It is quite exceptional for local > routines to be commented in the standard files - perhaps a bad thing, not > sure. No, comments are a good thing! As long as they accurately reflect the code. So -- you mean to say that Get2() (and by extension get() since it returns Get2()) will now always return a length 4 sequence? > > > I would see more value in returning (in this order) "success or failure", > > "sequence > > of interest", "sequence containing invalid characters until next valid > > character > > (not read)" and "total characters read". And probably not even that, as I > > would > > prefer to see it separated from get(). get()'s pointer, of course, would > > point > > to the next valid character. > > > > So the optimal situation in my opinion would be to leave get() returning a > > two-element > > sequence as now, but add get_leading_whitespace_count(), get_chars_read(), > > and > > get_last_invalid(). Obviously, those names suck, but the point still stands. > > > > Same for value(). > > > > Answered the previous two points in previous post. > > Actually, I don't think that the symmetry between get() and value() is very > deep. With value(), you still can inspect the input string easily. With get(), > it is not as easy - you need to seek() back where you were, and then > get_bytes() > the same amount you read so as to inspect the string that was just read and > return to file position. On some devices, this may be slow or impossible. > So, the idea of returning at some point the whole string, which will stop > right > before the first invalid character anyway, may make sense for get(), but not > as much for value(). Conversely, while value_from() can be just as useful as > find_from() is (and easier to implement), get_from() would just be seek() > followed > by get(), so it would hardly be useful. > > CChris -- A complex system that works is invariably found to have evolved from a simple system that works. --John Gall's 15th law of Systemantics. "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.