Re: Bug in new get()?

new topic     » goto parent     » topic index » view thread      » older message » newer message

I was in a hurry when writing the first reply, some precisions now.


Jason Gade wrote:
> 
> Okay, I'm writing a stub to test your new get.e with regards to sanity.ex, so
> I can fix sanity.ex.
> 
> And I'm looking at your docs. I don't quite understand them and I have a
> change
> suggestion.
> 
> Your docs say:
>     get() returns a 4 element sequence, like value() does:
>     a status code (success/error/end of file),
>     the value just read (meaningful only when the status code is
>     GET_SUCCESS), 
>     the number of characters read, 
>     the number of leading whitespace characters. 
> 
> I'm getting a bunch of 1s, -1s, or 0s for the last two elements. In fact I'm
> not getting any other values.
> 

The bunch of 0s in 3rd position is a bug, corrected. value() didn't have this
bug.
The bunch of -1s in the last position should be a bunch of 0s, hardly more
useful. This is because nearly all the test values start without any leading
whitespace. Otherwise, you'd get more useful output - I'll add some.

> Are these really useful return values? Maybe a different function to read
> those
> last two numbers would be better (state kept in a file variable of course)?
> 

What do you mean by this "file variable"?
As I said in my previous post, I don't like the idea of having too many routine
names to do "almost similar but ..." things. C's stdlib went that way, there are
some good reasons to do this indeed, but the drawbacks are obvious.

> Plus your comment at the top of Get2() doesn't seem to reflect reality: you
> say that Get2() will return a 2-element sequence unless record_whitespace flag
> is set. Get2() seems to return 4 elements regardless.
> 

Actually the comments of Get() and Get2() need to be swapped. Will do that.
Bear in mind they are local routines anyway. It is quite exceptional for local
routines to be commented in the standard files - perhaps a bad thing, not sure.

> I would see more value in returning (in this order) "success or failure",
> "sequence
> of interest", "sequence containing invalid characters until next valid
> character
> (not read)" and "total characters read". And probably not even that, as I
> would
> prefer to see it separated from get(). get()'s pointer, of course, would point
> to the next valid character.
> 
> So the optimal situation in my opinion would be to leave get() returning a
> two-element
> sequence as now, but add get_leading_whitespace_count(), get_chars_read(), and
> get_last_invalid(). Obviously, those names suck, but the point still stands.
> 
> Same for value().
> 

Answered the previous two points in previous post.

Actually, I don't think that the symmetry between get() and value() is very
deep. With value(), you still can inspect the input string easily. With get(), it
is not as easy - you need to seek() back where you were, and then get_bytes() the
same amount you read so as to inspect the string that was just read and return to
file position. On some devices, this may be slow or impossible.
So, the idea of returning at some point the whole string, which will stop right
before the first invalid character anyway, may make sense for get(), but not as
much for value(). Conversely, while value_from() can be just as useful as
find_from() is (and easier to implement), get_from() would just be seek()
followed by get(), so it would hardly be useful.

CChris

> --
> A complex system that works is invariably found to have evolved from a simple
> system that works.
> --John Gall's 15th law of Systemantics.
> 
> "Premature optimization is the root of all evil in programming."
> --C.A.R. Hoare
> 
> j.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu