Re: Bug in get() and value(): embedded comments
- Posted by CChris <christian.cuvier at agriculture.g?uv.?r> Jul 26, 2007
- 594 views
Matt Lewis wrote: > > CChris wrote: > > > > Jason Gade wrote: > > > > > > CChris wrote: > > > > * In the comments, it reads: > > > > "After reading one valid representation of a Euphoria object, ..." > > > > So now there is the possibility of several valid representations for an > > > > object; > > > > otherwise, "the" would have been used again instead. > > > > > > > > And indeed, value() can cope with spaces inside a string that represents > a sequence,</font></i> > > > > so that the interpretation from the Comments section would seem to be > > > > the one > > > > to take into account. > > > > > > BTW, I couldn't find this comment in the source of get.e. > > > > > > I would assume, though, that the word "one" above was used as opposed to > > > the > > > word "two" or "three" or whatever. I really don't think that it was > > > intended > > > to imply more than one valid representation of an object. > > > > > > > The number of valid representations of a sequence is almost infinite. Add > > spaces > > and tabs wherever you want, it's still valid. So "on" really stands for "one > > of many". > > I think you're parsing the sentence incorrectly (and it's somewhat vague). > > I believe that the meaning of "one valid representation" was in the sense of, > "it will only read one object, and then it will stop." It really has nothing > to do with the number of ways that you could alter the representation using > different combinations of whitespace. > While you may be right, it is still true that adding syntactically unnecessary whitespace keeps the string/sequence of bytes valid and readable by get()/value(). Since comments are exactly as (un)necessary as whitespace, they should be treated the same, and currently are not. > While what you're doing is sort of interesting, I think you're getting away > from the purpose of the function. It definitely is *not* meant to handle > comments. > This is debatable. Comments are not explicitly mentioned, but did the author mean they were excluded? I explained righ above why I think they are not, but feedback from the author is needed here. Rob? > FTFM: "This works the same as get(),..." > > And following the link to get: > "Multiple "top-level" objects in the input stream must be separated from > each other with one or more "whitespace" characters (blank, tab, \r or \n). > Whitespace is not necessary within a top-level object. A call to get() > will read one entire top-level object, plus one additional (whitespace) > character." Off the point, since I'm concerned with embedded comments appearing between non top level objects. BTW did you try testing the code in the cchris_get branch? It fixes the extra space issue, and I couldn't measure any performance penalty. If this is confirmed, then it could be a good idea to merge that code, but this needs external confirmation. Executables are available at http://oedoc.free.fr/get_fixed/exw.exe (and ex.exe). > > Note: no mention of comments. This would definitely fall under the category > of enhancement, or feature request, and not bugs, AFAICT. It certainly > sounds useful, but without analyzing the impact (speed, etc), I don't think > we should necessarily include it. Others may have other reasons to > exclude this, but I'll let them speak for themselves. > Since the comment mark is two character long, there will have to be a 1 character lookahead buffer somewhere. If it cannot be done without impacting performance, then we should leave the functions as they are and document the fact that comments are not covered. But this will be assessed from actual code, which isn't written yet. CChris > Matt