Re: Bug in get() and value(): embedded comments
- Posted by CChris <christian.cuvier at agri??lture.gouv.fr> Jul 26, 2007
- 632 views
Robert Craig wrote: > > CChris wrote: > > While you may be right, it is still true that adding syntactically > > unnecessary > > whitespace keeps the string/sequence of bytes valid and readable by > > get()/value(). > > Since comments are exactly as (un)necessary as whitespace, they should be > > treated > > the same, and currently are not. > > > > > While what you're doing is sort of interesting, I think you're getting > > > away > > > from the purpose of the function. It definitely is *not* meant to handle > > > comments. > > > > This is debatable. Comments are not explicitly mentioned, but did the author > > mean they were excluded? I explained righ above why I think they are not, > > but > > feedback from the author is needed here. Rob? > > That fact that comments are not currently supported is not > a bug or oversight on my part, but I would not object to their > being supported, if most people think it would be useful, > and if there is no significant performance issue. > > > Since the comment mark is two character long, there will have to be a 1 > > character > > lookahead buffer somewhere. If it cannot be done without impacting > > performance, > > then we should leave the functions as they are and document the fact that > > comments > > are not covered. But this will be assessed from actual code, which isn't > > written > > yet. > > You should test the performance. > While optimizing an application a long time ago, > I noticed that seek() can be more expensive > than you might imagine. > Given that the performance loss would apply in a lot of cases that wouldn't be concerned by embedded comments, this mod is not allowed to fail the test, as far as I'm concerned. I won't code this before next week, as things stand. Likewise, the code in the cchris_get branch on SVN has been isolated there because I wish several persons to test performance on dfferent machines and platforms before it can be considered harmless, and the quirk removal worthwhile. No feedback so far. Now that I think about it, if the proposed mod doesn't hurt performance, then I probably can remove the get() quirk at the same time for no additional cost. Needs actual coding. > If you want to improve the docs, pointing out that > there can be many different string representations of the same > object, please go ahead. e.g. 3.0 +3.0 3.00 3.000 3e0 are all > the same object, and are considered by the language definition > to be *exactly* the same as 3. The implementation > *might* choose to store 3 differently than 3.0, internally, > but it doesn't have to. > This, or emphasizing that sprint() returns a shortest form which is not unique. I think I'll add a sentence about it to cross the t's. > Regards, > Rob Craig > Rapid Deployment Software > <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a> CChris