Re: strtok

new topic     » goto parent     » topic index » view thread      » older message » newer message

> From: gertie at visionsix.com
> Subject: Re: strtok
> 
> 
> On 10 Jun 2003, at 12:52, Christian.CUVIER at agriculture.gouv.fr wrote:
> 
> 
>> > Date: Mon, 9 Jun 2003 21:01:10 -0500
>> > From: gertie at visionsix.com
>> > Subject: strtok
>> > 
>> > 
>> > Anyone using strtok v2 should upgrade to v2.1 asap. I was running an app
>> > that
>> > used v2, and it was missing random data in parse(). I changed to v2.1 and
>> > experience no problems.
> 
> 
> Lemme point out that v2 is old, and v2.1 is compatable with v2. V2.1 was the 
> last "official" release, i just happened to have a few apps i never got around
>
> to changing the include line in.
>  
> 
>> > Kat
> 
>> 
>>  Actually, this answers another recent post about strtok.
>>  Why don't these generic text/sequence handling routines get too much 
>> attention? I think EuRegExp got even less feedback.
> 
> 
> The regexp stuff confuses me, mostly because it's intrinsically able to 
> happily provide an answer, even if it's totally wrong. And it's possible to
> get
> the same answer in multiple ways. This tells me if you feed it different
> source
> text, or an out-of-whack database, it will provide wierd output instead of 
> failing on the bad data.
> 

	Agreed. That's why my library provides tools to report every pattern 
match that can be figured out, so that external code can sort out the 
right thing, if there's any in target string. Also, it has extended 
conditional groups for advance filtering of bogus answers.
	PentaText Tools (DOS-based free regexp tools) provide interactive, 
incremental regexp search/replace and additional features which are a 
better safety net. Sorry, can't compete with this. :] And the dll is not 
free. :(

> 
>>  My tentative answer is: actually, these routines are very powerful, 
>> since you can do about everything with them putting the right params in 
>> it. And most people, including myself, find it easier to code their own 
>> stuff for the specific need they have.
>> 
>>  For instance: I needed to separate off the args of a generic routine 
>> call. Including ugly things like:
>> 
>> MyProc({3,5,s2},sort(f(x,y)),(x=0))
> 
> 
> Need a loop. I considered such a thing in strtok lib, but i couldn't really
> figure
> what level people would want to parse, and only 3 or 4 people have ever 
> provided feedback on strtok.
> 
> Since i have added parms to the other functions, like case insensitivity and 
> "return the separators also", i could add a "parse these levels only" in
> nested
> sequences. Rather like sorttok() and sortntok() does.
> 

	Just the sort of thing I was hinting at. When you fully understand the 
code, you can tweak it, or its use, your way with hardly any 
limitations. See below.

> 
>>  Just parsing on ',' doesn't help, right? So I devised a level function 
>> (seq=level(sequence source, atom open_del,atom close_del) to get the 
>> parentheses and brace nesting level of all chars in source sequence, and 
>> a split function which parses for ',' in the 0-level part of source and 
>> returns tokens of original string (seq=split(sequence to_parse,sequence 
>> on_what, sequence token_source). to_parse and token_source must have the 
>> same length. The return is a sequence of {token,start_pos}.
> 
> 
> You lost me. Is this for a computer language parser? Did you look at 
> DavidC's Yacc and Ox and Py and ..... arg,, i can't think of it atm,, the
> early
> Eu interpreter clone?
> 

	No, a preprocessor that would make Eu programming a lot easier, and 
would prove that many useful features can be added without problems or 
breaking code (hint, hint...). Implementing PBR and such stuff requires 
a deep analysis of the source, as convoluted as it may be.

> 
>>  Could I do it using strtok? Likely, given its genericity and sheer 
>> power, but for this I'd need to study the docs to find the right 
>> combination, and possibly have to understand your code. That's an 
>> overhead compared to writing my own routines (they didn't take too long 
>> to debug).
> 
> 
> I was hoping the docs were enough that you wouldn't need to know the 
> internals of the code. What's missing?
> 
	Nothing at first glance, but I didn't pore over them seriously. I was 
missing the lib and the docs altogether when I first tried to address 
this advanced parsing issue (my home dev machine doesn't have access to 
Internet). And I found it simple enough to devise my own tools on the 
fly. And when they appeared generic enough to solve other problems I was 
experiencing, I went straight ahead with my own stuff.
	I had downloaded 2.1, and it is here on my puter at office. I may spare 
some time to review the docs, if you mind the feedback.

> 
>>  Just my own experience. Some people may just feel it the other
>> way.
> 
> 
> Sometimes a particular situation needs a custom solution. I tried to make 
> strtok simple to start with, but i added more functionality later on, as i 
> needed it, it's quite reuseable. I think the lack of use may be related to 
> literacy. Simply calling the words of a sentence "meaning tokens", like the 
> data elements of a database, may be throwing people off. Strtok.e is 
> definitely not the C lib either.
> 
> Kat

	And the C lib is so obfuscated because of strong typing that it is 
about as easy to use as a generic routine with tons of options and 
parms. You can't have it both ways: addressing a myriad of individual 
slightly different issues can't be achieved easily, but you can choose 
the sort of difficulty you'd better off with. C went one way and Eu 
another way about it.
	"Tokens" may throw ordinary people off base. Eu users are generally 
more involved in computers than I am (I have no computer science 
background), so I'd be a bit surprised if this was an issue. But perhaps...

	Have a nice day.

CChris

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu