1. Str-Kat
- Posted by Shawn Pringle <shawn.pringle at ?mail.c?m> May 30, 2008
- 747 views
Kat, A EUPHORIA string is a sequence that contains integer values that each represent a character value. EUPHORIA has a string type as much as C does. A string is not dealt seperately as a special type and it doesn't need to be. Unlike both BASIC and C though there are no special string ops that concatonate, determine the length, and copy. In EUPHORIA you manipulate strings and arrays the same way because they are the same. They are all sequences. And since sequence manipulation is strait forward so is string manipulation. I wouldn't want it to be any other way. Shawn Pringle
2. Re: Str-Kat
- Posted by Kat <KAT12 at c?osahs.n?t> May 30, 2008
- 732 views
Shawn Pringle wrote: > > Kat, > > A EUPHORIA string is a sequence that contains integer > values that each represent a character value. EUPHORIA > has a string type as much as C does. > > A string is not dealt seperately as a special type and > it doesn't need to be. > > Unlike both BASIC and C though there are no special > string ops that concatonate, determine the length, and > copy. In EUPHORIA you manipulate strings and arrays > the same way because they are the same. They are all > sequences. And since sequence manipulation is strait > forward so is string manipulation. I wouldn't want > it to be any other way. Really? Since when does EUPHORIA use only 8 bits for each CHAR in the SEQUENCE ? Since when can you load a 500mbyte STRING into EUPHORIA and not have the OS kill the application with "too much memory used" error (windoze allows each app to have only 2 gigabytes)? Kat, forgetting she wrote STRING-TOKENS lib in archives.
3. Re: Str-Kat
- Posted by Shawn Pringle <shawn.pringle at ??ail.com> May 30, 2008
- 740 views
Kat wrote: > > Shawn Pringle wrote: > > > > Kat, > > > > A EUPHORIA string is a sequence that contains integer > > values that each represent a character value. EUPHORIA > > has a string type as much as C does. > > > > A string is not dealt seperately as a special type and > > it doesn't need to be. > > > > Unlike both BASIC and C though there are no special > > string ops that concatonate, determine the length, and > > copy. In EUPHORIA you manipulate strings and arrays > > the same way because they are the same. They are all > > sequences. And since sequence manipulation is strait > > forward so is string manipulation. I wouldn't want > > it to be any other way. > > Really? Since when does EUPHORIA use only 8 bits for each CHAR in the SEQUENCE > ? Since when can you load a 500mbyte STRING into EUPHORIA and not have the OS > kill the application with "too much memory used" error (windoze allows each > app to have only 2 gigabytes)? > > Kat, > forgetting she wrote STRING-TOKENS lib in archives. I didn't say 8 bits for each character. Normally they are 7-bit but for those who are doing non-English they could be 18 bit. Frankly, I don't give a damn. Why would you want to load a 500 MB string into memory at once anyway? Shawn
4. Re: Str-Kat
- Posted by Derek Parnell <ddparnell at b?gpond.co?> May 30, 2008
- 705 views
Kat wrote: > > Shawn Pringle wrote: > > > > Kat, > > > > A EUPHORIA string is a sequence that contains integer > > values that each represent a character value. > Really? Ok, its not quite accurate. It should read more like ... "A EUPHORIA string is a sequence that ONLY contains POSTIVE integer values that each represent a character value." > Since when does EUPHORIA use only 8 bits for each CHAR > in the SEQUENCE? Since when is the definition of "string" :: An array of 8-bit unsigned integers? > Since when can you load a 500mbyte STRING into EUPHORIA and > not have the OS kill the application with "too much memory used" > error (windoze allows each app to have only 2 gigabytes)? It doesn't. I bet a Commodore 64 couldn't do that either. Since when do you absolutely, positively, must have all those 500 mega BYTES in RAM at the same time? Are you saying that your task can only be achieved if all those bytes are in RAM simultaneously? -- Derek Parnell Melbourne, Australia Skype name: derek.j.parnell
5. Re: Str-Kat
- Posted by Kat <KAT12 at c?osahs.?et> May 30, 2008
- 735 views
Shawn Pringle wrote: > > Kat wrote: > > > > Shawn Pringle wrote: > > > > > > Kat, > > > > > > A EUPHORIA string is a sequence that contains integer > > > values that each represent a character value. EUPHORIA > > > has a string type as much as C does. > > > > > > A string is not dealt seperately as a special type and > > > it doesn't need to be. > > > > > > Unlike both BASIC and C though there are no special > > > string ops that concatonate, determine the length, and > > > copy. In EUPHORIA you manipulate strings and arrays > > > the same way because they are the same. They are all > > > sequences. And since sequence manipulation is strait > > > forward so is string manipulation. I wouldn't want > > > it to be any other way. > > > > Really? Since when does EUPHORIA use only 8 bits for each CHAR in the > > SEQUENCE > > ? Since when can you load a 500mbyte STRING into EUPHORIA and not have the > > OS > > kill the application with "too much memory used" error (windoze allows each > > app to have only 2 gigabytes)? > > > > Kat, > > forgetting she wrote STRING-TOKENS lib in archives. > > I didn't say 8 bits for each character. Normally they are 7-bit but for > those who are doing non-English they could be 18 bit. Frankly, I don't > give a damn. There's the problem. Doesn't explain why you brought it up tho. > Why would you want to load a 500 MB string into memory at once anyway? You going to criticise the "why", instead of finding the "how"? Ok, try loading 250 megabytes and *using* it. The 250 will become a gigabyte, and almost any way you use it will copy it, making it 2 gigabytes, and the OS will kill it. Oh, for non-ascii chars in non-usa places, you could still have UTF-8 or UTF-16 strings and still save memory. Your example still used 32bits/char. Kat
6. Re: Str-Kat
- Posted by CChris <christian.cuvier at agric?lture.gouv.f?> May 30, 2008
- 714 views
Shawn Pringle wrote: > > Kat, > > A EUPHORIA string is a sequence that contains integer > values that each represent a character value. EUPHORIA > has a string type as much as C does. > > A string is not dealt seperately as a special type and > it doesn't need to be. > > Unlike both BASIC and C though there are no special > string ops that concatonate, determine the length, and > copy. In EUPHORIA you manipulate strings and arrays > the same way because they are the same. They are all > sequences. And since sequence manipulation is strait > forward so is string manipulation. I wouldn't want > it to be any other way. > > Shawn Pringle Compare the memory overhead and performance hit that general sequences take when compared to raw arrays of bytes/words/dwords in memory, and you will want to have different operators and types. The flexibility of sequences is wonderful. However, sequences of bytes/dwords are a fairly common special case, and there is room for the processing speed and memory footprint to be much, much more optimised. CChris
7. Re: Str-Kat
- Posted by Shawn Pringle <shawn.pringle at g?ail.c?m> May 30, 2008
- 711 views
Derek Parnell wrote: > > Kat wrote: > > > > Shawn Pringle wrote: > > > > > > Kat, > > > > > > A EUPHORIA string is a sequence that contains integer > > > values that each represent a character value. > > > > Really? > > Ok, its not quite accurate. It should read more like ... > > "A EUPHORIA string is a sequence that ONLY contains POSTIVE integer > values that each represent a character value." > > > Since when does EUPHORIA use only 8 bits for each CHAR > > in the SEQUENCE? > > Since when is the definition of "string" :: An array of 8-bit unsigned > integers? > I just do not accept the COBOL definition of a string. Eventually we all will be using Unicode in one form or another. Sure, utf-8 would fit in that definition but one could also use 16-bit unsigned integers because the Windows API either does that or returns chars using an unportable codepage. To, me there are only two types of strings I work with: 16-bit and 7-bit ASCII. Inspite of having over 65,000 characters to work with the ANSI Unicode commitee wasn't able to fit everything in that space. I have to blame their waste on things like special codes for Roman numerals(!) and other needless homoglyphs. So, they have scheme for encoding an 18-bit space in multiple 16-bit characters. Don't worry about EUPHORIA strings, the UNICODE commitee will eventually use four billion different code points anyway. Just wait a few years for them to catch up. ;) Shawn > Derek Parnell > Melbourne, Australia > Skype name: derek.j.parnell
8. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at yah?o.c?m> May 30, 2008
- 740 views
I've been reading this string thread and I'm quite astonished because of two simple facts. Strings are just data and we are supposed to be programmers. There are different types of strings of course, like BSTRings, unicode strings and every C programmers favorite, null terminated strings. All of these strings are referenced by address via pointer. The last fact is we already have a euphoria word that produces a string... allocate_string. Perhapa this needs to be promoted into the core, but we have it already. I reiterate, we are programmers. We are free to manipulate such strings in any way a programmer can, which is to say any way at all. Limited only by the programmers imagination. So where is the problem? Unicode, same deal. There may be a lot of things that are difficult for programmers to do mainly because of undocumented interfaces, but manipulation of data should not be one of them. Are we not all programmers? Stand up and be counted.
9. Re: Str-Kat
- Posted by Ricardo Forno <ricardoforno at tutop?a.?om> May 30, 2008
- 714 views
My two cents on the subject: Since it seems that something will be done regarding strings, a solution I envision is to add a field to the internal sequence descriptor telling how many bits has an atom for this particular sequence. The programmer will in principle be unaware of the atom length, and EU will automatically take care of this attribute. For example, assume an empty sequence is being filled with chars via &= or append(). This sequence will contain only 8-bits atoms. If an Unicode char is added, then all the previous elements will be transformed to 16-bit format. If an integer is added (having a negative or high value), then all the previous elements will be upgraded to the corresponding size. This will even allow for 1-bit elements, or to have integers up to 128 using a single byte, so increasing both space and time efficiency. After all, this is already done when you append a fraction to a sequence that is only composed by integers. Regards.
10. Re: Str-Kat
- Posted by ChrisBurch2 <crylex at freeuk??o.uk> May 30, 2008
- 706 views
ken mortenson wrote: <snip> > > Are we not all programmers? Stand up and be counted. Are you not yet entertained? (sorry, wine and gladiatorial reference sprang to mind) Chris
11. Re: Str-Kat
- Posted by Kat <KAT12 at coos?hs.net> May 30, 2008
- 720 views
- Last edited May 31, 2008
ken mortenson wrote: > > I've been reading this string thread and I'm quite astonished because of two > simple facts. > > Strings are just data and we are supposed to be programmers. > > There are different types of strings of course, like BSTRings, unicode strings > and every C programmers favorite, null terminated strings. All of these > strings are referenced by address via pointer. The last fact is we already > have a euphoria word that produces a string... > > allocate_string. Yeasbut, all the sequence operators will need to be recoded for the string type of your choosing. How exactly the string is implemented then becomes irrelavant (or irreverant for the language purists). Namespacing will make this easier, i suspect: compare(seq,seq) strings:compare(string,string) but it still must be coded up. It would also be easier if Eu returned unused memeory to the OS via some command or other, but an outboard strings.euu (a la .tpu) managing it's memory thru api calls, wluod mkae it wrok too, in a very non-euphorian way. I had considered making this sorta thing, but like working on eunet, eubot, etc, i just couldn't do it with humans in the way objecting to it, i gave up. Kat
12. Re: Str-Kat
- Posted by Lucius L. Hilley III <euphoria at unk?ar.com> May 30, 2008
- 706 views
- Last edited May 31, 2008
Kat wrote: > > It would also be easier if Eu returned unused > memory to the OS via some command or other, but an outboard strings.euu (a > la .tpu) managing it's memory thru api calls, wluod mkae it wrok too, in a > very > non-euphorian way. I had considered making this sorta thing, but like working > on eunet, eubot, etc, i just couldn't do it with humans in the way objecting > to it, i gave up. > > Kat That is a clever work around. Load a euphoria program that shares memory space for the explicit reason of being able to close out and reload said program to manage the memory. Lucius L. Hilley III - Unkmar
13. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at y?hoo.co?> May 30, 2008
- 702 views
- Last edited May 31, 2008
Kat wrote: > It would also be easier if Eu returned unused > memeory to the OS via some command or other I believe it does Kat. I believe I remember seeing free() routine?
14. Re: Str-Kat
- Posted by Kat <KAT12 at coosahs.??t> May 30, 2008
- 699 views
- Last edited May 31, 2008
ken mortenson wrote: > > Kat wrote: > > > It would also be easier if Eu returned unused > > memeory to the OS via some command or other > > I believe it does Kat. I believe I remember seeing free() routine? So it's only a matter of using allocate() and free()? No wonder it's not been done yet! Where is the magic spell that makes compare() and equal() and string[2] and length(string) work, after you get allocate() and free() typed out? Kat
15. Re: Str-Kat
- Posted by Kat <KAT12 at c?osa?s.net> May 30, 2008
- 750 views
- Last edited May 31, 2008
ken mortenson wrote: <snip> Ken, you gotta realise, there's been substancial resistance to actual strings in Eu. Sequences are great things, wonderful things. But most of the world is strings, like this sentence. Or this paragraph. And the world is a big place, there's a looooot of strings out there. And most people would rather fight having strings in Eu in any form, especially if i write about it, or write the code. Just ask CK or JBrown. And i just can't justify anything i say, well enough to get anything to change. If i had a code block that did all i believe should be done in Eu, i'd be ashamed to say so. Kat
16. Re: Str-Kat
- Posted by Jim Brown <jbrown105 at l?nuxbu?dhist.net> May 30, 2008
- 720 views
- Last edited May 31, 2008
Kat wrote: > > ken mortenson wrote: > > > > Kat wrote: > > > > > It would also be easier if Eu returned unused > > > memeory to the OS via some command or other > > > > I believe it does Kat. I believe I remember seeing free() routine? > > So it's only a matter of using allocate() and free()? No wonder it's not been > done yet! Where is the magic spell that makes compare() and equal() and > string[2] > and length(string) work, after you get allocate() and free() typed out? > > Kat --string.e --------------------- namespace string global function compare(atom a, atom b) integer i, ac, bc i = 0 ac = peek(a) bc = peek(b) while 1 do if ac > bc then return 1 elsif ac < bc then return -1 elsif (ac = bc) and (ac = 0) then return 0 else i = i + 1 ac = peek(a+i) bc = peek(b+i) end if end while end function global function equal(atom a, atom b) return compare(a,b) = 0 end function global function length(atom a) integer i i = 0 while peek(a+i) != 0 do i = i + 1 end while return i end function --string[2] is impossible to do currently, but there is this workaround global function slice(atom a, integer i) return peek(a+i) end function
17. Re: Str-Kat
- Posted by c.k.lester <euphoric at ??lester.com> May 30, 2008
- 720 views
- Last edited May 31, 2008
Kat wrote: > > And most people would rather fight having > strings in Eu in any form, especially if i write about it, or write the code. > Just ask CK or JBrown. I've never fought against string. Why do I keep getting caught up in your insanity? Please leave already.
18. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at y?hoo.com> May 30, 2008
- 702 views
- Last edited May 31, 2008
Kat wrote: > So it's only a matter of using allocate() and free()? No wonder it's not been > done yet! Where is the magic spell that makes compare() and equal() and > string[2] > and length(string) work, after you get allocate() and free() typed out? Kat, please don't be snarky. You could write that stuff couldn't you? I would agree that classes would make it simpler, but again it's just data and Euphoria gives you all the tools you need. Could it more effeciently be written into the core? Absolutely. There is where perhaps you could make your case. But your case would be a lot stronger if you had written a string library and already found it's performance limited. I haven't checked the archive but if someone there has written a string library you might find an ally to adding it to the core? Something to think about anyway. I hope I've given you some helpful ideas.
19. Re: Str-Kat
- Posted by Kat <KAT12 at coosa?s.n?t> May 30, 2008
- 714 views
- Last edited May 31, 2008
c.k.lester wrote: > > Kat wrote: > > > > And most people would rather fight having > > strings in Eu in any form, especially if i write about it, or write the > > code. > > Just ask CK or JBrown. > > I've never fought against string. "especially if i write about it, or write the code." > Why do I keep getting caught up in your insanity? Please leave already. Proves my point. Kat
20. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at yaho?.co?> May 30, 2008
- 763 views
- Last edited May 31, 2008
Kat wrote: > Ken, you gotta realise, there's been substancial resistance to actual strings > in Eu. Sequences are great things, wonderful things. But most of the world is > strings, like this sentence. Or this paragraph. And the world is a big place, > there's a looooot of strings out there. And most people would rather fight > having > strings in Eu in any form, especially if i write about it, or write the code. > Just ask CK or JBrown. And i just can't justify anything i say, well enough > to get anything to change. If i had a code block that did all i believe should > be done in Eu, i'd be ashamed to say so. I understand your point Kat. These guys must be some kind of minimalists or something, eh? What you need are allies, Kat. Instead of beating heads and walls (one of my favorite pastimes btw) find out if others have the same need you do. It would make a lot stronger case if you had others to champion the cause. It is unfortunate, but people do resist ideas for lot's of human reasons which have little to do with the ideas themselves. Personally, I don't see a compelling case for strings in Euphoria because I really like how sequences have been implemented (particularly with regard to allocation/deallocation, no garbage collection or memory leaks.) They provide a way to send and receive strings from foreign DLLs. Now if you add strings, you open up a whole lot of potential memory issues that Euphoria thankfully doesn't have. That isn't a show stopper in my mind. I think strings can be handled well. C has the cleanest strings (just a pointer and a null terminator) but it makes the programmer do all the management tasks. VB has a slightly more complicated BSTRing type and does the management for you, but it's a really slow implementation. I wish you well Kat. Any champians out there? Anyone know how to spell champian?
21. Re: Str-Kat
- Posted by Matt Lewis <matthewwalkerlewis at gm?il.c?m> May 30, 2008
- 742 views
- Last edited May 31, 2008
Kat wrote: > > ken mortenson wrote: > <snip> > > Ken, you gotta realise, there's been substancial resistance to actual strings > in Eu. Sequences are great things, wonderful things. But most of the world is > strings, like this sentence. Or this paragraph. And the world is a big place, > there's a looooot of strings out there. Yes, and euphoria does a pretty good job with most string jobs I've run across. We all agree that sequences with millions of elements don't work well with today's hardware, which happens to be your use case. > And most people would rather fight having strings in Eu in any form, > especially if i write about it, or write the code. Just ask CK or JBrown. Yeah, some people disagree with you. Some don't. But it doesn't matter what anyone wants if no one can figure out a good way to implement them, which happens to be the case with strings. > And i just can't justify anything i say, well enough to get anything to > change. If i had a code block that did all i believe should be done in Eu, > i'd be ashamed to say so. Few people have. It *does* help if you have some code that does it. Asking others to do the work on stuff you're interested in works less well in a volunteer setting. The only code of yours I've seen really has been strtok stuff, and it seemed good enough to me. I haven't really seen the assaults on your code, but I've sure heard you talk about them a lot. I certainly sympathize with your RL issues. Matt
22. Re: Str-Kat
- Posted by Kat <KAT12 at coo?ahs.?et> May 30, 2008
- 732 views
- Last edited May 31, 2008
ken mortenson wrote: <snip> > But your case would be a lot stronger if you had written a string library > and already found it's performance limited. Did, done, found it so. Try winxp on a computer with 512megs memory, and load and use strtok's parse on a 100kbyte string. 100K byte isn't hard to find, some webpages are over 100Kbytes, NOT counting the css, js, and pics. You'll get bogged down with drive swapping the memory back and forth. Kat
23. Re: Str-Kat
- Posted by Kat <KAT12 at c?o?ahs.net> May 30, 2008
- 844 views
- Last edited May 31, 2008
ken mortenson wrote: > > Kat wrote: > > > Ken, you gotta realise, there's been substancial resistance to actual > > strings > > in Eu. Sequences are great things, wonderful things. But most of the world > > is > > strings, like this sentence. Or this paragraph. And the world is a big > > place, > > there's a looooot of strings out there. And most people would rather fight > > having > > strings in Eu in any form, especially if i write about it, or write the > > code. > > Just ask CK or JBrown. And i just can't justify anything i say, well enough > > to get anything to change. If i had a code block that did all i believe > > should > > be done in Eu, i'd be ashamed to say so. > > I understand your point Kat. These guys must be some kind of minimalists or > something, eh? > > What you need are allies, Kat. Instead of beating heads and walls (one of my > favorite pastimes btw) find out if others have the same need you do. > > It would make a lot stronger case if you had others to champion the cause. > It is unfortunate, but people do resist ideas for lot's of human reasons > which have little to do with the ideas themselves. > > Personally, I don't see a compelling case for strings in Euphoria because > I really like how sequences have been implemented (particularly with regard > to allocation/deallocation, no garbage collection or memory leaks.) They > provide a way to send and receive strings from foreign DLLs. > > Now if you add strings, you open up a whole lot of potential memory issues > that Euphoria thankfully doesn't have. That isn't a show stopper in my > mind. I think strings can be handled well. C has the cleanest strings > (just a pointer and a null terminator) but it makes the programmer do all > the management tasks. VB has a slightly more complicated BSTRing type and > does the management for you, but it's a really slow implementation. > > I wish you well Kat. Any champians out there? Anyone know how to spell > champian? You have spelled champion before. I can write the code as an include, and it might be better left at that. I used a lot of pointers to strings in TurboPascal (making PowerBasic catch my eye) and lots of pchars, so doing the same in Eu would be fairly easy. Btw, people rallied against pointers too. Pointers seem real un-Eu-like, apparently, and i wouldn't release code that has them, i been flamed for my code enough already, even recently. I even left #Euphoria for CK's pleasure. He still isn't satisfied, as you can see. Kat
24. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at ?ahoo.?om> May 31, 2008
- 730 views
Kat wrote: > > ken mortenson wrote: > > But your case would be a lot stronger if you had written a string library > > and already found it's performance limited. > > Did, done, found it so. Try winxp on a computer with 512megs memory, and load > and use strtok's parse on a 100kbyte string. 100K byte isn't hard to find, > some > webpages are over 100Kbytes, NOT counting the css, js, and pics. You'll get > bogged down with drive swapping the memory back and forth. I put together computer from the junk I've got around the house that only had 32mb or RAM. I had to search all over the internet for utils and a browser that would perform on such a limited machine (The power switch died on it and I haven't replaced it, so I took another machine out of storage which is a bit better.) Anyway, you're probably able better than some of the younger folk (assuming there are younger folk here, I really have no idea of the age demographic) to remember when we had to tape sorts? I'm talking millions of records on 9 track tape. We made it work. Your always going to find applications where this isn't enough memory. Having four times the memory (or having Euphoria use real strings instead of sequences) isn't going to change that much. I would take a careful look at what you're doing with the data and try to manipulate it in a way that doesn't fill memory so much which is putting you in a disk churning situation because of memory swaps. I usually only deal with a subset of my data. If I had to fill memory with an application I'd probably have to get a machine that allowed me to add enough memory to do the job (I can't afford my dream machine, but that's what others have done.) When I say subset of data, that doesn't mean I'm not processing all of it. It just means I try do deal with it in managable chunks. It sounds to me like you really don't have a string issue. The issue seems to be more about what algoritms your choosing. If you are a bit more detailed in your description, perhaps someone will have some ideas. Best to ya.
25. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at y??oo.com> May 31, 2008
- 749 views
Kat wrote: > I can write the code as an include, and it might be better left at that. I > used > a lot of pointers to strings in TurboPascal (making PowerBasic catch my eye) > and lots of pchars, so doing the same in Eu would be fairly easy. Btw, people > rallied against pointers too. Pointers seem real un-Eu-like, apparently, and > i wouldn't release code that has them, i been flamed for my code enough > already, > even recently. > > I even left #Euphoria for CK's pleasure. He still isn't satisfied, as you can > see. The asbestos underwear does come in handy at times. I never really understood why pointers are so bad, they're just an address in memory. You can get carried away with pointers to pointers to pointers and so forth and I never did like asterisk as a choice of symbol as C uses. I always thought @ said address to me, but it's not so much better either. PowerBasic ever adds a few more wrinkles because they have more than pass by value and pass by address (I can't think of what it is right now but I did find it interesting.) In my O.T.L. (trademark and EEEVIL patent pending) everything is a function where the return value can be ignored (doesn't have to be assigned to a junk variable) Every parameter is passed by address and subs passes back an address. As with Euphoria, this allows passing back multiple values. Which is funny when I think about it cuz I'm really more of a pass by value kind of guy. Go figure!
26. Re: Str-Kat
- Posted by Kat <KAT12 at coo?ahs.?et> May 31, 2008
- 703 views
ken mortenson wrote: > > Kat wrote: > > > > ken mortenson wrote: > > > But your case would be a lot stronger if you had written a string library > > > and already found it's performance limited. > > > > Did, done, found it so. Try winxp on a computer with 512megs memory, and > > load > > and use strtok's parse on a 100kbyte string. 100K byte isn't hard to find, > > some > > webpages are over 100Kbytes, NOT counting the css, js, and pics. You'll get > > bogged down with drive swapping the memory back and forth. > > I put together computer from the junk I've got around the house that only > had 32mb or RAM. I had to search all over the internet for utils and a > browser that would perform on such a limited machine (The power switch died > on it and I haven't replaced it, so I took another machine out of storage > which is a bit better.) > > Anyway, you're probably able better than some of the younger folk (assuming > there are younger folk here, I really have no idea of the age demographic) > to remember when we had to tape sorts? I'm talking millions of records > on 9 track tape. We made it work. Yes, my first NLP program involved 2 5inch floppy drives and 2 to 3 hours of manually swapping disks as the computer requested each disc. I had just escaped using 8 inch floppy drives, still have the drives tho. > Your always going to find applications where this isn't enough memory. > Having four times the memory (or having Euphoria use real strings instead > of sequences) isn't going to change that much. > > I would take a careful look at what you're doing with the data and try > to manipulate it in a way that doesn't fill memory so much which is putting > you in a disk churning situation because of memory swaps. Been there, 20K was a huge amount of ram to have. I once paid $100 for a single byte of solid state static ram in the late 60's. It's how i know how to manage to get Eu to load a 37meg file so it can be match()'d thru several times per second, something i couldn't do with the file out onthe drive. But please, if it makes you happy, continue to question my experience rather than the size of the problem. > I usually only deal with a subset of my data. If I had to fill memory > with an application I'd probably have to get a machine that allowed me > to add enough memory to do the job (I can't afford my dream machine, but > that's what others have done.) > > When I say subset of data, that doesn't mean I'm not processing all of it. > It just means I try do deal with it in managable chunks. Like one short line at a time out of a 100k text file? Is that what you are seriously telling me i should do on a modern computer running Euphoria?? > It sounds to me like you really don't have a string issue. The issue seems > to be more about what algoritms your choosing. If you are a bit more > detailed in your description, perhaps someone will have some ideas. So parsing a single webpage's html is unreasonable, and i should write it to disk, then handle the file only one line at a time and writing it back to the drive? What's the difference there vs letting the OS do it? Why is munging even a 100K byte file too excessive?? > Best to ya. You too. Kat
27. Re: Str-Kat
- Posted by c.k.lester <euphoric at ckle?ter.co?> May 31, 2008
- 718 views
Kat wrote: > > I even left #Euphoria for CK's pleasure. He still isn't satisfied, > as you can see. My initial comment in the channel was light-hearted, given our relationship going so far back. I tried to pad it with smilies or misspellings enough to indicate that I wasn't being totally serious... We've both been Euphoria programmers for some time and have shared good conversation in the past. I might even have the logs to prove it. No doubt you do. I've always sympathized with your plight(s), even when the stories become somewhat unbelievable. But then something clicks in your brain and all of a sudden you're this paranoid delusional freak that invites everybody to her pity party and then lashes out at friend and foe. Others know what I'm talking about because it's quite abrupt sometimes... and sad. Kat, I've never bad-mouthed your code, and I've never seen or heard anybody else say anything bad about your code. I suspect that nobody really has and that you've taken criticism out of context or you're so lacking in self- confidence that you consider all negative words to be direct attacks against poor little you. As far as I'm concerned, you have good ideas and are very skilled at getting computers to do what you want. I used strtok for a long time until improvements were made and certain funcs and procs were sped up. At that time, I think you were on sabbatical from Euphoria. It was a peaceful time. :) So, I don't care if you stick around or not. My wish is that you would leave the delusional paranoia behind, the victim-mentality, or whatever psychosis/ neurosis is driving you these days, and become a mature member of this Euphoria community, who understands that not everybody is going to see things the same way, and that some ideas just won't be implemented. Heck, I want a few things still (see the requested features list) that I don't think will be implemented unless I do them myself... and right now, until that lottery ticket hits, I don't have the time or skills to touch the interpreter. However, I shouldn't say never because there's grumbling about adding GOTO to Euphoria. Will you live to see it?!!? I pray you're not rotting in jail subsisting on bread, water, and daily beatings when that day comes.
28. Re: Str-Kat
- Posted by ken mortenson <kenneth_john at yahoo?co?> May 31, 2008
- 718 views
Kat wrote: > Like one short line at a time out of a 100k text file? Is that what you are > seriously telling me i should do on a modern computer running Euphoria?? If it works. > So parsing a single webpage's html is unreasonable I don't know your application. I can't say one way or the other. But I can say, there are cats and there are ways to skin 'em. When you come up with a solution, I'd be happy to hear how you did it.
29. Re: Str-Kat
- Posted by c.k.lester <euphoric at ckleste?.co?> May 31, 2008
- 723 views
ken mortenson wrote: > > ...there are cats and there are ways to skin 'em. Oooooh! Never say that to a Kat!!!
30. Re: Str-Kat
- Posted by Kat <KAT12 at coosa?s.ne?> May 31, 2008
- 711 views
Wow, deja vu. CK labeled me with more psychobabble (again), and my responce is tied up or deleted in moderation (again). Kat
31. Re: Str-Kat
- Posted by CChris <christian.cuvier at ??riculture.gouv.fr> May 31, 2008
- 699 views
Jim Brown wrote: > > Kat wrote: > > > > ken mortenson wrote: > > > > > > Kat wrote: > > > > > > > It would also be easier if Eu returned unused > > > > memeory to the OS via some command or other > > > > > > I believe it does Kat. I believe I remember seeing free() routine? > > > > So it's only a matter of using allocate() and free()? No wonder it's not > > been > > done yet! Where is the magic spell that makes compare() and equal() and > > string[2] > > and length(string) work, after you get allocate() and free() typed out? > > > > Kat > > --string.e > --------------------- > namespace string > > global function compare(atom a, atom b) > integer i, ac, bc > i = 0 > ac = peek(a) > bc = peek(b) > while 1 do > if ac > bc then > return 1 > elsif ac < bc then > return -1 > elsif (ac = bc) and (ac = 0) then > return 0 > else > i = i + 1 > ac = peek(a+i) > bc = peek(b+i) > end if > end while > end function > > global function equal(atom a, atom b) > return compare(a,b) = 0 > end function > > global function length(atom a) > integer i > i = 0 > while peek(a+i) != 0 do > i = i + 1 > end while > return i > end function > > --string[2] is impossible to do currently, but there is this workaround > global function slice(atom a, integer i) > return peek(a+i) > end function On an Intel CPU, the length function needs only be this: push edi push ecx xor eax,eax xor ecx,ecx mov edi,[esp+4] cld repnz scasb jecxz ret sub eax,ecx pop ecx pop edi ret A whopping 20 bytes. You can shave 2 more if ecx is discardable. The jecxz is optional too, becuse machines with 4Go RAM are still hard to find. That comes to 16 bytes, which nicely fits into a cache line. Oh, and the string address needs not to be on the stack if ecx is discardable. Can still shave some cycles. How much slower would the string: code above be? I'd bet between 10 and 30 times. CChris
32. Re: Str-Kat
- Posted by Jim Brown <jbrown105 at linu?buddhist.ne?> May 31, 2008
- 710 views
CChris wrote: > > Jim Brown wrote: > > > > Kat wrote: > > > > > > ken mortenson wrote: > > > > > > > > Kat wrote: > > > > > > > > > It would also be easier if Eu returned unused > > > > > memeory to the OS via some command or other > > > > > > > > I believe it does Kat. I believe I remember seeing free() routine? > > > > > > So it's only a matter of using allocate() and free()? No wonder it's not > > > been > > > done yet! Where is the magic spell that makes compare() and equal() and > > > string[2] > > > and length(string) work, after you get allocate() and free() typed out? > > > > > > Kat > > > > --string.e > > --------------------- > > namespace string > > > > global function compare(atom a, atom b) > > integer i, ac, bc > > i = 0 > > ac = peek(a) > > bc = peek(b) > > while 1 do > > if ac > bc then > > return 1 > > elsif ac < bc then > > return -1 > > elsif (ac = bc) and (ac = 0) then > > return 0 > > else > > i = i + 1 > > ac = peek(a+i) > > bc = peek(b+i) > > end if > > end while > > end function > > > > global function equal(atom a, atom b) > > return compare(a,b) = 0 > > end function > > > > global function length(atom a) > > integer i > > i = 0 > > while peek(a+i) != 0 do > > i = i + 1 > > end while > > return i > > end function > > > > --string[2] is impossible to do currently, but there is this workaround > > global function slice(atom a, integer i) > > return peek(a+i) > > end function > > On an Intel CPU, the length function needs only be this: > push edi > push ecx > xor eax,eax > xor ecx,ecx > mov edi,[esp+4] > cld > repnz scasb > jecxz ret > sub eax,ecx > pop ecx > pop edi > ret > > A whopping 20 bytes. You can shave 2 more if ecx is discardable. The jecxz is > optional too, becuse machines with 4Go RAM are still hard to find. That comes > to 16 bytes, which nicely fits into a cache line. > > Oh, and the string address needs not to be on the stack if ecx is discardable. > Can still shave some cycles. > > How much slower would the string: code above be? I'd bet between 10 and 30 > times. > > CChris I was going for simplicity, not speed. I would have just done define_c_func(open_dll(""), "strlen", ....) and let your OS vendor take care of the speed hack. Both of them would be a lot faster than a pure eu length function. In fact, you can define_c_func() strcmp for compare(), and also get a speed boost for compare() and equal(). But Kat doesn't do C (and who can blame her?)