1. Strings
- Posted by Michael Nelson <mike-nelson-ODAAT at WORLDNET.ATT.NET> Dec 09, 1999
- 708 views
- Last edited Dec 10, 1999
------=_NextPart_000_00C6_01BF4299.B05BD0E0 charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable I really want to stay out of the war over strings, but I thought I'd = share the string type I use--easier to read than the others I've seen in = these posts, but should be comparably efficient: global type string(object theObject) object temp if not sequence(theObject) return 0 end if for i=3D1 to length(theObject) do temp=3DtheObject[i] if not integer(temp) or temp<0 or temp>255 then return 0 end if=20 end for return 1 end type --Mike Nelson ------=_NextPart_000_00C6_01BF4299.B05BD0E0 charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <!DOCTYPE HTML PUBLIC "-//W3C//DTD W3 HTML//EN"> <HTML> <HEAD> <META content=3Dtext/html;charset=3Diso-8859-1 = http-equiv=3DContent-Type> <META content=3D'"MSHTML 4.72.3110.7"' name=3DGENERATOR> </HEAD> <BODY bgColor=3D#ffffff> <DIV><FONT color=3D#000000 size=3D2>I really want to stay out of the war = over=20 strings, but I thought I'd share the string type I use--easier to read = than the=20 others I've seen in these posts, but should be comparably=20 efficient:</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2></FONT> </DIV> <DIV><FONT color=3D#000000 size=3D2>global type string(object=20 theObject)</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2></FONT><FONT size=3D2>object = temp</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>if not sequence(theObject) return 0 = end=20 if</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>for i=3D1 to length(theObject) = do</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2> =20 temp=3DtheObject[i]</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2> if not = integer(temp) or=20 temp<0 or temp>255 then return 0 end if </FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>end for</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>return 1</FONT></DIV> <DIV><FONT color=3D#000000 size=3D2>end type</FONT></DIV> <DIV> </DIV> ------=_NextPart_000_00C6_01BF4299.B05BD0E0--
2. Re: Strings
- Posted by Kat <KSMiTH at PELL.NET> Dec 10, 1999
- 684 views
There is a war over strings? ----- Original Message ----- From: Michael Nelson <mike-nelson-ODAAT at WORLDNET.ATT.NET> To: <EUPHORIA at LISTSERV.MUOHIO.EDU> Sent: Friday, December 10, 1999 1:04 AM Subject: Strings I really want to stay out of the war over strings, but I thought I'd share the string type I use--easier to read than the others I've seen in these posts, but should be comparably efficient: global type string(object theObject) object temp if not sequence(theObject) return 0 end if for i=1 to length(theObject) do temp=theObject[i] if not integer(temp) or temp<0 or temp>255 then return 0 end if end for return 1 end type --Mike Nelson
3. Strings
- Posted by Jason Gade <jaygade at yahoo.com> Feb 18, 2006
- 657 views
Okay, so I'm going to propose two things with regards to strings. Even though I said that I wouldn't propose new stuff. Plus, I suppose it could be handled by ESL (once we get around to it) if it isn't implemented internally (preferred0. A string built-in data type with: Byte-size ASCII strings. For Kat, since she can't have goto. Unicode UTF-8 strings. One built-in type should be able to handle both. Atom has integer as a subclass for efficiency. I think that sequence can have string as a subclass as well, since strings are a "basic" type in most programming projects. Strings can be up-cast to sequences, like integers can be up-cast to atoms. One question: are string constants stored as byte-strings or as DWORD-strings in the Euphoria interpreter? -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.
4. Re: Strings
- Posted by Ryan W. Johnson <ryanj at fluidae.com> Feb 18, 2006
- 668 views
Jason Gade wrote: > > Okay, so I'm going to propose two things with regards to strings. Even though > I said that I wouldn't propose new stuff. Plus, I suppose it could be handled > by ESL (once we get around to it) if it isn't implemented internally > (preferred0. > > A string built-in data type with: > Byte-size ASCII strings. For Kat, since she can't have goto. > Unicode UTF-8 strings. > One built-in type should be able to handle both. > > Atom has integer as a subclass for efficiency. I think that sequence can have > string as a subclass as well, since strings are a "basic" type in most > programming > projects. Strings can be up-cast to sequences, like integers can be up-cast > to atoms. I would be very happy if this was implemented! Is there any reason to not have built-in strings? ~Ryan W. Johnson Fluid Application Environment http://www.fluidae.com/ [cool quote here, if i ever think of one...]
5. Re: Strings
- Posted by Jason Gade <jaygade at yahoo.com> Feb 18, 2006
- 665 views
Ryan W. Johnson wrote: > > Jason Gade wrote: > > > > Okay, so I'm going to propose two things with regards to strings. Even > > though > > I said that I wouldn't propose new stuff. Plus, I suppose it could be > > handled > > by ESL (once we get around to it) if it isn't implemented internally > > (preferred0. > > > > A string built-in data type with: > > Byte-size ASCII strings. For Kat, since she can't have goto. > > Unicode UTF-8 strings. > > One built-in type should be able to handle both. > > > > Atom has integer as a subclass for efficiency. I think that sequence can > > have > > string as a subclass as well, since strings are a "basic" type in most > > programming > > projects. Strings can be up-cast to sequences, like integers can be up-cast > > to atoms. > > I would be very happy if this was implemented! Is there any > reason to not have built-in strings? I admit I was tired and bored last night when I posted that. I've been thinking about it all morning. One reason for *not* having built-in strings is that sequences handle 99% of the functionality of strings already. This proposal would get more complicated when you want sequences of strings as well. The string type would only be able to apply to a single-level sequence. But a question that occurs to me is what percentage of sequences in any given Euphoria application represent text strings? I think it really only matters for efficiency when working with large amounts of text data. Because sequence elements are 4-bytes each. So, basically, I retract my proposal. But it was helpful for reminding me of what features I want to see in an Euphoria Standard Library string module. > ~Ryan W. Johnson > > Fluid Application Environment > <a href="http://www.fluidae.com/">http://www.fluidae.com/</a> > > [cool quote here, if i ever think of one...] -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.
6. Re: Strings
- Posted by Al Getz <Xaxo at aol.com> Feb 18, 2006
- 664 views
Jason Gade wrote: > > Ryan W. Johnson wrote: > > > > Jason Gade wrote: > > > > > > Okay, so I'm going to propose two things with regards to strings. Even > > > though > > > I said that I wouldn't propose new stuff. Plus, I suppose it could be > > > handled > > > by ESL (once we get around to it) if it isn't implemented internally > > > (preferred0. > > > > > > A string built-in data type with: > > > Byte-size ASCII strings. For Kat, since she can't have goto. > > > Unicode UTF-8 strings. > > > One built-in type should be able to handle both. > > > > > > Atom has integer as a subclass for efficiency. I think that sequence can > > > have > > > string as a subclass as well, since strings are a "basic" type in most > > > programming > > > projects. Strings can be up-cast to sequences, like integers can be > > > up-cast > > > to atoms. > > > > I would be very happy if this was implemented! Is there any > > reason to not have built-in strings? > > I admit I was tired and bored last night when I posted that. I've been > thinking > about it all morning. > > One reason for *not* having built-in strings is that sequences handle 99% of > the functionality of strings already. > > This proposal would get more complicated when you want sequences of strings > as well. The string type would only be able to apply to a single-level > sequence. > > But a question that occurs to me is what percentage of sequences in any given > Euphoria application represent text strings? > > I think it really only matters for efficiency when working with large amounts > of text data. Because sequence elements are 4-bytes each. > > So, basically, I retract my proposal. > > But it was helpful for reminding me of what features I want to see in an > Euphoria > Standard Library string module. > > > ~Ryan W. Johnson > > > > Fluid Application Environment > > <a href="http://www.fluidae.com/">http://www.fluidae.com/</a> > > > > [cool quote here, if i ever think of one...] > > > -- > "Any programming problem can be solved by adding a level of indirection." > --anonymous > "Any performance problem can be solved by removing a level of indirection." > --M. Haertel > "Premature optimization is the root of all evil in programming." > --C.A.R. Hoare > j. Hi again, You're right in that the main advantage to having a string type would be mostly in making a text editor, where there is so much text the memory savings would be great, but yes, quite a few apps wont benefit much with *that* kind of definition of 'string'. There is, however, another definition of 'string' where it's actually a memory element: string s s="My Window" where internally s is a pointer to the string, so it can be passed to a C function like: x=CreateWindow(s,...) without having the bother of s=allocate_string("My Window"). This would mean other function such as 'printf(..)' would have to also support this new kind of data type: printf(1,"%s\n",{s}) where Euphoria would recognize 's' as a memory string object and make the necessary call to print that type of object rather than say a sequence string. Take care, Al And, good luck with your Euphoria programming! My bumper sticker: "I brake for LED's"
7. Re: Strings
- Posted by Jason Gade <jaygade at yahoo.com> Feb 18, 2006
- 667 views
Al Getz wrote: > > Jason Gade wrote: > > > > Ryan W. Johnson wrote: > > > > > > Jason Gade wrote: > > > > > > > > Okay, so I'm going to propose two things with regards to strings. Even > > > > though > > > > I said that I wouldn't propose new stuff. Plus, I suppose it could be > > > > handled > > > > by ESL (once we get around to it) if it isn't implemented internally > > > > (preferred0. > > > > > > > > A string built-in data type with: > > > > Byte-size ASCII strings. For Kat, since she can't have goto. > > > > Unicode UTF-8 strings. > > > > One built-in type should be able to handle both. > > > > > > > > Atom has integer as a subclass for efficiency. I think that sequence can > > > > have > > > > string as a subclass as well, since strings are a "basic" type in most > > > > programming > > > > projects. Strings can be up-cast to sequences, like integers can be > > > > up-cast > > > > to atoms. > > > > > > I would be very happy if this was implemented! Is there any > > > reason to not have built-in strings? > > > > I admit I was tired and bored last night when I posted that. I've been > > thinking > > about it all morning. > > > > One reason for *not* having built-in strings is that sequences handle 99% of > > the functionality of strings already. > > > > This proposal would get more complicated when you want sequences of strings > > as well. The string type would only be able to apply to a single-level > > sequence. > > > > But a question that occurs to me is what percentage of sequences in any > > given > > Euphoria application represent text strings? > > > > I think it really only matters for efficiency when working with large > > amounts > > of text data. Because sequence elements are 4-bytes each. > > > > So, basically, I retract my proposal. > > > > But it was helpful for reminding me of what features I want to see in an > > Euphoria > > Standard Library string module. > > > > > ~Ryan W. Johnson > > > > > > Fluid Application Environment > > > <a href="http://www.fluidae.com/">http://www.fluidae.com/</a> > > > > > > [cool quote here, if i ever think of one...] > > > > > > -- > > "Any programming problem can be solved by adding a level of indirection." > > --anonymous > > "Any performance problem can be solved by removing a level of indirection." > > --M. Haertel > > "Premature optimization is the root of all evil in programming." > > --C.A.R. Hoare > > j. > > > Hi again, > > > You're right in that the main advantage to having a string type > would be mostly in making a text editor, where there is so much > text the memory savings would be great, but yes, quite a few > apps wont benefit much with *that* kind of definition of 'string'. > There is, however, another definition of 'string' where it's actually > a memory element: > > string s > s="My Window" > > where internally s is a pointer to the string, so it can be passed > to a C function like: > > x=CreateWindow(s,...) > > without having the bother of s=allocate_string("My Window"). Well, using allocate_string doesn't seem like *too* much work to me. But you lose a lot of the dynamics of sequences with manual allocation. > > This would mean other function such as 'printf(..)' would > have to also support this new kind of data type: > > printf(1,"%s\n",{s}) > > where Euphoria would recognize 's' as a memory string object and > make the necessary call to print that type of object rather than > say a sequence string. Well, you could always use C for stuff like that... But routines that convert between static strings in memory and sequences would be useful. So if 's' was a pointer to a string then you could do: printf(1, "%s\n", stringz(s)) > > Al > > > My bumper sticker: "I brake for LED's" This is just a mental exercise, but something else occurs to me. Euphoria uses bit-flags to determine the type of data that it is working with -- a pointer to a sequence or a double, or a 31-bit integer. See http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=A&fromYear=A&toMonth=A&toYear=A&postedBy=Robert+Craig&keywords=%22bit+fiddling%22 Euphoria *could* use bit flags to say "pointer to string". -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.
8. Re: Strings
- Posted by Bernie Ryan <xotron at bluefrog.com> Feb 18, 2006
- 671 views
All of my libraries already support memory based "C" type string handling which are written in eurphoria assembler. Bernie My files in archive: WMOTOR, XMOTOR, W32ENGIN, MIXEDLIB, EU_ENGIN, WIN32ERU, WIN32API Can be downloaded here: http://www.rapideuphoria.com/cgi-bin/asearch.exu?dos=on&win=on&lnx=on&gen=on&keywords=bernie+ryan
9. Re: Strings
- Posted by Jason Gade <jaygade at yahoo.com> Feb 18, 2006
- 683 views
Jason Gade wrote: > This is just a mental exercise, but something else occurs to me. Euphoria uses > bit-flags to determine the type of data that it is working with -- a pointer > to a sequence or a double, or a 31-bit integer. See > <a > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=A&fromYear=A&toMonth=A&toYear=A&postedBy=Robert+Craig&keywords=%22bit+fiddling%22">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=A&fromYear=A&toMonth=A&toYear=A&postedBy=Robert+Craig&keywords=%22bit+fiddling%22</a> > > Euphoria *could* use bit flags to say "pointer to string". Oh, and it looks like Euphoria *could* offer the programmer a way to check whether a variable has been initialized. -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.
10. Re: Strings
- Posted by "Greg Haberek" <ghaberek at gmail.com> Feb 18, 2006
- 685 views
> Oh, and it looks like Euphoria *could* offer the programmer a way to chec= k whether a variable has been initialized. I proposed this a while ago. I believe object() should return true for initialized, and false for unitinialized, like this:
sequence s if not object( s ) then -- not initialized! s = {} end if
~Greg
11. Strings
- Posted by Matt Lewis <matthewwalkerlewis at ?mail.c?m> May 29, 2008
- 685 views
Kat wrote: > > Eu doesn't have a string type, there is no string in Eu. They are sequences, > where every 8bit character takes up 32bits. And you are adding a function that > replaces 3 whole lines of code that do the same thing, a trivial thing, while > important stuff doesn't get added. This is something that seems like it should be trivial, but isn't, given the design of euphoria. Adding a new primitive would require changes all over the place (hundreds of places, easily). We'd probably have to redo some of the test macros, and I suspect that we'd lose a lot of the current speed of euphoria. If someone can figure out an easy way to do this, I suspect it'd get into the language pretty quickly. Of course, that ignores the prospect of Unicode, which is a whole 'nother can of worms, and also something that will require some drastic recoding in the back end, though due to the way that sequences are implemented, probably none in the front end. Matt
12. Re: Strings
- Posted by Shawn Pringle <shawn.pringle at gma?l.c?m> May 30, 2008
- 697 views
Matt Lewis wrote: > > If someone can figure out an easy way to do this, I suspect it'd get into > the language pretty quickly. Of course, that ignores the prospect of > Unicode, which is a whole 'nother can of worms, and also something that > will require some drastic recoding in the back end, though due to the > way that sequences are implemented, probably none in the front end. > > Matt It seems you can do Unicode well enough already: sequence shawn shawn = { 's', 'h', #0430, 'w','n' } -- a is cyrillic a. You can use poke2 from words.e or poke2 from my own pokpeek2.e and use with a unicode C routine. All manipulation works like any another sequence Shawn Pringle
13. Re: Strings
- Posted by Matt Lewis <matthewwalkerlewis at ?mail.c?m> May 30, 2008
- 708 views
Shawn Pringle wrote: > > Matt Lewis wrote: > > > > If someone can figure out an easy way to do this, I suspect it'd get into > > the language pretty quickly. Of course, that ignores the prospect of > > Unicode, which is a whole 'nother can of worms, and also something that > > will require some drastic recoding in the back end, though due to the > > way that sequences are implemented, probably none in the front end. > > > > Matt > It seems you can do Unicode well enough already: > > sequence shawn > shawn = { 's', 'h', #0430, 'w','n' } -- a is cyrillic a. Yes, wxEuphoria does this currently. > You can use poke2 from words.e or poke2 from my own pokpeek2.e > and use with a unicode C routine. All manipulation works > like any another sequence Or you can poke2 with the built-in in 4.0. But then you'll also have to wrap all of your I/O functions, too. That's where all the work will be. Matt
14. Strings
- Posted by Mathew Hounsell <mfh03 at UOW.EDU.AU> Aug 05, 1999
- 726 views
Yes I'm going to wade into this again. 1st) memory conservation is good especially when there is a large amount of waste. 2nd) speed decrease is not nice, but acceptable because most importantly ***Type checking reduces error's*** It also reduce's the need for manual error detection which can get quite omplex and slow and consuming with euphoria. The alternative is to let the routine die from an obscure error and let the finger be pointed at the routine it dies in rather than the one who started the problem. I will again say string's are good. Consider the standard indexing on a sequence, it must require quite complex operation's. Where as the indexing on a string would require a simple bounds check, offset calculation and a peek. As intel chips align to 4 bytes. Unless the string has a length which is a multiple of 4 there is a gap of up to 3 bytes to append into reducing the time for that. There are many other benefits but I'm going to diner. ------------------------- Sincerely, Mathew Hounsell mat.hounsell at excite.com