Re: Strings
- Posted by Jason Gade <jaygade at yahoo.com> Feb 18, 2006
- 649 views
Al Getz wrote: > > Jason Gade wrote: > > > > Ryan W. Johnson wrote: > > > > > > Jason Gade wrote: > > > > > > > > Okay, so I'm going to propose two things with regards to strings. Even > > > > though > > > > I said that I wouldn't propose new stuff. Plus, I suppose it could be > > > > handled > > > > by ESL (once we get around to it) if it isn't implemented internally > > > > (preferred0. > > > > > > > > A string built-in data type with: > > > > Byte-size ASCII strings. For Kat, since she can't have goto. > > > > Unicode UTF-8 strings. > > > > One built-in type should be able to handle both. > > > > > > > > Atom has integer as a subclass for efficiency. I think that sequence can > > > > have > > > > string as a subclass as well, since strings are a "basic" type in most > > > > programming > > > > projects. Strings can be up-cast to sequences, like integers can be > > > > up-cast > > > > to atoms. > > > > > > I would be very happy if this was implemented! Is there any > > > reason to not have built-in strings? > > > > I admit I was tired and bored last night when I posted that. I've been > > thinking > > about it all morning. > > > > One reason for *not* having built-in strings is that sequences handle 99% of > > the functionality of strings already. > > > > This proposal would get more complicated when you want sequences of strings > > as well. The string type would only be able to apply to a single-level > > sequence. > > > > But a question that occurs to me is what percentage of sequences in any > > given > > Euphoria application represent text strings? > > > > I think it really only matters for efficiency when working with large > > amounts > > of text data. Because sequence elements are 4-bytes each. > > > > So, basically, I retract my proposal. > > > > But it was helpful for reminding me of what features I want to see in an > > Euphoria > > Standard Library string module. > > > > > ~Ryan W. Johnson > > > > > > Fluid Application Environment > > > <a href="http://www.fluidae.com/">http://www.fluidae.com/</a> > > > > > > [cool quote here, if i ever think of one...] > > > > > > -- > > "Any programming problem can be solved by adding a level of indirection." > > --anonymous > > "Any performance problem can be solved by removing a level of indirection." > > --M. Haertel > > "Premature optimization is the root of all evil in programming." > > --C.A.R. Hoare > > j. > > > Hi again, > > > You're right in that the main advantage to having a string type > would be mostly in making a text editor, where there is so much > text the memory savings would be great, but yes, quite a few > apps wont benefit much with *that* kind of definition of 'string'. > There is, however, another definition of 'string' where it's actually > a memory element: > > string s > s="My Window" > > where internally s is a pointer to the string, so it can be passed > to a C function like: > > x=CreateWindow(s,...) > > without having the bother of s=allocate_string("My Window"). Well, using allocate_string doesn't seem like *too* much work to me. But you lose a lot of the dynamics of sequences with manual allocation. > > This would mean other function such as 'printf(..)' would > have to also support this new kind of data type: > > printf(1,"%s\n",{s}) > > where Euphoria would recognize 's' as a memory string object and > make the necessary call to print that type of object rather than > say a sequence string. Well, you could always use C for stuff like that... But routines that convert between static strings in memory and sequences would be useful. So if 's' was a pointer to a string then you could do: printf(1, "%s\n", stringz(s)) > > Al > > > My bumper sticker: "I brake for LED's" This is just a mental exercise, but something else occurs to me. Euphoria uses bit-flags to determine the type of data that it is working with -- a pointer to a sequence or a double, or a 31-bit integer. See http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=A&fromYear=A&toMonth=A&toYear=A&postedBy=Robert+Craig&keywords=%22bit+fiddling%22 Euphoria *could* use bit flags to say "pointer to string". -- "Any programming problem can be solved by adding a level of indirection." --anonymous "Any performance problem can be solved by removing a level of indirection." --M. Haertel "Premature optimization is the root of all evil in programming." --C.A.R. Hoare j.