1. Euphoria 2.5 Features..... ??
- Posted by Mario Steele <eumario at trilake.net> Dec 05, 2003
- 725 views
Hello all, I'm back! Allright, wanted to pass along a couple of ideas, and maybe run over a idea like a rampid person, that tries to wear something out. And I do seriously think these would be good Ideas for Euphoria 2.5 First off, The wearing out Idea/Problem. A String Type Variable. Yeah Yeah, I realize we can do custom string type variables in Euphoria, but there's still 1 major problem with it all. Even if it's a string type variable, Euphoria still allocates 4 bytes of memory to store a simple 0 to 255 (Or 0 to 255*2) character into memory. Now for little baby strings, okay, that's kewl, no big loss. But when we get into big files, like for example, Win32lib _BEFORE_ the breakup into seperate libraries, and still somewhat now, we get a very big memory waste for all of it. Now we all know, the interpreter doesn't allocate 4 bytes of memory for each character, when reading the Plain Text euphoria code into memory, for parsing, and execution. So my quesiton is, why should we have to waste memory to read a file into memory, especially big files, which could double, even triple in size, when loaded into a simple Euphoria Sequence. Now, yes, currently, Euphoria Slicing works beautifully on sequences, but who says that said slicing can't be used on string variables as well. It's not that much diffrent, only diffrence really, is that in Sequences, there's embedding, but in strings, it is straight flat line, nothing embedded within it. We would proabbly even get faster slicing abilities on string type variables then we would on sequences, that just had single byte characters in it. I'm not saying get rid of the sequence, cause that's one of the things that makes Euphoria so powerful, but to allocate 4 bytes of memory, to store a single character seems to be just a plain waste of memory. But then again, that's just my two cents. Next, a actual new Idea: Now, I noticed, when binding, it uses the Public Edition of Euphoria as the base for it, which is all fine and dandy, cause you can remove the source, and have the original Public Edition interpreter with no problems. But, I see two major problems with this idea. One being the file size. yes, considering what other interpreters have so far, Euphoria is alot smaller, but there is allways room for improvement. When Euphoria source code is bound to the interpreter, one specifically isn't going to be removing the source code, to get the interpreter, unless they know exactly how to do that, without destroying the interpreter. Now, what I propose is, that you can proabbly loose about 100k or more on the Interpreter for binding, if you divide the interpreter into two parts in the C code. Consider them if you would, modules. The two modules would be, 1 the reading of clear, un-objectized, plain text euphoria code. This is really only needed for being a Interpreter, not as a Bound Executable. The second module, would be the objectized, shrouded code, which only Euphoria knows how to decipher for, weither it be a external file, or bound to the file. Since the objectized code would be more orientated to sort of a "Binary Script" file, which is native only to Euphoria, it would grately enhance the load time, and parsing time for bound Euphoria programs, in which strictly relies on the source of the code, bound or not, to be purely objectized. Since there is no need for the plain text module, this would proabbly reduce the size of bound programs by another 100kb or more in size. This alone would make Euphoria even more desirable, for the yet even smaller executables then what other interpreters come out with. These are only ideas, which I think would greatly enhance Euphoria, and I hope some, if not all of you would agree with me on this. Sorry Rob, Not trying to tell ya how to do your job, just giving my two cents in. That's all for now. L8ers, EuMario -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Mario Steele (EuMario) Tuscan Chat Client http://www.tuscanchat.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
2. Re: Euphoria 2.5 Features..... ??
- Posted by Mario Steele <eumario at trilake.net> Dec 05, 2003
- 680 views
Hey Al, Thanks for your response, and as for that Unicode thing about it, I do know of a easy way to make that work to, but still allow for a single string type definition. All you would have to do is do a quick byte scan through the input stream, to see weither the char is in a 0 to 255 range, or a 0 to 255*2 range. A example of this would be: function need_2_bytes(object stream) for x = 1 to length(stream) do if stream[x] > 255 and stream[x] < 255*2 then return 1 elsif stream[x] > 255*2 then return -1 end if end for return 0 end function Oviously, the first character that it runs into, to be unicode, then we don't need to check any further, we assume it's unicode, and allcoate the string as such, and if it returns -1, then we have a type_check error, which means that someone threw in there something that's bigger then 256*2. And if we get 0 back, then the stream can be put into single byte character holders. And I'm sure there are faster algorithims out there, that can scan byte wise in a much faster fashion then this. And the problem is, people don't want to deal with the memory routines themselvs, unless it's like explicitly needed by windows. They'd rather use Sequences, and their programs get bloated. That's why Sequences are so popular in Euphoria, cause it get's away from PTRs that are dependant in C/C++. But again, this is just a simple Programmer writting his two cents in. LOL L8ers, EuMario -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Mario Steele (EuMario) Tuscan Chat Client http://www.tuscanchat.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Al Getz wrote: > >Hello Mario, > >I'd like to comment on your string type observations mainly. > >I too noticed that to open a standard text file in an editor >written in Euphoria using sequences to store the characters >line by line requires four times as much memory as there are >bytes of characters. Such as waste, but if there was to be >a string type that operates like sequences but only stores >one byte per element (that's such as good idea!) it would >have to be convertable over to 2 bytes per character also >with the trend toward using Unicode, which would require >two bytes per character. >Currently to open a file that contains 1,000,000 characters >takes up 4,000,000 bytes of memory just for the characters alone. > >The only other way i can think of to do it is to manage your >own memory block. You'd have to do a comparison to see if >it took that much longer to peek/poke then to simply get >a line of text from a sequence. In any case it would be >more code to think about. > >Im all for a string type but as i was saying it would have >to be expandable to 2 bytes per element, or possibly have >a 'string2' type that uses 2 bytes instead of 1 per char. > >I've found that most of the time when im using a sequence >a certain way...such as for a string of chars...i seldom >change this later in the program to store say an integer or >atom (that is integer in the form of a number, not a char). >It seems easier to follow the code flow if the sequences >dont change too much. Even when im returning a variable >from a function that 'sometimes' has to return a sequence >rather then integer i try to keep the basic format of >the returned structure the same: >return {0,""} >for an error condition rather then just >return 0 >which would have meant it returns an object rather then a sequence >all the time. > >It would introduce some confusion however, because you would >have to know when a sequence was used as a string rather then >a full blown sequence. > >Most text applications dont use that much text anyway, but >on the other hand if you think about it there is an AWEFULL >lot of blocks of ram that look something like: > >00 >00 >00 >65 >00 >00 >00 >66 > >etc > >when you open a text file in an editor that uses sequences to >store text> > >I would imagine that using 'allocate_string' over and over >as a line of text changes would take lots more time to >do, but if the program was done with this in mind it might >be actually faster to draw on screen if the text is already >in memory... > >Take care, >Al > >
3. Re: Euphoria 2.5 Features..... ??
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Dec 05, 2003
- 657 views
On Fri, 05 Dec 2003 06:37:24 -0600, Mario Steele <eumario at trilake.net> wrote: >this would proabbly reduce >the size of bound programs by another 100kb or more in size. I like this idea! I tested a simple program: original source: 884 bytes shrouded: 273 bytes bound: 74,015 bytes. Now, if we could save 100kb or more in size, that works out at about -26K for every file I bind, so I can just bind everything on my hard disk and create loads of free space, potentially more than the original manufacturers spec.) TeeHee, Pete
4. Re: Euphoria 2.5 Features..... ??
- Posted by Mario Steele <eumario at trilake.net> Dec 05, 2003
- 645 views
--------------020101000108060208080209 *bonks himself* Okay, maybe not 100k, but instead of 73k, more like 54k, or smaller. You have to think about how much of that, is reading routines, and interpreting said code, into objective code, then actual execution. Especially in defining keywords, instead of their counterpart tokenized objective code. :P If that makes any sense. Anywho, L8ers, EuMario -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Mario Steele (EuMario) Tuscan Chat Client http://www.tuscanchat.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pete Lomax wrote: > > >On Fri, 05 Dec 2003 06:37:24 -0600, Mario Steele <eumario at trilake.net> >wrote: > > >>this would proabbly reduce >>the size of bound programs by another 100kb or more in size. >> >> >I like this idea! >I tested a simple program: >original source: 884 bytes >shrouded: 273 bytes >bound: 74,015 bytes. >Now, if we could save 100kb or more in size, that works out at about >-26K for every file I bind, so I can just bind everything on my hard >disk and create loads of free space, potentially more than the >original manufacturers spec.) > >TeeHee, >Pete > > > >TOPICA - Start your own email discussion group. FREE! > > --------------020101000108060208080209 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"> <title></title> </head> <body> *bonks himself*<br> <br> Okay, maybe not 100k, but instead of 73k, more like 54k, or smaller. You have to think about how much of that, is reading routines, and interpreting said code, into objective code, then actual execution. Especially in defining keywords, instead of their counterpart tokenized objective code. :P If that makes any sense.<br> <br> Anywho, L8ers,<br> EuMario<br> <br> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- <br> Mario Steele (EuMario) <br> Tuscan Chat Client <br> <a class="moz-txt-link-freetext" href="http://www.tuscanchat.com">http://www.tuscanchat.com</a> <br> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- <br> <br> Pete Lomax wrote:<br> <blockquote type="cite" cite="mid1082347249-1463747838-1070632320 at boing.topica.com"> <pre wrap="">============ The Euphoria Mailing List ============ On Fri, 05 Dec 2003 06:37:24 -0600, Mario Steele <a class="moz-txt-link-rfc2396E" href="mailto:eumario at trilake.net"><eumario at trilake.net></a> wrote: </pre> <blockquote type="cite"> <pre wrap="">this would proabbly reduce the size of bound programs by another 100kb or more in size. </pre> </blockquote> <pre wrap=""><!---->I like this idea! I tested a simple program: original source: 884 bytes shrouded: 273 bytes bound: 74,015 bytes. Now, if we could save 100kb or more in size, that works out at about -26K for every file I bind, so I can just bind everything on my hard disk and create loads of free space, potentially more than the original manufacturers spec.
) TeeHee, Pete --^---------------------------------------------------------------- This email was sent to: <a class="moz-txt-link-abbreviated" href="mailto:eumario at trilake.net">eumario at trilake.net</a> EASY UNSUBSCRIBE click here: <a class="moz-txt-link-freetext" href="http://topica.com/u/?b1dd66.b63nIc.ZXVtYXJp">http://topica.com/u/?b1dd66.b63nIc.ZXVtYXJp</a> Or send an email to: <a class="moz-txt-link-abbreviated" href="mailto:EUforum-unsubscribe at topica.com">EUforum-unsubscribe at topica.com</a> TOPICA - Start your own email discussion group. FREE! <a class="moz-txt-link-freetext" href="http://www.topica.com/partner/tag02/create/index2.html">http://www.topica.com/partner/tag02/create/index2.html</a> --^---------------------------------------------------------------- </pre> </blockquote> <br> --------------020101000108060208080209--
5. Re: Euphoria 2.5 Features..... ??
- Posted by "Daniel Kluss" <codepilot at netzero.net> Dec 05, 2003
- 699 views
This is a multi-part message in MIME format. ------=_NextPart_000_0013_01C3BB28.08EAD700 charset="iso-8859-1" I would be cool if I could bind a hello world program that would only be a couple of K, like say 5 or 10 Daniel Kluss ----- Original Message ----- From: Mario Steele Subject: Re: Euphoria 2.5 Features..... ?? Okay, maybe not 100k, but instead of 73k, more like 54k, or smaller. You have to think about how much of that, is reading routines, and interpreting said code, into objective code, then actual execution. Especially in defining keywords, instead of their counterpart tokenized objective code. :P If that makes any sense. Anywho, L8ers, EuMario -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Mario Steele (EuMario) Tuscan Chat Client http://www.tuscanchat.com -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pete Lomax wrote: On Fri, 05 Dec 2003 06:37:24 -0600, Mario Steele <eumario at trilake.net> wrote: this would proabbly reduce the size of bound programs by another 100kb or more in size. I like this idea! I tested a simple program: original source: 884 bytes shrouded: 273 bytes bound: 74,015 bytes. Now, if we could save 100kb or more in size, that works out at about -26K for every file I bind, so I can just bind everything on my hard disk and create loads of free space, potentially more than the original manufacturers spec.) TeeHee, Pete TOPICA - Start your own email discussion group. FREE! ------=_NextPart_000_0013_01C3BB28.08EAD700 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: 8bit <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD><TITLE></TITLE> <META http-equiv=Content-Type content=text/html;charset=ISO-8859-1> <META content="MSHTML 6.00.2800.1276" name=GENERATOR> <STYLE></STYLE> </HEAD> <BODY bgColor=#ffffff> <DIV><FONT face=Arial size=2>I would be cool if I could bind a hello world program that would only be a couple of K, like say 5 or 10</FONT></DIV> <DIV><FONT face=Arial size=2>Daniel Kluss</FONT></DIV> <BLOCKQUOTE dir=ltr style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px"> <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV> <DIV style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B> <A title=eumario at trilake.net href="mailto:eumario at trilake.net">Mario Steele</A> </DIV> <DIV style="FONT: 10pt arial"><B>To:</B> <A title=EUforum at topica.com href="mailto:EUforum at topica.com">EUforum at topica.com</A> </DIV> <DIV style="FONT: 10pt arial"><B>Sent:</B> Friday, December 05, 2003 5:58 AM</DIV> <DIV style="FONT: 10pt arial"><B>Subject:</B> Re: Euphoria 2.5 Features..... ??</DIV> <DIV><BR></DIV><PRE>============ The Euphoria Mailing List ============ </PRE>*bonks himself*<BR><BR>Okay, maybe not 100k, but instead of 73k, more like 54k, or smaller. You have to think about how much of that, is reading routines, and interpreting said code, into objective code, then actual execution. Especially in defining keywords, instead of their counterpart tokenized objective code. :P If that makes any sense.<BR><BR>Anywho, L8ers,<BR>EuMario<BR><BR>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- <BR>Mario Steele (EuMario) <BR>Tuscan Chat Client <BR><A class=moz-txt-link-freetext href="http://www.tuscanchat.com">http://www.tuscanchat.com</A> <BR>-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- <BR><BR>Pete Lomax wrote:<BR> <BLOCKQUOTE cite=mid1082347249-1463747838-1070632320 at boing.topica.com type="cite"><PRE wrap="">============ The Euphoria Mailing List ============ On Fri, 05 Dec 2003 06:37:24 -0600, Mario Steele <A class=moz-txt-link-rfc2396E href="mailto:eumario at trilake.net"><eumario at trilake.net></A> wrote: </PRE> <BLOCKQUOTE type="cite"><PRE wrap="">this would proabbly reduce the size of bound programs by another 100kb or more in size. </PRE></BLOCKQUOTE><PRE wrap=""><!---->I like this idea! I tested a simple program: original source: 884 bytes shrouded: 273 bytes bound: 74,015 bytes. Now, if we could save 100kb or more in size, that works out at about -26K for every file I bind, so I can just bind everything on my hard disk and create loads of free space, potentially more than the original manufacturers spec.
) TeeHee, Pete </PRE></BLOCKQUOTE><BR><PRE>--^^--------------------------------------------------------------- This email was sent to: codepilot at netzero.net EASY UNSUBSCRIBE click here: <A href="http://topica.com/u/?b1dd66.b6rXHe.Y29kZXBp">http://topica.com/u/?b1dd66.b6rXHe.Y29kZXBp</A> Or send an email to: EUforum-unsubscribe at topica.com TOPICA - Start your own email discussion group. FREE! <A href="http://www.topica.com/partner/tag02/create/index2.html">http://www.topica.com/partner/tag02/create/index2.html</A> ------=_NextPart_000_0013_01C3BB28.08EAD700--
6. Re: Euphoria 2.5 Features..... ??
- Posted by euphoric <euphoric at cklester.com> Dec 05, 2003
- 655 views
Daniel Kluss wrote: > > > I would be cool if I could bind a hello world program that would only > be a couple of K, like say 5 or 10 > Daniel Kluss Daniel, I think you're already cool, regardless of how large a hello world program is after binding.
7. Re: Euphoria 2.5 Features..... ??
- Posted by "Hayden McKay" <hmck1 at dodo.com.au> Dec 06, 2003
- 638 views
Yes 'cause small proggies are always ~190k. ----- Original Message ----- From: "C. K. Lester" <euphoric at cklester.com> To: <EUforum at topica.com> Subject: Re: Euphoria 2.5 Features..... ?? > > > Daniel Kluss wrote: > > > > > I would be cool if I could bind a hello world program that would only > > be a couple of K, like say 5 or 10 > > Daniel Kluss > > Daniel, I think you're already cool, regardless of how large a hello > world program is after binding. > > > > TOPICA - Start your own email discussion group. FREE! > > > -- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.545 / Virus Database: 339 - Release Date: 28/11/03 > --- --
8. Re: Euphoria 2.5 Features..... ??
- Posted by "Igor Kachan" <kinz at peterlink.ru> Dec 09, 2003
- 640 views
Hello Mario again, ---------- > From: Mario Steele <eumario at trilake.net> > Subject: Euphoria 2.5 Features..... ?? > > > Hello all, I'm back! > > Allright, wanted to pass along a couple of ideas, and maybe run over a > idea like a rampid person, that tries to wear something out. And I do > seriously think these would be good Ideas for Euphoria 2.5 > > First off, The wearing out Idea/Problem. A String Type Variable. > Yeah Yeah, I realize we can do custom string type variables in Euphoria, > but there's still 1 major problem with it all. Even if it's a string > type variable, Euphoria still allocates 4 bytes of memory to store a > simple 0 to 255 (Or 0 to 255*2) character into memory. Now for little > baby strings, okay, that's kewl, no big loss. But when we get into big > files, like for example, Win32lib _BEFORE_ the breakup into seperate > libraries, and still somewhat now, we get a very big memory waste for > all of it. Now we all know, the interpreter doesn't allocate 4 bytes of > memory for each character, when reading the Plain Text euphoria code > into memory, for parsing, and execution. So my quesiton is, why should > we have to waste memory to read a file into memory, especially big > files, which could double, even triple in size, when loaded into a > simple Euphoria Sequence. [[snipped the solved problem]] I do not know if someone used the trick below, but it seems to be useful for some new text routines of Euphoria without new char type. --- code include machine.e sequence text text = "test of packing the text into the atoms " ? text puts(1,text) puts(1, "\n----- chars (integers) - ") ? length(text) puts(1, "\n\n\n\n") sequence TEXT TEXT = {} for i=1 to length(text) by 4 do TEXT &= bytes_to_int(text[i..i+3]) end for ? TEXT for i=1 to length(TEXT) do puts(1,int_to_bytes(TEXT[i])) end for puts(1, "\n----- atoms - ") ? length(TEXT) --- end of code Try please, maybe it is not too bad ... Regards, Igor Kachan kinz at peterlink.ru
9. Re: Euphoria 2.5 Features..... ??
- Posted by "Igor Kachan" <kinz at peterlink.ru> Dec 09, 2003
- 624 views
Hello Al, ---------- > From: Al Getz <Xaxo at aol.com> > Subject: RE: Euphoria 2.5 Features..... ?? > > Hello there Igor, > > First off, that algorithm probably doesnt work for sequences > that dont have lengths that are multiples of 4. Yes, I was just in a hurry to &= {0} or {0,0} or {0,0,0} to the not/4 texts, it is a clear thing for me. > It's a neat idea thoughThanks. > Second, why would you want to go through all that trouble > when you can use allocate_text()/free() pairs to manage > the storage of text if you are THAT worried about the > waste of memory due to character sequence storage ? These allocate_text()/free() pairs are the low-level stuff, C-like stuff, and I am not *too* worried in this case. But why not to use an atom as just a sequence of bytes? It is some artificial trick for Euphoria, but EU atoms have the elementary particles too, same as real atoms. If you need the particles - get them just now - this is EU. But do you see - this need becomes really actual just sometimes? As some extra feature? The Euphoria language has an incredible flexibility. And with front- and back-end this incredible flexibility may be fantastic one, I think. > Take care, > Al Good luck! Regards, Igor Kachan kinz at peterlink.ru > Igor Kachan wrote: > > > > > > Hello Mario again, > > > > ---------- > > > From: Mario Steele <eumario at trilake.net> > > > To: EUforum at topica.com > > > Subject: Euphoria 2.5 Features..... ?? > > > Sent: 5 dec 2003 y. 15:37 > > > > > > > > > Hello all, I'm back! > > > > > > Allright, wanted to pass along a couple of ideas, and maybe run over a > > > idea like a rampid person, that tries to wear something out. And I do > > > seriously think these would be good Ideas for Euphoria 2.5 > > > > > > First off, The wearing out Idea/Problem. A String Type Variable. > > > Yeah Yeah, I realize we can do custom string type variables in Euphoria, > > > > > > but there's still 1 major problem with it all. Even if it's a string > > > type variable, Euphoria still allocates 4 bytes of memory to store a > > > simple 0 to 255 (Or 0 to 255*2) character into memory. Now for little > > > baby strings, okay, that's kewl, no big loss. But when we get into big > > > files, like for example, Win32lib _BEFORE_ the breakup into seperate > > > libraries, and still somewhat now, we get a very big memory waste for > > > all of it. Now we all know, the interpreter doesn't allocate 4 bytes of > > > > > > memory for each character, when reading the Plain Text euphoria code > > > into memory, for parsing, and execution. So my quesiton is, why should > > > we have to waste memory to read a file into memory, especially big > > > files, which could double, even triple in size, when loaded into a > > > simple Euphoria Sequence. > > > > [[snipped the solved problem]] > > > > I do not know if someone used the trick below, > > but it seems to be useful for some new > > text routines of Euphoria without > > new char type. > > > > --- code > > include machine.e > > > > sequence text > > > > text = "test of packing the text into the atoms " > > > > ? text > > > > puts(1,text) > > > > puts(1, "\n----- chars (integers) - ") > > ? length(text) > > puts(1, "\n\n\n\n") > > > > sequence TEXT TEXT = {} > > > > for i=1 to length(text) by 4 do > > TEXT &= bytes_to_int(text[i..i+3]) > > end for > > > > ? TEXT > > > > for i=1 to length(TEXT) do > > puts(1,int_to_bytes(TEXT[i])) > > end for > > > > puts(1, "\n----- atoms - ") > > ? length(TEXT) > > --- end of code > > > > Try please, maybe it is not too bad ...
10. Re: Euphoria 2.5 Features..... ??
- Posted by Isaac Raway <isaac-topica at blueapples.org> Dec 12, 2003
- 742 views
This is a multi-part message in MIME format. --------------000604050200070307080409 Concerning strings; Paul Graham Wrote an excellent article a few months ago that has a section very pertinent to this concept of adding a string type. By the way, my final vote is *don't add strings*, just make the interpreter optimize sequences that are only made of characters. We don't need a string data type. We just need the sequence data type to be a bit smarter. / *** Most data structures exist because of speed. For example, many languages today have both strings and lists. Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don't, really. Strings only exist for efficiency. But it's lame to clutter up the semantics of the language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization. If we think of the core of a language as a set of axioms, surely it's gross to have additional axioms that add no expressive power, simply for the sake of efficiency. Efficiency is important, but I don't think that's the right way to get it. The right way to solve that problem, I think, is to separate the meaning of a program from the implementation details. Instead of having both lists and strings, have just lists, with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes if necessary. ***/ The full text of this article is available at http://www.paulgraham.com/hundred.html Mario Steele wrote: [snip] > First off, The wearing out Idea/Problem. A String Type Variable. [snip] --------------000604050200070307080409 Content-Type: text/html; charset=us-ascii Content-Transfer-Encoding: 7bit <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1"> <title></title> </head> <body> <font face="Times New Roman, Times, serif">Concerning strings; Paul Graham Wrote an excellent article a few months ago that has a section very pertinent to this concept of adding a string type. By the way, my final vote is *don't add strings*, just make the interpreter optimize sequences that are only made of characters. We don't need a string data type. We just need the sequence data type to be a bit smarter.<br> </font> <blockquote><i><font face="Times New Roman, Times, serif"> ***</font><br> <font face="Times New Roman, Times, serif"> Most data structures exist because of speed. For example, many languages today have both strings and lists. Semantically, strings are more or less a subset of lists in which the elements are characters. So why do you need a separate data type? You don't, really. Strings only exist for efficiency. But it's lame to clutter up the semantics of the language with hacks to make programs run faster. Having strings in a language seems to be a case of premature optimization.</font><br> <br> <font face="Times New Roman, Times, serif"> If we think of the core of a language as a set of axioms, surely it's gross to have additional axioms that add no expressive power, simply for the sake of efficiency. Efficiency is important, but I don't think that's the right way to get it.</font><br> <br> <font face="Times New Roman, Times, serif"> The right way to solve that problem, I think, is to separate the meaning of a program from the implementation details. Instead of having both lists and strings, have just lists, with some way to give the compiler optimization advice that will allow it to lay out strings as contiguous bytes if necessary.</font><br> <font face="Times New Roman, Times, serif"> ***</font></i><br> </blockquote> <font face="Times New Roman, Times, serif">The full text of this article is available at <a class="moz-txt-link-freetext" href="http://www.paulgraham.com/hundred.html">http://www.paulgraham.com/hundred.html</a></font><font face="Times New Roman, Times, serif"><br> <br> Mario Steele wrote:<br> [snip]<br> </font> <blockquote type="cite" cite="mid1308946833-1463792382-1070627949 at boing.topica.com"><font face="Times New Roman, Times, serif">First off, The wearing out Idea/Problem. A String Type Variable. <br> </font></blockquote> <font face="Times New Roman, Times, serif">[snip]</font><br> --------------000604050200070307080409--
11. Re: Euphoria 2.5 Features..... ??
- Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 12, 2003
- 688 views
----- Original Message ----- >From: Isaac Raway >Subject: Re: Euphoria 2.5 Features..... ?? > > > >Concerning strings; Paul Graham Wrote an excellent >article a few months ago that has a section very >pertinent to this concept of adding a string type. > By the way, my final vote is *don't add strings*, >just make the interpreter optimize sequences that >are only made of characters. We don't need a >string data type. We just need the sequence data >type to be a bit smarter. Well, I tend to have a different opinion. That's because I view the current Euphoria idiom of a sequence as being a variable length list of objects, but a string is a variable list of characters. And we all know that an object is not the same as a character. An object is an entity whose value's datatype can be a sequence, atom, or integer, with no additional restrictions on their value domain. But a character is an integer and is constrained to the value domain zero to some upper limit, depending on the encoding method in use. For example, ASCII and UTF-8 have the upper limit of 255, UTF-16 has the upper limit of 2^16, etc... The problem we have with Euphoria is that there is no way that we can tell it that a sequence CAN only contain characters in an effecient manner. The 'type' mechanism can be used, but it has some overheads. type string(sequence x) for i = 1 to length(x) do if not integer(x[i]) then return 0 end if if x[i] < 0 or x[i] > EncodingMaxLimit then return 0 end if end for return 1 end type Now we can force Euphoria to support strings, but still not waste RAM. string Name Name = "Derek" -- okay, this works. Name[3] = 2.345 -- This should now fail. But without this 'hint', Euphoria would be quite happen to turn a 'string' back in to an ordinary sequence again. Thus, I would argue that a string type needs to be built into Euphoria before we gain both efficiency in speed and in RAM usage. -- Derek
12. Re: Euphoria 2.5 Features..... ??
- Posted by Isaac Raway <isaac-topica at blueapples.org> Dec 12, 2003
- 651 views
I think perhaps you misunderstand me, so let me try again with more of my own words. I agree that Euphoria would benefit from having strings, however I believe that they should be defined in terms of a sequence. I strongly disagree that a character is not an object. In the most global sense of the word "object" (which is the sense we should employ when discussing any change like this to a language), *everything* is an object. Besides, a character can be easily defined in terms of the type construct. I believe that the correct way to go about adding "strings" to Euphoria is to modify the way a variable is define. So, if you wanted to define a string as in the example you used, when you write string Name you are really saying to the language sequence of char Name The langauge would then make optimizations for the variable Name so that it is stored in a way specific to the data type char. (Note that there may need to be the additon of a construct specifying this storage method.) Using this system allows much more felxibility and therefore would be much more valuable if implemented than the addition of strings as their own data type. It is good to keep the the basic parts of a language as simple as possible. If we only slightly modify the definition of sequence to by either (a) an ordered set of arbitrary objects, or (b) an ordered set of a given object type, then we have made a very valuable addition to the language. Peace and Euphoria, ~ Isaac Derek Parnell wrote: > > >----- Original Message ----- > >>From: Isaac Raway >>To: EUforum at topica.com >>Sent: Friday, December 12, 2003 4:56 PM >>Subject: Re: Euphoria 2.5 Features..... ?? >> >> >>Concerning strings; Paul Graham Wrote an excellent >>article a few months ago that has a section very >>pertinent to this concept of adding a string type. >>By the way, my final vote is *don't add strings*, >>just make the interpreter optimize sequences that >>are only made of characters. We don't need a >>string data type. We just need the sequence data >>type to be a bit smarter. >> > >Well, I tend to have a different opinion. That's because I view the current >Euphoria idiom of a sequence as being a variable length list of objects, but a >string is a variable list of characters. And we all know that an object is not >the same as a character. An object is an entity whose value's datatype can be a >sequence, atom, or integer, with no additional restrictions on their value >domain. But a character is an integer and is constrained to the value domain zero >to some upper limit, depending on the encoding method in use. For example, ASCII >and UTF-8 have the upper limit of 255, UTF-16 has the upper limit of 2^16, etc... > > >The problem we have with Euphoria is that there is no way that we can tell it >that a sequence CAN only contain characters in an effecient manner. The 'type' >mechanism can be used, but it has some overheads. > > type string(sequence x) > for i = 1 to length(x) do > if not integer(x[i]) then > return 0 > end if > if x[i] < 0 or x[i] > EncodingMaxLimit then > return 0 > end if > end for > return 1 > end type > >Now we can force Euphoria to support strings, but still not waste RAM. > >string Name >Name = "Derek" -- okay, this works. >Name[3] = 2.345 -- This should now fail. > >But without this 'hint', Euphoria would be quite happen to turn a 'string' back >in to an ordinary sequence again. > >Thus, I would argue that a string type needs to be built into Euphoria before >we gain both efficiency in speed and in RAM usage. > >
13. Re: Euphoria 2.5 Features..... ??
- Posted by "Derek Parnell" <ddparnell at bigpond.com> Dec 13, 2003
- 661 views
----- Original Message ----- From: "Isaac Raway" <isaac-topica at blueapples.org> To: <EUforum at topica.com> Subject: Re: Euphoria 2.5 Features..... ?? > > > I think perhaps you misunderstand me, so let me try again with more of > my own words. > > I agree that Euphoria would benefit from having strings, however I > believe that they should be defined in terms of a sequence. LOL. Isaac, we are in complete agreement. This is exactly what I proposed for the Euphoria about a year ago. > I strongly disagree that a character is not an object. In the most > global sense of the word "object" (which is the sense we should employ > when discussing any change like this to a language), *everything* is an > object. Besides, a character can be easily defined in terms of the type > construct. Yes you are correct, but I was using the term 'object' in the sense that Euphoria uses it rather than the standard English meaning. And in the Euphoric point of view, an object is not a character. And Euphoria does not have a character datatype, even though (as you say), the type system could be used to define one. > I believe that the correct way to go about adding "strings" to Euphoria > is to modify the way a variable is define. So, if you wanted to define a > string as in the example you used, when you write > > string Name > > you are really saying to the language > > sequence of char Name > > The langauge would then make optimizations for the variable Name so that it is > stored in a way specific to the data type char. (Note that there may need to be > the additon of a construct specifying this storage method.) Yes! This is exactly what I proposed. And in the same manner we should be able to say ... sequence of integer Scores which would cause Euphoria to ensure that only integers were stored in the sequence. > Using this system allows much more felxibility and therefore would be > much more valuable if implemented than the addition of strings as their > own data type. > > It is good to keep the the basic parts of a language as simple as > possible. If we only slightly modify the definition of sequence to by > either (a) an ordered set of arbitrary objects, or (b) an ordered set of > a given object type, then we have made a very valuable addition to the > language. Yes it would. -- Derek
14. Re: Euphoria 2.5 Features..... ??
- Posted by "Euman" <euman at bellsouth.net> Dec 13, 2003
- 663 views
----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> > Yes! This is exactly what I proposed. And in the same manner we should be able > to say ... > > sequence of integer Scores function sofi(sequence s, integer i) s &= repeat(i,1) end function hehehehe just kidding! Euman
15. Re: Euphoria 2.5 Features..... ??
- Posted by "Lucius L. Hilley III" <L3Euphoria at bellsouth.net> Dec 13, 2003
- 662 views
----- Original Message ----- From: "Derek Parnell" <ddparnell at bigpond.com> To: <EUforum at topica.com> Subject: Re: Euphoria 2.5 Features..... ?? <SNIP by Unkmar> > sequence of integer Scores > > which would cause Euphoria to ensure that only integers were stored in the sequence. <SNIP by Unkmar> > Derek > global type seq_of_int(object x) sequence mask if atom(x) then return 0 -- an atom isn't a sequence else mask = repeat(0, length(x)) if compare(mask, (x*0) then return 0 -- a sequence must be embedded. elsif compare(x, floor(x)) then return 0 -- a float is somewhere in the sequence else mask = (-1073741824 > x) or (x > 1073741823) return find(1, x) -- if 1 is found then outsite integer range end if end if end type Lucius L. Hilley III - Unkmar PS: This doesn't mean I am against a built in system. In fact. I'm for one. I don't like the above becuse it has a horrible speed penalty.
16. Re: Euphoria 2.5 Features..... ??
- Posted by "Hayden McKay" <hmck1 at dodo.com.au> Dec 13, 2003
- 747 views
----- Original Message ----- From: "Lucius Hilley" <l3euphoria at bellsouth.net> To: <EUforum at topica.com> Subject: Re: Euphoria 2.5 Features..... ?? > > > ----- Original Message ----- > From: "Derek Parnell" <ddparnell at bigpond.com> > To: <EUforum at topica.com> > Sent: Friday, December 12, 2003 05:43 PM > Subject: Re: Euphoria 2.5 Features..... ?? > > > <SNIP by Unkmar> > > sequence of integer Scores > > > > which would cause Euphoria to ensure that only integers were stored in the > sequence. > <SNIP by Unkmar> > > Derek > > > > > <Snip by Hayden> > Lucius L. Hilley III - Unkmar > > PS: This doesn't mean I am against a built in system. > In fact. I'm for one. > I don't like the above becuse it has a horrible speed penalty. > Hayden global type seq_int(object x) if sequence(x) then for i = 1 to length(x) do if not integer(x[i]) then return 0 end if end for else return 0 end if return 1 end type This also check if sequence not have embeded sequence and of integers only. > > > TOPICA - Start your own email discussion group. FREE! > > > -- > Incoming mail is certified Virus Free. > Checked by AVG anti-virus system (http://www.grisoft.com). > Version: 6.0.548 / Virus Database: 341 - Release Date: 5/12/03 > --- --