1. Strings, again

I know this topic has been done to death, but if I may resurrect it?

I would like to have a pre-defined type of 'string', which would be a single
sequence (not nested) of atoms with numeric value between 0 and 256, without a
final delimeter. All operations would be allowed as any other sequence.

The advantage would be that when coding a program which deals with a lot of
human readable text the use of the 'string' type would speed development and
catch bugs sooner. User defined procedures to output human readable text would be
easier to code.

I believe that this would be a low impact change to make, giving 'string'
devotees a good start in writing their own string based routines, but maintain
the purity of Euphoria sequences. This would be a logical extension to the
inputing of text already catered for in the permitted form < a = "text string" >

Bunjo

new topic     » topic index » view message » categorize

2. Re: Strings, again

A.C.Harper wrote:

> I would like to have a pre-defined type of 'string', which would be a single
> sequence

"C" null terminated strings ?
basic type strings ?
ebidic type strings ?
bstr type string ?
unicode type strings
What do you propose to be the standard ?

A sequence can handle all of these types.

Bernie

new topic     » goto parent     » topic index » view message » categorize

3. Re: Strings, again

I think it should be like this:
"abcd"    --string
{0,1,2,3} --string(non null-terminated so this would work)
{-1,256}  --not a string: -1 and 256 wouldn't be allowed(meaning
            strings would be 8-bit, and therefore use less RAM)
{1,2,3,{}}--not a string:contains a sequence

Using this method, I would have many uses for strings.

Bernie Ryan wrote:
> 
> 
> A.C.Harper wrote:
> 
> > I would like to have a pre-defined type of 'string', which would be a single
> > sequence
> 
> "C" null terminated strings ?
> basic type strings ?
> ebidic type strings ?
> bstr type string ?
> unicode type strings
> What do you propose to be the standard ?
> 
> A sequence can handle all of these types.
> 
> Bernie
>

new topic     » goto parent     » topic index » view message » categorize

4. Re: Strings, again

On Sat, 05 Jun 2004 16:00:06 -0700, CoJaBo <guest at RapidEuphoria.com>
wrote:

(rudely top-)posted by: CoJaBo <cojabo at suscom.net>
>
>I think it should be like this:
>"abcd"    --string
>{0,1,2,3} --string(non null-terminated so this would work)
I disagree there: {} clearly denote a sequence.
>{-1,256}  --not a string: -1 and 256 wouldn't be allowed(meaning
>            strings would be 8-bit, and therefore use less RAM)
>{1,2,3,{}}--not a string:contains a sequence
>
>Using this method, I would have many uses for strings.
>
>Bernie Ryan wrote:
>> 
>> 
>> A.C.Harper wrote:
>> 
>> > I would like to have a pre-defined type of 'string', which would be a
>> > single sequence
>> 
>> "C" null terminated strings ?
>> basic type strings ?
>> ebidic type strings ?
>> bstr type string ?
>> unicode type strings
>> What do you propose to be the standard ?
C-style, utf8 would be my choice. Admittedly the others, if you need
them, would have to be handled via sequences.
>> 
>> A sequence can handle all of these types.
>From the application side, indeed, no gain.
>From the programmer side, especially ? and trace(), much benefit.

HMMM...
How can I hammer this home?

POINT 1: A string type will make nothing possible that cannot be done
using sequences. Nothing, and I mean nothing. Some fairly hideous
syntax might be necessary to overcome the total failure to distinguish
between currently ambiguous strings and sequences (eg {tStr,"ABC"} vs
{tSeq,{65,66,67}}, but it CAN be done. (and if you never yet needed
to, bully for you).

POINT 2: Adding a string type will not stop sequences working.
(Yes, it has got to the point where I feel the need to state that)

POINT 3: A string type will make the interpreter slower.

POINT 4: A string type will make programmers more productive,
at least when tracking down bugs via trace() and/or examining ex.err
that involve a fair bit of textual stuff. (and if you never yet went
down that path, bully for you)

POINT 5: It has nothing to do with the programmer knowing what a
variable contains, but the infrastructure knowing it. This discussion
appears to have gone seriously off track regarding printing. That is
not, and in my view never has been a problem when the programmer is
writing code, but it certainly IS when the programmer is relying on
already written code (from RDS) they CANNOT change - specifically the
trace window and the ex.err file.

POINT 6: There *ARE* benefits to having strings, EVEN THEY DO NOT
APPLY TO YOU. (And as pt 3, a downside too).

POINT 7: Rob won't do this, so please stop rubbing salt in the open
wound which is: I can't have my beloved string type!!!!!

OK?
Pete
PS If Rob's promised Eu in Eu, which is already promised to be far
slower than the real deal, could handle strings (and be even slower),
I still think that would be pretty cool!

new topic     » goto parent     » topic index » view message » categorize

5. Re: Strings, again

Pete wrote:

<snip>

> HMMM...
> How can I hammer this home?
>
> POINT 1: A string type will make nothing possible that cannot be done
> using sequences. Nothing, and I mean nothing. Some fairly hideous
> syntax might be necessary to overcome the total failure to distinguish
> between currently ambiguous strings and sequences (eg {tStr,"ABC"} vs
> {tSeq,{65,66,67}}, but it CAN be done. (and if you never yet needed
> to, bully for you).

I agree. Using such a syntax will be awkward, but not a real problem
when only my own program is concerned.
But things look different, when a program written by me writes some data
to a file, and a program written by someone else reads the data.

Quoting "lib_e_g.htm#get" (Eu 2.4):
"get() can read arbitrarily complicated Euphoria objects. You could have
a long sequence of values in braces and separated by commas, e.g.
{23, {49, 57}, 0.5, -1, 99, 'A', "john"}."

"The combination of print() and get() can be used to save a Euphoria
object to disk and later read it back. This technique could be used to
implement a database as one or more large Euphoria sequences stored in
disk files."

Now let's see what happens when I use this technique to implement a
database:
include get.e
sequence data
integer fn

data = {23, {49, 57}, 0.5, -1, 99, 'A', "john"}

fn = open("test.dat", "w")
print(fn, data)
close(fn)

fn = open("test.dat", "r")
data = get(fn)
close(fn)

print(1, data)


As we already would have expected, the program prints
   {0, {23,{49,57},0.5,-1,99,65,{106,111,104,110}}}
on the screen.

The information that "john" is a string got lost here, by print()ing the
data to the file. Even when I change the output command, or change the
content of the file manually, so that "test.dat" actually contains
   {23,{49,57},0.5,-1,99,65,"john"}

the following program
include get.e
sequence data
integer fn

fn = open("test.dat", "r")
data = get(fn)
close(fn)
print(1, data)


also prints
   {0, {23,{49,57},0.5,-1,99,65,{106,111,104,110}}}
on the screen.

To make a long story short:
In order to make data exchange via print() and get() unambiguous, the
writing program must use a syntax like {tStr,"ABC"}, and the reading
program (possibly written by someone else), must use THE SAME SYNTAX,
too. In other words, some kind of standard will be required.

> POINT 2: Adding a string type will not stop sequences working.
> (Yes, it has got to the point where I feel the need to state that)
>
> POINT 3: A string type will make the interpreter slower.
>
> POINT 4: A string type will make programmers more productive,
> at least when tracking down bugs via trace() and/or examining ex.err
> that involve a fair bit of textual stuff. (and if you never yet went
> down that path, bully for you)
>
> POINT 5: It has nothing to do with the programmer knowing what a
> variable contains, but the infrastructure knowing it.

Thanks for pointing this out again.

> This discussion
> appears to have gone seriously off track regarding printing. That is
> not, and in my view never has been a problem when the programmer is
> writing code, but it certainly IS when the programmer is relying on
> already written code (from RDS) they CANNOT change - specifically the
> trace window and the ex.err file.

And it IS also a problem when exchanging data, as I tried to show above.
I, the programmer of the writing program, know that "john" is a string.
What I want is this: Just write the data to a file, send the file to
someone else, and then the recipient should be able to *simply* and
*reliably* retrieve the information, that I sent him.
What I not want is, to call her/him by phone and tell: "Hey, variable #1
is a sequence, variable #2 is ...".

> POINT 6: There *ARE* benefits to having strings, EVEN THEY DO NOT
> APPLY TO YOU. (And as pt 3, a downside too).
>
> POINT 7: Rob won't do this, so please stop rubbing salt in the open
> wound which is: I can't have my beloved string type!!!!!
>
> OK?

Yes, you wrote a well-balanced summary. Thanks.

> Pete
> PS If Rob's promised Eu in Eu, which is already promised to be far
> slower than the real deal, could handle strings (and be even slower),
> I still think that would be pretty cool!

In principle I also think so, but than it wouldn't be completely
compatible with the official version any more. sad

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

6. Re: Strings, again

Pete Lomax wrote:
> 
> On Sat, 05 Jun 2004 16:00:06 -0700, CoJaBo <guest at RapidEuphoria.com>
> wrote:
> 
> (rudely top-)posted by: CoJaBo <cojabo at suscom.net>
I do not see why this is considered "rude".
There are many other posts posted like this,
and I never seen any rule against it anywhere.
In fact, the only rule I could find was
"Try to stay on the topic of Euphoria or closely-related issues.".

> >
> >I think it should be like this:
> >"abcd"    --string
> >{0,1,2,3} --string(non null-terminated so this would work)
> I disagree there: {} clearly denote a sequence.
> >{-1,256}  --not a string: -1 and 256 wouldn't be allowed(meaning
> >            strings would be 8-bit, and therefore use less RAM)
> >{1,2,3,{}}--not a string:contains a sequence
> >
> >Using this method, I would have many uses for strings.
> >
> >Bernie Ryan wrote:
> >> 
> >> 
> >> A.C.Harper wrote:
> >> 
> >> > I would like to have a pre-defined type of 'string', which would be a
> >> > single sequence
> >> 
> >> "C" null terminated strings ?
> >> basic type strings ?
> >> ebidic type strings ?
> >> bstr type string ?
> >> unicode type strings
> >> What do you propose to be the standard ?
> C-style, utf8 would be my choice. Admittedly the others, if you need
> them, would have to be handled via sequences.
> >> 
> >> A sequence can handle all of these types.
> >From the application side, indeed, no gain.
> >From the programmer side, especially ? and trace(), much benefit.
> 
> HMMM...
> How can I hammer this home?
> 
> POINT 1: A string type will make nothing possible that cannot be done
> using sequences. Nothing, and I mean nothing. Some fairly hideous
> syntax might be necessary to overcome the total failure to distinguish
> between currently ambiguous strings and sequences (eg {tStr,"ABC"} vs
> {tSeq,{65,66,67}}, but it CAN be done. (and if you never yet needed
> to, bully for you).
> 
> POINT 2: Adding a string type will not stop sequences working.
> (Yes, it has got to the point where I feel the need to state that)
> 
> POINT 3: A string type will make the interpreter slower.
> 
> POINT 4: A string type will make programmers more productive,
> at least when tracking down bugs via trace() and/or examining ex.err
> that involve a fair bit of textual stuff. (and if you never yet went
> down that path, bully for you)
> 
> POINT 5: It has nothing to do with the programmer knowing what a
> variable contains, but the infrastructure knowing it. This discussion
> appears to have gone seriously off track regarding printing. That is
> not, and in my view never has been a problem when the programmer is
> writing code, but it certainly IS when the programmer is relying on
> already written code (from RDS) they CANNOT change - specifically the
> trace window and the ex.err file.
> 
> POINT 6: There *ARE* benefits to having strings, EVEN THEY DO NOT
> APPLY TO YOU. (And as pt 3, a downside too).
> 
> POINT 7: Rob won't do this, so please stop rubbing salt in the open
> wound which is: I can't have my beloved string type!!!!!
> 
> OK?
> Pete
> PS If Rob's promised Eu in Eu, which is already promised to be far
> slower than the real deal, could handle strings (and be even slower),
> I still think that would be pretty cool!
> 
>

new topic     » goto parent     » topic index » view message » categorize

7. Re: Strings, again

CoJaBo wrote:
> 
> 
> Pete Lomax wrote:
> > 
> > On Sat, 05 Jun 2004 16:00:06 -0700, CoJaBo <guest at RapidEuphoria.com>
> > wrote:
> > 
> > (rudely top-)posted by: CoJaBo <cojabo at suscom.net>
> I do not see why this is considered "rude".
> There are many other posts posted like this,
> and I never seen any rule against it anywhere.
> In fact, the only rule I could find was
> "Try to stay on the topic of Euphoria or closely-related issues.".

I'd have to agree with CoJaBo here.  When I'm at work and a discussion goes
on via email for a while, I don't want to have to scroll to the bottom every
time a new comment comes up... If I've been following the discussion, then 
I just want to see what the new comment is without having to scroll to find
it.

My general rule here is:  if the previous discussion is long and my comment 
is short, then I'll top-post.  if the previous discussion is short, or I 
decide to snip and comment on one point, then I'm more likely to reply 
under it (little or no scrolling required).

Either way, I wouldn't consider anybody's preference to be "rude".

Just my opinion,
-- Brian

new topic     » goto parent     » topic index » view message » categorize

8. Re: Strings, again

On Sun, 06 Jun 2004 20:36:51 -0700, CoJaBo <guest at RapidEuphoria.com>
wrote:

>Pete Lomax wrote:
>> (rudely top-)posted by: CoJaBo <cojabo at suscom.net>
>I do not see why this is considered "rude".
Of course, I meant rude in the gentler sense of mildy impolite, I did
not mean to imply you were being offensive or insulting.
It is a general internet courtesy, which granted is not in Rob's
rulebook, and is not often observed here. If you are replying to a
specific point, it is helpful to indicate which. If you are not, then
you should cut the unnecessary text from the bottom of the post.

I should know, I've done far ruder things than you blink
Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

9. Re: Strings, again

Pete Lomax wrote:
> 
> On Sun, 06 Jun 2004 20:36:51 -0700, CoJaBo <guest at RapidEuphoria.com>
> wrote:
> 
> >Pete Lomax wrote:
> >> (rudely top-)posted by: CoJaBo <cojabo at suscom.net>
> >I do not see why this is considered "rude".
> Of course, I meant rude in the gentler sense of mildy impolite, I did
> not mean to imply you were being offensive or insulting.
> It is a general internet courtesy, which granted is not in Rob's
> rulebook, and is not often observed here. If you are replying to a
> specific point, it is helpful to indicate which. If you are not, then
> you should cut the unnecessary text from the bottom of the post.
> 
> I should know, I've done far ruder things than you blink
> Regards,
> Pete

I generally regard Pete's opinion to be close to infallible blink 
(or at least extraordinarilly well considered), but I admit to being
puzzled here. I do not see how a given choice of formatting to emphasize
and focus on a specific point of view is even "impolite", much less rude.
In fact, I think a cogent argument could be made that failing to take time
and give thought to such formatting (for emphasis and focus) is
inconsiderate (but probably not rude nor impolite).

Allen

new topic     » goto parent     » topic index » view message » categorize

10. Re: Strings, again

On 7 Jun 2004, at 6:28, Allen V Robnett wrote:

> 
> 
> posted by: Allen V Robnett <alrobnett at alumni.princeton.edu>
> 
> Pete Lomax wrote:
> > 
> > On Sun, 06 Jun 2004 20:36:51 -0700, CoJaBo <guest at RapidEuphoria.com>
> > wrote:
> > 
> > >Pete Lomax wrote:
> > >> (rudely top-)posted by: CoJaBo <cojabo at suscom.net>
> > >I do not see why this is considered "rude".
> > Of course, I meant rude in the gentler sense of mildy impolite, I did
> > not mean to imply you were being offensive or insulting.
> > It is a general internet courtesy, which granted is not in Rob's
> > rulebook, and is not often observed here. If you are replying to a
> > specific point, it is helpful to indicate which. If you are not, then
> > you should cut the unnecessary text from the bottom of the post.
> > 
> > I should know, I've done far ruder things than you blink
> > Regards,
> > Pete
> 
> I generally regard Pete's opinion to be close to infallible blink 
> (or at least extraordinarilly well considered), but I admit to being
> puzzled here. I do not see how a given choice of formatting to emphasize
> and focus on a specific point of view is even "impolite", much less rude.
> In fact, I think a cogent argument could be made that failing to take time
> and give thought to such formatting (for emphasis and focus) is
> inconsiderate (but probably not rude nor impolite).

Ditto. Erudite.

Kat

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu