OpenEuphoria: Forum: Big String Idea for v2.6

1. Big String Idea for v2.6

Posted by Al Getz <Xaxo at aol.com> Jul 18, 2005
467 views

Hello all,

This could be very, very useful if it was implemented...

Take for example, a C function call:

String="ThisName"
pString=allocate_string(String)
retv=c_func(xSetName,{pString})
free(pString)

Granted we could have saved one line here by not declaring a string
sequence 'String' but a lot of times calls like this have a string
that originates outside the block of code that actually calls
the c function anyway, so this isnt a bad example.

If we take a close look at this, we can see we had to do quite a
bit of work just to pass the string to the c_func didnt we?
We had to allocate a string, then free it.  This is not only a
lot of additional work, it took more processor time also.
All one would need is a simple function that gets the pointer
to the sequence element and the above code would reduce to something
like this:

String="ThisName"
retv=c_func(xSetName,{pointer(String)})

pointer() of course returns the pointer to the first element
of String, and String would have to be a one dimensional sequence.
A variant would look like this:

retv=c_func(xSetName,{pointer(String[1])})

indicating the pointer should be the pointer to the first element
of String.

Of course sub-elements would still be indicated by indexes like:

pointer(MyStrings[3][1])


Now i know this brings up a few problems like the fact that
the sequences are stored as four byte C integers rather than
single bytes (like a C string) so here is a way around
that...

  pointer4a(String[1])
would take every 4th element and poke it into memory to create
a C string.  Upon return from the function call, the mem would
be free'd.  This wouldnt take any additional memory if the
call didnt return right away either...because right now poke's
do the same thing.

As a side effect, strings could be indexed to suite:

String="C:\\DirA\\Filename.txt"
c_proc(xSetFilename,{pointer(String[9])})
c_proc(xSetDirname,{pointer(String[1..8])})
--done


BTW, in the preceeding the name 'pointer' was arbitrary..
even 'allocate_string' would work as long as the function call
took note that it had to be free'd upon completion.  This 
wouldnt even add any additional keywords to the language.


Take care,
Al

And, good luck with your Euphoria programming!

My bumper sticker: "I brake for LED's"

new topic » topic index » view message » categorize

2. Re: Big String Idea for v2.6

Posted by Jason Gade <jaygade at yahoo.com> Jul 18, 2005
459 views

Al Getz wrote:
> 
> Hello all,
> 
> This could be very, very useful if it was implemented...
> 
> Take for example, a C function call:
> 
> String="ThisName"
> pString=allocate_string(String)
> retv=c_func(xSetName,{pString})
> free(pString)
> 
> Granted we could have saved one line here by not declaring a string
> sequence 'String' but a lot of times calls like this have a string
> that originates outside the block of code that actually calls
> the c function anyway, so this isnt a bad example.
> 
> If we take a close look at this, we can see we had to do quite a
> bit of work just to pass the string to the c_func didnt we?
> We had to allocate a string, then free it.  This is not only a
> lot of additional work, it took more processor time also.
> All one would need is a simple function that gets the pointer
> to the sequence element and the above code would reduce to something
> like this:
> 
> String="ThisName"
> retv=c_func(xSetName,{pointer(String)})
> 
> pointer() of course returns the pointer to the first element
> of String, and String would have to be a one dimensional sequence.
> A variant would look like this:
> 
> retv=c_func(xSetName,{pointer(String[1])})
> 
> indicating the pointer should be the pointer to the first element
> of String.
> 
> Of course sub-elements would still be indicated by indexes like:
> 
> pointer(MyStrings[3][1])
> 
> 
> Now i know this brings up a few problems like the fact that
> the sequences are stored as four byte C integers rather than
> single bytes (like a C string) so here is a way around
> that...
> 
>   pointer4a(String[1])
> would take every 4th element and poke it into memory to create
> a C string.  Upon return from the function call, the mem would
> be free'd.  This wouldnt take any additional memory if the
> call didnt return right away either...because right now poke's
> do the same thing.
> 
> As a side effect, strings could be indexed to suite:
> 
> String="C:\\DirA\\Filename.txt"
> c_proc(xSetFilename,{pointer(String[9])})
> c_proc(xSetDirname,{pointer(String[1..8])})
> --done
> 
> 
> BTW, in the preceeding the name 'pointer' was arbitrary..
> even 'allocate_string' would work as long as the function call
> took note that it had to be free'd upon completion.  This 
> wouldnt even add any additional keywords to the language.
> 
> 
> Take care,
> Al
> 
> And, good luck with your Euphoria programming!
> 
> My bumper sticker: "I brake for LED's"
> 

Hey, Al.

When we first started discussing this Standard Euphoria Library project, this is
a problem that I wanted to address.  While I like the almost-typeless design of
Euphoria there are times when types are needed, especially when interfacing with
C code or OS routines.

I would like to see routines that can make structures, records, packed strings,
unicode strings, fixed-type arrays and lists, etc.  Routines to check the
values/formats and make sure they are correct, and routines to allocate memory
and return a pointer to the user.

Fixed arrays and lists could probably use the built-in type system but for
structures and records and other types I would make explicit type-validation
routines.

These new types would be as transparent to the user as possible.  That is,
hopefully the object returned would be usable as a normal Euphoria sequence with
expected results.

=====================================
Too many freaks, not enough circuses.

j.

new topic » goto parent » topic index » view message » categorize

3. Re: Big String Idea for v2.6

Posted by Robert Craig <rds at RapidEuphoria.com> Jul 18, 2005
471 views
Last edited Jul 19, 2005

Al Getz wrote:
> All one would need is a simple function that gets the pointer
> to the sequence element and the above code would reduce to something
> like this:
> 
> String="ThisName"
> retv=c_func(xSetName,{pointer(String)})

As I've said before, I don't intend to ever give you a 
function that will tell you the address of a Euphoria 
variable in memory. Of course you'd also need to know 
the exact bit and byte layout of the variable, and this
would have to be defined in the manual.

I firmly believe this would be a very bad thing to do:

* I wouldn't be able to move a variable to a new location
  in memory (e.g. garbage collection).

* I might not be able to store two variable values in the same
  place in memory.

* I wouldn't be able to change the internal representation
  of Euphoria values in memory.

* ugly corruption bugs, and hard to read code would result

You could assume that all elements of a string sequence are 4-byte
values in Intel byte order, but this is not necessarily true 
today (some could be floating-point values), and could change drastically 
in the future, e.g. on a non-Intel machine. Suppose I (or some other 
implementer of Euphoria) wanted to use one-byte or one-bit per element 
in the future in some cases?

There's just no way that I want to expose this information
to your program. Euphoria is a language of values, not bits,
bytes and storage locations. You can peek and poke, but only
with your own blocks of memory, not with Euphoria variables.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic » goto parent » topic index » view message » categorize

4. Re: Big String Idea for v2.6

Posted by Derek Parnell <ddparnell at bigpond.com> Jul 19, 2005
467 views

Robert Craig wrote:
> 
> Al Getz wrote:
> > All one would need is a simple function that gets the pointer
> > to the sequence element and the above code would reduce to something
> > like this:
> > 
> > String="ThisName"
> > retv=c_func(xSetName,{pointer(String)})
> 
> As I've said before, I don't intend to ever give you a 
> function that will tell you the address of a Euphoria 
> variable in memory. Of course you'd also need to know 
> the exact bit and byte layout of the variable, and this
> would have to be defined in the manual.
> 
> I firmly believe this would be a very bad thing to do:
> 
> * I wouldn't be able to move a variable to a new location
>   in memory (e.g. garbage collection).
> 
> * I might not be able to store two variable values in the same
>   place in memory.
> 
> * I wouldn't be able to change the internal representation
>   of Euphoria values in memory.
> 
> * ugly corruption bugs, and hard to read code would result
> 
> You could assume that all elements of a string sequence are 4-byte
> values in Intel byte order, but this is not necessarily true 
> today (some could be floating-point values), and could change drastically 
> in the future, e.g. on a non-Intel machine. Suppose I (or some other 
> implementer of Euphoria) wanted to use one-byte or one-bit per element 
> in the future in some cases?
> 
> There's just no way that I want to expose this information
> to your program. Euphoria is a language of values, not bits,
> bytes and storage locations. You can peek and poke, but only
> with your own blocks of memory, not with Euphoria variables.
> 

I don't do this often but when its due, its due: I fully support Robert's stance
here. Euphoria is *not* and never will be a low-level language.

I know that we will need to interface to non-Euphoria programs and libraries
from time to time, and the current Euphoria already gives us the tools to do
this. At worst, you could say that the tools are not as 'easy' as the rest of
Euphoria to use, so there might be a case for some more syntax sugar to be
sprinkled.

For example, a number of people have developed libraries that assist in
describing RAM layouts (structs) and moving data to and from those. Because of
the flexibility of Euphoria, each library created for this purpose has a
different 'syntax' and usage. So, as this is a commonly used activity for many
programmers, it would be nice for some built-in (i.e. RDS supplied) method of
doing this, in order for some standardization to be put in place. That could
either be a library or built-in keywords.

Simple things like ASCII strings formatted in the C style don't really need much
more assistance. However, the general case of RAM allocation and deallocation,
could be assisted a bit. For example, if RAM is allocated in a routine, Euphoria
could be told to automatically deallocate it when the function is exited. This is
a simple safety measure that would help many coders.

eg...

  procedure SetCursor(integer x, integer y)
      atom pointA
      atom pointB
      pointA = auto_allocate(SIZEOF_point)
      . . . 
      pointB = auto_allocate(SIZEOF_point)
      . . .
      -- At this point, all auto_allocated RAM calls in this
      -- routine would be deallocated.
  end procedure

But certainly, we should never need to know where in RAM Euphoria is managing
its own data items.

-- 
Derek Parnell
Melbourne, Australia
irc://irc.sorcery.net:9000/euphoria

new topic » goto parent » topic index » view message » categorize

5. Re: Big String Idea for v2.6

Posted by Al Getz <Xaxo at aol.com> Jul 19, 2005
439 views

Hi Rob and Derek,

Rob:
I see what you mean now...and i see you're going for the most timeless
solution possible.  That makes a lot of sense.  Thanks much for your
rather detailed explanation and i see already others have taken note too.

Derek:
Yes :)  As you can see from the above, I cant help but agree with Rob, and
also with your reply; esp with...
  "Simple things like ASCII strings formatted in the C style don't really
   need much more assistance."
Thanks for your added comments.


Take care,
Al


Derek Parnell wrote:
> 
> Robert Craig wrote:
> > 
> > Al Getz wrote:
> > > All one would need is a simple function that gets the pointer
> > > to the sequence element and the above code would reduce to something
> > > like this:
> > > 
> > > String="ThisName"
> > > retv=c_func(xSetName,{pointer(String)})
> > 
> > As I've said before, I don't intend to ever give you a 
> > function that will tell you the address of a Euphoria 
> > variable in memory. Of course you'd also need to know 
> > the exact bit and byte layout of the variable, and this
> > would have to be defined in the manual.
> > 
> > I firmly believe this would be a very bad thing to do:
> > 
> > * I wouldn't be able to move a variable to a new location
> >   in memory (e.g. garbage collection).
> > 
> > * I might not be able to store two variable values in the same
> >   place in memory.
> > 
> > * I wouldn't be able to change the internal representation
> >   of Euphoria values in memory.
> > 
> > * ugly corruption bugs, and hard to read code would result
> > 
> > You could assume that all elements of a string sequence are 4-byte
> > values in Intel byte order, but this is not necessarily true 
> > today (some could be floating-point values), and could change drastically 
> > in the future, e.g. on a non-Intel machine. Suppose I (or some other 
> > implementer of Euphoria) wanted to use one-byte or one-bit per element 
> > in the future in some cases?
> > 
> > There's just no way that I want to expose this information
> > to your program. Euphoria is a language of values, not bits,
> > bytes and storage locations. You can peek and poke, but only
> > with your own blocks of memory, not with Euphoria variables.
> > 
> 
> I don't do this often but when its due, its due: I fully support Robert's
> stance here.
> Euphoria is *not* and never will be a low-level language.
> 
> I know that we will need to interface to non-Euphoria programs and libraries
> from time
> to time, and the current Euphoria already gives us the tools to do this. At
> worst,
> you could say that the tools are not as 'easy' as the rest of Euphoria to use,
> so there
> might be a case for some more syntax sugar to be sprinkled. 
> 
> For example, a number of people have developed libraries that assist in
> describing
> RAM layouts (structs) and moving data to and from those. Because of the
> flexibility
> of Euphoria, each library created for this purpose has a different 'syntax'
> and usage.
> So, as this is a commonly used activity for many programmers, it would be nice
> for
> some built-in (i.e. RDS supplied) method of doing this, in order for some
> standardization
> to be put in place. That could either be a library or built-in keywords.
> 
> Simple things like ASCII strings formatted in the C style don't really need
> much more
> assistance. However, the general case of RAM allocation and deallocation,
> could be
> assisted a bit. For example, if RAM is allocated in a routine, Euphoria could
> be told
> to automatically deallocate it when the function is exited. This is a simple
> safety
> measure that would help many coders.
> 
> eg...
> 
>   procedure SetCursor(integer x, integer y)
>       atom pointA
>       atom pointB
>       pointA = auto_allocate(SIZEOF_point)
>       . . . 
>       pointB = auto_allocate(SIZEOF_point)
>       . . .
>       -- At this point, all auto_allocated RAM calls in this
>       -- routine would be deallocated.
>   end procedure
> 
> But certainly, we should never need to know where in RAM Euphoria is managing
> its own
> data items.
> 
> -- 
> Derek Parnell
> Melbourne, Australia
> irc://irc.sorcery.net:9000/euphoria
> 



And, good luck with your Euphoria programming!

My bumper sticker: "I brake for LED's"

OpenEuphoria

1. Big String Idea for v2.6

2. Re: Big String Idea for v2.6

3. Re: Big String Idea for v2.6

4. Re: Big String Idea for v2.6

5. Re: Big String Idea for v2.6

Search

Include:

Quick Links

User menu

Misc Menu