OpenEuphoria: Forum: Sequence Ops

1. Sequence Ops

Posted by Robert Craig <rds at Ra?idEuphoria.com> Jul 15, 2007
632 views

Here's a bit of background on sequence operations
in Euphoria.

When I started designing the Euphoria language
(18 years ago!), I believed that the basic cycle time 
of the interpreter, i.e. the time required to process 
the simplest IL operation, such as adding two integers, 
would be similar to other interpreters around at that time, 
and Euphoria might even be slower, since Euphoria was going 
to have great flexibility in it's data structures, as well
as lots of run-time error checking.

I knew from my experience with APL, that an interpreted
language could gain speed by supporting operations on
large aggregates of data (vectors, matrices, lists etc.), 
thereby reducing the number of statements that need to be interpreted, 
and shifting the workload to a fast routine written in C or 
assembly language. In APL this was very true. Everyone tried
hard to write code that used APL "vectors" 
(somewhat similar to Euphoria sequences, but more restricted), 
rather than APL scalars (atoms), since you could often 
speed up a program by a factor of 10x to 100x that way.

APL also had a richer set of primitive operations on vectors
than Euphoria has. e.g. it had things like rotate-right, 
matrix transpose etc. It also had a feature that let you
apply an operation to a whole vector. e.g. +/x would add
up all the elements in x. It also let you select elements
from a vector by providing another vector of 1's and 0's
which typically was created by relational ops (< > = etc.).
x = y, where x and y are vectors (sequences) was thus 
very useful in APL. Less so, in Euphoria.

The first working version of the Euphoria interpreter
was about 40 times slower than today's interpreter
at simple integer:integer ops. 

sequence:sequence ops were not much slower than today however, 
since they are handled by run-time routines written in C,
so it was usually much faster to use sequence ops to do something, 
than to write a Euphoria loop, and do the job one atom at a time.

However, I soon became obsessed with speed, and surprised
even myself with how much faster I was able to
make the interpreter, eventually reducing the overhead 
per IL op down to just a few machine instructions.
I always compared Euphoria against C, never benchmarking
against other interpreters until just before v1.0 when
I set up some benchmarks against QBasic, and was very surprised
at how much faster Euphoria was. I later compared against
other popular interpreters and was also surprised. It seemed
like other developers just didn't care much about interpreter speed,
or they added features to their language that were incompatible with
fast execution.

Many years later, the Euphoria to C translator, boosted the speed
of simple integer:integer ops even more.

So the situation today is that it's often a bit faster to
write a Euphoria loop, than it is to use sequence ops,
mainly because sequence ops will have some storage
allocation/deallocation overhead.

Euphoria sequence ops are not used as much as 
I originally expected. 

Back in my APL days, I worked (as a summer job in university) 
for a large stock broker's research department. I was constantly 
doing analysis of time series data, such as the closing 
daily stock price of some company going back many years. 
We had various theories to test, computing correlation coefficients 
on the data etc. It was very convenient to use APL vectors
(Euphoria sequences). It eliminated a lot of loops,
and made the code very concise, not to mention faster.

Later in my career I found myself working with a powerful
parallel SIMD (single-instruction stream, multiple data stream) 
computer used in analyzing sonar signals bouncing off submarines. 
It had 8 CPU's working in lock-step, executing the same
instruction on 8 streams of data. It was the fastest
machine in the world at computing FFTs (Fast Fourier Transforms).
I worked on a language for that machine, where, naturally
SIMD (i.e. sequence) operations predominated.

I think the people on this list who are interested
in language design may not have as much use for SIMD operations
as other application writers who work in the areas of business
data processing, statistics, or scientific computing, where 
the algorithms are often fairly simple (i.e. boring), 
but you are applying those algorithms to large amounts of 
real-world data. Hobbyists are not usually interested in 
applications like that, and don't have access to large streams
of data that they care about.

So what should we do?
Thats up to you!  

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic » topic index » view message » categorize

2. Re: Sequence Ops

Posted by Ricardo Forno <ricardoforno at tutopia.?om> Jul 15, 2007
611 views
Last edited Jul 16, 2007

Robert Craig wrote:
> 
> Here's a bit of background on sequence operations
> in Euphoria.
> 
> When I started designing the Euphoria language
> (18 years ago!), I believed that the basic cycle time 
> of the interpreter, i.e. the time required to process 
> the simplest IL operation, such as adding two integers, 
> would be similar to other interpreters around at that time, 
> and Euphoria might even be slower, since Euphoria was going 
> to have great flexibility in it's data structures, as well
> as lots of run-time error checking.
> 
> I knew from my experience with APL, that an interpreted
> language could gain speed by supporting operations on
> large aggregates of data (vectors, matrices, lists etc.), 
> thereby reducing the number of statements that need to be interpreted, 
> and shifting the workload to a fast routine written in C or 
> assembly language. In APL this was very true. Everyone tried
> hard to write code that used APL "vectors" 
> (somewhat similar to Euphoria sequences, but more restricted), 
> rather than APL scalars (atoms), since you could often 
> speed up a program by a factor of 10x to 100x that way.
> 
> APL also had a richer set of primitive operations on vectors
> than Euphoria has. e.g. it had things like rotate-right, 
> matrix transpose etc. It also had a feature that let you
> apply an operation to a whole vector. e.g. +/x would add
> up all the elements in x. It also let you select elements
> from a vector by providing another vector of 1's and 0's
> which typically was created by relational ops (< > = etc.).
> x = y, where x and y are vectors (sequences) was thus 
> very useful in APL. Less so, in Euphoria.
> 
> The first working version of the Euphoria interpreter
> was about 40 times slower than today's interpreter
> at simple integer:integer ops. 
> 
> sequence:sequence ops were not much slower than today however, 
> since they are handled by run-time routines written in C,
> so it was usually much faster to use sequence ops to do something, 
> than to write a Euphoria loop, and do the job one atom at a time.
> 
> However, I soon became obsessed with speed, and surprised
> even myself with how much faster I was able to
> make the interpreter, eventually reducing the overhead 
> per IL op down to just a few machine instructions.
> I always compared Euphoria against C, never benchmarking
> against other interpreters until just before v1.0 when
> I set up some benchmarks against QBasic, and was very surprised
> at how much faster Euphoria was. I later compared against
> other popular interpreters and was also surprised. It seemed
> like other developers just didn't care much about interpreter speed,
> or they added features to their language that were incompatible with
> fast execution.
> 
> Many years later, the Euphoria to C translator, boosted the speed
> of simple integer:integer ops even more.
> 
> So the situation today is that it's often a bit faster to
> write a Euphoria loop, than it is to use sequence ops,
> mainly because sequence ops will have some storage
> allocation/deallocation overhead.
> 
> Euphoria sequence ops are not used as much as 
> I originally expected. 
> 
> Back in my APL days, I worked (as a summer job in university) 
> for a large stock broker's research department. I was constantly 
> doing analysis of time series data, such as the closing 
> daily stock price of some company going back many years. 
> We had various theories to test, computing correlation coefficients 
> on the data etc. It was very convenient to use APL vectors
> (Euphoria sequences). It eliminated a lot of loops,
> and made the code very concise, not to mention faster.
> 
> Later in my career I found myself working with a powerful
> parallel SIMD (single-instruction stream, multiple data stream) 
> computer used in analyzing sonar signals bouncing off submarines. 
> It had 8 CPU's working in lock-step, executing the same
> instruction on 8 streams of data. It was the fastest
> machine in the world at computing FFTs (Fast Fourier Transforms).
> I worked on a language for that machine, where, naturally
> SIMD (i.e. sequence) operations predominated.
> 
> I think the people on this list who are interested
> in language design may not have as much use for SIMD operations
> as other application writers who work in the areas of business
> data processing, statistics, or scientific computing, where 
> the algorithms are often fairly simple (i.e. boring), 
> but you are applying those algorithms to large amounts of 
> real-world data. Hobbyists are not usually interested in 
> applications like that, and don't have access to large streams
> of data that they care about.
> 
> So what should we do?
> Thats up to you!  
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a>

Hi Rob.
What I think is that maybe is is possible to increase the speed of sequence
operations such as to beat loops. At least, this is how another APL-derived
language (J) works. In J, sequence / array operations are much faster than
their loop counterparts.
Regards.

new topic » goto parent » topic index » view message » categorize

3. Re: Sequence Ops

Posted by CChris <christian.cuvier at agricultu?e.gouv.fr> Jul 15, 2007
618 views
Last edited Jul 16, 2007

Let me state what I think we should do. Because "sequence operations" are
diverse, I don't think an all-encompassing solution would work, whatever it is.
The contexts where the various operations are used hardly overlap, so that
forcing a unniform meaning across them just doesn't make sense for me.

1/  [arithmetic op_]assignments: lhs=, +=, -=, *=, /= rhs

I have always worked in mathematics and statustucs. As a result, I completely
share Rob's views on these operations as extremely useful to keep the source
files readable as well as providing a very needed flexibility.

Note for Pete: I'd probably stop using Euphoria altogether if 1+{1,2,3} ever
started to return anything other than {2,3,4}, even without an error, because
that's one of its current strengths. Using sq_add() and friends just means more
typing for no benefit. That's the single construct I use most when doing
scientifical/statistical computing.

However, because CPUs don't work exactly as they used to 18 years ago, a good
thing to do would be to assess whether, when the rhs is an atom, the above
sequence ops are indeed faster than loops. If the benchmarks say no, then we
might have to reconsider their being in the language. If the answer is no when
the rhs is a sequence, I don't think it is much of a problem, because it is a
less typical use.

There is a moblem, however, with the semantics of "s[p..q] = x". When x is an
atom, this is clear, and assigns a few contiguous elements the same value. But
when x is a sequence, the semantics change to mean "set a subsequence of s to x",
and the lengths have better match. Perhaps this point should be emphasized more
in the docs, and the awkward, error prone 
<euocde>s[p..q]=repeat(x,q-p+1)</eucode>
{{{
 be mentioned as the way to do the
possibly more natural thing: assigning a sequence value to contiguous elements.

I had suggested s[p..q][]=x a few years ago, see the corresponding thread.

2/ The &= operator.

This one is a special case, because it doesn't apply a transformation to each
element of a sequence as the above, but takes the lhs as a whole. I wish the
former could be done too, and it has been requested a couple times on th list.

Since "seq[p..q] &= x" is always illegal (except when x is "", in which case seq
remains unchanged), why not use this idiom to apply a "&" to each element of a
sequence?
We'd have

sequence s
s={1,2,3}
? s & 0 -- {1,2,3,0}
? s[1..3] & 0 -- {{1,0},{2,0},{3,0}}, currently errors out because lengths don't
match

Alternatively, as above, s[]&=x could be used too. That syntax would also allow
some useful stiff, like }}}
<eucode>s=atom(x[])</eucode>
{{{
 returning a sequence of 1
and 0s more useful than the well known 0.

3/ Relational ops:  (lhs =, !=, <,<=,>,>= rhs).

Here comes the highly debated stuff.

Once again, I fully agree with the vector interpretation being by far the most
useful.

However, inside an if or while block header, this interpretation wouldn't make
sense. This is what Pete is gtiping about, with some reason, because the result
cannot be a vector, contrary to one would expect in the general case.

I don't see any issue in having the relational operators behave differently
whether they are rhs or not. Hence:

if "Pete" = "Lomax" then
w/eucode>
would be optimised out because the conditional clause is known to be the FALSE
constant, while
}}}
<eucode>
s= ("Pete" = "Lomax")
w/eucode>
won't work because lengths don't match. compare() and equal() can still be used
if you need the boolean in a rhs. Note that appearing in a routine argument is
the same as being a rhs, since the actual value is being used.
So, if you really wish to crash your program, you could still do this:
}}}
<eucode>
funcion id(object x) return x end function
if id("Pete "="Lomax") then -- ...

Here, because the = operator defines part of a rhs, id() will return a sequence,
and this triggers a run time error because an atom i expecyed.

That's how I view sequence operations after using them a fair amount over years,
and experienced strengths and weaknesses of the current scheme. There are
improvements to make. But remving sequence operations from the language would
certainly cripple it beyond repair, and Eu doesn't need that.

CChris

new topic » goto parent » topic index » view message » categorize

4. Re: Sequence Ops

Posted by jxliv7 <jxliv7 at hotm?il.com> Jul 15, 2007
597 views
Last edited Jul 16, 2007

to Rob and all the numerous others debating Euhporia:

i'm wondering if these sequence operations described might be useful in multiple
processor applications. that leads to some interesting parallel processing code
that might make Euphoria ahead of and simpler than what i see out there. i say
"see" because it's been too many years since i have wrapped my mind around
machine code, assembler, and all those details to "do".

i also look at the few 64-bit changes that are at the machine level/compiler
level and again see where Euphoria could be more useful to more programmers.

i'm going to have to adjust my time so i spend less time painting canvases and
get back into coding. Aside from sex, computers were my first love.

i do have a question: instead of changing the core syntax/operation of Euphoria,
wouldn't it be better/easier to put together include/DLL/API aps that to the
particular things you want...? couldn't enhancements (like == or +/x) be
interpreted at that level or would it have to be something that has to be done at
another further level...?

regards, etc.


--
jon

new topic » goto parent » topic index » view message » categorize

5. Re: Sequence Ops

Posted by George Walters <gwalters at s?.rr.com> Jul 16, 2007
612 views

Geez, I never new anyone who programmed in APL. I used to do that on a big
IBM many yeara ago!!

new topic » goto parent » topic index » view message » categorize

6. Re: Sequence Ops

Posted by Ricardo Forno <ricardoforno at tutopia.co?> Jul 16, 2007
567 views

George Walters wrote:
> 
> Geez, I never new anyone who programmed in APL. I used to do that on a big
> IBM many yeara ago!!

 From about 1965 up to the middle 80's, many of my programs were
written in APL.
Regards.

new topic » goto parent » topic index » view message » categorize

7. Re: Sequence Ops

Posted by Al Getz <Xaxo at ao?.?om> Jul 16, 2007
617 views

CChris wrote:
> 
> 
> Let me state what I think we should do. Because "sequence operations" are
> diverse,
> I don't think an all-encompassing solution would work, whatever it is. The
> contexts
> where the various operations are used hardly overlap, so that forcing a
> unniform
> meaning across them just doesn't make sense for me.
> 
> 1/  [arithmetic op_]assignments: lhs=, +=, -=, *=, /= rhs
> 
> I have always worked in mathematics and statustucs. As a result, I completely
> share Rob's views on these operations as extremely useful to keep the source
> files readable as well as providing a very needed flexibility.
> 
> Note for Pete: I'd probably stop using Euphoria altogether if 1+{1,2,3} ever
> started to return anything other than {2,3,4}, even without an error, because
> that's one of its current strengths. Using sq_add() and friends just means
> more
> typing for no benefit. That's the single construct I use most when doing
> scientifical/statistical
> computing.
> 
> However, because CPUs don't work exactly as they used to 18 years ago, a good
> thing to do would be to assess whether, when the rhs is an atom, the above
> sequence
> ops are indeed faster than loops. If the benchmarks say no, then we might have
> to reconsider their being in the language. If the answer is no when the rhs
> is a sequence, I don't think it is much of a problem, because it is a less
> typical
> use.
> 
> There is a moblem, however, with the semantics of "s[p..q] = x". When x is an
> atom, this is clear, and assigns a few contiguous elements the same value. But
> when x is a sequence, the semantics change to mean "set a subsequence of s to
> x", and the lengths have better match. Perhaps this point should be emphasized
> more in the docs, and the awkward, error prone 
> <euocde>s[p..q]=repeat(x,q-p+1)</eucode>
{{{

> be mentioned as the way to do the possibly more natural thing: assigning a
> sequence
> value to contiguous elements.
> 
> I had suggested s[p..q][]=x a few years ago, see the corresponding thread.
> 
> 2/ The &= operator.
> 
> This one is a special case, because it doesn't apply a transformation to each
> element of a sequence as the above, but takes the lhs as a whole. I wish the
> former could be done too, and it has been requested a couple times on th list.
> 
> Since "seq[p..q] &= x" is always illegal (except when x is "", in which case
> seq remains unchanged), why not use this idiom to apply a "&" to each element
> of a sequence?
> We'd have
> }}}
<eucode>
> sequence s
> s={1,2,3}
> ? s & 0 -- {1,2,3,0}
> ? s[1..3] & 0 -- {{1,0},{2,0},{3,0}}, currently errors out because lengths
> don't match
> </eucode>
{{{

> Alternatively, as above, s[]&=x could be used too. That syntax would also
> allow some useful stiff, like }}}
<eucode>s=atom(x[])</eucode>
{{{

> returning a </font><font color="#FF00FF">sequence </font><font
> color="#330033">of 1 </font><font color="#0000FF">and </font><font
> color="#330033">0s more useful than the well
> known 0.</font>
> 
> 3/ Relational ops:  (lhs =, !=, <,<=,>,>= rhs).
> 
> Here comes the highly debated stuff.
> 
> Once again, I fully agree with the vector interpretation being by far the most
> useful.
> 
> However, inside an if or while block header, this interpretation wouldn't make
> sense. This is what Pete is gtiping about, with some reason, because the
> result
> cannot be a vector, contrary to one would expect in the general case. 
> 
> I don't see any issue in having the relational operators behave differently
> whether they are rhs or not. Hence:
> }}}
<eucode>
> if "Pete" = "Lomax" then
> w/eucode>
> would be optimised out because the conditional clause is known to be the FALSE
> constant,
> while
> }}}
<eucode>
> s= ("Pete" = "Lomax")
> w/eucode>
> won't work because lengths don't match. compare()
> and equal() can still be used if you need the boolean
> in a rhs. Note that appearing in a routine argument is the same as being a
> rhs, since the actual value is being used.
> So, if you really wish to crash your program, you could still do this:
> }}}
<eucode>
> funcion id(object x) return x end function
> if id("Pete "="Lomax") then -- ...
> </eucode>
{{{

> Here, because the = operator defines part of a rhs, id() will return a
> sequence,
> and this triggers a run time error because an atom i expecyed.
> 
> That's how I view sequence operations after using them a fair amount over
> years,
> and experienced strengths and weaknesses of the current scheme. There are
> improvements
> to make. But remving sequence operations from the language would certainly
> cripple
> it beyond repair, and Eu doesn't need that.
> 
> CChris


Hello,

I think i agree that 1+{1,2,3} should stay the same too, as changing
it now would break code, but im not really sure just how much i
actually used this in the past.  I think a few times but that's it.
The reason is because i would seldom have the need to add two things
whos type were not the same.  Usually, when i add two things it's
because they originated from two other sources, and both sources
would be basically the same.
Still, it is very handy when you want to act on every element in
the sequence.


Take care,
Al

E boa sorte com sua programacao Euphoria!


My bumper sticker: "I brake for LED's"

 From "Black Knight":
"I can live with losing the good fight,
 but i can not live without fighting it".
"Well on second thought, maybe not."

OpenEuphoria

1. Sequence Ops

2. Re: Sequence Ops

3. Re: Sequence Ops

4. Re: Sequence Ops

5. Re: Sequence Ops

6. Re: Sequence Ops

7. Re: Sequence Ops

Search

Include:

Quick Links

User menu

Misc Menu