Sequence Ops

new topic     » topic index » view thread      » older message » newer message

Here's a bit of background on sequence operations
in Euphoria.

When I started designing the Euphoria language
(18 years ago!), I believed that the basic cycle time 
of the interpreter, i.e. the time required to process 
the simplest IL operation, such as adding two integers, 
would be similar to other interpreters around at that time, 
and Euphoria might even be slower, since Euphoria was going 
to have great flexibility in it's data structures, as well
as lots of run-time error checking.

I knew from my experience with APL, that an interpreted
language could gain speed by supporting operations on
large aggregates of data (vectors, matrices, lists etc.), 
thereby reducing the number of statements that need to be interpreted, 
and shifting the workload to a fast routine written in C or 
assembly language. In APL this was very true. Everyone tried
hard to write code that used APL "vectors" 
(somewhat similar to Euphoria sequences, but more restricted), 
rather than APL scalars (atoms), since you could often 
speed up a program by a factor of 10x to 100x that way.

APL also had a richer set of primitive operations on vectors
than Euphoria has. e.g. it had things like rotate-right, 
matrix transpose etc. It also had a feature that let you
apply an operation to a whole vector. e.g. +/x would add
up all the elements in x. It also let you select elements
from a vector by providing another vector of 1's and 0's
which typically was created by relational ops (< > = etc.).
x = y, where x and y are vectors (sequences) was thus 
very useful in APL. Less so, in Euphoria.

The first working version of the Euphoria interpreter
was about 40 times slower than today's interpreter
at simple integer:integer ops. 

sequence:sequence ops were not much slower than today however, 
since they are handled by run-time routines written in C,
so it was usually much faster to use sequence ops to do something, 
than to write a Euphoria loop, and do the job one atom at a time.

However, I soon became obsessed with speed, and surprised
even myself with how much faster I was able to
make the interpreter, eventually reducing the overhead 
per IL op down to just a few machine instructions.
I always compared Euphoria against C, never benchmarking
against other interpreters until just before v1.0 when
I set up some benchmarks against QBasic, and was very surprised
at how much faster Euphoria was. I later compared against
other popular interpreters and was also surprised. It seemed
like other developers just didn't care much about interpreter speed,
or they added features to their language that were incompatible with
fast execution.

Many years later, the Euphoria to C translator, boosted the speed
of simple integer:integer ops even more.

So the situation today is that it's often a bit faster to
write a Euphoria loop, than it is to use sequence ops,
mainly because sequence ops will have some storage
allocation/deallocation overhead.

Euphoria sequence ops are not used as much as 
I originally expected. 

Back in my APL days, I worked (as a summer job in university) 
for a large stock broker's research department. I was constantly 
doing analysis of time series data, such as the closing 
daily stock price of some company going back many years. 
We had various theories to test, computing correlation coefficients 
on the data etc. It was very convenient to use APL vectors
(Euphoria sequences). It eliminated a lot of loops,
and made the code very concise, not to mention faster.

Later in my career I found myself working with a powerful
parallel SIMD (single-instruction stream, multiple data stream) 
computer used in analyzing sonar signals bouncing off submarines. 
It had 8 CPU's working in lock-step, executing the same
instruction on 8 streams of data. It was the fastest
machine in the world at computing FFTs (Fast Fourier Transforms).
I worked on a language for that machine, where, naturally
SIMD (i.e. sequence) operations predominated.

I think the people on this list who are interested
in language design may not have as much use for SIMD operations
as other application writers who work in the areas of business
data processing, statistics, or scientific computing, where 
the algorithms are often fairly simple (i.e. boring), 
but you are applying those algorithms to large amounts of 
real-world data. Hobbyists are not usually interested in 
applications like that, and don't have access to large streams
of data that they care about.

So what should we do?
Thats up to you!  smile

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu