An interesting staff paper from MTS on fast interpreters

new topic     » topic index » view thread      » older message » newer message

The following paper was sent to the language
department
of S.AI.L by me in 1998, over a fast interpreter for
AI scripts.
It might be of some use to Rob?
Maybe Pete or David??

Anyways, this was translated from Dutch half a year
ago, so don't mind the spelling errors and weird-ass
commabominations (please don't kill me!).

"
The theory of velocity
-*-*-*-*-*-*-*-*-*-*-*-


By Mike Smith




There are many programming languages out there.
Ranging from assemblers to 32-Bit multi-platform
RAD interpreters.
But what is the most important aspect of a
program? Speed.
It doesn't matter if you are writing a database
application or a 3D game engine, it always comes
down to displaying information on the screen
as fast and effective as possible.
Now what *is* the best programming language out there?
The language that is both effective, easy to use and
extremely fast?
Believe it or not, it exists, and it's name is
"EUPHORIA" .
This language is an interpreted language available for
32-Bit DOS and 32-Bit Windows.
Allthough this language is interpreted, it's extremely
fast!
It runs circles around classic interpretters like
QBASIC and JAVA,
and it does more run-time checks than any of those
languages.

This paper will handle the theory of velocity found in
programming languages, and features a step-by-step
guide
on creating your very own interpreter!

Passage 1: Compiler or Interpretter?
-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-

Interpreter.
If you want to create a programming language that is
available
on a variety of platforms, is extremely fast and takes
very
small development time, you should create an
Interpreter rather
than a Compiler.
What? Isn't an Interpreted program slower than a
compiled program?
Well, yes.
To date, we still haven't seen an Interpretter that
beats compiled
and optimized programs. That is to say when compiled
with a
recent commercial compiler.
But that is just becuase most Interpreter architects
are
not qualified enough to create a fast interpreter.
Rapid Deployement Software, the creators of the
Euphoria
language, have made a very fast Interpreter beat very
slow Interpreters, but even they with they're
expertise
have not even come close to the speeds that can be
achieved
with program Interpreters.
The key of creating a fast Interpreter lies in the
way you execute so-called "internal" code, the code
that the Interpreter generates from the source
program that you are Interpreting. This internal code
is most of the time refferenced as "P-Code", or
"Pre-Compiled" Code.
Most Interpreters prefer to store 1-Byte "Op-Codes" in
an
array in memory, and then evaluate each byte code with
the following algorithm:

if This Byte Code equals "this", then
   do this
else if This Byte Code equals "this", then
   do this
..

And so on.
Now this is a very, very slow proccess and results in
the slowest
program execution known to man.
Other Interpreters preffer to store function pointers
in memory,
and simply calling those functions at run-time.
Each pointer points to an internal function that the
Interpreter
provides.
This method can result in faster program execution,
but it still
will be very, very slow.

The best way to do it, and the way that can make an
Interpreter
beat a compiler in speed by far, is called (by me):
"Run-Time Compilation" or "CPU Feeding".
If you are experienced with ASM or ASseMbly language,
you might have heared about Machine Code.
Now Machine Code is *not* just another name for ASM,
it is an entirely different language.
In ASM, you use verbs to do a low-level operation
on memory or "registers".
Machine Code on the other hand is a collection of
bytes,
i.e. raw data, that the CPU can immediately proccess.
A "register" is some sort of variable, there are a
fixed
number of registers in a computer system, each
register having
an ID. You can use a register to store data
temporarily.
There is one "special" register called the IP
register,
or Instruction Pointer.
What the Instruction Pointer does is hold the address
(in memory)
of the next Machine Instruction that should be
executed.
A compiler turns your source to Machine Code once, and
then
stores it on disk as a .com or .exe file.
The operating system reads the machine instructions
in the .com or .exe file into memory,and points the
CPU
to the next Machine Code to execute.

This is what we should do.
We should write an Interpreter that translates your
source
to snippets on Machine Code,stores them into memory
and
keeps setting the IP register to the next instruction
to execute.
Optimising code instruction at a time dynamically.
Maybe this idea is not that new, and maybe such an
Interpreter
has already being made, but one thing is for sure,
it'll be FAST.
"

Note, when I wrote this there were no JIT compilers as
I recall.
Therefore, you could say my paper describes what we
now know as a JIT compiler.

With the exception of my proposed interpreter
dynamically optimising machine code.
This can result in faster code than when compiled
using a static compiler.

Note that this paper was a proposal, not a
well-thought-out blueprint.
I now know a simple Jump to a black of machine codes
is more effecient than modifying the Instruction
Pointer.
The IP modification technique I proposed because I
didn't think through the dynamic optimisation hard
enough.


Mike The Spike

__________________________________________________
Do You Yahoo!?
Yahoo! Auctions - Buy the things you want at great prices.
http://auctions.yahoo.com/

new topic     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu