An interesting staff paper from MTS on fast interpreters
- Posted by John Cage <drcage2000 at YAHOO.COM> Jan 24, 2001
- 462 views
The following paper was sent to the language department of S.AI.L by me in 1998, over a fast interpreter for AI scripts. It might be of some use to Rob? Maybe Pete or David?? Anyways, this was translated from Dutch half a year ago, so don't mind the spelling errors and weird-ass commabominations (please don't kill me!). " The theory of velocity -*-*-*-*-*-*-*-*-*-*-*- By Mike Smith There are many programming languages out there. Ranging from assemblers to 32-Bit multi-platform RAD interpreters. But what is the most important aspect of a program? Speed. It doesn't matter if you are writing a database application or a 3D game engine, it always comes down to displaying information on the screen as fast and effective as possible. Now what *is* the best programming language out there? The language that is both effective, easy to use and extremely fast? Believe it or not, it exists, and it's name is "EUPHORIA" . This language is an interpreted language available for 32-Bit DOS and 32-Bit Windows. Allthough this language is interpreted, it's extremely fast! It runs circles around classic interpretters like QBASIC and JAVA, and it does more run-time checks than any of those languages. This paper will handle the theory of velocity found in programming languages, and features a step-by-step guide on creating your very own interpreter! Passage 1: Compiler or Interpretter? -*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*- Interpreter. If you want to create a programming language that is available on a variety of platforms, is extremely fast and takes very small development time, you should create an Interpreter rather than a Compiler. What? Isn't an Interpreted program slower than a compiled program? Well, yes. To date, we still haven't seen an Interpretter that beats compiled and optimized programs. That is to say when compiled with a recent commercial compiler. But that is just becuase most Interpreter architects are not qualified enough to create a fast interpreter. Rapid Deployement Software, the creators of the Euphoria language, have made a very fast Interpreter beat very slow Interpreters, but even they with they're expertise have not even come close to the speeds that can be achieved with program Interpreters. The key of creating a fast Interpreter lies in the way you execute so-called "internal" code, the code that the Interpreter generates from the source program that you are Interpreting. This internal code is most of the time refferenced as "P-Code", or "Pre-Compiled" Code. Most Interpreters prefer to store 1-Byte "Op-Codes" in an array in memory, and then evaluate each byte code with the following algorithm: if This Byte Code equals "this", then do this else if This Byte Code equals "this", then do this .. And so on. Now this is a very, very slow proccess and results in the slowest program execution known to man. Other Interpreters preffer to store function pointers in memory, and simply calling those functions at run-time. Each pointer points to an internal function that the Interpreter provides. This method can result in faster program execution, but it still will be very, very slow. The best way to do it, and the way that can make an Interpreter beat a compiler in speed by far, is called (by me): "Run-Time Compilation" or "CPU Feeding". If you are experienced with ASM or ASseMbly language, you might have heared about Machine Code. Now Machine Code is *not* just another name for ASM, it is an entirely different language. In ASM, you use verbs to do a low-level operation on memory or "registers". Machine Code on the other hand is a collection of bytes, i.e. raw data, that the CPU can immediately proccess. A "register" is some sort of variable, there are a fixed number of registers in a computer system, each register having an ID. You can use a register to store data temporarily. There is one "special" register called the IP register, or Instruction Pointer. What the Instruction Pointer does is hold the address (in memory) of the next Machine Instruction that should be executed. A compiler turns your source to Machine Code once, and then stores it on disk as a .com or .exe file. The operating system reads the machine instructions in the .com or .exe file into memory,and points the CPU to the next Machine Code to execute. This is what we should do. We should write an Interpreter that translates your source to snippets on Machine Code,stores them into memory and keeps setting the IP register to the next instruction to execute. Optimising code instruction at a time dynamically. Maybe this idea is not that new, and maybe such an Interpreter has already being made, but one thing is for sure, it'll be FAST. " Note, when I wrote this there were no JIT compilers as I recall. Therefore, you could say my paper describes what we now know as a JIT compiler. With the exception of my proposed interpreter dynamically optimising machine code. This can result in faster code than when compiled using a static compiler. Note that this paper was a proposal, not a well-thought-out blueprint. I now know a simple Jump to a black of machine codes is more effecient than modifying the Instruction Pointer. The IP modification technique I proposed because I didn't think through the dynamic optimisation hard enough. Mike The Spike __________________________________________________ Do You Yahoo!? Yahoo! Auctions - Buy the things you want at great prices. http://auctions.yahoo.com/