1. Precompiled include files

As I'm working with the source, building and running tests, and waiting for the same files to be translated over and over again, a thought occurred to me:

Would it be a useful feature if Euphoria included precompiled .e files from include/std in object format? The files could either be built with the language itself or else they could be cached after other builds (including building the language itself). The standard library files don't change much at all either during Euphoria development or during user program development. It might save a bit of time.

Just a thought for a possible future feature.

new topic     » topic index » view message » categorize

2. Re: Precompiled include files

jaygade said...

As I'm working with the source, building and running tests, and waiting for the same files to be translated over and over again, a thought occurred to me:

Would it be a useful feature if Euphoria included precompiled .e files from include/std in object format? The files could either be built with the language itself or else they could be cached after other builds (including building the language itself). The standard library files don't change much at all either during Euphoria development or during user program development. It might save a bit of time.

Just a thought for a possible future feature.

Yes, I've thought about this some. See RoadmapBeyond:

wiki said...
  • Integration of compiled libraries with interpreted code
  • Compile a euphoria include as a dll.
    • Translator adds a function that returns data about the public / exported routines, which the interpreter adds to the symbol table as special euphoria routines
    • User can call compiled routines like normal euphoria routines, and the interpreter handles the call transparently
    • Possibly use ifdefs to determine whether regular interpreted or compiled library to be used (if available)
    • Add a lib/ dir in the euphoria directory structure as a place for the compiled libs to live, or possibly specify other locations through config file

Matt

new topic     » goto parent     » topic index » view message » categorize

3. Re: Precompiled include files

mattlewis said...
jaygade said...

As I'm working with the source, building and running tests, and waiting for the same files to be translated over and over again, a thought occurred to me:

Would it be a useful feature if Euphoria included precompiled .e files from include/std in object format? The files could either be built with the language itself or else they could be cached after other builds (including building the language itself). The standard library files don't change much at all either during Euphoria development or during user program development. It might save a bit of time.

Just a thought for a possible future feature.

Yes, I've thought about this some. See RoadmapBeyond:

wiki said...
  • Integration of compiled libraries with interpreted code
  • Compile a euphoria include as a dll.
    • Translator adds a function that returns data about the public / exported routines, which the interpreter adds to the symbol table as special euphoria routines
    • User can call compiled routines like normal euphoria routines, and the interpreter handles the call transparently
    • Possibly use ifdefs to determine whether regular interpreted or compiled library to be used (if available)
    • Add a lib/ dir in the euphoria directory structure as a place for the compiled libs to live, or possibly specify other locations through config file

Matt

. . . . because as a dll we can unload it and reload it during runtime as much as we like to change it . . . .

useless

new topic     » goto parent     » topic index » view message » categorize

4. Re: Precompiled include files

useless_ said...

. . . . because as a dll we can unload it and reload it during runtime as much as we like to change it . . . .

It depends on the implementation, but probably not. My idea makes it all work seamlessly from the user's perspective, which means that it's integrated with how the symbol table is built and how code is parsed.

Matt

new topic     » goto parent     » topic index » view message » categorize

5. Re: Precompiled include files

mattlewis said...

It depends on the implementation, but probably not. My idea makes it all work seamlessly from the user's perspective, which means that it's integrated with how the symbol table is built and how code is parsed.

Matt

Isn't that the same thing as the Rob's original shrouded code ? 
Wasn't that originally rejected by the developers for some reason when it was requested ? 

new topic     » goto parent     » topic index » view message » categorize

6. Re: Precompiled include files

mattlewis said...

It depends on the implementation, but probably not. My idea makes it all work seamlessly from the user's perspective, which means that it's integrated with how the symbol table is built and how code is parsed.

Matt

BRyan said...

Isn't that the same thing as the Rob's original shrouded code ?

No. The shrouded and scrambled code was just a form of obsfucation. The interpreter (rather, the parser) would essentially unobsfucate it back into something like its original form before executing it. Rather than being more efficient like precompiled code, it instead added a step to the execution process.

BRyan said...

Wasn't that originally rejected by the developers for some reason when it was requested ?

Shrouded code uses the same bytes that are used in other file encodings that go beyond ASCII (e.g. UTF-8). Getting the old shrouded code to work now that we're moving to support Unicode is not a trivial task.

new topic     » goto parent     » topic index » view message » categorize

7. Re: Precompiled include files

BRyan said...
mattlewis said...

It depends on the implementation, but probably not. My idea makes it all work seamlessly from the user's perspective, which means that it's integrated with how the symbol table is built and how code is parsed.

Isn't that the same thing as the Rob's original shrouded code ? Wasn't that originally rejected by the developers for some reason when it was requested ?

No, it's actually using a compiled library as though it was included as a normal source file. I imagine that it would probably speed up parsing time for the application and would definitely speed up run time (which shrouded code would not do).

I don't believe that Rob ever open sourced his shrouding code. I think this was due to people possibly relying on released code staying encrypted. I'm not positive about this, though, and I remember him talking a bit about how stuff was encrypted at one point. Either way, the code was never released, and I don't think there was enough interest in reviving it.

Also, changes that started around 2.5 made shrouding more difficult from an implementation standpoint. I believe that was when the front end was rewritten in euphoria and no execution was done until everything was parsed.

Matt

new topic     » goto parent     » topic index » view message » categorize

8. Re: Precompiled include files

My main point was more for translating than for other uses. Files like std/convert.e could be compiled into convert.o and then linked to the rest of the .o files when doing a translate.

Currently when doing "make" on the source, the std/include files get compiled, linked, and then discarded several times in the process.

new topic     » goto parent     » topic index » view message » categorize

9. Re: Precompiled include files

Trying to understand how things work now:

  1. I write a program, say test.ex
  2. eushroud test.ex produces test.il
  3. eub test.il executes the "shrouded" file

The advantage of an .il file is that no parsing of source-code is needed and unused code is excluded; the result is faster startup compared to using eui test.ex .

The "current shrouding" is not the same as "historical shrouding." For example, a code number was used to make extra lines workable in the "old free" Euphoria interpreter.

It looks like Python lets you "compile" .py scripts into .pyc (which may be analogous to Euphoria .il code). (Python does not provide a real compiler like euc.) Python .pyc files skip parsing but do not improve program speed.

How practical would it be for eui to read .il files (say from a library) with the idea that some parsing time is saved?

new topic     » goto parent     » topic index » view message » categorize

10. Re: Precompiled include files

_tom said...

How practical would it be for eui to read .il files (say from a library) with the idea that some parsing time is saved?

Not very. The .il files are dumps of the symbol tree (which includes all of the code). So all of the references in the .il file only work for that particular file. You can't combine them.

There has been talk in the past about coming up with a more flexible way that would be similar to compiling a file to IL, then linking various IL files together. I think that would take quite a bit of work, and I'm not sure how much of a benefit it would be.

Matt

new topic     » goto parent     » topic index » view message » categorize

11. Re: Precompiled include files

For me it wouldn't be .il files it would be .dll files. The Euphoria functions would be compiled C code and the dlls loaded when a EUPHORIA include is included.

-- really loads compiled code from a DLL say std/convert.dll: translator doesn't really use .e file if older 
include std/convert.e 

As the other devs know, symbol names get mangled according to Symtab values each time they are parsed and they are not the same on subsequent translations. We would want a way that always produces the same name and a way to load the symbols from the DLL while parsing.

new topic     » goto parent     » topic index » view message » categorize

12. Re: Precompiled include files

I'm not talking about .il files OR .dll (or .so or .dylib) files. I'm talking about translating and compiling std/*.e files to .c and .o files during a project build.

I'm mainly talking about building Euphoria, although the same could apply to other larger projects. The same std/*.e file gets translated into a .c file and then compiled into a .o file and then finally linked in to the project. This seems wasteful when it happens several times during a build, when these files change extremely rarely.

new topic     » goto parent     » topic index » view message » categorize

13. Re: Precompiled include files

jaygade said...

I'm not talking about .il files OR .dll (or .so or .dylib) files. I'm talking about translating and compiling std/*.e files to .c and .o files during a project build.

I'm mainly talking about building Euphoria, although the same could apply to other larger projects. The same std/*.e file gets translated into a .c file and then compiled into a .o file and then finally linked in to the project. This seems wasteful when it happens several times during a build, when these files change extremely rarely.

I ran seven tests looking for a bug in the interpreter and so I had compiled several build directories. I compared the function name in the C from the EUPHORIA routine screen_output and each of them had the same prefix of _56. Arguably this file didn't need to be compiled each time unless the EUPHORIA code they used had changed. Since their code and even their function names remain the same across builds. It is not only the contents of the translation but also the defines that are used that affect the object code and of course which translator you are using.

What about inlining? If routines are inlined from other dot - e files than if any included file is changed you have to recompile it. So already the library files are interdependent in this way.

new topic     » goto parent     » topic index » view message » categorize

14. Re: Precompiled include files

I don't know anything about inlining.

It does seem to me, though, that the Translator can be at least as smart as Make when it comes to which files have been modified.

It may not be a simple change, but it should be a doable change.

Scenario:

  1. Make is building the interpreter.
  2. One of the interpreter files includes std/convert.e.
  3. The translator accepts a command flag that says "reuse compiled library files".
  4. But that the library file hasn't been translated yet, or the .e file is newer than the .c or .o file.
  5. The translator translates and compiles the file, and saves it to a known location. It does this for all std/*.e files included in the project. (I don't whether it's worthwhile for project-specific or all .e files in a project?)
  6. The translator finishes its build of the interpreter and deletes all of its old files EXCEPT the compiled library files.
  7. Make is building the translator.
  8. One of the translator files includes std/convert.e
  9. The translator accepts a command flag that says "reuse compiled library files".
  10. The translator finds the precompiled .c or .o file and compares it to the .e file, and sees that it's newer.
  11. The translator uses the precompiled file in its build of the translator, saving cycles and time.
  12. The translator finishes its build of the interpreter and deletes all of its old files EXCEPT the compiled library files.


And finally, the developer makes some changes to some project file or other and does another build; but he saves a lot cycles and time because a third or so of the project is already built. If there's a conflict, there can be a 'make clean' target which removes the precompiled files from the build directory.

Again, this is just brainstorming. If this ONLY affects building of Euphoria and not other projects then it might not be worth doing.

new topic     » goto parent     » topic index » view message » categorize

15. Re: Precompiled include files

jaygade said...

It does seem to me, though, that the Translator can be at least as smart as Make when it comes to which files have been modified.

This was tried, and I think there are still some remnants in the translator source from this effort. If we were doing a simple translation to C, it would probably be fairly easy. However, the translator itself does a fair amount of optimization, which includes getting rid of unused code. This feeds into other optimizations. So we can't just look at when that file has changed. We also have to know if anything has changed about the way it's being used.

So, for instance, if we know that a certain function is only called with a particular value, we can replace the variable with the value constant (assuming it's an integer) and we reduce run time tests and entire code branches. The C compiler does its own optimizations, too.

This all means that we may speed up translation (done once) at the expense of slower run times and larger executables. Ultimately, we dropped the idea.

Matt

new topic     » goto parent     » topic index » view message » categorize

16. Re: Precompiled include files

That's interesting, and it doesn't seem to match my recent experiment. I created these two files and ran "euc -nobuild" on them to examine the .c files.

-- convert1.ex 
include std/console.e 
include std/convert.e 
 
object cmd 
 
cmd = command_line() 
 
if length(cmd) < 3 then 
	display("[] requires at least 1 argument", {cmd[2]}) 
	abort(1) 
end if 
 
display(to_string(cmd[3])) 
-- convert2.ex 
include std/console.e 
include std/convert.e 
 
object cmd 
 
cmd = command_line() 
 
if length(cmd) < 3 then 
	display("[] requires at least 1 argument", {cmd[2]}) 
	abort(1) 
end if 
 
display(to_number(cmd[3])) 

convert.c was identical in both cases, and seemed to contain all of the exported functions of convert.e, as were all of the .c files created from other std/*.e files. I don't see any stripping or optimization being performed on these files.

I may be misunderstanding, but doesn't the translator treat files on its command line differently from files "included" within another file? I'm only guessing that from experiments not from actually studying the code.

Admittedly I'm testing with 4.05 on Windows right now. Maybe my example is too small or it's different on 4.1.0 -- I'll have to test it later.

Edit: Nevermind. I just did another experiment and I see that it creates a much different file.

new topic     » goto parent     » topic index » view message » categorize

17. Re: Precompiled include files

Nevermind. I just did another experiment with very different results.

new topic     » goto parent     » topic index » view message » categorize

18. Re: Precompiled include files

jaygade said...

Admittedly I'm testing with 4.05 on Windows right now. Maybe my example is too small or it's different on 4.1.0 -- I'll have to test it later.

It looks like 4.1 is better at removing things. I think this is probably due to better routine_id analysis. When I tried your examples using a 4.1 translator, I checked for the respective to_string/to_number that wasn't used, and they weren't there where you'd expect them to not be there.

Matt

new topic     » goto parent     » topic index » view message » categorize

19. Re: Precompiled include files

Yeah, I just translated int.ex (with 4.05) and it created a totally different convert.c with different identifiers and everything.

The funny thing is that it was also a much smaller file -- 37K vs. 60K.

new topic     » goto parent     » topic index » view message » categorize

20. Re: Precompiled include files

jaygade said...

Yeah, I just translated int.ex (with 4.05) and it created a totally different convert.c with different identifiers and everything.

The funny thing is that it was also a much smaller file -- 37K vs. 60K.

Yes, the version of the translator the defined words and the defined words will effect the resulting C code. Even if you compare the convert.c in intobj and transobj using the same interpreter/translator you will find different symbols inside the routines and different names for the routines themselves. They must be named this way to avoid conflicts among routines that have the same name.

These routines are not even necessarily functionally equivalent if I understand what Matt is saying.

new topic     » goto parent     » topic index » view message » categorize

21. Re: Precompiled include files

SDPringle said...

Yes, the version of the translator the defined words and the defined words will effect the resulting C code. Even if you compare the convert.c in intobj and transobj using the same interpreter/translator you will find different symbols inside the routines and different names for the routines themselves. They must be named this way to avoid conflicts among routines that have the same name.

These routines are not even necessarily functionally equivalent if I understand what Matt is saying.

At the C level, this can be true. Consider the following simple example:

 
-- foo.ex 
export procedure foo( object o ) 
	if sequence( o ) then 
		? 1 
	elsif integer( o ) then 
		? 2 
	elsif atom( o ) then 
		? 3 
	end if 
end procedure 
 
foo( 1 ) 
foo( "bar" ) 
foo( 1.5 ) 

Try commenting out different calls of foo, and look at the translated result. The translator isn't smart enough to omit the if blocks, but you'll note that you get things like:

    /** foo.ex:5		elsif integer( o ) then*/ 
    _9 = 1; 
    if (_9 == 0) 
    { 
        _9 = NOVALUE; 
        goto L3; // [22] 33 
    } 
    else{ 
        _9 = NOVALUE; 
    } 
 

...and so the C compiler will be able to optimize this away.

Matt

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu