Re: ver 4.0 problem with SVN1222
- Posted by mattlewis (admin) Oct 29, 2008
- 1012 views
I don't understand what you mean by using globals without including them ?
I don't understand how a global can be used without including it ?
What missing include statements are you talking about the std libraries or
my include statements?
The std library should be fine. I don't think that the 4.0 std library even uses any globals. Everything should be either public or export. Here is the issue [deep breath]:
In pre-4.0 euphoria (maybe pre-3.1, but anyway...), if you tried to use a global without a namespace qualifier, and there was more than one global with that name, you would get an error. In 4.0, the parser tries to be smarter, and to figure out which symbol you really meant to use. It does this by keeping track of which files included which other files.
One way that the earlier behavior could really cause problems was with 3rd party libraries. Imagine that two 3rd party libraries defined a global with the same name. It would be perfectly safe, in your application, to use either library. But if you use both, you might start getting errors in one of those libraries (assuming the author didn't use namespace qualifiers when referencing those globals that he and the other author both defined). Therefore, the only way to get the two libraries to work together would be to go in and edit their code. Note that this could happen, even if your application always used namespace qualifiers.
In 4.0, however, the parser will consider where the symbol is being used and which files were included by the file with the code that references the global. It will choose the global symbol that was included by the third party library over a global declared by code that the file "doesn't know about." Now, the two third party libraries Just Work when you put them together. They operate independently, and don't cause conflicts with each other.
There are two new terms in 4.0: direct include and indirect include. A direct include is when a file has an include statement for a file. So if the line "include bar.e" appears in foo.e, we would say that foo.e directly includes bar.e. Suppose now that bar.e directly includes baz.e. Then we could also say that foo.e indirectly includes baz.e. Also suppose that bat.e is not included by any of these files, directly or indirectly, and that baz.e and bat.e both define a global named X. If, somewhere in foo.e, the code refers to X, the parser will see that there is a connection between foo.e and baz.e, but not bat.e, and use baz.e:X in preference to bat.e:X. And baz.e:X is an included global for foo.e, while bat.e:X is an unincluded global. A common way for this to occur in code is to have the main file include some library like win32lib, and to have other parts of the app included by the main file rely on the main file to include win32lib. Now you've got lots of unincluded global references going on.
Let's complicate things some more. :) We've removed the restriction on forward references. This means that two files can now safely include each other, and use each other's routines and constants and variables [of appropriate scope, of course]. One instance where this happens is between sort.e, sequence.e and text.e. There is code in sequence.e that uses sort(). The order in which files are included affects the order in which their code is actually parsed. This caused lots of headaches if you tried to have mutually dependent files in pre-4.0 euphoria, and lead to use of routine_ids and moving routines before or after include statements, etc. But now that's not necessary.
Now, let's suppose you're working on a complex app that actually happens to include the old sort.e, perhaps because you're porting it from 3.1 to 4.0 (there are other ways that a similar scenario could occur, but this one actually did). And suppose that the old sort.e was included before the new one was. It just so happens that the new sort.e has an extra (optional) parameter, meaning that the two are somewhat incompatible. So if sequence.e is parsed before the new sort.e, the call to sort() gets resolved to the old sort, which is an error, because it tries to pass two parameters to a function that only takes one.
But now we've encountered a situation in which code is being affected (incorrectly!) by code that it knows nothing about. So now, instead of resolving to the unincluded global immediately, it defers doing so until the end, to see if there were other, better choices. If you're using a lot of unincluded globals by design, there will be no better choices, but it will cause a lot of overhead as the parser continues to look for them. At the end of the parsing, it will clean them all up, and the program should run as before.
I have a question if I include a standard library in a file and that file
includes another file which includes that same standard library; does that
same standard library get scanned for forward references twice ?
Each file is only scanned once, just like it's always been. Each time the scanner finishes with a file, it attempts to resolve forward references. In theory, we could wait until the end of the program, but the problem is that local symbols are hidden, so we need to check before we can't see them. One probable future optimization is to make this a bit smarter, and only check those that might refer to local symbols in that file. But it's partly a case of trying to get things correct before optimizing.
Basically, the parser remembers all of the unresolved forward references and periodically tries to resolve them.
Matt