OpenEuphoria: Forum: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

1. symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at ?mail.c?m> Oct 09, 2007
923 views

CChris wrote:
> 
> 
> Does the include tree also contain the direct ancestors? If not, there are
> some
> issues with this scheme, which is quite common:
> }}}
<eucode>
-- file1.e
include third.e
global integer n_also_in_third
n_also_in_third=n_init() -- ok, this g.s. shadows anything in third.e
include file2.e
--...
> 
-- in file2.e
integer my_var
my_var=n_also_in_third 
-- could produec an error, while   
-- file1.e::n_also_in_third is obviously being targeted
</eucode>
{{{

> As the name suggests, n_also_in_third is also a global symbol defined in
> third.e.
> I'd consider it a problem if this isn't properly handled.

No, it doesn't climb back up the tree.  I can see the logic of it, but I'm
not sure I agree with this.  Is it really that common?  I know that I've 
occasionally been guilty of this.  

 From the perspective of file2.e, it's not obvious that the correct resolution
should be from file1.e.  I think it might be better to give an error in
this case.  The resolution would be to include the file explicitly.

Continuing my train of thought from the response to Pete, I guess a legitimate
question to be asked would be whether we should let symbols in directly
included files mask those from lower down in the tree?  Assuming, of course,
that no namespace identifier is used, and there is only one possible symbol.

I'm also starting to think that any symbol resolution from beyond the 
include tree should generate a warning.

Matt

new topic » topic index » view message » categorize

2. symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blue?ond?r.co.uk> Oct 09, 2007
851 views
Last edited Oct 10, 2007

Matt Lewis wrote:
> 
> Pete Lomax wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > Actually, namespaces will be inherited (and are inherited, for code at > >
> > > > head of the svn repository) in the next release.  I typically run this
> > > code, which is probably why I haven't noticed the bug.
> > 
> > Cool. How did you solve that in the end? In Positive I devised a set of
> > priority
> > tables, which is quite simple really but harder to explain, eg:
> > 
> > -- fileno  desc       parents	 priorities
> > --   1	   inc7 	{}	{2,1,1,1,1,1}
> > --   2	     eric	{1}	{2,4,3,3,3,3}
> > --   3	       bob	{2}	{2,4,6,5}
> 
> <snip>
> 
> My technique is a bit different.  I don't worry about priorities.  Just the
> 'lineage' of each file.  Basically, I record who got included by whom.
> Then, when doing a symbol lookup, if there are multiple matching global
> symbols, I have to do some work.
Fair enough. Actually I just realised that since you could rebuild my table from
the parentage info, both approaches are equivalent, just doing the work needed in
different places.

> IMHO, solve 95% of the possible namespace conflicts with 3rd party libs.
Agreed.

> Now, when a namespace identifier is used:
> 
> If a symbol is not within the file (as specified by the namespace 
> identifier), *or its include tree* then it is ignored.
Are you really saying (see also final comment) that if I code

include win32lib.ew as win32lib
...
  l=win32lib:length(x)

ie, length() [or some global from somewhere else] has nothing to do with
win32lib, but you just ignore it and use the builtin? Seems misleading to me. I
would replace "it is ignored" with "then an error results" in your statement.

> I suppose that giving preference to a symbol in the namespaced file (as
> opposed to files included by that file) would also work to solve the 
> dilemma.
The point is that without such you cannot explicitly refer to the one in the
namespaced file, whereas you can for sub-includes by re-including them.

CChris wrote:
> -- in file2.e
> integer my_var
> my_var=n_also_in_third 
> -- could produec an error, while   
> -- file1.e::n_also_in_third is obviously being targeted
As Matt said, unqualified it is not "obviously" the one in file1.e.
However, "include file1.e as file1" must allow file1:n_also_in_third to
explicitly target one of the clashing globals (Eu has worked exactly like that
since 2.3 and possibly before). As above it would not work properly without a
preference scheme, when you extend the namespace scope that is.

I agree a global from the parent should override one from a grandparent but
agree with Matt that it should not override ones in the siblings.
Not quite sure where Matt stands on the first part.

Matt Lewis wrote:
> Continuing my train of thought from the response to Pete, I guess a 
> legitimate question to be asked would be whether we should let symbols in 
> directly included files mask those from lower down in the tree?  Assuming, >
> of course, that no namespace identifier is used, and there is only one
> possible symbol.
Are you saying that in eric you would check bob&diane (directly included by
eric) as one level then alice&chris (included indirectly via bob&diane
respectively) as the next, which seems messy? The parent masks grandparent seems
more useful. One of my guiding principles is: If it works standalone, it should
work the same when you include it in the middle of an application.

> I'm also starting to think that any symbol resolution from beyond the 
> include tree should generate a warning.
Is this the same worry I had with "win32lib:length" above?

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

3. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ai?.com> Oct 10, 2007
876 views

Pete Lomax wrote:
> 
> Fair enough. Actually I just realised that since you could rebuild my table
> from the parentage info, both approaches are equivalent, just doing the work
> needed in different places.

Yep.

> > IMHO, solve 95% of the possible namespace conflicts with 3rd party libs.
> Agreed.
> 
> > Now, when a namespace identifier is used:
> > 
> > If a symbol is not within the file (as specified by the namespace 
> > identifier), *or its include tree* then it is ignored.
> Are you really saying (see also final comment) that if I code
> }}}
<eucode>
> include win32lib.ew as win32lib
> ...
>   l=win32lib:length(x)
> </eucode>
{{{

> ie, length() [or some global from somewhere else] has nothing to do with
> win32lib,
> but you just ignore it and use the builtin? Seems misleading to me. I would
> replace "it is ignored" with "then an error results" in your statement.

Yes, I think that would be an error.  The global has to be somewhere in the
include tree of win32lib (so all of the memory stuff, for instance, could
be accessed through the win32lib namespace).

> > I suppose that giving preference to a symbol in the namespaced file (as
> > opposed to files included by that file) would also work to solve the 
> > dilemma.
> The point is that without such you cannot explicitly refer to the one in the
> namespaced file, whereas you can for sub-includes by re-including them.

Yep, it's a case I hadn't thought of, but obviously needs to be addressed.

> CChris wrote:
> > -- in file2.e
> > integer my_var
> > my_var=n_also_in_third 
> > -- could produec an error, while   
> > -- file1.e::n_also_in_third is obviously being targeted
> As Matt said, unqualified it is not "obviously" the one in file1.e.
> However, "include file1.e as file1" must allow file1:n_also_in_third to
> explicitly
> target one of the clashing globals (Eu has worked exactly like that since 2.3
> and possibly before). As above it would not work properly without a preference
> scheme, when you extend the namespace scope that is.
> 
> I agree a global from the parent should override one from a grandparent but
> agree with Matt that it should not override ones in the siblings.
> Not quite sure where Matt stands on the first part.

I think that 'the first part' is the global/parent/grandparent statement?
I'm not sure on that one.  Alternately, your statement seems to logically
follow the other facts, but there's something about it that isn't sitting
right.  I'll have to think about it some more, but I suspect that you're
correct.

> Matt Lewis wrote:
> > Continuing my train of thought from the response to Pete, I guess a 
> > legitimate question to be asked would be whether we should let symbols in 
> > directly included files mask those from lower down in the tree?  Assuming,
> > of course, that no namespace identifier is used, and there is only one 
> > possible symbol.
> Are you saying that in eric you would check bob&diane (directly included by
> eric) as one level then alice&chris (included indirectly via bob&diane
> respectively) as the next, which seems messy? The parent masks grandparent
> seems
> more useful. One of my guiding principles is: If it works standalone, it
> should
> work the same when you include it in the middle of an application.

I'm not sure I follow the difference here.  Maybe I'm thinking too much
in terms of the way RTLookup is coded (and the symbol table is put 
together).  I don't think there's any guarantee that you'll encounter
symbols from one file before another, so you have to check all of them
before deciding who masks whom (though I might be wrong about this).

I think some of my concern with the masking is that it's a new behavior 
for euphoria, and isn't always obvious.  I think that if any globals from
other files mask other globals from other files due to a parent/grandparent
relationship, there should be at minimum a warning generated.  I think
it would be easy to get confused as to which symbol is resolved, and
going on our merry way to destruction.

> > I'm also starting to think that any symbol resolution from beyond the 
> > include tree should generate a warning.
> Is this the same worry I had with "win32lib:length" above?

I don't think so.  Let's say you're developing myapp.exw, and it includes
win32lib.  It also includes myapp.ew, which does not include win32lib,
but calls win32lib routines.  As long as win32lib was included before
myapp.ew, then this will work, but since you're depending on symbols that
you've never explicitly included into myapp.ew, I  believe that a warning
should be generated.

Matt

new topic » goto parent » topic index » view message » categorize

4. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agricultu?e.gouv.?r> Oct 10, 2007
863 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > 
> > Does the include tree also contain the direct ancestors? If not, there are
> > some
> > issues with this scheme, which is quite common:
> > }}}
<eucode>
> -- file1.e
> include third.e
> global integer n_also_in_third
> n_also_in_third=n_init() -- ok, this g.s. shadows anything in third.e
> include file2.e
> --...
> > 
> -- in file2.e
> integer my_var
> my_var=n_also_in_third 
> -- could produec an error, while   
> -- file1.e::n_also_in_third is obviously being targeted
> </eucode>
{{{

> > As the name suggests, n_also_in_third is also a global symbol defined in
> > third.e.
> > I'd consider it a problem if this isn't properly handled.
> 
> No, it doesn't climb back up the tree.  I can see the logic of it, but I'm
> not sure I agree with this.  Is it really that common?  I know that I've 
> occasionally been guilty of this.  
> 

How do you have reliable two way communication between a server and a client
then? So it is common as soon as your lib/app is not organised in a pyramidal
way. By "reliable" I mean, not disrupted by a third party file included alongside
a large library.

>  From the perspective of file2.e, it's not obvious that the correct resolution
> should be from file1.e. 

Huh?
A parent/ancestor is a more likely target than a cousin, don't you think?

> I think it might be better to give an error in
> this case.  The resolution would be to include the file explicitly.

Which means file2.e should include file1.e, even though file1.e includes
file2.e? It makes sense, but will such a sea change - allowing circular includes
- be accepted? I would.

> 
> Continuing my train of thought from the response to Pete, I guess a legitimate
> question to be asked would be whether we should let symbols in directly
> included files mask those from lower down in the tree?  Assuming, of course,
> that no namespace identifier is used, and there is only one possible symbol.
> 

Definitely yes, that's the only way to extend routines that are not builtins -
something which is hard to do currently, other than using clumsy call_func/proc()
calls.
One could always use namespacing to access the shadowed symbols.

> I'm also starting to think that any symbol resolution from beyond the 
> include tree should generate a warning.
> 

It doesn't hurt, agreed.

> Matt
CChris

new topic » goto parent » topic index » view message » categorize

5. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at ?m?il.com> Oct 10, 2007
882 views

CChris wrote:
> 
> Matt Lewis wrote:
> > 
> > No, it doesn't climb back up the tree.  I can see the logic of it, but I'm
> > not sure I agree with this.  Is it really that common?  I know that I've 
> > occasionally been guilty of this.  
> > 
> 
> How do you have reliable two way communication between a server and a client
> then? So it is common as soon as your lib/app is not organised in a pyramidal
> way. By "reliable" I mean, not disrupted by a third party file included
> alongside
> a large library.

You've lost me here.

> >  From the perspective of file2.e, it's not obvious that the correct
> >  resolution
> > should be from file1.e. 
> 
> Huh?
> A parent/ancestor is a more likely target than a cousin, don't you think?

I think you've missed my point.  I agree that the parent is more likely,
but it's still ambiguous.  file2.e has a dependency that it doesn't make
clear.  The symbol could be from anywhere.  Now, if it added an include
to its parent, then it would be much clearer.  A better solution might
be to move those symbols to a separate file that both file1.e and file2.e 
could include.  This would make things less ambiguous for both the 
parser and the coder.

> > I think it might be better to give an error in
> > this case.  The resolution would be to include the file explicitly.
> 
> Which means file2.e should include file1.e, even though file1.e includes
> file2.e?
> It makes sense, but will such a sea change - allowing circular includes - be
> accepted? I would.

They are currently accepted.  There is no change, except that we still
need a mechanism for allowing symbols in a direct include to mask those
of an indirect include.

Matt

new topic » goto parent » topic index » view message » categorize

6. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at bl?eyond?r.co.uk> Oct 10, 2007
860 views

Matt Lewis wrote:
> 
> Pete Lomax wrote:
> > I agree a global from the parent should override one from a grandparent but
> > agree with Matt that it should not override ones in the siblings.
> > Not quite sure where Matt stands on the first part.
> 
> I think that 'the first part' is the global/parent/grandparent statement?
Yes. An example:

main.e
  global Z
  include libA.e
    global Z
    include libAmisc.e
       if Z then

Once you take the view that libA works standalone and including it as part of a
larger app should not change that, then the unqualified reference to Z should
automatically resolve to the one in libA and ignore the one in main.e

IMO such a rule would help far more often than hinder, and not as if it is going
to break any existing code or anything.

> I'm not sure on that one.  Alternately, your statement seems to logically
> follow the other facts, but there's something about it that isn't sitting
> right.  I'll have to think about it some more, but I suspect that you're
> correct.
Perhaps the missing part is that "include third.e as third" also allows
third:n_also_in_third to explicitly target the other one of the clashing globals.

> > Matt Lewis wrote:
> > > Continuing my train of thought from the response to Pete, I guess a 
> > > legitimate question to be asked would be whether we should let symbols in 
> > > directly included files mask those from lower down in the tree?  Assuming,
> > > of course, that no namespace identifier is used, and there is only one 
> > > possible symbol.
> > Are you saying that in eric you would check bob&diane (directly included by
> > eric) as one level then alice&chris (included indirectly via bob&diane
> > respectively) as the next, which seems messy? The parent masks grandparent
> > seems
> > more useful. One of my guiding principles is: If it works standalone, it
> > should
> > work the same when you include it in the middle of an application.
> 
> I'm not sure I follow the difference here.  Maybe I'm thinking too much
> in terms of the way RTLookup is coded (and the symbol table is put 
> together).  I don't think there's any guarantee that you'll encounter
> symbols from one file before another, so you have to check all of them
> before deciding who masks whom (though I might be wrong about this).

Given that some of the sub-includes may in fact be re-includes, then yes,
following a hash chain may chance upon relevant entries in a completely
higgledy-piggledy fashion. I concede that treating bob&diane as one level and
alice&chris as one lower need not be any messier, still not seeing why you would
actually want to do that though. Got a practical example?

> I think some of my concern with the masking is that it's a new behavior 
> for euphoria, and isn't always obvious.  I think that if any globals from
> other files mask other globals from other files due to a parent/grandparent
> relationship, there should be at minimum a warning generated.  I think
> it would be easy to get confused as to which symbol is resolved, and
> going on our merry way to destruction.

The obvious thing is to develop a set of test files. You can find the ones I
used at http://www.palacebuilders.pwp.blueyonder.co.uk/t05.zip which you may or
may not have already seen and may or may not find useful. The .exw files are the
actual tests (14 of), some of which are mind numbingly trivial.

Probably also a good ideal to develop a set of "fail" test files which should
generate the desired errors and warnings.

> > > I'm also starting to think that any symbol resolution from beyond the 
> > > include tree should generate a warning.
> > Is this the same worry I had with "win32lib:length" above?
> 
> I don't think so.  Let's say you're developing myapp.exw, and it includes
> win32lib.  It also includes myapp.ew, which does not include win32lib,
> but calls win32lib routines.  As long as win32lib was included before
> myapp.ew, then this will work, but since you're depending on symbols that
> you've never explicitly included into myapp.ew, I  believe that a warning
> should be generated.

I'm not convinced. If you can make myapp:closeWindow() work in this case without
introducing other problems then fair do's, but I still think it should just be an
error (which goes away if you put "include win32lib.ew" inside myapp.ew).

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

7. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ail.c?m> Oct 10, 2007
864 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> Yes. An example:

main.e
   global Z
   include libA.e
     global Z
     include libAmisc.e
        if Z then

> Once you take the view that libA works standalone and including it as part of
> a larger app should not change that, then the unqualified reference to Z
> should
> automatically resolve to the one in libA and ignore the one in main.e
> 
> IMO such a rule would help far more often than hinder, and not as if it is
> going
> to break any existing code or anything.

I can see what you're saying, but there's still some ambiguity.  The minimum
that I think I'd go along with is to put in a warning.  Now, if libAmisc.e
put in an include statement for libA.e, then it would be explicit that it
wants to use the symbols from libA.e.  

What if main.e had also included libAmisc.e?

main.e
   global Z
   include libAmisc.e
     if Z then
   include libA.e
     global Z
     include libAmisc.e
        if Z then

Now how would the interpreter handle this?  It has to be an error (assuming
that libAmisc.e never included libA.e).  I think we have to assume that 
this situation will occur, because for main.e to see some of the symbols
of libAmisc.e, it might need to use a namespace identifier to avoid 
having libA.e symbols mask those from libAmisc.e.  So we've got another
case of a library breaking because of the way it was used (similar, but
slightly different from including two libraries with conflicting symbols).

Based on this line of thought, I believe that using a symbol not in one's
include tree should throw a warning, and that this situation (as it does
today) should be an error.

> > I'm not sure on that one.  Alternately, your statement seems to logically
> > follow the other facts, but there's something about it that isn't sitting
> > right.  I'll have to think about it some more, but I suspect that you're
> > correct.
>
> Perhaps the missing part is that "include third.e as third" also allows
> third:n_also_in_third
> to explicitly target the other one of the clashing globals.

I think you're right.
 
> > > Matt Lewis wrote:
> > > > Continuing my train of thought from the response to Pete, I guess a 
> > > > legitimate question to be asked would be whether we should let symbols
> > > > in
> > > > directly included files mask those from lower down in the tree? 
> > > > Assuming,
> > > > of course, that no namespace identifier is used, and there is only one 
> > > > possible symbol.
> > > 
> > > Are you saying that in eric you would check bob&diane (directly included
> > > by
> > > eric) as one level then alice&chris (included indirectly via bob&diane
> > > respectively) as the next, which seems messy? The parent masks grandparent
> > > seems
> > > more useful. One of my guiding principles is: If it works standalone, it
> > > should
> > > work the same when you include it in the middle of an application.
> > 
> > I'm not sure I follow the difference here.  Maybe I'm thinking too much
> > in terms of the way RTLookup is coded (and the symbol table is put 
> > together).  I don't think there's any guarantee that you'll encounter
> > symbols from one file before another, so you have to check all of them
> > before deciding who masks whom (though I might be wrong about this).
> 
> Given that some of the sub-includes may in fact be re-includes, then yes,
> following
> a hash chain may chance upon relevant entries in a completely
> higgledy-piggledy
> fashion. I concede that treating bob&diane as one level and alice&chris
> as one lower need not be any messier, still not seeing why you would actually
> want to do that though. Got a practical example?

I wonder if we're talking past each other?  I'll adopt your example from
above:

app.ex
  include main.e main.e
    global Z
    include libAmisc.e
      if Z then
    include libA.e
      global Z
      include libAmisc.e
         if Z then
  ? Z -- back in app.ex


I was talking about how to resolve Z.  I'd say that an unqualified Z in 
this instance should get you main:Z.  To get libA:Z

app.ex
  include main.e main.e
    global Z
    include libAmisc.e
      if Z then
    include libA.e
      global Z
      include libAmisc.e
         if Z then
  include libA.e as A
  ? A:Z -- back in app.ex


> > I think some of my concern with the masking is that it's a new behavior 
> > for euphoria, and isn't always obvious.  I think that if any globals from
> > other files mask other globals from other files due to a parent/grandparent
> > relationship, there should be at minimum a warning generated.  I think
> > it would be easy to get confused as to which symbol is resolved, and
> > going on our merry way to destruction.
> 
> The obvious thing is to develop a set of test files. You can find the ones I
> used at <a
> href="http://www.palacebuilders.pwp.blueyonder.co.uk/t05.zip">http://www.palacebuilders.pwp.blueyonder.co.uk/t05.zip</a>
> which you may or may not have already seen and may or may not find useful. The
> .exw files are
> the actual tests (14 of), some of which are mind numbingly trivial. 
> 
> Probably also a good ideal to develop a set of "fail" test files which should
> generate the desired errors and warnings.

I agree.  I checked out your test files.  Once I got rid of the 'positivisms'
in them, they all seemed to pass (on the version of euphoria built from
the svn head), except for incV.exw, and I'm not even going to try to 
figure out what you were doing there.  I got lots of illegal escape chars, 
and I suspect that it relies on the directory structure of your machine.

> > > > I'm also starting to think that any symbol resolution from beyond the 
> > > > include tree should generate a warning.
> > >
> > > Is this the same worry I had with "win32lib:length" above?
> > 
> > I don't think so.  Let's say you're developing myapp.exw, and it includes
> > win32lib.  It also includes myapp.ew, which does not include win32lib,
> > but calls win32lib routines.  As long as win32lib was included before
> > myapp.ew, then this will work, but since you're depending on symbols that
> > you've never explicitly included into myapp.ew, I  believe that a warning
> > should be generated.
> 
> I'm not convinced. If you can make myapp:closeWindow() work in this case
> without
> introducing other problems then fair do's, but I still think it should just
> be an error (which goes away if you put "include win32lib.ew" inside
> myapp.ew).

I'm not sure what myapp:closeWindow() refers to.  The behavior described
above will work with current euphoria:

myapp.exw
  include win32lib.ew
  include myapp.ew
    closeWindow(...)

But let's suppose that someone has made a custom structure library, and
it has a routine called allocate().

myapp.exw
  include win32lib.ew  -- somewhere, this includes machine.e 
  include struct.e
    global function allocate(...)
  include myapp.ew
     ptr = allocate(...)

This worked fine until we put struct.e in there.  If we were to warn (before
we add struct.e to the mix) about using allocate() in myapp.ew, when it
doesn't include (directly or indirectly) machine.e, then we can at least
sleep at night knowing that we flagged it, and he should have fixed it
before it became a problem.

This isn't the best example, because all of the stuff happens in the
app dev's code, and it isn't best practices, but I don't think we should
necessarily assume (as developers of the language) that the code written
in the language will follow best practices.  I'm just saying that since we 
*can* detect something that can easily lead to problems, we should give a 
warning about it.

I bet that I'll find a lot of this in various libraries of mine.

Matt

new topic » goto parent » topic index » view message » categorize

8. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agr?cul?ure.gouv.fr> Oct 10, 2007
861 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > No, it doesn't climb back up the tree.  I can see the logic of it, but I'm
> > > not sure I agree with this.  Is it really that common?  I know that I've 
> > > occasionally been guilty of this.  
> > > 
> > 
> > How do you have reliable two way communication between a server and a client
> > then? So it is common as soon as your lib/app is not organised in a
> > pyramidal
> > way. By "reliable" I mean, not disrupted by a third party file included
> > alongside
> > a large library.
> 
> You've lost me here.
> 

Why include a file in a program?
Reason #1: so that the including file can rely on routines, and less often
variables, from the included file;
Reason #2, often forgotten: so that the included file can benefit from services
(data or routines) from the including file. By carefully choosing the placement
of the include statement, the including file can control which global identifiers
the included file can access.

Now, if the including file also includes a third party file, and the latter has
a symbol that clashes with one in the including file, then the _included_ file
will have its access to the clashing symbol in the including file disrupted.

I hope my point about client/server is clearer now.

> > >  From the perspective of file2.e, it's not obvious that the correct
> > >  resolution
> > > should be from file1.e. 
> > 
> > Huh?
> > A parent/ancestor is a more likely target than a cousin, don't you think?
> 
> I think you've missed my point.  I agree that the parent is more likely,
> but it's still ambiguous.  file2.e has a dependency that it doesn't make
> clear.

Is the dependency to appear necessarily in file2?

>  The symbol could be from anywhere.  Now, if it added an include
> to its parent, then it would be much clearer.  A better solution might
> be to move those symbols to a separate file that both file1.e and file2.e 
> could include.  This would make things less ambiguous for both the 
> parser and the coder.

This is ok for variables in the including file, but not for routines, as they
frequently access local symbols. If the routines that both file1 and file2 would
include are the only users of the locals they use in file1, then these locals can
be moved to the new file, but this is a rather rare, favourable circumstance.
Your scheme would increase the use of global symbols because of the unnatural,
inconvenient splitting of files.

> 
> > > I think it might be better to give an error in
> > > this case.  The resolution would be to include the file explicitly.
> > 
> > Which means file2.e should include file1.e, even though file1.e includes
> > file2.e?
> > It makes sense, but will such a sea change - allowing circular includes - be
> > accepted? I would.
> 
> They are currently accepted.  There is no change, except that we still
> need a mechanism for allowing symbols in a direct include to mask those
> of an indirect include.
> 
> Matt

Currently, they are accepted, but ignored, as the target of the second include
statement is already known from parser.  The change is that, however, note would
be taken of the inclusion attempt so that the included file can see symbols from
the including file. And care is needed so that only the known symbols in the
including file at the time it includes the included file are visible from it, not
all global symbols. Or did I get anything wrong?

CChris

new topic » goto parent » topic index » view message » categorize

9. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmai?.com> Oct 10, 2007
860 views

CChris wrote:
> 
> Matt Lewis wrote:
> > 
> > CChris wrote:
> > > 
> > > How do you have reliable two way communication between a server and a
> > > client
> > > then? So it is common as soon as your lib/app is not organised in a
> > > pyramidal
> > > way. By "reliable" I mean, not disrupted by a third party file included
> > > alongside
> > > a large library.
> > 
> > You've lost me here.
> > 
> 
> Why include a file in a program?
> Reason #1: so that the including file can rely on routines, and less often
> variables,
> from the included file;
> Reason #2, often forgotten: so that the included file can benefit from
> services
> (data or routines) from the including file. By carefully choosing the
> placement
> of the include statement, the including file can control which global
> identifiers
> the included file can access.
> 
> Now, if the including file also includes a third party file, and the latter
> has a symbol that clashes with one in the including file, then the _included_
> file will have its access to the clashing symbol in the including file
> disrupted.
> 
> I hope my point about client/server is clearer now.

Yes.  I would repeat what I said in the response to Pete.  The 'client' 
file should have an include statement to its parent to make clear that it
uses symbols in that file.

> > > >  From the perspective of file2.e, it's not obvious that the correct
> > > >  resolution
> > > > should be from file1.e. 
> > > 
> > > Huh?
> > > A parent/ancestor is a more likely target than a cousin, don't you think?
> > 
> > I think you've missed my point.  I agree that the parent is more likely,
> > but it's still ambiguous.  file2.e has a dependency that it doesn't make
> > clear.
> 
> Is the dependency to appear necessarily in file2?

Not sure exactly what your asking here, but I'll clarify my statement.  By
dependency, I mean that file2 is using a symbol from file1.  But the 
problem is that file2 never includes file1, so it's implicit, and can
lead to ambiguity down the road, depending on how its used.  I'd rather
throw a warning or an error than make assumptions that can lead to 
subtle bugs.

> >  The symbol could be from anywhere.  Now, if it added an include
> > to its parent, then it would be much clearer.  A better solution might
> > be to move those symbols to a separate file that both file1.e and file2.e 
> > could include.  This would make things less ambiguous for both the 
> > parser and the coder.
> 
> This is ok for variables in the including file, but not for routines, as they
> frequently access local symbols. If the routines that both file1 and file2
> would
> include are the only users of the locals they use in file1, then these locals
> can be moved to the new file, but this is a rather rare, favourable
> circumstance.
> Your scheme would increase the use of global symbols because of the unnatural,
> inconvenient splitting of files.

I agree, which is why I used the 'might' caveat.  TIMTOWTDI.

> > 
> > > > I think it might be better to give an error in
> > > > this case.  The resolution would be to include the file explicitly.
> > > 
> > > Which means file2.e should include file1.e, even though file1.e includes
> > > file2.e?
> > > It makes sense, but will such a sea change - allowing circular includes -
> > > be
> > > accepted? I would.
> > 
> > They are currently accepted.  There is no change, except that we still
> > need a mechanism for allowing symbols in a direct include to mask those
> > of an indirect include.
> > 
> > Matt
> 
> Currently, they are accepted, but ignored, as the target of the second include
> statement is already known from parser.  The change is that, however, note
> would
> be taken of the inclusion attempt so that the included file can see symbols
> from the including file. And care is needed so that only the known symbols in
> the including file at the time it includes the included file are visible from
> it, not all global symbols. Or did I get anything wrong?

They are ignored, only inasmuch as they don't generate new code.  In the
current code (note: not the released version, but the svn head) it will
have the effect of improving symbol resolution.  It shouldn't break any
existing code, but should assist in resolving namespace conflicts (though
with the deficiencies previously mentioned in this thread).

Matt

new topic » goto parent » topic index » view message » categorize

10. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blu?yonder.co.uk> Oct 11, 2007
855 views

Matt Lewis wrote:
> 
> What if main.e had also included libAmisc.e?
> 
> }}}
<eucode>
>  main.e
>    global Z
>    include libAmisc.e
>      if Z then
>    include libA.e
>      global Z
>      include libAmisc.e
>         if Z then
> </eucode>
{{{

> Now how would the interpreter handle this?
Well, yes, that would break it for sure, ... but ... that code has been broken
since the year dot (without any warning) so I struggle to accept that example as
relevant.
>  It has to be an error
<snip>
> I bet that I'll find a lot of this in various libraries of mine.
I am quite worried that absolute torrents of the warning you propose will occur
in legacy code. On a lesser note, it seems to me that while you could give said
warning in the libA standalone case, it would be nigh on impossible for the code
snippet above to actually produce a compile error?

Your post gave me quite a pause for thought, I may come back to some of the
other points raised later.

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

11. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at ?m?il.com> Oct 11, 2007
863 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> > What if main.e had also included libAmisc.e?
> > 
> > }}}
<eucode>
> >  main.e
> >    global Z
> >    include libAmisc.e
> >      if Z then
> >    include libA.e
> >      global Z
> >      include libAmisc.e
> >         if Z then
> > </eucode>
{{{

> > Now how would the interpreter handle this?
> >
> Well, yes, that would break it for sure, ... but ... that code has been broken
> since the year dot (without any warning) so I struggle to accept that example
> as relevant.

Yes, the point is that we're trying to fix things so that we can reasonably
use third party code without having to modify any of it to make it play
nicely together.  This particular exercise is basically taking things to
their logical conclusions, and to try to get it right the first time,
rather than to have to revisit this later.

I extended CChris' question to cover a more worrisome example, basically
in an effort to explain why I disagreed with him.

> >  It has to be an error
> <snip>
> > I bet that I'll find a lot of this in various libraries of mine.
>
> I am quite worried that absolute torrents of the warning you propose will
> occur
> in legacy code. On a lesser note, it seems to me that while you could give
> said
> warning in the libA standalone case, it would be nigh on impossible for the
> code snippet above to actually produce a compile error?

Why are you worried?  Adding a warning doesn't create the problem, it just
alerts us to a problem that already exists, and gives us a chance to
fix it before it starts causing problems.  While there are some warnings 
that can be ignored (like short circuits--assuming you understand the 
implications), this is one that should really be corrected.

You're right about the compile error.  The problem here is that it
would use the *wrong* Z, so not a compile error, but a bug that could be
fixed based on the warning.

Here's how you might get a compile error:

--main.ex
global constant Z = "main"
include libA.e

--libA.e
global constant Z = "libA"
include libAmisc.e

--libAmisc.e
printf(1, "libAmisc.e: Z = '%s'\n", {Z})

Now, if you include libAmisc.e in main.ex before libA.e, you'll get 
the wrong Z.  Or, if libA.e is included before Z is declared in main.ex,
you'll get the correct Z, but only as long as there's not a Z declared
earlier (including, of course, in files like libB.e).

Now, as a thought experiment, replace Z with something like TRUE.

Matt

new topic » goto parent » topic index » view message » categorize

12. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmail?c?m> Oct 11, 2007
875 views

Pete Lomax wrote:
>
> I am quite worried that absolute torrents of the warning you propose will
> occur
> in legacy code. 

Heh.  I added the warning and tested against eu/ec/backend.ex.  It was
an absolute torrent of warnings.  And it requires some code reorganization.

Basically, the top-level .ex file includes files based on what it's doing.
And there are some functions and constants that are defined differently
based upon this.  But, of course, the users of these identifiers just
pick up whatever is there, since there's no such thing as a conditional
include.

So I created two additional files, common.e and mode.e.  I made common.e
because backend.ex doesn't use global.e.  And I moved several things to
global.e to allow consistent include statements (in order to get rid of 
the warnings).

mode.e is used to facilitate the modularity, so that instead of code 
calling things like InitBackEnd() directly, they end up calling the 
version in mode.e, and the files where the real InitBackEnd() are declared
use set_init_backend() to pass a routine_id.

I've committed the changes to a branch in the repository:
http://rapideuphoria.svn.sourceforge.net/viewvc/rapideuphoria/branches/mwl/source/

You might say, "Who cares, this isn't a library!"  And I'd say, "Not yet."
One long term goal of mine is to be able to embed the interpreter into
your applications (I've done this with ooeu), at which point it will be
important.  By making these changes, we can be assured that we won't get
any namespace resolution problems originating from within the interpreter
code, which, IMHO, is a duty of a library developer.

Matt

new topic » goto parent » topic index » view message » categorize

13. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agricultur?.gouv.?r> Oct 11, 2007
862 views

Matt Lewis wrote:
> 
> Pete Lomax wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > What if main.e had also included libAmisc.e?
> > > 
> > > }}}
<eucode>
> > >  main.e
> > >    global Z
> > >    include libAmisc.e
> > >      if Z then
> > >    include libA.e
> > >      global Z
> > >      include libAmisc.e
> > >         if Z then
> > > </eucode>
{{{

> > > Now how would the interpreter handle this?
> > >
> > Well, yes, that would break it for sure, ... but ... that code has been
> > broken
> > since the year dot (without any warning) so I struggle to accept that
> > example
> > as relevant.
> 
> Yes, the point is that we're trying to fix things so that we can reasonably
> use third party code without having to modify any of it to make it play
> nicely together.  This particular exercise is basically taking things to
> their logical conclusions, and to try to get it right the first time,
> rather than to have to revisit this later.
> 
> I extended CChris' question to cover a more worrisome example, basically
> in an effort to explain why I disagreed with him.
> 
> > >  It has to be an error
> > <snip>
> > > I bet that I'll find a lot of this in various libraries of mine.
> >
> > I am quite worried that absolute torrents of the warning you propose will
> > occur
> > in legacy code. On a lesser note, it seems to me that while you could give
> > said
> > warning in the libA standalone case, it would be nigh on impossible for the
> > code snippet above to actually produce a compile error?
> 
> Why are you worried?  Adding a warning doesn't create the problem, it just
> alerts us to a problem that already exists, and gives us a chance to
> fix it before it starts causing problems.  While there are some warnings 
> that can be ignored (like short circuits--assuming you understand the 
> implications), this is one that should really be corrected.
> 
> You're right about the compile error.  The problem here is that it
> would use the *wrong* Z, so not a compile error, but a bug that could be
> fixed based on the warning.
> 
> Here's how you might get a compile error:
> }}}
<eucode>
> --main.ex
> global constant Z = "main"
> include libA.e
> 
> --libA.e
> global constant Z = "libA"
> include libAmisc.e
> 
> --libAmisc.e
> printf(1, "libAmisc.e: Z = '%s'\n", {Z})
> </eucode>
{{{

> Now, if you include libAmisc.e in main.ex before libA.e, you'll get 
> the wrong Z.  Or, if libA.e is included before Z is declared in main.ex,
> you'll get the correct Z, but only as long as there's not a Z declared
> earlier (including, of course, in files like libB.e).
> 
> Now, as a thought experiment, replace Z with something like TRUE.
> 
> Matt

I have thought a fair deal about this, and what I have been readingmakes my
earlier point even clearer: tinkering with namespaces cannot solve the problem of
having several multifile libraries play together without modifying any of them.
You need a _different_ concept, which is one of a _set_ of files being
collectively grouped together. Namespaces apply to only one file, with or without
inheritance.

In order for applications to use any combinations of libraries and to be immune
to any change in their inner working, subdirectory layout and whatnot, you'll
have to make explicit the distinction between an interface symbol - which is
meant to be available everywhere, except if explicitly shadowed - and a multifile
symbol, which is meant only to be seen from a specific set of files. For
historical reasons, the two are currently defined as "global": this makes the
proper handling of current code more complex.

I still consider the kind of proposal I had put forward before as more relevant
than a simple change in namespace semantics, which is not solving the problems,
or replaces them with others, as the recent posts show.

See http://oedoc.free.fr/Packages.htm for a complete description. I have a
working copy of eu.ex using this scheme. One could remove the "with
previous_package" directive, at the expense of a little more file splitting, as
symbols of different packaging status should have to go to different files then.

I'll emphasize again that the dual export list mechanism described there is
meant as a _transitional_ feature, whose purpose is to allow immediate wrapping
of existing code so that it works without clashes now, without having to wait for
it being rewritten. This is why the scheme is far less complicated than it might
seem to some.

CChris

new topic » goto parent » topic index » view message » categorize

14. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at ?mai?.com> Oct 11, 2007
873 views

CChris wrote:
> 
> I have thought a fair deal about this, and what I have been readingmakes my
> earlier point even clearer: tinkering with namespaces cannot solve the problem
> of having several multifile libraries play together without modifying any of
> them. 
> You need a _different_ concept, which is one of a _set_ of files being
> collectively
> grouped together. Namespaces apply to only one file, with or without
> inheritance.

It's still not clear to me why we need a different concept, rather than a 
more comprehensive implementation.  Also, I think the concept of a namespace
being able to reach into the includes of a particular file makes a lot of
sense.  For instance, you can use a single namespace for any function that
is part of win32lib.  Then, the user doesn't have to be aware of the 
implementation details--just that he uses stuff from win32lib.

> In order for applications to use any combinations of libraries and to be
> immune
> to any change in their inner working, subdirectory layout and whatnot, you'll
> have to make explicit the distinction between an interface symbol - which is
> meant to be available everywhere, except if explicitly shadowed - and a
> multifile
> symbol, which is meant only to be seen from a specific set of files. For
> historical
> reasons, the two are currently defined as "global": this makes the proper
> handling
> of current code more complex.

Please provide some evidence (or at least an example) of why we need an
explicit distinction between the interface symbol and a multifile symbol.
The only advantage I see is encapsulation.  That's a good thing to have,
but it doesn't really solve any of the symbol resolution issues.  Sure,
you may have reduced the scope of the problem, since there are simply 
fewer symbols floating around and able to conflict, but it only takes
one to generate an error or a bug.

> I still consider the kind of proposal I had put forward before as more
> relevant
> than a simple change in namespace semantics, which is not solving the
> problems,
> or replaces them with others, as the recent posts show.

Well, the "new problems" are just an artifact of thinking through the 
consequences.  And I suspect that the implementation is much less complex
than what you've proposed.  Can you please present an example of how your
solution solves something that mine does not?

> See http://oedoc.free.fr/Packages.htm for
> a complete description. I have a working copy of eu.ex using this scheme. One
> could remove
> the "with previous_package" directive, at the expense of a little more file
> splitting,
> as symbols of different packaging status should have to go to different files
> then.
> 
> I'll emphasize again that the dual export list mechanism described there is
> meant as a _transitional_ feature, whose purpose is to allow immediate
> wrapping
> of existing code so that it works without clashes now, without having to wait
> for it being rewritten. This is why the scheme is far less complicated than
> it might seem to some.

It still seems overly complicated to me.  I doubt I'll be convinced until
you can show me some advantages through examples.  Or maybe we're really
talking about different issues.  In which case I still find your proposal
over complicated.

Matt

new topic » goto parent » topic index » view message » categorize

15. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonder.co?uk> Oct 12, 2007
881 views

Matt Lewis wrote:
> Heh.  I added the warning and tested against eu/ec/backend.ex.  It was
> an absolute torrent of warnings.

Are you giving field or file level warnings, eg
Warning: scanner.e uses global(s) from global.e but does not include it
  {symtab_index, boolean, gline_number, line_number, OpTrace, ...}
vs
scanner.e:27 Warning: symtab_index from global.e assumed
scanner.e:29 Warning: boolean from global.e assumed
scanner.e:61 Warning: gline_number from global.e assumed
scanner.e:62 Warning: line_number from global.e assumed

I woke up much less concerned with this, after all a torrent of warnings is a
small price to pay for safer code, though if it can be made a file-level trickle
then even better.

Pete

new topic » goto parent » topic index » view message » categorize

16. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonde?.?o.uk> Oct 12, 2007
896 views

Matt Lewis wrote:
> 
> I wonder if we're talking past each other?  I'll adopt your example from
> above:
> 
> }}}
<eucode>
> app.ex
>   include main.e main.e
>     global Z
>     include libAmisc.e
>       if Z then
>     include libA.e
>       global Z
>       include libAmisc.e
>          if Z then
>   ? Z -- back in app.ex
> </eucode>
{{{

> 
> I was talking about how to resolve Z.  I'd say that an unqualified Z in 
> this instance should get you main:Z.

No, I'm really not getting this. Blindly assuming it is always the first Z just
seems completely wrong to me. If I was struggling with some function in libA, and
tried setting or printing Z, I'd be miffed enough when I finally found out about
the Z in main.e. What if (an extra level of nesting and) I then did an "include
libA as A" to force the issue, and it crapped on some previously working use of
unqualified Z later on because of this rule? Just force the namespace qualifier.
It is not as if this case happens very often.

Regards,
Pete

> except for incV.exw
Oh, there was a fix in 3.0.0 for that, quite safe to ignore it now.

new topic » goto parent » topic index » view message » categorize

17. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ail.co?> Oct 12, 2007
911 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > Heh.  I added the warning and tested against eu/ec/backend.ex.  It was
> > an absolute torrent of warnings.
> 
> Are you giving field or file level warnings, eg
> Warning: scanner.e uses global(s) from global.e but does not include it
>   {symtab_index, boolean, gline_number, line_number, OpTrace, ...}
> vs
> scanner.e:27 Warning: symtab_index from global.e assumed
> scanner.e:29 Warning: boolean from global.e assumed
> scanner.e:61 Warning: gline_number from global.e assumed
> scanner.e:62 Warning: line_number from global.e assumed

I was giving a per instance warning, though a bit less specific than your
example (mainly because it was easy, and the first thing I thought of).
Only giving one warning per symbol per file (i.e., no repeat warnings
for symtab_index as above) will complicate things, as it doesn't really
follow the warning infrastructure that's in place (at least, I don't think
it does).

> I woke up much less concerned with this, after all a torrent of warnings is
> a small price to pay for safer code, though if it can be made a file-level
> trickle
> then even better.

Agreed.

Matt

new topic » goto parent » topic index » view message » categorize

18. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ail?com> Oct 12, 2007
871 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> > I wonder if we're talking past each other?  I'll adopt your example from
> > above:
> > 
> > }}}
<eucode>
> > app.ex
> >   include main.e main.e
> >     global Z
> >     include libAmisc.e
> >       if Z then
> >     include libA.e
> >       global Z
> >       include libAmisc.e
> >          if Z then
> >   ? Z -- back in app.ex
> > </eucode>
{{{

> > 
> > I was talking about how to resolve Z.  I'd say that an unqualified Z in 
> > this instance should get you main:Z.
> 
> No, I'm really not getting this. Blindly assuming it is always the first Z
> just
> seems completely wrong to me. If I was struggling with some function in libA,
> and tried setting or printing Z, I'd be miffed enough when I finally found out
> about the Z in main.e. What if (an extra level of nesting and) I then did an
> "include libA as A" to force the issue, and it crapped on some previously
> working
> use of unqualified Z later on because of this rule? Just force the namespace
> qualifier. It is not as if this case happens very often.

Well, in the example I gave, I think the reference to Z in libAmisc.e
can *only* refer to main.e.  Consider that at the time it is parsed 
(when main.e includes it) there is only one Z in existence.  The interpreter
cannot possibly know about libA:Z, and has only one reason to doubt that
it meant main:Z, which is that libAmisc.e does not include any file
that has a Z declared.

This is perfectly legal behavior, even if it is dangerous for exactly
the reasons above.  The proper thing for the author of libA to do would
have been to put an "include libA.e" at the top of libAmisc.e, since 
it depends on having that file around.  That way, if someone decides to
include libAmisc.e, it will still behave as it is supposed to.  If
libA.e is included first (or only) then the include statement in libAmisc.e
only serves to inform the interpreter about where it expects to get
symbols.

Before I get into the issue of masking (which I believe is subtly 
different from this eaxmple) I think it's important to  reiterate the
purpose of this family of enhancements:

* Make it possible and easier to use multiple third party libraries.
* Avoid certain namespace conflicts where there is a reasonable or
  obvious way to resolve the conflict.
* Reduce or eliminate the need to have to edit third party libraries to
  resolve namespace conflicts.

Obviously, it will be possible for someone to write a library such that 
it may conflict with other code (possibly another poorly written library).
There's only so much that we can do inside the interpreter.

Now, on to the masking issue:

-- app.ex
include lib.e as lib
  global Z
  include sublib.e
    global Z
include sublib.e as sublib
? Z         -- #1 compile error
? lib:Z     -- #2 lib:Z
? sublib:Z  -- #3 sublib:Z

I'm not sure why someone would write code like this, but I'm sure that it's
either been done already or will be done.  Under 3.1, we could access
either one by using a namespace in app.ex.  I've proposed extending the
namespace for lib.e to include sublib.e.  I think this is useful, because
it avoids requiring library users to understand the structure of, say,
win32lib, and simply use one namespace for all of win32lib.

However, in this case, we have to decide what to do when we encounter
something like lib.e and sublib.e.  We can use a namespace identifier to
access sublib:Z, but if we try to use lib:Z, there are now two 
possibilities.  After reading your priorities, and (IIRC) on of CChris' 
arguments, I believe that it's reasonable to assume that using
lib:Z from app.ex means that, in fact, you want lib:Z and not sublib:Z.

But what if there is no namespace identifier?  In this case, it's probably
correct to throw a compile error.  

Back to using a namespace.  I'm also thinking that we should only allow
the "top-level" symbol to mask other symbols.  IOW, I wouldn't use a
full priority table like you did, but only divide the symbols into two
groups, the symbols in the explicitly namespaced file, and the symbols
in all of its includes:

--app.ex
include lib.e as lib
  include sub1.e
    global Z
    include sub2.e
      global Z
include sub1.e as sub1
include sub2.e as sub2
? Z      -- #1 compile error
? lib:Z  -- #2 compile error
? sub1:Z -- #3 sub1:Z
? sub2:Z -- #4 sub2:Z

Consider that if there were no mask, it would be impossible to reference
sub1:Z.

I believe that this behavior follows these rules:

  1: Unambiguity and Least Surprise
  2: Zero impact on legacy code
  3: Minimal performance impact

In fact, now that I think about it, the masking behavior described is 
required to achieve #2.

Matt

new topic » goto parent » topic index » view message » categorize

19. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonder?co.uk> Oct 12, 2007
895 views

Matt Lewis wrote:
> 
> Pete Lomax wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > I wonder if we're talking past each other?
It seems we are.
> > > 
> > > }}}
<eucode>
> > > app.ex
> > >   include main.e main.e
> > >     global Z
> > >     include libAmisc.e
> > >       if Z then
> > >     include libA.e
> > >       global Z
> > >       include libAmisc.e
> > >          if Z then
> > >   ? Z -- back in app.ex
> > > </eucode>
{{{

> > > 
> > > I was talking about how to resolve Z.  I'd say that an unqualified Z in 
> > > this instance should get you main:Z.
My reply assumed you meant the "? Z -- back in app.ex".

Pete

new topic » goto parent » topic index » view message » categorize

20. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonder?c?.uk> Oct 12, 2007
869 views

Matt Lewis wrote:

<snip> 
Apart from discussing different Z, I agree with most everything just said.

> Now, on to the masking issue:
> }}}
<eucode>
> -- app.ex
> include lib.e as lib
>   global Z
>   include sublib.e
>     global Z
> include sublib.e as sublib
> ? Z         -- #1 compile error
> ? lib:Z     -- #2 lib:Z
> ? sublib:Z  -- #3 sublib:Z
> </eucode>
{{{


> But what if there is no namespace identifier?  In this case, it's probably
> correct to throw a compile error.  
And that was what I was saying 

> Back to using a namespace.  I'm also thinking that we should only allow
> the "top-level" symbol to mask other symbols.  IOW, I wouldn't use a
> full priority table like you did, but only divide the symbols into two
> groups, the symbols in the explicitly namespaced file, and the symbols
> in all of its includes:

Agreed, that is actually exactly what I do:
> }}}
<eucode>
> --app.ex
> include lib.e as lib
>   include sub1.e
>     global Z
>     include sub2.e
>       global Z
> include sub1.e as sub1
> include sub2.e as sub2
> ? Z      -- #1 compile error
> ? lib:Z  -- #2 compile error
> ? sub1:Z -- #3 sub1:Z
> ? sub2:Z -- #4 sub2:Z
> </eucode>
{{{


My priority table for this would look like this:
   file    parents  priorities
  app.ex    {}       {2,1,1,1}
  lib.e     {1}      {2,4,3,3}
  sub1.e    {2}      {2,4,6,5}
  sub2.e    {3}      {2,4,6,8}

Of course one Z has file number 3 against it and the other has file no 4.
#1: in app.ex, no namespace, so using priorities[1]. 2 Z at pri 1 -> error
#2: namespace "lib" hence using priorities[2], 2 Z at pri 3 -> error
#3: namespace "sub1" hence using priorities[3], 6>5 so use the first one
#4: namespace "sub2" hence using priorities[4], 1 Z at pri >=7 so use it.

The bit you missed is that the {2,1,1,1} and {4,3,3} act like the "only two
groups" you propose, pretty much in all cases I think.

> Consider that if there were no mask, it would be impossible to reference
> sub1:Z.
Yup.
> 
> I believe that this behavior follows these rules:
> 
>   1: Unambiguity and Least Surprise
>   2: Zero impact on legacy code
>   3: Minimal performance impact
> 
Seen that somewhere 

> In fact, now that I think about it, the masking behavior described is 
> required to achieve #2.
Yup, letting a namespace see more stuff means you need to give priority to the
stuff that someone carefully chose the right namespace for.

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

21. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gma?l?com> Oct 12, 2007
876 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> > Pete Lomax wrote:
> > > 
> > > Matt Lewis wrote:
> > > > 
> > > > I wonder if we're talking past each other?
> It seems we are.
> > > > 
> > > > }}}
<eucode>
> > > > app.ex
> > > >   include main.e main.e
> > > >     global Z
> > > >     include libAmisc.e
> > > >       if Z then
> > > >     include libA.e
> > > >       global Z
> > > >       include libAmisc.e
> > > >          if Z then
> > > >   ? Z -- back in app.ex
> > > > </eucode>
{{{

> > > > 
> > > > I was talking about how to resolve Z.  I'd say that an unqualified Z in 
> > > > this instance should get you main:Z.
>
> My reply assumed you meant the "? Z -- back in app.ex".

Sorry, that was a really bad bit of communication on my part--I *think*
I was mixing stuff around.  The "? Z -- back in app.ex" should be an error.
The "if Z then" in libAmisc.e was what would resolve to main.e.

Matt

new topic » goto parent » topic index » view message » categorize

22. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmail.?o?> Oct 12, 2007
889 views

Pete Lomax wrote:
> 
> The obvious thing is to develop a set of test files. You can find the ones I
> used at <a
> href="http://www.palacebuilders.pwp.blueyonder.co.uk/t05.zip">http://www.palacebuilders.pwp.blueyonder.co.uk/t05.zip</a>
> which you may or may not have already seen and may or may not find useful. The
> .exw files are
> the actual tests (14 of), some of which are mind numbingly trivial. 
> 
> Probably also a good ideal to develop a set of "fail" test files which should
> generate the desired errors and warnings.

I don't have any "fail" tests, but I've developed some to demonstrate my
approach.  I've put them, and a brief description up on the sf.net 
wiki:

http://rapideuphoria.wiki.sourceforge.net/Namespace_Resolution

Again, in order for this to work, you'll need the mwl branch (actually,
the trunk will succeed for everything except generating the warnings).

Matt

new topic » goto parent » topic index » view message » categorize

23. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agriculture?g?uv.fr> Oct 14, 2007
855 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 

Sorry for the late reply, I have my hands full these days - moving within a few
weeks. All development/debugging activities are expected to be somewhat delayed
or disrupted during that period. Back to normal around end of november, hopefully
sooner.

> > I have thought a fair deal about this, and what I have been readingmakes my
> > earlier point even clearer: tinkering with namespaces cannot solve the
> > problem
> > of having several multifile libraries play together without modifying any of
> > them. 
> > You need a _different_ concept, which is one of a _set_ of files being
> > collectively
> > grouped together. Namespaces apply to only one file, with or without
> > inheritance.
> 
> It's still not clear to me why we need a different concept, rather than a 
> more comprehensive implementation. 

Because sometimes you need to specify whether you need to access a particular
file or all the include subtree it is the root of. You'd expect applications to
use the second sort of access, but libraries may need the first kind, if only to
access shadowed symbols when extending them. Overloading namespacing is not
better than the current overloading of "global" - not worse admittedly.

> Also, I think the concept of a namespace
> being able to reach into the includes of a particular file makes a lot of
> sense.  For instance, you can use a single namespace for any function that
> is part of win32lib.  Then, the user doesn't have to be aware of the 
> implementation details--just that he uses stuff from win32lib.
>  

This is exactly what packages are for. As underlined above, using a namespace
for both file level and package level accesses isn't as safe as using two
different syntaxes.

> > In order for applications to use any combinations of libraries and to be
> > immune
> > to any change in their inner working, subdirectory layout and whatnot,
> > you'll
> > have to make explicit the distinction between an interface symbol - which is
> > meant to be available everywhere, except if explicitly shadowed - and a
> > multifile
> > symbol, which is meant only to be seen from a specific set of files. For
> > historical
> > reasons, the two are currently defined as "global": this makes the proper
> > handling
> > of current code more complex.
> 
> Please provide some evidence (or at least an example) of why we need an
> explicit distinction between the interface symbol and a multifile symbol.
> The only advantage I see is encapsulation.  That's a good thing to have,
> but it doesn't really solve any of the symbol resolution issues. 

I don't understand. If all libraries restrictively specify, for each non local
symbol they define, where it is to be seen from, then there is not a single
conflict left, except when two interface symbols collide. In this case (two
exported abs() functions), a simple addition or deletion from an export list at
the application level, or a simple scope restriction directive, will mask one of
the clashing symbols and solve the conflict.

> Sure,
> you may have reduced the scope of the problem, since there are simply 
> fewer symbols floating around and able to conflict, but it only takes
> one to generate an error or a bug.
> 

Other advantages - encapsulation is of course a powerful one already:
* When you read the code of a library, you'd know immediately whether a symbol
is for the lib's internal use or for the end user. Currently they both read
"global".
* Applications need not worry with undocumented symbols popping in to start a
clash when the version of a library it uses changes. Multifile symbols can freely
appear and disappear, change names, change types - an application need not be
affected by this, and will not if the symbols are properly marked as scope
restricted.
* By externally changing the status (true global vs multifile) of a routine, you
can extend it any number of times. Currently, you can extend builtin routines
only once and library routines zero times. For variables, this is less useful -
but should there be global variables at all?

All the above, when using the proper costructs, can be done without changing a
single line of existing code; a wrapper include file can handle it all.

> > I still consider the kind of proposal I had put forward before as more
> > relevant
> > than a simple change in namespace semantics, which is not solving the
> > problems,
> > or replaces them with others, as the recent posts show.
> 
> Well, the "new problems" are just an artifact of thinking through the 
> consequences.  And I suspect that the implementation is much less complex
> than what you've proposed.  Can you please present an example of how your
> solution solves something that mine does not?

I'll have to check the documentation at your branch. I think the wiki should
present globally how each and every kind of symbol clash will be handled or not,
rather than only mentioning the differences.

> 
> > See <a
> > href="http://oedoc.free.fr/Packages.htm">http://oedoc.free.fr/Packages.htm</a>
> > for
> > a complete description. I have a working copy of eu.ex using this scheme.
> > One
> could remove</font></i>
> > the "with previous_package" directive, at the expense of a little more file
> > splitting,
> > as symbols of different packaging status should have to go to different
> > files then.
> </font></i>
> > 
> > I'll emphasize again that the dual export list mechanism described there is
> > meant as a _transitional_ feature, whose purpose is to allow immediate
> > wrapping
> > of existing code so that it works without clashes now, without having to
> > wait
> > for it being rewritten. This is why the scheme is far less complicated than
> > it might seem to some.
> 
> It still seems overly complicated to me.  I doubt I'll be convinced until
> you can show me some advantages through examples.  Or maybe we're really
> talking about different issues.  In which case I still find your proposal
> over complicated.
> 
> Matt

It looks like you are concerned with other types of clashs than I, but my
solution also handles "yours". Also, I may be more concerned about what happens
when a library changes: the issues are subtly different, since they concern code
you didn't write and ideally should not modify. I don't see that your solution
helps much there - again, I have to read a complete description.

CChris
CChris

new topic » goto parent » topic index » view message » categorize

24. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?a?l.com> Oct 14, 2007
875 views

CChris wrote:
> 
> It looks like you are concerned with other types of clashs than I, but my
> solution
> also handles "yours". Also, I may be more concerned about what happens when
> a library changes: the issues are subtly different, since they concern code
> you didn't write and ideally should not modify. I don't see that your solution
> helps much there - again, I have to read a complete description.

I agree that my solution doesn't really deal with the encapsulation issue.
That's not the point, and I don't believe that your solution really solves
the issues at hand.  The point is to fix the problem of using multiple
libraries and having them conflict with each other.  If two libraries 
export the same symbols, you still have the same problem.

What happens when the library maintainer makes some changes, and the 
packaging directives don't keep up?  

Also, your solution seems to me to be overly complex, and difficult to
maintain.  When we get to encapsulation, I think I'd prefer something
along the lines of an import vs include, where the globals "imported" 
don't go beyond the file that imported them.  This way, if you need to
keep stuff hidden, just import it.  If you want to expose things, either
put it in the main include, or use the standard include directive.

It's possible that I just don't understand your proposal sufficiently.
Please submit a modified set of files to demonstrate how you'd modify
v3.1 to handle these files.  The benefit of my proposal is that there is
absolutely no new syntax, and it does it in an intuitively straightforward
manner.

I'm not really interested (in this discussion) to hear about how your 
solution encapsulates symbols.  I'm interested in how we deal with
multiple files that expose duplicate symbols, whether the original 
author wants them to be exposed or just used internally.  Accept that
the symbols are exposed to the program.

It seems that your method puts a larger burden onto the programmer to
resolve conflicts.  My goal is to take what's already there (all 
those include statements) and to make full use of the information that
they impart.

Matt

new topic » goto parent » topic index » view message » categorize

25. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonder??o.uk> Oct 14, 2007
872 views
Last edited Oct 15, 2007

Matt Lewis wrote:
> 
> CChris wrote:
> > I don't see that your solution helps much there
>
> Also, your solution seems to me to be overly complex

While I find this topic fascinating, I think we can conclude that you two
disagree  It is time (imho) to focus on a (beta) release and move away from
theoretical bantering to swapping actual code samples that show the
advantages/shortfalls of the new features, before attempting to add any more,
that is. Email me if you want any help, proof reading, etc.

BTW, I've still had zero entries to my data hiding challenge, nor has anyone set
a better one: http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm

Regards,
Pete
PS I'm still not keen on installing TortoiseSVN or Watcom on this box - I
already have enough things I waste time on when more important stuff waits..

new topic » goto parent » topic index » view message » categorize

26. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ail.com> Oct 15, 2007
875 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> > CChris wrote:
> > > I don't see that your solution helps much there
> >
> > Also, your solution seems to me to be overly complex
> 
> While I find this topic fascinating, I think we can conclude that you two
> disagree 
> It is time (imho) to focus on a (beta) release and move away from theoretical
> bantering
> to swapping actual code samples that show the advantages/shortfalls of the new
> features,
> before attempting to add any more, that is. Email me if you want any help,
> proof reading,
> etc.

In fact, my code is posted.  The trunk actually functions just as I've been
describing, except that it doesn't emit the warnings.  That's in my 
branch.  You don't actually need svn installed to get the code.  If you
go to the sf project page, you can "Browse SVN" and download the files
individually.  It's actually easier to just install svn, but you still 
have that option.

And I have test code posted on the sf project wiki.

> BTW, I've still had zero entries to my data hiding challenge, nor has 
> anyone set a better one:
> http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm

I think I never really looked at it.  Looking at it now, I think I disagree
with some of the premises.  I briefly described to CChris what I thought
was a reasonably simple way to encapsulate.  The only change would be
to change from "include" to "import."  I'm aware that this keyword has
been discussed before, though I'm not certain if it had the same 
intention as I've described.

Importing would work just like including, except that files that included
you couldn't see the files that you imported:

-- myapp.exw
include myapp.ew
  -- myapp.ew
  import win32lib.ew
main = create(...)  -- error!

I'm wondering about the justification for this requirement:
  "z2 is visible in f2 and f5 but nowhere else"

Other files include f2, but apparently only want to see certain symbols
from f2 (the file where z2 is declared).  While I think that your system
is interesting, I'm not convinced that it's solving a real problem.  If
a library writer wants to limit what his library exposes, then all he 
needs to do is to either declare those symbols as locals, or, if they must
be seen by other files in the library, then he just needs to import the
appropriate files within the library.  This way, the library user will
never know about all of those other symbols.

I guess we probably need to declare what problem we're solving.  My idea
is that we want to be able to use symbols across files, but to have 
a 'firewall' to limit their visibility to users of those files.  One 
reason that I like this scheme is that it requires very little be done
to get this effect, and it's done in a very straightforward way.

> Regards,
> Pete
> PS I'm still not keen on installing TortoiseSVN or Watcom on this box - I
> already
> have enough things I waste time on when more important stuff waits..

I would recommend skipping TortoiseSVN and just using the command line.

Matt

new topic » goto parent » topic index » view message » categorize

27. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Igor Kachan <kinz at p?ter?ink.ru> Oct 15, 2007
898 views

Pete Lomax wrote:

> BTW, I've still had zero entries to my data hiding challenge,
> nor has anyone set a better one: 
> http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm

Hi Pete,

Me wrote:
--==========
Date: 2007 Aug 21 9:30
 From: Igor Kachan <kinz at ?eterlink.?u>
Subject: Re: Data Hiding challenge

Pete Lomax wrote:

> 
> A while ago, CChris made a suggestion, erm about hosting stuff on my site I
> think. Been busy. Anyway, here is a small starter:
> 
> <a
> href="http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm">http://palacebuilders.
pwp.blueyonder.co.uk/dhc.htm</a>
> 

Ok, thanks. Now I do see what you really want.

I think that the *private* and *local* symbols of Euphoria
are hidden well enough, so the problem only stands for
the *global* symbols.

There are 2 the very powerful keywords in EU -
*with* and *without*.

I'd suggest the new parameters for these metacommands.

with namespace [file of the list or 1.0 or 2.3]

or

without symbol [list]  

So, the scheme is simple.

You can just say about not needed globals,
or to block them as hidden:

without symbol abs, void, PI, arcsin, VOID, ABS
include misc.e
include new_sdl.e

Or :

with namespace my_project.en
include win32lib.e
include wmotor.e
include wxeuphoria.e

and then make the file my_project.en

-- my_project.en
without symbol abs, void, PI, arcsin, VOID, ABS,
               misc2.e/abs, misc3.e/ABS,
               c:/euphoria/include/my_libs/my_misc5.e/VOID,
               ........
               ........
-- eof

.en stands for Euphoria Namespace
 
Or: 

with namespace 1.0 -- for the initial plain namespace

Or:

with namespace 2.3 -- for the current namespace


Just my $0.02 for now.

I'm sorry, my spare time is very limited now,
maybe in wintertime I'll have more possibilities
for implementation of this idea.

It seems to me, such a system can be very flexible
without any new keywords in Euphoria, and is clear,
for me at least.
--===========

I think the problem is that we can *include* some
global symbols using the 'include' key word,
but can not *exclude* them using also some special
key word, which makes the *excluded* global symbols,
listed somewhere, 'hidden', so to say.

So I suggested to use the known EU key words, namely
'with' and 'without' just to *exclude* the not needed
global symbols of some library.

That may be, for example:

include win32lib.ew without symbol [list]
-- or just:
include win32lib.ew without [list]

I do not like all these complicated rules with
trees, tables of priority etc etc, I just can
not remember who is who in those tables and trees,
I'm just old and lazy.

All *excluded* global symbols must be *listed*
in separate special list to see them in program
text without any dificulties.

So we can have three mechanisms for name spaces
in EU -- includeing, excludeing and renameing (include ... as).

Again, just my $0.02, but I'd like
to see some comments this time  

Regards,
Igor Kachan
kinz at peterlink.ru

new topic » goto parent » topic index » view message » categorize

28. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at a??iculture.gouv.fr> Oct 15, 2007
922 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > It looks like you are concerned with other types of clashs than I, but my
> > solution
> > also handles "yours". Also, I may be more concerned about what happens when
> > a library changes: the issues are subtly different, since they concern code
> > you didn't write and ideally should not modify. I don't see that your
> > solution
> > helps much there - again, I have to read a complete description.
> 
> I agree that my solution doesn't really deal with the encapsulation issue.
> That's not the point, and I don't believe that your solution really solves
> the issues at hand.  The point is to fix the problem of using multiple
> libraries and having them conflict with each other.  If two libraries 
> export the same symbols, you still have the same problem.
> 

If your application includes two libs which export the same identifier, it has
to explicitly choose one of them, using the current namespace system.  At least,
internal (ie global, but not interface) symbols won't be seen, just interface
symbols.

> What happens when the library maintainer makes some changes, and the 
> packaging directives don't keep up?  
> 

The maintainer usually tests his lib, and specially the new functionalities, and
something won't work right, which he should fix before shipping.

If the directives are in some custom wrapper you wrote, then it is your call to
keep the wrapper up to date, since you wrote it. Or to ship an update for your
application, so that an user can use the new version of the lib.

If the packaging uses a negative list, ie a list of internal symbols, then
adding more exported symbols requires no update at all.

> Also, your solution seems to me to be overly complex, and difficult to
> maintain.  When we get to encapsulation, I think I'd prefer something
> along the lines of an import vs include, where the globals "imported" 
> don't go beyond the file that imported them.  This way, if you need to
> keep stuff hidden, just import it.  If you want to expose things, either
> put it in the main include, or use the standard include directive.
> 
> It's possible that I just don't understand your proposal sufficiently.
> Please submit a modified set of files to demonstrate how you'd modify
> v3.1 to handle these files.  The benefit of my proposal is that there is
> absolutely no new syntax, and it does it in an intuitively straightforward
> manner.
> 

Huh? It does work for some obvious cases, but explicitly inluding a parent file,
for example, is far from anything I'd call "intuitive or straightforward".

I'll have to post the files indeed. As I noted earlier, this may take some time
because of RL fiercely catching up these days. But this is the way to go. Also, I
didn't try adjusting the runtime lookup functions accordingly yet.

> I'm not really interested (in this discussion) to hear about how your 
> solution encapsulates symbols.  I'm interested in how we deal with
> multiple files that expose duplicate symbols, whether the original 
> author wants them to be exposed or just used internally.  Accept that
> the symbols are exposed to the program.

We obviously have different approaches.

> 
> It seems that your method puts a larger burden onto the programmer to
> resolve conflicts.  My goal is to take what's already there (all 
> those include statements) and to make full use of the information that
> they impart.
> 
> Matt

The only burden is to disambiguate between interface and non interface symbols.
It is not a new task for programmer, since the docs he wrote usually do this
already.

CChris

new topic » goto parent » topic index » view message » categorize

29. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agricultu?e.gouv.?r> Oct 15, 2007
868 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > 
> > CChris wrote:
> > > I don't see that your solution helps much there
> >
> > Also, your solution seems to me to be overly complex
> 
> While I find this topic fascinating, I think we can conclude that you two
> disagree 
> It is time (imho) to focus on a (beta) release and move away from theoretical
> bantering
> to swapping actual code samples that show the advantages/shortfalls of the new
> features,
> before attempting to add any more, that is. Email me if you want any help,
> proof reading,
> etc.
> 
> BTW, I've still had zero entries to my data hiding challenge, nor has anyone
> set a better one: <a
> href="http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm">http://palacebuilders.pwp.blueyonder.co.uk/dhc.htm</a>
> 

I admit that I had lost interest in the issue, as well as in developing anything
for Eu. I won't forget submitting, but with no predictable time frame, sorry.

CChris

> Regards,
> Pete
> PS I'm still not keen on installing TortoiseSVN or Watcom on this box - I
> already
> have enough things I waste time on when more important stuff waits..

new topic » goto parent » topic index » view message » categorize

30. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmai?.c?m> Oct 15, 2007
854 views

CChris wrote:
> 
> Matt Lewis wrote:
> > 
> > 
> > I agree that my solution doesn't really deal with the encapsulation issue.
> > That's not the point, and I don't believe that your solution really solves
> > the issues at hand.  The point is to fix the problem of using multiple
> > libraries and having them conflict with each other.  If two libraries 
> > export the same symbols, you still have the same problem.
> > 
> 
> If your application includes two libs which export the same identifier, it has
> to explicitly choose one of them, using the current namespace system.  At
> least,
> internal (ie global, but not interface) symbols won't be seen, just interface
> symbols.

Yes, but what about those two libraries.  One will have been included in
the main app after the other.  Did the author use namespaces for everything
in his library?  The first included library has polluted the namespace.
How does the second library manage to deconflict the symbols?

> > What happens when the library maintainer makes some changes, and the 
> > packaging directives don't keep up?  
> > 
> 
> The maintainer usually tests his lib, and specially the new functionalities,
> and something won't work right, which he should fix before shipping.
>
> If the directives are in some custom wrapper you wrote, then it is your call
> to keep the wrapper up to date, since you wrote it. Or to ship an update for
> your application, so that an user can use the new version of the lib.
> 
> If the packaging uses a negative list, ie a list of internal symbols, then
> adding
> more exported symbols requires no update at all.

In a perfect world, this would be done.  And I agree that this isn't, in
and of itself, a deal breaker.  I am saying that it's creating another
way to create bugs.  How would the author know that he'd missed something?
That his library causes problems with some other piece of code that hasn't
even been written yet?  

The end result is that the users of the library are going to have to do 
extra work (by effectively editing the code of the library) to use this 
library, which is exactly what I'm trying to avoid.

A negative list may reduce the chances of having to go in and edit, but
it does not solve the problem.  It does, however drive up the complexity 
of the code.

> 
> > Also, your solution seems to me to be overly complex, and difficult to
> > maintain.  When we get to encapsulation, I think I'd prefer something
> > along the lines of an import vs include, where the globals "imported" 
> > don't go beyond the file that imported them.  This way, if you need to
> > keep stuff hidden, just import it.  If you want to expose things, either
> > put it in the main include, or use the standard include directive.
> > 
> > It's possible that I just don't understand your proposal sufficiently.
> > Please submit a modified set of files to demonstrate how you'd modify
> > v3.1 to handle these files.  The benefit of my proposal is that there is
> > absolutely no new syntax, and it does it in an intuitively straightforward
> > manner.
> > 
> 
> Huh? It does work for some obvious cases, but explicitly inluding a parent
> file,
> for example, is far from anything I'd call "intuitive or straightforward".

This really baffles me.  You have a file which uses symbols in another file.
If that other file is not present, then it won't work.  You're saying that
it's unintuitive to include that file with the symbols you depend upon?

> I'll have to post the files indeed. As I noted earlier, this may take some
> time
> because of RL fiercely catching up these days. But this is the way to go.
> Also,
> I didn't try adjusting the runtime lookup functions accordingly yet.
> 
> > I'm not really interested (in this discussion) to hear about how your 
> > solution encapsulates symbols.  I'm interested in how we deal with
> > multiple files that expose duplicate symbols, whether the original 
> > author wants them to be exposed or just used internally.  Accept that
> > the symbols are exposed to the program.
> 
> We obviously have different approaches.

Well, if you're honestly trying to solve the namespace/third party conflict
problem then I'd agree with you, and I'd go farther to say that your
approach is deeply flawed.

> > It seems that your method puts a larger burden onto the programmer to
> > resolve conflicts.  My goal is to take what's already there (all 
> > those include statements) and to make full use of the information that
> > they impart.
> 
> The only burden is to disambiguate between interface and non interface
> symbols.
> It is not a new task for programmer, since the docs he wrote usually do this
> already.

You still haven't answered the main issue here.  How does this solve two
third party library conflicts?  I'll admit that I simply may not understand
your solution well enough, and maybe your examples will clear this up,
but you seem to be avoiding the question by saying, effectively, that
your solution reduces the number of global symbols, so probabilistically,
the chance of a conflict is reduced.

I agree with you on that point.  But your solution doesn't solve, AFAICT,
the conflicts themselves.  So I'll use your terminology:

How would your approach resolve conflicts between two interface symbols
conflicting in two third party libraries.  The conflict exists because
the libraries (not just user code) use those interface symbols. 
Therefore, simply having a program like:

-- myapp.ex
include libMatt.e
include libCChris.e

...causes an error somewhere inside of libCChris.e, because one of its
interface symbols conflicts with an interface symbol in libMatt.e.

What would we need to do to make sure that this situation doesn't occur.
And what tools does the library programmer have to help him?

Matt

new topic » goto parent » topic index » view message » categorize

31. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyo?der.co.u?> Oct 15, 2007
902 views

Igor Kachan wrote:
> 
> Pete Lomax wrote:
>  
> > BTW, I've still had zero entries to my data hiding challenge,
> 
> I think that the *private* and *local* symbols of Euphoria
> are hidden well enough, so the problem only stands for
> the *global* symbols.
Correct.
> 
> without symbol [list]  
> 
> So, the scheme is simple.
> 
Perhaps too simple for me. The include tree looks like this:
f1
 |
 +-f2
 |  | 
 |  +-f3 (defines z3)
 |  | 
 |  +-f4
 |  |
 |  +-f5
 |     |
 |     +-f6 (defines z6)
 +-f7

How would you make z6 visible in f7 but not in f5, f2 or f1?
How would you make z3 visible in f4 and f7 but not in f5, f2, or f1?

> You can just say about not needed globals,
> or to block them as hidden:
> 
> without symbol abs, void, PI, arcsin, VOID, ABS
> include misc.e
> include new_sdl.e

I would not argue against a simple "ringfence" such as this, it has uses, but
does not solve the challenge posted. One point about the above that I would like
to make is that a "without symbol" directive should apply only to the next
include statement, not the next N include statements.

> I do not like all these complicated rules with
> trees, tables of priority etc etc, I just can
> not remember who is who in those tables and trees,

The priority tables are a compiler internal, and like the symbol hash table
which has been around for donkeys years, it should not be something you need to
think about when coding. It is just meant to be a way of implementing "programmer
intent" as best as possible. I agree that it is a bit of a challenge to document
this stuff without confusing everyone.

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

32. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Bernie Ryan <xotron at bl?efrog.com> Oct 15, 2007
856 views

Pete Lomax wrote:
> Perhaps too simple for me. The include tree looks like this:
> f1
>  |
>  +-f2
>  |  | 
>  |  +-f3 (defines z3)
>  |  | 
>  |  +-f4
>  |  |
>  |  +-f5
>  |     |
>  |     +-f6 (defines z6)
>  +-f7

Pete:

I think a better way to handle the name-space problem is this:

Create a name-space at the function/procedure level.

This could be handled by using a routine_id technique.

The biggest problem with using files is the fact that everything

is included even if you are NOT using it.

There must be someone out there that can think of way

to implement this.

Bernie

My files in archive:
WMOTOR, XMOTOR, W32ENGIN, MIXEDLIB, EU_ENGIN, WIN32ERU, WIN32API 

Can be downloaded here:
http://www.rapideuphoria.com/cgi-bin/asearch.exu?dos=on&win=on&lnx=on&gen=on&keywords=bernie+ryan

new topic » goto parent » topic index » view message » categorize

33. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blue??nder.co.uk> Oct 15, 2007
844 views

Matt Lewis wrote:
> I think I never really looked at it.  Looking at it now, I think I disagree
> with some of the premises.  I briefly described to CChris what I thought
> was a reasonably simple way to encapsulate.  The only change would be
> to change from "include" to "import."

But what happens when a file contains both globals that you want the whole world
to see and globals needed for private communication?

Hacking the files into "include" and "import" pieces is probably not a goer -
wanna try that with win32lib?

> I'm wondering about the justification for this requirement:
>   "z2 is visible in f2 and f5 but nowhere else"

A simple example is a routine_id. f2 wants to (forward) call something in f5 so
the standard trick is to declare a global in f2, and set it in f5 after the
routine definition. Nothing else needs to know about that routine_id, but there
it is happily polluting the global namespace, forever waiting for an unsuspecting
victim.

> Other files include f2, but apparently only want to see certain symbols
> from f2 (the file where z2 is declared).  While I think that your system
> is interesting, I'm not convinced that it's solving a real problem.

The trouble with real problems is that they have more than one solution 
I accept it is more of a reaction to the oft discussed "pollution" than any real
issue I am struggling with, and in fact I wrote the data hiding challenge to get
a better grip on the problems being moaned about. If you can think of a better
example...

It may help to reiterate the idea behind this:
Someone writes a lib of some sort. Perhaps this is designed in from day 1,
perhaps a user experiences performance problems, but someone writes f7 as an
optional bolt-on and litters the code with counts f2called, f3called, etc.
f7 processes all this data at the end, maybe checks that #opens = #closes,
bytes_allocated = bytes_freed, or whatever. Maybe you had a go at writing a
central collectStats() routine but it made the lib 50% slower whereas lots of
f2called+=1 etc made no noticeable difference. Anyway, the idea is to gather data
all over the shop, which is only used in one place.

There are obviously other classes of problems to be thought about as well.

> I guess we probably need to declare what problem we're solving.  My idea
> is that we want to be able to use symbols across files, but to have 
> a 'firewall' to limit their visibility to users of those files.  One 
> reason that I like this scheme is that it requires very little be done
> to get this effect, and it's done in a very straightforward way.

I would not be against this, but once you set up a firewall for one purpose it
seems hard to add a different one for the "bolt on" case, should they overlap.

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

34. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmai?.com> Oct 15, 2007
860 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > I think I never really looked at it.  Looking at it now, I think I disagree
> > with some of the premises.  I briefly described to CChris what I thought
> > was a reasonably simple way to encapsulate.  The only change would be
> > to change from "include" to "import."
> 
> But what happens when a file contains both globals that you want the whole
> world
> to see and globals needed for private communication?
> 
> Hacking the files into "include" and "import" pieces is probably not a goer
> - wanna try that with win32lib?

Short answer: yes.

I agree that there would be significant work for something like win32lib,
but any encapsulation scheme is likely to require this for something like
win32lib.  In fact, I suspect that this would be a good way to go,
because it would help to make the interface more independent of the 
implementation.

> > I'm wondering about the justification for this requirement:
> >   "z2 is visible in f2 and f5 but nowhere else"
> 
> A simple example is a routine_id. f2 wants to (forward) call something in f5
> so the standard trick is to declare a global in f2, and set it in f5 after the
> routine definition. Nothing else needs to know about that routine_id, but
> there
> it is happily polluting the global namespace, forever waiting for an
> unsuspecting
> victim.

I guess I'm not as worried about this as you are.  Another (similar) way
to do this is to have a "setter" routine in f2 called by f5 (or whomever
is using f2).  Again, this could also be solved by organizing the library
by interface vs implementation files.

Could you expand on the 'unsuspecting victim' a bit more?  Are you concerned
about conflicts?  Bugs where someone wrongly uses the variable?  I think 
I'd like to point out that with the newer symbol resolution rules, it
should only be an issue for *users* of the code, not additional 3rd
party code that is put into the same app.

> > Other files include f2, but apparently only want to see certain symbols
> > from f2 (the file where z2 is declared).  While I think that your system
> > is interesting, I'm not convinced that it's solving a real problem.
> 
> The trouble with real problems is that they have more than one solution 
> I accept it is more of a reaction to the oft discussed "pollution" than any
> real issue I am struggling with, and in fact I wrote the data hiding challenge
> to get a better grip on the problems being moaned about. If you can think of
> a better example...

Fair enough.  I think the best example might be in actually doing it to
something like win32lib (though maybe a bit smaller, of course).

> It may help to reiterate the idea behind this:
> Someone writes a lib of some sort. Perhaps this is designed in from day 1,
> perhaps
> a user experiences performance problems, but someone writes f7 as an optional
> bolt-on and litters the code with counts f2called, f3called, etc.
> f7 processes all this data at the end, maybe checks that #opens = #closes,
> bytes_allocated
> = bytes_freed, or whatever. Maybe you had a go at writing a central
> collectStats()
> routine but it made the lib 50% slower whereas lots of f2called+=1 etc made
> no noticeable difference. Anyway, the idea is to gather data all over the
> shop,
> which is only used in one place.

My first thought about this is that it's all debug code that should 
probably be removed before release, so I'm not sure that it's a good
example. 

> There are obviously other classes of problems to be thought about as well.

Yes.  To expand on why I'm for adding some sort of data hiding to euphoria
(which probably isn't saying anything you haven't already thought or said):

If you expose something, then it's almost guaranteed that someone will
use it, whether or not they should.  This can cause problems if you
later decide to change something that someone else depended upon.
Take a look at Raymond Chen's blog for some great examples in the MSWindows
world.

That's not to say that someone won't go in and abuse your code and expose
things that you didn't want exposed, but that's a different issue 
altogether.  Allowing things to slip out makes it more painful to
change things later--you have to balance upsetting your users who took
advantage of exposed, but undocumented and unintended features of your 
code with the benefits of the change.

Win32lib, for example does all kinds of stuff under the covers that most
people have no clue about (and I've been away long enough from the code
that I probably wouldn't recognize it anymore, either).  But you can
still basically use the code like you did 5 or 6 years ago.  However,
we can all witness the problems that I've had every time it's changed
the way the memory/structure code.  EuCOM depends on those things
(because it's really good code), and I don't really see a great solution
in there.  And those things were meant for external coders to use them,
but we can imagine a similar situation with code not meant for external
consumption.

> > I guess we probably need to declare what problem we're solving.  My idea
> > is that we want to be able to use symbols across files, but to have 
> > a 'firewall' to limit their visibility to users of those files.  One 
> > reason that I like this scheme is that it requires very little be done
> > to get this effect, and it's done in a very straightforward way.
> 
> I would not be against this, but once you set up a firewall for one purpose
> it seems hard to add a different one for the "bolt on" case, should they
> overlap.

I think it's actually pretty easy to punch holes in the firewall.  Just
add an include statement.  Of course, by doing so you should realize that 
you're no longer using the library the way it was originally designed,
so you need to be aware of the consequences (not all of which are 
necessarily bad).

Taking another look at your example, I'm concerned with the shared scope
identifiers.  What do we do when those start to conflict?

Other languages have more "natural" ways to deal with this problem.  In
C, you have header files, and in [most?] object oriented languages, you 
have different scopes for class members that help with encapsulation.

Maybe we need to add the concept of header files to euphoria?  To include
a library, it defines which files to include, and which symbols to be
exported, without actually causing any IL to be generated, so you don't 
have to write any "interface wrappers" when you lay out the code.

I doubt this is a good idea, but it should probably be considered.

And perhaps in addition to your 3 criteria:

  1: Unambiguity and Least Surprise
  2: Zero impact on legacy code
  3: Minimal performance impact

  4: KISS

It's that #4 that probably worries me most between your and CChris'
encapsulation proposals, though I find yours a bit simpler.

Matt

new topic » goto parent » topic index » view message » categorize

35. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at ag?icultu?e.gouv.fr> Oct 15, 2007
851 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > 
> > > I agree that my solution doesn't really deal with the encapsulation issue.
> > > That's not the point, and I don't believe that your solution really solves
> > > the issues at hand.  The point is to fix the problem of using multiple
> > > libraries and having them conflict with each other.  If two libraries 
> > > export the same symbols, you still have the same problem.
> > > 
> > 
> > If your application includes two libs which export the same identifier, it
> > has
> > to explicitly choose one of them, using the current namespace system.  At
> > least,
> > internal (ie global, but not interface) symbols won't be seen, just
> > interface
> > symbols.
> 
> Yes, but what about those two libraries.  One will have been included in
> the main app after the other.  Did the author use namespaces for everything
> in his library?  The first included library has polluted the namespace.
> How does the second library manage to deconflict the symbols?
> 

There are two ways around it:
1) As explained in my paper, write a wrapper include around say the first
library, and filter the clashing symbol out from there:

-- app.exw
include wrap_lib1.e -- managed inclusion of lib1.e
include lib2.e
-- more code here

-- wrap_lib1.e
-- define a package around the rogue lib
package lib1_filtered = <comma separated list of allowed symbols from lib1>
-- enforce it
with package lib1_filtered
-- now in protected area
include lib1.e
-- very complicated three lines


So, if lib1.e defines symbols that would clash either with identifiers in
app.exw or in lib2.e('s subtree), they will be screened off and neither lib2.e
nor app.exw will see them.

wrap_lib1.e is to be written by the application coder, since the list of
clashing symbols depends on the other third party libs and the app.exw code.

2/ If only a couple symbols cause trouble, which is expected to be the most
frequent case, it is even simpler to do this:

-- app.exw
include lib1.e
restrict clashing_symbol_1 from lib1.e to lib1.e
--...
restrict clashing_symbol_N from lib1.e to lib1.e
include lib2.e
-- more code here

with exactly the same effect. If the symbols come from lib1.e's subtree, then
solution 1/ is in order.

> > > What happens when the library maintainer makes some changes, and the 
> > > packaging directives don't keep up?  
> > > 
> > 
> > The maintainer usually tests his lib, and specially the new functionalities,
> > and something won't work right, which he should fix before shipping.
> >
> > If the directives are in some custom wrapper you wrote, then it is your call
> > to keep the wrapper up to date, since you wrote it. Or to ship an update for
> > your application, so that an user can use the new version of the lib.
> > 
> > If the packaging uses a negative list, ie a list of internal symbols, then
> > adding
> > more exported symbols requires no update at all.
> 
> In a perfect world, this would be done.  And I agree that this isn't, in
> and of itself, a deal breaker.  I am saying that it's creating another
> way to create bugs.  How would the author know that he'd missed something?
> That his library causes problems with some other piece of code that hasn't
> even been written yet?  

He wouldn't. By properly marking internal symbols as such, he'd eliminate their
clashing with anything, so only interface symbols may still cause problems. If
someone uses another library with same interface, then see above.

> 
> The end result is that the users of the library are going to have to do 
> extra work (by effectively editing the code of the library) to use this 
> library, which is exactly what I'm trying to avoid.
> 

Definitely not true.
As the examples above show, the system is designed to _completely eliminate_ any
third party code editing. It also enables one to deal with legacy code (ie
without internal or exported), using export lists in wrapper includes.

> A negative list may reduce the chances of having to go in and edit, but
> it does not solve the problem.  It does, however drive up the complexity 
> of the code.
> 

Not really. SymTab has two more fields, one of them being the initial package
and the other the current package. If a symbol is *exported*, initial package is
-1; if *internal*, it is current_package; if *global*, the relevant package list
is looked up so as to determine whether this is internal or exported. Finally,
keyfind() checks from the current package field if som global is visible.

For example, contrary to your implementation, no change of file structure in the
front end is required in order to include it as a library. Just a wrapper as a
stopgap, or marking all current global symbols as internal if they are not to be
seen outside

> > 
> > > Also, your solution seems to me to be overly complex, and difficult to
> > > maintain.  When we get to encapsulation, I think I'd prefer something
> > > along the lines of an import vs include, where the globals "imported" 
> > > don't go beyond the file that imported them.  This way, if you need to
> > > keep stuff hidden, just import it.  If you want to expose things, either
> > > put it in the main include, or use the standard include directive.
> > > 
> > > It's possible that I just don't understand your proposal sufficiently.
> > > Please submit a modified set of files to demonstrate how you'd modify
> > > v3.1 to handle these files.  The benefit of my proposal is that there is
> > > absolutely no new syntax, and it does it in an intuitively straightforward
> > > manner.
> > > 
> > 
> > Huh? It does work for some obvious cases, but explicitly inluding a parent
> > file,
> > for example, is far from anything I'd call "intuitive or straightforward".
> 
> This really baffles me.  You have a file which uses symbols in another file.
> If that other file is not present, then it won't work.  You're saying that
> it's unintuitive to include that file with the symbols you depend upon?
>

Since the identifier defined in the parent is already in the known symbol space,
it is counter intuitive to add it again, yes. The only advantage I see in your
scheme is the dependency being made explicit. Otherwise, it's more typing for
less functionality, because then all symbols from parent will become viible from
the included file, won't they? If they don't, read "same functionality".
  
> > I'll have to post the files indeed. As I noted earlier, this may take some
> > time
> > because of RL fiercely catching up these days. But this is the way to go.
> > Also,
> > I didn't try adjusting the runtime lookup functions accordingly yet.
> > 
> > > I'm not really interested (in this discussion) to hear about how your 
> > > solution encapsulates symbols.  I'm interested in how we deal with
> > > multiple files that expose duplicate symbols, whether the original 
> > > author wants them to be exposed or just used internally.  Accept that
> > > the symbols are exposed to the program.
> > 
> > We obviously have different approaches.
> 
> Well, if you're honestly trying to solve the namespace/third party conflict
> problem then I'd agree with you, and I'd go farther to say that your
> approach is deeply flawed.
>  

I am trying to solve several related issues around name clashing and scope in
one sweep, rather than applying yet another Band-Aid. So This is why my solution
brings both encapsulation, overloading and name clash prevention in one
consistent "package".

The changes of syntax are:
* 2 new scope modifiers, *internal* and *exported*, which are mutually exclusive
with *global* and make the intent of the symbol known from just reading the code;
* 2 new with/without directive parameter:
+ *with package <pkg>*/*without package* pushes a package layer onto the package
stack;
+ *with previous_package* pops this stack.
* 2 new top level directives:
+ *package <pkg> [[!]=<symbol list>]* defines which globals are
exported/internal
+ *restrict <sym> from <id_def> to <new_scope>* changes the scope of an
identifier.

As explained, people who write multifile libs or apps that use several
(multifile) libs will need or use some of the new syntax. Hardly anyone else is
impacted.

> > > It seems that your method puts a larger burden onto the programmer to
> > > resolve conflicts.  My goal is to take what's already there (all 
> > > those include statements) and to make full use of the information that
> > > they impart.
> > 
> > The only burden is to disambiguate between interface and non interface
> > symbols.
> > It is not a new task for programmer, since the docs he wrote usually do this
> > already.
> 
> You still haven't answered the main issue here.  How does this solve two
> third party library conflicts?  I'll admit that I simply may not understand
> your solution well enough, and maybe your examples will clear this up,
> but you seem to be avoiding the question by saying, effectively, that
> your solution reduces the number of global symbols, so probabilistically,
> the chance of a conflict is reduced.
> 

Of course it does. As the code above shows, it also addressesthe clashes between
interface symbols.

> I agree with you on that point.  But your solution doesn't solve, AFAICT,
> the conflicts themselves.  So I'll use your terminology:
> 
> How would your approach resolve conflicts between two interface symbols
> conflicting in two third party libraries.  The conflict exists because
> the libraries (not just user code) use those interface symbols. 
> Therefore, simply having a program like:
> }}}
<eucode>
> -- myapp.ex
> include libMatt.e
> include libCChris.e
> </eucode>
{{{

> ...causes an error somewhere inside of libCChris.e, because one of its
> interface symbols conflicts with an interface symbol in libMatt.e.
> 
> What would we need to do to make sure that this situation doesn't occur.
> And what tools does the library programmer have to help him?
> 

If two interface symbols collide, I think the examples above show exactly what
to do from the _application writer's_ end.

If a non interface symbol clashes with anything, then:
1/ The application writer can screen it off using the same method;
2/ Later, the library writer may mark interface symbols as *internal*, or define
a package with either an export or exclude list, as s/he sees fit, and release a
better behaved library. The wrapper will still work, but won't be useful then.
This ensures immunity to change in library versions.

> Matt

Well, it looks like my paper is completely not understandable; however, I don't
know what to do about it, because I had tried to make it as precise and detailed
as I could figure out. Need some hints there.

CChris

new topic » goto parent » topic index » view message » categorize

36. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gm??l.com> Oct 15, 2007
849 views

CChris wrote:
> 
> Matt Lewis wrote:
> > 
> > 
> > Yes, but what about those two libraries.  One will have been included in
> > the main app after the other.  Did the author use namespaces for everything
> > in his library?  The first included library has polluted the namespace.
> > How does the second library manage to deconflict the symbols?
> > 
> 
> There are two ways around it:
> 1) As explained in my paper, write a wrapper include around say the first
> library,
> and filter the clashing symbol out from there:

<snip>

> 2/ If only a couple symbols cause trouble, which is expected to be the most
> frequent case, it is even simpler to do this:

OK, so the answer is that the user has to write some sort of wrapper for
the libraries to make them work together.  For contrast, my approach
requires the user to do nothing to get the libraries to work together.
He only needs to use a namespace (as he would right now) if he wants
to use one of the conflicting symbols.

> > > If the packaging uses a negative list, ie a list of internal symbols, then
> > > adding
> > > more exported symbols requires no update at all.
> > 
> > In a perfect world, this would be done.  And I agree that this isn't, in
> > and of itself, a deal breaker.  I am saying that it's creating another
> > way to create bugs.  How would the author know that he'd missed something?
> > That his library causes problems with some other piece of code that hasn't
> > even been written yet?  
> 
> He wouldn't. By properly marking internal symbols as such, he'd eliminate
> their
> clashing with anything, so only interface symbols may still cause problems.
> If someone uses another library with same interface, then see above.

Yes, if he properly marked them.  My ulterior motive with this was to
draw attention to the new warnings I've proposed.  The warnings alert a
coder to the fact that the origin of some symbols might be ambiguous 
(possibly in the future, when third party code is in the mix).

> > 
> > The end result is that the users of the library are going to have to do 
> > extra work (by effectively editing the code of the library) to use this 
> > library, which is exactly what I'm trying to avoid.
> > 
> 
> Definitely not true.
> As the examples above show, the system is designed to _completely eliminate_
> any third party code editing. It also enables one to deal with legacy code (ie
> without internal or exported), using export lists in wrapper includes.

I was considering writing the additional wrappers to be equivalent to 
editing third party code.  I guess I just don't understand why anyone would
want to have to create these wrappers if there were no need.

> > A negative list may reduce the chances of having to go in and edit, but
> > it does not solve the problem.  It does, however drive up the complexity 
> > of the code.
> > 
> 
> Not really. SymTab has two more fields, one of them being the initial package
> and the other the current package. If a symbol is *exported*, initial package
> is -1; if *internal*, it is current_package; if *global*, the relevant package
> list is looked up so as to determine whether this is internal or exported.
> Finally,
> keyfind() checks from the current package field if som global is visible. 
> 
> For example, contrary to your implementation, no change of file structure in
> the front end is required in order to include it as a library. Just a wrapper
> as a stopgap, or marking all current global symbols as internal if they are
> not to be seen outside

I wasn't concerned as much with interpreter code.  As I see it, my 
proposal just does the obvious thing when resolving symbols, and warns when
there is no such obvious thing, or errors when there are conflicting 
obvious things.  There's not really any increased mental load for the
coder.  Having to manage import lists is just a way to make explicit
things that are actually pretty obvious, IMHO.

> > > Huh? It does work for some obvious cases, but explicitly inluding a parent
> > > file,
> > > for example, is far from anything I'd call "intuitive or straightforward".
> > 
> > This really baffles me.  You have a file which uses symbols in another file.
> > If that other file is not present, then it won't work.  You're saying that
> > it's unintuitive to include that file with the symbols you depend upon?
> >
> 
> Since the identifier defined in the parent is already in the known symbol
> space,
> it is counter intuitive to add it again, yes. The only advantage I see in your
> scheme is the dependency being made explicit. Otherwise, it's more typing for
> less functionality, because then all symbols from parent will become viible
> from the included file, won't they? If they don't, read "same functionality".

Here's why I don't think it's counter-intuitive.  When looking at the file,
how can you tell where the symbol is meant to come from?  The file *does*
have a dependency on its parent file.  No more symbols are made visible
than already exist for the child file.

I'd note that it's not a required thing.  The difference is that it prevents
an entire class of bugs from occurring.  If the include statement is really
that odious, you could put "without warning" in the file.  Unlike your
approach, if you follow this rule, and don't let the warnings slide, then
you're *guaranteed* to avoid namespace conflicts with other third party
code.

> > Well, if you're honestly trying to solve the namespace/third party conflict
> > problem then I'd agree with you, and I'd go farther to say that your
> > approach is deeply flawed.
> >  
> 
> I am trying to solve several related issues around name clashing and scope in
> one sweep, rather than applying yet another Band-Aid. So This is why my
> solution
> brings both encapsulation, overloading and name clash prevention in one
> consistent
> "package".

My problem is that it doesn't really prevent name clashes.  True, it gives
the user tools with which to fight back, but why require extra work
where it's not needed?

I also disagree that my approach is a Band-Aid.  It is, rather, a fairly
comprehensive solution to getting code from multiple libraries to play
nicely.  No currently working code would break, but would allow code
out there that currently conflicts to work immediately (some libraries 
might require some minor changes--i.e., adding a few include statements).

> The changes of syntax are:
> * 2 new scope modifiers, *internal* and *exported*, which are mutually
> exclusive
> with *global* and make the intent of the symbol known from just reading the
> code;
> * 2 new with/without directive parameter:
> + *with package <pkg>*/*without package* pushes a package layer onto the
> package
> stack;
> + *with previous_package* pops this stack.
> * 2 new top level directives:
> + *package <pkg> [[!]=<symbol list>]* defines which globals are
> exported/internal
> + *restrict <sym> from <id_def> to <new_scope>* changes the scope of an
> identifier.
> 
> As explained, people who write multifile libs or apps that use several
> (multifile)
> libs will need or use some of the new syntax. Hardly anyone else is impacted.

Well, as someone who both writes and uses multiple multifile libs and 
apps, I can tell you that I'm not real keen on using the techniques 
described to get them all to work together.

> > You still haven't answered the main issue here.  How does this solve two
> > third party library conflicts?  I'll admit that I simply may not understand
> > your solution well enough, and maybe your examples will clear this up,
> > but you seem to be avoiding the question by saying, effectively, that
> > your solution reduces the number of global symbols, so probabilistically,
> > the chance of a conflict is reduced.
> > 
> 
> Of course it does. As the code above shows, it also addressesthe clashes
> between
> interface symbols.

I now understand you.  As mentioned above, it just gives a coder a 
framework to solve the problems.  I just prefer a simpler approach.

> Well, it looks like my paper is completely not understandable; however, I
> don't
> know what to do about it, because I had tried to make it as precise and
> detailed
> as I could figure out. Need some hints there.

I was thinking about the problem differently, as something that could be
95% solved by the interpreter, rather than 50%.  OK, numbers are made up,
but my real point is that under my system, you can write code that will
100% play nicely with other code.  If anything conflicts, the user just
has to use a namespace to differentiate between your library and someone
else's.

There are no new keywords.  No new syntax.  And coders get an extra warning
that is extremely valuable in pointing out a potential problem.  Essentially,
as long as your library is error free, you can be confident that anyone
can simply include it into their code, and it won't cause any namespace
conflicts (of course, other, poorly written code might).

Matt

new topic » goto parent » topic index » view message » categorize

37. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Igor Kachan <kinz at pe??rlink.ru> Oct 16, 2007
846 views

Pete Lomax wrote:
> 
> Igor Kachan wrote:
> > 
> > Pete Lomax wrote:
> >  
> > > BTW, I've still had zero entries to my data hiding challenge,
> > 
> > I think that the *private* and *local* symbols of Euphoria
> > are hidden well enough, so the problem only stands for
> > the *global* symbols.
> Correct.
> > 
> > without symbol [list]  
> > 
> > So, the scheme is simple.
> > 
> Perhaps too simple for me. The include tree looks like this:
> f1
>  |
>  +-f2
>  |  | 
>  |  +-f3 (defines z3)
>  |  | 
>  |  +-f4
>  |  |
>  |  +-f5
>  |     |
>  |     +-f6 (defines z6)
>  +-f7
> 
> How would you make z6 visible in f7 but not in f5, f2 or f1?
> How would you make z3 visible in f4 and f7 but not in f5, f2, or f1?

I'd prefer just *edit* these files, using Edita, and nothing more -
to rename some globals, copy/paster some files instead of includeing
them etc etc, and to get well readable *new* source files,
without any need to keep in mind some complicated rules of
"local but global" scope with "private but local" import.

> > You can just say about not needed globals,
> > or to block them as hidden:
> > 
> > without symbol abs, void, PI, arcsin, VOID, ABS
> > include misc.e
> > include new_sdl.e
> 
> I would not argue against a simple "ringfence" such as this, it has uses, but
> does not solve the challenge posted. One point about the above that I would
> like to make is that a "without symbol" directive should apply only to the
> next
> include statement, not the next N include statements.


Yes, this your "one point" is good point.

 
> > I do not like all these complicated rules with
> > trees, tables of priority etc etc, I just can
> > not remember who is who in those tables and trees,
> 
> The priority tables are a compiler internal, and like the symbol hash table
> which has been around for donkeys years, it should not be something you need
> to think about when coding. It is just meant to be a way of implementing
> "programmer
> intent" as best as possible.


Yes, I know.


> I agree that it is a bit of a challenge to document
> this stuff without confusing everyone.

Yes, it is the very, the most, important thing!

Regards,
Igor Kachan
kinz at peterlink.ru

new topic » goto parent » topic index » view message » categorize

38. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agricultur?.go?v.fr> Oct 16, 2007
847 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > Matt Lewis wrote:
> > > 
> > > 
> > > Yes, but what about those two libraries.  One will have been included in
> > > the main app after the other.  Did the author use namespaces for
> > > everything
> > > in his library?  The first included library has polluted the namespace.
> > > How does the second library manage to deconflict the symbols?
> > > 
> > 
> > There are two ways around it:
> > 1) As explained in my paper, write a wrapper include around say the first
> > library,
> > and filter the clashing symbol out from there:
> 
> <snip>
> 
> > 2/ If only a couple symbols cause trouble, which is expected to be the most
> > frequent case, it is even simpler to do this:
> 
> OK, so the answer is that the user has to write some sort of wrapper for
> the libraries to make them work together.  For contrast, my approach
> requires the user to do nothing to get the libraries to work together.
> He only needs to use a namespace (as he would right now) if he wants
> to use one of the conflicting symbols.
>  

I do not see how, within your framework, an application is protected from a new
non interface global identifier popping in a lib's newer version and clashing
with an existing symbol in the application. Could you explain how it does?

More generally, I'd like to see a precise documentation of the new working of
namespaces you are suggesting. It would be much better than just exhibiting a
dozen of contrivedly simple test files, however relevant and useful they may be.

> > > > If the packaging uses a negative list, ie a list of internal symbols,
> > > > then
> adding</font></i>
> > > > more exported symbols requires no update at all.
> > > 
> > > In a perfect world, this would be done.  And I agree that this isn't, in
> > > and of itself, a deal breaker.  I am saying that it's creating another
> > > way to create bugs.  How would the author know that he'd missed something?
> > > That his library causes problems with some other piece of code that hasn't
> > > even been written yet?  
> > 
> > He wouldn't. By properly marking internal symbols as such, he'd eliminate
> > their
> > clashing with anything, so only interface symbols may still cause problems.
> > If someone uses another library with same interface, then see above.
> 
> Yes, if he properly marked them.  My ulterior motive with this was to
> draw attention to the new warnings I've proposed.  The warnings alert a
> coder to the fact that the origin of some symbols might be ambiguous 
> (possibly in the future, when third party code is in the mix).
>

Most people run without warning anyway, if only to avoid the silly short circuit
stuff, or "parameter not used" when they are not required in an actual routine
but the prototype cannot change. They are practically useless, even though they
convey useful information.
 
> > > 
> > > The end result is that the users of the library are going to have to do 
> > > extra work (by effectively editing the code of the library) to use this 
> > > library, which is exactly what I'm trying to avoid.
> > > 
> > 
> > Definitely not true.
> > As the examples above show, the system is designed to _completely eliminate_
> > any third party code editing. It also enables one to deal with legacy code
> > (ie
> > without internal or exported), using export lists in wrapper includes.
> 
> I was considering writing the additional wrappers to be equivalent to 
> editing third party code.  I guess I just don't understand why anyone would
> want to have to create these wrappers if there were no need.
>  

As long as there is no clash, there is no need. 
Even though it would be nicer for the user to have a wrapper supplied by the
library maintainer, there is no obligation to do so. Either the application coder
or any fourth party can supply one.

> > > A negative list may reduce the chances of having to go in and edit, but
> > > it does not solve the problem.  It does, however drive up the complexity 
> > > of the code.
> > > 
> > 
> > Not really. SymTab has two more fields, one of them being the initial
> > package
> > and the other the current package. If a symbol is *exported*, initial
> > package
> > is -1; if *internal*, it is current_package; if *global*, the relevant
> > package
> > list is looked up so as to determine whether this is internal or exported.
> > Finally,
> > keyfind() checks from the current package field if som global is visible. 
> > 
> > For example, contrary to your implementation, no change of file structure in
> > the front end is required in order to include it as a library. Just a
> > wrapper
> > as a stopgap, or marking all current global symbols as internal if they are
> > not to be seen outside
> 
> I wasn't concerned as much with interpreter code.  As I see it, my 
> proposal just does the obvious thing when resolving symbols, and warns when
> there is no such obvious thing, or errors when there are conflicting 
> obvious things.  There's not really any increased mental load for the
> coder.  Having to manage import lists is just a way to make explicit
> things that are actually pretty obvious, IMHO.

Except when they are not. Besides, my solution also makes it possible to
extend/redefine routines, which is currently tricky and limited, and about which
I don't see how your solution helps.

> 
> > > > Huh? It does work for some obvious cases, but explicitly inluding a
> > > > parent
> file,</font></i>
> > > > for example, is far from anything I'd call "intuitive or
> > > > straightforward".
> > > 
> > > This really baffles me.  You have a file which uses symbols in another
> > > file.
> > > If that other file is not present, then it won't work.  You're saying that
> > > it's unintuitive to include that file with the symbols you depend upon?
> > >
> > 
> > Since the identifier defined in the parent is already in the known symbol
> > space,
> > it is counter intuitive to add it again, yes. The only advantage I see in
> > your
> > scheme is the dependency being made explicit. Otherwise, it's more typing
> > for
> > less functionality, because then all symbols from parent will become viible
> > from the included file, won't they? If they don't, read "same
> > functionality".
> 
> Here's why I don't think it's counter-intuitive.  When looking at the file,
> how can you tell where the symbol is meant to come from?  The file *does*
> have a dependency on its parent file.  No more symbols are made visible
> than already exist for the child file.

Sometimes you just cannot, because the file is meant to be included by several
possible parent files. So at least you should provide a special syntax for
"include parent file". Or have a reserved namespace to designate symbols known to
come from some ancestor (in inclusion tree). Writing relations with parent in
concrete runs counter to any code reuse strategy.

> 
> I'd note that it's not a required thing.  The difference is that it prevents
> an entire class of bugs from occurring.  If the include statement is really
> that odious, you could put "without warning" in the file.  Unlike your
> approach, if you follow this rule, and don't let the warnings slide, then
> you're *guaranteed* to avoid namespace conflicts with other third party
> code.
>

I don't think there is any guarantee, because there are just so many ways
clashes can happen. Again, where are the complete docs? That's the only way one
can tell.

> > > Well, if you're honestly trying to solve the namespace/third party
> > > conflict
> > > problem then I'd agree with you, and I'd go farther to say that your
> > > approach is deeply flawed.
> > >  
> > 
> > I am trying to solve several related issues around name clashing and scope
> > in
> > one sweep, rather than applying yet another Band-Aid. So This is why my
> > solution
> > brings both encapsulation, overloading and name clash prevention in one
> > consistent
> > "package".
> 
> My problem is that it doesn't really prevent name clashes.  True, it gives
> the user tools with which to fight back, but why require extra work
> where it's not needed?

The advent of name clashes depends on a specific combination of libraries and
application code. So, I don't believe a simple change of rules will prevent all
cases. And user still cannot fignt back without editing third party code.

> 
> I also disagree that my approach is a Band-Aid.  It is, rather, a fairly
> comprehensive solution to getting code from multiple libraries to play
> nicely.  No currently working code would break, but would allow code
> out there that currently conflicts to work immediately (some libraries 
> might require some minor changes--i.e., adding a few include statements).
> 

Adding the include statements may well have other side effects under your
scheme. So, assuming it would only take this amount of work (which I certainly
don't believe), a few library maintainers will need to run another cycle of tests
and release newer versions. Now you are a user and have been waiting for three
years for the next release, how about that? My solution allows for things to
start working (again) now, and the library updates, while welcome, will come when
they may.

And I still uphold that your solution, while it certainly improves on current
Eu, addresses only part of the related issues, and will make the remainder more
difficult to solve in the future. This is exactly what P. Robinson and I were
calling "Band-Aid".

> > The changes of syntax are:
> > * 2 new scope modifiers, *internal* and *exported*, which are mutually
> > exclusive
> > with *global* and make the intent of the symbol known from just reading the
> > code;
> > * 2 new with/without directive parameter:
> > + *with package <pkg>*/*without package* pushes a package layer onto the
> > package
> > stack;
> > + *with previous_package* pops this stack.
> > * 2 new top level directives:
> > + *package <pkg> [[!]=<symbol list>]* defines which globals are
> > exported/internal
> > + *restrict <sym> from <id_def> to <new_scope>* changes the scope of an
> > identifier.
> > 
> > As explained, people who write multifile libs or apps that use several
> > (multifile)
> > libs will need or use some of the new syntax. Hardly anyone else is
> > impacted.
> 
> Well, as someone who both writes and uses multiple multifile libs and 
> apps, I can tell you that I'm not real keen on using the techniques 
> described to get them all to work together.

I can't help there, I'm afraid.

> 
> > > You still haven't answered the main issue here.  How does this solve two
> > > third party library conflicts?  I'll admit that I simply may not
> > > understand
> > > your solution well enough, and maybe your examples will clear this up,
> > > but you seem to be avoiding the question by saying, effectively, that
> > > your solution reduces the number of global symbols, so probabilistically,
> > > the chance of a conflict is reduced.
> > > 
> > 
> > Of course it does. As the code above shows, it also addressesthe clashes
> > between
> > interface symbols.
> 
> I now understand you.  As mentioned above, it just gives a coder a 
> framework to solve the problems.  I just prefer a simpler approach.
> 

It uses no new syntax, but is not simpler to handle and certainly leaves more
issues standing.

> > Well, it looks like my paper is completely not understandable; however, I
> > don't
> > know what to do about it, because I had tried to make it as precise and
> > detailed
> > as I could figure out. Need some hints there.
> 
> I was thinking about the problem differently, as something that could be
> 95% solved by the interpreter, rather than 50%.  OK, numbers are made up,
> but my real point is that under my system, you can write code that will
> 100% play nicely with other code.  If anything conflicts, the user just
> has to use a namespace to differentiate between your library and someone
> else's.

This isn't even different of how it works now. The more I read, the less
problems I see your scheme solving.

> 
> There are no new keywords.  No new syntax.  And coders get an extra warning
> that is extremely valuable in pointing out a potential problem.  Essentially,
> as long as your library is error free, you can be confident that anyone
> can simply include it into their code, and it won't cause any namespace
> conflicts (of course, other, poorly written code might).
> 
> Matt

These statements are not backed by any evidence, and I simply don't think they
are true. Actual, detailed documentation is required to assess the scheme; I
tried to come up with one for mine: let me know what is missing. Of course I
agree with poor code always causing trouble, but at least let give users tools to
manage it, as the poor code sometimes is the only available one.

CChris

new topic » goto parent » topic index » view message » categorize

39. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at gmai?.c?m> Oct 16, 2007
888 views

CChris wrote:
> 
> I do not see how, within your framework, an application is protected from a
> new non interface global identifier popping in a lib's newer version and
> clashing
> with an existing symbol in the application. Could you explain how it does?

OK, I think I didn't make clear exactly which clashes it will prevent.  It
will prevent any namespace conflicts from happening within the third party
libraries.  Code that uses the different libraries will still have to
disambiguate which symbols they wish to use.  This can be easily done
with namespaces.  The assumption is that the third party libraries do
not use symbols in places where they are not directly or indirectly included.

> More generally, I'd like to see a precise documentation of the new working of
> namespaces you are suggesting. It would be much better than just exhibiting
> a dozen of contrivedly simple test files, however relevant and useful they may
> be.

Here's a shot at it:

Terms:
include directly: When one file (f1) includes another file (f2), f1 is
    said to include f2 directly
include indirectly:  A file (fx) is said to be included indirectly into 
    another file (f1) when there is a series of direct includes linking
    f1 and fx (e.g., f1->f2->...->fx)
included symbol: A symbol is said to be included into a file if it is
    a global symbol that has been either directly or indirectly included
    by the file

When the parser encounters a symbol that could resolve to multiple global
symbols, the parser will attempt to use included symbols over non-included
symbols.  Since the author included a file that exported a symbol with
the name being used, it is reasonable to assume that the desired symbol
is the one that was included.

If a namespace identifier is used, symbols in the file with the declared
namespace are checked for first.  If none exist, then the parser will
attempt to use symbols included by the namespaced file.  If multiple
included symbols match, then a compile error occurs.  To use these symbols,
additional namespaces must be used for the respective files.

If a non-included symbol is used, a warning will be generated.

> Most people run without warning anyway, if only to avoid the silly short
> circuit
> stuff, or "parameter not used" when they are not required in an actual routine
> but the prototype cannot change. They are practically useless, even though
> they
> convey useful information.

While I tend to run without warning, I usually at least attempt to get rid
of warnings before release.  Most of my warnings are relatively harmless
(unused variables--bad for maintenance, but probably not going to cause
any errors).  These warnings really do signify the possibility of bad
things ahead.

> > I was considering writing the additional wrappers to be equivalent to 
> > editing third party code.  I guess I just don't understand why anyone would
> > want to have to create these wrappers if there were no need.
> >  
> 
> As long as there is no clash, there is no need. 
> Even though it would be nicer for the user to have a wrapper supplied by the
> library maintainer, there is no obligation to do so. Either the application
> coder or any fourth party can supply one.

Yes, I understand that, but I'm saying that there's a better way to solve
this; one that doesn't require the new file in the first place.

> > I wasn't concerned as much with interpreter code.  As I see it, my 
> > proposal just does the obvious thing when resolving symbols, and warns when
> > there is no such obvious thing, or errors when there are conflicting 
> > obvious things.  There's not really any increased mental load for the
> > coder.  Having to manage import lists is just a way to make explicit
> > things that are actually pretty obvious, IMHO.
> 
> Except when they are not. Besides, my solution also makes it possible to
> extend/redefine
> routines, which is currently tricky and limited, and about which I don't see
> how your solution helps.

You're right.  That's not part of what it does.  I'd like for you to please
expand upon "Except when they are not."

> > Here's why I don't think it's counter-intuitive.  When looking at the file,
> > how can you tell where the symbol is meant to come from?  The file *does*
> > have a dependency on its parent file.  No more symbols are made visible
> > than already exist for the child file.
> 
> Sometimes you just cannot, because the file is meant to be included by several
> possible parent files. So at least you should provide a special syntax for
> "include
> parent file". Or have a reserved namespace to designate symbols known to come
> from some ancestor (in inclusion tree). Writing relations with parent in
> concrete
> runs counter to any code reuse strategy.

I disagree.  In fact, I think it improves code reuse.  If you take a look at 
what I've done with the interpreter code, you'll see that these issues have
been cleared up.  There are a couple of additional files used to declare
data.

I'd argue that not defining your dependencies (i.e., what external symbols 
do you depend upon?) make code reuse more difficult, since you end up
getting really tightly coupled modules that cannot be easily separated.

If you're really dependent on parent symbols, then you probably have a 
poor design (I've done this myself, and paid for it).  It's probably
better to remove the symbols to a third file, and have both parent and
child use that file.  Relationships are clearer, and you don't need to
do weird stuff like stick the include halfway down the parent file.

> > I'd note that it's not a required thing.  The difference is that it prevents
> > an entire class of bugs from occurring.  If the include statement is really
> > that odious, you could put "without warning" in the file.  Unlike your
> > approach, if you follow this rule, and don't let the warnings slide, then
> > you're *guaranteed* to avoid namespace conflicts with other third party
> > code.
> >
> 
> I don't think there is any guarantee, because there are just so many ways
> clashes
> can happen. Again, where are the complete docs? That's the only way one can
> tell.

Well, I think an audit/test of the code would be a better way to really tell,
but I think my docs do a decent job of describing the behavior.  I'd be
grateful if you could find holes in my approach.  A single counter-example
would suffice.

> > My problem is that it doesn't really prevent name clashes.  True, it gives
> > the user tools with which to fight back, but why require extra work
> > where it's not needed?
> 
> The advent of name clashes depends on a specific combination of libraries and
> application code. So, I don't believe a simple change of rules will prevent
> all cases. And user still cannot fignt back without editing third party code.

Again, please provide an example where my methodology fails.  I don't 
see it, but I may be too close.

> > 
> > I also disagree that my approach is a Band-Aid.  It is, rather, a fairly
> > comprehensive solution to getting code from multiple libraries to play
> > nicely.  No currently working code would break, but would allow code
> > out there that currently conflicts to work immediately (some libraries 
> > might require some minor changes--i.e., adding a few include statements).
> > 
> 
> Adding the include statements may well have other side effects under your
> scheme.
> So, assuming it would only take this amount of work (which I certainly don't
> believe), a few library maintainers will need to run another cycle of tests
> and release newer versions. Now you are a user and have been waiting for three
> years for the next release, how about that? My solution allows for things to
> start working (again) now, and the library updates, while welcome, will come
> when they may.

It's true that some third party (multi-file) libraries may require some
rework.  I didn't find any warnings with win32lib, however, which has to
be the most used and most complicated of them all.  My solution will solve
most cases where combinations of libraries will fail.  The only time it
won't prevent this is when the libraries use non-included symbols, which
I believe is probably not very common.  In those cases, some action is
required.  However, the action is the insertion of a few include statements.

It does not require anything nearly as complex as your method.  As an
example, how would you solve this issue:

-- app.ex
include libMatt.e
include libCChris.e
...

-- libMatt.e
include libMatt1.e
? x

-- libMatt1.e
global constant x = "Matt1"

-- libCChris.e
include libCChris1.e
? x

-- libCChris1.e
global constant x = "CChris1"

This is a simple demonstration of exactly the sort of problem I'm solving
here.  Line 2 of libCChris.e will error, because it doesn't know whether
it should use libMatt1:x or libCChris:x under current symbol resolution.
But I think that it's clear to anyone looking at the code that it's
really trying to use libCChris1:x.

This is a huge problem that shouldn't exist.  It makes it difficult to
use multiple libraries together, because you end up having to edit the
code...every time it's updated...

> And I still uphold that your solution, while it certainly improves on current
> Eu, addresses only part of the related issues, and will make the remainder
> more
> difficult to solve in the future. This is exactly what P. Robinson and I were
> calling "Band-Aid".

I have never said that it attempted to solve multiple problems, but I do
believe that its the best way to solve this particular problem.  Thinking
that something to hide information is going to solve this problem is
just wishful thinking.  I'd reply that your method for solving namespace
conflicts a Band-Aid approach to that problem.  

Please show me your solution to the above example, and we'll go from there,
and we'll be able to compare.  Note that my methodology requires no code
chages.

> > I now understand you.  As mentioned above, it just gives a coder a 
> > framework to solve the problems.  I just prefer a simpler approach.
> > 
> 
> It uses no new syntax, but is not simpler to handle and certainly leaves more
> issues standing.

I don't understand why it's not simpler to handle.  Once you've submitted
a response to the above problem, maybe we'll have a better basis for
comparison.  Feel free to post other problems, so that we can compare the
solutions (Pete: you, too...).  Well, I'd argue that they leave completely
different issues standing, because they're attacking different problems.

> > I was thinking about the problem differently, as something that could be
> > 95% solved by the interpreter, rather than 50%.  OK, numbers are made up,
> > but my real point is that under my system, you can write code that will
> > 100% play nicely with other code.  If anything conflicts, the user just
> > has to use a namespace to differentiate between your library and someone
> > else's.
> 
> This isn't even different of how it works now. The more I read, the less
> problems
> I see your scheme solving.

Actually, there are some important (subtle, perhaps) differences.  I hope
that the above example helps to illustrate this.

> > 
> > There are no new keywords.  No new syntax.  And coders get an extra warning
> > that is extremely valuable in pointing out a potential problem. 
> > Essentially,
> > as long as your library is error free, you can be confident that anyone
> > can simply include it into their code, and it won't cause any namespace
> > conflicts (of course, other, poorly written code might).
> 
> These statements are not backed by any evidence, and I simply don't think they
> are true. Actual, detailed documentation is required to assess the scheme; I
> tried to come up with one for mine: let me know what is missing. Of course I
> agree with poor code always causing trouble, but at least let give users tools
> to manage it, as the poor code sometimes is the only available one.

As I documented above, the change is simple, but powerful.  I stand behind
my assertion.  If anyone can post some (warning free) code as a counter
example, then I'll back down, of course.  There is a problem right now 
where you couldn't use libMatt and libCChris in the same application without
editing one or both of them.  It's simply impossible.  My change allows them
to live together without having to change them at all.

Matt

new topic » goto parent » topic index » view message » categorize

40. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at bluey?nder.?o.uk> Oct 16, 2007
842 views

Matt Lewis wrote:
> It's true that some third party (multi-file) libraries may require some
> rework.  I didn't find any warnings with win32lib, however, which has to
> be the most used and most complicated of them all.
It has not escaped me that far more effort has probably been spent on avoiding
these kinds of issues in win32lib, via conventional methods, than all other Eu
code in existence put together! With hindsight it was a very poor choice of
challenge! On a hunch I looked in Arwen and was surprised I could only find one
apparently unnecessary global, NUMCLASSES in classes.ew, and even that seems like
a typo as only classes.ew uses it.

Perhaps indeed this problem is being blown out of all proportion.

Or perhaps this is the very reason that win32lib is not broken up into more
manageable pieces, say printing, fonts, listviews, treeviews, mouse, properties,
datetime, scrollbars, bitmaps, etc.

Regards,
Pete

new topic » goto parent » topic index » view message » categorize

41. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at a?ricultu?e.gouv.fr> Oct 16, 2007
844 views

Pete Lomax wrote:
> 
> Matt Lewis wrote:
> > It's true that some third party (multi-file) libraries may require some
> > rework.  I didn't find any warnings with win32lib, however, which has to
> > be the most used and most complicated of them all.
> It has not escaped me that far more effort has probably been spent on avoiding
> these kinds of issues in win32lib, via conventional methods, than all other
> Eu code in existence put together! With hindsight it was a very poor choice
> of challenge! On a hunch I looked in Arwen and was surprised I could only find
> one apparently unnecessary global, NUMCLASSES in classes.ew, and even that
> seems
> like a typo as only classes.ew uses it.
> 
> Perhaps indeed this problem is being blown out of all proportion.
> 
> Or perhaps this is the very reason that win32lib is not broken up into more
> manageable pieces, say printing, fonts, listviews, treeviews, mouse,
> properties,
> datetime, scrollbars, bitmaps, etc.
> 
> Regards,
> Pete

Very precisely so.

See, in v0.70.x, the former w32support.e had grown to a jumble of 6000+ lines
covering much unrelated issues. This is why I suggested and implemented a breakup
of the file into w32memory.ew (the structure engine, quite extensively enhanced)
and w32utils.ew (everything else).
I had been informed of EuCom including directly w32suport.e, which is why there
is a wrapper file of that name in the current distribution. As a result,

include w32support.e

 still has the expected effect, but

include w32support.e as mem

 no longer does.

Likewise, spinning off w32forms.ew, which appeared to be quite natural, wasn't
that easy. I managed to make public half of the symbols that went global in the
process, but not some others. This is why I'd insist on preventing "private
global", or inernal, symbols to break havoc on an application.

Under the current system, you have to lump everything in a single file or expose
unwanted globals. That's how we had a 33000+ line win32lib.ew.

Introducing "import", or "local include", or whatever, seems to me more useful
than the planned change in namespace semantics, though I'd keep the idea of
inherite namespaces - so that a namespace acts like a package identifier.

If we were redesigning Eu, I'd suggest making include bloxk any global in
included files, "global include" being used to lift this restriction. But this
would break too much code I'm afraid, so let's go with "import", hoping most
library writers will soon update. Note that my solution spares the user having to
wait for an update, while avoiding to touch any file he already has.

CChris

new topic » goto parent » topic index » view message » categorize

42. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at agric?lture.g?uv.fr> Oct 17, 2007
837 views

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > I do not see how, within your framework, an application is protected from a
> > new non interface global identifier popping in a lib's newer version and
> > clashing
> > with an existing symbol in the application. Could you explain how it does?
> 
> OK, I think I didn't make clear exactly which clashes it will prevent.  It
> will prevent any namespace conflicts from happening within the third party
> libraries.  Code that uses the different libraries will still have to
> disambiguate which symbols they wish to use.  This can be easily done
> with namespaces.  The assumption is that the third party libraries do
> not use symbols in places where they are not directly or indirectly included.
>  
> > More generally, I'd like to see a precise documentation of the new working
> > of
> > namespaces you are suggesting. It would be much better than just exhibiting
> > a dozen of contrivedly simple test files, however relevant and useful they
> > may
> > be.
> 
> Here's a shot at it:
> 
> Terms:
> include directly: When one file (f1) includes another file (f2), f1 is
>     said to include f2 directly
> include indirectly:  A file (fx) is said to be included indirectly into 
>     another file (f1) when there is a series of direct includes linking
>     f1 and fx (e.g., f1->f2->...->fx)
> included symbol: A symbol is said to be included into a file if it is
>     a global symbol that has been either directly or indirectly included
>     by the file
> 
> When the parser encounters a symbol that could resolve to multiple global
> symbols, the parser will attempt to use included symbols over non-included
> symbols.  Since the author included a file that exported a symbol with
> the name being used, it is reasonable to assume that the desired symbol
> is the one that was included.
> 
> If a namespace identifier is used, symbols in the file with the declared
> namespace are checked for first.  If none exist, then the parser will
> attempt to use symbols included by the namespaced file.  If multiple
> included symbols match, then a compile error occurs.  To use these symbols,
> additional namespaces must be used for the respective files.
> 
> If a non-included symbol is used, a warning will be generated.
> 

<snipped/>

Consider the following:

-- app.exw
include lib.e as lib
?lib:n

-- lib.e
include sublib.e as sublib

-- sublib.e
global integer n
include subsublib.e as subsublib

-- subsublib.e
global integer n

The sublib and subsublin namespaces are optional.

Now some questions:
1/ There are two different n defined below lib.e, and none in lib.e. Your docs
imply that lib:n should cause an error. I don't think so: the closest up should
mask the other symbols below it. If there are siblings, then yes, throw an error.

2/ If lib.e's newer version defines n, which it currently doesn't, then app.exw
has a risk of being affected in an unpredictable way, if it is important that
sublib:n be accessed, and lib:n was coded out of genericity. Or because an
earlier version of lib.e was defining the right n, and the inherited namespaces
made the transition to current version seamless.

CChris

new topic » goto parent » topic index » view message » categorize

43. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Matt Lewis <matthewwalkerlewis at g?ail?com> Oct 17, 2007
840 views

CChris wrote:
> 
> 
> Consider the following:

-- app.exw
include lib.e as lib
?lib:n

-- lib.e
include sublib.e as sublib

-- sublib.e
global integer n
include subsublib.e as subsublib

-- subsublib.e
global integer n

> The sublib and subsublin namespaces are optional.
> 
> Now some questions:
> 1/ There are two different n defined below lib.e, and none in lib.e.
> Your docs imply that lib:n should cause an error. I don't think so: 
> the closest up should mask the other symbols below it. If there are
> siblings, then yes, throw an error.

I think this is a reasonable conclusion, but I have some reservations.  What 
about the case where they aren't direct descendants?  We'd need to be able 
to calculate the "minimum include distance" in this case.  At that point,
if something unrelated gets rearranged in some file (includes change, etc)
you'll end up with a bug that's very difficult to prevent, and probably
hard to catch.  The alternative, where if some conflicting symbol were 
added, would throw an immediate error, and you'd know that you needed to
specify which you wanted.

For the above reasons, I think I'd stick to only allowing the top-level
symbol mask included symbols.
 
> 2/ If lib.e's newer version defines n, which it currently doesn't, then
> app.exw has a risk of being affected in an unpredictable way, if it is
> important that sublib:n be accessed, and lib:n was coded out of genericity.
> Or because an earlier version of lib.e was defining the right n, and 
> the inherited namespaces made the transition to current version seamless.

Yes, this is a potential hazard.  This is probably a great reason to be
able to hide symbols (assuming the author didn't want sublib:n to be
used in the first place), though it could be that the library author
still wants both to be visible.  The only reasonable defense against this
is for the library author to make sure a change such as this is communicated
to the users.

I don't really see any alternative, other than not allowing namespaces to
inherit.  I suspect that this is a very rare situation, and I still
believe that the benefits outweigh the costs.

Matt

new topic » goto parent » topic index » view message » categorize

44. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by Pete Lomax <petelomax at blueyonde?.co?uk> Oct 18, 2007
844 views

CChris wrote:
> 
> Consider the following:
> }}}
<eucode>
> -- app.exw
> include lib.e as lib
> ?lib:n
> 
> -- lib.e
> include sublib.e as sublib
> 
> -- sublib.e
> global integer n
> include subsublib.e as subsublib
> 
> -- subsublib.e
> global integer n
> </eucode>
{{{

> The sublib and subsublin namespaces are optional.
> 
> Now some questions:
> 1/ There are two different n defined below lib.e, and none in lib.e. Your docs
> imply that lib:n should cause an error. I don't think so: the closest up
> should
> mask the other symbols below it.
I see no practical reason to do this. Matt arrived at the same conclusion as me
on this, quite independently, and only after that decision did I point out the
same effect in my priority tables.
http://www.openeuphoria.org/EUforum/m17071.html

> If there are siblings, then yes, throw an error.> 
> 2/ If lib.e's newer version defines n, which it currently doesn't, then
> app.exw
> has a risk of being affected in an unpredictable way, if it is important that
> sublib:n be accessed, and lib:n was coded out of genericity.
In fact I consider this a necessary evil, hopefully sufficiently rare not to be
a significant problem (also covered in the above link). Making something similar
happen with subincludes basically doubles the probability, for no apparent gain.
> Or because an earlier
> version of lib.e was defining the right n, and the inherited namespaces made
> the transition to current version seamless.
> 
> CChris

Pete

new topic » goto parent » topic index » view message » categorize

45. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Posted by CChris <christian.cuvier at a?riculture?gouv.fr> Oct 18, 2007
835 views
Last edited Oct 19, 2007

Matt Lewis wrote:
> 
> CChris wrote:
> > 
> > 
> > Consider the following:
> }}}
<eucode>
> -- app.exw
> include lib.e as lib
> ?lib:n
> 
> -- lib.e
> include sublib.e as sublib
> 
> -- sublib.e
> global integer n
> include subsublib.e as subsublib
> 
> -- subsublib.e
> global integer n
> </eucode>
{{{

> > The sublib and subsublin namespaces are optional.
> > 
> > Now some questions:
> > 1/ There are two different n defined below lib.e, and none in lib.e.
> > Your docs imply that lib:n should cause an error. I don't think so: 
> > the closest up should mask the other symbols below it. If there are
> > siblings, then yes, throw an error.
> 
> I think this is a reasonable conclusion, but I have some reservations.  What
> 
> about the case where they aren't direct descendants?  We'd need to be able 
> to calculate the "minimum include distance" in this case.  At that point,
> if something unrelated gets rearranged in some file (includes change, etc)
> you'll end up with a bug that's very difficult to prevent, and probably
> hard to catch.  The alternative, where if some conflicting symbol were 
> added, would throw an immediate error, and you'd know that you needed to
> specify which you wanted.
> 
> For the above reasons, I think I'd stick to only allowing the top-level
> symbol mask included symbols.
>  

I don't think an absolute distance should be used, exactly for these reasons.

However, if there are two or more symbols below the namespaced file, and there
is none in that file, and if one of these symbols is above the others, then the
situation you describe doesn't occur, because one of the symbols is at the root
of a subtree containing all others. In that case, the identifier should resolve
to this symbol. On the contrary, if there is no such root, or if the root doesn't
define the desired identifier, it is an ambiguous condition, and an error has to
be thrown. Rearranging has very little chance to cause a switch between these two
situation, while it can upset anything based on a mere distance calculation.


> > 2/ If lib.e's newer version defines n, which it currently doesn't, then
> > app.exw has a risk of being affected in an unpredictable way, if it is
> > important that sublib:n be accessed, and lib:n was coded out of genericity.
> > Or because an earlier version of lib.e was defining the right n, and 
> > the inherited namespaces made the transition to current version seamless.
> 
> Yes, this is a potential hazard.  This is probably a great reason to be
> able to hide symbols (assuming the author didn't want sublib:n to be
> used in the first place), though it could be that the library author
> still wants both to be visible.  The only reasonable defense against this
> is for the library author to make sure a change such as this is communicated
> to the users.
> 
> I don't really see any alternative, other than not allowing namespaces to
> inherit.  I suspect that this is a very rare situation, and I still
> believe that the benefits outweigh the costs.
> 
> Matt

No problem. It's just that this has to be documented somewhere if it is to be
part of the language.

CChris

OpenEuphoria

1. symbol resolution (was:EuCOM : Attn Matt : String Return Value)

2. symbol resolution (was:EuCOM : Attn Matt : String Return Value)

3. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

4. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

5. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

6. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

7. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

8. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

9. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

10. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

11. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

12. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

13. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

14. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

15. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

16. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

17. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

18. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

19. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

20. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

21. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

22. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

23. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

24. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

25. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

26. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

27. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

28. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

29. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

30. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

31. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

32. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

33. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

34. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

35. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

36. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

37. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

38. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

39. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

40. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

41. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

42. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

43. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

44. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

45. Re: symbol resolution (was:EuCOM : Attn Matt : String Return Value)

Search

Include:

Quick Links

User menu

Misc Menu