Re: Pass by Reference
- Posted by DerekParnell (admin) Sep 12, 2009
- 2355 views
Lack of references is the biggest issue with Euphoria right now
Can I just concentrate on Pass by Reference (PBR) for now rather than references in general.
background
PBR means that when we make a function call, any argument that is passed by reference has a reference to the data given to the routine rather than the actual data itself. The routine can use the reference as if the data was really supplied to it.
The main reasons for doing this are ...
- speed
- allowing a routine to modify arguments whose scope is outside the routine
- defining routines that create data whose life continues after the routine has returned
In languages such as C/C, passing a argument data that is larger than what can be held in a CPU register involves copying data to the call stack. And if that data is changed by the routine, it might have to copy the updated data back to the original location. To avoid this copying, programmers usually pass the address of the data (reference) to the routine thus giving the routine access to the data in its actual location in RAM. This is a lot faster than copying.
A routine that wants to modify any of its arguments, such that the change is still in effect after the routine returns, has to do one of two things.
- Accept a copy of the data, modify it and then return the modified data
- Accept a reference to the data's location and directly modify it in-place.
If a routine is creating new data and wants to return that to its caller, it also has two options...
- Create the data in the routine's stack space and return a copy of it. This slows things down for large data items.
- Create the data in the heap and return a reference to it. Much faster operation. Note: You shouldn't return a reference to the routine's stack space as this space disappears when the routine returns.
So in summary, references work much faster than copying larger data items and the only way a routine can update (anonymous) data is by getting a reference to it. By anonymous I mean data that is stored in a variable whose name is unknown to the routine.
issues
- Immutable Data
Can this be done completely during parse-time or must there be some action at run-time to ensure the integrity of immutable data? If done at run-time, we lose some, and possibly all, of the speed advantage.
- View-Only Data
How does a language prevent a routine from modifying such data if it is passed by reference?
- Transitive Immutability
currently
Actually, Euphoria already uses PBR automatically. However it always ensures that any modifications a routine does to an argument does not affect any other part of the application. It does this by making a copy of the argument just before changing it, and this copy is never implicitly returned by the routine.
This means that you can call a routine, passing a sequence of any length, and it always takes the same time regardless of how big the sequence is. This is because a reference to the sequence is really passed, and if the routine never changes the argument, no copy is done either. Also, if a routine creates a sequence and returns it, what is really returned is a reference to the new sequence so no further copying is done.
However, to enable explicit PBR in order for a routine to modify data that lives outside of the routine, we will have to add some complexity to both the internals and to the syntax, to ensure that Euphoria has enough knowledge about the coder's intentions.
I'm not saying this should be avoided, but just noting that the is no free lunch involved either.
One way to implement it might be to have all variables as view-only by default and for those ones that the coder wishes routines to be able to modify, mark them at declaration time as mutable. E.g
mutable sequence FileName sequence BaseFile . . . GetFile( FileName ) -- Fills in the value of FileName. BaseFile = filebase(FileName) -- BaseFile can only be changed by assignment
Anyhow, there is a lot more discussion yet required on this topic.