1. find options
- Posted by SDPringle Mar 15, 2009
- 996 views
It would be useful to be able to pass a compare function to find and match. For when the key is not residing at the top of the structure.
Shawn
2. Re: find options
- Posted by jeremy (admin) Mar 15, 2009
- 987 views
Shawn, I like the idea. Maybe you can add it to the feature requests page on SF.net so we do not forget it. However, I will say that 4.0 enhancements that are not on the board already has to be shut down. We have one more feature to add, that is the automatic cleaning up of user defined types.
Jeremy
3. Re: find options
- Posted by SDPringle Mar 15, 2009
- 944 views
What is meant by 'automatic cleaning up' of user defined types?
Shawn
4. Re: find options
- Posted by jimcbrown (admin) Mar 15, 2009
- 970 views
What is meant by 'automatic cleaning up' of user defined types?
Shawn
Basically, a destructor.
A way to ensure a function gets called, so special types like maps, regexes, etc will get cleaned up automatically even if the user forgets to call regex:free() or map:delete() or etc. (This is a potential memory leak otherwise.)
This is being extended to UDTs in general.
5. Re: find options
- Posted by jeremy (admin) Mar 15, 2009
- 965 views
Let's say you have wrapped PCRE. When you call pcre:new("[A-Z]+") memory is allocated and returned into euphoria as an atom. Take this example:
include pcre.e as r procedure is_proper_name(sequence s) r:regex upperRx = r:new("[A-Z][a-z]+ [A-Z][a-z]+") return r:match(upperRx, s) end procedure
So, upperRx was allocated memory. I am unsure of how much each regex structure takes in PCRE, but it could be of a sizable amount. In Euphoria, we have garbage collection and we do not have to free memory as in C, that's one reason we program in Euphoria and not C. However, in the above example, we have leaked memory. the contents of upperRx can never be accessed again, nor is it free'd. The current way of dealing with the above code is this:
include pcre.e as r procedure is_proper_name(sequence s) r:regex upperRx = r:new("[A-Z][a-z]+ [A-Z][a-z]+") integer result = r:match(upperRx, s) r:free(upperRx) return result end procedure
Now, the code is not so bad, but we have introduced having to deal with memory management in Euphoria, which is not nice. Now, we currently have user defined types that we can have. The addition we are speaking of will include a type cleanup method. This is all in design right now, but take this for example:
public type regex(object r) -- validate r end type public type_cleanup regex(object r) if r <> 0 then pcre_free(r) end if end type_cleanup
Now, the above syntax I have no idea about. We have not discussed it yet, really. But, you get the idea. Now, go back up to the very first example. In this case, upperRx was assigned to a type of regex. When upperRx goes out of scope, Euphoria will now know to call the type_cleanup function associated with the regex user defined type.
This can be used in a few places in the standard library already. For instance, regular expressions and maps. Both of those currently require you to either call free or call delete which is pretty anti-Euphoria.
Jeremy
6. Re: find options
- Posted by mattlewis (admin) Mar 15, 2009
- 965 views
Let's say you have wrapped PCRE. When you call pcre:new("[A-Z]+") memory is allocated and returned into euphoria as an atom. Take this example:
Here is, perhaps, a more common example:
type auto_file( atom af ) return af >= 0 end type procedure cleanup_autofile( auto_file af ) close( af ) end procedure cleanup_type( auto_file, routine_id("cleanup_autofile")) function new_autofile( sequence fname, sequence mode ) auto_file af = open( fname, mode ) -- tell the interpreter to clean this up for us return bless_type( auto_file, af ) end function procedure main() auto_file fn = open( "readme.txt", "r") sequence line = gets(fn) puts( 1, line ) -- fn gets de-referenced here and cleanup_autofile() is called. end procedure main()
Again, consider this to be pseudocode. The syntax for auto-cleanup of UDTs hasn't been decided yet. But the key is that once the auto_file variable goes completely out of scope, its cleanup routine will be called.
In this case, we don't have to remember to close the file handle, but this can work for other types, as well, including data that manually allocates memory.
Matt
7. Re: destructor
- Posted by SDPringle Mar 15, 2009
- 968 views
A destructor system shouldn't need a bless() routine unless you want the interpreter to remember what a value is. In a static system. We need a way to connect the destructor to the type. Add a builtin: procedure register_destructor( integer tid, integer rid )
You can use a routine id of a type. The type gets attached. The parser could put the destructor routines in where variables go out of scope.
So, what will it be? Virtual methods or static ones?
Shawn Pringle
8. Re: destructor
- Posted by mattlewis (admin) Mar 15, 2009
- 960 views
A destructor system shouldn't need a bless() routine unless you want the interpreter to remember what a value is. In a static system. We need a way to connect the destructor to the type. Add a builtin: procedure register_destructor( integer tid, integer rid )
You can use a routine id of a type. The type gets attached. The parser could put the destructor routines in where variables go out of scope.
It's more than out of scope. It's when the reference count drops to zero. This requires it to be figured out at run time, not compile time.
So, what will it be? Virtual methods or static ones?
I don't know. I suspect that we may want both. It might be possible to automatically attach the destructor information any time we assign to a UDT with a destructor. But there might be cases where this isn't possible, and want to be able to attach the destructor.
Derek previously said that he had a proposal for this, but I haven't seen it yet, and I haven't thought through it enough to be able to say, otherwise. I'm currently working on doing this for built-in data, specifically regexes, which are always created in the back end, so it's not an issue.
Matt
9. Re: destructor
- Posted by DerekParnell (admin) Mar 15, 2009
- 961 views
Derek previously said that he had a proposal for this, but I haven't seen it yet ...
Because getting v4.0 has a higher priority right now. UDT enhancements will have to wait a bit longer.
10. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 937 views
I think, not picking, but because of map and it's emulated PBR, this needs to be part of 4.0. It's not a matter of 4.0 or this, it's both. For internals, like regex (the other part of 4.0 that required this type of cleanup), Matt's internal cleanup changes already solves this.
Now, once we get this, I think we should probably change how stack works and make it emulate PBR too. Once we get read PBR, we can remove these emulations from map and stack and remove the need for user cleanups as well on them.
Jeremy
11. Re: destructor
- Posted by mattlewis (admin) Mar 15, 2009
- 943 views
Derek previously said that he had a proposal for this, but I haven't seen it yet ...
Because getting v4.0 has a higher priority right now. UDT enhancements will have to wait a bit longer.
Wait, I thought we were including this in 4.0, so we could garbage collect, e.g., maps. I just committed code to automatically GC regexesnot the currently built PCRE, though it should be easy enough to adapt what I've done to that, if we end up going that way.
Matt
12. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 944 views
Now, once we get this, I think we should probably change how stack works and make it emulate PBR too. Once we get read PBR, we can remove these emulations from map and stack and remove the need for user cleanups as well on them.
I meant to say "Once we get real PBR" not "read PBR"... I'll be glad when I get the updates done to EUforum so we can edit for errors like that
Jeremy
13. Re: destructor
- Posted by DerekParnell (admin) Mar 15, 2009
- 923 views
Because getting v4.0 has a higher priority right now. UDT enhancements will have to wait a bit longer.
Wait, I thought we were including this in 4.0, so we could garbage collect, e.g., maps ...
Oh?! I misunderstood this then.
I thought that GC for maps in not a show stopper, as it can be done right now with a bit of extra work on the coders part, it did not think that this was a high priority even though it is an important enhancement. Also, as it is not a v3.1 code breaker I didn't think it was a must-do-now thing.
I'll send you a note on the dev-list with my thoughts on UDT ctor/dtor functionality.
14. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 936 views
Oh?! I misunderstood this then.
I thought that GC for maps in not a show stopper, as it can be done right now with a bit of extra work on the coders part, it did not think that this was a high priority even though it is an important enhancement. Also, as it is not a v3.1 code breaker I didn't think it was a must-do-now thing.
I think it is a show stopper because going forward, once PBR is in 4.1, the usage of map (and stack) will change, thus we will have a code break in 4.1 because of the introduction of maps into 4.0.
Jeremy
15. Re: destructor
- Posted by DerekParnell (admin) Mar 15, 2009
- 981 views
I think it is a show stopper because going forward, once PBR is in 4.1, the usage of map (and stack) will change, thus we will have a code break in 4.1 because of the introduction of maps into 4.0.
I can't see how the usage will change? Can you give me an example of what you mean?
A typical usage of map would go something like ...
custrec = new() -- Create a new map . . . put(custrec, "Name", "Joe Blow") put(custrec, "Address", "555 High Street") put(custrec, "Phone", 555675632) . . . name = get(custrec, "Name") . . . delete(map)
And stacks are the same thing.
PBR will not force this usage to change, as far as I can see. The underlying implementation will change but the API shouldn't have to change. What am I overlooking? I know that the delete() call might be superfluous in most cases but its not wrong.
16. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 935 views
I think it is a show stopper because going forward, once PBR is in 4.1, the usage of map (and stack) will change, thus we will have a code break in 4.1 because of the introduction of maps into 4.0.
I can't see how the usage will change? Can you give me an example of what you mean?
A typical usage of map would go something like ...
custrec = new() -- Create a new map . . . put(custrec, "Name", "Joe Blow") put(custrec, "Address", "555 High Street") put(custrec, "Phone", 555675632) . . . name = get(custrec, "Name") . . . delete(map)
And stacks are the same thing.
PBR will not force this usage to change, as far as I can see. The underlying implementation will change but the API shouldn't have to change. What am I overlooking? I know that the delete() call might be superfluous in most cases but its not wrong.
I believe we will no longer have to call delete because map/stack will not have to allocate/manage it's own item in a sequences table? Thus, custrec will be the actual map sequence, not just a pointer to the internal id of the map structure it's using?
Jeremy
17. Re: destructor
- Posted by DerekParnell (admin) Mar 15, 2009
- 950 views
I believe we will no longer have to call delete because map/stack will not have to allocate/manage it's own item in a sequences table?
Exactly. We will no longer HAVE to call delete(), but doing so is not actually wrong or going to cause a problem.
We need an explicit delete mechanism because sometimes one needs to delete a map/whatever before it goes out of scope.
All I'm saying is that PBR and UDT dtors are not essential to having v4.0 ship.
18. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 933 views
I know that the delete() call might be superfluous in most cases but its not wrong.
Guess I should have read better
Although it will not force that change, why introduce memory management into Euphoria for just one version? I know that you have suggested it is the same as open/close, but I disagree (I may be the only one, I don't know) but the concept of opening/closing a file is everywhere and anyone understands that, even non-programmers as they open a file in Microsoft Word and a lot of times Close that file when they are done.
Creating a new map (structure in memory) and then having to close that memory structure is memory management, any way you call it and I am not sure of any other language that requires you to close a hash table or close a map (with the exception of those that require you to use memory management, that is).
For those reasons, I think we should avoid ever introducing memory management requirements into the standard library. Matt has alleviated this problem with Regular Expressions and I think we should go the extra step to alleviate it with map and stack as well. We could have just called re:free() re:close() but that doesn't make sense either, as other languages do not close a regular expression either.
My 2 cents.
Jeremy
19. Re: destructor
- Posted by jeremy (admin) Mar 15, 2009
- 934 views
- Last edited Mar 16, 2009
We need an explicit delete mechanism because sometimes one needs to delete a map/whatever before it goes out of scope.
Hm, I'm not sure I've ever had to do that in any other language. I simply put the map in the scope that it's valid. I do know sometimes you may need to clear a map, but that's not the same as deleting it's very existence, but further, one should not have to clear/delete a map when they are done using it for 4.0 and then no longer have to for 4.1. I think this teaching of the users just for 4.0 is not wise. Euphoria takes care of it's own memory and does not require the user to do so. If we do not have UDT's in 4.0 then this is no longer the case. Users will then have to learn about memory management, and for just one version? When 4.1 comes out, not having to do it? (ok, they could still do it if they wanted, but in 4.1 it would be the rare exception, not the norm like you are suggesting they have to do in 4.0).
Personally, if it came down to UDT's not in 4.0, I think it would be wiser to not ship map/stack and introduce them in 4.1 when UDT's or PBR is in Euphoria.
All I'm saying is that PBR and UDT dtors are not essential to having v4.0 ship.
I'm not yet convinced, but I'm not the sole decision maker on these things, maybe we need to discuss it on the dev list amongst the other developers and vote on this?
Jeremy