1. type checking in EUPHORIA

Occasionally, while looking at software written in EUPHORIA I have seen a kind of programming error which a parser can catch. One is,

-- I actually saw this once 
atom p = allocate( 50, routine_id( "free" ) ) 

Now, a good programmer will type in allocate into the JavaScript search engine and see the second parameter should be a boolean value of true or false and never make this error. The same argument goes for good programmers never having dangling pointers or memory leaks. So, lets forget about that argument.

You will get a runtime exception if routine_id( "free" ) evaluates to something other than 0 or 1. This is good.

Now, what if we have another set of permissions like,

constant READ_ONLY = 0, READ_WRITE=1, READ_WRITE_EXECUTE = 3 
atom p = allocate(40, READ_WRITE) 

Put a boolean type, and type-checking will not catch this at run-time. In 4.0, there is nothing yet to catch this at parse time. In 4.1 the interpreter could issue a warning if you use use enumerated types rather than constants. Should it do more than that?

Consider this code:

type enum boolean F=0, T end type 
type enum permission READ_ONLY=0, READ_WRITE=1 end type 
-- function allocate(integer c, boolean atomatic = F) 
atom p = allocate(40, READ_WRITE) -- will issue a warning 
 
p = allocate(40, 0) -- wont issue a warning.  Should it? 
 
boolean garbage_collect = T 
 
p = allocate(40, garbage_collect) -- good -- no warning 
 
integer f = 1 
 
p = allocate(40, f) -- no warning.  should there be a warning? 
 
 
p = allocate(4, 0=1) -- Expressions should just be let through 
p = allocate(4, 1-1) -- even if they are bad. 

I am leaning towards the idea that when there is a single literal or variable as an argument rather than an more complex instruction, we should only ones that belong to that enumerated type.

Shawn Pringle

new topic     » topic index » view message » categorize

2. Re: type checking in EUPHORIA

SDPringle said...

I am leaning towards the idea that when there is a single literal or variable as an argument rather than an more complex instruction, we should only ones that belong to that enumerated type.

Shawn Pringle

Currently in the alternative_literals branch, if you use something that is not a value from an enumerated type, checking is skipped. Maybe that's not the best idea.

Shawn Pringle

new topic     » goto parent     » topic index » view message » categorize

3. Re: type checking in EUPHORIA

SDPringle said...
SDPringle said...

I am leaning towards the idea that when there is a single literal or variable as an argument rather than an more complex instruction, we should only ones that belong to that enumerated type.

Shawn Pringle

Currently in the alternative_literals branch, if you use something that is not a value from an enumerated type, checking is skipped. Maybe that's not the best idea.

The major drawback that I can see here is that I think it would prevent you from directly passing a value as returned from a function (even if the function correctly returns the correct thing). Or if you stored the enum into a sequence, and then pulled it out later and passed it to a function.

Matt

new topic     » goto parent     » topic index » view message » categorize

4. Re: type checking in EUPHORIA

mattlewis said...

The major drawback that I can see here is that I think it would prevent you from directly passing a value as returned from a function (even if the function correctly returns the correct thing). Or if you stored the enum into a sequence, and then pulled it out later and passed it to a function.

Matt

If an argument to a function or a comparison consists of more than a single token the checking is skipped. This single token must be a constant, literal or a variable. This is one thing I didn't make clear as something I was not thinking of changing.

If you have two enumerated types: 'boolean' and 'weekday'. Suppose the boolean type consists of 'false'(=0) and 'true'(=1) constants. Then assigning a weekday with the value 'false' using the token false, will provoke a warning. However if you assign the same value which is not some enumerated type, like '0' there is no warning.

weekday wd = Sunday -- okay! 
 
sequence s = {Monday, Friday} -- type literal information is lost inside this sequence. 
wd = s[1] -- No check... more than one token. 
s = { 0, 3 } 
wd = s[1] -- okay! No check... more than one token. 
wd = false -- warning 
wd = platform() -- no check... 
 
wd = 0 -- no warning.... should there be? 

Shawn Pringle

new topic     » goto parent     » topic index » view message » categorize

5. Re: type checking in EUPHORIA

SDPringle said...

Occasionally, while looking at software written in EUPHORIA I have seen a kind of programming error which a parser can catch.

I agree that the parser should be able to catch more datatype usage errors than it currently does. My take on this is that when the parser can know the datatype of both the source expression and the target of that expression, it should issue a warning if the datatypes are incompatible. This would be a parse-time warning and not a run-time warning. Conversely, if the parser does not absolutely know the datatypes at parse-time, then it should behave as it does now.

This extends beyond enumerated types to also include native types and user-defined types.

To enhance this behaviour, we may have to look at letting the parser know the expected datatype ofa funtion's return value. Currently, from the point of view of the parser, a function always returns an object.

-- example  
function somefunc(integer a)(sequence) 
 ... 
  return some_seq 
end function 
new topic     » goto parent     » topic index » view message » categorize

6. Re: type checking in EUPHORIA

DerekParnell said...
SDPringle said...

Occasionally, while looking at software written in EUPHORIA I have seen a kind of programming error which a parser can catch.

I agree that the parser should be able to catch more datatype usage errors than it currently does. My take on this is that when the parser can know the datatype of both the source expression and the target of that expression, it should issue a warning if the datatypes are incompatible. This would be a parse-time warning and not a run-time warning. Conversely, if the parser does not absolutely know the datatypes at parse-time, then it should behave as it does now.

I would go further and throw an error. We do this already in some cases. Mostly when we end up inlining something, I think.

DerekParnell said...

This extends beyond enumerated types to also include native types and user-defined types.

To enhance this behaviour, we may have to look at letting the parser know the expected datatype ofa funtion's return value. Currently, from the point of view of the parser, a function always returns an object.

-- example  
function somefunc(integer a)(sequence) 
 ... 
  return some_seq 
end function 

Yes, this would be a nice enhancement. Maybe something like:

-- example  
function somefunc(integer a) as sequence 
 ... 
  return some_seq 
end function 

I did something like this in ooeu in order to improve polymorphism.

Matt

new topic     » goto parent     » topic index » view message » categorize

7. criticism of alternative literals

What has been bothering me about this enhancement is it doesn't help with parameters that are made up with bits that are combined with or_bits or or_all. It cannot. For example, in regex, we cannot do anything better than the current option_spec even in this branch because it is a sequence or a combination of flags via or_bits. Even if a function only accepts numbers an enumerated type wouldn't work because the enumerated type values would only be single bits and we need to be able to put arbitrary combinations of bits.

So, I have been thinking that rather than only a type, it would be good if we could also specify that we only want literals from a given enumerated type.

Take std/regex.e. Change DEFAULT...STRING_OFFSETS from constants to an enumerated type called say regex_bit. Now, regex_bit cannot be used as a type itself but then allow the user to specify regex_bit* to mean any or_bits or & combination of regex_bits. Then we have something much more powerful. The parser could enforce that the constants and variables belong to regex_bit and/or regex_bits* types.

-- std/regex.e 
public function new(regex re, option_spec options=DEFAULT as regex_bit*) 
... 
end function 
 
include std/regex.e 
constant capitalized = regex:new("[A-Z][a-z]*", 15) -- error because it is a numeric literal 
constant capitalized = regex:new("[A-Z][a-z]*", routine_id( "free" ) )  -- error string literal 
constant capitalized = regex:new("[A-Z][a-z]*", or_bits(LINUX, re:DEFAULT)) -- error wrong enumerated type LINUX 

Shawn Pringle

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu