1. length() of an atom
- Posted by DerekParnell (admin) Jul 26, 2010
- 1804 views
I'm proposing that getting the length() of an atom should return 1. Currently it crashes the program.
This would simplify some algorithms because it avoids having to test for atom/sequence before taking the length. In those cases, we just use 1 anyway for atoms.
My thinking is that if appending an atom to a sequence increases the sequence length by 1, then it must follow that the length of an atom is 1.
Can anyone give me some convincing reasons to not make this change for Eu4?
2. Re: length() of an atom
- Posted by Salix Jul 26, 2010
- 1797 views
? equal(length(25),length({25}))
Strange. Is it true? I mean would you consider it equal?
But I could leave with returning 1, 0, -1 or anything... :->
Best regards,
Salix
3. Re: length() of an atom
- Posted by DerekParnell (admin) Jul 26, 2010
- 1777 views
? equal(length(25),length({25}))
Strange. Is it true? I mean would you consider it equal?
Yes. The 'it' being that they have the same length.
But I could leave with returning 1, 0, -1 or anything... :->
I have no idea what you mean by that statement.
4. Re: length() of an atom
- Posted by Salix Jul 26, 2010
- 1801 views
Yes. The 'it' being that they have the same length.
I understand that it is the goal of the proposal. It is just strange for me to have a length. Probably I am too much influenced by the current error message: "length of an atom is not defined". I get it regularly...
But I could leave with returning 1, 0, -1 or anything... :->
I have no idea what you mean by that statement.
(Egh... "live".) All I wanted to say that I can accept your proposal. But I could also accept to have 0 or -1 for the length(atom) call.
Rgds,
Salix
5. Re: length() of an atom
- Posted by _tom (admin) Jul 26, 2010
- 1785 views
This could "shake up" the language since it goes to a fundamental design from the origins of Euphoria. (Whether this is good or bad, is another question.)
Some initial ideas that came to mind. (But I need more coffee before I fully understand how changing length() will alter the language.)
a fundamental Euphoria building block
length( 23 ) ---> undefined length( { 23 } ) ---> 1
fundamental to the definition of atom vs sequence
will patterns be always true
must follow "patterns"
length( { 5, 6 } ) ---> 2 length( { 5 } ) ---> 1 length( {} ) ---> 0 length( 5 ) ----> undefined
length( 5 ) ==> 1 does not fit any "pattern", and is therefore out of place
a strength of Euphoria is that these kinds of patterns are always followed
'true length' is used in many routines
many Euphoria operations depend on objects being "the same length"
simple meaning of $
seq( 1 .. $ ) vs. seq( 1 .. length(seq) )
the "$" has a simple meaning
- where can this be used?
is this of value in programming?
seq( 1 .. length(5) )
automagic dangers
automagic conversions are common in hard to use languages
- automagic not a Euphoria "feature"
- Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
- introducing automagic features would change the entire character of the language
"behaves as if length"
from the documentation: "atoms are of length 1 for this purpose"
- we do have idea that atoms have a length dimension "of sorts"
the "behaves" may seem special to someone new to Euphoria, but the design patterns are consistent
{1, 2, 3 } & 9999 ---- > { 1, 2, 3, 9999 }
the 9999 "behaves" as if it "should" have a length of 1 but, that is a that appears in the future and does not exist in the present
question about 'internals'
- the description of length() in the documentation has comment that the length is stored internally for fast access
- will an atom need to store data that it has a length of 1 ?
is the solution a new routine ?
- lets call it "size()"
length( 2 ) -- > undefined size( 2 ) ---> 1
this way length() and size() are closely related, but sensitive to the difference between atom and sequence
even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?
6. Re: length() of an atom
- Posted by mattlewis (admin) Jul 26, 2010
- 1755 views
Yes. The 'it' being that they have the same length.
I understand that it is the goal of the proposal. It is just strange for me to have a length. Probably I am too much influenced by the current error message: "length of an atom is not defined". I get it regularly...
This is a good point, and I had a similar reaction when I first considered this.
In a case where you really expected or needed a sequence, the bug in the code would a a little bit more difficult to find. It might be discovered shortly after, when something else expected a sequence. Or it could lead to silently corrupting data. Of course, all of this is theoretical, and doesn't mean the change is wrong. Just something to be considered.
Matt
7. Re: length() of an atom
- Posted by mattlewis (admin) Jul 26, 2010
- 1745 views
automagic dangers
automagic conversions are common in hard to use languages
- automagic not a Euphoria "feature"
- Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
- introducing automagic features would change the entire character of the language
This is not strictly true. The implementation of atoms has a bit of automagic in it. However, it's fairly tame compared to what some languages do.
While I haven't seen Derek's solution, I'm sure the answer is to replace the current call to error handling with a hardcoded return 1; or similar. That point in the documentation is to make clear that length() is always O(1), and doesn't do something like count the number of elements every time you call it.
is the solution a new routine ?
- lets call it "size()"
length( 2 ) -- > undefined size( 2 ) ---> 1
this way length() and size() are closely related, but sensitive to the difference between atom and sequence
even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?
This strikes me as an interesting alternative, since it leaves existing code unaffected, but allows the optimization that lead to this proposal.
Matt
8. Re: length() of an atom
- Posted by petelomax Jul 26, 2010
- 1753 views
I was about to suggest lengtha().
Pete
9. Re: length() of an atom
- Posted by DerekParnell (admin) Jul 28, 2010
- 1671 views
This could "shake up" the language since it goes to a fundamental design from the origins of Euphoria. (Whether this is good or bad, is another question.)
Really? "shake up"? Isn't that a bit over dramatic? It's effect is microscopic. The only people affected by this idea are those who rely on crashing an application when getting the length of an atom - and that has got to be a tiny, tiny, number of Euphoria developers.
And I'd argue if one is doing that then they really have a bad approach to the language and code design in general.
Some initial ideas that came to mind. (But I need more coffee before I fully understand how changing length() will alter the language.)
a fundamental Euphoria building block
length( 23 ) ---> undefined length( { 23 } ) ---> 1
fundamental to the definition of atom vs sequence
How is it a fundamental difference between atoms and sequences when the language already deems that atoms have a length of one when it comes to some operations?
The "length()" function is only "entrenched" with its current semantics because that's what Robert defined for it. It is no more "entrenched" than other changes we have made from v3 to v4.
It's as useful for finding mistakes in your code as a divide by zero is. Both situations should be handled more gracefully than relying on these to crash the application.
Why is length( 5 ) - part of a pattern? What pattern?
upper( { 65, 65 } ) ---> {97, 97} upper( { 65 } ) ---> {97} upper( {} ) ---> {} upper( 65 ) ---> 97
Hmmm... not much of pattern there.
Huh? "true length" as opposed to what? "false length"?
Sure, there are Euphoria operations that depend on sequences being the same length, but these operations are by their nature SEQUENCE operations and atoms have nothing to do with it.
eg. {1,2,3} + {4,5,6} --> {5,7,9}
Both sequences must be the same length, but what has that got to do with the length() function? Nothing at all.
Or are you suggesting that if someone writes a routine that takes 'object' parameters and uses the length() function on those parameters, and relies on it crashing if the parameter is an atom? This is a bad design. If the parameters must be sequences, then either use 'sequence' argument signature or test for sequence() at run time.
I don't get what you are saying. The '$' token only means 'length(X)' when it appears inside a subscript of the sequence X. Nothing about that changes if length(atom) is defined as 1.
The example ...
seq( 1 .. length(5) )
Why would someone do that? It would be the same as saying ...
seq( 1 .. 1 )
automagic dangers
automagic conversions are common in hard to use languages
- automagic not a Euphoria "feature"
- Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
- introducing automagic features would change the entire character of the language
Actually, Euphoria currently "automagically" converts from integer to atom and visa versa all the time now without you having to do anything.
And how on Earth is defining length(atom) as 1 making things not simple? In fact, it actually simplifies some algorithms.
Euphoria already has overloaded operators. The '&' (concatenation) operator accepts different data types. The arithmetic operators accept different data types, etc ... The puts() function accepts different data types. I can go on....
We already have overloading in Euphoria. "automagic" already happens in Euphoria - so this change will not be introducing it.
"behaves as if length"
from the documentation: "atoms are of length 1 for this purpose"
- we do have idea that atoms have a length dimension "of sorts"
the "behaves" may seem special to someone new to Euphoria, but the design patterns are consistent
{1, 2, 3 } & 9999 ---- > { 1, 2, 3, 9999 }
the 9999 "behaves" as if it "should" have a length of 1 but, that is a that appears in the future and does not exist in the present
So ...? That actually sounds like a very good reason to formalize the 'apparent' length of an atom.
Not at all. I've have implemented this idea in my local copy of Euphoria about a week ago and everything still works fine. Currently, Euphoria already tests the operand on length() to see if its a sequence and if so, fetch the internal length data but if not it crashes. My changes just replaces the 'crash' with a 'return 1' instead. And with the translator, some 'length()' expressions can get translated as a literal 1 without any runtime overhead.
is the solution a new routine ?
- lets call it "size()"
length( 2 ) -- > undefined size( 2 ) ---> 1
this way length() and size() are closely related, but sensitive to the difference between atom and sequence
even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?
I also thought this but it seems contrived and adds another predefined (built-in) function when an existing function can do the job without breaking code or really causing anyone grief.
In summary, I'm not convinced that length(atom) = 1 is a bad idea yet.
10. Re: length() of an atom
- Posted by euphoric (admin) Jul 28, 2010
- 1708 views
What if you had length() return -1 if it was given an atom?
11. Re: length() of an atom
- Posted by jimcbrown (admin) Jul 28, 2010
- 1673 views
What if you had length() return -1 if it was given an atom?
I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...
If you really needed old code to crash, you could always do this:
function length(object x) if not sequence(x) then crash("TODO: add eu-lint") end if return eu:length(x) end function
12. Re: length() of an atom
- Posted by euphoric (admin) Jul 28, 2010
- 1658 views
What if you had length() return -1 if it was given an atom?
I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...
It might be more painful to have to go through all your code and insert a
if sequence( x ) then
above every
if length( x ) then
or
y = length( x )
Measuring the length of an atom is like measuring the depth of a square. A square has no depth so you can't measure it, nor will any value be accurate. It is undefined.
If you really needed old code to crash, you could always do this:
function length(object x) if not sequence(x) then crash("TODO: add eu-lint") end if return eu:length(x) end function
How convenient. So now I have to have my own function to do what Euphoria did for me before.
I want to know in what case(s) is length(atom) return 1 a good thing.
Also, I have code that does this:
for t=1 to length( ftxt ) do if ftxt[t] = '#' then ...
ftxt should be a sequence, and if it's not, there's a bug. Now you're going to make me test ftxt first? I should probably do that anyway. heh. :)
length() returns a property (number of elements) of a variable. atoms don't have a length property, so it's inconsistent.
Like Derek said, length() is part of a group of operations which "are by their nature SEQUENCE operations and atoms have nothing to do with it."
Thank you, Derek. :D
It's like getting the "reverse" of an atom. Makes no sense! Yes, we nicely return the atom for some inexplicable reason instead of making the programmer provide a sequence. That's a quirk of the language I guess.
I don't mind quirky behaviors. But don't pass this off as consistent or logical. They're programmer's helpers, at best.
13. Re: length() of an atom
- Posted by jimcbrown (admin) Jul 28, 2010
- 1633 views
What if you had length() return -1 if it was given an atom?
I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...
It might be more painful to have to go through all your code and insert a
if sequence( x ) then
above every
if length( x ) then
or
y = length( x )
But as this is already the case anyways, to avoid crashing, we lose nothing here.
If you really needed old code to crash, you could always do this:
function length(object x) if not sequence(x) then crash("TODO: add eu-lint") end if return eu:length(x) end function
How convenient. So now I have to have my own function to do what Euphoria did for me before.
Why do you need the code to _CRASH_ ?
Also, I have code that does this:
for t=1 to length( ftxt ) do if ftxt[t] = '#' then ...
ftxt should be a sequence, and if it's not, there's a bug. Now you're going to make me test ftxt first? I should probably do that anyway. heh. :)
It will still _CRASH_ ...
length() returns a property (number of elements) of a variable. atoms don't have a length property, so it's inconsistent.
Like Derek said, length() is part of a group of operations which "are by their nature SEQUENCE operations and atoms have nothing to do with it."
Thank you, Derek. :D
It's like getting the "reverse" of an atom. Makes no sense! Yes, we nicely return the atom for some inexplicable reason instead of making the programmer provide a sequence. That's a quirk of the language I guess.
Measuring the length of an atom is like measuring the depth of a square. A square has no depth so you can't measure it, nor will any value be accurate. It is undefined.
I fully agree with you here. From a theoretical viewpoint, a length should only apply to sequences (and similiar objects such as linked lists, arrays, etc).
But you put it best below...
I don't mind quirky behaviors. But don't pass this off as consistent or logical. They're programmer's helpers, at best.
Agreed, and I feel the use as a programmer's helper here significantly outweighs the desire to adhere to theoretical dogma.
I want to know in what case(s) is length(atom) return 1 a good thing.
To borrow Jeremy's example, a function that takes an Euphoria object and serializes it into a memory block for use by a C function.
15. Re: length() of an atom
- Posted by ArthurCrump Jul 28, 2010
- 1637 views
If you want a function which returns the length of a sequence or 1 if the parameter is an atom it is possible to use:
result = length( {} & X ) -- or: result = length( X & {} )
Probably inefficient, but if an atom is concatenated to a sequence, it extends the sequence by 1 element, so an atom could be said to have a natural length of 1.
Also, if an empty sequence is concatenated to an object X, it leaves it unaltered if X is a sequence but would convert an atom into a sequence with one element.
Several of the standard routines which accept an atom or a sequence begin by converting the atom to a sequence with one element.
If length(Atom) did return an integer, the only existing programs which would be affected are those which fail.
By now you will have probably gathered that I am in favour of the proposal.
I suppose it would be possible to define a second parameter for the length function to specify the value to be returned for an atom, but how could the default be specified to cause a failure if it is used?
16. Re: length() of an atom
- Posted by DerekParnell (admin) Jul 28, 2010
- 1620 views
What if you had length() return -1 if it was given an atom?
Because that defeats the purpose of my proposal. I want to make coding simpler for people so returning 1 is much more user-friendly than -1.
Also, what is the unit-of-measure returned by length()? I'm suggesting it is 'elements'.
{x,y} has 2 elements involved {x} has 1 element involved {} has zero elements involved x has 1 element involved
17. Re: length() of an atom
- Posted by euphoric (admin) Jul 28, 2010
- 1632 views
I want to make coding simpler for people so returning 1 is much more user-friendly than -1.
Also, what is the unit-of-measure returned by length()? I'm suggesting it is 'elements'.
{x,y} has 2 elements involved {x} has 1 element involved {} has zero elements involved x has 1 element involved
Elements only exist in the context of a sequence. No sequence, no elements. That last 'x' is not an element.
Regardless, I don't think this change will hurt much, if at all. So have at it. :)
But I'm curious, how does returning 1 for length(atom) make coding simpler and more user-friendly? What case did you discover works better with length(atom) returning 1?
18. Re: length() of an atom
- Posted by DerekParnell (admin) Jul 28, 2010
- 1651 views
Elements only exist in the context of a sequence. No sequence, no elements. That last 'x' is not an element.
Does an egg exist outside of its carton?
Regardless, I don't think this change will hurt much, if at all. So have at it. :)
Awwww ... don't give in so soon
But I'm curious, how does returning 1 for length(atom) make coding simpler and more user-friendly? What case did you discover works better with length(atom) returning 1?
--- r = x[1 .. i] & q & [i+1 .. $] if atom(q) then n += 1 else n += length(q) end if ---
but now I can code ...
--- r = x[1 .. i] & q & [i+1 .. $] n += length(q) ---
19. Re: length() of an atom
- Posted by mattlewis (admin) Jul 28, 2010
- 1660 views
but now I can code ...
--- r = x[1 .. i] & q & [i+1 .. $] n += length(q) ---
You should be using splice():
r = splice( x, q, i + 1 )
Especially in a loop, this makes a huge difference in performance (likewise for replace(), insert() and remove()).
Matt
20. Re: length() of an atom
- Posted by Vinoba Sep 22, 2010
- 1578 views
Here is what Keneth Iverson defined for scalar values and vctors. In APL, a scalar is somewhat similar to atom nad is treated as a one element vector when an operation is performed which might make a vector. ⍝ is a comment character. I have added to make non-APLers comfortable. APL has no precedence. it executes from right to left. ⍴ is a monadic function for shape and ← is used for assignment - same as = in many languages.
var1 ← 5 ⍝ // Assign a single value 5 to a variable called var1 var1 ⍝ // show what is in var1 5 ⎕ ← nshape ← ⍴ var1 ⍝ //shape of var1 assigned to nshape and show - result is nothing ⍴ nshape ⍝ // nshape is actually an empty vector with shape of 0 0 ⍴ nshape ⍝ // nshape is actually an empty vector with shape of 0 0 ⍴ vshape ← var2 ← var1 , 12 18 43 ⍝ // , is a join function in APL 4 var2 ⍝ // var2 is a join of var 1 and a three elment sequence. shape is 4 5 12 18 43
21. Re: length() of an atom
- Posted by petelomax Sep 23, 2010
- 1649 views
That really didn't make a whole lot of sense to me!
Anyway, as soon as this thread reappeared, I had two thoughts:
To keep things in order, length(atom) should return {}.
Well, it made me chuckle anyway, of course I'm joking. However, the other solution is to code (the equivalent of)
function length(object x, object default_result={}) if sequence(x) then return eu:length(x) elsif atom(default_result) then return default_result else crash_and_burn() end if end function
Then of course you can simply code length(5,1) to get your 1 rather than crash, and no legacy code whatsoever will be affected at all (and I hope not even performancewise), nor will it affect anyone that disagrees with changing length().
Regards, Pete
22. Re: length() of an atom
- Posted by DerekParnell (admin) Sep 23, 2010
- 1638 views
That really didn't make a whole lot of sense to me!
Show me some existing application that depends on having length(atom) crash the application.
Give me some example code that demonstrates a performance hit with the new length() feature.
23. Re: length() of an atom
- Posted by jimcbrown (admin) Sep 23, 2010
- 1540 views
Show me some existing application that depends on having length(atom) crash the application.
I have quite a few of those ... but I can easily rewrite almost all of them to use crash() in std/error.e instead.
I have a really obscure application that actually relies on this behavior (just one), but it's really old and won't run under Euphoria 2.3 .. I'd hate to see a new feature held back because of just one single application.
24. Re: length() of an atom
- Posted by irv Sep 23, 2010
- 1515 views
Well, as a result of all this discussion, I went and wrote a 'program':
atom a = 7 ? a ? length(a)
What does it print?
7
1
So, it looks like it's already been implemented. So where are the problem reports? BTW: Derek, how long has that been in there, anyway - since the day you made the first post? :)
25. Re: length() of an atom
- Posted by DerekParnell (admin) Sep 23, 2010
- 1458 views
So, it looks like it's already been implemented. So where are the problem reports? BTW: Derek, how long has that been in there, anyway - since the day you made the first post? :)
It has been in there since rev 3326, July 29th. About 8 weeks ago.
26. Re: length() of an atom
- Posted by petelomax Sep 24, 2010
- 1473 views
That really didn't make a whole lot of sense to me!
Show me some existing application that depends on having length(atom) crash the application.
I was talking about the Vinoba/APL post.
Of course no application depends on that, but some programmers do.
What is your beef with length(o) crashing and length(o,1) never crashing?
Give me some example code that demonstrates a performance hit with the new length() feature.
Good, I see you implemented it correctly.
Pete