1. length() of an atom

I'm proposing that getting the length() of an atom should return 1. Currently it crashes the program.

This would simplify some algorithms because it avoids having to test for atom/sequence before taking the length. In those cases, we just use 1 anyway for atoms.

My thinking is that if appending an atom to a sequence increases the sequence length by 1, then it must follow that the length of an atom is 1.

Can anyone give me some convincing reasons to not make this change for Eu4?

new topic     » topic index » view message » categorize

2. Re: length() of an atom

? equal(length(25),length({25})) 
 

Strange. Is it true? I mean would you consider it equal?
But I could leave with returning 1, 0, -1 or anything... :->

Best regards,
Salix

new topic     » goto parent     » topic index » view message » categorize

3. Re: length() of an atom

Salix said...
? equal(length(25),length({25})) 
 

Strange. Is it true? I mean would you consider it equal?

Yes. The 'it' being that they have the same length.

Salix said...

But I could leave with returning 1, 0, -1 or anything... :->

I have no idea what you mean by that statement.

new topic     » goto parent     » topic index » view message » categorize

4. Re: length() of an atom

DerekParnell said...

Yes. The 'it' being that they have the same length.

I understand that it is the goal of the proposal. It is just strange for me to have a length. Probably I am too much influenced by the current error message: "length of an atom is not defined". I get it regularly... blink

DerekParnell said...
Salix said...

But I could leave with returning 1, 0, -1 or anything... :->

I have no idea what you mean by that statement.

(Egh... "live".) All I wanted to say that I can accept your proposal. But I could also accept to have 0 or -1 for the length(atom) call.

Rgds,

Salix

new topic     » goto parent     » topic index » view message » categorize

5. Re: length() of an atom

This could "shake up" the language since it goes to a fundamental design from the origins of Euphoria. (Whether this is good or bad, is another question.)

Some initial ideas that came to mind. (But I need more coffee before I fully understand how changing length() will alter the language.)

a fundamental Euphoria building block

 length( 23 )      ---> undefined 
 length( { 23 } ) ---> 1 

fundamental to the definition of atom vs sequence

idea of "length()" is clearly entrenched as the length of a sequence useful as a way of finding a type mismatch in the logic of a program (even if it can be annoying?)


will patterns be always true


must follow "patterns"

length( { 5, 6 } ) ---> 2 
length( { 5 } )    ---> 1 
length( {} )       ---> 0 
 
length( 5 )        ----> undefined 

length( 5 ) ==> 1 does not fit any "pattern", and is therefore out of place

a strength of Euphoria is that these kinds of patterns are always followed

'true length' is used in many routines

many Euphoria operations depend on objects being "the same length"

simple meaning of $

seq( 1 .. $ )   vs.  seq( 1 .. length(seq) ) 

the "$" has a simple meaning

  • where can this be used?

is this of value in programming?

seq( 1 .. length(5) ) 


automagic dangers

automagic conversions are common in hard to use languages

  • automagic not a Euphoria "feature"
  • Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
  • introducing automagic features would change the entire character of the language


"behaves as if length"

from the documentation: "atoms are of length 1 for this purpose"

  • we do have idea that atoms have a length dimension "of sorts"

the "behaves" may seem special to someone new to Euphoria, but the design patterns are consistent

{1, 2, 3 } & 9999   ---- > { 1, 2, 3, 9999  } 

the 9999 "behaves" as if it "should" have a length of 1 but, that is a that appears in the future and does not exist in the present

question about 'internals'

  • the description of length() in the documentation has comment that the length is stored internally for fast access
  • will an atom need to store data that it has a length of 1 ?


is the solution a new routine ?


  • lets call it "size()"

length( 2 ) -- > undefined 
size( 2 ) ---> 1 

this way length() and size() are closely related, but sensitive to the difference between atom and sequence

even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?

new topic     » goto parent     » topic index » view message » categorize

6. Re: length() of an atom

Salix said...
DerekParnell said...

Yes. The 'it' being that they have the same length.

I understand that it is the goal of the proposal. It is just strange for me to have a length. Probably I am too much influenced by the current error message: "length of an atom is not defined". I get it regularly... blink

This is a good point, and I had a similar reaction when I first considered this.

In a case where you really expected or needed a sequence, the bug in the code would a a little bit more difficult to find. It might be discovered shortly after, when something else expected a sequence. Or it could lead to silently corrupting data. Of course, all of this is theoretical, and doesn't mean the change is wrong. Just something to be considered.

Matt

new topic     » goto parent     » topic index » view message » categorize

7. Re: length() of an atom

_tom said...

automagic dangers

automagic conversions are common in hard to use languages

  • automagic not a Euphoria "feature"
  • Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
  • introducing automagic features would change the entire character of the language

This is not strictly true. The implementation of atoms has a bit of automagic in it. However, it's fairly tame compared to what some languages do.

_tom said...

question about 'internals'

  • the description of length() in the documentation has comment that the length is stored internally for fast access
  • will an atom need to store data that it has a length of 1 ?

While I haven't seen Derek's solution, I'm sure the answer is to replace the current call to error handling with a hardcoded return 1; or similar. That point in the documentation is to make clear that length() is always O(1), and doesn't do something like count the number of elements every time you call it.

_tom said...

is the solution a new routine ?

  • lets call it "size()"

length( 2 ) -- > undefined 
size( 2 ) ---> 1 

this way length() and size() are closely related, but sensitive to the difference between atom and sequence

even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?

This strikes me as an interesting alternative, since it leaves existing code unaffected, but allows the optimization that lead to this proposal.

Matt

new topic     » goto parent     » topic index » view message » categorize

8. Re: length() of an atom

_tom said...

is the solution a new routine ?

I was about to suggest lengtha().

Pete

new topic     » goto parent     » topic index » view message » categorize

9. Re: length() of an atom

_tom said...

This could "shake up" the language since it goes to a fundamental design from the origins of Euphoria. (Whether this is good or bad, is another question.)

Really? "shake up"? Isn't that a bit over dramatic? It's effect is microscopic. The only people affected by this idea are those who rely on crashing an application when getting the length of an atom - and that has got to be a tiny, tiny, number of Euphoria developers.

And I'd argue if one is doing that then they really have a bad approach to the language and code design in general.

_tom said...

Some initial ideas that came to mind. (But I need more coffee before I fully understand how changing length() will alter the language.)

a fundamental Euphoria building block

 length( 23 )      ---> undefined 
 length( { 23 } ) ---> 1 

fundamental to the definition of atom vs sequence

idea of "length()" is clearly entrenched as the length of a sequence useful as a way of finding a type mismatch in the logic of a program (even if it can be annoying?)


How is it a fundamental difference between atoms and sequences when the language already deems that atoms have a length of one when it comes to some operations?

The "length()" function is only "entrenched" with its current semantics because that's what Robert defined for it. It is no more "entrenched" than other changes we have made from v3 to v4.

It's as useful for finding mistakes in your code as a divide by zero is. Both situations should be handled more gracefully than relying on these to crash the application.

_tom said...

will patterns be always true


must follow "patterns"

length( { 5, 6 } ) ---> 2 
length( { 5 } )    ---> 1 
length( {} )       ---> 0 
 
length( 5 )        ----> undefined 

length( 5 ) ==> 1 does not fit any "pattern", and is therefore out of place

a strength of Euphoria is that these kinds of patterns are always followed

Why is length( 5 ) - part of a pattern? What pattern?

upper( { 65, 65 } ) ---> {97, 97} 
upper( { 65 } )     ---> {97} 
upper( {} )         ---> {} 
upper( 65 )         ---> 97 

Hmmm... not much of pattern there.

_tom said...

'true length' is used in many routines

many Euphoria operations depend on objects being "the same length"

Huh? "true length" as opposed to what? "false length"?

Sure, there are Euphoria operations that depend on sequences being the same length, but these operations are by their nature SEQUENCE operations and atoms have nothing to do with it.

eg.    {1,2,3} + {4,5,6} --> {5,7,9} 

Both sequences must be the same length, but what has that got to do with the length() function? Nothing at all.

Or are you suggesting that if someone writes a routine that takes 'object' parameters and uses the length() function on those parameters, and relies on it crashing if the parameter is an atom? This is a bad design. If the parameters must be sequences, then either use 'sequence' argument signature or test for sequence() at run time.

_tom said...

simple meaning of $

seq( 1 .. $ )   vs.  seq( 1 .. length(seq) ) 

the "$" has a simple meaning

  • where can this be used?

is this of value in programming?

seq( 1 .. length(5) ) 


I don't get what you are saying. The '$' token only means 'length(X)' when it appears inside a subscript of the sequence X. Nothing about that changes if length(atom) is defined as 1.

The example ...

seq( 1 .. length(5) ) 

Why would someone do that? It would be the same as saying ...

seq( 1 .. 1 ) 

_tom said...

automagic dangers

automagic conversions are common in hard to use languages

  • automagic not a Euphoria "feature"
  • Euphoria simplicity is due in part because there are no coersions, automatic type changes, overloaded operators and routines (we like simple !)
  • introducing automagic features would change the entire character of the language


Actually, Euphoria currently "automagically" converts from integer to atom and visa versa all the time now without you having to do anything.

And how on Earth is defining length(atom) as 1 making things not simple? In fact, it actually simplifies some algorithms.

Euphoria already has overloaded operators. The '&' (concatenation) operator accepts different data types. The arithmetic operators accept different data types, etc ... The puts() function accepts different data types. I can go on....

We already have overloading in Euphoria. "automagic" already happens in Euphoria - so this change will not be introducing it.

_tom said...

"behaves as if length"

from the documentation: "atoms are of length 1 for this purpose"

  • we do have idea that atoms have a length dimension "of sorts"

the "behaves" may seem special to someone new to Euphoria, but the design patterns are consistent

{1, 2, 3 } & 9999   ---- > { 1, 2, 3, 9999  } 

the 9999 "behaves" as if it "should" have a length of 1 but, that is a that appears in the future and does not exist in the present

So ...? That actually sounds like a very good reason to formalize the 'apparent' length of an atom.

_tom said...

question about 'internals'

  • the description of length() in the documentation has comment that the length is stored internally for fast access
  • will an atom need to store data that it has a length of 1 ?

Not at all. I've have implemented this idea in my local copy of Euphoria about a week ago and everything still works fine. Currently, Euphoria already tests the operand on length() to see if its a sequence and if so, fetch the internal length data but if not it crashes. My changes just replaces the 'crash' with a 'return 1' instead. And with the translator, some 'length()' expressions can get translated as a literal 1 without any runtime overhead.

_tom said...

is the solution a new routine ?


  • lets call it "size()"

length( 2 ) -- > undefined 
size( 2 ) ---> 1 

this way length() and size() are closely related, but sensitive to the difference between atom and sequence

even is not adopted as a routine, would the documentation be improved if "size" was used as a descriptor for some Euphoria operations instead of the way "length" is used now?

I also thought this but it seems contrived and adds another predefined (built-in) function when an existing function can do the job without breaking code or really causing anyone grief.

In summary, I'm not convinced that length(atom) = 1 is a bad idea yet.

new topic     » goto parent     » topic index » view message » categorize

10. Re: length() of an atom

What if you had length() return -1 if it was given an atom?

new topic     » goto parent     » topic index » view message » categorize

11. Re: length() of an atom

euphoric said...

What if you had length() return -1 if it was given an atom?

I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...

If you really needed old code to crash, you could always do this:

function length(object x) 
	if not sequence(x) then 
		crash("TODO: add eu-lint") 
	end if 
	return eu:length(x) 
end function 
new topic     » goto parent     » topic index » view message » categorize

12. Re: length() of an atom

jimcbrown said...
euphoric said...

What if you had length() return -1 if it was given an atom?

I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...

It might be more painful to have to go through all your code and insert a

   if sequence( x ) then 

above every

   if length( x ) then 

or

   y = length( x ) 

Measuring the length of an atom is like measuring the depth of a square. A square has no depth so you can't measure it, nor will any value be accurate. It is undefined.

jimcbrown said...

If you really needed old code to crash, you could always do this:

function length(object x) 
	if not sequence(x) then 
		crash("TODO: add eu-lint") 
	end if 
	return eu:length(x) 
end function 

How convenient. So now I have to have my own function to do what Euphoria did for me before.

I want to know in what case(s) is length(atom) return 1 a good thing.

Also, I have code that does this:

for t=1 to length( ftxt ) do 
   if ftxt[t] = '#' then 
   ... 

ftxt should be a sequence, and if it's not, there's a bug. Now you're going to make me test ftxt first? I should probably do that anyway. heh. :)

length() returns a property (number of elements) of a variable. atoms don't have a length property, so it's inconsistent.

Like Derek said, length() is part of a group of operations which "are by their nature SEQUENCE operations and atoms have nothing to do with it."

Thank you, Derek. :D

It's like getting the "reverse" of an atom. Makes no sense! Yes, we nicely return the atom for some inexplicable reason instead of making the programmer provide a sequence. That's a quirk of the language I guess.

I don't mind quirky behaviors. But don't pass this off as consistent or logical. They're programmer's helpers, at best.

new topic     » goto parent     » topic index » view message » categorize

13. Re: length() of an atom

euphoric said...
jimcbrown said...
euphoric said...

What if you had length() return -1 if it was given an atom?

I guess doing abs(length(x)) would be a compromise, but it'd be a pain to have to include std/math.e for what should be a builtin operation...

It might be more painful to have to go through all your code and insert a

   if sequence( x ) then 

above every

   if length( x ) then 

or

   y = length( x ) 

But as this is already the case anyways, to avoid crashing, we lose nothing here.

euphoric said...
jimcbrown said...

If you really needed old code to crash, you could always do this:

function length(object x) 
	if not sequence(x) then 
		crash("TODO: add eu-lint") 
	end if 
	return eu:length(x) 
end function 

How convenient. So now I have to have my own function to do what Euphoria did for me before.

Why do you need the code to _CRASH_ ?

euphoric said...

Also, I have code that does this:

for t=1 to length( ftxt ) do 
   if ftxt[t] = '#' then 
   ... 

ftxt should be a sequence, and if it's not, there's a bug. Now you're going to make me test ftxt first? I should probably do that anyway. heh. :)

It will still _CRASH_ ...

euphoric said...

length() returns a property (number of elements) of a variable. atoms don't have a length property, so it's inconsistent.

Like Derek said, length() is part of a group of operations which "are by their nature SEQUENCE operations and atoms have nothing to do with it."

Thank you, Derek. :D

euphoric said...

It's like getting the "reverse" of an atom. Makes no sense! Yes, we nicely return the atom for some inexplicable reason instead of making the programmer provide a sequence. That's a quirk of the language I guess.

euphoric said...

Measuring the length of an atom is like measuring the depth of a square. A square has no depth so you can't measure it, nor will any value be accurate. It is undefined.

I fully agree with you here. From a theoretical viewpoint, a length should only apply to sequences (and similiar objects such as linked lists, arrays, etc).

But you put it best below...

euphoric said...

I don't mind quirky behaviors. But don't pass this off as consistent or logical. They're programmer's helpers, at best.

Agreed, and I feel the use as a programmer's helper here significantly outweighs the desire to adhere to theoretical dogma.

euphoric said...

I want to know in what case(s) is length(atom) return 1 a good thing.

To borrow Jeremy's example, a function that takes an Euphoria object and serializes it into a memory block for use by a C function.

new topic     » goto parent     » topic index » view message » categorize

14. Re: length() of an atom

Fine.

new topic     » goto parent     » topic index » view message » categorize

15. Re: length() of an atom

If you want a function which returns the length of a sequence or 1 if the parameter is an atom it is possible to use:

result = length( {} & X )   -- or: 
result = length( X & {} ) 
 

Probably inefficient, but if an atom is concatenated to a sequence, it extends the sequence by 1 element, so an atom could be said to have a natural length of 1.
Also, if an empty sequence is concatenated to an object X, it leaves it unaltered if X is a sequence but would convert an atom into a sequence with one element.
Several of the standard routines which accept an atom or a sequence begin by converting the atom to a sequence with one element.
If length(Atom) did return an integer, the only existing programs which would be affected are those which fail.

By now you will have probably gathered that I am in favour of the proposal.

I suppose it would be possible to define a second parameter for the length function to specify the value to be returned for an atom, but how could the default be specified to cause a failure if it is used?

new topic     » goto parent     » topic index » view message » categorize

16. Re: length() of an atom

euphoric said...

What if you had length() return -1 if it was given an atom?

Because that defeats the purpose of my proposal. I want to make coding simpler for people so returning 1 is much more user-friendly than -1.

Also, what is the unit-of-measure returned by length()? I'm suggesting it is 'elements'.

{x,y} has 2 elements involved 
{x}   has 1 element involved 
{}    has zero elements involved 
x     has 1 element involved 

new topic     » goto parent     » topic index » view message » categorize

17. Re: length() of an atom

DerekParnell said...

I want to make coding simpler for people so returning 1 is much more user-friendly than -1.

Also, what is the unit-of-measure returned by length()? I'm suggesting it is 'elements'.

{x,y} has 2 elements involved 
{x}   has 1 element involved 
{}    has zero elements involved 
x     has 1 element involved 

Elements only exist in the context of a sequence. No sequence, no elements. That last 'x' is not an element.

Regardless, I don't think this change will hurt much, if at all. So have at it. :)

But I'm curious, how does returning 1 for length(atom) make coding simpler and more user-friendly? What case did you discover works better with length(atom) returning 1?

new topic     » goto parent     » topic index » view message » categorize

18. Re: length() of an atom

euphoric said...

Elements only exist in the context of a sequence. No sequence, no elements. That last 'x' is not an element.

Does an egg exist outside of its carton?

euphoric said...

Regardless, I don't think this change will hurt much, if at all. So have at it. :)

Awwww ... don't give in so soon smile

euphoric said...

But I'm curious, how does returning 1 for length(atom) make coding simpler and more user-friendly? What case did you discover works better with length(atom) returning 1?

   --- 
   r = x[1 .. i] & q & [i+1 .. $] 
   if atom(q) then 
      n += 1 
   else 
      n += length(q) 
   end if 
   --- 

but now I can code ...

   --- 
   r = x[1 .. i] & q & [i+1 .. $] 
   n += length(q) 
   --- 
new topic     » goto parent     » topic index » view message » categorize

19. Re: length() of an atom

DerekParnell said...

but now I can code ...

   --- 
   r = x[1 .. i] & q & [i+1 .. $] 
   n += length(q) 
   --- 

You should be using splice():

  r = splice( x, q, i + 1 ) 

Especially in a loop, this makes a huge difference in performance (likewise for replace(), insert() and remove()).

Matt

new topic     » goto parent     » topic index » view message » categorize

20. Re: length() of an atom

Here is what Keneth Iverson defined for scalar values and vctors. In APL, a scalar is somewhat similar to atom nad is treated as a one element vector when an operation is performed which might make a vector. ⍝ is a comment character. I have added to make non-APLers comfortable. APL has no precedence. it executes from right to left. ⍴ is a monadic function for shape and ← is used for assignment - same as = in many languages.

      var1 ← 5  ⍝ // Assign a single value 5 to a variable called var1 
      var1 ⍝ // show what is in var1  
5 
      ⎕ ←  nshape ← ⍴ var1 ⍝ //shape of var1 assigned to nshape and show - result is nothing 
      ⍴ nshape ⍝ // nshape is actually an empty vector with shape of 0 
0 
      ⍴ nshape ⍝ // nshape is actually an empty vector with shape of 0 
0 
      ⍴ vshape ← var2 ← var1 , 12 18 43 ⍝ // , is a join function in APL 
4 
      var2 ⍝ // var2 is a join of var 1 and a three elment sequence. shape is 4 
5 12 18 43 

new topic     » goto parent     » topic index » view message » categorize

21. Re: length() of an atom

That really didn't make a whole lot of sense to me!

Anyway, as soon as this thread reappeared, I had two thoughts:

To keep things in order, length(atom) should return {}.

Well, it made me chuckle anyway, of course I'm joking. However, the other solution is to code (the equivalent of)

function length(object x, object default_result={}) 
  if sequence(x) then 
    return eu:length(x) 
  elsif atom(default_result) then 
    return default_result 
  else 
    crash_and_burn() 
  end if 
end function 

Then of course you can simply code length(5,1) to get your 1 rather than crash, and no legacy code whatsoever will be affected at all (and I hope not even performancewise), nor will it affect anyone that disagrees with changing length().

Regards, Pete

new topic     » goto parent     » topic index » view message » categorize

22. Re: length() of an atom

petelomax said...

That really didn't make a whole lot of sense to me!

Show me some existing application that depends on having length(atom) crash the application.

Give me some example code that demonstrates a performance hit with the new length() feature.

new topic     » goto parent     » topic index » view message » categorize

23. Re: length() of an atom

DerekParnell said...

Show me some existing application that depends on having length(atom) crash the application.

I have quite a few of those ... but I can easily rewrite almost all of them to use crash() in std/error.e instead.

I have a really obscure application that actually relies on this behavior (just one), but it's really old and won't run under Euphoria 2.3 .. I'd hate to see a new feature held back because of just one single application.

new topic     » goto parent     » topic index » view message » categorize

24. Re: length() of an atom

Well, as a result of all this discussion, I went and wrote a 'program':

atom a = 7 
? a 
? length(a) 

What does it print?
7
1
So, it looks like it's already been implemented. So where are the problem reports? BTW: Derek, how long has that been in there, anyway - since the day you made the first post? :)

new topic     » goto parent     » topic index » view message » categorize

25. Re: length() of an atom

irv said...

So, it looks like it's already been implemented. So where are the problem reports? BTW: Derek, how long has that been in there, anyway - since the day you made the first post? :)

It has been in there since rev 3326, July 29th. About 8 weeks ago.

new topic     » goto parent     » topic index » view message » categorize

26. Re: length() of an atom

DerekParnell said...
petelomax said...

That really didn't make a whole lot of sense to me!

Show me some existing application that depends on having length(atom) crash the application.

I was talking about the Vinoba/APL post.
Of course no application depends on that, but some programmers do.
What is your beef with length(o) crashing and length(o,1) never crashing?

DerekParnell said...

Give me some example code that demonstrates a performance hit with the new length() feature.

Good, I see you implemented it correctly.

Pete

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu