1. Homogeneous sequence

Hi all,

I want to have a user-defined type, that checks whether a variable is a
homogeneous sequence. With this, I mean a sequence which contains only
top-level elements of the same "structure". I think I can't express it
better in English, please see the following code for details.
Did I overlook anything? Should an empty sequence be considered
homogeneous? Any comments are welcome. 
function object_structure (object x)
   if atom(x) then
      return 1
   end if

   for i = 1 to length(x) do
      x[i] = object_structure(x[i])
   end for
   return x
end function

global type homogeneous_sequence (object x)
   object struc

   if atom(x) then
      return 0
   end if

   if length(x) = 0 then
      return 1                  -- ?
   end if

   struc = object_structure(x[1])
   for i = 2 to length(x) do
      if not equal(struc, object_structure(x[i])) then
         return 0
      end if
   end for
   
   return 1
end type

-- Demo:
object a, b, c
a = {3,{3 ,5,{5,2},4}}
b = {4,{5 ,1,{9,2},4}}
c = {4,{{},1,{9,2},4}}

? object_structure(a)
? object_structure(b)
? object_structure(c)
? homogeneous_sequence({a,b})
? homogeneous_sequence({a,c})
? homogeneous_sequence({b,c})

Regards,
   Juergen

new topic     » topic index » view message » categorize

2. Re: Homogeneous sequence

Me wrote:

> Hi all,
> 
> I want to have a user-defined type, that checks whether a variable is a
> homogeneous sequence. With this, I mean a sequence which contains only
> top-level elements of the same "structure". I think I can't express it
> better in English, please see the following code for details.
> Did I overlook anything? Should an empty sequence be considered
> homogeneous? Any comments are welcome. 
> }}}
<eucode>
> function object_structure (object x)
>    if atom(x) then
>       return 1
>    end if
> 
>    for i = 1 to length(x) do
>       x[i] = object_structure(x[i])
>    end for
>    return x
> end function

I just realized, that the following code does the same:
function object_structure (object x)
   return x = x
end function


> global type homogeneous_sequence (object x)
>    object struc
> 
>    if atom(x) then
>       return 0
>    end if
> 
>    if length(x) = 0 then
>       return 1                  -- ?
>    end if
> 
>    struc = object_structure(x[1])
>    for i = 2 to length(x) do
>       if not equal(struc, object_structure(x[i])) then
>          return 0
>       end if
>    end for
>    
>    return 1
> end type
> 
> -- Demo:
> object a, b, c
> a = {3,{3 ,5,{5,2},4}}
> b = {4,{5 ,1,{9,2},4}}
> c = {4,{{},1,{9,2},4}}
> 
> ? object_structure(a)
> ? object_structure(b)
> ? object_structure(c)
> ? homogeneous_sequence({a,b})
> ? homogeneous_sequence({a,c})
> ? homogeneous_sequence({b,c})
> </eucode>
{{{

> Regards,
>    Juergen

new topic     » goto parent     » topic index » view message » categorize

3. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> Hi all,
> 
> I want to have a user-defined type, that checks whether a variable is a
> homogeneous sequence. With this, I mean a sequence which contains only
> top-level elements of the same "structure".
You realise this will fail if the table contains eg {name, salary} and the
lengths of the names are not all the same?

There is a potentially exponential performance hit type-checking every element
of a sequence like this. If length(table) is 1000 then a table[5]=x statement
will rigorously type-check the other 999 elements.
A possible solution is:
-- homogenous.e
sequence the_table -- local, so no-one else can play with it

global type tableitem(sequence x)
  if length(x)=2 and sequence(x[1]) and atom(x[2]) then
     return 1
  end if
  return 0
end type

global procedure replace(integer idx, tableitem x)
  the_table[idx]=x
end procedure

global funtion get_item(integer idx)
  return the_table[idx]
end function
-- plus a host of similar functions to append, remove, insert etc sad

In other words the_table is not itself type-checked, but all the operations
allowed on it are. I accept that the above, while solving a particular
performance problem, is rather cumbersome and does not easily scale well to
handle multiple tables.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

4. Re: Homogeneous sequence

Pete Lomax wrote:

> Juergen Luethje wrote:
> 
> > Hi all,
> > 
> > I want to have a user-defined type, that checks whether a variable is a
> > homogeneous sequence. With this, I mean a sequence which contains only
> > top-level elements of the same "structure".
> You realise this will fail if the table contains eg {name, salary} and the
> lengths
> of the names are not all the same?
> 
> There is a potentially exponential performance hit type-checking every element
> of a sequence like this. If length(table) is 1000 then a table[5]=x statement
> will rigorously type-check the other 999 elements.
> A possible solution is:
> 
> -- homogenous.e
> sequence the_table -- local, so no-one else can play with it
> 
> global type tableitem(sequence x)
>   if length(x)=2 and sequence(x[1]) and atom(x[2]) then
>      return 1
>   end if
>   return 0
> end type
> 
> global procedure replace(integer idx, tableitem x)
>   the_table[idx]=x
> end procedure
> 
> global funtion get_item(integer idx)
>   return the_table[idx]
> end function
> -- plus a host of similar functions to append, remove, insert etc sad
> 
> In other words the_table is not itself type-checked, but all the operations
> allowed on it are. I accept that the above, while solving a particular
> performance
> problem, is rather cumbersome and does not easily scale well to handle
> multiple
> tables.

Thanks for the reply, Pete. I don't understand how it answers my
question, though. My origininal post probably was not very clear.
Here is the reason why I'm interested in all this:

A global library function shall operate on all elements of a list
which is passed to it as parameter. I'm not talking of tables, but the
list could contain _any_ type of elements, as long as all elements in
the list have the same "structure".

I want the function to do Euphoria sequence operations with all elements
of the list like this:
function foo (homogenous_sequence list)
   sequence z, list
   ...
   z = list[1]
   for i = 2 to length(list) do
      z += list[i]
   end for
   ...
end function

I want the elements of 'list' to be either all atoms or all sequences.
If all elements of 'list' are sequences, then their lengths must be the
same, ozherwise Eu sequence ops won't work. If the sequences contain
sub-sequences, their length must also be the same, I think.

So I'm looking for a good way to implement the user-defined type
'homogenous_sequence', that checks whether either all elements of
'list' are atoms, or all elements of 'list' are sequences with the
same lengths, and with the same order of atoms and sequences inside,
and all corresponding sub-sequences also must have the same lengths
and so on ...

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

5. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> Pete Lomax wrote:
> 
> > Juergen Luethje wrote:
> > 
> > > Hi all,
> > > 
> > > I want to have a user-defined type, that checks whether a variable is a
> > > homogeneous sequence. With this, I mean a sequence which contains only
> > > top-level elements of the same "structure".
> > You realise this will fail if the table contains eg {name, salary} and the
> > lengths
> > of the names are not all the same?
> > 
> > There is a potentially exponential performance hit type-checking every
> > element
> > of a sequence like this. If length(table) is 1000 then a table[5]=x
> > statement
> > will rigorously type-check the other 999 elements.
> > A possible solution is:
> > 
> > -- homogenous.e
> > sequence the_table -- local, so no-one else can play with it
> > 
> > global type tableitem(sequence x)
> >   if length(x)=2 and sequence(x[1]) and atom(x[2]) then
> >      return 1
> >   end if
> >   return 0
> > end type
> > 
> > global procedure replace(integer idx, tableitem x)
> >   the_table[idx]=x
> > end procedure
> > 
> > global funtion get_item(integer idx)
> >   return the_table[idx]
> > end function
> > -- plus a host of similar functions to append, remove, insert etc sad
> > 
> > In other words the_table is not itself type-checked, but all the operations
> > allowed on it are. I accept that the above, while solving a particular
> > performance
> > problem, is rather cumbersome and does not easily scale well to handle
> > multiple
> > tables.
> 
> Thanks for the reply, Pete. I don't understand how it answers my
> question, though. My origininal post probably was not very clear.
> Here is the reason why I'm interested in all this:
> 
> A global library function shall operate on all elements of a list
> which is passed to it as parameter. I'm not talking of tables, but the
> list could contain _any_ type of elements, as long as all elements in
> the list have the same "structure".
> 
> I want the function to do Euphoria sequence operations with all elements
> of the list like this:
> }}}
<eucode>
> function foo (homogenous_sequence list)
>    sequence z, list
>    ...
>    z = list[1]
>    for i = 2 to length(list) do
>       z += list[i]
>    end for
>    ...
> end function
> </eucode>
{{{

> I want the elements of 'list' to be either all atoms or all sequences.
> If all elements of 'list' are sequences, then their lengths must be the
> same, ozherwise Eu sequence ops won't work. If the sequences contain
> sub-sequences, their length must also be the same, I think.
> 
> So I'm looking for a good way to implement the user-defined type
> 'homogenous_sequence', that checks whether either all elements of
> 'list' are atoms, or all elements of 'list' are sequences with the
> same lengths, and with the same order of atoms and sequences inside,
> and all corresponding sub-sequences also must have the same lengths
> and so on ...
> 
> Regards,
>    Juergen

You are painfully aware that type checking has a O(N) complexity, where N is not
the legth, but the number of all atoms in all your subsequences. Reguar type
checking is probably very inefficient here.
Where do you get the objects you are processing from? My first approach would be
to have such objects come from "trusted" sources, where the creation process,
being under your control, would guarantee the lengths being equal etc. faster
than checking on an object as if you didn't know anything about it.
If the objects come from data files you didn't control the creation, then... you
have no choice indeed. Try to check each object once, so that you can considered
ti as trusted later, avoiding any further check.

CChris

new topic     » goto parent     » topic index » view message » categorize

6. Re: Homogeneous sequence

CChris wrote:

> Juergen Luethje wrote:
> 
> > Pete Lomax wrote:
> > 
> > > Juergen Luethje wrote:
> > > 
> > > > Hi all,
> > > > 
> > > > I want to have a user-defined type, that checks whether a variable is a
> > > > homogeneous sequence. With this, I mean a sequence which contains only
> > > > top-level elements of the same "structure".
> > > You realise this will fail if the table contains eg {name, salary} and the
> > > lengths
> > > of the names are not all the same?
> > > 
> > > There is a potentially exponential performance hit type-checking every
> > > element
> > > of a sequence like this. If length(table) is 1000 then a table[5]=x
> > > statement
> > > will rigorously type-check the other 999 elements.
> > > A possible solution is:
> > > 
> > > -- homogenous.e
> > > sequence the_table -- local, so no-one else can play with it
> > > 
> > > global type tableitem(sequence x)
> > >   if length(x)=2 and sequence(x[1]) and atom(x[2]) then
> > >      return 1
> > >   end if
> > >   return 0
> > > end type
> > > 
> > > global procedure replace(integer idx, tableitem x)
> > >   the_table[idx]=x
> > > end procedure
> > > 
> > > global funtion get_item(integer idx)
> > >   return the_table[idx]
> > > end function
> > > -- plus a host of similar functions to append, remove, insert etc sad
> > > 
> > > In other words the_table is not itself type-checked, but all the
> > > operations
> > > allowed on it are. I accept that the above, while solving a particular
> > > performance
> > > problem, is rather cumbersome and does not easily scale well to handle
> > > multiple
> > > tables.
> > 
> > Thanks for the reply, Pete. I don't understand how it answers my
> > question, though. My origininal post probably was not very clear.
> > Here is the reason why I'm interested in all this:
> > 
> > A global library function shall operate on all elements of a list
> > which is passed to it as parameter. I'm not talking of tables, but the
> > list could contain _any_ type of elements, as long as all elements in
> > the list have the same "structure".
> > 
> > I want the function to do Euphoria sequence operations with all elements
> > of the list like this:
> > 
> > function foo (homogenous_sequence list)
> >    sequence z, list
> >    ...
> >    z = list[1]
> >    for i = 2 to length(list) do
> >       z += list[i]
> >    end for
> >    ...
> > end function
> > 
> > I want the elements of 'list' to be either all atoms or all sequences.
> > If all elements of 'list' are sequences, then their lengths must be the
> > same, ozherwise Eu sequence ops won't work. If the sequences contain
> > sub-sequences, their length must also be the same, I think.
> > 
> > So I'm looking for a good way to implement the user-defined type
> > 'homogenous_sequence', that checks whether either all elements of
> > 'list' are atoms, or all elements of 'list' are sequences with the
> > same lengths, and with the same order of atoms and sequences inside,
> > and all corresponding sub-sequences also must have the same lengths
> > and so on ...
> > 
> > Regards,
> >    Juergen
> 
> You are painfully aware that type checking has a O(N) complexity, where N is
> not the legth, but the number of all atoms in all your subsequences. Reguar
> type checking is probably very inefficient here.

O(N) complexity doesn't cause any pain for me.

> Where do you get the objects you are processing from? My first approach would
> be to have such objects come from "trusted" sources, where the creation
> process,
> being under your control, would guarantee the lengths being equal etc. faster
> than checking on an object as if you didn't know anything about it.
> If the objects come from data files you didn't control the creation, then...
> you have no choice indeed. Try to check each object once, so that you can
> considered
> ti as trusted later, avoiding any further check.

I want the user-defined type-checking for the parameter of a library
function, which is to be published. So it's not under my control in
which context people will use it, and from where they get their data.
That's actually the reason why I think it's a good idea to perform type-
checking on the parameter.
Also, we should be aware that in a library type-checking is only an
_offer_ to the user of the library. After debugging of the program,
type-checking should be turned off anyway.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

7. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> If all elements of 'list' are sequences, then their lengths must be the
> same, otherwise Eu sequence ops won't work.

Ding! I see now.

My gut says there is a *faster* way, danged if I can articulate it proper
though. The core idea would be Tmatch(template,object) rather than build a
template for every object. But then again you said you care not about
performance.

To answer the questions actually posed:
An empty sequence should imo be considered homogenous.
So too should an atom, in this context, which you have excluded.

You may also (one day) be looking at sq_op(a,b) and likewise wanting to prove
that both a and b have matching structures, not particularly hard just something
to factor in at the get-go.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

8. Re: Homogeneous sequence

Pete Lomax wrote:

> Juergen Luethje wrote:
> 
> > If all elements of 'list' are sequences, then their lengths must be the
> > same, otherwise Eu sequence ops won't work.
> 
> Ding! I see now.
> 
> My gut says there is a *faster* way, danged if I can articulate it proper
> though.
> The core idea would be Tmatch(template,object) rather than build a template
> for every object.

I think I understand the reason why. When already the beginning of an
object does not match the beginning of the template, it's not necessary
any more to "go through" that whole object in order to create a template
of it.

> But then again you said you care not about performance.

Another misunderstanding. The suggestion "Do w/o type-checking, that will
save time." just is not an option for me, since I want to do type-
checking smile (i.e. offer it to the user of the library).
But if there is a faster way to do it, then I'll be happy.
Now I've written the following code, and will do some more tests with it:
function equal_structures_0 (object template, object x)
   -- out: TRUE/FALSE
   return equal(template, x=x)
end function

function equal_structures_1 (object x1, object x2)
   -- Here it is not necessary and also doesn't seem to be advantageous
   -- to create a template firstly. We can as well use the original data
   -- itself.
   -- out: TRUE/FALSE

   if atom(x1) and atom(x2) then
      return 1
   end if

   if sequence(x1) and sequence(x2)
   and (length(x1) = length(x2)) then
      for i = 1 to length(x1) do
         if not equal_structures_1(x1[i], x2[i]) then
            return 0
         end if
      end for
      return 1
   end if

   return 0
end function

-- Demo
object a, b, c, template
a = {3,{3 ,5,{5,2},4}}
b = {4,{5 ,1,{9,2},4}}
c = {4,{{},1,{9,2},4}}

-- old way
template = (a=a)
? equal_structures_0(template, b)
? equal_structures_0(template, c)

-- new way
? equal_structures_1(a, b)
? equal_structures_1(a, c)


> To answer the questions actually posed:
> An empty sequence should imo be considered homogenous.
> So too should an atom, in this context, which you have excluded.
> 
> You may also (one day) be looking at sq_op(a,b) and likewise wanting to prove
> that both a and b have matching structures, not particularly hard just
> something
> to factor in at the get-go.

That's a good idea, I think.

Thanks,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

9. Re: Homogeneous sequence

Hi all.
While reading this thread, I browsed my General Functions package and saw
an obvious mistake in the example for the Structure function, which does
a very similar thing to one of the functions discussed here.
For now, I don't plan to post a correction, because the mistake is so
obvious.
Regards.

new topic     » goto parent     » topic index » view message » categorize

10. Re: Homogeneous sequence

Ricardo Forno wrote:

> Hi all.
> While reading this thread, I browsed my General Functions package and saw
> an obvious mistake in the example for the Structure function, which does
> a very similar thing to one of the functions discussed here.
> For now, I don't plan to post a correction, because the mistake is so
> obvious.
> Regards.

Hello Ricardo,

this reminds me ...
When looking for the solution of a general problem, I actually should
firstly look into your great General Functions package.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

11. Re: Homogeneous sequence

Me wrote:

<snip>

> Now I've written the following code, and will do some more tests with it:
> }}}
<eucode>
> function equal_structures_0 (object template, object x)
>    -- out: TRUE/FALSE
>    return equal(template, x=x)
> end function
> 
> function equal_structures_1 (object x1, object x2)
>    -- Here it is not necessary and also doesn't seem to be advantageous
>    -- to create a template firstly. We can as well use the original data
>    -- itself.
>    -- out: TRUE/FALSE
> 
>    if atom(x1) and atom(x2) then
>       return 1
>    end if
> 
>    if sequence(x1) and sequence(x2)
>    and (length(x1) = length(x2)) then
>       for i = 1 to length(x1) do
>          if not equal_structures_1(x1[i], x2[i]) then
>             return 0
>          end if
>       end for
>       return 1
>    end if
> 
>    return 0
> end function
> 
> -- Demo
> object a, b, c, template
> a = {3,{3 ,5,{5,2},4}}
> b = {4,{5 ,1,{9,2},4}}
> c = {4,{{},1,{9,2},4}}
> 
> -- old way
> template = (a=a)
> ? equal_structures_0(template, b)
> ? equal_structures_0(template, c)
> 
> -- new way
> ? equal_structures_1(a, b)
> ? equal_structures_1(a, c)
> </eucode>
{{{


<snip>

According to my tests, function 0 is clearly faster than function 1.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

12. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> Ricardo Forno wrote:
> 
> > Hi all.
> > While reading this thread, I browsed my General Functions package and saw
> > an obvious mistake in the example for the Structure function, which does
> > a very similar thing to one of the functions discussed here.
> > For now, I don't plan to post a correction, because the mistake is so
> > obvious.
> > Regards.
> 
> Hello Ricardo,
> 
> this reminds me ...
> When looking for the solution of a general problem, I actually should
> firstly look into your great General Functions package.
> 
> Regards,
>    Juergen

Hi, Juergen

This works, I think, though I'd rather have an iterative solution.

type seq_of_atom_or_empty(sequence s)
    if equal(s, {}) then
        return 1
    end if
    for i = 1 to length(s) do
        if sequence(s[i]) then
            return 0
        end if
    end for
    return 1
end type

function object_structure_2(object x)
    if atom(x) then
        return 1
    elsif seq_of_atom_or_empty(x) then
        return {}
    end if

    for i = 1 to length(x) do
        x[i] = object_structure_2(x[i])
    end for
    return x
end function

global type homogeneous_sequence_2 (object x)
   object struc

   if atom(x) then
      return 0
   end if

   if length(x) = 0 then
      return 1                  -- ?
   end if

   struc = object_structure_2(x[1])
   for i = 2 to length(x) do
      if not equal(struc, object_structure_2(x[i])) then
         return 0
      end if
   end for
   
   return 1
end type


-- Demo:
object a, b, c
a = {3,{3 ,5,{5,2},4}}
b = {4,{5 ,1,{9,2},4}}
c = {4,{{},1,{9,2},4}}

-- ? object_structure(a)
-- ? object_structure(b)
-- ? object_structure(c)
-- ? homogeneous_sequence({a,b})
-- ? homogeneous_sequence({a,c})
-- ? homogeneous_sequence({b,c})
? object_structure_2(a)
? object_structure_2(b)
? object_structure_2(c)
? homogeneous_sequence_2({a,b})
? homogeneous_sequence_2({a,c})
? homogeneous_sequence_2({b,c})


Bob

new topic     » goto parent     » topic index » view message » categorize

13. Re: Homogeneous sequence

object a, b, c, a1, b1, c1
a = {3,{3 ,5,{5,2},4}}
b = {4,{5 ,1,{9,2},4}}
c = {4,{{},1,{9,2},4}}

a1 = a >= 0 or a < 0
b1 = b >= 0 or b < 0
c1 = c >= 0 or c < 0

?equal(a1,b1) -- 1
?equal(a1,c1) -- 0
?equal(b1,c1) -- 0

new topic     » goto parent     » topic index » view message » categorize

14. Re: Homogeneous sequence

c.k.lester wrote:
> 
> object a, b, c, a1, b1, c1
> a = {3,{3 ,5,{5,2},4}}
> b = {4,{5 ,1,{9,2},4}}
> c = {4,{{},1,{9,2},4}}
> 
> a1 = a >= 0 or a < 0
> b1 = b >= 0 or b < 0
> c1 = c >= 0 or c < 0

Oops. I guess better (faster) is

 a1 = (a=a)
 b1 = (b=b)
 c1 = (c=c)

> ?equal(a1,b1) -- 1
> ?equal(a1,c1) -- 0
> ?equal(b1,c1) -- 0

new topic     » goto parent     » topic index » view message » categorize

15. Re: Homogeneous sequence

Bob Elia (me) wrote:
> 
> Juergen Luethje wrote:
> > 
> > Ricardo Forno wrote:
> > 
> > > Hi all.
> > > While reading this thread, I browsed my General Functions package and saw
> > > an obvious mistake in the example for the Structure function, which does
> > > a very similar thing to one of the functions discussed here.
> > > For now, I don't plan to post a correction, because the mistake is so
> > > obvious.
> > > Regards.
> > 
> > Hello Ricardo,
> > 
> > this reminds me ...
> > When looking for the solution of a general problem, I actually should
> > firstly look into your great General Functions package.
> > 
> > Regards,
> >    Juergen
> 
> Hi, Juergen
> 
> This works, I think, though I'd rather have an iterative solution.
> 
> }}}
<eucode>
> type seq_of_atom_or_empty(sequence s)
>     if equal(s, {}) then
>         return 1
>     end if
>     for i = 1 to length(s) do
>         if sequence(s[i]) then
>             return 0
>         end if
>     end for
>     return 1
> end type
> 
> function object_structure_2(object x)
>     if atom(x) then
>         return 1
>     elsif seq_of_atom_or_empty(x) then
>         return {}
>     end if
> 
>     for i = 1 to length(x) do
>         x[i] = object_structure_2(x[i])
>     end for
>     return x
> end function
> 
> global type homogeneous_sequence_2 (object x)
>    object struc
> 
>    if atom(x) then
>       return 0
>    end if
> 
>    if length(x) = 0 then
>       return 1                  -- ?
>    end if
> 
>    struc = object_structure_2(x[1])
>    for i = 2 to length(x) do
>       if not equal(struc, object_structure_2(x[i])) then
>          return 0
>       end if
>    end for
>    
>    return 1
> end type
> 
> 
> -- Demo:
> object a, b, c
> a = {3,{3 ,5,{5,2},4}}
> b = {4,{5 ,1,{9,2},4}}
> c = {4,{{},1,{9,2},4}}
> 
> -- ? object_structure(a)
> -- ? object_structure(b)
> -- ? object_structure(c)
> -- ? homogeneous_sequence({a,b})
> -- ? homogeneous_sequence({a,c})
> -- ? homogeneous_sequence({b,c})
> ? object_structure_2(a)
> ? object_structure_2(b)
> ? object_structure_2(c)
> ? homogeneous_sequence_2({a,b})
> ? homogeneous_sequence_2({a,c})
> ? homogeneous_sequence_2({b,c})
> 
> </eucode>
{{{

> 
> Bob

Nope, that doesn't work.  I was trying to account for strings of un-equal
length but {65,66,67} might not be one.  Oops.

In what form is the user providing this data? As Euphoria objects?
Or are you converting it?

Bob

new topic     » goto parent     » topic index » view message » categorize

16. Re: Homogeneous sequence

Bob Elia wrote:

<big snip>

> Nope, that doesn't work.  I was trying to account for strings of un-equal
> length but {65,66,67} might not be one.  Oops.
> 
> In what form is the user providing this data? As Euphoria objects?

As a sequence.

> Or are you converting it?

I only want to create a user-defined type that checks the data which are
passed as parameter to a library function. The definition of that type
will be in the same library as the function, and it should look somehow
like this:
global type homogeneous_sequence (object x)
   -- A sequence is considered homogeneous, when all its top-level
   -- elements have the same structure. We'll get the structure of an
   -- Euphoria object simply by assigning an arbitrary value (always the
   -- same, though!) to all of its atoms. Here the number 1 is used.   
   object struc, t

   if atom(x) then
      return 0
   end if

   if length(x) = 0 then
      return 1
   end if

   t = x[1]
   struc = (t=t)                     -- assign 1 to all atoms in t
   for i = 2 to length(x) do
      t = x[i]
      if not equal(struc, t=t) then
         return 0
      end if
   end for
   
   return 1
end type

Sorry, I can't explain it better.
However, I'm happy with this type as it is now. Of course, if you or
someone else has a suggestion that makes it considerably faster ...

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

17. Re: Homogeneous sequence

c.k.lester wrote:

> c.k.lester wrote:
> > 
> > object a, b, c, a1, b1, c1
> > a = {3,{3 ,5,{5,2},4}}
> > b = {4,{5 ,1,{9,2},4}}
> > c = {4,{{},1,{9,2},4}}
> > 
> > a1 = a >= 0 or a < 0
> > b1 = b >= 0 or b < 0
> > c1 = c >= 0 or c < 0
> 
> Oops. I guess better (faster) is
> 
>  a1 = (a=a)
>  b1 = (b=b)
>  c1 = (c=c)

Yes, that's faster. And that's what I have used in my posts in this
thread (except of the first one). smile

> > ?equal(a1,b1) -- 1
> > ?equal(a1,c1) -- 0
> > ?equal(b1,c1) -- 0

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

18. Re: Homogeneous sequence

CChris wrote:
> You are painfully aware that type checking has a O(N) complexity, where N is
> not the legth, but the number of all atoms in all your subsequences. Reguar
> type checking is probably very inefficient here.

This reminds me of a proposed extension to the type system
that various people, including myself, have thought about before.
It would be intuitively clear, and somewhat useful, to allow
type declarations such as:
sequence of integer x
  sequence of object x
  sequence of sequence x
  sequence of sequence of atom x
  sequence of my_user_defined_type x
  etc.


In many cases this would reduce the type-checking cost from
O(n) to O(1), since the type-check could be limited to a
single value, e.g.

  x[5] = expression

would only have to test x[5], not all of x, as you would need
today if you made your own user-defined type that loops over
all of x to enforce that it only contain elements of say
integer, or some other type.

Stricter types would help to catch bugs earlier in the code, 
and document variables more precisely.

Type information, such as this, could also in many cases 
help the Translator to produce slightly faster C code.

I spent a lot of time thinking about this a few years ago.
I didn't proceed with it because:
  
  * I was worried that since the existing user-defined type
    system is not used all that much, perhaps this enhancement
    to the type system would not be used much either. People 
    might be content to just say:
         sequence x
    rather than the much wordier:
         sequence of integer x
    On the other hand, this enhancement might encourage much more
    use of user-defined types.

  * Newbies might be confused, and put off by all this, 
    thinking it was somehow necessary to provide full and proper
    types for everything.

  * It seemed at the time to be a fair bit of work, and extra 
    baggage added to the language, for just a small gain.

  * In many cases, the Translator deduces this information by
    examining all uses of a variable across the whole program.

But perhaps we should reconsider this idea.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

19. Re: Homogeneous sequence

Robert Craig wrote:
> 
> CChris wrote:
> > You are painfully aware that type checking has a O(N) complexity, where N is
> > not the legth, but the number of all atoms in all your subsequences. Reguar
> > type checking is probably very inefficient here.
> 
> This reminds me of a proposed extension to the type system
> that various people, including myself, have thought about before.
> It would be intuitively clear, and somewhat useful, to allow
> type declarations such as:
> }}}
<eucode>
>   sequence of integer x
>   sequence of object x
>   sequence of sequence x
>   sequence of sequence of atom x
>   sequence of my_user_defined_type x
>   etc.
> </eucode>
{{{

> 
> In many cases this would reduce the type-checking cost from
> O(n) to O(1), since the type-check could be limited to a
> single value, e.g.
> 
>   x[5] = expression
> 
> would only have to test x[5], not all of x, as you would need
> today if you made your own user-defined type that loops over
> all of x to enforce that it only contain elements of say
> integer, or some other type.
> 
> Stricter types would help to catch bugs earlier in the code, 
> and document variables more precisely.
> 
> Type information, such as this, could also in many cases 
> help the Translator to produce slightly faster C code.
> 
> I spent a lot of time thinking about this a few years ago.
> I didn't proceed with it because:
>   
>   * I was worried that since the existing user-defined type
>     system is not used all that much, perhaps this enhancement
>     to the type system would not be used much either. People 
>     might be content to just say:
>          sequence x
>     rather than the much wordier:
>          sequence of integer x
>     On the other hand, this enhancement might encourage much more
>     use of user-defined types.
> 
>   * Newbies might be confused, and put off by all this, 
>     thinking it was somehow necessary to provide full and proper
>     types for everything.
> 
>   * It seemed at the time to be a fair bit of work, and extra 
>     baggage added to the language, for just a small gain.
> 
>   * In many cases, the Translator deduces this information by
>     examining all uses of a variable across the whole program.
> 
> But perhaps we should reconsider this idea.
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a>

This would save typing:

  seq_of_int x
  seq_of_obj x
  seq_of_seq x
  seq_of_seq_of atom x
  seq_of_udt x

Bernie

My files in archive:
WMOTOR, XMOTOR, W32ENGIN, MIXEDLIB, EU_ENGIN, WIN32ERU, WIN32API 

Can be downloaded here:
http://www.rapideuphoria.com/cgi-bin/asearch.exu?dos=on&win=on&lnx=on&gen=on&keywords=bernie+ryan

new topic     » goto parent     » topic index » view message » categorize

20. Re: Homogeneous sequence

Robert Craig wrote:
> This reminds me of a proposed extension to the type system
> that various people, including myself, have thought about before.
> It would be intuitively clear, and somewhat useful, to allow
> type declarations such as:
> }}}
<eucode>
>   sequence of integer x
>   sequence of object x
>   sequence of sequence x
>   sequence of sequence of atom x
>   sequence of my_user_defined_type x
>   etc.
> </eucode>
{{{


> But perhaps we should reconsider this idea.

Yes, yes, yes! 

-- 
Derek Parnell
Melbourne, Australia
Skype name: derek.j.parnell

new topic     » goto parent     » topic index » view message » categorize

21. Re: Homogeneous sequence

Robert Craig wrote:

> CChris wrote:
> > You are painfully aware that type checking has a O(N) complexity, where N is
> > not the legth, but the number of all atoms in all your subsequences. Reguar
> > type checking is probably very inefficient here.
> 
> This reminds me of a proposed extension to the type system
> that various people, including myself, have thought about before.
> It would be intuitively clear, and somewhat useful, to allow
> type declarations such as:
> }}}
<eucode>
>   sequence of integer x
>   sequence of object x
>   sequence of sequence x
>   sequence of sequence of atom x
>   sequence of my_user_defined_type x
>   etc.
> </eucode>
{{{

> 
> In many cases this would reduce the type-checking cost from
> O(n) to O(1), since the type-check could be limited to a
> single value, e.g.
> 
>   x[5] = expression
> 
> would only have to test x[5], not all of x, as you would need
> today if you made your own user-defined type that loops over
> all of x to enforce that it only contain elements of say
> integer, or some other type.
> 
> Stricter types would help to catch bugs earlier in the code, 
> and document variables more precisely.

I really like this suggestion. It would be a "natural" extension
of Euphoria's current type system.

> Type information, such as this, could also in many cases 
> help the Translator to produce slightly faster C code.
> 
> I spent a lot of time thinking about this a few years ago.
> I didn't proceed with it because:
>   
>   * I was worried that since the existing user-defined type
>     system is not used all that much, perhaps this enhancement
>     to the type system would not be used much either. People 
>     might be content to just say:
>          sequence x
>     rather than the much wordier:
>          sequence of integer x
>     On the other hand, this enhancement might encourage much more
>     use of user-defined types.

This is hard to predict IMHO.

>   * Newbies might be confused, and put off by all this, 
>     thinking it was somehow necessary to provide full and proper
>     types for everything.

This would be my main concern. However, I think it can be avoided by
good documentation. The current Eu documentation IMHO is good or very
good. If explanations concerning the extended type system will be
written in the same clear and precise way, neither too short nor too
long, then people should easily understand it.

>   * It seemed at the time to be a fair bit of work, and extra 
>     baggage added to the language, for just a small gain.

I don't have enough knowledge about Eu's internals, so I can't judge
how "bad" that is.

>   * In many cases, the Translator deduces this information by
>     examining all uses of a variable across the whole program.

Cool!
Maybe it would be interesting to know, how many people translate their
programs, and how many people prefer to bind their programs.
For instance IIRC reverse-engeneering of bound programs is harder than
reverse-engeneering of translated/compiled programs, right?. If so, then
this might be a reason for people to prefer to bind their programs rather
than translate/compile them.

> But perhaps we should reconsider this idea.

My personal vote is "yes" -- being not the one who'll have to do
the work. smile

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

22. Re: Homogeneous sequence

Robert Craig wrote:
> 
> CChris wrote:
> > You are painfully aware that type checking has a O(N) complexity, where N is
> > not the legth, but the number of all atoms in all your subsequences. Reguar
> > type checking is probably very inefficient here.
> 
> This reminds me of a proposed extension to the type system
> that various people, including myself, have thought about before.
> It would be intuitively clear, and somewhat useful, to allow
> type declarations such as:
> }}}
<eucode>
>   sequence of integer x
>   sequence of object x
>   sequence of sequence x
>   sequence of sequence of atom x
>   sequence of my_user_defined_type x
>   etc.
> </eucode>
{{{

> 

Definitely a welcome addition, if only because of

> Stricter types would help to catch bugs earlier in the code, 
> and document variables more precisely.
> 

which I can verify to hold almost every day I work on a program.
<snip>

>   * Newbies might be confused, and put off by all this, 
>     thinking it was somehow necessary to provide full and proper
>     types for everything.

Are they put off, in C, by "int some_ints[]", which Eu translates as
sequence of integer some_ints

? Probably adding a shortcut
notation like the one Bernie suggested would be a good idea.

> 
>   * It seemed at the time to be a fair bit of work, and extra 
>     baggage added to the language, for just a small gain.
> 
>   * In many cases, the Translator deduces this information by
>     examining all uses of a variable across the whole program.
> 

One of my (revolutionary) thoughts is: now that the Translator is no longer a
special part of the Eu distribution, isn't it time to use its algorithm (one or
two passes only) in the interpreter, so as to generate some optimised IL? For
mrograms that run in a loop (ie most instructions are executed several times),
the overhead of type inference would be offset by sharper IL. Perhaps a new "with
type_detect" would be useful to turn this on or off, if the overhead may be
significative.

CChris
> But perhaps we should reconsider this idea.
> 
> Regards,
>    Rob Craig
>    Rapid Deployment Software
>    <a href="http://www.RapidEuphoria.com">http://www.RapidEuphoria.com</a>

new topic     » goto parent     » topic index » view message » categorize

23. Re: Homogeneous sequence

CChris wrote:
> > }}}
<eucode>
> >   sequence of integer x
> >   sequence of object x
> >   sequence of sequence x
> >   sequence of sequence of atom x
> >   sequence of my_user_defined_type x
> >   etc.
> > </eucode>
{{{



> Probably adding a shortcut notation like the one Bernie
> suggested would be a good idea.

I would strenuously fight against that suggestion. I didn't comment on Bernie's
original post because I thought that it was so obviously a mistaken idea that no
one would support it.

If one must have a shortened syntax (and I can see why some might like that),
then keep it in the 'Euphoria' style and avoid abbreviations etc...

As all of these are variations are sequences of some sort, and the common
English term for a sequence is 'list', then maybe something along the lines of
...

  integer list x
  object list x
  sequence list x
  atom list list x
  my_user_defined_type list x

This form avoids the need for 'plural' English ("list of integers"), avoids the
preposition word "of", parses unambiguously, is extensible and you can even avoid
making "list" a reserved word by employing a context-sensitive grammar.

By the way, notice that "object list" is identical to today's "sequence" so some
purists might be tempted to remove 'sequence' as a keyword ... <just kidding>

-- 
Derek Parnell
Melbourne, Australia
Skype name: derek.j.parnell

new topic     » goto parent     » topic index » view message » categorize

24. Re: Homogeneous sequence

Derek Parnell wrote:
> 
> CChris wrote:
> > > }}}
<eucode>
> > >   sequence of integer x
> > >   sequence of object x
> > >   sequence of sequence x
> > >   sequence of sequence of atom x
> > >   sequence of my_user_defined_type x
> > >   etc.
> > > </eucode>
{{{

> 
> 
> > Probably adding a shortcut notation like the one Bernie
> > suggested would be a good idea.
> 
> I would strenuously fight against that suggestion. I didn't comment on
> Bernie's
> original post because I thought that it was so obviously a mistaken idea that
> no one would support it.
> 
> If one must have a shortened syntax (and I can see why some might like that),
> then keep it in the 'Euphoria' style and avoid abbreviations etc...
> 
> As all of these are variations are sequences of some sort, and the common
> English
> term for a sequence is 'list', then maybe something along the lines of ...
> 
>   integer list x
>   object list x
>   sequence list x
>   atom list list x
>   my_user_defined_type list x
> 
> This form avoids the need for 'plural' English ("list of integers"), avoids
> the preposition word "of", parses unambiguously, is extensible and you can
> even
> avoid making "list" a reserved word by employing a context-sensitive grammar.
> 
> By the way, notice that "object list" is identical to today's "sequence" so
> some
> purists might be tempted to remove 'sequence' as a keyword ... <just kidding>
> 
> -- 
> Derek Parnell
> Melbourne, Australia
> Skype name: derek.j.parnell

This one suits me fine too.

CChris

new topic     » goto parent     » topic index » view message » categorize

25. Re: Homogeneous sequence

Derek Parnell wrote:

> CChris wrote:
> > > }}}
<eucode>
> > >   sequence of integer x
> > >   sequence of object x
> > >   sequence of sequence x
> > >   sequence of sequence of atom x
> > >   sequence of my_user_defined_type x
> > >   etc.
> > > </eucode>
{{{

> 
> 
> > Probably adding a shortcut notation like the one Bernie
> > suggested would be a good idea.
> 
> I would strenuously fight against that suggestion. I didn't comment on
> Bernie's
> original post because I thought that it was so obviously a mistaken idea that
> no one would support it.
> 
> If one must have a shortened syntax (and I can see why some might like that),
> then keep it in the 'Euphoria' style and avoid abbreviations etc...

I agree.

> As all of these are variations are sequences of some sort, and the common
> English
> term for a sequence is 'list', then maybe something along the lines of ...
> 
>   integer list x
>   object list x
>   sequence list x
>   atom list list x
>   my_user_defined_type list x
> 
> This form avoids the need for 'plural' English ("list of integers"),

In his suggestion above, Rob did not use the plural form. He wrote
"sequence of integer x". Or don't you like that because it's not
correct English grammar?

> avoids
> the preposition word "of", parses unambiguously, is extensible and you can
> even
> avoid making "list" a reserved word by employing a context-sensitive grammar.
> 
> By the way, notice that "object list" is identical to today's "sequence" so
> some
> purists might be tempted to remove 'sequence' as a keyword ... <just kidding>

What you proposed can also be done with the word "sequence", I think.
So it wouldn't be necessary to introduce a new keyword:

   integer sequence x
   object sequence x
   sequence sequence x
   atom sequence sequence x
   my_user_defined_type sequence x

However, I think for me personally Rob's suggestion is better readable.
And I consider readability much more important than "fast writability".

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

26. Re: Homogeneous sequence

>    integer sequence x
>    object sequence x
>    sequence sequence x
>    atom sequence sequence x
>    my_user_defined_type sequence x

-- Which one would be correct?

sequence sequence sx
sequence sequence integer sy

sx={"Tom","Bob","Mary","John"}
sy={"Tom","Bob","Mary","John"}


Rgds, 
Salix

new topic     » goto parent     » topic index » view message » categorize

27. Re: Homogeneous sequence

Salix wrote:

> >    integer sequence x
> >    object sequence x
> >    sequence sequence x
> >    atom sequence sequence x
> >    my_user_defined_type sequence x
> 
> }}}
<eucode>
> -- Which one would be correct? 
> 
> sequence sequence sx
> sequence sequence integer sy
> 
> sx={"Tom","Bob","Mary","John"}
> sy={"Tom","Bob","Mary","John"}
> </eucode>
{{{


Both would be correct, and so would be
   sequence sx, sy
      or
   object sx, sy
      or
   sequence sequence object sx, sy
      or
   sequence sequence atom sx, sy

Hmm ... maybe too confusing for beginners after all?

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

28. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> Salix wrote:
> 
> > >    integer sequence x
> > >    object sequence x
> > >    sequence sequence x
> > >    atom sequence sequence x
> > >    my_user_defined_type sequence x
> > 
> > }}}
<eucode>
> > -- Which one would be correct? 
> > 
> > sequence sequence sx
> > sequence sequence integer sy
> > 
> > sx={"Tom","Bob","Mary","John"}
> > sy={"Tom","Bob","Mary","John"}
> > </eucode>
{{{

> 
> Both would be correct, and so would be
>    sequence sx, sy
>       or
>    object sx, sy
>       or
>    sequence sequence object sx, sy
>       or
>    sequence sequence atom sx, sy
> 
> Hmm ... maybe too confusing for beginners after all?
> 
> Regards,
>    Juergen

Yes. Very confusing IMHO.
I do not support this idea.

Regards,

Salix

new topic     » goto parent     » topic index » view message » categorize

29. Re: Homogeneous sequence

Robert Craig wrote:
> 
> This reminds me of a proposed extension to the type system
> that various people, including myself, have thought about before.
> It would be intuitively clear, and somewhat useful, to allow
> type declarations such as:
> }}}
<eucode>
>   sequence of integer x
>   sequence of object x
>   sequence of sequence x
>   sequence of sequence of atom x
>   sequence of my_user_defined_type x
>   etc.
> </eucode>
{{{

> 
When I last thought about this, I came to the conclusion that the above was
wrong. Specifically,
sequence of sequence of atom z

has two problems:
 1) minimal semantic meaning
 2) no way to perform the manual type check, unless you plan to allow:
if sequence of sequence of atom(z) then

As far as my logic got, I figured the *ONLY* place an "of" should be valid
is in a user defined type definition:
type set(sequence of atom s)
       return 1
    end type
    type set_table(sequence of set st)
       return 1
    end type

Limiting the number of times that "of" can be used to *ONE* should help solve
both the problems mentioned above, and allow sensibly typed work vars when
playing with individual elements of these things.

While I somehow doubt this will gain much in the way of popular support, I
remain convinced it is far superior while still achieving all the benefits you
outlined.

Also, "sequence of object" should imo be an outright compilation error.
That is what "sequence" already is and I see no reason to either perform a
squillion isObj() tests or permit the expression then optimise it away.

> I spent a lot of time thinking about this a few years ago.
Did you leave any design notes lying around?

>   * Newbies might be confused, and put off by all this, 
>     thinking it was somehow necessary to provide full and proper
>     types for everything.
Maybe not. It has been a while now, but on re-reading:
"To augment the predefined types, you can create user-defined types."
"For many programs, there is little advantage in defining new types"
"However, for larger programs, strict type definitions can aid debugging"

I probably mentally skipped that section on first reading anyway, but was glad
to know it was there.

It is usually a good idea to drum up the doc changes first of all, and I feel
sure you can find the words to reassure the newbie that they can keep it simple.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

30. Re: Homogeneous sequence

Derek Parnell wrote:
> As all of these are variations are sequences of some sort, and the common
> English
> term for a sequence is 'list', then maybe something along the lines of ...
> 
>   integer list x
>   object list x
>   sequence list x
>   atom list list x
>   my_user_defined_type list x
> 

Derek:

Aren't we simply saying:

array of integer x
array of object x
array of sequence x
array of array of sequence x
array of my_user_defined_type x


Bernie

My files in archive:
WMOTOR, XMOTOR, W32ENGIN, MIXEDLIB, EU_ENGIN, WIN32ERU, WIN32API 

Can be downloaded here:
http://www.rapideuphoria.com/cgi-bin/asearch.exu?dos=on&win=on&lnx=on&gen=on&keywords=bernie+ryan

new topic     » goto parent     » topic index » view message » categorize

31. Re: Homogeneous sequence

Pete Lomax wrote:
> 
> Robert Craig wrote:
> > 
> > This reminds me of a proposed extension to the type system
> > that various people, including myself, have thought about before.
> > It would be intuitively clear, and somewhat useful, to allow
> > type declarations such as:
> > }}}
<eucode>
> >   sequence of integer x
> >   sequence of object x
> >   sequence of sequence x
> >   sequence of sequence of atom x
> >   sequence of my_user_defined_type x
> >   etc.
> > </eucode>
{{{

> > 
> When I last thought about this, I came to the conclusion that the above was
> wrong. Specifically,
> }}}
<eucode>
> sequence of sequence of atom z
> </eucode>
{{{

> has two problems:
>  1) minimal semantic meaning

Your comment has minimal semantic meaning.

>  2) no way to perform the manual type check, unless you plan to allow:
> }}}
<eucode>
>       if sequence of sequence of atom(z) then
> </eucode>
{{{

> As far as my logic got, I figured the *ONLY* place an "of" should be valid
> is in a user defined type definition:
> }}}
<eucode>
>     type set(sequence of atom s)
>        return 1
>     end type
>     type set_table(sequence of set st)
>        return 1
>     end type
> </eucode>
{{{

> Limiting the number of times that "of" can be used to *ONE* should help solve
> both the problems mentioned above, and allow sensibly typed work vars when
> playing
> with individual elements of these things.
> 

I must be having a nightmare.
So you never had to deal with sequences of strings, which would translate to
sequence of sequence of printable

?
assuming printable is a subtype of atom, presumably checking for range 32..255
unless some UTF-16/32 encodings are to be accounted for.

As the very frequent case above shows, the number should be raised to two at the
very, very minimum. And giving any explicit limit is a kludge barely squaring
with the otherwise dynamic nature of structures in Eu - there is no limit to the
depth of a sequence. The docs may warn against too high a level for obvious
performance reasons, but that should be all.

Otherwise, I'd say that limiting the use of "of" to UDT declarations makes
sense.

> While I somehow doubt this will gain much in the way of popular support, I
> remain
> convinced it is far superior while still achieving all the benefits you
> outlined.
> 
> Also, "sequence of object" should imo be an outright compilation error.
> That is what "sequence" already is and I see no reason to either perform a
> squillion
> isObj() tests or permit the expression then optimise it away.

Optimize away the expression first, then perform the tests. Why raising an
error?

> 
> > I spent a lot of time thinking about this a few years ago.
> Did you leave any design notes lying around?
> 
> >   * Newbies might be confused, and put off by all this, 
> >     thinking it was somehow necessary to provide full and proper
> >     types for everything.
> Maybe not. It has been a while now, but on re-reading:
> "To augment the predefined types, you can create user-defined types."
> "For many programs, there is little advantage in defining new types"
> "However, for larger programs, strict type definitions can aid debugging"
> 
> I probably mentally skipped that section on first reading anyway, but was glad
> to know it was there.
> 
> It is usually a good idea to drum up the doc changes first of all, and I feel
> sure you can find the words to reassure the newbie that they can keep it
> simple.
> 
> Regards,
> Pete

Do we have any database of sorts of what newbies like, don't like or barely
grasp? I have read this sort of comment repeatedly, while failing to see any
objective evidence backing the opinions stated on behalf of the newbies by non
newbies. Which makes those opinions rather dubious.

CChris

new topic     » goto parent     » topic index » view message » categorize

32. Re: Homogeneous sequence

Derek Parnell wrote:
> CChris wrote:
> > Probably adding a shortcut notation like the one Bernie
> > suggested would be a good idea.
> 
> I would strenuously fight against that suggestion. I didn't comment on
> Bernie's
> original post because I thought that it was so obviously a mistaken idea that
> no one would support it.
> 
> If one must have a shortened syntax (and I can see why some might like that),
> then keep it in the 'Euphoria' style and avoid abbreviations etc...

I would not like Bernie's suggestion either.
There would be an infinite number of different 
short forms required.

If things were getting too wordy,
I was thinking people could define for themselves 
some simple short forms (for example):

type string(sequence of integer s)
    return TRUE
end type

string name, country, province

type matrix(sequence of sequence of atom m)
    return TRUE
end type

matrix chess_board


If you wanted tighter checking for strings you could write:
type char(integer x)
    return x >= 0 and x <= 255 
end type

type string(sequence of char s)
    return TRUE
end type

string name

...

name[6] = 2.5 


The above would only require that we check name[6] 
to see if 2.5 is a "char" or not (it's not).
This is only O(1), so it would not kill your performance
like O(n) would. 

Of course you could still say "without type_check"
once your program was debugged.

Note: if someone wrote, as you can today,
type string(sequence of integer s)
    for i = 1 to length(s) do
        if s[i] < 0 or s[i] > 255 then
            return FALSE
        end if
    end for
    return TRUE
end type


the interpreter would have to run through the O(n) loop each time
as it does now, when type_check is in force. 
It would be too much of a stretch for it to optimize that form.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

33. Re: Homogeneous sequence

Robert Craig wrote:
> Note: if someone wrote, as you can today,
> }}}
<eucode>
> type string(sequence of integer s)
>     for i = 1 to length(s) do
>         if s[i] < 0 or s[i] > 255 then
>             return FALSE
>         end if
>     end for
>     return TRUE
> end type
> </eucode>
{{{


Of course, you can't exactly write that today, smile
but I think you get the idea. The for-loop requires execution, 
even if only one element is changed, but if
you break it down into a sequence of a user-defined type,
then only that user-defined type needs to be executed,
when an element of a sequence is modified.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

34. Re: Homogeneous sequence

CChris wrote:
> 
> > }}}
<eucode>
> >     type set(sequence of atom s)
> >        return 1
> >     end type
> >     type set_table(sequence of set st)
> >        return 1
> >     end type
> > </eucode>
{{{

> > Limiting the number of times that "of" can be used to *ONE*
> 
> I must be having a nightmare.
> So you never had to deal with sequences of strings
You are misreading me. I am saying "force the programmer to give each level a
plausible name". You can daisy-chain 37 udts together if you wish.

In the above, a set_table is a sequence, each element of which is a set, where a
set is a sequence, each element of which is an atom. For your example you could
code:
type char(integer c)
       return c>=32 and c<=255
    end type
    type line(sequence of char l)
       return 1
    end type
    type file(sequence of line f)
       return 1
    end type


As Salix pointed out "sequence of sequence of integer" simply does not hold a
candle to the code readability of a "name_table" type definition.
Obviously I accept the above is alot more typing .. and alot nicer.

There are also some technical points I missed.
set_table k
    k[5]=z

If z is already type set, then you would not need to call set() as a result of
this assignment. You can compare an explict type definition instantly,  while you
would still have to walk down the type-chain as each "[" is processed, it is
obviously simpler than judging that "7 sequence of plus integer plus 3
subscripts" matches a replacement element of "4 sequence of plus integer".

Because you are only changing k[5], then in the type set_table() definition, you
do not need to type check the parameter st, but can invoke the udt after the
leading TYPE_CHECK_SOF opcode (or do nothing if that would just be a return 1 as
shown).

In this way, a smart compiler and sensibly written code will often avoid
launching a daisy chain of types, and hopefully optimise away any of the "return
1" bodies without much trouble. If it can do that last trick, I see no reason why
it would be any slower than a hidden/inline implementation, and it will be easier
to write sensible compiler-friendly code.

> > Also, "sequence of object" should imo be an outright compilation error.
> > That is what "sequence" already is and I see no reason to either perform a
> > squillion
> > isObj() tests or permit the expression then optimise it away.
> 
> Optimize away the expression first, then perform the tests. Why raising an
> error?
Because it is redundant and potentially misleading. Why allow it?

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

35. Re: Homogeneous sequence

Pete Lomax wrote:
> 
> CChris wrote:
> > 
> > > }}}
<eucode>
> > >     type set(sequence of atom s)
> > >        return 1
> > >     end type
> > >     type set_table(sequence of set st)
> > >        return 1
> > >     end type
> > > </eucode>
{{{

> > > Limiting the number of times that "of" can be used to *ONE*
> > 
> > I must be having a nightmare.
> > So you never had to deal with sequences of strings
> You are misreading me. I am saying "force the programmer to give each level
> a plausible name". You can daisy-chain 37 udts together if you wish.
> 
> In the above, a set_table is a sequence, each element of which is a set, where
> a set is a sequence, each element of which is an atom. For your example you
> could code:
> }}}
<eucode>
>     type char(integer c)
>        return c>=32 and c<=255
>     end type
>     type line(sequence of char l)
>        return 1
>     end type
>     type file(sequence of line f)
>        return 1
>     end type
> </eucode>
{{{

> 
> As Salix pointed out "sequence of sequence of integer" simply does not hold
> a candle to the code readability of a "name_table" type definition.
> Obviously I accept the above is alot more typing .. and alot nicer.
> 
> There are also some technical points I missed.
> }}}
<eucode>
> set_table k
>     k[5]=z
> </eucode>
{{{

> If z is already type set, then you would not need to call set() as a result
> of this assignment. You can compare an explict type definition instantly, 
> while
> you would still have to walk down the type-chain as each "[" is processed, it
> is obviously simpler than judging that "7 sequence of plus integer plus 3
> subscripts"
> matches a replacement element of "4 sequence of plus integer".
> 
> Because you are only changing k[5], then in the type set_table() definition,
> you do not need to type check the parameter st, but can invoke the udt after
> the leading TYPE_CHECK_SOF opcode (or do nothing if that would just be a
> return
> 1 as shown).
> 
> In this way, a smart compiler and sensibly written code will often avoid
> launching
> a daisy chain of types, and hopefully optimise away any of the "return 1"
> bodies
> without much trouble. If it can do that last trick, I see no reason why it
> would
> be any slower than a hidden/inline implementation, and it will be easier to
> write sensible compiler-friendly code.
> 

Is't it easier for the compiler to see the whole of the daisy chain, and
optimise as you said, if the type is presented as a whole?
Otherwise, the compiler has to reassemble a few nested type declarations into
the whole thing. More work both for coder and compiler, with nil benefits and
perhaps a slight loss of performance.

> > > Also, "sequence of object" should imo be an outright compilation error.
> > > That is what "sequence" already is and I see no reason to either perform a
> > > squillion
> > > isObj() tests or permit the expression then optimise it away.
> > 
> > Optimize away the expression first, then perform the tests. Why raising an
> > error?
> Because it is redundant and potentially misleading. Why allow it?
> 

Because of the overhead of testing for it, because there is no example of such
limitations in Eu, because it doesn't hurt since the last "object" will be simply
ignored.
I remember your writing "exceptions simply beget exceptions". That would be the
short answer.

CChris

> Regards,
> Pete

new topic     » goto parent     » topic index » view message » categorize

36. Re: Homogeneous sequence

Robert Craig wrote:

<snip>

> If you wanted tighter checking for strings you could write:
> 
> type char(integer x)
>     return x >= 0 and x <= 255 
> end type
> 
> type string(sequence of char s)
>     return TRUE
> end type

</snip>

Currently, there is a problem with code like that. When we e.g. write
....
if char(foo) then
...

then the program will crash instead of returning FALSE, in case 'foo' is
not an integer. In order to avoid that, currently we have to define the
type somehow like this:
type char(object x)
   return integer(x) and x >= 0 and x <= 255 
end type

What would that problem mean regarding the proposed extension of the type
system? Or maybe this behaviour can be changed, when the type system is
revised? I'd appreciate that.

And what do you think about Pete's proposal:
| As far as my logic got, I figured the *ONLY* place an "of" should be
| valid is in a user defined type definition:
|
|    type set(sequence of atom s)
|       return 1
|    end type
|    type set_table(sequence of set st)
|       return 1
|    end type

Curious,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

37. Re: Homogeneous sequence

CChris wrote:
> 
> Is't it easier for the compiler to see the whole of the daisy chain, and 
> optimise as you said, if the type is presented as a whole?
Maybe. One of my nagging questions is this:
type ssm(sequence of sequence of myudt q) return 1 end type
ssm k
object z
    k[5]=z

After the assignment, where exactly do you type-check s[5]? If the programmer
was forced to create a type sm and use that in the definition of type ssm, the
answer would be obvious. Without it, well you are going to need a new typecheck
opcode that contains a for loop anyway, but then that would have to become
nested/recursive. At the end of the day it may be six of one and half a dozen of
the other, but the lack of a named type to declare z with, and the corresponding
extra type checks which will therefore need to be issued, swings it for me.

> Otherwise, the compiler has to reassemble a few nested type declarations
More a question of walking down an existing chain, I think. This stuff already
works so I thought of piggybacking onto it:
type positive_int(integer pi)
    return pi>=0
end type
type minute(positive_int m)
    return m<=59
end type
type even_min(minute em)
    return and_bits(em,1)=0
end type
type odd_min(minute om)
    return and_bits(om,1)
end type
odd_min z


> Because of the overhead of testing for it, because there is no example of 
> such limitations in Eu
Ah, but is there any similar precedent of such redundancy in Eu?
Anyway, it is just my [unshakeable] opinion it should error out.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

38. Re: Homogeneous sequence

Bernie Ryan wrote:
> 
> Derek Parnell wrote:
> > As all of these are variations are sequences of some sort, and the common
> > English
> > term for a sequence is 'list', then maybe something along the lines of ...
> > 
> >   integer list x
> >   object list x
> >   sequence list x
> >   atom list list x
> >   my_user_defined_type list x
> > 
> 
> Derek:
> 
> Aren't we simply saying:
> 
> array of integer x
> array of object x
> array of sequence x
> array of array of sequence x
> array of my_user_defined_type x

Yes. The operative word being "simply". 

Most English speaking people, especially non-programmers, use the term "list"
rather than "array" or "sequence" when describing such a structure. I feel that
"list" is a simple, one-syllable, commonly used word. Also the form "<adjective>
<noun>" is simpiler than the form "<noun> of <noun>", IMHO.

Try reading out source code over the telephone.

I agree that people who would write long definitions using this syntax should be
gentle taken aside and educated. So maybe its also time to borrow another idea
from the D programming language - alias.

  alias atom list text
  alias text list file
  file options

These lines above mean that "text" is an alias for "atom list", and "file" is an
alias for "text list", and we are declaring a symbol called "options" is is of
type "file" which is really "text list" which is really "atom list list". The
alias can be thought of as a light-weight UDT.
 

-- 
Derek Parnell
Melbourne, Australia
Skype name: derek.j.parnell

new topic     » goto parent     » topic index » view message » categorize

39. Re: Homogeneous sequence

Reading this thread with interest and yet with reservations also.

According to the proposal would I be able to do something like this:
sequence of integer x
sequence y
object z

-- some code assigning values to x, y, and z

x = y -- typecheck y to make sure it complies with the definition
x = z -- ditto


Or would it fail right away because the type definitions do not match? 

--
A complex system that works is invariably found to have evolved from a simple
system that works.
--John Gall's 15th law of Systemantics.

"Premature optimization is the root of all evil in programming."
--C.A.R. Hoare

j.

new topic     » goto parent     » topic index » view message » categorize

40. Re: Homogeneous sequence

Juergen Luethje wrote:
> the program will crash instead of returning FALSE, in case 'foo' is
> not an integer. In order to avoid that, currently we have to define the
> type somehow like this:
> }}}
<eucode>
> type char(object x)
>    return integer(x) and x >= 0 and x <= 255 
> end type
> </eucode>
{{{

> What would that problem mean regarding the proposed extension of the type
> system? Or maybe this behaviour can be changed, when the type system is
> revised? I'd appreciate that.
Good point. My proposal is useless without that change, I forgot that.

I suspect this is a relatively easy change to try: Instead of emitting a
TYPE_CHECK opcode at the top of a type definition, emit new opcodes which effect
a return 0:
Somewhere in parser.e you should find:
if SymTab[p][S_TOKEN] = TYPE and param_num != 1 then
	CompileErr("types must have exactly one parameter")
    end if

which seems the point to set a flag (change "and" to "then flag=1 if"), then
after
-- code to perform type checks on all the parameters
    sym = SymTab[p][S_NEXT]
    for i = 1 to SymTab[p][S_NUM_ARGS] do
	TypeCheck(sym)
	sym = SymTab[sym][S_NEXT]
    end for

clear that flag. Within TypeCheck(), instead of
emit_op(INTEGER_CHECK)
		    emit_op(SEQUENCE_CHECK)
		    emit_op(ATOM_CHECK)
		    emit_op(TYPE_CHECK)

when this flag is set emit some new opcodes which instead of (in execute.e):
procedure opINTEGER_CHECK()
    a = Code[pc+1]
    if not integer(val[a]) then
	RTFatalType(pc+1)
    end if
    pc += 2
end procedure

do a hacked version of opRETURNP/F() (instead of RTFatal) to return 0.

At that point, it got a bit too messy for a quick hack to carry on with right
now, but if you get the gist give it a twirl. smile

Obviously I am only suggesting an experiment on eu.ex/hll back-end, in pure Eu
code, also obviously you cannot test the flag in opINTEGER_CHECK(), but need the
new opcodes and new routines.

An interesting point would be whether you can find *ANY* working program that
this change breaks, obviously I somewhat doubt it.

> And what do you think about Pete's proposal:
And what do you think about Pete's proposal?

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

41. Re: Homogeneous sequence

Derek Parnell wrote:
> 
> Try reading out source code over the telephone.
> 
<snip>
>
>   alias atom list text
>   alias text list file
>   file options
> 
LOL! That tickled me, thanks!

Obviously you're a fan of D unlike me and on closer reading it has merits but
that specific syntax is not an Eu fit, imo.

Still giggling,
Pete

new topic     » goto parent     » topic index » view message » categorize

42. Re: Homogeneous sequence

Jason Gade wrote:
> 
> Reading this thread with interest and yet with reservations also.
> 
> According to the proposal would I be able to do something like this:
> }}}
<eucode>
> sequence of integer x
> sequence y
> object z
> 
> -- some code assigning values to x, y, and z
> 
> x = y -- typecheck y to make sure it complies with the definition
> x = z -- ditto
> </eucode>
{{{

> 
> Or would it fail right away because the type definitions do not match? 

I am sure you fears are unfounded.

First, the typecheck both times is on x to ensure the new value fits.
  (error: type check failure, x is -1)
  Type checking is always done after the effect, for such messages.
  It is no different to y=z, failing (only) when z is not a sequence.

The objective is to perform less type checks, not more. y=x would perform no
typecheck whatsoever, because the compiler has enough type info to know it will
succeed, ditto z=y and z=x. More detailed type info === even less checks.

Actually this point is debateable: currently more detailed checks are so
expensive no-one does them; if we slash the cost then argubly more people will
use them, which is also an objective.

Type-wise, you should note that the compiler can seem thick as pigshit:
integer i
   i=3.5
   i="fred"

compiles/binds/translates without hitch. It only goes wrong at run-time; it is
only the backend that knows much about type errors. Obviously the front-end is
actually pretty smart about emitting the right code for the right types, just not
about predicting type errors. Reasonable, I guess.

HTH,
Pete

new topic     » goto parent     » topic index » view message » categorize

43. Re: Homogeneous sequence

> > And what do you think about Pete's proposal:
> And what do you think about Pete's proposal?

I'm strongly against introducing these new definitions in general. (And not only
Pete's suggestion.)

For me Euphoria's motto is "just say NO to complicated programming languages".
This change doesn't solves any of my coding problems but adds complexity to the
syntax.

I feel that this proposal is clearly against the "Clean and simple syntax", the
"Minimal and simple to use Data Types", and probably the "Maintainability" points
of the (non-official) mission statement (that I keep relevant). (See
http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax
)

I think it is also against Rob's indirectly stated wish that Euphoria "will gain
more users". (See
http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds
). Imagine a newbie asking how to define a simple "string" in Euphoria, that is a
widely known and commonly used variable type?

list integer gain_for_salix
-- or list atom gain_for_salix
-- or list object gain_for_salix
-- or sequence gain_for_salix
-- or object gain_for_salix
gain_for_salix="zero"


... a nightmare.

Regards,

Salix

new topic     » goto parent     » topic index » view message » categorize

44. Re: Homogeneous sequence

Pete Lomax wrote:
> 
> Derek Parnell wrote:
> > 
> > Try reading out source code over the telephone.
> > 
> <snip>
> >
> >   alias atom list text
> >   alias text list file
> >   file options
> > 
> LOL! That tickled me, thanks!
> 
> Obviously you're a fan of D unlike me and on closer reading it has merits but
> that specific syntax is not an Eu fit, imo.
> 
> Still giggling,
> Pete

Even more concise and visually intuitive:
{atom} s -- a sequence of atoms
{{integer}} -- a sequence of sequences of integers
{{}} -- a sequence of sequences

Since the interpreter expects type marks in very specific places, this construct
is not ambiguous at all.
And {atom}(s) could be parsed properly, including inside an if/while statement.

öbject" in the middle would be optimised away easily, without any need to error
out.

How about that?

CChris

new topic     » goto parent     » topic index » view message » categorize

45. Re: Homogeneous sequence

Salix wrote:
> 
> > > And what do you think about Pete's proposal:
> > And what do you think about Pete's proposal?
> 
> I'm strongly against introducing these new definitions in general. (And not
> only Pete's suggestion.)
> 
> For me Euphoria's motto is "just say NO to complicated programming languages".
> This change doesn't solves any of my coding problems but adds complexity to
> the syntax. 
> 
> I feel that this proposal is clearly against the "Clean and simple syntax",
> the "Minimal and simple to use
> Data Types", and probably the "Maintainability" points of the (non-official)
> mission statement (that I keep
> relevant). (See <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax</a>
> )
> 
> I think it is also against Rob's indirectly stated wish that Euphoria "will
> gain more users". (See <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds</a>
> ). Imagine a newbie asking how to define a simple "string" in Euphoria, that
> is a widely known and commonly
> used variable type? 
> 
> }}}
<eucode>
> list integer gain_for_salix
> -- or list atom gain_for_salix
> -- or list object gain_for_salix
> -- or sequence gain_for_salix
> -- or object gain_for_salix
> gain_for_salix="zero"
> </eucode>
{{{

> 
> ... a nightmare.
> 
> Regards,
> 
> Salix

It will lose even more users if it doesn't gain some more power.
A string is a sequence of printable.
A printable is an integer, or a subrange of integers - it all depends of how
strict you need to be.
It is still ok to define a string as a sequence - that's what we all do - but no
one here is happy to have {{1,2}} pass the type check. Are you?

CChris

new topic     » goto parent     » topic index » view message » categorize

46. Re: Homogeneous sequence

Salix wrote:
 
> > > And what do you think about Pete's proposal:
> > And what do you think about Pete's proposal?
> 
> I'm strongly against introducing these new definitions in general. (And not
> only Pete's suggestion.)

Me too.

I think it is not that important feature, anyone always can
pack/unpack 8 bytes into atom or 4 bytes into integer,
so "sequence of integer" or "sequence of atom" do not have
great sense to me, nor "sequence of sequence".
 
> For me Euphoria's motto is "just say NO to complicated programming languages".

I think this slogan is something obsolete after opening
of the source code of Euphoria. Anyway C, one of the most
complicated languages, is involved for development of
new Open Eophoria. We just can not say NO to C now, if
we want new Euphoria.

I'd like to see new slogan. Something like to
"Euphoria! Just say YES to simple programming language!"

> This change doesn't solves any of my coding problems but adds complexity to
> the syntax. 
> 
> I feel that this proposal is clearly against the "Clean and simple syntax",
> the "Minimal and simple to use
> Data Types", and probably the "Maintainability" points of the (non-official)
> mission statement (that I keep
> relevant). (See <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax</a>
> )
> 
> I think it is also against Rob's indirectly stated wish that Euphoria "will
> gain more users". (See <a
> href="http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds">http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds</a>
> ). Imagine a newbie asking how to define a simple "string" in Euphoria, that
> is a widely known and commonly
> used variable type? 
> 
> }}}
<eucode>
> list integer gain_for_salix
> -- or list atom gain_for_salix
> -- or list object gain_for_salix
> -- or sequence gain_for_salix
> -- or object gain_for_salix
> gain_for_salix="zero"
> </eucode>
{{{

> 
> ... a nightmare.

I'm agreed, Salix, with your opinion.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

47. Re: Homogeneous sequence

CChris wrote:
> Salix wrote: 
> It will lose even more users if it doesn't gain some more power.
> A string is a sequence of printable.
> A printable is an integer, or a subrange of integers - it all depends of how
> strict you need to be.
> It is still ok to define a string as a sequence - that's what we all do - but
> no one here is happy to have {{1,2}} pass the type check. Are you?
> 

Thanks for asking, CChris! I am. smile

See, I don't think that Euphoria would gain a lot of power with this. 
I don't consider this issue critical from the leavers point of view.

Talking about the "string" type, please make no mistake. I do not 
propose its implementation. If I need it I just define a new "type"
or "function" to help me controlling the sequences in my programme.

This also answers your question: yes, I am. That's what a sequence 
means for me. And whenever I want to check if {{1,2}} is a 
"sequence with a list of integers inside" then I do not check it 
with the "sequence()" function. It will obviously pass that type 
check. I use rather my unique
"is_it_a_sequence_with_a_list_of_integers_inside()" function. smile

I understand that adding a whole list of functions like that to 
the core language might mean a bit of speed. But regularly checking
various objects is not typical in my opinion. (Surely not in my case.) 
And please note that introducing the change you are discussing 
affects heavily the syntax and readability of Euphoria. 

Regards,

Salix

new topic     » goto parent     » topic index » view message » categorize

48. Re: Homogeneous sequence

Salix wrote:
> 
> > And what do you think about Pete's proposal?
> 
> I'm strongly against introducing these new definitions in general. (And not
> only Pete's suggestion.)
> 
That's OK, but let me try rephrasing the question one last time.
sequence change_forced_on_salix
change_forced_on_salix="zero"

Suppose there is a new string.e standard include (see Rob's post), then
include string.e
string question, clause, person
clause = "Would this help "
person = "Salix"
question = append(clause,person)

line 5:
type check failure, question is {W,o,u,l,d, ,t,h,i,s, ,h,e,l,p, ,"Salix"}

Immediately you know that & should have been used instead of append.
Caused a few headaches with path & filename in edita.edb, and not much fun to
clean up the mess such a bug makes, I can tell you.

Putting this aside for a moment, would you like to see a new builtin string type
in Eu?

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

49. Re: Homogeneous sequence

Pete Lomax wrote:
> Putting this aside for a moment, would you like to see a new builtin string
> type in Eu?

Khm... I should admit that I skipped Rob's message on the 
"string(sequence of integer s)" solution. It makes most of 
my arguements on simple syntax of general coding not applicable. 

As you probably assumed I also have my debugging routines to avoid 
the famous "sequence found inside character string" error message 
and to keep on testing the data put into my sequences. But I can 
handle it with the currently received crash messages and later 
I delete anyway the seq. check routines. 

So I say rather "no". (But this time very quitely.) :-]

Regards,

Salix
(who will not drop Euphoria because of a homogeneous sequence)

new topic     » goto parent     » topic index » view message » categorize

50. Re: Homogeneous sequence

Salix wrote:
> 
> Pete Lomax wrote:
> > Putting this aside for a moment, would you like to see a new builtin string
> > type in Eu?
> 
> Khm... I should admit that I skipped Rob's message on the 
> "string(sequence of integer s)" solution. It makes most of 
> my arguements on simple syntax of general coding not applicable. 
> 
> As you probably assumed I also have my debugging routines to avoid 
> the famous "sequence found inside character string" error message 
> and to keep on testing the data put into my sequences. But I can 
> handle it with the currently received crash messages and later 
> I delete anyway the seq. check routines. 
> 
> So I say rather "no". (But this time very quitely.) :-]
> 
> Regards,
> 
> Salix
> (who will not drop Euphoria because of a homogeneous sequence)

lol
Me neither.
But homogeneous sequences may be one of the xx (two digit number) small reasons
that might decide me to do so some day because they will add up to too much.

CChris

new topic     » goto parent     » topic index » view message » categorize

51. Re: Homogeneous sequence

CChris wrote:
> 
> Pete Lomax wrote:
> > 
> > Derek Parnell wrote:
> > > 
> > > Try reading out source code over the telephone.
> > > 
> > <snip>
> > >
> > >   alias atom list text
> > >   alias text list file
> > >   file options
> > > 
> > LOL! That tickled me, thanks!
> > 
> > Obviously you're a fan of D unlike me and on closer reading it has merits
> > but
> > that specific syntax is not an Eu fit, imo.
> > 
> > Still giggling,
> > Pete
> 
> Even more concise and visually intuitive:
> }}}
<eucode>
> {atom} s -- a sequence of atoms
> {{integer}} -- a sequence of sequences of integers
> {{}} -- a sequence of sequences
> </eucode>
{{{

> Since the interpreter expects type marks in very specific places, this
> construct
> is not ambiguous at all.
> And {atom}(s) could be parsed properly, including inside an if/while
> statement.
> 
> öbject" in the middle would be optimised away easily, without any need to
> error
> out.
> 
> How about that?
> 
> CChris

"More visually intuitive"? That's a joke, right?

Anyway, blech! If I wanted to program in a language like that then I would. I'm
sure there are several of them out there.

I'm still with some of the naysayers on the whole idea -- I still don't see much
benefit to it. The only think that I can think of is verifying whether a sequence
is single-level or multi-level with regards to passing strings.

--
A complex system that works is invariably found to have evolved from a simple
system that works.
--John Gall's 15th law of Systemantics.

"Premature optimization is the root of all evil in programming."
--C.A.R. Hoare

j.

new topic     » goto parent     » topic index » view message » categorize

52. Re: Homogeneous sequence

Pete Lomax wrote:
> 
> Jason Gade wrote:
> > 
> > Reading this thread with interest and yet with reservations also.
> > 
> > According to the proposal would I be able to do something like this:
> > }}}
<eucode>
> > sequence of integer x
> > sequence y
> > object z
> > 
> > -- some code assigning values to x, y, and z
> > 
> > x = y -- typecheck y to make sure it complies with the definition
> > x = z -- ditto
> > </eucode>
{{{

> > 
> > Or would it fail right away because the type definitions do not match? 
> 
> I am sure you fears are unfounded.
> 
> First, the typecheck both times is on x to ensure the new value fits.
>   (error: type check failure, x is -1)

So if y is a sequence of integers except for y[$] which is 1.1, at what point is
x checked for validity? Is it an O(N) operation or an O(1) operation?

Doesn't Euphoria usually just copy the pointer internally?

>   Type checking is always done after the effect, for such messages.
>   It is no different to y=z, failing (only) when z is not a sequence.
> 
> The objective is to perform less type checks, not more. y=x would perform no
> typecheck whatsoever, because the compiler has enough type info to know it
> will
> succeed, ditto z=y and z=x. More detailed type info === even less checks.

I agree with performing fewer typechecks. And with needing to perform fewer.

> 
> Actually this point is debateable: currently more detailed checks are so
> expensive
> no-one does them; if we slash the cost then argubly more people will use them,
> which is also an objective.
> 
> Type-wise, you should note that the compiler can seem thick as pigshit:
> }}}
<eucode>
> integer i
>    i=3.5
>    i="fred"
> </eucode>
{{{

> compiles/binds/translates without hitch. It only goes wrong at run-time; it
> is only the backend that knows much about type errors. Obviously the front-end
> is actually pretty smart about emitting the right code for the right types,
> just not about predicting type errors. Reasonable, I guess.
> 
> HTH,
> Pete

IIRC, before Euphoria's front-end was re-written, Rob was pretty proud of the
speed of the interpreter's front end. Plus the fact that it would continue
converting code to IL and executing code at the same time.

Checking for certain obvious type errors in the front end makes a lot sense, but
I've already seen some people complain about the slowness of the front end. (I'm
not one, BTW).

Hmm. This is a tangent, but it would be interesting for the front end to use the
tasking feature maybe to do some of this... Except the back end would have to be
aware of it also.

The Euphoria to C translator should definitely be doing those kinds of checks
though.

--
A complex system that works is invariably found to have evolved from a simple
system that works.
--John Gall's 15th law of Systemantics.

"Premature optimization is the root of all evil in programming."
--C.A.R. Hoare

j.

new topic     » goto parent     » topic index » view message » categorize

53. Re: Homogeneous sequence

Pete Lomax wrote:

> Juergen Luethje wrote:
> > the program will crash instead of returning FALSE, in case 'foo' is
> > not an integer. In order to avoid that, currently we have to define the
> > type somehow like this:
> > 
> > type char(object x)
> >    return integer(x) and x >= 0 and x <= 255 
> > end type
> > 
> > What would that problem mean regarding the proposed extension of the type
> > system? Or maybe this behaviour can be changed, when the type system is
> > revised? I'd appreciate that.
> Good point. My proposal is useless without that change, I forgot that.
> 
> I suspect this is a relatively easy change to try: Instead of emitting a
> TYPE_CHECK
> opcode at the top of a type definition, emit new opcodes which effect a return
> 0:
> Somewhere in parser.e you should find:
> 
>     if SymTab[p][S_TOKEN] = TYPE and param_num != 1 then
>   CompileErr("types must have exactly one parameter")
>     end if
> 
> which seems the point to set a flag (change "and" to "then flag=1 if"), then
> after
> 
>     -- code to perform type checks on all the parameters 
>     sym = SymTab[p][S_NEXT]
>     for i = 1 to SymTab[p][S_NUM_ARGS] do
>   TypeCheck(sym)
>   sym = SymTab[sym][S_NEXT]
>     end for
> 
> clear that flag. Within TypeCheck(), instead of
> 
>               emit_op(INTEGER_CHECK)
>               emit_op(SEQUENCE_CHECK)
>               emit_op(ATOM_CHECK)
>               emit_op(TYPE_CHECK)
> 
> when this flag is set emit some new opcodes which instead of (in execute.e):
> 
> procedure opINTEGER_CHECK()
>     a = Code[pc+1]
>     if not integer(val[a]) then
>   RTFatalType(pc+1)
>     end if
>     pc += 2
> end procedure
> 
> do a hacked version of opRETURNP/F() (instead of RTFatal) to return 0.
> 
> At that point, it got a bit too messy for a quick hack to carry on with right
> now, but if you get the gist
> give it a twirl. smile
> 
> Obviously I am only suggesting an experiment on eu.ex/hll back-end, in pure
> Eu code, also obviously you cannot test the flag in opINTEGER_CHECK(), but
> need
> the new opcodes and new routines.

All this is a bit over my head.

> An interesting point would be whether you can find *ANY* working program that
> this change breaks, obviously I somewhat doubt it.
> 
> > And what do you think about Pete's proposal:
> And what do you think about Pete's proposal?

smile
My feeling says that it's really a good idea in order to avoid the
clutter demonstrated by Salix. By I think I don't have sufficient
knowledge to make a reasonable judgement.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

54. Re: Homogeneous sequence

Salix wrote:

> > > And what do you think about Pete's proposal:
> > And what do you think about Pete's proposal?
> 
> I'm strongly against introducing these new definitions in general. (And not
> only Pete's suggestion.)
> 
> For me Euphoria's motto is "just say NO to complicated programming languages".
> This change doesn't solves any of my coding problems but adds complexity to
> the syntax.

Following Pete's suggestion, there will only be little additional
complexity in the syntax. Maybe it will not solve any of your coding
problems, but it will have some other advantages (see Rob's first post
in this thread for details).

> I feel that this proposal is clearly against the "Clean and simple syntax",
> the "Minimal and simple to use
> Data Types", and probably the "Maintainability" points of the (non-official)
> mission statement (that I keep
> relevant). (See
> http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=ray+smith&keywords=Clean+and+simple+Syntax
> )

Firstly, this "mission statement" is not only non-official, but it's
very rather pretty much non-official. smile It's exactly nothing but a
private post by a single Euphoria user.

Secondly it's an error to believe that things are simpler when we have
only very few datatypes. _There are_ different kinds of data in the
world, and the better an Euphoria program can reflect them, the cleaner
and safer it is.

> I think it is also against Rob's indirectly stated wish that Euphoria "will
> gain more users". (See
> http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=8&fromYear=B&toMonth=A&toYear=B&postedBy=craig&keywords=birds
> ). Imagine a newbie asking how to define a simple "string" in Euphoria, that
> is a widely known and commonly
> used variable type? 

type char (integer x)
   return x >= 32 and x <= 127       -- or something similar
end type
type string (sequence of char)
   return TRUE
end type

Where is the problem?

> list integer gain_for_salix
> -- or list atom gain_for_salix
> -- or list object gain_for_salix
> -- or sequence gain_for_salix
> -- or object gain_for_salix
> gain_for_salix="zero"
> 
> 
> ... a nightmare.

That's why Pete made the proposal to allow something like that only in
the definition of types. So something like this will not be valid code.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

55. Re: Homogeneous sequence

Juergen Luethje wrote:
> 
> That's why Pete made the proposal to allow something like that only in
> the definition of types. So something like this will not be valid code.
> 
> Regards,
>    Juergen

Huh. Somehow I totally missed that.
http://www.openeuphoria.org/cgi-bin/esearch.exu?fromMonth=7&fromYear=C&toMonth=9&toYear=C&postedBy=Pete+Lomax&keywords=Homogeneous+sequence+*ONLY*

Thanks for pointing that out, Juergen. Pete -- that allays some of my concerns.

--
A complex system that works is invariably found to have evolved from a simple
system that works.
--John Gall's 15th law of Systemantics.

"Premature optimization is the root of all evil in programming."
--C.A.R. Hoare

j.

new topic     » goto parent     » topic index » view message » categorize

56. Re: Homogeneous sequence

Jason Gade wrote:
> 
> CChris wrote:
> > 
> > Pete Lomax wrote:
> > > 
> > > Derek Parnell wrote:
> > > > 
> > > > Try reading out source code over the telephone.
> > > > 
> > > <snip>
> > > >
> > > >   alias atom list text
> > > >   alias text list file
> > > >   file options
> > > > 
> > > LOL! That tickled me, thanks!
> > > 
> > > Obviously you're a fan of D unlike me and on closer reading it has merits
> > > but
> > > that specific syntax is not an Eu fit, imo.
> > > 
> > > Still giggling,
> > > Pete
> > 
> > Even more concise and visually intuitive:
> > }}}
<eucode>
> > {atom} s -- a sequence of atoms
> > {{integer}} -- a sequence of sequences of integers
> > {{}} -- a sequence of sequences
> > </eucode>
{{{

> > Since the interpreter expects type marks in very specific places, this
> > construct
> > is not ambiguous at all.
> > And {atom}(s) could be parsed properly, including inside an if/while
> > statement.
> > 
> > öbject" in the middle would be optimised away easily, without any need to
> > error
> > out.
> > 
> > How about that?
> > 
> > CChris
> 
> "More visually intuitive"? That's a joke, right?
> 

It is NOT a joke. Euphoria's verbosity is already pretty high, and code becomes
rapidly hard to read with all those wordy stuff and long significative
identifiers. Only proper spacing makes it tolerable, but then you bump rapidly
into line length issues, since the display is not extensible. The idea is to
fight visual clutter, the average level of which is dangerously high.

Since we use {...} to denote a sequence of what's between the braces, it is
pretty intuitive to put a type mark there instead of actual values to mean that
the sequence is made of elements of such a type.

But perhaps you don't like indicating sequence contents with braces?

> Anyway, blech! If I wanted to program in a language like that then I would.
> I'm sure there are several of them out there.
> 
> I'm still with some of the naysayers on the whole idea -- I still don't see
> much benefit to it. The only think that I can think of is verifying whether
> a sequence is single-level or multi-level with regards to passing strings.
> 

This is a frequent case, but by no means the only one:
Is a string made of printable characters only?
Are all the characters in some Unicode range, else change font?
Are all records of the list the same length? (Juergen's initial concern)
Do all records verify some consistency check, like
s[i][DATE_START]<s[i][DATE_END] \forall i?

And countless many other cases, which would be coded much more neatly than they
now are. Of course, direct type checks are costly, but flagging homogeneous
sequences would speed up a lot of things by allowing incremental type checking,
as emphasized earlier in the thread.

CChris
> --
> A complex system that works is invariably found to have evolved from a simple
> system that works.
> --John Gall's 15th law of Systemantics.
> 
> "Premature optimization is the root of all evil in programming."
> --C.A.R. Hoare
> 
> j.

new topic     » goto parent     » topic index » view message » categorize

57. Re: Homogeneous sequence

CChris wrote:
> It is NOT a joke. Euphoria's verbosity is already pretty high, and code
> becomes
> rapidly hard to read with all those wordy stuff and long significative
> identifiers.
> Only proper spacing makes it tolerable, but then you bump rapidly into line
> length issues, since the display is not extensible. The idea is to fight
> visual
> clutter, the average level of which is dangerously high.

Obviously we disagree on that as well. I think that Euphoria's verbosity is
"just right". Neither too long nor too short. I find it to be very readable.

> 
> Since we use {...} to denote a sequence of what's between the braces, it is
> pretty intuitive to put a type mark there instead of actual values to mean
> that
> the sequence is made of elements of such a type.
> 
> But perhaps you don't like indicating sequence contents with braces?

Yeah, on the RHS. Not the LHS. I find your example above practically unreadable.

> 
> This is a frequent case, but by no means the only one:
> Is a string made of printable characters only?
> Are all the characters in some Unicode range, else change font?
> Are all records of the list the same length? (Juergen's initial concern)
> Do all records verify some consistency check, like
> s[i][DATE_START]<s[i][DATE_END]
> \forall i?
> 
> And countless many other cases, which would be coded much more neatly than
> they
> now are. Of course, direct type checks are costly, but flagging homogeneous
> sequences would speed up a lot of things by allowing incremental type
> checking,
> as emphasized earlier in the thread.
> 
> CChris

Then write a function or a type to do it! It's not that hard. I think you focus
too much on the performance issues. Are we all still running at 33MHz or what?

Anyway, as I said elsewhere in the thread, I liked Pete's idea of limiting it to
new type definitions. That may also increase the usage of a well-designed but
under-used language feature.

--
A complex system that works is invariably found to have evolved from a simple
system that works.
--John Gall's 15th law of Systemantics.

"Premature optimization is the root of all evil in programming."
--C.A.R. Hoare

j.

new topic     » goto parent     » topic index » view message » categorize

58. Re: Homogeneous sequence

Jason Gade wrote:
> 
> Pete Lomax wrote:
> > 
> > Jason Gade wrote:
> > > 
> > > }}}
<eucode>
> > > sequence of integer x
> > > sequence y
> > > 
> > > x = y
> > > </eucode>
{{{

> > > 
> > the typecheck is on x to ensure the new value fits.
> 
> So if y is a sequence of integers except for y[$] which is 1.1, at what point
> is x checked for validity? Is it an O(N) operation or an O(1) operation?
After assignment. It will be an O(N) operation, remember this is a replacement
for a type with a for loop in it, currently the only way to preform this amount
of detailed type-checking.
> 
> Doesn't Euphoria usually just copy the pointer internally?
Yes, and it still will for the builtin types, or if the types of x and y are the
same, plus additionally the typecheck after x[5]=i which is an O(N) operation at
the moment (for type-checked variables) would reduce to O(1) if i is an atom or 0
if it is an integer.

No-one is going to be forced to use this if they do not want to, and the impact
on them should be so miniscule it cannot be measured.

> > just not about predicting type errors. Reasonable, I guess.
> Rob was pretty proud of the speed of the interpreter's front end.
Correct, that is why I accept this as reasonable.

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

59. Re: Homogeneous sequence

Juergen Luethje wrote:
> Robert Craig wrote:
> <snip>
> 
> > If you wanted tighter checking for strings you could write:
> > 
> > type char(integer x)
> >     return x >= 0 and x <= 255 
> > end type
> > 
> > type string(sequence of char s)
> >     return TRUE
> > end type
> 
> </snip>
> 
> Currently, there is a problem with code like that. When we e.g. write
> }}}
<eucode>
> ....
> if char(foo) then
> ...
> </eucode>
{{{

> then the program will crash instead of returning FALSE, in case 'foo' is
> not an integer. In order to avoid that, currently we have to define the
> type somehow like this:
> }}}
<eucode>
> type char(object x)
>    return integer(x) and x >= 0 and x <= 255 
> end type
> </eucode>
{{{

> What would that problem mean regarding the proposed extension of the type
> system? Or maybe this behaviour can be changed, when the type system is
> revised? I'd appreciate that.

As you suggest, perhaps when an argument with the incorrect type
is passed to a type routine, the program should not
die on the spot, but rather the type routine should simply return 0,
indicating that the value does not belong to that type.

This will allow the type routine to be used as a function,
just to check yes or no, (1 or 0) whether a value belongs
to that type. If the type routine is being called automatically
by the interpreter at an assignment statement, or when arguments are
passed to a normal *function*, then you will still get a 
type_check failure message (assuming "with type_check" is on),
but the error traceback will start at the assignment statement,
or function declaration line, not at the declaration line of the 
type routine, as happens now. I think this would
be a somewhat better diagnostic message than what we get now. 

In Euphoria, a type routine is declared differently from a 
normal function, so I guess we could be excused in
having slightly different rules in one versus the other.

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu