1. Small feature request for future EU versions

It would be nice to have the '==' (equal comparison) relational
operator implemented in future versions of EU. Many programming
languages have this operator for equal comparision, while EU
uses equal(). I doubt anyone will comfuse '==' with the '='
assignment operator. And if this is a compatibility issue,
keep the equal() ruitine until people agree that it is no longer
nessasary for backwards compatibility. What do you all think about
that?

new topic     » topic index » view message » categorize

2. Re: Small feature request for future EU versions

> posted by: Vincent <darkvincentdude at yahoo.com>
>
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
> nessasary for backwards compatibility. What do you all think about
> that?

Generally I think using `=' for assignment and `==' for comparison sucks! In 
mathematics `=' has been used for ages as a relation symbol. It's the 
assignment statement that needs (may need) special treatment. The natural 
symbol for assignment is a "left arrow", but unfortunately it's not 
available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'.

-- August

new topic     » goto parent     » topic index » view message » categorize

3. Re: Small feature request for future EU versions

August wrote:

>> posted by: Vincent <darkvincentdude at yahoo.com>
>>
>> It would be nice to have the '==' (equal comparison) relational
>> operator implemented in future versions of EU. Many programming
>> languages have this operator for equal comparision, while EU
>> uses equal(). I doubt anyone will comfuse '==' with the '='
>> assignment operator. And if this is a compatibility issue,
>> keep the equal() ruitine until people agree that it is no longer
>> nessasary for backwards compatibility. What do you all think about
>> that?
>
> Generally I think using `=' for assignment and `==' for comparison sucks! In
> mathematics `=' has been used for ages as a relation symbol. It's the
> assignment statement that needs (may need) special treatment. The natural
> symbol for assignment is a "left arrow", but unfortunately it's not
> available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'.

Assignment could be written as e.g.  x := y

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

4. Re: Small feature request for future EU versions

August wrote:
> > posted by: Vincent <darkvincentdude at yahoo.com>
> > It would be nice to have the '==' (equal comparison) relational
> > operator implemented in future versions of EU.
> Generally I think using `=' for assignment and `==' for comparison sucks! In 

I think using

    if mySeq==yourSeq then...

is a step in the right direction from

    if equal(mySeq,yourSeq) then...

That's 22 characters down from 28 characters, a 21% reduction in typing!
Not too shabby. :)

-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/

new topic     » goto parent     » topic index » view message » categorize

5. Re: Small feature request for future EU versions

> Assignment could be written as e.g.  x := y

Yes, ever since Algol that has been the most common alternative to `='.

new topic     » goto parent     » topic index » view message » categorize

6. Re: Small feature request for future EU versions

Vincent wrote:
> 
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
Unfortunatly this, while it certainly be helpful, will probably never
be added...

> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
If this is ever added, equal() should remain. '==' would simply
be short for equal(). Less typing good!

> nessasary for backwards compatibility. What do you all think about
> that?
>

new topic     » goto parent     » topic index » view message » categorize

7. Re: Small feature request for future EU versions

CoJaBo wrote:
> 
> Vincent wrote:
> > 
> > It would be nice to have the '==' (equal comparison) relational
> > operator implemented in future versions of EU. Many programming
> Unfortunatly this, while it certainly be helpful, will probably never
> be added...
> 
> > languages have this operator for equal comparision, while EU
> > uses equal(). I doubt anyone will comfuse '==' with the '='
> > assignment operator. And if this is a compatibility issue,
> > keep the equal() ruitine until people agree that it is no longer
> If this is ever added, equal() should remain. '==' would simply
> be short for equal(). Less typing good!
> 
> > nessasary for backwards compatibility. What do you all think about
> > that?
> >
> 

Yea, but Rob is introducing the $ feature which happens to be a short
method of typing length.

x[$] is the same as x[length(x)]


And I dont think it would be any hassle for Rob to just implement
the '==' operator as a alternative for equal() for 5 reasons:

#1 The '==' operator is used for boolean comparing of objects in many
   languages.

#2 Using equal() makes the code more unreadable.
   (not really but still)

#3 It takes less typing and is more readable.

#4 I dont think Robert Craig would have much difficulty if any
   implementing the '==' relational operator in the core language
   definition.

#5 I think it's cool :P

new topic     » goto parent     » topic index » view message » categorize

8. Re: Small feature request for future EU versions

Derek Parnell wrote:

> Chris Bensler wrote:
>
> [snip]
>
>>
>> I think I would prefer an operator like ==.
>
> And I would like '=' to only be an equality test, ':=' to only be an
> assignment action and 'element_eq()' to be the sequence operation.

Me too.

> If we have '==' we should logically also have '<<', '>>', '!==', '<<='
> and '>>='.

Oh no, please.

>> I think that is something that can be done in a preprocessor though.
>> Alot of the changes that people want can be done in a preprocessor, and
>> once 2.5 arrives, we will have the ability to edit the front end, I
>> beleive, which is even better than a preprocesor.
>
> Yes, this is going to be survival-of-the-fitest game, with lots
> of variants vying for attention.

Yep. smile

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

9. Re: Small feature request for future EU versions

Juergen Luethje wrote:
> 
> Derek Parnell wrote:
> 
> Chris Bensler wrote:
> >> [snip]

> >> I think that is something that can be done in a preprocessor though.
> >> Alot of the changes that people want can be done in a preprocessor, and
> >> once 2.5 arrives, we will have the ability to edit the front end, I
> >> beleive, which is even better than a preprocesor.
> >
> > Yes, this is going to be survival-of-the-fitest game, with lots
> > of variants vying for attention.
> 
> Yep. smile
> 
> Regards,
>    Juergen
> 
> 

survival-of-the-fittest game! 
oh no. sad

please don't go off in different directions like pieces of a bomb.
communicate with each other , all those who want to work on the project.
get the same syntax down (names of functions,procedures and arguments).
then each can try to make the inside of those routines better.

or like an ide does, make a front end better using different methods (maybe even
a pre-processor ) but stick to the same insides(syntax used for the routines).

just my thoughts.

rudy

new topic     » goto parent     » topic index » view message » categorize

10. Re: Small feature request for future EU versions

rudy toews wrote:

> Juergen Luethje wrote:
>>
>> Derek Parnell wrote:
>>
>> Chris Bensler wrote:
>>>> [snip]
>
>>>> I think that is something that can be done in a preprocessor though.
>>>> Alot of the changes that people want can be done in a preprocessor, and
>>>> once 2.5 arrives, we will have the ability to edit the front end, I
>>>> beleive, which is even better than a preprocesor.
>>>
>>> Yes, this is going to be survival-of-the-fitest game, with lots
>>> of variants vying for attention.
>>
>> Yep. smile
>>
>
> survival-of-the-fittest game!
> oh no. sad
>
> please don't go off in different directions like pieces of a bomb.
> communicate with each other , all those who want to work on the project.
> get the same syntax down (names of functions,procedures and arguments).

I agree that this is desirable. Experience from the past as well as this
little discussion shows, that different people often have different
preferences, though.

> then each can try to make the inside of those routines better.
>
> or like an ide does, make a front end better using different methods
> (maybe even a pre-processor ) but stick to the same insides(syntax used
> for the routines).
>
> just my thoughts.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

11. Re: Small feature request for future EU versions

On Sun, 17 Oct 2004 00:24:23 -0700, rudy toews
<guest at RapidEuphoria.com> wrote:

>> Derek Parnell wrote:
>> > Yes, this is going to be survival-of-the-fitest game, with lots
>> > of variants vying for attention.
>please don't go off in different directions like pieces of a bomb.
>communicate with each other
FWIW, new operators are not actually needed, it is quite easy to
greatly improve the language using the current ones.

Taking the if <expr> then example, the interpreter should realise that
<expr> must deliver a boolean result (or crash), in other words map

e1<e2 to compare(e1,e2)=-1
e1<=e2 to compare(e1,e2)!=1
e1=e2 to equal(e1,e2),
e1!=e2 to not equal(e1,e2),
e1>=e2 to compare(e1,e2)!=-1
e1>e2 to compare(e1,e2)=1

Of course, it does not need to do this if e1 and e2 are both atoms.

Taking this a step further "if <e1> and <e2> then" must also deliver a
boolean result, which means the "need" for a boolean result must be
propagated into e1 and e2. The same is true for all the other
operators (not, unary minus, +, -, *, /, or, xor), and the same logic
applies to while, for, subscript, and slice expressions, but not to
assignments, constants, or parameters. If this all sounds horribly
complicated, don't worry, in practice it's not. I already have this
working in Posetf blink)

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

12. Re: Small feature request for future EU versions

Vincent wrote:
> 
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
> nessasary for backwards compatibility. What do you all think about
> that?

For that matter, what's wrong with just having '=' for both 
string and numeric comparisons? Several other languages do that 
without any problem.

Irv

new topic     » goto parent     » topic index » view message » categorize

13. Re: Small feature request for future EU versions

On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com>
wrote:
> string and numeric comparisons? Several other languages do that
> without any problem.
> 
> Irv

The problem is this:

constant 
atom1=5,
atom2=10,
seq1={1,5,2,4},
seq2={1,2,5,4}

constant cond1 = ( atom1 = atom2 )
constant cond2 = ( seq1 = seq2 )

--cond1 will be 0, as 5 does not equal 10.
--cond2 will be {1,0,0,1}, because Euphoria compares each element.

The problem is that the IF statement doesn't know what to do with a
sequence, and there's not an obvious solution.
As a quick hack, lets say that if the IF statement receives a
sequence, it treats it as true if every element is non-zero.

That would allow us to easily compare strings using the simple form:
if string1 = string2.
However, what if the sequence passed to the IF statement is empty?
What if the sequence contains multiple levels?
What if the sequence contains a mix of non-zero integers, and an empty sequence?

I think that these things aren't easily solved...
Maybe it should be extended partially.
1. Only atoms, and 1-dimensional sequences can be passed to the IF statement
2. If an atom, pass if non-zero.
3. If a sequence, pass if all elements are non-zero.
4. If an empty sequence is passed to the if statement, treat it as a
'zero', I suppose.

What do you think of this solution?
I really don't like the idea of adding extra relationship operators.
:=, =>, ==, etc, is really annoying to remember. Trust me, I've
written too much VHDL code...

At least with the above suggestion, there's no broken compatibility, a
common issue (Why can't I just use '=' to compare these strings?) is
fixed, and it's a logical solution.
-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

14. Re: Small feature request for future EU versions

On Mon, 18 Oct 2004 09:52:19 +1000, Patrick Barnes <mrtrick at gmail.com>
wrote:

>The problem is that the IF statement doesn't know what to do with a
>sequence, and there's not an obvious solution.
Actually there is: the interpreter can easily determine when it needs
a boolean result, eg in an if statement, so it simply maps relational
operators to equal()/compare().

>Why can't I just use '=' to compare these strings?
Trust me, this can be done for any reasonable expression, eg

	if name="pete" then

or

	if forename="pete" and surname="lomax" then

but not

	if {1,2,3}+{4,5,6} then

I'll give it a go if/when I get 2.5

Regards,
Pete

new topic     » goto parent     » topic index » view message » categorize

15. Re: Small feature request for future EU versions

Patrick Barnes wrote:
> 
> On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com>
> wrote:
> > string and numeric comparisons? Several other languages do that
> > without any problem.
> > 
> > Irv
> 
> The problem is this:
> 
> constant 
> atom1=5,
> atom2=10,
> seq1={1,5,2,4},
> seq2={1,2,5,4}
> 
> constant cond1 = ( atom1 = atom2 )
> constant cond2 = ( seq1 = seq2 )
> 
> --cond1 will be 0, as 5 does not equal 10.
> --cond2 will be {1,0,0,1}, because Euphoria compares each element.

By any normal meaning of the word 'equal' as used by scientists, mathematicians,
programmers, and the corner grocer, the answer can not possibly 
be {1,0,0,1}. It is either TRUE or FALSE. If the lengths of the two 
sequences are different, then they are not equal, and the result should 
be FALSE - not an error. 

If, for some strange reason, someone wanted an item-by-item comparison 
between two sequences, then a new and more meaningful name should be chosen
for a function which returns {1,0,0,1}. 

> The problem is that the IF statement doesn't know what to do with a
> sequence, and there's not an obvious solution.
> As a quick hack, lets say that if the IF statement receives a
> sequence, it treats it as true if every element is non-zero.
> 
> That would allow us to easily compare strings using the simple form:
> if string1 = string2.
> However, what if the sequence passed to the IF statement is empty?
> What if the sequence contains multiple levels?
> What if the sequence contains a mix of non-zero integers, and an empty
> sequence?

You're thinking within the box built by RDS. 
No matter whether the sequences are empty, or contain multiple levels, 
if the two are identical, then they are equal, otherwise they aren't.
That is the obvious definition of equal which anyone can understand. 
 
> I think that these things aren't easily solved...

The whole thing could have been avoided if Rob had used the = operator 
to return equality and another operator or function to return a comparison.

....

> At least with the above suggestion, there's no broken compatibility, a
> common issue (Why can't I just use '=' to compare these strings?) is
> fixed, and it's a logical solution.

I'll bet there aren't a dozen uses of = in the existing code base. I've 
used it exactly once, and I really don't mind changing that instance to 
some new function. 

Irv

new topic     » goto parent     » topic index » view message » categorize

16. Re: Small feature request for future EU versions

Ricardo Forno wrote:

<snip>

> Moreover, please remember (or consider) that one of the most common pitfalls
> in the C language is to use = instead of ==, and in Pascal and other ones,
> to use = instead of :=.

<snip>

That really surprises me. I was thinking that different symbols for
different operations (comparison vs. assignment) would lead to clearer
code and less pitfalls -- compared to Euphoria's '=', the meaning of
which dependes on the context.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

17. Re: Small feature request for future EU versions

Juergen Luethje wrote:
> That really surprises me. I was thinking that different symbols for
> different operations (comparison vs. assignment) would lead to clearer
> code and less pitfalls -- compared to Euphoria's '=', the meaning of
> which dependes on the context.

Consider the following C code:

    if (A = B) {
        printf("A is equal to B");
    }

Every C programmer will eventually have a 3 hour debugging
session (and probably on several different occasions), where he 
finally realizes that the above code is actually doing:

    A = B;
    if (A != 0) {
        printf("A is equal to B");
    }

Not only is the if-statement "wrong", but A is overwritten by B! 

Regards,
   Rob Craig
   Rapid Deployment Software
   http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

18. Re: Small feature request for future EU versions

Robert Craig wrote:

> Juergen Luethje wrote:
>> That really surprises me. I was thinking that different symbols for
>> different operations (comparison vs. assignment) would lead to clearer
>> code and less pitfalls -- compared to Euphoria's '=', the meaning of
>> which dependes on the context.
>
> Consider the following C code:
>
>     if (A = B) {
>         printf("A is equal to B");
>     }
>
> Every C programmer will eventually have a 3 hour debugging
> session (and probably on several different occasions), where he
> finally realizes that the above code is actually doing:
>
>     A = B;
>     if (A != 0) {
>         printf("A is equal to B");
>     }
>
> Not only is the if-statement "wrong", but A is overwritten by B!

Urgs, that is ugly. I see now. Thanks for the explanation!

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

19. Re: Small feature request for future EU versions

Derek Parnell wrote:

> If RDS were to change Euphoria to behave as most people expect it to,
> there would be some existing programs that would fail. But I suspect
> that there would be very, very few of those. The pain is worth the gain.

This is indeed an rehash of an old complaint. My recollection (although quite 
possibly wrong) of Robert's response was:

1. His vision for Euphoria is a language where operators are simple and 
consistant. Having '=' apply to each item in a sequence is simple and 
consistant.

2. Part of Euphoria's value is backwards compatibility. Changing how '=' works 
would break code in the library - something he's typically unwilling to do 
unless there is a major gain for the language.

Given this, I'd be *very* suprised to find Robert's changed his position.

I suspect that one's personal expectation of the '=' operator depends on what 
language you come from. If you come from a language such as BASIC, you 
probably prefer a single true/false value from the comparison operator. 
People coming from C would consider Euphoria's behavior to be normal - this 
extends to other C/C++ derived languages, such as Java.

If you want my personal opinion, you can search the archives. blink

-- David Cuny

new topic     » goto parent     » topic index » view message » categorize

20. Re: Small feature request for future EU versions

----- Original Message ----- 
From: "Derek Parnell"
Sent: Sunday, October 17, 2004 9:36 PM
Subject: RE: Small feature request for future EU versions


> 

[snip]

> It seems that you would like equality (and relationship comparisions) 
> implemented as built-in functions (eg. equal(), compare() ) and 
> sequence operations performed by operators ('=', '<', etc...)
> 
> Whereas I'd prefer the reverse situation. I'd like relationship
> comparisions to use operators and sequence operations to use built-in
> functions.
> 
> I'd prefer that ...
> 
>   cond1 = (seq1 = seq2) 
> 
> to be interpreted as ...
> 
>   if the contents of seq1 and the contents of seq2 are identical then
>   assign 'true' to cond1 otherwise assign 'false' to cond1.
> 

I agree with you here.
if (identical) then
  TRUE
else
  FALSE
end if

> If I really wanted a sequence operation to be performed as its result 
> to be assigned I'd rather write something like ...
> 
>   cond1 = seqop_eq(seq1, seq2)
> 
> -- 
> Derek Parnell
> Melbourne, Australia
>

new topic     » goto parent     » topic index » view message » categorize

21. Re: Small feature request for future EU versions

On Mon, 18 Oct 2004 21:18:27 -0400, Lucius Hilley
<l3euphoria at bellsouth.net> wrote:
> From: "Derek Parnell"
> Sent: Sunday, October 17, 2004 9:36 PM
> Subject: RE: Small feature request for future EU versions
> > It seems that you would like equality (and relationship comparisions)
> > implemented as built-in functions (eg. equal(), compare() ) and
> > sequence operations performed by operators ('=', '<', etc...)
> >
> > Whereas I'd prefer the reverse situation. I'd like relationship
> > comparisions to use operators and sequence operations to use built-in
> > functions.
> >
> > I'd prefer that ...
> >
> >   cond1 = (seq1 = seq2)
> >
> > to be interpreted as ...
> >
> >   if the contents of seq1 and the contents of seq2 are identical then
> >   assign 'true' to cond1 otherwise assign 'false' to cond1.
> >
> 
> I agree with you here.
> if (identical) then
>   TRUE
> else
>   FALSE
> end if

I agree, logically it should work like that.

I think that the behaviour of the 'if' construct should be changed, to
automatically use equal() to check if the phrase seq1 = seq2 is used
within an if statement.

But... ONLY the if statement.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

22. Re: Small feature request for future EU versions

Hi Ricardo, you wrote:

> Hi, Juergen.
> Your thinking is logical, and I agree that using different operators for
> different purposes *should* be clearer.

This is the theory ...

> But, in my experience (and it seems
> some other people had the same kind of experience), == or := versus = leads
> to pitfalls.

... and that seems to be the practice. smile
I didn't know that because of lack of experience with C or Pascal.
However, the example that Rob posted in the meantime was impressive.

> I think this is due to the fact that all these operators contain the = sign.
> This problem rarely arose in APL, where the assignation symbol was a left
> arrow (such as <-, but a single character).
> Regards.

Interesting. Thank you!

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

23. Re: Small feature request for future EU versions

Just in case this topic hasn't been discussed to death yet...

Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll agree.

However, I do not support changing their current behaviour from
returning a sequence comprised of the individual elements, operated
on. Apart from the backwards compatibility...

I have written many pieces of code that taken an 'object' argument,
act on it in some way, and return it... without ever testing whether
the object is a sequence or an atom!
This is a tremendous benefit for Euphoria.. Try do the same thing in
C, and it's more than likely that you'll have to write an individual
function for every type of argument. And I don't mean just atomic or
array types... I mean char, short, int, uint, long, double, char[],
short[], int[], long[], double[], plus any other types that you are
interested in. And even then it can only handle 1-d arrrays!

For example: (and I'm sure it's possible to do this more efficiently,
this is just off the top of my head)

function abs(object x)
     return x * (1 - 2*( x < 0 )  )
end function


Elegant, and simple. Works for ANY object X... 
I see this as a very important reason not to change the behaviour of
existing operators.


However, the problem still remains... I have object A here. and object
B there. I want to compare them, but inside of the language, not using
a function. That would suggest adding a new operator type, but what?

Given the many pitfals of the '=' vs '==' in C, I would steer clear of
using '=='.

(Okay, here's the suggestion, pick it to pieces!)
How about something that is obviously intended for comparing sequences?
Applying braces around the operators that could be used for comparison
(not +, -, *, or /)

A {=} B 
A {!=} B 

replaces equal(A, B) and not equal(A, B)

A {>} B 
A {<} B
A {>=} B
A {<=} B

replaces compare(A,B) and it's tests.... I cannot remember how compare
works, ever. Each time I want to use compare(), I need to look it up
to see the behaviour... These operators are logical, and more
importantly intuitive.

The benefits:

It is a long term solution that works in any situation.... no 'if'
specialisations are required. I can use these within an equation to
test equality, without causing problems with sequence operators.

Usage:

Although nominally for $sequence $test $sequence, it would use the
same mechanisms internally as compare() and equal().
That means it can be used to compare 
atom with sequence, 
sequence with atom, 
atom with atom, 
sequence with sequence. 
Obviously comparing an atom with a sequence will always return
false... That's ok, it can be used to compare sequence elements with
one another.

The only confusion I can even think of, is that someone will
consistently use {=} to compare two atoms, rather than =.... This is
not even a real abuse, and is in my mind unlikely.

I welcome comment on this little suggestion of mine.
-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

24. Re: Small feature request for future EU versions

Patrick Barnes wrote:
> 
> Just in case this topic hasn't been discussed to death yet...

Impossible!  blink

> Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll
> agree.
> 
> However, I do not support changing their current behaviour from
> returning a sequence comprised of the individual elements, operated
> on. Apart from the backwards compatibility...
>
> I have written many pieces of code that taken an 'object' argument,
> act on it in some way, and return it... without ever testing whether
> the object is a sequence or an atom!
> This is a tremendous benefit for Euphoria.. Try do the same thing in
> C, and it's more than likely that you'll have to write an individual
> function for every type of argument. And I don't mean just atomic or
> array types... I mean char, short, int, uint, long, double, char[],
> short[], int[], long[], double[], plus any other types that you are
> interested in. And even then it can only handle 1-d arrrays!

I agree that this is one of Euphoria's overriding strengths. It costs a bit
of runtime performance (dynamic-typing verses static-typing) but in the
long run it is wonderful.

But why *must* this functionality be implemented using operators rather
than built-in functions? In essence, it is a syntax issue and not a
semantic issue.
 
> For example: (and I'm sure it's possible to do this more efficiently,
> this is just off the top of my head)
> 
> }}}
<eucode>
> function abs(object x)
>      return x * (1 - 2*( x < 0 )  )
> end function
> </eucode>
{{{


function abs(object x)
     return x * (1 - 2*( lessthan(x,0) )  )
end function


 
> Elegant, and simple. Works for ANY object X... 
> I see this as a very important reason not to change the behaviour of
> existing operators.

Have you an estimate for how frequently the operators have actually been
used in this manner? I think there are two instances in the RDS libraries,
the case conversion routines (BTW which only work on a limited set of
characters) and ... actually I can't find the other example just now.
  
My position is that the functionalty should remain in Euphoria, but as it
is rarely used it should be implemented using (unambiguous) built-in 
functions, and the much more common comparision functionality should be
implemented using operators.

In the end, both are translated to IL for execution, so its just a matter
of which syntax to use to represent the functionality.


However, if you really insist on operators for the sequence operations, 
then as these are rarely used, a special syntax for those might be better.

function abs(object x)
     return x * (1 - 2*( x {<} 0 )  )
end function



-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

25. Re: Small feature request for future EU versions

On Thu, 21 Oct 2004 20:27:48 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> But why *must* this functionality be implemented using operators rather
> than built-in functions? In essence, it is a syntax issue and not a
> semantic issue.

Well, using functions rather than operators it looks ungainly, and
requires more typing. In addition, they can be difficult to quickly
comprehend when you are reading through source:

if A {<=} B then
Vs:
if compare(A, B) <= 0 then

I don't even know if the two are functionally the same! I can't
remember whether compare is supposed to return a positive or a
negative value in which case!

 > }}}
<eucode>
> function abs(object x)
>      return x * (1 - 2*( lessthan(x,0) )  )
> end function
> </eucode>
{{{


Yes, that could work... but there are more important considerations...
(see below)

> Have you an estimate for how frequently the operators have actually been
> used in this manner? I think there are two instances in the RDS libraries,
> the case conversion routines (BTW which only work on a limited set of
> characters) and ... actually I can't find the other example just now.

Don't forget that the RDS libraries are missing many basic functions
like the aforementioned abs... I'm reasonably sure this functionality
gets used more often in things like genfunc, etc...

My projects have used sequence operators whenever I see them as being
appropriate.

> My position is that the functionalty should remain in Euphoria, but as it
> is rarely used it should be implemented using (unambiguous) built-in
> functions, and the much more common comparision functionality should be
> implemented using operators.

I think this would be a grave error. Initially, I thought "yeah,
that's a great idea, it makes more sense"... and it does. But... (see
below)

 
> However, if you really insist on operators for the sequence operations,
> then as these are rarely used, a special syntax for those might be better.
> 
> }}}
<eucode>
> function abs(object x)
>      return x * (1 - 2*( x {<} 0 )  )
> end function
> </eucode>
{{{


Actually, I initially thought that too. After all, it makes as much
sense as the solution I proposed in the earlier thread, if not
more....

PROBLEM! Backwards compatibility. Not that we haven't heard it shouted before...
It's worse than regular backwards compatibility problems though... 
Consider: Should it be changed, every program that uses sequence
operators will have a different behaviour.

eg: 
X = a[i] + (b[j][m] > c[p][2]) 

However, there will be no obvious error message. Rather than the line
above executing as expected, with a[i] being added to by an array,
it'll be added to by an atom. Who knows what will happen?
Of course, nothing could happen... if b[j][m] and c[p][2] are atoms.
But are they? I have no idea, and neither will anyone searching
through source changing things over.
There's no way to detect whether this will affect code.


That's why I suggest the new {=} apply to sequence comparison. After
all, it *is* logical for given values of 'logical', and it won't break
any existing code.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

26. Re: Small feature request for future EU versions

Derek Parnell wrote:

<snip>

> Have you an estimate for how frequently the operators have actually been
> used in this manner? I think there are two instances in the RDS libraries,
> the case conversion routines (BTW which only work on a limited set of
> characters)

That's why they are useless for text written in many languages other
than Englisch. That is strange for a product, that is intended for
international use, especially because it is very easy to write better case
conversion routines.
Case conversion routines written without using the operators in the
manner mentioned above are also faster. That's why RDS themselves don't
use their own library routines for case conversion, when speed is
important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...

<snip>

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

27. Re: Small feature request for future EU versions

Juergen Luethje wrote:
> 
> Derek Parnell wrote:
> 
> <snip>
> 
> > Have you an estimate for how frequently the operators have actually been
> > used in this manner? I think there are two instances in the RDS libraries,
> > the case conversion routines (BTW which only work on a limited set of
> > characters)
> 
> That's why they are useless for text written in many languages other
> than Englisch. That is strange for a product, that is intended for
> international use, especially because it is very easy to write better case
> conversion routines.
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...

I didn't think that would be the case, but I just tested a simple lookup
table approach to case conversion and it runs in 75% of the time that
lower() uses.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

28. Re: Small feature request for future EU versions

Derek Parnell wrote:

> Juergen Luethje wrote:

<snip>

>> That's why they are useless for text written in many languages other
>> than Englisch. That is strange for a product, that is intended for
>> international use, especially because it is very easy to write better case
>> conversion routines.
>> Case conversion routines written without using the operators in the
>> manner mentioned above are also faster. That's why RDS themselves don't
>> use their own library routines for case conversion, when speed is
>> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>
> I didn't think that would be the case, but I just tested a simple lookup
> table approach to case conversion and it runs in 75% of the time that
> lower() uses.

I use personally a modified version of RDS' fast_lower() (URL might wrap):
http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&toYear=9&postedBy=Juergen+Luethje&keywords=lower

Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
the time that lower() uses. smile

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

29. Re: Small feature request for future EU versions

----- Original Message ----- 
From: "Juergen Luethje"
Sent: Friday, October 22, 2004 2:48 AM
Subject: Re: Small feature request for future EU versions


> 
> <snip>
>
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
> 
> <snip>
> 
> Regards,
>   Juergen

Those routines will do case convertion at any depth.
Anything else would have to be either recursive or specially designed for
each task.

    unkmar

new topic     » goto parent     » topic index » view message » categorize

30. Re: Small feature request for future EU versions

Lucius Hilley wrote:

> ----- Original Message ----- 
> From: "Juergen Luethje"
>
>> <snip>
>>
>> Case conversion routines written without using the operators in the
>> manner mentioned above are also faster. That's why RDS themselves don't
>> use their own library routines for case conversion, when speed is
>> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>>
>> <snip>
>
> Those routines will do case convertion at any depth.
> Anything else would have to be either recursive or specially designed for
> each task.

My modified version of RDS' fast_lower() function (used in 'guru.ex' and
'search.ex') also does case conversion at any depth. Yes, it is recursive.

As I wrote, my function (which can also handle special characters such
as the German umlauts) runs in 50% of the time that lower() uses, when
applied to the whole text of Euphoria/Doc/Library.doc.

Do you think my recursive function will be slower than the lower()
library function, when applied to deeply nested objects? I don't
know. I think the library function also will have to do some recursion
internally. And do we apply lower() and upper() more often to deeply
nested objects, or to plain text strings?

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

31. Re: Small feature request for future EU versions

On 22 Oct 2004, at 14:12, Juergen Luethje wrote:

> 
> 
> Derek Parnell wrote:
> 
> > Juergen Luethje wrote:
> 
> <snip>
> 
> >> That's why they are useless for text written in many languages other
> >> than Englisch. That is strange for a product, that is intended for
> >> international use, especially because it is very easy to write better case
> >> conversion routines. Case conversion routines written without using the
> >> operators in the manner mentioned above are also faster. That's why RDS
> >> themselves don't use their own library routines for case conversion, when
> >> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
> >
> > I didn't think that would be the case, but I just tested a simple lookup
> > table approach to case conversion and it runs in 75% of the time that
> > lower() uses.
> 
> I use personally a modified version of RDS' fast_lower() (URL might wrap):
>
> http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t
> oYear=9&postedBy=Juergen+Luethje&keywords=lower
> 
> Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
> the time that lower() uses. smile

Just out of curiosity, did you compare to/against Jiri's lib in 
http://www.rapideuphoria.com/nlseu.zip
? I did something similar in mirc ages ago, but naturally it ran v e r y s l o w
l
y there.

Kat

new topic     » goto parent     » topic index » view message » categorize

32. Re: Small feature request for future EU versions

Kat wrote:

> On 22 Oct 2004, at 14:12, Juergen Luethje wrote:
>
>> Derek Parnell wrote:
>>
>>> Juergen Luethje wrote:
>>
>> <snip>
>>
>>>> That's why they are useless for text written in many languages other
>>>> than Englisch. That is strange for a product, that is intended for
>>>> international use, especially because it is very easy to write better case
>>>> conversion routines. Case conversion routines written without using the
>>>> operators in the manner mentioned above are also faster. That's why RDS
>>>> themselves don't use their own library routines for case conversion, when
>>>> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>>>
>>> I didn't think that would be the case, but I just tested a simple lookup
>>> table approach to case conversion and it runs in 75% of the time that
>>> lower() uses.
>>
>> I use personally a modified version of RDS' fast_lower() (URL might wrap):
>>
>> http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t
>> oYear=9&postedBy=Juergen+Luethje&keywords=lower
>>
>> Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
>> the time that lower() uses. smile
>
> Just out of curiosity, did you compare to/against Jiri's lib in
> http://www.rapideuphoria.com/nlseu.zip
> ? I did something similar in mirc ages ago, but naturally it ran v e r y s l o
> w l y
> there.

I hadn't compared it to that lib, because I wasn't aware of that library.
Now I downloaded 'nlseu.zip' and compared it:
Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
function in 'nlseu.zip' takes 310% of the time that the lower() function
in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

33. Re: Small feature request for future EU versions

Hi, Juergen!

You wrote:

[snip]

> I hadn't compared it to that lib, because I wasn't aware of that library.
> Now I downloaded 'nlseu.zip' and compared it:
> Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
> function in 'nlseu.zip' takes 310% of the time that the lower() function
> in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.

There  is  wildcarr.e  in my  ru_eu_9_.zip  package.
It has the additional functions with the bilingual (English/Russian) names.

English names of those functions are case_la() - for Latin alphabet,
and case_ru() - for Russian alphabet in 5 different encodings.

I did not test the speed of those functions, they just work for me and I do
not care.

Try please, any alphabet may be supported that Russian way, I think.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

34. Re: Small feature request for future EU versions

On Fri, 22 Oct 2004 21:43:00 +0200, Juergen Luethje <j.lue at gmx.de>
wrote:

>Now I downloaded 'nlseu.zip' and compared it:
>Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
>function in 'nlseu.zip' takes 310% of the time that the lower() function
>in 'wildcard.e' uses. 
Probably
>Furthermore, nlsLower() is only for Windows.
True[1]

It will convert to lower case not only the usual A-Z, not only the few
characters in #80..#FF, but also, potentially, if modified to use
CharLowerW instead of CharLowerA, unicode.

(Much) Slower, yes.
However, I wanted to point out that it is fundamentally better, at
least for some purposes.

Regards,
Pete
[1] no doubt there is a similar Linux system call.

new topic     » goto parent     » topic index » view message » categorize

35. Re: Small feature request for future EU versions

Hi, Juergen, again!

> 
> Hi, Juergen!
> 
> You wrote:
> 
> [snip]
> 
> > I hadn't compared it to that lib, because I wasn't aware of that
library.
> > Now I downloaded 'nlseu.zip' and compared it:
> > Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
> > function in 'nlseu.zip' takes 310% of the time that the lower()
function
> > in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.
> 
> There  is  wildcarr.e  in my  ru_eu_9_.zip  package.
> It has the additional functions with the bilingual (English/Russian)
names.
> 
> English names of those functions are case_la() - for Latin alphabet,
> and case_ru() - for Russian alphabet in 5 different encodings.
> 
> I did not test the speed of those functions, they just work for me and I
do
> not care.
> 
> Try please, any alphabet may be supported that Russian way, I think.

Oops... Forgot to say.

If you want these Russian libraries to be compatible with
the standard Euphoria,  run the command :

ex_r.exe translat.ex

Then you'll have the complete set of these libs with
the .ez extention.

The wildcarr.ez  and others such libs support translator Eu2C and binder.

To get Russian program translated to Latin, use Esc t command of
the red.ex  editor.  This way pure Russian red.ex  was compiled with
pure English Open Watcom 1.1 and binded  with pure
English CE Euphoria v.2.4.

This way *any pure Russian program* runs on any *custom* Euphoria, which
supports the standard Euphoria code.
Just move all .ez into separate dir and rename them as .e to get
this effect.

All that just now, naturally, and free.

Good Luck again!

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

36. Re: Small feature request for future EU versions

Igor Kachan wrote:

> Hi, Juergen, again!
>
>> Hi, Juergen!

<snip>

>> There  is  wildcarr.e  in my  ru_eu_9_.zip  package.
>> It has the additional functions with the bilingual (English/Russian) names.
>>
>> English names of those functions are case_la() - for Latin alphabet,
>> and case_ru() - for Russian alphabet in 5 different encodings.
>>
>> I did not test the speed of those functions, they just work for me and I do
>> not care.
>>
>> Try please, any alphabet may be supported that Russian way, I think.
>
> Oops... Forgot to say.
>
> If you want these Russian libraries to be compatible with
> the standard Euphoria,  run the command :
>
> ex_r.exe translat.ex
>
> Then you'll have the complete set of these libs with
> the .ez extention.
>
> The wildcarr.ez  and others such libs support translator Eu2C and binder.
>
> To get Russian program translated to Latin, use Esc t command of
> the red.ex  editor.  This way pure Russian red.ex  was compiled with
> pure English Open Watcom 1.1 and binded  with pure
> English CE Euphoria v.2.4.
>
> This way *any pure Russian program* runs on any *custom* Euphoria, which
> supports the standard Euphoria code.
> Just move all .ez into separate dir and rename them as .e to get
> this effect.

My test program ...

include wildcarr.e
constant CP = {"dos","win","koi","iso","mac"}
sequence s
s = "heLLo"
for i = 1 to length(CP) do
   printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
end for


... still just prints 5 times "heLLo" (German Windows 98).

<snip>

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

37. Re: Small feature request for future EU versions

Hi,

Juergen Luethje wrote:

[snip]
>>> English names of those functions are case_la() - for Latin alphabet,
>>> and case_ru() - for Russian alphabet in 5 different encodings.
[snip]

> My test program ...
> 
> }}}
<eucode>
> include wildcarr.e
> constant CP = {"dos","win","koi","iso","mac"}
> sequence s
> s = "heLLo"
> for i = 1 to length(CP) do
>    printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
> end for
> </eucode>
{{{

> 
> ... still just prints 5 times "heLLo" (German Windows 98).
> 
> <snip>

There is case_la() for Latin alphabet, not case_ru().  See please on top
once more.
Your  "heLLo" is in Latin alphabet.
I think, you can make case_gr() for German alphabet, someone - case_fr()
for France,
and so on.  case_ru() is Russian Cyrillic.
There are no any  "dos","win","koi","iso","mac"  for pure Latin.
Pure Latin  is just ASCII,  A..Z,  a..z.
 Do you see now?

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

38. Re: Small feature request for future EU versions

Igor Kachan wrote:

> Juergen Luethje wrote:
>
> [snip]
>>>> English names of those functions are case_la() - for Latin alphabet,
>>>> and case_ru() - for Russian alphabet in 5 different encodings.
> [snip]
>
>> My test program ...
>>
>> }}}
<eucode>
>> include wildcarr.e
>> constant CP = {"dos","win","koi","iso","mac"}
>> sequence s
>> s = "heLLo"
>> for i = 1 to length(CP) do
>>    printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
>> end for
>> </eucode>
{{{

>>
>> ... still just prints 5 times "heLLo" (German Windows 98).
>>
>> <snip>
>
> There is case_la() for Latin alphabet, not case_ru().  See please on top
> once more.
> Your  "heLLo" is in Latin alphabet.

case_la() is just Euphoria's standard lower() and upper() combined in 1
function. I have that already, and as has been discussed here, this
doesn't handle special characters such as the German umlauts. So it is
useless for me.

> I think, you can make case_gr() for German alphabet,

As I wrote at the beginning of this thread, I already *had* made a
function that is able to handle German characters.
I thought you would provide me another function. Now I downloaded all
that Russian stuff just so that you tell me, I shall write my own
function???

> someone - case_fr()
> for France, and so on.  case_ru() is Russian Cyrillic.
> There are no any  "dos","win","koi","iso","mac"  for pure Latin.
> Pure Latin  is just ASCII,  A..Z,  a..z.
>  Do you see now?

Regards,
   Juergen

new topic     » goto parent     » topic index » view message » categorize

39. Re: Small feature request for future EU versions

Juergen Luethje wrote:

[snip}

> case_la() is just Euphoria's standard lower() and upper() combined in 1
> function. I have that already, and as has been discussed here, this
> doesn't handle special characters such as the German umlauts. So it is
> useless for me.

OK, but it is useful for me to handle *all* languages with pure
Latin alphabet. 
Maybe, it is useful for someone else as an examle of "delete double"
method.
There are many different people with different programming skills here.
Some of them just do not know that lower() and upper() handles pure
Latin only.

This list is for the free technical support of any user of Euphoria
programming language, for PD Euphoria users and for CE Euphoria users,
as it is stated on the Euphoria home page.

Well, we have done our job for pure Latin - there are enough
available functions, I think.

Let us go to the original national alphabets.

> > I think, you can make case_gr() for German alphabet,
> 
> As I wrote at the beginning of this thread, I already *had* made a
> function that is able to handle German characters.

Very well, I do understand that.
Sorry, I do not know your function yet, but I am *sure* it handles German
characters correctly -- you are German yourself.
But if German language has different code pages for DOS (OEM) and for
Windows (ANSI), your function has to handle *both* code pages correctly.
I can not create this function for you - I just do not know German,
and the standard code pages are more or less international
 - for not single native language.

Say, 1251 Windows CP, supports all Cyrillic languages - Russian,
Ukrainian, Belorussian, Tatar and so on, but every one of them
has its own alphabet(!).

And functions must handle *every* native alphabet of given coge page.

And many people just do not know, that they must handle their
alphabets for DOS and for Windows differently, not saying
about Linux and Mac.

> I thought you would provide me another function. Now I downloaded all
> that Russian stuff just so that you tell me, I shall write my own
> function???

Why you *shall*, if you have your function *already done* for one
or for all German code pages?

I just think that you can revise your function now with help from that
Russian stuff and correct your function for different German code pages
as it is already done for Russian.

RDS doesn't know German nor Russian well enough to provide the
case_ge() Euphoria function for you and the case_ru() function for me.

So, me wrote, remember please - try please, any alphabet may be
supported that Russian *way*, I think.

Me did not provide the universal function for any alphabet,
but a *way* to make the concrete function for concrete alphabet.

If there is the case_ru() function for all code pages in Russian,
why not to have case_ge() for German, case_fr() for French, case_gr()
for Greek and so on?
Without any confusing with usage of some universal function?

The universal function is possible, but it requires
dozens and dozens of the "if end if" statements and it requires
the knowledge of not only code pages, but plus national alphabets,
first of all to be created.

These functions are simple enough if you make them for your
native languages, but they are more or less buggy, if you use
some universal stuff.

For example, my case_ru() doesn't handle Russian letters
E and e with two dots above for now, just becouse of some
issues with the reference stuff.
And I can not discuss these issues with most of, say, Germans.

Who care? RDS? To do some custom job for their free PD production?

Good will and volunteers are needed here.

This is the only way to support the excellent PD thing.

Not Shareware, but Public Domain thing without expiry
time of evaluation or such. 

> > someone - case_fr()
> > for France, and so on.  case_ru() is Russian Cyrillic.
> > There are no any  "dos","win","koi","iso","mac"  for pure Latin.
> > Pure Latin  is just ASCII,  A..Z,  a..z.
> > Do you see now?

So, we have many alternatives just now - to use existing system functions
of Windows and Linux, and to use the native Euphoria functions -- yours
one for German and my case_la(), case_ru() for Latin and for 6 different
Russians, including specialized for Euphoria very rare Latinic Russian.

As I can suppose, RDS itself potentially may provide something native for
English, Japanese and Esperanto - but it seems to me, there is Latin
alphabet only for that task, lower() and upper() already exist. Done! smile

Once more - good will and volunteers are needed,
if you want the native Euphoria functions for this task.

The task is not very difficult if you do know your native alphabet,
be sure, dear End Users.

Good Luck!

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

40. Re: Small feature request for future EU versions

Hi, Dear EU users!

Me wrote:

[snip]
> Me did not provide the universal function for any alphabet,
> but a *way* to make the concrete function for concrete alphabet.
[snip]
> The task is not very difficult if you do know your native alphabet,
> be sure, dear End Users.

Let us try to use that way to make the concrete (i.e. specific, only
this one, not any other) function for concrete well known alphabet
-- pure Latin -- alphabet_la, just for example.

sequence alphabet_la
alphabet_la="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"

global function case_LA(integer c, object x)
-- case_LA -- to not confuse with the existing case_la() function
-- convert Latin atom or sequence to upper or to lower case
 integer n
 sequence A, a
 if c then 
     A=alphabet_la[1 ..26]
     a=alphabet_la[27..52]
 else 
     a=alphabet_la[1 ..26]
     A=alphabet_la[27..52]
 end if
 n = 0
 if atom(x) then n=find(x,a)
       if n then return A[n]
            else return n end if
 else
     for i=1 to length(x) do
	 n=find(x[i],a)
	 if n then
	    x[i] = A[n]
	 end if
     end for  
 end if      
return x
end function -- case_LA()

puts(1, alphabet_la & '\n')
puts(1, case_LA(1, alphabet_la) & '\n')
puts(1, case_LA(0, alphabet_la) & '\n')


Just replace the alphabet_la sequence with your native alphabet
and you will have concrete function for your native language
and for your current code page.
If you see your alphabet correctly in your editor, all right.
Do not forget, on DOS and Windows may be different results.

Say,
sequence alphabet_mj
alphabet_mj = "......place here The Great Mumbo Jumbo Alphabet....."

global function case_mj(integer c, object x)
And so on.

Do not forget:
     A=alphabet_mj[1 ..     not 26,  but needed number]
     a=alphabet_mj[not 27, but needed number .. not 52, but needed number]
     -- Just first and second half.
 
... is easy peasy, NO? -- by Pete Lomax

But I just now found one old bug in my case_ru() function.
Thanks for your questions!

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

41. Re: Small feature request for future EU versions

Ricardo Forno wrote:

[snip]

> Hi, Igor.
> Your function is slow compared to other approaches.
> Since in Euphoria (and also i C and other languages) a character is just
a
> number, may I suggest a change?
> I think it would be much faster having two 256-character sequences, one
> representing the "translation" of the other (characters in the same
> position), and using the input characters as indexes.
> This way, you will get not only a way to translate between upper and
lower
> case, but also (changing the sequences) to translate ASCII to EBCDIC or
> vice-versa, or any translation character by character you want.
> Regards.

Hi, Ricardo!

Yes, my case_LA() function is twice as slower than standard upper() and
lower().
I tested them both just now.
I like to speed up things, but I have no this problem with conversion or
translation. I translate all bilingual EU libs into Latinic Russian in a
small
fraction of second and never wait a results.
Esc t or Esc e in red.ex and that final beep sounds.

I like your suggestion, but I just do not see the solution how to make
this way the stable templet for *any* alphabet now.

But case_LA() function is templet for any possible alphabet and any
code page just now, as far as I can see.

Some alphabets have no case at all, some alphabets have different
numbers of upper and lower letters.
For example, computer Russian has 3 extra letters in upper case, which
are absent in Russian canonical grammar. 

But speed is good thing, yes, Ricardo.
Try, maybe  you can make such a fast & simple templet.
For now, I can not imagine another possible function.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

42. Re: Small feature request for future EU versions

What I think he means is this:

constant  LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
constant LA_lo = "abcdefghijklmnopqrstuvwxyz"
constant LA_diff = LA_up - LA_lo

global function case_LA(integer c, object x)
    integer n
    if atom(x) then
        if c then
             n = find( x, LA_lo )
             if n then
                  x += LA_diff[n]
             end if
        else
             n = find( x, LA_up )
             if n then
                  x -= LA_diff[n]
             end if
         end if
     else
         for i = 1 to length(x) do
             x[i] = case_LA(c, x[i])
         end for
     end if

      return x
end function

> I like your suggestion, but I just do not see the solution how to make
> this way the stable templet for *any* alphabet now.

Works for any alphabet and code page, and is faster, because it
doesn't have to keep slicing the alphabet sequences.

> Some alphabets have no case at all, some alphabets have different
> numbers of upper and lower letters.
> For example, computer Russian has 3 extra letters in upper case, which
> are absent in Russian canonical grammar.

The limitation of the above function is that LA_up and LA_lo must be
the same length... what do you mean by 3 extra letters? What if you
try to convert them to lower case? If it should just leave them as
upper case, that's fine - just leave them out of the function.

 
***The above function is completely untested. It should not be used in
nuclear reactors, medical life support systems, or anywhere where
failure may cause injury*** smile
-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

43. Re: Small feature request for future EU versions

Patrick Barnes wrote:
> 
> What I think he means is this:
> 
> constant  LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> constant LA_lo = "abcdefghijklmnopqrstuvwxyz"
> constant LA_diff = LA_up - LA_lo
> 
> global function case_LA(integer c, object x)
>     integer n
>     if atom(x) then
>         if c then
>              n = find( x, LA_lo )
>              if n then
>                   x += LA_diff[n]

 OR instead ...
                    x = LA_up[n]


[snip]

I have a generic case conversion that can work for most alphabets.
I'll submit it to the contributions page.


[snip]


> > Some alphabets have no case at all, some alphabets have different
> > numbers of upper and lower letters.
> > For example, computer Russian has 3 extra letters in upper case, which
> > are absent in Russian canonical grammar.

I believe German and some other language has a situation where a
single lower-case letter gets changed to two characters when
converted to uppercase - the German s-sharp character 'ß' changes
to 'SS' when converted.


-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

44. Re: Small feature request for future EU versions

Patrick Barnes wrote:

[snip]

> > I like your suggestion, but I just do not see the solution how to make
> > this way the stable templet for *any* alphabet now.
> 
> Works for any alphabet and code page, and is faster, because it
> doesn't have to keep slicing the alphabet sequences.

Ok, I see "...for any alphabet...".

But my_case_LA() makes just single slicing on the alphabet_la sequence, 
and yours_case_LA() has 3 supporting sequences.

Can you rename yous_case_LA() as case_La() or such? Just for short, to not
confuse?

> > Some alphabets have no case at all, some alphabets have different
> > numbers of upper and lower letters.
> > For example, computer Russian has 3 extra letters in upper case, which
> > are absent in Russian canonical grammar.
> 
> The limitation of the above function is that LA_up and LA_lo must be
> the same length... what do you mean by 3 extra letters? What if you
> try to convert them to lower case? If it should just leave them as
> upper case, that's fine - just leave them out of the function.

Ok, I see "The limitation..." .
What about "...for any alphabet..."?    smile

Well, there are 3 lower-case letters in Russian, which can not stand
on the first place in a word.
There are no such the words in Russian at all.
So, you can not find them in any normal Russian text.
But some more or less artificial computer texts, like "ruSSIan" or "heLLo"
can include those additional upper-case letters.
And I can just include those letters into the alphabet_ru sequence
or exclude them and get the functions for canonical and for artificial
Russian languages.

Ok, Derek submitted his new library to Rob for these things.

I think, there is enough such a stuff in the RDS archives
now to not force Rob to learn our own crazy mumbos_jumbos.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

45. Re: Small feature request for future EU versions

Derek Parnell wrote:

[snip]
[snip]

> I have a generic case conversion that can work for most alphabets.
> I'll submit it to the contributions page.

[snip]
[snip]

I have downloaded yours library, it works for me, thanks.
But it seems to me, your function can not make the *selective*
conversion of bilingual texts of the same code page,
but of different alphabets.

Say, I can run my stuff the following way:

text = case_la(Lo, text)        -- to get all pure Latin letters lower-case
text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian
letters lower-case
text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
upper-case


Couldn't you?
Sorry, good appetite, yes.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

46. Re: Small feature request for future EU versions

Igor Kachan wrote:
> 
> Derek Parnell wrote:
> 
> [snip]
> [snip]
> 
> > I have a generic case conversion that can work for most alphabets.
> > I'll submit it to the contributions page.
> 
> [snip]
> [snip]
> 
> I have downloaded yours library, it works for me, thanks.

You are welcome.

> But it seems to me, your function can not make the *selective*
> conversion of bilingual texts of the same code page,
> but of different alphabets.

I'm sorry but I don't understand what you are saying. What does 
"*selective* conversion" mean? 

I think by "bilingual texts of the same code page, but of different
alphabets." you mean some text in which there is a mixture of characters
from different alphabets, but each character is still from the same code
page. 

I believe my functions can handle that. Something like 

  "Outside window = fenêtre extérieur: garçon = boy"

should come out from Obj_upper() as 

  "OUTSIDE WINDOW = FENÊTRE EXTÉRIEUR: GARÇON = BOY"

So long as each character has a unique code point in the code page, 
regardless of its language, my functions can help. 

> Say, I can run my stuff the following way:
> 
> }}}
<eucode>
> text = case_la(Lo, text)        -- to get all pure Latin letters lower-case
> text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian
> letters lower-case
> text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
> upper-case
> </eucode>
{{{

> 
> Couldn't you?

No, because I don't know those alphabets. However, you could using my
functions. Use the SetCase procedure to define the mappings for those
alphabets. Something like ...

SetCase( "абвгд...эюя",
"АБВГД . . . ЭЮЯ", -1)

and then use the Windows Cyrillic code page to get the correct display 
glyphs.

-- 
Derek Parnell
Melbourne, Australia

new topic     » goto parent     » topic index » view message » categorize

47. Re: Small feature request for future EU versions

Derek Parnell wrote:

[snip]

> > I have downloaded yours library, it works for me, thanks.

> You are welcome.

Thanks.

> > But it seems to me, your function can not make the *selective*
> > conversion of bilingual texts of the same code page,
> > but of different alphabets.

> I'm sorry but I don't understand what you are saying. What does
> "*selective* conversion" mean?=20

> I think by "bilingual texts of the same code page, but of different
> alphabets." you mean some text in which there is a mixture of characters
> from different alphabets, but each character is still from the same code
> page.

Yes, you are right.

To be more specific and clear, let us take the concrete example,
how my case_ru() function works.

It gets 3 parameters, say:

text = case_ru(1, text, "win")

1 - upper, 0 - lower.
text - sequence with the text I want to process.

"win" - one of 5 options - "dos", "iso", "koi", "mac".
It means the code page of the text sequence.
"win" stands for Windows 1251 common Cyrillic code page.
User has to know what is the text's code page.
It is not a current machine code page, but a concrete
text's code page.

So, this function only processes Russian alphabet on
common Cyrillic code page and doesn't affect the specific
letters, say, Ukrainian, in bilingual Russian/Ukrainian texts.

Same for "dos", "iso", "koi", "mac" Cyrillic code pages.

This way I can use my case_ru() function on any Euphoria platform
to process Russian texts of any other Euphoria platform.
It is really generic for Euphoria platforms, can to process
any given Russian text and doesn't depend on current platform.

So, the case_ua() function, if someone wants to have it,
may use the full Ukrainian alphabet with all common
Russian/Ukrainian letters, or just a few specific Ukrainian
letters to process only these letters in a bilingual
(Russian/Ukrainian) Cyrillic texts on any Euphoria platform.

> I believe my functions can handle that. Something like
>  "Outside window = fen=EAtre ext=E9rieur: gar=E7on = boy"
> should come out from Obj_upper() as=20
>  "OUTSIDE WINDOW = FEN=CATRE EXT=C9RIEUR: GAR=C7ON = BOY"
> So long as each character has a unique code point in the code page,
> regardless of its language, my functions can help.

Are you saying your function processes all alphabets of
given code page at once? Yes, as far as I can see.

If so, it can not process Win_Western texts selectively
on default and requires some *additional* job of local
programmer.

Same as that my case_mj() function, but which is selective
on default, if you want to prepare and use it that way.

The only productive way to get these functions very
useful - to force the local programmers to make functions
for their native alphabets, code pages and languages, I think.

> > Say, I can run my stuff the following way:
> >=20
> > }}}
<eucode>
> > text = case_la(Lo, text)        -- to get all pure Latin letters
lower-case
> > text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainia=
n
> > letters lower-case
> > text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
> > upper-case
> > </eucode>
{{{

>
> Couldn't you?

> No, because I don't know those alphabets. However, you could using my
> functions. Use the SetCase procedure to define the mappings for those
> alphabets. Something like ...

> SetCase( "абвгд...эюя",
> "АБВГД . . . ЭЮЯ", -1)

> and then use the Windows Cyrillic code page to get the correct display=

> glyphs.

OK, I do uderstand correctly, I think.
If a user of your function wants to process his text selectively
and not affect the letters of some second possible language of given
code page, he/she has to make some different tables for the SetCase()
function and call it twice.
Firstly for first language, then, after first pass, for the second one.
Right?

But what the strange hex codes/unicodes are in your example above?
It seems to me, the SetCase() function doesn't handles these codes yet,
and it is some reserve for future.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

48. Re: Small feature request for future EU versions

On Wed, 27 Oct 2004 00:56:24 -0300, Ricardo Forno <rforno at uyuyuy.com> wrote:
> Well... not exactly this.
> find() is a bit slow.
> What I suggest is having a 256-character sequence from which you take the
> character corresponding to the the one you want to translate, used as an
> index.
> For example, you know that 'A' is equal to 65 (its ASCII value). Assume then
> that sequence X contains an 'a' in position 66, a 'b' in position 67, and so
> on. So, to translate sequence Z from upper to lower case, you will code:
> 
> for i = 1 to length(Z) do
>     Z[i] = X[Z[i]+1]
> end for

Like this:
constant UPPER = 1
constant LOWER = 2
constant CODE_PAGE_SIZE = 255 --maximum value of a character.

--initial setup
sequence to_uppercase, to_lowercase
to_uppercase = repeat( 0, CODE_PAGE_SIZE ) 
to_lowercase = repeat( 0, CODE_PAGE_SIZE )
constant alphabet = { {'A', 'a'}, {'B', 'b'}, ...etc, for entire alphabet

--populate translation tables with data
for i = 1 to length(alphabet) do
    to_uppercase[ alphabet[i][LOWER] ] = alphabet[i][UPPER]
    to_lowercase[ alphabet[i][UPPER] ] = alphabet[i][LOWER]
end for

--the actual function
function change_case( integer case, object z ) 
    integer c
    if sequence(z) then
         for i = 1 to length(z) do
              z[i] = change_case( case, z[i] )
         end for
--  elsif not z then   --don't do any transform on null chars
--       return z
    elsif case = LOWER then
          c = to_lowercase[z]
    else --assume case = UPPER
          c = to_uppercase[z]
    end if 

    if c then 
        return c 
    else 
        return z
    end if
    
end if


And there you go. The only way to make it faster, I think, is to use a
non-recursive algorithm and maybe restrict it to 1d arrays only.

The method's only assumption is that no zero-value characters exist
(the null char?). To check for this, just uncomment the checking
code... might make it run a little slower.

What do you think of this Igor? Handles any odd combination of
alphabets you want to throw at it, I think.

-- 
MrTrick

new topic     » goto parent     » topic index » view message » categorize

49. Re: Small feature request for future EU versions

Patrick Barnes wrote:

[snip]

> And there you go. The only way to make it faster, I think, is to use a
> non-recursive algorithm and maybe restrict it to 1d arrays only.
> 
> The method's only assumption is that no zero-value characters exist
> (the null char?). To check for this, just uncomment the checking
> code... might make it run a little slower.
> 
> What do you think of this Igor? Handles any odd combination of
> alphabets you want to throw at it, I think.
> 
> -- 
> MrTrick

Using all yours people suggestions and questions I have combined
the case_xx() function, which is twice as faster than standard
case_la(), takes any alphabet and is selective on default.
Works with my 5 crazy Russians and pure Mumbo Jumbo as well.
Results  -  0.3 sec  -  3.5M file  -  1.8GHz box.

Try please:

sequence table
table = repeat(0,256)

global constant
alphabet_LA
= "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ
= ".....place here The Great Mumbo Jumbo Alphabet...."

global function case_xx(integer c, object x, sequence alphabet)
 integer A, a
 table *= 0 -- to clear table for next use with another alphabet
	  for i=1 to length(alphabet) - 1 by 2 do
	    A = alphabet[i]
	    a = alphabet[i+1]
		if c then
		    table[A] = A
		    table[a] = A
		else
		    table[A] = a
		    table[a] = a
		end if    
	  end for
		      
      if atom(x) then
	  if x then
	     return table[x]
	  else
	     return x
	  end if
       else
	  for i=1 to length(x) do
	      a = x[i]
	      if a then -- to convert binary
		  x[i] = table[a]
	      end if
	  end for
    end if  
return x      
end function

puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')
----


Many Thanks To All !

Just add your native alphabet and use (on your own risk blink

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

50. Re: Small feature request for future EU versions

OOppss...

Me wrote:

Patrick Barnes wrote:

[snip]

> And there you go. The only way to make it faster, I think, is to use a
> non-recursive algorithm and maybe restrict it to 1d arrays only.
> 
> The method's only assumption is that no zero-value characters exist
> (the null char?). To check for this, just uncomment the checking
> code... might make it run a little slower.
> 
> What do you think of this Igor? Handles any odd combination of
> alphabets you want to throw at it, I think.
> 
> -- 
> MrTrick

Using all yours people suggestions and questions I have combined
the case_xx() function, which is twice as faster than standard
case_la(), takes any alphabet and is selective on default.
Works with my 5 crazy Russians and pure Mumbo Jumbo as well.
Results  -  0.3 sec  -  3.5M file  -  1.8GHz box.

Try please:

sequence table
table = repeat(0,256)

global constant
alphabet_LA
= "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ
= ".....place here The Great Mumbo Jumbo Alphabet...."

global function case_xx(integer c, object x, sequence alphabet)
 integer A, a
 table *= 0 -- to clear table for next use with another alphabet
	  for i=1 to length(alphabet) - 1 by 2 do
	    A = alphabet[i]
	    a = alphabet[i+1]
		if c then
		    table[A] = A
		    table[a] = A
		else
		    table[A] = a
		    table[a] = a
		end if    
	  end for
		      
      if atom(x) then
	  if x then
	     return table[x]
	  else
	     return x
	  end if
       else
	  for i=1 to length(x) do
	      a = x[i]
	      if a then -- to convert binary
              if table[a] then -- to not affect others
		     x[i] = table[a]
              end if --
	      end if
	  end for
    end if  
return x      
end function

puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')
----


Many Thanks To All !

Just add your native alphabet and use (on your own risk blink

OOpsss ... Just next bug fix, see above.


Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

51. Re: Small feature request for future EU versions

Derek Parnell and Juergen Luethje wrote:

[snip]

> > Have you an estimate for how frequently the operators have actually
been
> > used in this manner? I think there are two instances in the RDS
libraries,
> > the case conversion routines (BTW which only work on a limited set of
> > characters)
> 
> That's why they are useless for text written in many languages other
> than Englisch. That is strange for a product, that is intended for
> international use, especially because it is very easy to write better
case
> conversion routines.
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...

Hey, customers, get please a final function,
written as a "poem" in Euphoria.
Almost all bugs are fixed.
The last one is just a victim of art.
Let us wait for German, French, Greek, Turkish, Polish,
Tatar, Ukrainian, Mongol and other alphabets.
Mumbo Jumbo is ready.
I did not placed here 5 Russian alphabets - they act
as an old USSR torpedo on RDS MessageBoard.

global constant
alphabet_LA = "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ = ".....place here The Great Mumbo Jumbo Alphabet...."

global function case_xx(integer c, object x, sequence alphabet)
 integer A, a
 sequence table table = repeat(0,256)
 
     for i=1 to length(alphabet) - 1 by 2 do
			 A = alphabet[i]
			 a = alphabet[i+1]
  if c then table[A] = A
		table[a] = A else
		table[A] = a
		table[a] = a end if end for
		 if atom(x) then
			if x  then
			   x
			   =
		     table[x]
			if x  then
			   x
			   =
			   x end if end if
    else for i=1 to length(x) do
			   a
			   =
			   x[i]
			if a  then
		  if table[a] then
			   x[i]
			   =
		   table[a] end if end if end for end if  
		  return x
 end function

puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')


I'm waiting for the bug reports.
Let us finish this too long thread. Ok?

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu