1. Small feature request for future EU versions
It would be nice to have the '==' (equal comparison) relational
operator implemented in future versions of EU. Many programming
languages have this operator for equal comparision, while EU
uses equal(). I doubt anyone will comfuse '==' with the '='
assignment operator. And if this is a compatibility issue,
keep the equal() ruitine until people agree that it is no longer
nessasary for backwards compatibility. What do you all think about
that?
2. Re: Small feature request for future EU versions
- Posted by "August" <fusionfive at tele2.se>
Oct 16, 2004
-
Last edited Oct 17, 2004
> posted by: Vincent <darkvincentdude at yahoo.com>
>
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
> nessasary for backwards compatibility. What do you all think about
> that?
Generally I think using `=' for assignment and `==' for comparison sucks! In
mathematics `=' has been used for ages as a relation symbol. It's the
assignment statement that needs (may need) special treatment. The natural
symbol for assignment is a "left arrow", but unfortunately it's not
available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'.
-- August
3. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de>
Oct 16, 2004
-
Last edited Oct 17, 2004
August wrote:
>> posted by: Vincent <darkvincentdude at yahoo.com>
>>
>> It would be nice to have the '==' (equal comparison) relational
>> operator implemented in future versions of EU. Many programming
>> languages have this operator for equal comparision, while EU
>> uses equal(). I doubt anyone will comfuse '==' with the '='
>> assignment operator. And if this is a compatibility issue,
>> keep the equal() ruitine until people agree that it is no longer
>> nessasary for backwards compatibility. What do you all think about
>> that?
>
> Generally I think using `=' for assignment and `==' for comparison sucks! In
> mathematics `=' has been used for ages as a relation symbol. It's the
> assignment statement that needs (may need) special treatment. The natural
> symbol for assignment is a "left arrow", but unfortunately it's not
> available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'.
Assignment could be written as e.g. x := y
Regards,
Juergen
4. Re: Small feature request for future EU versions
- Posted by cklester <cklester at yahoo.com>
Oct 16, 2004
-
Last edited Oct 17, 2004
August wrote:
> > posted by: Vincent <darkvincentdude at yahoo.com>
> > It would be nice to have the '==' (equal comparison) relational
> > operator implemented in future versions of EU.
> Generally I think using `=' for assignment and `==' for comparison sucks! In
I think using
if mySeq==yourSeq then...
is a step in the right direction from
if equal(mySeq,yourSeq) then...
That's 22 characters down from 28 characters, a 21% reduction in typing!
Not too shabby. :)
-=ck
"Programming in a state of EUPHORIA."
http://www.cklester.com/euphoria/
5. Re: Small feature request for future EU versions
- Posted by "August" <fusionfive at tele2.se>
Oct 16, 2004
-
Last edited Oct 17, 2004
> Assignment could be written as e.g. x := y
Yes, ever since Algol that has been the most common alternative to `='.
6. Re: Small feature request for future EU versions
Vincent wrote:
>
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
Unfortunatly this, while it certainly be helpful, will probably never
be added...
> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
If this is ever added, equal() should remain. '==' would simply
be short for equal(). Less typing good!
> nessasary for backwards compatibility. What do you all think about
> that?
>
7. Re: Small feature request for future EU versions
CoJaBo wrote:
>
> Vincent wrote:
> >
> > It would be nice to have the '==' (equal comparison) relational
> > operator implemented in future versions of EU. Many programming
> Unfortunatly this, while it certainly be helpful, will probably never
> be added...
>
> > languages have this operator for equal comparision, while EU
> > uses equal(). I doubt anyone will comfuse '==' with the '='
> > assignment operator. And if this is a compatibility issue,
> > keep the equal() ruitine until people agree that it is no longer
> If this is ever added, equal() should remain. '==' would simply
> be short for equal(). Less typing good!
>
> > nessasary for backwards compatibility. What do you all think about
> > that?
> >
>
Yea, but Rob is introducing the $ feature which happens to be a short
method of typing length.
x[$] is the same as x[length(x)]
And I dont think it would be any hassle for Rob to just implement
the '==' operator as a alternative for equal() for 5 reasons:
#1 The '==' operator is used for boolean comparing of objects in many
languages.
#2 Using equal() makes the code more unreadable.
(not really but still)
#3 It takes less typing and is more readable.
#4 I dont think Robert Craig would have much difficulty if any
implementing the '==' relational operator in the core language
definition.
#5 I think it's cool :P
8. Re: Small feature request for future EU versions
Derek Parnell wrote:
> Chris Bensler wrote:
>
> [snip]
>
>>
>> I think I would prefer an operator like ==.
>
> And I would like '=' to only be an equality test, ':=' to only be an
> assignment action and 'element_eq()' to be the sequence operation.
Me too.
> If we have '==' we should logically also have '<<', '>>', '!==', '<<='
> and '>>='.
Oh no, please.
>> I think that is something that can be done in a preprocessor though.
>> Alot of the changes that people want can be done in a preprocessor, and
>> once 2.5 arrives, we will have the ability to edit the front end, I
>> beleive, which is even better than a preprocesor.
>
> Yes, this is going to be survival-of-the-fitest game, with lots
> of variants vying for attention.
Yep.
Regards,
Juergen
9. Re: Small feature request for future EU versions
Juergen Luethje wrote:
>
> Derek Parnell wrote:
>
> Chris Bensler wrote:
> >> [snip]
> >> I think that is something that can be done in a preprocessor though.
> >> Alot of the changes that people want can be done in a preprocessor, and
> >> once 2.5 arrives, we will have the ability to edit the front end, I
> >> beleive, which is even better than a preprocesor.
> >
> > Yes, this is going to be survival-of-the-fitest game, with lots
> > of variants vying for attention.
>
> Yep.
>
> Regards,
> Juergen
>
>
survival-of-the-fittest game!
oh no.
please don't go off in different directions like pieces of a bomb.
communicate with each other , all those who want to work on the project.
get the same syntax down (names of functions,procedures and arguments).
then each can try to make the inside of those routines better.
or like an ide does, make a front end better using different methods (maybe even
a pre-processor ) but stick to the same insides(syntax used for the routines).
just my thoughts.
rudy
10. Re: Small feature request for future EU versions
rudy toews wrote:
> Juergen Luethje wrote:
>>
>> Derek Parnell wrote:
>>
>> Chris Bensler wrote:
>>>> [snip]
>
>>>> I think that is something that can be done in a preprocessor though.
>>>> Alot of the changes that people want can be done in a preprocessor, and
>>>> once 2.5 arrives, we will have the ability to edit the front end, I
>>>> beleive, which is even better than a preprocesor.
>>>
>>> Yes, this is going to be survival-of-the-fitest game, with lots
>>> of variants vying for attention.
>>
>> Yep.
>>
>
> survival-of-the-fittest game!
> oh no.
>
> please don't go off in different directions like pieces of a bomb.
> communicate with each other , all those who want to work on the project.
> get the same syntax down (names of functions,procedures and arguments).
I agree that this is desirable. Experience from the past as well as this
little discussion shows, that different people often have different
preferences, though.
> then each can try to make the inside of those routines better.
>
> or like an ide does, make a front end better using different methods
> (maybe even a pre-processor ) but stick to the same insides(syntax used
> for the routines).
>
> just my thoughts.
Regards,
Juergen
11. Re: Small feature request for future EU versions
On Sun, 17 Oct 2004 00:24:23 -0700, rudy toews
<guest at RapidEuphoria.com> wrote:
>> Derek Parnell wrote:
>> > Yes, this is going to be survival-of-the-fitest game, with lots
>> > of variants vying for attention.
>please don't go off in different directions like pieces of a bomb.
>communicate with each other
FWIW, new operators are not actually needed, it is quite easy to
greatly improve the language using the current ones.
Taking the if <expr> then example, the interpreter should realise that
<expr> must deliver a boolean result (or crash), in other words map
e1<e2 to compare(e1,e2)=-1
e1<=e2 to compare(e1,e2)!=1
e1=e2 to equal(e1,e2),
e1!=e2 to not equal(e1,e2),
e1>=e2 to compare(e1,e2)!=-1
e1>e2 to compare(e1,e2)=1
Of course, it does not need to do this if e1 and e2 are both atoms.
Taking this a step further "if <e1> and <e2> then" must also deliver a
boolean result, which means the "need" for a boolean result must be
propagated into e1 and e2. The same is true for all the other
operators (not, unary minus, +, -, *, /, or, xor), and the same logic
applies to while, for, subscript, and slice expressions, but not to
assignments, constants, or parameters. If this all sounds horribly
complicated, don't worry, in practice it's not. I already have this
working in Posetf
)
Regards,
Pete
12. Re: Small feature request for future EU versions
Vincent wrote:
>
> It would be nice to have the '==' (equal comparison) relational
> operator implemented in future versions of EU. Many programming
> languages have this operator for equal comparision, while EU
> uses equal(). I doubt anyone will comfuse '==' with the '='
> assignment operator. And if this is a compatibility issue,
> keep the equal() ruitine until people agree that it is no longer
> nessasary for backwards compatibility. What do you all think about
> that?
For that matter, what's wrong with just having '=' for both
string and numeric comparisons? Several other languages do that
without any problem.
Irv
13. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com>
Oct 17, 2004
-
Last edited Oct 18, 2004
On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com>
wrote:
> string and numeric comparisons? Several other languages do that
> without any problem.
>
> Irv
The problem is this:
constant
atom1=5,
atom2=10,
seq1={1,5,2,4},
seq2={1,2,5,4}
constant cond1 = ( atom1 = atom2 )
constant cond2 = ( seq1 = seq2 )
--cond1 will be 0, as 5 does not equal 10.
--cond2 will be {1,0,0,1}, because Euphoria compares each element.
The problem is that the IF statement doesn't know what to do with a
sequence, and there's not an obvious solution.
As a quick hack, lets say that if the IF statement receives a
sequence, it treats it as true if every element is non-zero.
That would allow us to easily compare strings using the simple form:
if string1 = string2.
However, what if the sequence passed to the IF statement is empty?
What if the sequence contains multiple levels?
What if the sequence contains a mix of non-zero integers, and an empty sequence?
I think that these things aren't easily solved...
Maybe it should be extended partially.
1. Only atoms, and 1-dimensional sequences can be passed to the IF statement
2. If an atom, pass if non-zero.
3. If a sequence, pass if all elements are non-zero.
4. If an empty sequence is passed to the if statement, treat it as a
'zero', I suppose.
What do you think of this solution?
I really don't like the idea of adding extra relationship operators.
:=, =>, ==, etc, is really annoying to remember. Trust me, I've
written too much VHDL code...
At least with the above suggestion, there's no broken compatibility, a
common issue (Why can't I just use '=' to compare these strings?) is
fixed, and it's a logical solution.
--
MrTrick
14. Re: Small feature request for future EU versions
On Mon, 18 Oct 2004 09:52:19 +1000, Patrick Barnes <mrtrick at gmail.com>
wrote:
>The problem is that the IF statement doesn't know what to do with a
>sequence, and there's not an obvious solution.
Actually there is: the interpreter can easily determine when it needs
a boolean result, eg in an if statement, so it simply maps relational
operators to equal()/compare().
>Why can't I just use '=' to compare these strings?
Trust me, this can be done for any reasonable expression, eg
if name="pete" then
or
if forename="pete" and surname="lomax" then
but not
if {1,2,3}+{4,5,6} then
I'll give it a go if/when I get 2.5
Regards,
Pete
15. Re: Small feature request for future EU versions
Patrick Barnes wrote:
>
> On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com>
> wrote:
> > string and numeric comparisons? Several other languages do that
> > without any problem.
> >
> > Irv
>
> The problem is this:
>
> constant
> atom1=5,
> atom2=10,
> seq1={1,5,2,4},
> seq2={1,2,5,4}
>
> constant cond1 = ( atom1 = atom2 )
> constant cond2 = ( seq1 = seq2 )
>
> --cond1 will be 0, as 5 does not equal 10.
> --cond2 will be {1,0,0,1}, because Euphoria compares each element.
By any normal meaning of the word 'equal' as used by scientists, mathematicians,
programmers, and the corner grocer, the answer can not possibly
be {1,0,0,1}. It is either TRUE or FALSE. If the lengths of the two
sequences are different, then they are not equal, and the result should
be FALSE - not an error.
If, for some strange reason, someone wanted an item-by-item comparison
between two sequences, then a new and more meaningful name should be chosen
for a function which returns {1,0,0,1}.
> The problem is that the IF statement doesn't know what to do with a
> sequence, and there's not an obvious solution.
> As a quick hack, lets say that if the IF statement receives a
> sequence, it treats it as true if every element is non-zero.
>
> That would allow us to easily compare strings using the simple form:
> if string1 = string2.
> However, what if the sequence passed to the IF statement is empty?
> What if the sequence contains multiple levels?
> What if the sequence contains a mix of non-zero integers, and an empty
> sequence?
You're thinking within the box built by RDS.
No matter whether the sequences are empty, or contain multiple levels,
if the two are identical, then they are equal, otherwise they aren't.
That is the obvious definition of equal which anyone can understand.
> I think that these things aren't easily solved...
The whole thing could have been avoided if Rob had used the = operator
to return equality and another operator or function to return a comparison.
....
> At least with the above suggestion, there's no broken compatibility, a
> common issue (Why can't I just use '=' to compare these strings?) is
> fixed, and it's a logical solution.
I'll bet there aren't a dozen uses of = in the existing code base. I've
used it exactly once, and I really don't mind changing that instance to
some new function.
Irv
16. Re: Small feature request for future EU versions
Ricardo Forno wrote:
<snip>
> Moreover, please remember (or consider) that one of the most common pitfalls
> in the C language is to use = instead of ==, and in Pascal and other ones,
> to use = instead of :=.
<snip>
That really surprises me. I was thinking that different symbols for
different operations (comparison vs. assignment) would lead to clearer
code and less pitfalls -- compared to Euphoria's '=', the meaning of
which dependes on the context.
Regards,
Juergen
17. Re: Small feature request for future EU versions
Juergen Luethje wrote:
> That really surprises me. I was thinking that different symbols for
> different operations (comparison vs. assignment) would lead to clearer
> code and less pitfalls -- compared to Euphoria's '=', the meaning of
> which dependes on the context.
Consider the following C code:
if (A = B) {
printf("A is equal to B");
}
Every C programmer will eventually have a 3 hour debugging
session (and probably on several different occasions), where he
finally realizes that the above code is actually doing:
A = B;
if (A != 0) {
printf("A is equal to B");
}
Not only is the if-statement "wrong", but A is overwritten by B!
Regards,
Rob Craig
Rapid Deployment Software
http://www.RapidEuphoria.com
18. Re: Small feature request for future EU versions
Robert Craig wrote:
> Juergen Luethje wrote:
>> That really surprises me. I was thinking that different symbols for
>> different operations (comparison vs. assignment) would lead to clearer
>> code and less pitfalls -- compared to Euphoria's '=', the meaning of
>> which dependes on the context.
>
> Consider the following C code:
>
> if (A = B) {
> printf("A is equal to B");
> }
>
> Every C programmer will eventually have a 3 hour debugging
> session (and probably on several different occasions), where he
> finally realizes that the above code is actually doing:
>
> A = B;
> if (A != 0) {
> printf("A is equal to B");
> }
>
> Not only is the if-statement "wrong", but A is overwritten by B!
Urgs, that is ugly. I see now. Thanks for the explanation!
Regards,
Juergen
19. Re: Small feature request for future EU versions
Derek Parnell wrote:
> If RDS were to change Euphoria to behave as most people expect it to,
> there would be some existing programs that would fail. But I suspect
> that there would be very, very few of those. The pain is worth the gain.
This is indeed an rehash of an old complaint. My recollection (although quite
possibly wrong) of Robert's response was:
1. His vision for Euphoria is a language where operators are simple and
consistant. Having '=' apply to each item in a sequence is simple and
consistant.
2. Part of Euphoria's value is backwards compatibility. Changing how '=' works
would break code in the library - something he's typically unwilling to do
unless there is a major gain for the language.
Given this, I'd be *very* suprised to find Robert's changed his position.
I suspect that one's personal expectation of the '=' operator depends on what
language you come from. If you come from a language such as BASIC, you
probably prefer a single true/false value from the comparison operator.
People coming from C would consider Euphoria's behavior to be normal - this
extends to other C/C++ derived languages, such as Java.
If you want my personal opinion, you can search the archives.
-- David Cuny
20. Re: Small feature request for future EU versions
----- Original Message -----
From: "Derek Parnell"
Sent: Sunday, October 17, 2004 9:36 PM
Subject: RE: Small feature request for future EU versions
>
[snip]
> It seems that you would like equality (and relationship comparisions)
> implemented as built-in functions (eg. equal(), compare() ) and
> sequence operations performed by operators ('=', '<', etc...)
>
> Whereas I'd prefer the reverse situation. I'd like relationship
> comparisions to use operators and sequence operations to use built-in
> functions.
>
> I'd prefer that ...
>
> cond1 = (seq1 = seq2)
>
> to be interpreted as ...
>
> if the contents of seq1 and the contents of seq2 are identical then
> assign 'true' to cond1 otherwise assign 'false' to cond1.
>
I agree with you here.
if (identical) then
TRUE
else
FALSE
end if
> If I really wanted a sequence operation to be performed as its result
> to be assigned I'd rather write something like ...
>
> cond1 = seqop_eq(seq1, seq2)
>
> --
> Derek Parnell
> Melbourne, Australia
>
21. Re: Small feature request for future EU versions
On Mon, 18 Oct 2004 21:18:27 -0400, Lucius Hilley
<l3euphoria at bellsouth.net> wrote:
> From: "Derek Parnell"
> Sent: Sunday, October 17, 2004 9:36 PM
> Subject: RE: Small feature request for future EU versions
> > It seems that you would like equality (and relationship comparisions)
> > implemented as built-in functions (eg. equal(), compare() ) and
> > sequence operations performed by operators ('=', '<', etc...)
> >
> > Whereas I'd prefer the reverse situation. I'd like relationship
> > comparisions to use operators and sequence operations to use built-in
> > functions.
> >
> > I'd prefer that ...
> >
> > cond1 = (seq1 = seq2)
> >
> > to be interpreted as ...
> >
> > if the contents of seq1 and the contents of seq2 are identical then
> > assign 'true' to cond1 otherwise assign 'false' to cond1.
> >
>
> I agree with you here.
> if (identical) then
> TRUE
> else
> FALSE
> end if
I agree, logically it should work like that.
I think that the behaviour of the 'if' construct should be changed, to
automatically use equal() to check if the phrase seq1 = seq2 is used
within an if statement.
But... ONLY the if statement.
--
MrTrick
22. Re: Small feature request for future EU versions
Hi Ricardo, you wrote:
> Hi, Juergen.
> Your thinking is logical, and I agree that using different operators for
> different purposes *should* be clearer.
This is the theory ...
> But, in my experience (and it seems
> some other people had the same kind of experience), == or := versus = leads
> to pitfalls.
... and that seems to be the practice.
I didn't know that because of lack of experience with C or Pascal.
However, the example that Rob posted in the meantime was impressive.
> I think this is due to the fact that all these operators contain the = sign.
> This problem rarely arose in APL, where the assignation symbol was a left
> arrow (such as <-, but a single character).
> Regards.
Interesting. Thank you!
Regards,
Juergen
23. Re: Small feature request for future EU versions
Just in case this topic hasn't been discussed to death yet...
Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll agree.
However, I do not support changing their current behaviour from
returning a sequence comprised of the individual elements, operated
on. Apart from the backwards compatibility...
I have written many pieces of code that taken an 'object' argument,
act on it in some way, and return it... without ever testing whether
the object is a sequence or an atom!
This is a tremendous benefit for Euphoria.. Try do the same thing in
C, and it's more than likely that you'll have to write an individual
function for every type of argument. And I don't mean just atomic or
array types... I mean char, short, int, uint, long, double, char[],
short[], int[], long[], double[], plus any other types that you are
interested in. And even then it can only handle 1-d arrrays!
For example: (and I'm sure it's possible to do this more efficiently,
this is just off the top of my head)
function abs(object x)
return x * (1 - 2*( x < 0 ) )
end function
Elegant, and simple. Works for ANY object X...
I see this as a very important reason not to change the behaviour of
existing operators.
However, the problem still remains... I have object A here. and object
B there. I want to compare them, but inside of the language, not using
a function. That would suggest adding a new operator type, but what?
Given the many pitfals of the '=' vs '==' in C, I would steer clear of
using '=='.
(Okay, here's the suggestion, pick it to pieces!)
How about something that is obviously intended for comparing sequences?
Applying braces around the operators that could be used for comparison
(not +, -, *, or /)
A {=} B
A {!=} B
replaces equal(A, B) and not equal(A, B)
A {>} B
A {<} B
A {>=} B
A {<=} B
replaces compare(A,B) and it's tests.... I cannot remember how compare
works, ever. Each time I want to use compare(), I need to look it up
to see the behaviour... These operators are logical, and more
importantly intuitive.
The benefits:
It is a long term solution that works in any situation.... no 'if'
specialisations are required. I can use these within an equation to
test equality, without causing problems with sequence operators.
Usage:
Although nominally for $sequence $test $sequence, it would use the
same mechanisms internally as compare() and equal().
That means it can be used to compare
atom with sequence,
sequence with atom,
atom with atom,
sequence with sequence.
Obviously comparing an atom with a sequence will always return
false... That's ok, it can be used to compare sequence elements with
one another.
The only confusion I can even think of, is that someone will
consistently use {=} to compare two atoms, rather than =.... This is
not even a real abuse, and is in my mind unlikely.
I welcome comment on this little suggestion of mine.
--
MrTrick
24. Re: Small feature request for future EU versions
Patrick Barnes wrote:
>
> Just in case this topic hasn't been discussed to death yet...
Impossible!
> Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll
> agree.
>
> However, I do not support changing their current behaviour from
> returning a sequence comprised of the individual elements, operated
> on. Apart from the backwards compatibility...
>
> I have written many pieces of code that taken an 'object' argument,
> act on it in some way, and return it... without ever testing whether
> the object is a sequence or an atom!
> This is a tremendous benefit for Euphoria.. Try do the same thing in
> C, and it's more than likely that you'll have to write an individual
> function for every type of argument. And I don't mean just atomic or
> array types... I mean char, short, int, uint, long, double, char[],
> short[], int[], long[], double[], plus any other types that you are
> interested in. And even then it can only handle 1-d arrrays!
I agree that this is one of Euphoria's overriding strengths. It costs a bit
of runtime performance (dynamic-typing verses static-typing) but in the
long run it is wonderful.
But why *must* this functionality be implemented using operators rather
than built-in functions? In essence, it is a syntax issue and not a
semantic issue.
> For example: (and I'm sure it's possible to do this more efficiently,
> this is just off the top of my head)
>
> }}}
<eucode>
> function abs(object x)
> return x * (1 - 2*( x < 0 ) )
> end function
> </eucode>
{{{
function abs(object x)
return x * (1 - 2*( lessthan(x,0) ) )
end function
> Elegant, and simple. Works for ANY object X...
> I see this as a very important reason not to change the behaviour of
> existing operators.
Have you an estimate for how frequently the operators have actually been
used in this manner? I think there are two instances in the RDS libraries,
the case conversion routines (BTW which only work on a limited set of
characters) and ... actually I can't find the other example just now.
My position is that the functionalty should remain in Euphoria, but as it
is rarely used it should be implemented using (unambiguous) built-in
functions, and the much more common comparision functionality should be
implemented using operators.
In the end, both are translated to IL for execution, so its just a matter
of which syntax to use to represent the functionality.
However, if you really insist on operators for the sequence operations,
then as these are rarely used, a special syntax for those might be better.
function abs(object x)
return x * (1 - 2*( x {<} 0 ) )
end function
--
Derek Parnell
Melbourne, Australia
25. Re: Small feature request for future EU versions
On Thu, 21 Oct 2004 20:27:48 -0700, Derek Parnell
<guest at rapideuphoria.com> wrote:
> But why *must* this functionality be implemented using operators rather
> than built-in functions? In essence, it is a syntax issue and not a
> semantic issue.
Well, using functions rather than operators it looks ungainly, and
requires more typing. In addition, they can be difficult to quickly
comprehend when you are reading through source:
if A {<=} B then
Vs:
if compare(A, B) <= 0 then
I don't even know if the two are functionally the same! I can't
remember whether compare is supposed to return a positive or a
negative value in which case!
> }}}
<eucode>
> function abs(object x)
> return x * (1 - 2*( lessthan(x,0) ) )
> end function
> </eucode>
{{{
Yes, that could work... but there are more important considerations...
(see below)
> Have you an estimate for how frequently the operators have actually been
> used in this manner? I think there are two instances in the RDS libraries,
> the case conversion routines (BTW which only work on a limited set of
> characters) and ... actually I can't find the other example just now.
Don't forget that the RDS libraries are missing many basic functions
like the aforementioned abs... I'm reasonably sure this functionality
gets used more often in things like genfunc, etc...
My projects have used sequence operators whenever I see them as being
appropriate.
> My position is that the functionalty should remain in Euphoria, but as it
> is rarely used it should be implemented using (unambiguous) built-in
> functions, and the much more common comparision functionality should be
> implemented using operators.
I think this would be a grave error. Initially, I thought "yeah,
that's a great idea, it makes more sense"... and it does. But... (see
below)
> However, if you really insist on operators for the sequence operations,
> then as these are rarely used, a special syntax for those might be better.
>
> }}}
<eucode>
> function abs(object x)
> return x * (1 - 2*( x {<} 0 ) )
> end function
> </eucode>
{{{
Actually, I initially thought that too. After all, it makes as much
sense as the solution I proposed in the earlier thread, if not
more....
PROBLEM! Backwards compatibility. Not that we haven't heard it shouted before...
It's worse than regular backwards compatibility problems though...
Consider: Should it be changed, every program that uses sequence
operators will have a different behaviour.
eg:
X = a[i] + (b[j][m] > c[p][2])
However, there will be no obvious error message. Rather than the line
above executing as expected, with a[i] being added to by an array,
it'll be added to by an atom. Who knows what will happen?
Of course, nothing could happen... if b[j][m] and c[p][2] are atoms.
But are they? I have no idea, and neither will anyone searching
through source changing things over.
There's no way to detect whether this will affect code.
That's why I suggest the new {=} apply to sequence comparison. After
all, it *is* logical for given values of 'logical', and it won't break
any existing code.
--
MrTrick
26. Re: Small feature request for future EU versions
Derek Parnell wrote:
<snip>
> Have you an estimate for how frequently the operators have actually been
> used in this manner? I think there are two instances in the RDS libraries,
> the case conversion routines (BTW which only work on a limited set of
> characters)
That's why they are useless for text written in many languages other
than Englisch. That is strange for a product, that is intended for
international use, especially because it is very easy to write better case
conversion routines.
Case conversion routines written without using the operators in the
manner mentioned above are also faster. That's why RDS themselves don't
use their own library routines for case conversion, when speed is
important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
<snip>
Regards,
Juergen
27. Re: Small feature request for future EU versions
Juergen Luethje wrote:
>
> Derek Parnell wrote:
>
> <snip>
>
> > Have you an estimate for how frequently the operators have actually been
> > used in this manner? I think there are two instances in the RDS libraries,
> > the case conversion routines (BTW which only work on a limited set of
> > characters)
>
> That's why they are useless for text written in many languages other
> than Englisch. That is strange for a product, that is intended for
> international use, especially because it is very easy to write better case
> conversion routines.
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
I didn't think that would be the case, but I just tested a simple lookup
table approach to case conversion and it runs in 75% of the time that
lower() uses.
--
Derek Parnell
Melbourne, Australia
28. Re: Small feature request for future EU versions
Derek Parnell wrote:
> Juergen Luethje wrote:
<snip>
>> That's why they are useless for text written in many languages other
>> than Englisch. That is strange for a product, that is intended for
>> international use, especially because it is very easy to write better case
>> conversion routines.
>> Case conversion routines written without using the operators in the
>> manner mentioned above are also faster. That's why RDS themselves don't
>> use their own library routines for case conversion, when speed is
>> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>
> I didn't think that would be the case, but I just tested a simple lookup
> table approach to case conversion and it runs in 75% of the time that
> lower() uses.
I use personally a modified version of RDS' fast_lower() (URL might wrap):
http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&toYear=9&postedBy=Juergen+Luethje&keywords=lower
Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
the time that lower() uses.
Regards,
Juergen
29. Re: Small feature request for future EU versions
----- Original Message -----
From: "Juergen Luethje"
Sent: Friday, October 22, 2004 2:48 AM
Subject: Re: Small feature request for future EU versions
>
> <snip>
>
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>
> <snip>
>
> Regards,
> Juergen
Those routines will do case convertion at any depth.
Anything else would have to be either recursive or specially designed for
each task.
unkmar
30. Re: Small feature request for future EU versions
Lucius Hilley wrote:
> ----- Original Message -----
> From: "Juergen Luethje"
>
>> <snip>
>>
>> Case conversion routines written without using the operators in the
>> manner mentioned above are also faster. That's why RDS themselves don't
>> use their own library routines for case conversion, when speed is
>> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>>
>> <snip>
>
> Those routines will do case convertion at any depth.
> Anything else would have to be either recursive or specially designed for
> each task.
My modified version of RDS' fast_lower() function (used in 'guru.ex' and
'search.ex') also does case conversion at any depth. Yes, it is recursive.
As I wrote, my function (which can also handle special characters such
as the German umlauts) runs in 50% of the time that lower() uses, when
applied to the whole text of Euphoria/Doc/Library.doc.
Do you think my recursive function will be slower than the lower()
library function, when applied to deeply nested objects? I don't
know. I think the library function also will have to do some recursion
internally. And do we apply lower() and upper() more often to deeply
nested objects, or to plain text strings?
Regards,
Juergen
31. Re: Small feature request for future EU versions
On 22 Oct 2004, at 14:12, Juergen Luethje wrote:
>
>
> Derek Parnell wrote:
>
> > Juergen Luethje wrote:
>
> <snip>
>
> >> That's why they are useless for text written in many languages other
> >> than Englisch. That is strange for a product, that is intended for
> >> international use, especially because it is very easy to write better case
> >> conversion routines. Case conversion routines written without using the
> >> operators in the manner mentioned above are also faster. That's why RDS
> >> themselves don't use their own library routines for case conversion, when
> >> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
> >
> > I didn't think that would be the case, but I just tested a simple lookup
> > table approach to case conversion and it runs in 75% of the time that
> > lower() uses.
>
> I use personally a modified version of RDS' fast_lower() (URL might wrap):
>
> http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t
> oYear=9&postedBy=Juergen+Luethje&keywords=lower
>
> Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
> the time that lower() uses.
Just out of curiosity, did you compare to/against Jiri's lib in
http://www.rapideuphoria.com/nlseu.zip
? I did something similar in mirc ages ago, but naturally it ran v e r y s l o w
l
y there.
Kat
32. Re: Small feature request for future EU versions
Kat wrote:
> On 22 Oct 2004, at 14:12, Juergen Luethje wrote:
>
>> Derek Parnell wrote:
>>
>>> Juergen Luethje wrote:
>>
>> <snip>
>>
>>>> That's why they are useless for text written in many languages other
>>>> than Englisch. That is strange for a product, that is intended for
>>>> international use, especially because it is very easy to write better case
>>>> conversion routines. Case conversion routines written without using the
>>>> operators in the manner mentioned above are also faster. That's why RDS
>>>> themselves don't use their own library routines for case conversion, when
>>>> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
>>>
>>> I didn't think that would be the case, but I just tested a simple lookup
>>> table approach to case conversion and it runs in 75% of the time that
>>> lower() uses.
>>
>> I use personally a modified version of RDS' fast_lower() (URL might wrap):
>>
>> http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t
>> oYear=9&postedBy=Juergen+Luethje&keywords=lower
>>
>> Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of
>> the time that lower() uses.
>
> Just out of curiosity, did you compare to/against Jiri's lib in
> http://www.rapideuphoria.com/nlseu.zip
> ? I did something similar in mirc ages ago, but naturally it ran v e r y s l o
> w l y
> there.
I hadn't compared it to that lib, because I wasn't aware of that library.
Now I downloaded 'nlseu.zip' and compared it:
Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
function in 'nlseu.zip' takes 310% of the time that the lower() function
in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.
Regards,
Juergen
33. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru>
Oct 22, 2004
-
Last edited Oct 23, 2004
Hi, Juergen!
You wrote:
[snip]
> I hadn't compared it to that lib, because I wasn't aware of that library.
> Now I downloaded 'nlseu.zip' and compared it:
> Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
> function in 'nlseu.zip' takes 310% of the time that the lower() function
> in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.
There is wildcarr.e in my ru_eu_9_.zip package.
It has the additional functions with the bilingual (English/Russian) names.
English names of those functions are case_la() - for Latin alphabet,
and case_ru() - for Russian alphabet in 5 different encodings.
I did not test the speed of those functions, they just work for me and I do
not care.
Try please, any alphabet may be supported that Russian way, I think.
Regards,
Igor Kachan
kinz at peterlink.ru
34. Re: Small feature request for future EU versions
On Fri, 22 Oct 2004 21:43:00 +0200, Juergen Luethje <j.lue at gmx.de>
wrote:
>Now I downloaded 'nlseu.zip' and compared it:
>Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
>function in 'nlseu.zip' takes 310% of the time that the lower() function
>in 'wildcard.e' uses.
Probably
>Furthermore, nlsLower() is only for Windows.
True[1]
It will convert to lower case not only the usual A-Z, not only the few
characters in #80..#FF, but also, potentially, if modified to use
CharLowerW instead of CharLowerA, unicode.
(Much) Slower, yes.
However, I wanted to point out that it is fundamentally better, at
least for some purposes.
Regards,
Pete
[1] no doubt there is a similar Linux system call.
35. Re: Small feature request for future EU versions
Hi, Juergen, again!
>
> Hi, Juergen!
>
> You wrote:
>
> [snip]
>
> > I hadn't compared it to that lib, because I wasn't aware of that
library.
> > Now I downloaded 'nlseu.zip' and compared it:
> > Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower()
> > function in 'nlseu.zip' takes 310% of the time that the lower()
function
> > in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows.
>
> There is wildcarr.e in my ru_eu_9_.zip package.
> It has the additional functions with the bilingual (English/Russian)
names.
>
> English names of those functions are case_la() - for Latin alphabet,
> and case_ru() - for Russian alphabet in 5 different encodings.
>
> I did not test the speed of those functions, they just work for me and I
do
> not care.
>
> Try please, any alphabet may be supported that Russian way, I think.
Oops... Forgot to say.
If you want these Russian libraries to be compatible with
the standard Euphoria, run the command :
ex_r.exe translat.ex
Then you'll have the complete set of these libs with
the .ez extention.
The wildcarr.ez and others such libs support translator Eu2C and binder.
To get Russian program translated to Latin, use Esc t command of
the red.ex editor. This way pure Russian red.ex was compiled with
pure English Open Watcom 1.1 and binded with pure
English CE Euphoria v.2.4.
This way *any pure Russian program* runs on any *custom* Euphoria, which
supports the standard Euphoria code.
Just move all .ez into separate dir and rename them as .e to get
this effect.
All that just now, naturally, and free.
Good Luck again!
Regards,
Igor Kachan
kinz at peterlink.ru
36. Re: Small feature request for future EU versions
Igor Kachan wrote:
> Hi, Juergen, again!
>
>> Hi, Juergen!
<snip>
>> There is wildcarr.e in my ru_eu_9_.zip package.
>> It has the additional functions with the bilingual (English/Russian) names.
>>
>> English names of those functions are case_la() - for Latin alphabet,
>> and case_ru() - for Russian alphabet in 5 different encodings.
>>
>> I did not test the speed of those functions, they just work for me and I do
>> not care.
>>
>> Try please, any alphabet may be supported that Russian way, I think.
>
> Oops... Forgot to say.
>
> If you want these Russian libraries to be compatible with
> the standard Euphoria, run the command :
>
> ex_r.exe translat.ex
>
> Then you'll have the complete set of these libs with
> the .ez extention.
>
> The wildcarr.ez and others such libs support translator Eu2C and binder.
>
> To get Russian program translated to Latin, use Esc t command of
> the red.ex editor. This way pure Russian red.ex was compiled with
> pure English Open Watcom 1.1 and binded with pure
> English CE Euphoria v.2.4.
>
> This way *any pure Russian program* runs on any *custom* Euphoria, which
> supports the standard Euphoria code.
> Just move all .ez into separate dir and rename them as .e to get
> this effect.
My test program ...
include wildcarr.e
constant CP = {"dos","win","koi","iso","mac"}
sequence s
s = "heLLo"
for i = 1 to length(CP) do
printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
end for
... still just prints 5 times "heLLo" (German Windows 98).
<snip>
Regards,
Juergen
37. Re: Small feature request for future EU versions
Hi,
Juergen Luethje wrote:
[snip]
>>> English names of those functions are case_la() - for Latin alphabet,
>>> and case_ru() - for Russian alphabet in 5 different encodings.
[snip]
> My test program ...
>
> }}}
<eucode>
> include wildcarr.e
> constant CP = {"dos","win","koi","iso","mac"}
> sequence s
> s = "heLLo"
> for i = 1 to length(CP) do
> printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
> end for
> </eucode>
{{{
>
> ... still just prints 5 times "heLLo" (German Windows 98).
>
> <snip>
There is case_la() for Latin alphabet, not case_ru(). See please on top
once more.
Your "heLLo" is in Latin alphabet.
I think, you can make case_gr() for German alphabet, someone - case_fr()
for France,
and so on. case_ru() is Russian Cyrillic.
There are no any "dos","win","koi","iso","mac" for pure Latin.
Pure Latin is just ASCII, A..Z, a..z.
Do you see now?
Regards,
Igor Kachan
kinz at peterlink.ru
38. Re: Small feature request for future EU versions
Igor Kachan wrote:
> Juergen Luethje wrote:
>
> [snip]
>>>> English names of those functions are case_la() - for Latin alphabet,
>>>> and case_ru() - for Russian alphabet in 5 different encodings.
> [snip]
>
>> My test program ...
>>
>> }}}
<eucode>
>> include wildcarr.e
>> constant CP = {"dos","win","koi","iso","mac"}
>> sequence s
>> s = "heLLo"
>> for i = 1 to length(CP) do
>> printf(1, "'%s'\n", {case_ru(0,s,CP[i])})
>> end for
>> </eucode>
{{{
>>
>> ... still just prints 5 times "heLLo" (German Windows 98).
>>
>> <snip>
>
> There is case_la() for Latin alphabet, not case_ru(). See please on top
> once more.
> Your "heLLo" is in Latin alphabet.
case_la() is just Euphoria's standard lower() and upper() combined in 1
function. I have that already, and as has been discussed here, this
doesn't handle special characters such as the German umlauts. So it is
useless for me.
> I think, you can make case_gr() for German alphabet,
As I wrote at the beginning of this thread, I already *had* made a
function that is able to handle German characters.
I thought you would provide me another function. Now I downloaded all
that Russian stuff just so that you tell me, I shall write my own
function???
> someone - case_fr()
> for France, and so on. case_ru() is Russian Cyrillic.
> There are no any "dos","win","koi","iso","mac" for pure Latin.
> Pure Latin is just ASCII, A..Z, a..z.
> Do you see now?
Regards,
Juergen
39. Re: Small feature request for future EU versions
Juergen Luethje wrote:
[snip}
> case_la() is just Euphoria's standard lower() and upper() combined in 1
> function. I have that already, and as has been discussed here, this
> doesn't handle special characters such as the German umlauts. So it is
> useless for me.
OK, but it is useful for me to handle *all* languages with pure
Latin alphabet.
Maybe, it is useful for someone else as an examle of "delete double"
method.
There are many different people with different programming skills here.
Some of them just do not know that lower() and upper() handles pure
Latin only.
This list is for the free technical support of any user of Euphoria
programming language, for PD Euphoria users and for CE Euphoria users,
as it is stated on the Euphoria home page.
Well, we have done our job for pure Latin - there are enough
available functions, I think.
Let us go to the original national alphabets.
> > I think, you can make case_gr() for German alphabet,
>
> As I wrote at the beginning of this thread, I already *had* made a
> function that is able to handle German characters.
Very well, I do understand that.
Sorry, I do not know your function yet, but I am *sure* it handles German
characters correctly -- you are German yourself.
But if German language has different code pages for DOS (OEM) and for
Windows (ANSI), your function has to handle *both* code pages correctly.
I can not create this function for you - I just do not know German,
and the standard code pages are more or less international
- for not single native language.
Say, 1251 Windows CP, supports all Cyrillic languages - Russian,
Ukrainian, Belorussian, Tatar and so on, but every one of them
has its own alphabet(!).
And functions must handle *every* native alphabet of given coge page.
And many people just do not know, that they must handle their
alphabets for DOS and for Windows differently, not saying
about Linux and Mac.
> I thought you would provide me another function. Now I downloaded all
> that Russian stuff just so that you tell me, I shall write my own
> function???
Why you *shall*, if you have your function *already done* for one
or for all German code pages?
I just think that you can revise your function now with help from that
Russian stuff and correct your function for different German code pages
as it is already done for Russian.
RDS doesn't know German nor Russian well enough to provide the
case_ge() Euphoria function for you and the case_ru() function for me.
So, me wrote, remember please - try please, any alphabet may be
supported that Russian *way*, I think.
Me did not provide the universal function for any alphabet,
but a *way* to make the concrete function for concrete alphabet.
If there is the case_ru() function for all code pages in Russian,
why not to have case_ge() for German, case_fr() for French, case_gr()
for Greek and so on?
Without any confusing with usage of some universal function?
The universal function is possible, but it requires
dozens and dozens of the "if end if" statements and it requires
the knowledge of not only code pages, but plus national alphabets,
first of all to be created.
These functions are simple enough if you make them for your
native languages, but they are more or less buggy, if you use
some universal stuff.
For example, my case_ru() doesn't handle Russian letters
E and e with two dots above for now, just becouse of some
issues with the reference stuff.
And I can not discuss these issues with most of, say, Germans.
Who care? RDS? To do some custom job for their free PD production?
Good will and volunteers are needed here.
This is the only way to support the excellent PD thing.
Not Shareware, but Public Domain thing without expiry
time of evaluation or such.
> > someone - case_fr()
> > for France, and so on. case_ru() is Russian Cyrillic.
> > There are no any "dos","win","koi","iso","mac" for pure Latin.
> > Pure Latin is just ASCII, A..Z, a..z.
> > Do you see now?
So, we have many alternatives just now - to use existing system functions
of Windows and Linux, and to use the native Euphoria functions -- yours
one for German and my case_la(), case_ru() for Latin and for 6 different
Russians, including specialized for Euphoria very rare Latinic Russian.
As I can suppose, RDS itself potentially may provide something native for
English, Japanese and Esperanto - but it seems to me, there is Latin
alphabet only for that task, lower() and upper() already exist. Done!
Once more - good will and volunteers are needed,
if you want the native Euphoria functions for this task.
The task is not very difficult if you do know your native alphabet,
be sure, dear End Users.
Good Luck!
Regards,
Igor Kachan
kinz at peterlink.ru
40. Re: Small feature request for future EU versions
Hi, Dear EU users!
Me wrote:
[snip]
> Me did not provide the universal function for any alphabet,
> but a *way* to make the concrete function for concrete alphabet.
[snip]
> The task is not very difficult if you do know your native alphabet,
> be sure, dear End Users.
Let us try to use that way to make the concrete (i.e. specific, only
this one, not any other) function for concrete well known alphabet
-- pure Latin -- alphabet_la, just for example.
sequence alphabet_la
alphabet_la="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
global function case_LA(integer c, object x)
-- case_LA -- to not confuse with the existing case_la() function
-- convert Latin atom or sequence to upper or to lower case
integer n
sequence A, a
if c then
A=alphabet_la[1 ..26]
a=alphabet_la[27..52]
else
a=alphabet_la[1 ..26]
A=alphabet_la[27..52]
end if
n = 0
if atom(x) then n=find(x,a)
if n then return A[n]
else return n end if
else
for i=1 to length(x) do
n=find(x[i],a)
if n then
x[i] = A[n]
end if
end for
end if
return x
end function -- case_LA()
puts(1, alphabet_la & '\n')
puts(1, case_LA(1, alphabet_la) & '\n')
puts(1, case_LA(0, alphabet_la) & '\n')
Just replace the alphabet_la sequence with your native alphabet
and you will have concrete function for your native language
and for your current code page.
If you see your alphabet correctly in your editor, all right.
Do not forget, on DOS and Windows may be different results.
Say,
sequence alphabet_mj
alphabet_mj = "......place here The Great Mumbo Jumbo Alphabet....."
global function case_mj(integer c, object x)
And so on.
Do not forget:
A=alphabet_mj[1 .. not 26, but needed number]
a=alphabet_mj[not 27, but needed number .. not 52, but needed number]
-- Just first and second half.
... is easy peasy, NO? -- by Pete Lomax
But I just now found one old bug in my case_ru() function.
Thanks for your questions!
Regards,
Igor Kachan
kinz at peterlink.ru
41. Re: Small feature request for future EU versions
Ricardo Forno wrote:
[snip]
> Hi, Igor.
> Your function is slow compared to other approaches.
> Since in Euphoria (and also i C and other languages) a character is just
a
> number, may I suggest a change?
> I think it would be much faster having two 256-character sequences, one
> representing the "translation" of the other (characters in the same
> position), and using the input characters as indexes.
> This way, you will get not only a way to translate between upper and
lower
> case, but also (changing the sequences) to translate ASCII to EBCDIC or
> vice-versa, or any translation character by character you want.
> Regards.
Hi, Ricardo!
Yes, my case_LA() function is twice as slower than standard upper() and
lower().
I tested them both just now.
I like to speed up things, but I have no this problem with conversion or
translation. I translate all bilingual EU libs into Latinic Russian in a
small
fraction of second and never wait a results.
Esc t or Esc e in red.ex and that final beep sounds.
I like your suggestion, but I just do not see the solution how to make
this way the stable templet for *any* alphabet now.
But case_LA() function is templet for any possible alphabet and any
code page just now, as far as I can see.
Some alphabets have no case at all, some alphabets have different
numbers of upper and lower letters.
For example, computer Russian has 3 extra letters in upper case, which
are absent in Russian canonical grammar.
But speed is good thing, yes, Ricardo.
Try, maybe you can make such a fast & simple templet.
For now, I can not imagine another possible function.
Regards,
Igor Kachan
kinz at peterlink.ru
42. Re: Small feature request for future EU versions
What I think he means is this:
constant LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
constant LA_lo = "abcdefghijklmnopqrstuvwxyz"
constant LA_diff = LA_up - LA_lo
global function case_LA(integer c, object x)
integer n
if atom(x) then
if c then
n = find( x, LA_lo )
if n then
x += LA_diff[n]
end if
else
n = find( x, LA_up )
if n then
x -= LA_diff[n]
end if
end if
else
for i = 1 to length(x) do
x[i] = case_LA(c, x[i])
end for
end if
return x
end function
> I like your suggestion, but I just do not see the solution how to make
> this way the stable templet for *any* alphabet now.
Works for any alphabet and code page, and is faster, because it
doesn't have to keep slicing the alphabet sequences.
> Some alphabets have no case at all, some alphabets have different
> numbers of upper and lower letters.
> For example, computer Russian has 3 extra letters in upper case, which
> are absent in Russian canonical grammar.
The limitation of the above function is that LA_up and LA_lo must be
the same length... what do you mean by 3 extra letters? What if you
try to convert them to lower case? If it should just leave them as
upper case, that's fine - just leave them out of the function.
***The above function is completely untested. It should not be used in
nuclear reactors, medical life support systems, or anywhere where
failure may cause injury***
--
MrTrick
43. Re: Small feature request for future EU versions
Patrick Barnes wrote:
>
> What I think he means is this:
>
> constant LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
> constant LA_lo = "abcdefghijklmnopqrstuvwxyz"
> constant LA_diff = LA_up - LA_lo
>
> global function case_LA(integer c, object x)
> integer n
> if atom(x) then
> if c then
> n = find( x, LA_lo )
> if n then
> x += LA_diff[n]
OR instead ...
x = LA_up[n]
[snip]
I have a generic case conversion that can work for most alphabets.
I'll submit it to the contributions page.
[snip]
> > Some alphabets have no case at all, some alphabets have different
> > numbers of upper and lower letters.
> > For example, computer Russian has 3 extra letters in upper case, which
> > are absent in Russian canonical grammar.
I believe German and some other language has a situation where a
single lower-case letter gets changed to two characters when
converted to uppercase - the German s-sharp character 'ß' changes
to 'SS' when converted.
--
Derek Parnell
Melbourne, Australia
44. Re: Small feature request for future EU versions
Patrick Barnes wrote:
[snip]
> > I like your suggestion, but I just do not see the solution how to make
> > this way the stable templet for *any* alphabet now.
>
> Works for any alphabet and code page, and is faster, because it
> doesn't have to keep slicing the alphabet sequences.
Ok, I see "...for any alphabet...".
But my_case_LA() makes just single slicing on the alphabet_la sequence,
and yours_case_LA() has 3 supporting sequences.
Can you rename yous_case_LA() as case_La() or such? Just for short, to not
confuse?
> > Some alphabets have no case at all, some alphabets have different
> > numbers of upper and lower letters.
> > For example, computer Russian has 3 extra letters in upper case, which
> > are absent in Russian canonical grammar.
>
> The limitation of the above function is that LA_up and LA_lo must be
> the same length... what do you mean by 3 extra letters? What if you
> try to convert them to lower case? If it should just leave them as
> upper case, that's fine - just leave them out of the function.
Ok, I see "The limitation..." .
What about "...for any alphabet..."?
Well, there are 3 lower-case letters in Russian, which can not stand
on the first place in a word.
There are no such the words in Russian at all.
So, you can not find them in any normal Russian text.
But some more or less artificial computer texts, like "ruSSIan" or "heLLo"
can include those additional upper-case letters.
And I can just include those letters into the alphabet_ru sequence
or exclude them and get the functions for canonical and for artificial
Russian languages.
Ok, Derek submitted his new library to Rob for these things.
I think, there is enough such a stuff in the RDS archives
now to not force Rob to learn our own crazy mumbos_jumbos.
Regards,
Igor Kachan
kinz at peterlink.ru
45. Re: Small feature request for future EU versions
Derek Parnell wrote:
[snip]
[snip]
> I have a generic case conversion that can work for most alphabets.
> I'll submit it to the contributions page.
[snip]
[snip]
I have downloaded yours library, it works for me, thanks.
But it seems to me, your function can not make the *selective*
conversion of bilingual texts of the same code page,
but of different alphabets.
Say, I can run my stuff the following way:
text = case_la(Lo, text) -- to get all pure Latin letters lower-case
text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian
letters lower-case
text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
upper-case
Couldn't you?
Sorry, good appetite, yes.
Regards,
Igor Kachan
kinz at peterlink.ru
46. Re: Small feature request for future EU versions
Igor Kachan wrote:
>
> Derek Parnell wrote:
>
> [snip]
> [snip]
>
> > I have a generic case conversion that can work for most alphabets.
> > I'll submit it to the contributions page.
>
> [snip]
> [snip]
>
> I have downloaded yours library, it works for me, thanks.
You are welcome.
> But it seems to me, your function can not make the *selective*
> conversion of bilingual texts of the same code page,
> but of different alphabets.
I'm sorry but I don't understand what you are saying. What does
"*selective* conversion" mean?
I think by "bilingual texts of the same code page, but of different
alphabets." you mean some text in which there is a mixture of characters
from different alphabets, but each character is still from the same code
page.
I believe my functions can handle that. Something like
"Outside window = fenêtre extérieur: garçon = boy"
should come out from Obj_upper() as
"OUTSIDE WINDOW = FENÊTRE EXTÉRIEUR: GARÇON = BOY"
So long as each character has a unique code point in the code page,
regardless of its language, my functions can help.
> Say, I can run my stuff the following way:
>
> }}}
<eucode>
> text = case_la(Lo, text) -- to get all pure Latin letters lower-case
> text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian
> letters lower-case
> text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
> upper-case
> </eucode>
{{{
>
> Couldn't you?
No, because I don't know those alphabets. However, you could using my
functions. Use the SetCase procedure to define the mappings for those
alphabets. Something like ...
SetCase( "абвгд...эюя",
"АБВГД . . . ЭЮЯ", -1)
and then use the Windows Cyrillic code page to get the correct display
glyphs.
--
Derek Parnell
Melbourne, Australia
47. Re: Small feature request for future EU versions
Derek Parnell wrote:
[snip]
> > I have downloaded yours library, it works for me, thanks.
> You are welcome.
Thanks.
> > But it seems to me, your function can not make the *selective*
> > conversion of bilingual texts of the same code page,
> > but of different alphabets.
> I'm sorry but I don't understand what you are saying. What does
> "*selective* conversion" mean?=20
> I think by "bilingual texts of the same code page, but of different
> alphabets." you mean some text in which there is a mixture of characters
> from different alphabets, but each character is still from the same code
> page.
Yes, you are right.
To be more specific and clear, let us take the concrete example,
how my case_ru() function works.
It gets 3 parameters, say:
text = case_ru(1, text, "win")
1 - upper, 0 - lower.
text - sequence with the text I want to process.
"win" - one of 5 options - "dos", "iso", "koi", "mac".
It means the code page of the text sequence.
"win" stands for Windows 1251 common Cyrillic code page.
User has to know what is the text's code page.
It is not a current machine code page, but a concrete
text's code page.
So, this function only processes Russian alphabet on
common Cyrillic code page and doesn't affect the specific
letters, say, Ukrainian, in bilingual Russian/Ukrainian texts.
Same for "dos", "iso", "koi", "mac" Cyrillic code pages.
This way I can use my case_ru() function on any Euphoria platform
to process Russian texts of any other Euphoria platform.
It is really generic for Euphoria platforms, can to process
any given Russian text and doesn't depend on current platform.
So, the case_ua() function, if someone wants to have it,
may use the full Ukrainian alphabet with all common
Russian/Ukrainian letters, or just a few specific Ukrainian
letters to process only these letters in a bilingual
(Russian/Ukrainian) Cyrillic texts on any Euphoria platform.
> I believe my functions can handle that. Something like
> "Outside window = fen=EAtre ext=E9rieur: gar=E7on = boy"
> should come out from Obj_upper() as=20
> "OUTSIDE WINDOW = FEN=CATRE EXT=C9RIEUR: GAR=C7ON = BOY"
> So long as each character has a unique code point in the code page,
> regardless of its language, my functions can help.
Are you saying your function processes all alphabets of
given code page at once? Yes, as far as I can see.
If so, it can not process Win_Western texts selectively
on default and requires some *additional* job of local
programmer.
Same as that my case_mj() function, but which is selective
on default, if you want to prepare and use it that way.
The only productive way to get these functions very
useful - to force the local programmers to make functions
for their native alphabets, code pages and languages, I think.
> > Say, I can run my stuff the following way:
> >=20
> > }}}
<eucode>
> > text = case_la(Lo, text) -- to get all pure Latin letters
lower-case
> > text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainia=
n
> > letters lower-case
> > text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters
> > upper-case
> > </eucode>
{{{
>
> Couldn't you?
> No, because I don't know those alphabets. However, you could using my
> functions. Use the SetCase procedure to define the mappings for those
> alphabets. Something like ...
> SetCase( "абвгд...эюя",
> "АБВГД . . . ЭЮЯ", -1)
> and then use the Windows Cyrillic code page to get the correct display=
> glyphs.
OK, I do uderstand correctly, I think.
If a user of your function wants to process his text selectively
and not affect the letters of some second possible language of given
code page, he/she has to make some different tables for the SetCase()
function and call it twice.
Firstly for first language, then, after first pass, for the second one.
Right?
But what the strange hex codes/unicodes are in your example above?
It seems to me, the SetCase() function doesn't handles these codes yet,
and it is some reserve for future.
Regards,
Igor Kachan
kinz at peterlink.ru
48. Re: Small feature request for future EU versions
On Wed, 27 Oct 2004 00:56:24 -0300, Ricardo Forno <rforno at uyuyuy.com> wrote:
> Well... not exactly this.
> find() is a bit slow.
> What I suggest is having a 256-character sequence from which you take the
> character corresponding to the the one you want to translate, used as an
> index.
> For example, you know that 'A' is equal to 65 (its ASCII value). Assume then
> that sequence X contains an 'a' in position 66, a 'b' in position 67, and so
> on. So, to translate sequence Z from upper to lower case, you will code:
>
> for i = 1 to length(Z) do
> Z[i] = X[Z[i]+1]
> end for
Like this:
constant UPPER = 1
constant LOWER = 2
constant CODE_PAGE_SIZE = 255 --maximum value of a character.
--initial setup
sequence to_uppercase, to_lowercase
to_uppercase = repeat( 0, CODE_PAGE_SIZE )
to_lowercase = repeat( 0, CODE_PAGE_SIZE )
constant alphabet = { {'A', 'a'}, {'B', 'b'}, ...etc, for entire alphabet
--populate translation tables with data
for i = 1 to length(alphabet) do
to_uppercase[ alphabet[i][LOWER] ] = alphabet[i][UPPER]
to_lowercase[ alphabet[i][UPPER] ] = alphabet[i][LOWER]
end for
--the actual function
function change_case( integer case, object z )
integer c
if sequence(z) then
for i = 1 to length(z) do
z[i] = change_case( case, z[i] )
end for
-- elsif not z then --don't do any transform on null chars
-- return z
elsif case = LOWER then
c = to_lowercase[z]
else --assume case = UPPER
c = to_uppercase[z]
end if
if c then
return c
else
return z
end if
end if
And there you go. The only way to make it faster, I think, is to use a
non-recursive algorithm and maybe restrict it to 1d arrays only.
The method's only assumption is that no zero-value characters exist
(the null char?). To check for this, just uncomment the checking
code... might make it run a little slower.
What do you think of this Igor? Handles any odd combination of
alphabets you want to throw at it, I think.
--
MrTrick
49. Re: Small feature request for future EU versions
Patrick Barnes wrote:
[snip]
> And there you go. The only way to make it faster, I think, is to use a
> non-recursive algorithm and maybe restrict it to 1d arrays only.
>
> The method's only assumption is that no zero-value characters exist
> (the null char?). To check for this, just uncomment the checking
> code... might make it run a little slower.
>
> What do you think of this Igor? Handles any odd combination of
> alphabets you want to throw at it, I think.
>
> --
> MrTrick
Using all yours people suggestions and questions I have combined
the case_xx() function, which is twice as faster than standard
case_la(), takes any alphabet and is selective on default.
Works with my 5 crazy Russians and pure Mumbo Jumbo as well.
Results - 0.3 sec - 3.5M file - 1.8GHz box.
Try please:
sequence table
table = repeat(0,256)
global constant
alphabet_LA
= "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ
= ".....place here The Great Mumbo Jumbo Alphabet...."
global function case_xx(integer c, object x, sequence alphabet)
integer A, a
table *= 0 -- to clear table for next use with another alphabet
for i=1 to length(alphabet) - 1 by 2 do
A = alphabet[i]
a = alphabet[i+1]
if c then
table[A] = A
table[a] = A
else
table[A] = a
table[a] = a
end if
end for
if atom(x) then
if x then
return table[x]
else
return x
end if
else
for i=1 to length(x) do
a = x[i]
if a then -- to convert binary
x[i] = table[a]
end if
end for
end if
return x
end function
puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')
----
Many Thanks To All !
Just add your native alphabet and use (on your own risk
Regards,
Igor Kachan
kinz at peterlink.ru
50. Re: Small feature request for future EU versions
OOppss...
Me wrote:
Patrick Barnes wrote:
[snip]
> And there you go. The only way to make it faster, I think, is to use a
> non-recursive algorithm and maybe restrict it to 1d arrays only.
>
> The method's only assumption is that no zero-value characters exist
> (the null char?). To check for this, just uncomment the checking
> code... might make it run a little slower.
>
> What do you think of this Igor? Handles any odd combination of
> alphabets you want to throw at it, I think.
>
> --
> MrTrick
Using all yours people suggestions and questions I have combined
the case_xx() function, which is twice as faster than standard
case_la(), takes any alphabet and is selective on default.
Works with my 5 crazy Russians and pure Mumbo Jumbo as well.
Results - 0.3 sec - 3.5M file - 1.8GHz box.
Try please:
sequence table
table = repeat(0,256)
global constant
alphabet_LA
= "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ
= ".....place here The Great Mumbo Jumbo Alphabet...."
global function case_xx(integer c, object x, sequence alphabet)
integer A, a
table *= 0 -- to clear table for next use with another alphabet
for i=1 to length(alphabet) - 1 by 2 do
A = alphabet[i]
a = alphabet[i+1]
if c then
table[A] = A
table[a] = A
else
table[A] = a
table[a] = a
end if
end for
if atom(x) then
if x then
return table[x]
else
return x
end if
else
for i=1 to length(x) do
a = x[i]
if a then -- to convert binary
if table[a] then -- to not affect others
x[i] = table[a]
end if --
end if
end for
end if
return x
end function
puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')
----
Many Thanks To All !
Just add your native alphabet and use (on your own risk
OOpsss ... Just next bug fix, see above.
Regards,
Igor Kachan
kinz at peterlink.ru
51. Re: Small feature request for future EU versions
Derek Parnell and Juergen Luethje wrote:
[snip]
> > Have you an estimate for how frequently the operators have actually
been
> > used in this manner? I think there are two instances in the RDS
libraries,
> > the case conversion routines (BTW which only work on a limited set of
> > characters)
>
> That's why they are useless for text written in many languages other
> than Englisch. That is strange for a product, that is intended for
> international use, especially because it is very easy to write better
case
> conversion routines.
> Case conversion routines written without using the operators in the
> manner mentioned above are also faster. That's why RDS themselves don't
> use their own library routines for case conversion, when speed is
> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ...
Hey, customers, get please a final function,
written as a "poem" in Euphoria.
Almost all bugs are fixed.
The last one is just a victim of art.
Let us wait for German, French, Greek, Turkish, Polish,
Tatar, Ukrainian, Mongol and other alphabets.
Mumbo Jumbo is ready.
I did not placed here 5 Russian alphabets - they act
as an old USSR torpedo on RDS MessageBoard.
global constant
alphabet_LA = "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz",
alphabet_MJ = ".....place here The Great Mumbo Jumbo Alphabet...."
global function case_xx(integer c, object x, sequence alphabet)
integer A, a
sequence table table = repeat(0,256)
for i=1 to length(alphabet) - 1 by 2 do
A = alphabet[i]
a = alphabet[i+1]
if c then table[A] = A
table[a] = A else
table[A] = a
table[a] = a end if end for
if atom(x) then
if x then
x
=
table[x]
if x then
x
=
x end if end if
else for i=1 to length(x) do
a
=
x[i]
if a then
if table[a] then
x[i]
=
table[a] end if end if end for end if
return x
end function
puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n')
puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n')
puts(1, alphabet_LA & '\n')
puts(1, case_xx(0, 'A', alphabet_LA)& '\n')
----
puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n')
puts(1, alphabet_MJ & '\n')
----
puts(1, case_xx(0, 'M', alphabet_MJ) & '\n')
puts(1, case_xx(0, 'u', alphabet_MJ) & '\n')
----
puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) &
'\n')
I'm waiting for the bug reports.
Let us finish this too long thread. Ok?
Regards,
Igor Kachan
kinz at peterlink.ru