1. Small feature request for future EU versions
- Posted by Vincent <darkvincentdude at yahoo.com> Oct 16, 2004
- 720 views
It would be nice to have the '==' (equal comparison) relational operator implemented in future versions of EU. Many programming languages have this operator for equal comparision, while EU uses equal(). I doubt anyone will comfuse '==' with the '=' assignment operator. And if this is a compatibility issue, keep the equal() ruitine until people agree that it is no longer nessasary for backwards compatibility. What do you all think about that?
2. Re: Small feature request for future EU versions
- Posted by "August" <fusionfive at tele2.se> Oct 16, 2004
- 644 views
- Last edited Oct 17, 2004
> posted by: Vincent <darkvincentdude at yahoo.com> > > It would be nice to have the '==' (equal comparison) relational > operator implemented in future versions of EU. Many programming > languages have this operator for equal comparision, while EU > uses equal(). I doubt anyone will comfuse '==' with the '=' > assignment operator. And if this is a compatibility issue, > keep the equal() ruitine until people agree that it is no longer > nessasary for backwards compatibility. What do you all think about > that? Generally I think using `=' for assignment and `==' for comparison sucks! In mathematics `=' has been used for ages as a relation symbol. It's the assignment statement that needs (may need) special treatment. The natural symbol for assignment is a "left arrow", but unfortunately it's not available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'. -- August
3. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 16, 2004
- 659 views
- Last edited Oct 17, 2004
August wrote: >> posted by: Vincent <darkvincentdude at yahoo.com> >> >> It would be nice to have the '==' (equal comparison) relational >> operator implemented in future versions of EU. Many programming >> languages have this operator for equal comparision, while EU >> uses equal(). I doubt anyone will comfuse '==' with the '=' >> assignment operator. And if this is a compatibility issue, >> keep the equal() ruitine until people agree that it is no longer >> nessasary for backwards compatibility. What do you all think about >> that? > > Generally I think using `=' for assignment and `==' for comparison sucks! In > mathematics `=' has been used for ages as a relation symbol. It's the > assignment statement that needs (may need) special treatment. The natural > symbol for assignment is a "left arrow", but unfortunately it's not > available in ASCII and e.g. ` x <- y' can easilly be mixed up with `x < -y'. Assignment could be written as e.g. x := y Regards, Juergen
4. Re: Small feature request for future EU versions
- Posted by cklester <cklester at yahoo.com> Oct 16, 2004
- 646 views
- Last edited Oct 17, 2004
August wrote: > > posted by: Vincent <darkvincentdude at yahoo.com> > > It would be nice to have the '==' (equal comparison) relational > > operator implemented in future versions of EU. > Generally I think using `=' for assignment and `==' for comparison sucks! In I think using if mySeq==yourSeq then... is a step in the right direction from if equal(mySeq,yourSeq) then... That's 22 characters down from 28 characters, a 21% reduction in typing! Not too shabby. :) -=ck "Programming in a state of EUPHORIA." http://www.cklester.com/euphoria/
5. Re: Small feature request for future EU versions
- Posted by "August" <fusionfive at tele2.se> Oct 16, 2004
- 677 views
- Last edited Oct 17, 2004
> Assignment could be written as e.g. x := y Yes, ever since Algol that has been the most common alternative to `='.
6. Re: Small feature request for future EU versions
- Posted by CoJaBo <cojabo at suscom.net> Oct 17, 2004
- 654 views
Vincent wrote: > > It would be nice to have the '==' (equal comparison) relational > operator implemented in future versions of EU. Many programming Unfortunatly this, while it certainly be helpful, will probably never be added... > languages have this operator for equal comparision, while EU > uses equal(). I doubt anyone will comfuse '==' with the '=' > assignment operator. And if this is a compatibility issue, > keep the equal() ruitine until people agree that it is no longer If this is ever added, equal() should remain. '==' would simply be short for equal(). Less typing good! > nessasary for backwards compatibility. What do you all think about > that? >
7. Re: Small feature request for future EU versions
- Posted by Vincent <darkvincentdude at yahoo.com> Oct 17, 2004
- 688 views
CoJaBo wrote: > > Vincent wrote: > > > > It would be nice to have the '==' (equal comparison) relational > > operator implemented in future versions of EU. Many programming > Unfortunatly this, while it certainly be helpful, will probably never > be added... > > > languages have this operator for equal comparision, while EU > > uses equal(). I doubt anyone will comfuse '==' with the '=' > > assignment operator. And if this is a compatibility issue, > > keep the equal() ruitine until people agree that it is no longer > If this is ever added, equal() should remain. '==' would simply > be short for equal(). Less typing good! > > > nessasary for backwards compatibility. What do you all think about > > that? > > > Yea, but Rob is introducing the $ feature which happens to be a short method of typing length. x[$] is the same as x[length(x)] And I dont think it would be any hassle for Rob to just implement the '==' operator as a alternative for equal() for 5 reasons: #1 The '==' operator is used for boolean comparing of objects in many languages. #2 Using equal() makes the code more unreadable. (not really but still) #3 It takes less typing and is more readable. #4 I dont think Robert Craig would have much difficulty if any implementing the '==' relational operator in the core language definition. #5 I think it's cool :P
8. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 17, 2004
- 654 views
Derek Parnell wrote: > Chris Bensler wrote: > > [snip] > >> >> I think I would prefer an operator like ==. > > And I would like '=' to only be an equality test, ':=' to only be an > assignment action and 'element_eq()' to be the sequence operation. Me too. > If we have '==' we should logically also have '<<', '>>', '!==', '<<=' > and '>>='. Oh no, please. >> I think that is something that can be done in a preprocessor though. >> Alot of the changes that people want can be done in a preprocessor, and >> once 2.5 arrives, we will have the ability to edit the front end, I >> beleive, which is even better than a preprocesor. > > Yes, this is going to be survival-of-the-fitest game, with lots > of variants vying for attention. Yep. Regards, Juergen
9. Re: Small feature request for future EU versions
- Posted by rudy toews <rltoews at ilos.net> Oct 17, 2004
- 649 views
Juergen Luethje wrote: > > Derek Parnell wrote: > > Chris Bensler wrote: > >> [snip] > >> I think that is something that can be done in a preprocessor though. > >> Alot of the changes that people want can be done in a preprocessor, and > >> once 2.5 arrives, we will have the ability to edit the front end, I > >> beleive, which is even better than a preprocesor. > > > > Yes, this is going to be survival-of-the-fitest game, with lots > > of variants vying for attention. > > Yep. > > Regards, > Juergen > > survival-of-the-fittest game! oh no. please don't go off in different directions like pieces of a bomb. communicate with each other , all those who want to work on the project. get the same syntax down (names of functions,procedures and arguments). then each can try to make the inside of those routines better. or like an ide does, make a front end better using different methods (maybe even a pre-processor ) but stick to the same insides(syntax used for the routines). just my thoughts. rudy
10. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 17, 2004
- 666 views
rudy toews wrote: > Juergen Luethje wrote: >> >> Derek Parnell wrote: >> >> Chris Bensler wrote: >>>> [snip] > >>>> I think that is something that can be done in a preprocessor though. >>>> Alot of the changes that people want can be done in a preprocessor, and >>>> once 2.5 arrives, we will have the ability to edit the front end, I >>>> beleive, which is even better than a preprocesor. >>> >>> Yes, this is going to be survival-of-the-fitest game, with lots >>> of variants vying for attention. >> >> Yep. >> > > survival-of-the-fittest game! > oh no. > > please don't go off in different directions like pieces of a bomb. > communicate with each other , all those who want to work on the project. > get the same syntax down (names of functions,procedures and arguments). I agree that this is desirable. Experience from the past as well as this little discussion shows, that different people often have different preferences, though. > then each can try to make the inside of those routines better. > > or like an ide does, make a front end better using different methods > (maybe even a pre-processor ) but stick to the same insides(syntax used > for the routines). > > just my thoughts. Regards, Juergen
11. Re: Small feature request for future EU versions
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Oct 17, 2004
- 662 views
On Sun, 17 Oct 2004 00:24:23 -0700, rudy toews <guest at RapidEuphoria.com> wrote: >> Derek Parnell wrote: >> > Yes, this is going to be survival-of-the-fitest game, with lots >> > of variants vying for attention. >please don't go off in different directions like pieces of a bomb. >communicate with each other FWIW, new operators are not actually needed, it is quite easy to greatly improve the language using the current ones. Taking the if <expr> then example, the interpreter should realise that <expr> must deliver a boolean result (or crash), in other words map e1<e2 to compare(e1,e2)=-1 e1<=e2 to compare(e1,e2)!=1 e1=e2 to equal(e1,e2), e1!=e2 to not equal(e1,e2), e1>=e2 to compare(e1,e2)!=-1 e1>e2 to compare(e1,e2)=1 Of course, it does not need to do this if e1 and e2 are both atoms. Taking this a step further "if <e1> and <e2> then" must also deliver a boolean result, which means the "need" for a boolean result must be propagated into e1 and e2. The same is true for all the other operators (not, unary minus, +, -, *, /, or, xor), and the same logic applies to while, for, subscript, and slice expressions, but not to assignments, constants, or parameters. If this all sounds horribly complicated, don't worry, in practice it's not. I already have this working in Posetf ) Regards, Pete
12. Re: Small feature request for future EU versions
- Posted by irv mullins <irvm at ellijay.com> Oct 17, 2004
- 670 views
Vincent wrote: > > It would be nice to have the '==' (equal comparison) relational > operator implemented in future versions of EU. Many programming > languages have this operator for equal comparision, while EU > uses equal(). I doubt anyone will comfuse '==' with the '=' > assignment operator. And if this is a compatibility issue, > keep the equal() ruitine until people agree that it is no longer > nessasary for backwards compatibility. What do you all think about > that? For that matter, what's wrong with just having '=' for both string and numeric comparisons? Several other languages do that without any problem. Irv
13. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 17, 2004
- 665 views
- Last edited Oct 18, 2004
On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com> wrote: > string and numeric comparisons? Several other languages do that > without any problem. > > Irv The problem is this: constant atom1=5, atom2=10, seq1={1,5,2,4}, seq2={1,2,5,4} constant cond1 = ( atom1 = atom2 ) constant cond2 = ( seq1 = seq2 ) --cond1 will be 0, as 5 does not equal 10. --cond2 will be {1,0,0,1}, because Euphoria compares each element. The problem is that the IF statement doesn't know what to do with a sequence, and there's not an obvious solution. As a quick hack, lets say that if the IF statement receives a sequence, it treats it as true if every element is non-zero. That would allow us to easily compare strings using the simple form: if string1 = string2. However, what if the sequence passed to the IF statement is empty? What if the sequence contains multiple levels? What if the sequence contains a mix of non-zero integers, and an empty sequence? I think that these things aren't easily solved... Maybe it should be extended partially. 1. Only atoms, and 1-dimensional sequences can be passed to the IF statement 2. If an atom, pass if non-zero. 3. If a sequence, pass if all elements are non-zero. 4. If an empty sequence is passed to the if statement, treat it as a 'zero', I suppose. What do you think of this solution? I really don't like the idea of adding extra relationship operators. :=, =>, ==, etc, is really annoying to remember. Trust me, I've written too much VHDL code... At least with the above suggestion, there's no broken compatibility, a common issue (Why can't I just use '=' to compare these strings?) is fixed, and it's a logical solution. -- MrTrick
14. Re: Small feature request for future EU versions
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Oct 18, 2004
- 732 views
On Mon, 18 Oct 2004 09:52:19 +1000, Patrick Barnes <mrtrick at gmail.com> wrote: >The problem is that the IF statement doesn't know what to do with a >sequence, and there's not an obvious solution. Actually there is: the interpreter can easily determine when it needs a boolean result, eg in an if statement, so it simply maps relational operators to equal()/compare(). >Why can't I just use '=' to compare these strings? Trust me, this can be done for any reasonable expression, eg if name="pete" then or if forename="pete" and surname="lomax" then but not if {1,2,3}+{4,5,6} then I'll give it a go if/when I get 2.5 Regards, Pete
15. Re: Small feature request for future EU versions
- Posted by irv mullins <irvm at ellijay.com> Oct 18, 2004
- 670 views
Patrick Barnes wrote: > > On Sun, 17 Oct 2004 04:46:14 -0700, irv mullins <guest at rapideuphoria.com> > wrote: > > string and numeric comparisons? Several other languages do that > > without any problem. > > > > Irv > > The problem is this: > > constant > atom1=5, > atom2=10, > seq1={1,5,2,4}, > seq2={1,2,5,4} > > constant cond1 = ( atom1 = atom2 ) > constant cond2 = ( seq1 = seq2 ) > > --cond1 will be 0, as 5 does not equal 10. > --cond2 will be {1,0,0,1}, because Euphoria compares each element. By any normal meaning of the word 'equal' as used by scientists, mathematicians, programmers, and the corner grocer, the answer can not possibly be {1,0,0,1}. It is either TRUE or FALSE. If the lengths of the two sequences are different, then they are not equal, and the result should be FALSE - not an error. If, for some strange reason, someone wanted an item-by-item comparison between two sequences, then a new and more meaningful name should be chosen for a function which returns {1,0,0,1}. > The problem is that the IF statement doesn't know what to do with a > sequence, and there's not an obvious solution. > As a quick hack, lets say that if the IF statement receives a > sequence, it treats it as true if every element is non-zero. > > That would allow us to easily compare strings using the simple form: > if string1 = string2. > However, what if the sequence passed to the IF statement is empty? > What if the sequence contains multiple levels? > What if the sequence contains a mix of non-zero integers, and an empty > sequence? You're thinking within the box built by RDS. No matter whether the sequences are empty, or contain multiple levels, if the two are identical, then they are equal, otherwise they aren't. That is the obvious definition of equal which anyone can understand. > I think that these things aren't easily solved... The whole thing could have been avoided if Rob had used the = operator to return equality and another operator or function to return a comparison. .... > At least with the above suggestion, there's no broken compatibility, a > common issue (Why can't I just use '=' to compare these strings?) is > fixed, and it's a logical solution. I'll bet there aren't a dozen uses of = in the existing code base. I've used it exactly once, and I really don't mind changing that instance to some new function. Irv
16. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 18, 2004
- 700 views
Ricardo Forno wrote: <snip> > Moreover, please remember (or consider) that one of the most common pitfalls > in the C language is to use = instead of ==, and in Pascal and other ones, > to use = instead of :=. <snip> That really surprises me. I was thinking that different symbols for different operations (comparison vs. assignment) would lead to clearer code and less pitfalls -- compared to Euphoria's '=', the meaning of which dependes on the context. Regards, Juergen
17. Re: Small feature request for future EU versions
- Posted by Robert Craig <rds at RapidEuphoria.com> Oct 18, 2004
- 684 views
Juergen Luethje wrote: > That really surprises me. I was thinking that different symbols for > different operations (comparison vs. assignment) would lead to clearer > code and less pitfalls -- compared to Euphoria's '=', the meaning of > which dependes on the context. Consider the following C code: if (A = B) { printf("A is equal to B"); } Every C programmer will eventually have a 3 hour debugging session (and probably on several different occasions), where he finally realizes that the above code is actually doing: A = B; if (A != 0) { printf("A is equal to B"); } Not only is the if-statement "wrong", but A is overwritten by B! Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
18. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 18, 2004
- 671 views
Robert Craig wrote: > Juergen Luethje wrote: >> That really surprises me. I was thinking that different symbols for >> different operations (comparison vs. assignment) would lead to clearer >> code and less pitfalls -- compared to Euphoria's '=', the meaning of >> which dependes on the context. > > Consider the following C code: > > if (A = B) { > printf("A is equal to B"); > } > > Every C programmer will eventually have a 3 hour debugging > session (and probably on several different occasions), where he > finally realizes that the above code is actually doing: > > A = B; > if (A != 0) { > printf("A is equal to B"); > } > > Not only is the if-statement "wrong", but A is overwritten by B! Urgs, that is ugly. I see now. Thanks for the explanation! Regards, Juergen
19. Re: Small feature request for future EU versions
- Posted by David <dcuny at lanset.com> Oct 18, 2004
- 712 views
Derek Parnell wrote: > If RDS were to change Euphoria to behave as most people expect it to, > there would be some existing programs that would fail. But I suspect > that there would be very, very few of those. The pain is worth the gain. This is indeed an rehash of an old complaint. My recollection (although quite possibly wrong) of Robert's response was: 1. His vision for Euphoria is a language where operators are simple and consistant. Having '=' apply to each item in a sequence is simple and consistant. 2. Part of Euphoria's value is backwards compatibility. Changing how '=' works would break code in the library - something he's typically unwilling to do unless there is a major gain for the language. Given this, I'd be *very* suprised to find Robert's changed his position. I suspect that one's personal expectation of the '=' operator depends on what language you come from. If you come from a language such as BASIC, you probably prefer a single true/false value from the comparison operator. People coming from C would consider Euphoria's behavior to be normal - this extends to other C/C++ derived languages, such as Java. If you want my personal opinion, you can search the archives. -- David Cuny
20. Re: Small feature request for future EU versions
- Posted by "Unkmar" <L3Euphoria at bellsouth.net> Oct 19, 2004
- 670 views
----- Original Message ----- From: "Derek Parnell" Sent: Sunday, October 17, 2004 9:36 PM Subject: RE: Small feature request for future EU versions > [snip] > It seems that you would like equality (and relationship comparisions) > implemented as built-in functions (eg. equal(), compare() ) and > sequence operations performed by operators ('=', '<', etc...) > > Whereas I'd prefer the reverse situation. I'd like relationship > comparisions to use operators and sequence operations to use built-in > functions. > > I'd prefer that ... > > cond1 = (seq1 = seq2) > > to be interpreted as ... > > if the contents of seq1 and the contents of seq2 are identical then > assign 'true' to cond1 otherwise assign 'false' to cond1. > I agree with you here. if (identical) then TRUE else FALSE end if > If I really wanted a sequence operation to be performed as its result > to be assigned I'd rather write something like ... > > cond1 = seqop_eq(seq1, seq2) > > -- > Derek Parnell > Melbourne, Australia >
21. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 19, 2004
- 670 views
On Mon, 18 Oct 2004 21:18:27 -0400, Lucius Hilley <l3euphoria at bellsouth.net> wrote: > From: "Derek Parnell" > Sent: Sunday, October 17, 2004 9:36 PM > Subject: RE: Small feature request for future EU versions > > It seems that you would like equality (and relationship comparisions) > > implemented as built-in functions (eg. equal(), compare() ) and > > sequence operations performed by operators ('=', '<', etc...) > > > > Whereas I'd prefer the reverse situation. I'd like relationship > > comparisions to use operators and sequence operations to use built-in > > functions. > > > > I'd prefer that ... > > > > cond1 = (seq1 = seq2) > > > > to be interpreted as ... > > > > if the contents of seq1 and the contents of seq2 are identical then > > assign 'true' to cond1 otherwise assign 'false' to cond1. > > > > I agree with you here. > if (identical) then > TRUE > else > FALSE > end if I agree, logically it should work like that. I think that the behaviour of the 'if' construct should be changed, to automatically use equal() to check if the phrase seq1 = seq2 is used within an if statement. But... ONLY the if statement. -- MrTrick
22. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 19, 2004
- 688 views
Hi Ricardo, you wrote: > Hi, Juergen. > Your thinking is logical, and I agree that using different operators for > different purposes *should* be clearer. This is the theory ... > But, in my experience (and it seems > some other people had the same kind of experience), == or := versus = leads > to pitfalls. ... and that seems to be the practice. I didn't know that because of lack of experience with C or Pascal. However, the example that Rob posted in the meantime was impressive. > I think this is due to the fact that all these operators contain the = sign. > This problem rarely arose in APL, where the assignation symbol was a left > arrow (such as <-, but a single character). > Regards. Interesting. Thank you! Regards, Juergen
23. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 22, 2004
- 646 views
Just in case this topic hasn't been discussed to death yet... Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll agree. However, I do not support changing their current behaviour from returning a sequence comprised of the individual elements, operated on. Apart from the backwards compatibility... I have written many pieces of code that taken an 'object' argument, act on it in some way, and return it... without ever testing whether the object is a sequence or an atom! This is a tremendous benefit for Euphoria.. Try do the same thing in C, and it's more than likely that you'll have to write an individual function for every type of argument. And I don't mean just atomic or array types... I mean char, short, int, uint, long, double, char[], short[], int[], long[], double[], plus any other types that you are interested in. And even then it can only handle 1-d arrrays! For example: (and I'm sure it's possible to do this more efficiently, this is just off the top of my head)
function abs(object x) return x * (1 - 2*( x < 0 ) ) end function
Elegant, and simple. Works for ANY object X... I see this as a very important reason not to change the behaviour of existing operators. However, the problem still remains... I have object A here. and object B there. I want to compare them, but inside of the language, not using a function. That would suggest adding a new operator type, but what? Given the many pitfals of the '=' vs '==' in C, I would steer clear of using '=='. (Okay, here's the suggestion, pick it to pieces!) How about something that is obviously intended for comparing sequences? Applying braces around the operators that could be used for comparison (not +, -, *, or /) A {=} B A {!=} B replaces equal(A, B) and not equal(A, B) A {>} B A {<} B A {>=} B A {<=} B replaces compare(A,B) and it's tests.... I cannot remember how compare works, ever. Each time I want to use compare(), I need to look it up to see the behaviour... These operators are logical, and more importantly intuitive. The benefits: It is a long term solution that works in any situation.... no 'if' specialisations are required. I can use these within an equation to test equality, without causing problems with sequence operators. Usage: Although nominally for $sequence $test $sequence, it would use the same mechanisms internally as compare() and equal(). That means it can be used to compare atom with sequence, sequence with atom, atom with atom, sequence with sequence. Obviously comparing an atom with a sequence will always return false... That's ok, it can be used to compare sequence elements with one another. The only confusion I can even think of, is that someone will consistently use {=} to compare two atoms, rather than =.... This is not even a real abuse, and is in my mind unlikely. I welcome comment on this little suggestion of mine. -- MrTrick
24. Re: Small feature request for future EU versions
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 22, 2004
- 666 views
Patrick Barnes wrote: > > Just in case this topic hasn't been discussed to death yet... Impossible! > Sequence operators (= != >= <= > < + - * /) are confusing at first, I'll > agree. > > However, I do not support changing their current behaviour from > returning a sequence comprised of the individual elements, operated > on. Apart from the backwards compatibility... > > I have written many pieces of code that taken an 'object' argument, > act on it in some way, and return it... without ever testing whether > the object is a sequence or an atom! > This is a tremendous benefit for Euphoria.. Try do the same thing in > C, and it's more than likely that you'll have to write an individual > function for every type of argument. And I don't mean just atomic or > array types... I mean char, short, int, uint, long, double, char[], > short[], int[], long[], double[], plus any other types that you are > interested in. And even then it can only handle 1-d arrrays! I agree that this is one of Euphoria's overriding strengths. It costs a bit of runtime performance (dynamic-typing verses static-typing) but in the long run it is wonderful. But why *must* this functionality be implemented using operators rather than built-in functions? In essence, it is a syntax issue and not a semantic issue. > For example: (and I'm sure it's possible to do this more efficiently, > this is just off the top of my head) > > }}} <eucode> > function abs(object x) > return x * (1 - 2*( x < 0 ) ) > end function > </eucode> {{{
function abs(object x) return x * (1 - 2*( lessthan(x,0) ) ) end function
> Elegant, and simple. Works for ANY object X... > I see this as a very important reason not to change the behaviour of > existing operators. Have you an estimate for how frequently the operators have actually been used in this manner? I think there are two instances in the RDS libraries, the case conversion routines (BTW which only work on a limited set of characters) and ... actually I can't find the other example just now. My position is that the functionalty should remain in Euphoria, but as it is rarely used it should be implemented using (unambiguous) built-in functions, and the much more common comparision functionality should be implemented using operators. In the end, both are translated to IL for execution, so its just a matter of which syntax to use to represent the functionality. However, if you really insist on operators for the sequence operations, then as these are rarely used, a special syntax for those might be better.
function abs(object x) return x * (1 - 2*( x {<} 0 ) ) end function
-- Derek Parnell Melbourne, Australia
25. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 22, 2004
- 639 views
On Thu, 21 Oct 2004 20:27:48 -0700, Derek Parnell <guest at rapideuphoria.com> wrote: > But why *must* this functionality be implemented using operators rather > than built-in functions? In essence, it is a syntax issue and not a > semantic issue. Well, using functions rather than operators it looks ungainly, and requires more typing. In addition, they can be difficult to quickly comprehend when you are reading through source: if A {<=} B then Vs: if compare(A, B) <= 0 then I don't even know if the two are functionally the same! I can't remember whether compare is supposed to return a positive or a negative value in which case! > }}} <eucode> > function abs(object x) > return x * (1 - 2*( lessthan(x,0) ) ) > end function > </eucode> {{{ Yes, that could work... but there are more important considerations... (see below) > Have you an estimate for how frequently the operators have actually been > used in this manner? I think there are two instances in the RDS libraries, > the case conversion routines (BTW which only work on a limited set of > characters) and ... actually I can't find the other example just now. Don't forget that the RDS libraries are missing many basic functions like the aforementioned abs... I'm reasonably sure this functionality gets used more often in things like genfunc, etc... My projects have used sequence operators whenever I see them as being appropriate. > My position is that the functionalty should remain in Euphoria, but as it > is rarely used it should be implemented using (unambiguous) built-in > functions, and the much more common comparision functionality should be > implemented using operators. I think this would be a grave error. Initially, I thought "yeah, that's a great idea, it makes more sense"... and it does. But... (see below) > However, if you really insist on operators for the sequence operations, > then as these are rarely used, a special syntax for those might be better. > > }}} <eucode> > function abs(object x) > return x * (1 - 2*( x {<} 0 ) ) > end function > </eucode> {{{ Actually, I initially thought that too. After all, it makes as much sense as the solution I proposed in the earlier thread, if not more.... PROBLEM! Backwards compatibility. Not that we haven't heard it shouted before... It's worse than regular backwards compatibility problems though... Consider: Should it be changed, every program that uses sequence operators will have a different behaviour. eg: X = a[i] + (b[j][m] > c[p][2]) However, there will be no obvious error message. Rather than the line above executing as expected, with a[i] being added to by an array, it'll be added to by an atom. Who knows what will happen? Of course, nothing could happen... if b[j][m] and c[p][2] are atoms. But are they? I have no idea, and neither will anyone searching through source changing things over. There's no way to detect whether this will affect code. That's why I suggest the new {=} apply to sequence comparison. After all, it *is* logical for given values of 'logical', and it won't break any existing code. -- MrTrick
26. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 22, 2004
- 658 views
Derek Parnell wrote: <snip> > Have you an estimate for how frequently the operators have actually been > used in this manner? I think there are two instances in the RDS libraries, > the case conversion routines (BTW which only work on a limited set of > characters) That's why they are useless for text written in many languages other than Englisch. That is strange for a product, that is intended for international use, especially because it is very easy to write better case conversion routines. Case conversion routines written without using the operators in the manner mentioned above are also faster. That's why RDS themselves don't use their own library routines for case conversion, when speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... <snip> Regards, Juergen
27. Re: Small feature request for future EU versions
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 22, 2004
- 649 views
Juergen Luethje wrote: > > Derek Parnell wrote: > > <snip> > > > Have you an estimate for how frequently the operators have actually been > > used in this manner? I think there are two instances in the RDS libraries, > > the case conversion routines (BTW which only work on a limited set of > > characters) > > That's why they are useless for text written in many languages other > than Englisch. That is strange for a product, that is intended for > international use, especially because it is very easy to write better case > conversion routines. > Case conversion routines written without using the operators in the > manner mentioned above are also faster. That's why RDS themselves don't > use their own library routines for case conversion, when speed is > important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... I didn't think that would be the case, but I just tested a simple lookup table approach to case conversion and it runs in 75% of the time that lower() uses. -- Derek Parnell Melbourne, Australia
28. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 22, 2004
- 674 views
Derek Parnell wrote: > Juergen Luethje wrote: <snip> >> That's why they are useless for text written in many languages other >> than Englisch. That is strange for a product, that is intended for >> international use, especially because it is very easy to write better case >> conversion routines. >> Case conversion routines written without using the operators in the >> manner mentioned above are also faster. That's why RDS themselves don't >> use their own library routines for case conversion, when speed is >> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... > > I didn't think that would be the case, but I just tested a simple lookup > table approach to case conversion and it runs in 75% of the time that > lower() uses. I use personally a modified version of RDS' fast_lower() (URL might wrap): http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&toYear=9&postedBy=Juergen+Luethje&keywords=lower Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of the time that lower() uses. Regards, Juergen
29. Re: Small feature request for future EU versions
- Posted by "Unkmar" <L3Euphoria at bellsouth.net> Oct 22, 2004
- 646 views
----- Original Message ----- From: "Juergen Luethje" Sent: Friday, October 22, 2004 2:48 AM Subject: Re: Small feature request for future EU versions > > <snip> > > Case conversion routines written without using the operators in the > manner mentioned above are also faster. That's why RDS themselves don't > use their own library routines for case conversion, when speed is > important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... > > <snip> > > Regards, > Juergen Those routines will do case convertion at any depth. Anything else would have to be either recursive or specially designed for each task. unkmar
30. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 22, 2004
- 683 views
Lucius Hilley wrote: > ----- Original Message ----- > From: "Juergen Luethje" > >> <snip> >> >> Case conversion routines written without using the operators in the >> manner mentioned above are also faster. That's why RDS themselves don't >> use their own library routines for case conversion, when speed is >> important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... >> >> <snip> > > Those routines will do case convertion at any depth. > Anything else would have to be either recursive or specially designed for > each task. My modified version of RDS' fast_lower() function (used in 'guru.ex' and 'search.ex') also does case conversion at any depth. Yes, it is recursive. As I wrote, my function (which can also handle special characters such as the German umlauts) runs in 50% of the time that lower() uses, when applied to the whole text of Euphoria/Doc/Library.doc. Do you think my recursive function will be slower than the lower() library function, when applied to deeply nested objects? I don't know. I think the library function also will have to do some recursion internally. And do we apply lower() and upper() more often to deeply nested objects, or to plain text strings? Regards, Juergen
31. Re: Small feature request for future EU versions
- Posted by "Kat" <gertie at visionsix.com> Oct 22, 2004
- 676 views
On 22 Oct 2004, at 14:12, Juergen Luethje wrote: > > > Derek Parnell wrote: > > > Juergen Luethje wrote: > > <snip> > > >> That's why they are useless for text written in many languages other > >> than Englisch. That is strange for a product, that is intended for > >> international use, especially because it is very easy to write better case > >> conversion routines. Case conversion routines written without using the > >> operators in the manner mentioned above are also faster. That's why RDS > >> themselves don't use their own library routines for case conversion, when > >> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... > > > > I didn't think that would be the case, but I just tested a simple lookup > > table approach to case conversion and it runs in 75% of the time that > > lower() uses. > > I use personally a modified version of RDS' fast_lower() (URL might wrap): > > http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t > oYear=9&postedBy=Juergen+Luethje&keywords=lower > > Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of > the time that lower() uses. Just out of curiosity, did you compare to/against Jiri's lib in http://www.rapideuphoria.com/nlseu.zip ? I did something similar in mirc ages ago, but naturally it ran v e r y s l o w l y there. Kat
32. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 22, 2004
- 666 views
Kat wrote: > On 22 Oct 2004, at 14:12, Juergen Luethje wrote: > >> Derek Parnell wrote: >> >>> Juergen Luethje wrote: >> >> <snip> >> >>>> That's why they are useless for text written in many languages other >>>> than Englisch. That is strange for a product, that is intended for >>>> international use, especially because it is very easy to write better case >>>> conversion routines. Case conversion routines written without using the >>>> operators in the manner mentioned above are also faster. That's why RDS >>>> themselves don't use their own library routines for case conversion, when >>>> speed is important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... >>> >>> I didn't think that would be the case, but I just tested a simple lookup >>> table approach to case conversion and it runs in 75% of the time that >>> lower() uses. >> >> I use personally a modified version of RDS' fast_lower() (URL might wrap): >> >> http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=9&fromYear=9&toMonth=9&t >> oYear=9&postedBy=Juergen+Luethje&keywords=lower >> >> Applied to the whole text in Euphoria/Doc/Library.doc, it runs in 50% of >> the time that lower() uses. > > Just out of curiosity, did you compare to/against Jiri's lib in > http://www.rapideuphoria.com/nlseu.zip > ? I did something similar in mirc ages ago, but naturally it ran v e r y s l o > w l y > there. I hadn't compared it to that lib, because I wasn't aware of that library. Now I downloaded 'nlseu.zip' and compared it: Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower() function in 'nlseu.zip' takes 310% of the time that the lower() function in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows. Regards, Juergen
33. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 22, 2004
- 641 views
- Last edited Oct 23, 2004
Hi, Juergen! You wrote: [snip] > I hadn't compared it to that lib, because I wasn't aware of that library. > Now I downloaded 'nlseu.zip' and compared it: > Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower() > function in 'nlseu.zip' takes 310% of the time that the lower() function > in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows. There is wildcarr.e in my ru_eu_9_.zip package. It has the additional functions with the bilingual (English/Russian) names. English names of those functions are case_la() - for Latin alphabet, and case_ru() - for Russian alphabet in 5 different encodings. I did not test the speed of those functions, they just work for me and I do not care. Try please, any alphabet may be supported that Russian way, I think. Regards, Igor Kachan kinz at peterlink.ru
34. Re: Small feature request for future EU versions
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Oct 23, 2004
- 651 views
On Fri, 22 Oct 2004 21:43:00 +0200, Juergen Luethje <j.lue at gmx.de> wrote: >Now I downloaded 'nlseu.zip' and compared it: >Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower() >function in 'nlseu.zip' takes 310% of the time that the lower() function >in 'wildcard.e' uses. Probably >Furthermore, nlsLower() is only for Windows. True[1] It will convert to lower case not only the usual A-Z, not only the few characters in #80..#FF, but also, potentially, if modified to use CharLowerW instead of CharLowerA, unicode. (Much) Slower, yes. However, I wanted to point out that it is fundamentally better, at least for some purposes. Regards, Pete [1] no doubt there is a similar Linux system call.
35. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 23, 2004
- 658 views
Hi, Juergen, again! > > Hi, Juergen! > > You wrote: > > [snip] > > > I hadn't compared it to that lib, because I wasn't aware of that library. > > Now I downloaded 'nlseu.zip' and compared it: > > Applied to the whole text of Euphoria/Doc/Library.doc, the nlsLower() > > function in 'nlseu.zip' takes 310% of the time that the lower() function > > in 'wildcard.e' uses. Furthermore, nlsLower() is only for Windows. > > There is wildcarr.e in my ru_eu_9_.zip package. > It has the additional functions with the bilingual (English/Russian) names. > > English names of those functions are case_la() - for Latin alphabet, > and case_ru() - for Russian alphabet in 5 different encodings. > > I did not test the speed of those functions, they just work for me and I do > not care. > > Try please, any alphabet may be supported that Russian way, I think. Oops... Forgot to say. If you want these Russian libraries to be compatible with the standard Euphoria, run the command : ex_r.exe translat.ex Then you'll have the complete set of these libs with the .ez extention. The wildcarr.ez and others such libs support translator Eu2C and binder. To get Russian program translated to Latin, use Esc t command of the red.ex editor. This way pure Russian red.ex was compiled with pure English Open Watcom 1.1 and binded with pure English CE Euphoria v.2.4. This way *any pure Russian program* runs on any *custom* Euphoria, which supports the standard Euphoria code. Just move all .ez into separate dir and rename them as .e to get this effect. All that just now, naturally, and free. Good Luck again! Regards, Igor Kachan kinz at peterlink.ru
36. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 23, 2004
- 643 views
Igor Kachan wrote: > Hi, Juergen, again! > >> Hi, Juergen! <snip> >> There is wildcarr.e in my ru_eu_9_.zip package. >> It has the additional functions with the bilingual (English/Russian) names. >> >> English names of those functions are case_la() - for Latin alphabet, >> and case_ru() - for Russian alphabet in 5 different encodings. >> >> I did not test the speed of those functions, they just work for me and I do >> not care. >> >> Try please, any alphabet may be supported that Russian way, I think. > > Oops... Forgot to say. > > If you want these Russian libraries to be compatible with > the standard Euphoria, run the command : > > ex_r.exe translat.ex > > Then you'll have the complete set of these libs with > the .ez extention. > > The wildcarr.ez and others such libs support translator Eu2C and binder. > > To get Russian program translated to Latin, use Esc t command of > the red.ex editor. This way pure Russian red.ex was compiled with > pure English Open Watcom 1.1 and binded with pure > English CE Euphoria v.2.4. > > This way *any pure Russian program* runs on any *custom* Euphoria, which > supports the standard Euphoria code. > Just move all .ez into separate dir and rename them as .e to get > this effect. My test program ...
include wildcarr.e constant CP = {"dos","win","koi","iso","mac"} sequence s s = "heLLo" for i = 1 to length(CP) do printf(1, "'%s'\n", {case_ru(0,s,CP[i])}) end for
... still just prints 5 times "heLLo" (German Windows 98). <snip> Regards, Juergen
37. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 23, 2004
- 642 views
Hi, Juergen Luethje wrote: [snip] >>> English names of those functions are case_la() - for Latin alphabet, >>> and case_ru() - for Russian alphabet in 5 different encodings. [snip] > My test program ... > > }}} <eucode> > include wildcarr.e > constant CP = {"dos","win","koi","iso","mac"} > sequence s > s = "heLLo" > for i = 1 to length(CP) do > printf(1, "'%s'\n", {case_ru(0,s,CP[i])}) > end for > </eucode> {{{ > > ... still just prints 5 times "heLLo" (German Windows 98). > > <snip> There is case_la() for Latin alphabet, not case_ru(). See please on top once more. Your "heLLo" is in Latin alphabet. I think, you can make case_gr() for German alphabet, someone - case_fr() for France, and so on. case_ru() is Russian Cyrillic. There are no any "dos","win","koi","iso","mac" for pure Latin. Pure Latin is just ASCII, A..Z, a..z. Do you see now? Regards, Igor Kachan kinz at peterlink.ru
38. Re: Small feature request for future EU versions
- Posted by "Juergen Luethje" <j.lue at gmx.de> Oct 23, 2004
- 642 views
Igor Kachan wrote: > Juergen Luethje wrote: > > [snip] >>>> English names of those functions are case_la() - for Latin alphabet, >>>> and case_ru() - for Russian alphabet in 5 different encodings. > [snip] > >> My test program ... >> >> }}} <eucode> >> include wildcarr.e >> constant CP = {"dos","win","koi","iso","mac"} >> sequence s >> s = "heLLo" >> for i = 1 to length(CP) do >> printf(1, "'%s'\n", {case_ru(0,s,CP[i])}) >> end for >> </eucode> {{{ >> >> ... still just prints 5 times "heLLo" (German Windows 98). >> >> <snip> > > There is case_la() for Latin alphabet, not case_ru(). See please on top > once more. > Your "heLLo" is in Latin alphabet. case_la() is just Euphoria's standard lower() and upper() combined in 1 function. I have that already, and as has been discussed here, this doesn't handle special characters such as the German umlauts. So it is useless for me. > I think, you can make case_gr() for German alphabet, As I wrote at the beginning of this thread, I already *had* made a function that is able to handle German characters. I thought you would provide me another function. Now I downloaded all that Russian stuff just so that you tell me, I shall write my own function??? > someone - case_fr() > for France, and so on. case_ru() is Russian Cyrillic. > There are no any "dos","win","koi","iso","mac" for pure Latin. > Pure Latin is just ASCII, A..Z, a..z. > Do you see now? Regards, Juergen
39. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 24, 2004
- 660 views
Juergen Luethje wrote: [snip} > case_la() is just Euphoria's standard lower() and upper() combined in 1 > function. I have that already, and as has been discussed here, this > doesn't handle special characters such as the German umlauts. So it is > useless for me. OK, but it is useful for me to handle *all* languages with pure Latin alphabet. Maybe, it is useful for someone else as an examle of "delete double" method. There are many different people with different programming skills here. Some of them just do not know that lower() and upper() handles pure Latin only. This list is for the free technical support of any user of Euphoria programming language, for PD Euphoria users and for CE Euphoria users, as it is stated on the Euphoria home page. Well, we have done our job for pure Latin - there are enough available functions, I think. Let us go to the original national alphabets. > > I think, you can make case_gr() for German alphabet, > > As I wrote at the beginning of this thread, I already *had* made a > function that is able to handle German characters. Very well, I do understand that. Sorry, I do not know your function yet, but I am *sure* it handles German characters correctly -- you are German yourself. But if German language has different code pages for DOS (OEM) and for Windows (ANSI), your function has to handle *both* code pages correctly. I can not create this function for you - I just do not know German, and the standard code pages are more or less international - for not single native language. Say, 1251 Windows CP, supports all Cyrillic languages - Russian, Ukrainian, Belorussian, Tatar and so on, but every one of them has its own alphabet(!). And functions must handle *every* native alphabet of given coge page. And many people just do not know, that they must handle their alphabets for DOS and for Windows differently, not saying about Linux and Mac. > I thought you would provide me another function. Now I downloaded all > that Russian stuff just so that you tell me, I shall write my own > function??? Why you *shall*, if you have your function *already done* for one or for all German code pages? I just think that you can revise your function now with help from that Russian stuff and correct your function for different German code pages as it is already done for Russian. RDS doesn't know German nor Russian well enough to provide the case_ge() Euphoria function for you and the case_ru() function for me. So, me wrote, remember please - try please, any alphabet may be supported that Russian *way*, I think. Me did not provide the universal function for any alphabet, but a *way* to make the concrete function for concrete alphabet. If there is the case_ru() function for all code pages in Russian, why not to have case_ge() for German, case_fr() for French, case_gr() for Greek and so on? Without any confusing with usage of some universal function? The universal function is possible, but it requires dozens and dozens of the "if end if" statements and it requires the knowledge of not only code pages, but plus national alphabets, first of all to be created. These functions are simple enough if you make them for your native languages, but they are more or less buggy, if you use some universal stuff. For example, my case_ru() doesn't handle Russian letters E and e with two dots above for now, just becouse of some issues with the reference stuff. And I can not discuss these issues with most of, say, Germans. Who care? RDS? To do some custom job for their free PD production? Good will and volunteers are needed here. This is the only way to support the excellent PD thing. Not Shareware, but Public Domain thing without expiry time of evaluation or such. > > someone - case_fr() > > for France, and so on. case_ru() is Russian Cyrillic. > > There are no any "dos","win","koi","iso","mac" for pure Latin. > > Pure Latin is just ASCII, A..Z, a..z. > > Do you see now? So, we have many alternatives just now - to use existing system functions of Windows and Linux, and to use the native Euphoria functions -- yours one for German and my case_la(), case_ru() for Latin and for 6 different Russians, including specialized for Euphoria very rare Latinic Russian. As I can suppose, RDS itself potentially may provide something native for English, Japanese and Esperanto - but it seems to me, there is Latin alphabet only for that task, lower() and upper() already exist. Done! Once more - good will and volunteers are needed, if you want the native Euphoria functions for this task. The task is not very difficult if you do know your native alphabet, be sure, dear End Users. Good Luck! Regards, Igor Kachan kinz at peterlink.ru
40. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 25, 2004
- 649 views
Hi, Dear EU users! Me wrote: [snip] > Me did not provide the universal function for any alphabet, > but a *way* to make the concrete function for concrete alphabet. [snip] > The task is not very difficult if you do know your native alphabet, > be sure, dear End Users. Let us try to use that way to make the concrete (i.e. specific, only this one, not any other) function for concrete well known alphabet -- pure Latin -- alphabet_la, just for example.
sequence alphabet_la alphabet_la="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz" global function case_LA(integer c, object x) -- case_LA -- to not confuse with the existing case_la() function -- convert Latin atom or sequence to upper or to lower case integer n sequence A, a if c then A=alphabet_la[1 ..26] a=alphabet_la[27..52] else a=alphabet_la[1 ..26] A=alphabet_la[27..52] end if n = 0 if atom(x) then n=find(x,a) if n then return A[n] else return n end if else for i=1 to length(x) do n=find(x[i],a) if n then x[i] = A[n] end if end for end if return x end function -- case_LA() puts(1, alphabet_la & '\n') puts(1, case_LA(1, alphabet_la) & '\n') puts(1, case_LA(0, alphabet_la) & '\n')
Just replace the alphabet_la sequence with your native alphabet and you will have concrete function for your native language and for your current code page. If you see your alphabet correctly in your editor, all right. Do not forget, on DOS and Windows may be different results. Say, sequence alphabet_mj alphabet_mj = "......place here The Great Mumbo Jumbo Alphabet....." global function case_mj(integer c, object x) And so on. Do not forget: A=alphabet_mj[1 .. not 26, but needed number] a=alphabet_mj[not 27, but needed number .. not 52, but needed number] -- Just first and second half. ... is easy peasy, NO? -- by Pete Lomax But I just now found one old bug in my case_ru() function. Thanks for your questions! Regards, Igor Kachan kinz at peterlink.ru
41. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 26, 2004
- 665 views
Ricardo Forno wrote: [snip] > Hi, Igor. > Your function is slow compared to other approaches. > Since in Euphoria (and also i C and other languages) a character is just a > number, may I suggest a change? > I think it would be much faster having two 256-character sequences, one > representing the "translation" of the other (characters in the same > position), and using the input characters as indexes. > This way, you will get not only a way to translate between upper and lower > case, but also (changing the sequences) to translate ASCII to EBCDIC or > vice-versa, or any translation character by character you want. > Regards. Hi, Ricardo! Yes, my case_LA() function is twice as slower than standard upper() and lower(). I tested them both just now. I like to speed up things, but I have no this problem with conversion or translation. I translate all bilingual EU libs into Latinic Russian in a small fraction of second and never wait a results. Esc t or Esc e in red.ex and that final beep sounds. I like your suggestion, but I just do not see the solution how to make this way the stable templet for *any* alphabet now. But case_LA() function is templet for any possible alphabet and any code page just now, as far as I can see. Some alphabets have no case at all, some alphabets have different numbers of upper and lower letters. For example, computer Russian has 3 extra letters in upper case, which are absent in Russian canonical grammar. But speed is good thing, yes, Ricardo. Try, maybe you can make such a fast & simple templet. For now, I can not imagine another possible function. Regards, Igor Kachan kinz at peterlink.ru
42. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 27, 2004
- 654 views
What I think he means is this: constant LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ" constant LA_lo = "abcdefghijklmnopqrstuvwxyz" constant LA_diff = LA_up - LA_lo global function case_LA(integer c, object x) integer n if atom(x) then if c then n = find( x, LA_lo ) if n then x += LA_diff[n] end if else n = find( x, LA_up ) if n then x -= LA_diff[n] end if end if else for i = 1 to length(x) do x[i] = case_LA(c, x[i]) end for end if return x end function > I like your suggestion, but I just do not see the solution how to make > this way the stable templet for *any* alphabet now. Works for any alphabet and code page, and is faster, because it doesn't have to keep slicing the alphabet sequences. > Some alphabets have no case at all, some alphabets have different > numbers of upper and lower letters. > For example, computer Russian has 3 extra letters in upper case, which > are absent in Russian canonical grammar. The limitation of the above function is that LA_up and LA_lo must be the same length... what do you mean by 3 extra letters? What if you try to convert them to lower case? If it should just leave them as upper case, that's fine - just leave them out of the function. ***The above function is completely untested. It should not be used in nuclear reactors, medical life support systems, or anywhere where failure may cause injury*** -- MrTrick
43. Re: Small feature request for future EU versions
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 27, 2004
- 670 views
Patrick Barnes wrote: > > What I think he means is this: > > constant LA_up= "ABCDEFGHIJKLMNOPQRSTUVWXYZ" > constant LA_lo = "abcdefghijklmnopqrstuvwxyz" > constant LA_diff = LA_up - LA_lo > > global function case_LA(integer c, object x) > integer n > if atom(x) then > if c then > n = find( x, LA_lo ) > if n then > x += LA_diff[n] OR instead ... x = LA_up[n] [snip] I have a generic case conversion that can work for most alphabets. I'll submit it to the contributions page. [snip] > > Some alphabets have no case at all, some alphabets have different > > numbers of upper and lower letters. > > For example, computer Russian has 3 extra letters in upper case, which > > are absent in Russian canonical grammar. I believe German and some other language has a situation where a single lower-case letter gets changed to two characters when converted to uppercase - the German s-sharp character 'ß' changes to 'SS' when converted. -- Derek Parnell Melbourne, Australia
44. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 27, 2004
- 673 views
Patrick Barnes wrote: [snip] > > I like your suggestion, but I just do not see the solution how to make > > this way the stable templet for *any* alphabet now. > > Works for any alphabet and code page, and is faster, because it > doesn't have to keep slicing the alphabet sequences. Ok, I see "...for any alphabet...". But my_case_LA() makes just single slicing on the alphabet_la sequence, and yours_case_LA() has 3 supporting sequences. Can you rename yous_case_LA() as case_La() or such? Just for short, to not confuse? > > Some alphabets have no case at all, some alphabets have different > > numbers of upper and lower letters. > > For example, computer Russian has 3 extra letters in upper case, which > > are absent in Russian canonical grammar. > > The limitation of the above function is that LA_up and LA_lo must be > the same length... what do you mean by 3 extra letters? What if you > try to convert them to lower case? If it should just leave them as > upper case, that's fine - just leave them out of the function. Ok, I see "The limitation..." . What about "...for any alphabet..."? Well, there are 3 lower-case letters in Russian, which can not stand on the first place in a word. There are no such the words in Russian at all. So, you can not find them in any normal Russian text. But some more or less artificial computer texts, like "ruSSIan" or "heLLo" can include those additional upper-case letters. And I can just include those letters into the alphabet_ru sequence or exclude them and get the functions for canonical and for artificial Russian languages. Ok, Derek submitted his new library to Rob for these things. I think, there is enough such a stuff in the RDS archives now to not force Rob to learn our own crazy mumbos_jumbos. Regards, Igor Kachan kinz at peterlink.ru
45. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 27, 2004
- 674 views
Derek Parnell wrote: [snip] [snip] > I have a generic case conversion that can work for most alphabets. > I'll submit it to the contributions page. [snip] [snip] I have downloaded yours library, it works for me, thanks. But it seems to me, your function can not make the *selective* conversion of bilingual texts of the same code page, but of different alphabets. Say, I can run my stuff the following way:
text = case_la(Lo, text) -- to get all pure Latin letters lower-case text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian letters lower-case text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters upper-case
Couldn't you? Sorry, good appetite, yes. Regards, Igor Kachan kinz at peterlink.ru
46. Re: Small feature request for future EU versions
- Posted by Derek Parnell <ddparnell at bigpond.com> Oct 27, 2004
- 670 views
Igor Kachan wrote: > > Derek Parnell wrote: > > [snip] > [snip] > > > I have a generic case conversion that can work for most alphabets. > > I'll submit it to the contributions page. > > [snip] > [snip] > > I have downloaded yours library, it works for me, thanks. You are welcome. > But it seems to me, your function can not make the *selective* > conversion of bilingual texts of the same code page, > but of different alphabets. I'm sorry but I don't understand what you are saying. What does "*selective* conversion" mean? I think by "bilingual texts of the same code page, but of different alphabets." you mean some text in which there is a mixture of characters from different alphabets, but each character is still from the same code page. I believe my functions can handle that. Something like "Outside window = fenêtre extérieur: garçon = boy" should come out from Obj_upper() as "OUTSIDE WINDOW = FENÊTRE EXTÉRIEUR: GARÇON = BOY" So long as each character has a unique code point in the code page, regardless of its language, my functions can help. > Say, I can run my stuff the following way: > > }}} <eucode> > text = case_la(Lo, text) -- to get all pure Latin letters lower-case > text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainian > letters lower-case > text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters > upper-case > </eucode> {{{ > > Couldn't you? No, because I don't know those alphabets. However, you could using my functions. Use the SetCase procedure to define the mappings for those alphabets. Something like ... SetCase( "абвгд...эюя", "АБВГД . . . ЭЮЯ", -1) and then use the Windows Cyrillic code page to get the correct display glyphs. -- Derek Parnell Melbourne, Australia
47. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 27, 2004
- 656 views
Derek Parnell wrote: [snip] > > I have downloaded yours library, it works for me, thanks. > You are welcome. Thanks. > > But it seems to me, your function can not make the *selective* > > conversion of bilingual texts of the same code page, > > but of different alphabets. > I'm sorry but I don't understand what you are saying. What does > "*selective* conversion" mean?=20 > I think by "bilingual texts of the same code page, but of different > alphabets." you mean some text in which there is a mixture of characters > from different alphabets, but each character is still from the same code > page. Yes, you are right. To be more specific and clear, let us take the concrete example, how my case_ru() function works. It gets 3 parameters, say: text = case_ru(1, text, "win") 1 - upper, 0 - lower. text - sequence with the text I want to process. "win" - one of 5 options - "dos", "iso", "koi", "mac". It means the code page of the text sequence. "win" stands for Windows 1251 common Cyrillic code page. User has to know what is the text's code page. It is not a current machine code page, but a concrete text's code page. So, this function only processes Russian alphabet on common Cyrillic code page and doesn't affect the specific letters, say, Ukrainian, in bilingual Russian/Ukrainian texts. Same for "dos", "iso", "koi", "mac" Cyrillic code pages. This way I can use my case_ru() function on any Euphoria platform to process Russian texts of any other Euphoria platform. It is really generic for Euphoria platforms, can to process any given Russian text and doesn't depend on current platform. So, the case_ua() function, if someone wants to have it, may use the full Ukrainian alphabet with all common Russian/Ukrainian letters, or just a few specific Ukrainian letters to process only these letters in a bilingual (Russian/Ukrainian) Cyrillic texts on any Euphoria platform. > I believe my functions can handle that. Something like > "Outside window = fen=EAtre ext=E9rieur: gar=E7on = boy" > should come out from Obj_upper() as=20 > "OUTSIDE WINDOW = FEN=CATRE EXT=C9RIEUR: GAR=C7ON = BOY" > So long as each character has a unique code point in the code page, > regardless of its language, my functions can help. Are you saying your function processes all alphabets of given code page at once? Yes, as far as I can see. If so, it can not process Win_Western texts selectively on default and requires some *additional* job of local programmer. Same as that my case_mj() function, but which is selective on default, if you want to prepare and use it that way. The only productive way to get these functions very useful - to force the local programmers to make functions for their native alphabets, code pages and languages, I think. > > Say, I can run my stuff the following way: > >=20 > > }}} <eucode> > > text = case_la(Lo, text) -- to get all pure Latin letters lower-case > > text = case_ru(Lo, text, "win") -- to get all common Russian/Ukrainia= n > > letters lower-case > > text = case_ua(Up, text, "win") -- to get all pure Ukrainian letters > > upper-case > > </eucode> {{{ > > Couldn't you? > No, because I don't know those alphabets. However, you could using my > functions. Use the SetCase procedure to define the mappings for those > alphabets. Something like ... > SetCase( "абвгд...эюя", > "АБВГД . . . ЭЮЯ", -1) > and then use the Windows Cyrillic code page to get the correct display= > glyphs. OK, I do uderstand correctly, I think. If a user of your function wants to process his text selectively and not affect the letters of some second possible language of given code page, he/she has to make some different tables for the SetCase() function and call it twice. Firstly for first language, then, after first pass, for the second one. Right? But what the strange hex codes/unicodes are in your example above? It seems to me, the SetCase() function doesn't handles these codes yet, and it is some reserve for future. Regards, Igor Kachan kinz at peterlink.ru
48. Re: Small feature request for future EU versions
- Posted by Patrick Barnes <mrtrick at gmail.com> Oct 28, 2004
- 653 views
On Wed, 27 Oct 2004 00:56:24 -0300, Ricardo Forno <rforno at uyuyuy.com> wrote: > Well... not exactly this. > find() is a bit slow. > What I suggest is having a 256-character sequence from which you take the > character corresponding to the the one you want to translate, used as an > index. > For example, you know that 'A' is equal to 65 (its ASCII value). Assume then > that sequence X contains an 'a' in position 66, a 'b' in position 67, and so > on. So, to translate sequence Z from upper to lower case, you will code: > > for i = 1 to length(Z) do > Z[i] = X[Z[i]+1] > end for Like this:
constant UPPER = 1 constant LOWER = 2 constant CODE_PAGE_SIZE = 255 --maximum value of a character. --initial setup sequence to_uppercase, to_lowercase to_uppercase = repeat( 0, CODE_PAGE_SIZE ) to_lowercase = repeat( 0, CODE_PAGE_SIZE ) constant alphabet = { {'A', 'a'}, {'B', 'b'}, ...etc, for entire alphabet --populate translation tables with data for i = 1 to length(alphabet) do to_uppercase[ alphabet[i][LOWER] ] = alphabet[i][UPPER] to_lowercase[ alphabet[i][UPPER] ] = alphabet[i][LOWER] end for --the actual function function change_case( integer case, object z ) integer c if sequence(z) then for i = 1 to length(z) do z[i] = change_case( case, z[i] ) end for -- elsif not z then --don't do any transform on null chars -- return z elsif case = LOWER then c = to_lowercase[z] else --assume case = UPPER c = to_uppercase[z] end if if c then return c else return z end if end if
And there you go. The only way to make it faster, I think, is to use a non-recursive algorithm and maybe restrict it to 1d arrays only. The method's only assumption is that no zero-value characters exist (the null char?). To check for this, just uncomment the checking code... might make it run a little slower. What do you think of this Igor? Handles any odd combination of alphabets you want to throw at it, I think. -- MrTrick
49. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 28, 2004
- 668 views
Patrick Barnes wrote: [snip] > And there you go. The only way to make it faster, I think, is to use a > non-recursive algorithm and maybe restrict it to 1d arrays only. > > The method's only assumption is that no zero-value characters exist > (the null char?). To check for this, just uncomment the checking > code... might make it run a little slower. > > What do you think of this Igor? Handles any odd combination of > alphabets you want to throw at it, I think. > > -- > MrTrick Using all yours people suggestions and questions I have combined the case_xx() function, which is twice as faster than standard case_la(), takes any alphabet and is selective on default. Works with my 5 crazy Russians and pure Mumbo Jumbo as well. Results - 0.3 sec - 3.5M file - 1.8GHz box. Try please:
sequence table table = repeat(0,256) global constant alphabet_LA = "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz", alphabet_MJ = ".....place here The Great Mumbo Jumbo Alphabet...." global function case_xx(integer c, object x, sequence alphabet) integer A, a table *= 0 -- to clear table for next use with another alphabet for i=1 to length(alphabet) - 1 by 2 do A = alphabet[i] a = alphabet[i+1] if c then table[A] = A table[a] = A else table[A] = a table[a] = a end if end for if atom(x) then if x then return table[x] else return x end if else for i=1 to length(x) do a = x[i] if a then -- to convert binary x[i] = table[a] end if end for end if return x end function puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n') puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n') puts(1, alphabet_LA & '\n') puts(1, case_xx(0, 'A', alphabet_LA)& '\n') ---- puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n') puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n') puts(1, alphabet_MJ & '\n') ---- puts(1, case_xx(0, 'M', alphabet_MJ) & '\n') puts(1, case_xx(0, 'u', alphabet_MJ) & '\n') ---- puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) & '\n') ----
Many Thanks To All ! Just add your native alphabet and use (on your own risk Regards, Igor Kachan kinz at peterlink.ru
50. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 28, 2004
- 643 views
OOppss... Me wrote: Patrick Barnes wrote: [snip] > And there you go. The only way to make it faster, I think, is to use a > non-recursive algorithm and maybe restrict it to 1d arrays only. > > The method's only assumption is that no zero-value characters exist > (the null char?). To check for this, just uncomment the checking > code... might make it run a little slower. > > What do you think of this Igor? Handles any odd combination of > alphabets you want to throw at it, I think. > > -- > MrTrick Using all yours people suggestions and questions I have combined the case_xx() function, which is twice as faster than standard case_la(), takes any alphabet and is selective on default. Works with my 5 crazy Russians and pure Mumbo Jumbo as well. Results - 0.3 sec - 3.5M file - 1.8GHz box. Try please:
sequence table table = repeat(0,256) global constant alphabet_LA = "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz", alphabet_MJ = ".....place here The Great Mumbo Jumbo Alphabet...." global function case_xx(integer c, object x, sequence alphabet) integer A, a table *= 0 -- to clear table for next use with another alphabet for i=1 to length(alphabet) - 1 by 2 do A = alphabet[i] a = alphabet[i+1] if c then table[A] = A table[a] = A else table[A] = a table[a] = a end if end for if atom(x) then if x then return table[x] else return x end if else for i=1 to length(x) do a = x[i] if a then -- to convert binary if table[a] then -- to not affect others x[i] = table[a] end if -- end if end for end if return x end function puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n') puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n') puts(1, alphabet_LA & '\n') puts(1, case_xx(0, 'A', alphabet_LA)& '\n') ---- puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n') puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n') puts(1, alphabet_MJ & '\n') ---- puts(1, case_xx(0, 'M', alphabet_MJ) & '\n') puts(1, case_xx(0, 'u', alphabet_MJ) & '\n') ---- puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) & '\n') ----
Many Thanks To All ! Just add your native alphabet and use (on your own risk OOpsss ... Just next bug fix, see above. Regards, Igor Kachan kinz at peterlink.ru
51. Re: Small feature request for future EU versions
- Posted by "Igor Kachan" <kinz at peterlink.ru> Oct 28, 2004
- 655 views
Derek Parnell and Juergen Luethje wrote: [snip] > > Have you an estimate for how frequently the operators have actually been > > used in this manner? I think there are two instances in the RDS libraries, > > the case conversion routines (BTW which only work on a limited set of > > characters) > > That's why they are useless for text written in many languages other > than Englisch. That is strange for a product, that is intended for > international use, especially because it is very easy to write better case > conversion routines. > Case conversion routines written without using the operators in the > manner mentioned above are also faster. That's why RDS themselves don't > use their own library routines for case conversion, when speed is > important (Euphoria/bin/guru.ex, Euphoria/bin/search.ex) ... Hey, customers, get please a final function, written as a "poem" in Euphoria. Almost all bugs are fixed. The last one is just a victim of art. Let us wait for German, French, Greek, Turkish, Polish, Tatar, Ukrainian, Mongol and other alphabets. Mumbo Jumbo is ready. I did not placed here 5 Russian alphabets - they act as an old USSR torpedo on RDS MessageBoard.
global constant alphabet_LA = "AaBbCcDdEeFfGgIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz", alphabet_MJ = ".....place here The Great Mumbo Jumbo Alphabet...." global function case_xx(integer c, object x, sequence alphabet) integer A, a sequence table table = repeat(0,256) for i=1 to length(alphabet) - 1 by 2 do A = alphabet[i] a = alphabet[i+1] if c then table[A] = A table[a] = A else table[A] = a table[a] = a end if end for if atom(x) then if x then x = table[x] if x then x = x end if end if else for i=1 to length(x) do a = x[i] if a then if table[a] then x[i] = table[a] end if end if end for end if return x end function puts(1, case_xx(1, alphabet_LA, alphabet_LA) & '\n') puts(1, case_xx(0, alphabet_LA, alphabet_LA) & '\n') puts(1, alphabet_LA & '\n') puts(1, case_xx(0, 'A', alphabet_LA)& '\n') ---- puts(1, case_xx(1, alphabet_MJ, alphabet_MJ) & '\n') puts(1, case_xx(0, alphabet_MJ, alphabet_MJ) & '\n') puts(1, alphabet_MJ & '\n') ---- puts(1, case_xx(0, 'M', alphabet_MJ) & '\n') puts(1, case_xx(0, 'u', alphabet_MJ) & '\n') ---- puts(1, case_xx(0, case_xx(1, alphabet_LA, alphabet_LA), alphabet_LA) & '\n')
I'm waiting for the bug reports. Let us finish this too long thread. Ok? Regards, Igor Kachan kinz at peterlink.ru