1. Accented characters in identifiers
- Posted by CChris <christian.cuvier at agricultur?.?ouv.fr> May 27, 2008
- 799 views
Currently, if you use characters in the 128-255 range in identifiers, you will get incongruous error messages, like "Result of a function must be assigned" because you used a ó. This comes from the shrouding method Euphoria had been using long ago. Rob himself admitted supporting was becoming obsolte. Implementationwise, the move is simple: change the character class of all those chars from KEYWORD or BUILTIN to LETTER in the scanner. Nothing else (a couple if branches and constants will become dead code). Since characters that display as a letter in some code page may display differently on another, I think including the whole 128..255 range as valid characters is better than restricting it. If a char is valid somewhere, it must be valid anywhere, even if it displays funny. What do you think? CChris
2. Re: Accented characters in identifiers
- Posted by Jeremy Cowgar <jeremy at cowga?.c?m> May 27, 2008
- 755 views
CChris wrote: > > > Currently, if you use characters in the 128-255 range in identifiers, you will > get incongruous error messages, like "Result of a function must be assigned" > because you used a ó. > > > What do you think? > Being a unicode (I should say any non-ascii) dummy, how would this affect people reading and using code with such characters? I do not want to discriminate at all, but will we create two divisions in the libraries and also code contributions to Euphoria core? For instance, I know English and a tiny, tiny bit of Spanish. I'm sure others know multiple languages but spoken languages just was something I've never been interested in. If a library has come out that does some very cool things but even it's function names use characters > 128 that I don't even know how to type of my keyboard, let alone what they mean, I cannot use it. Now, this is *obviously* bigger than I. I can understand if you are not a native English speaking person you're probably clenching your fists and steam is rising from your head but I am just trying to understand the impact. I mean no offense or to say that all programmers should speak English and forget their native tongue. I'm just trying to understand. -- Jeremy Cowgar http://jeremy.cowgar.com
3. Re: Accented characters in identifiers
- Posted by ken mortenson <kenneth_john at yaho?.co?> May 27, 2008
- 751 views
Jeremy Cowgar wrote: > I do not want to discriminate at all... "Look! They are one people and there is one language for them all... Why now, there is NOTHING that they may have in mind that will be unattainable for them" You may recognize this quote from the story of Babel. The process of thinking IS discriminating (whatever your native language, call me a bigot.) Unicode is useful for data, not for code. Otherwise, why not have the languages keywords come in mutliple translations? It's because of the principle in that quote above. Do you want things to be attainable? Sometimes you have to limit yourself to make great results. It would be perfectly logical to say, "Let's do all our code in Unicode for one day we might want to bring out different language versions of our compiler." Except my keyboard expects me to write in English even iF other language keyboards are available. You can do a lot with 127 characters. It's not a valid assumption that you can do a lot more with 65,365 or so. I shudder to think of the finely crafted Euphoria language trying to become all things to all people. It will never achieve greatness if that's the road taken. Keep in mind I say that while also on a quest for the 'one true language.' Perhaps we should code in hebrew?
4. Re: Accented characters in identifiers
- Posted by Salix <salix at fr?emai?.hu> May 27, 2008
- 731 views
Jeremy Cowgar wrote: > > CChris wrote: > > > > What do you think? > > Being a unicode (I should say any non-ascii) dummy, how would this affect > people > reading and using code with such characters? I think it would be great! I would definitely use the special characters. Whenever I write an open library I try anyway to choose function names that are obvious for the majority of the programmers. (See English speakers.) But whenever I write a code for my own (CGI, database, etc.) I prefer my own language/words/characters. Why not? Regards, Salix (hu-en-de)
5. Re: Accented characters in identifiers
- Posted by Larry Miller <larrymiller at sas?tel.ne?> May 27, 2008
- 717 views
I would have to agree with Jeremy and Ken in their posts. CChris had suggested that accented characters be permitted in identifiers. While this might be appealing to some, it may cause more trouble than it is worth. The trouble with code pages is that they are all different in how they interpret characters 128-255. Most of the Latin based code pages are close but the same can not be said of others, such as those for Cyrillic, Greek, etc. If a program used a character in this range it may display differently on a system with a different code page. Depending on the font and code page used they may not even be readable. I am sure the developers already know this but others may not. These problems could be minimized (but not eliminated) if Euphoria were to use some form of unicode, such as UTF8. Bu I don't think that the developers wish to travel that road. I think it best that identifiers be restricted to characters common to all code pages - ASCII. This may annoy some who's native language is not english but I think they will understand. Larry Miller
6. Re: Accented characters in identifiers
- Posted by jacques deschênes <desja at gl?bet?otter.net> May 27, 2008
- 710 views
- Last edited May 28, 2008
I agree with Salix on this. We could give the freedom of accented character to programmers but standard libraries and other codes distributed with euphoria should be restricted to english. I rebember reading a contribution from a euphoria user with identifiers although not accented were in a language I don't understand. I gave up because it is hard to read code where all identifiers are from an unknown language. But it not a problem as long as it not part of the distribution. English is a de facto common language on this planet. jacques Salix wrote: > > Jeremy Cowgar wrote: > > > > CChris wrote: > > > > > > What do you think? > > > > Being a unicode (I should say any non-ascii) dummy, how would this affect > > people > > reading and using code with such characters? > > I think it would be great! I would definitely use the special characters. > Whenever I write an open library I try anyway to choose function names > that are obvious for the majority of the programmers. (See English speakers.) > But whenever I write a code for my own (CGI, database, etc.) I prefer my > own language/words/characters. Why not? > > Regards, > > Salix > (hu-en-de)
7. Re: Accented characters in identifiers
- Posted by Shawn Pringle <shawn.pringle at gmail.??m> May 28, 2008
- 726 views
I see two sides to this debate: One, says we should use identifiers everyone can type: That is a subset of ASCII. I don't understand how ADA programmers cope with their character set. The other says okay, lets use English identifiers in libraries and core keywords but in the programmers' code let us allow them to use his native tongue. I would like to add that sometimes routines that were intended to be internal to a program sometimes get put into a .e file. Sometimes these .e files get uploaded to the archive by some altruism of the programmmer. This would become less likely if the programmer would also have to translate their routine names. I think if we decide to include accented characters we should use Unicode 16bit format. The interpreter could branch and do the IO in a Unicode manner if it finds the byte-order word at the beginning of the file. Alternatively, we could have a browser like usage of character sets where it sets the encoding at the beginnning of the file in a comment.
#!/usr/bin/exu -- encoding: utf-8
I am not trying to be ironic or sarcastic here but I am just brainstorming what could be implemented. I say so because sometimes, I sense that I come across as sarcastic when I do not mean to. Shawn Pringle B.Sc.
8. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at pet?rlink.r?> May 29, 2008
- 733 views
CChris wrote: > > > Currently, if you use characters in the 128-255 range in identifiers, you will > get incongruous error messages, like "Result of a function must be assigned" > because you used a ó. > > This comes from the shrouding method Euphoria had been using long ago. Rob > himself > admitted supporting was becoming obsolte. > > Implementationwise, the move is simple: change the character class of all > those > chars from KEYWORD or BUILTIN to LETTER in the scanner. Nothing else (a couple > if branches and constants will become dead code). > > Since characters that display as a letter in some code page may display > differently > on another, I think including the whole 128..255 range as valid characters is > better than restricting it. If a char is valid somewhere, it must be valid > anywhere, > even if it displays funny. > > What do you think? Hi Chris, It is very good idea, I think, but its implementation is not too simple. There is Bilingual Euphoria 2.5 in the Archive. It understands any characters in identifiers and English and Russian keywords and has the English or Russian error messages and can translate the program text from English to Russian and back, but there is still unknown bug on Linux platform (DOS32, WIN32 are very stable, I work on it all the time). http://www.rapideuphoria.com/ru_eu_11.zip Sorry, I do not have some spare time to implement these features in 3.2 now - my vegetable-garden takes all my summer time So ask please Rob for that code just to see various details of that interpreter, if you want. That was strongly licensed 2.5 stuff, that was our co-work with Rob and he didn't want to open that code that time. Regards, Igor Kachan kinz at peterlink.ru
9. Re: Accented characters in identifiers
- Posted by CChris <christian.cuvier at agricult?re.go?v.fr> May 29, 2008
- 721 views
Igor Kachan wrote: > > CChris wrote: > > > > > > Currently, if you use characters in the 128-255 range in identifiers, you > > will > > get incongruous error messages, like "Result of a function must be assigned" > > because you used a ó. > > > > This comes from the shrouding method Euphoria had been using long ago. Rob > > himself > > admitted supporting was becoming obsolte. > > > > Implementationwise, the move is simple: change the character class of all > > those > > chars from KEYWORD or BUILTIN to LETTER in the scanner. Nothing else (a > > couple > > if branches and constants will become dead code). > > > > Since characters that display as a letter in some code page may display > > differently > > on another, I think including the whole 128..255 range as valid characters > > is > > better than restricting it. If a char is valid somewhere, it must be valid > > anywhere, > > even if it displays funny. > > > > What do you think? > > Hi Chris, > > It is very good idea, I think, but its implementation is not > too simple. There is Bilingual Euphoria 2.5 in the Archive. > It understands any characters in identifiers and > English and Russian keywords and has the English or Russian > error messages and can translate the program text from English > to Russian and back, but there is still unknown bug on > Linux platform (DOS32, WIN32 are very stable, I work on it all > the time). > > <a > href="http://www.rapideuphoria.com/ru_eu_11.zip">http://www.rapideuphoria.com/ru_eu_11.zip</a> > > Sorry, I do not have some spare time to implement these > features in 3.2 now - my vegetable-garden takes all my > summer time > > So ask please Rob for that code just to see various details > of that interpreter, if you want. That was strongly licensed > 2.5 stuff, that was our co-work with Rob and he didn't want > to open that code that time. > > > Regards, > Igor Kachan > kinz at peterlink.ru You may be aware that Euphoria has been open source for one year now. I'm not sure using any licensed material with restrictions worse than GPL would be possible or desirable. I may be wrong there though. Could you elaborate on how the bug shows up on Linux? Inasmuch as this doesn't infringe on any NDA of course. CChris
10. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at p?terlink.?u> May 29, 2008
- 735 views
CChris wrote: > > Igor Kachan wrote: > > > > CChris wrote: > > > > > > [snip] > > > > > > What do you think? > > > > Hi Chris, > > > > It is very good idea, I think, but its implementation is not > > too simple. There is Bilingual Euphoria 2.5 in the Archive. > > It understands any characters in identifiers and > > English and Russian keywords and has the English or Russian > > error messages and can translate the program text from English > > to Russian and back, but there is still unknown bug on > > Linux platform (DOS32, WIN32 are very stable, I work on it all > > the time). > > > > http://www.rapideuphoria.com/ru_eu_11.zip > > > > Sorry, I do not have some spare time to implement these > > features in 3.2 now - my vegetable-garden takes all my > > summer time > > > > So ask please Rob for that code just to see various details > > of that interpreter, if you want. That was strongly licensed > > 2.5 stuff, that was our co-work with Rob and he didn't want > > to open that code that time. > > > > > You may be aware that Euphoria has been open source for one year now. Yes, I do know that EU 3.0 is open source, but the 2.5 source code was a commercial product with strong license restrictions. After 3.0, I asked Rob to open the bilingual EU 2.5 too - I did not have the spare time to develop the bilingual EU 3.0 by myself, so why not to allow this work to someone who wants to work without reinventing of all that stuff? That time Rob prefered to wait me. But this waiting gets too long. > I'm not sure using any licensed material with restrictions worse > than GPL would be possible or desirable. I may be wrong there though. There are the official developers of the Open source EU now, why not to open just for them just that 2.5 bilingual interpreter? Rob? > Could you elaborate on how the bug shows up on Linux? Ok, I'll try to find that interpreter on my old reserved HDD and run it to make the screen-shots on Linux Mandrake 10.0. > Inasmuch as this doesn't infringe on any NDA of course. What is NDA? Sorry, I do not know this abbreviation. Regards, Igor Kachan kinz at peterlink.ru
11. Re: Accented characters in identifiers
- Posted by CChris <christian.cuvier at agriculture.g?uv.?r> May 29, 2008
- 736 views
Igor Kachan wrote: > > CChris wrote: > > > > Igor Kachan wrote: > > > > > > CChris wrote: > > > > > > > > [snip] > > > > > > > > What do you think? > > > > > > Hi Chris, > > > > > > It is very good idea, I think, but its implementation is not > > > too simple. There is Bilingual Euphoria 2.5 in the Archive. > > > It understands any characters in identifiers and > > > English and Russian keywords and has the English or Russian > > > error messages and can translate the program text from English > > > to Russian and back, but there is still unknown bug on > > > Linux platform (DOS32, WIN32 are very stable, I work on it all > > > the time). > > > > > > <a > > > href="http://www.rapideuphoria.com/ru_eu_11.zip">http://www.rapideuphoria.com/ru_eu_11.zip</a> > > > > > > Sorry, I do not have some spare time to implement these > > > features in 3.2 now - my vegetable-garden takes all my > > > summer time > > > > > > So ask please Rob for that code just to see various details > > > of that interpreter, if you want. That was strongly licensed > > > 2.5 stuff, that was our co-work with Rob and he didn't want > > > to open that code that time. > > > > > > > > You may be aware that Euphoria has been open source for one year now. > > Yes, I do know that EU 3.0 is open source, but the 2.5 source > code was a commercial product with strong license restrictions. > > After 3.0, I asked Rob to open the bilingual EU 2.5 too - I did not > have the spare time to develop the bilingual EU 3.0 by myself, so > why not to allow this work to someone who wants to work without > reinventing of all that stuff? > > That time Rob prefered to wait me. But this waiting gets too long. > > > I'm not sure using any licensed material with restrictions worse > > than GPL would be possible or desirable. I may be wrong there though. > > There are the official developers of the Open source EU now, > why not to open just for them just that 2.5 bilingual interpreter? > > Rob? > > > Could you elaborate on how the bug shows up on Linux? > > Ok, I'll try to find that interpreter on my old reserved > HDD and run it to make the screen-shots > on Linux Mandrake 10.0. > > > Inasmuch as this doesn't infringe on any NDA of course. > > What is NDA? Sorry, I do not know this abbreviation. > > Regards, > Igor Kachan > kinz at peterlink.ru Sorry: Non Disclosure Agreement. CChris
12. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at peterli??.ru> May 29, 2008
- 761 views
Igor Kachan wrote: > > CChris wrote: > > > > Igor Kachan wrote: > > > > > > CChris wrote: > > > > > > > > [snip] > > > > > > > > What do you think? > >[snip] > > > Could you elaborate on how the bug shows up on Linux? > > Ok, I'll try to find that interpreter on my old reserved > HDD and run it to make the screen-shots > on Linux Mandrake 10.0. There are buggy bilingual interpreter for Linux exu_r and ex.err files for two euphoria/demos/linux programs in this package: http://www.private.peterlink.ru/kinz/exu_r_25.zip Try please, if you want. sanity.ex works ok with exu_r - 100% passed. Regards, Igor Kachan kinz at peterlink.ru
13. Re: Accented characters in identifiers
- Posted by CChris <christian.cuvier at agr?culture.gouv.f?> May 30, 2008
- 743 views
Igor Kachan wrote: > > Igor Kachan wrote: > > > > CChris wrote: > > > > > > Igor Kachan wrote: > > > > > > > > CChris wrote: > > > > > > > > > > [snip] > > > > > > > > > > What do you think? > > > >[snip] > > > > > Could you elaborate on how the bug shows up on Linux? > > > > Ok, I'll try to find that interpreter on my old reserved > > HDD and run it to make the screen-shots > > on Linux Mandrake 10.0. > > There are buggy bilingual interpreter for > Linux exu_r and ex.err files for two euphoria/demos/linux > programs in this package: > > <a > href="http://www.private.peterlink.ru/kinz/exu_r_25.zip">http://www.private.peterlink.ru/kinz/exu_r_25.zip</a> > > Try please, if you want. > > sanity.ex works ok with exu_r - 100% passed. > > Regards, > Igor Kachan > kinz at peterlink.ru Got those files, which are hardly informative indeed. I think any implementation of acccented chars (allowing any UTF-8 char in identifiers is trivial, they just may cause display concerns when the code page is not the original one) would be done with the new tools in 4.0, and there wil be many. Perhaps you, Rob and Jeremy might want to discuss this? CChris
14. Accented characters in identifiers
- Posted by CChris <christian.cuvier at agriculture.gouv.fr> Apr 06, 2007
- 751 views
Currently, Eu interprets characters with the most significant bit set as opcodes. Only old shrouded files store Eu opcodes this way. Isn't it time to remove that restriction, so as to be able to use non english identifiers in programs? Other languages frequently use accented characters. Is anyone running these legacy shrouded files? CChris
15. Re: Accented characters in identifiers
- Posted by Robert Craig <rds at RapidEuphoria.com> Apr 06, 2007
- 740 views
CChris wrote: > Currently, Eu interprets characters with the most significant bit set as > opcodes. Only old shrouded files store Eu opcodes this way. > > Isn't it time to remove that restriction, so as to be able to use non > english identifiers in programs? Other languages frequently use accented > characters. Yes, I agree. I'll do that fairly soon, if nobody objects. Others, such as Igor Kachan, have also mentioned the lack of support for the higher ASCII codes for non-English languages. The 3.0 open source version of Euphoria does not have the ability to decrypt files that are both "shrouded" and "encrypted". "shrouding" used to mean just conversion of keywords and built-in names to single-byte codes and converting variable and routine names to short meaningless identifier names. With version 2.0 came the option to also "encrypt". In 2.5, a whole new binder/shrouder was developed where conversion to byte-codes no longer occurred, but a form of IL encryption continued, for bound executables only. > Is anyone running these legacy shrouded files? I believe there are very few files out there that are "shrouded", but not also "encrypted" or bound into an executable, so there is little point now in maintaining support for the single-byte codes (in scanner.e). Note that executable programs that were "bound" with the interpreter, pre-3.0, are not affected, since they contain the required interpreter version. The only breakage here would be very old, probably pre-2.0 (1997 or earlier) files, "shrouded", but not "encrypted". Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
16. Re: Accented characters in identifiers
- Posted by Robert Craig <rds at RapidEuphoria.com> Apr 06, 2007
- 751 views
Robert Craig wrote: > In 2.5, a whole new binder/shrouder was developed where > conversion to byte-codes no longer occurred, but a form of > IL encryption continued, for bound executables only. Actually, you could also make an encrypted separate .il file, to be run by the backend, but the point remains that if 3.0 can't "decrypt", there is little point in handling the special keyword/built-in byte codes in the 128-255 ASCII range. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
17. Re: Accented characters in identifiers
- Posted by Juergen Luethje <j.lue at gmx.de> Apr 06, 2007
- 753 views
- Last edited Apr 07, 2007
Robert Craig wrote: > CChris wrote: >> Currently, Eu interprets characters with the most significant bit set as >> opcodes. Only old shrouded files store Eu opcodes this way. >> >> Isn't it time to remove that restriction, so as to be able to use non >> english identifiers in programs? Other languages frequently use accented >> characters. > > Yes, I agree. I'll do that fairly soon, if nobody objects. > Others, such as Igor Kachan, have also mentioned the lack of support > for the higher ASCII codes for non-English languages. <snip> Sorry, I don't think that this is a good idea, because: a) The usage of this feature will bring a considerable disadvantage. When someone creates identifiers that contain special characters of her/his language, it is likely that other people somewhwre else in the world will have problems to read that code. You recently reminded us of a post from you on 12 Feb 2002: <http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize> In this message it reads: | I like it better the way it is. You could argue that I don't have to | use variable inits if I don't want to. You could argue that I don't | have to use goto if I don't want to. A language does not exist just | to serve the isolated programmer. It exists to serve a community | of programmers. In situations where it really doesn't matter | how something is written, I think there are advantages to | reducing the number of choices. IMHO the same is true concerning special characters in identifiers, especially since many of them are not equal in different languages. The Euphoria community is small enough, Euphoria shouldn't encourage people to write code that can only be read by a fraction of this small community. b) It is not necessary at all. We currently have a sufficient number of characters for creating identifiers. The German language also has some special characters, but I _never_ had the need to use one of them in an identifier. Regards, Juergen
18. Re: Accented characters in identifiers
- Posted by Pete Lomax <petelomax at blueyonder.co.uk> Apr 07, 2007
- 711 views
Robert Craig wrote: > CChris wrote: > > Is anyone running these legacy shrouded files? Seems unlikely. I was not even aware that pre-2.0 files could possibly be run by 3.0 anyway. Don't suppose there is any chance of partially resurrecting that feature [ie unencrypted] in any form is there Rob? (OK, don't sweat it, I know the answer was NO when I asked before) Juergen Luethje wrote: > When someone creates identifiers that contain special characters of > her/his language, it is likely that other people somewhwre else in > the world will have problems to read that code. However if they write code in ascii-7 but comments in Japanese... > The Euphoria community is small enough I strongly disagree, we should rise to the (formidable) challenge of a wider and multi-lingual community. If I cannot read some future code written in, say, Urdu, that may well be annoying[1], but I cannot believe there is or ever will be any benefit to us deciding now that such should never exist. Regards, Pete [1] aku has previously submitted quality code which was both named and commented in a foreign language; this problem already exists. To his (or her?!) credit, an english wrapper was provided.
19. Re: Accented characters in identifiers
- Posted by Juergen Luethje <j.lue at gmx.de> Apr 07, 2007
- 719 views
Pete Lomax wrote: > Juergen Luethje wrote: >> When someone creates identifiers that contain special characters of >> her/his language, it is likely that other people somewhwre else in >> the world will have problems to read that code. > However if they write code in ascii-7 but comments in Japanese... Then I can still read the code. >> The Euphoria community is small enough >I strongly disagree, we should rise to the (formidable) challenge of a wider > and multi-lingual community. If I cannot read some future code written in, > say, > Urdu, that may well be annoying[1], but I cannot believe there is or ever will > be any benefit to us deciding now that such should never exist. The benefit is in considerably reducing the chance that someone will encounter such an annoying situation. Regards, Juergen
20. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at peterlink.ru> Apr 07, 2007
- 724 views
Robert Craig wrote: > > CChris wrote: > > Currently, Eu interprets characters with the most significant bit set as > > opcodes. Only old shrouded files store Eu opcodes this way. > > > > Isn't it time to remove that restriction, so as to be able to use non > > english identifiers in programs? Other languages frequently use accented > > characters. > > Yes, I agree. I'll do that fairly soon, if nobody objects. If nobody objects, it is just consensus, too rare thing here > Others, such as Igor Kachan, have also mentioned the lack of support > for the higher ASCII codes for non-English languages. Yes, I like this feature very much, the bilingual EU 2.5 works OK for me and I have thanks from people for that package. That 2.5 can execute an EU code with *any* non-English names for identifiers (128..255 codes), not only Russian and English. Anyway, I plan to expand the 3.0.2(3..) source for this feature plus multilingual EU messages, some time later on, just my spare time is very limited for now. The automatic code translation from any foreign language to standard 100% pure Euphoria and back is simple (execpt comments) and works OK in 2.5 from English to Russian and from Russian to English. [snip] Regards, Igor Kachan kinz at peterlink.ru
21. Re: Accented characters in identifiers
- Posted by Robert Craig <rds at RapidEuphoria.com> Apr 08, 2007
- 737 views
Juergen Luethje wrote: > Robert Craig wrote: > > CChris wrote: > >> Currently, Eu interprets characters with the most significant bit set as > >> opcodes. Only old shrouded files store Eu opcodes this way. > >> > >> Isn't it time to remove that restriction, so as to be able to use non > >> english identifiers in programs? Other languages frequently use accented > >> characters. > > > > Yes, I agree. I'll do that fairly soon, if nobody objects. > > Others, such as Igor Kachan, have also mentioned the lack of support > > for the higher ASCII codes for non-English languages. > > <snip> > > Sorry, I don't think that this is a good idea, because: > > a) The usage of this feature will bring a considerable disadvantage. > When someone creates identifiers that contain special characters of > her/his language, it is likely that other people somewhwre else in > the world will have problems to read that code. > You recently reminded us of a post from you on 12 Feb 2002: > <<a > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> > > In this message it reads: > | I like it better the way it is. You could argue that I don't have to > | use variable inits if I don't want to. You could argue that I don't > | have to use goto if I don't want to. A language does not exist just > | to serve the isolated programmer. It exists to serve a community > | of programmers. In situations where it really doesn't matter > | how something is written, I think there are advantages to > | reducing the number of choices. > > IMHO the same is true concerning special characters in identifiers, > especially since many of them are not equal in different languages. > The Euphoria community is small enough, Euphoria shouldn't encourage > people to write code that can only be read by a fraction of this > small community. > > b) It is not necessary at all. We currently have a sufficient number of > characters for creating identifiers. The German language also has > some special characters, but I _never_ had the need to use one of > them in an identifier. OK, thanks for that insight. I guess I'll hold off, for at least several days, until we hear from some other non-English programmers. It just seemed to me that if I had to do without some of the English alphabet in my identifiers, it would be annoying to me, so I figured it must be annoying to non-English programmers. Also, if someone creates identifiers that are not English-related, I wouldn't understand them anyway, regardless of whether they contain accents or funny-looking characters. I guess it could be a problem though if some characters resemble punctuation and other confusing shapes, like some of the English ASCII 128-255 characters do on my English region computer. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
22. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at peterlink.ru> Apr 08, 2007
- 741 views
Robert Craig wrote: > Juergen Luethje wrote: >> Robert Craig wrote: >>> CChris wrote: >>>> Currently, Eu interprets characters with the most significant bit set as >>>> opcodes. Only old shrouded files store Eu opcodes this way. >>>> >>>> Isn't it time to remove that restriction, so as to be able to use non >>>> english identifiers in programs? Other languages frequently use accented >>>> characters. >>> >>> Yes, I agree. I'll do that fairly soon, if nobody objects. >>> Others, such as Igor Kachan, have also mentioned the lack of support >>> for the higher ASCII codes for non-English languages. >> >> <snip> >> >> Sorry, I don't think that this is a good idea, because: >> >> a) The usage of this feature will bring a considerable disadvantage. >> When someone creates identifiers that contain special characters of >> her/his language, it is likely that other people somewhwre else in >> the world will have problems to read that code. >> You recently reminded us of a post from you on 12 Feb 2002: >> <<a >> href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> >> >> In this message it reads: >> | I like it better the way it is. You could argue that I don't have to >> | use variable inits if I don't want to. You could argue that I don't >> | have to use goto if I don't want to. A language does not exist just >> | to serve the isolated programmer. It exists to serve a community >> | of programmers. In situations where it really doesn't matter >> | how something is written, I think there are advantages to >> | reducing the number of choices. >> >> IMHO the same is true concerning special characters in identifiers, >> especially since many of them are not equal in different languages. >> The Euphoria community is small enough, Euphoria shouldn't encourage >> people to write code that can only be read by a fraction of this >> small community. >> >> b) It is not necessary at all. We currently have a sufficient number of >> characters for creating identifiers. The German language also has >> some special characters, but I _never_ had the need to use one of >> them in an identifier. > > OK, thanks for that insight. > I guess I'll hold off, for at least several days, > until we hear from some other non-English programmers. Many of xUSSR programmers just scruple of their not very good English to discuss these things here ... So let me please take the second voice - I'm not too modest anyway, you do know And this my second voice says that the modern educational BlackBox system well known in Russia (by Oberon microsystems, Inc. Switzerland) does support the identifiers above 127 with some tecnical exception. > It just seemed to me that if I had to do without > some of the English alphabet in my identifiers, > it would be annoying to me, so I figured it must be > annoying to non-English programmers. ... to learn English and to look for Latinic letters on Russian (totally different) keyboard just to begin programming from the simplest:
puts(1,"Hello World!")
I have to say that anyway any professional programmer must to learn and know English well enough, but I do know some very talented persons who are almost absolutelly unapt to learn a foreign language. And I have to say that just only switching of the different registers of keyboard (Lat - Rus) is very annoying, for me too, but programming in *pure* Russian or in *pure* English both are handy - if without that perpetual switching. > Also, if someone creates identifiers that are not > English-related, I wouldn't understand them anyway, > regardless of whether they contain accents or > funny-looking characters. Rob, I think now you are one of those very talented persons who are almost absolutelly unapt to learn Russian, no? > I guess it could be a problem though if some characters > resemble punctuation and other confusing shapes, > like some of the English ASCII 128-255 characters do on my > English region computer. If you have a proper code page set up on your machine, all things will be clear without these confusing shapes. I must to say that I do understand all Juergen's objections very well, but, sorry, can not agree. Regards, Igor Kachan kinz at peterlink.ru
23. Re: Accented characters in identifiers
- Posted by Juergen Luethje <j.lue at gmx.de> Apr 08, 2007
- 729 views
Robert Craig wrote: > Juergen Luethje wrote: > > Robert Craig wrote: > > > CChris wrote: > > >> Currently, Eu interprets characters with the most significant bit set as > > >> opcodes. Only old shrouded files store Eu opcodes this way. > > >> > > >> Isn't it time to remove that restriction, so as to be able to use non > > >> english identifiers in programs? Other languages frequently use accented > > >> characters. > > > > > > Yes, I agree. I'll do that fairly soon, if nobody objects. > > > Others, such as Igor Kachan, have also mentioned the lack of support > > > for the higher ASCII codes for non-English languages. > > > > <snip> > > > > Sorry, I don't think that this is a good idea, because: > > > > a) The usage of this feature will bring a considerable disadvantage. > > When someone creates identifiers that contain special characters of > > her/his language, it is likely that other people somewhwre else in > > the world will have problems to read that code. > > You recently reminded us of a post from you on 12 Feb 2002: > > <<a > > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> > > > > In this message it reads: > > | I like it better the way it is. You could argue that I don't have to > > | use variable inits if I don't want to. You could argue that I don't > > | have to use goto if I don't want to. A language does not exist just > > | to serve the isolated programmer. It exists to serve a community > > | of programmers. In situations where it really doesn't matter > > | how something is written, I think there are advantages to > > | reducing the number of choices. > > > > IMHO the same is true concerning special characters in identifiers, > > especially since many of them are not equal in different languages. > > The Euphoria community is small enough, Euphoria shouldn't encourage > > people to write code that can only be read by a fraction of this > > small community. > > > > b) It is not necessary at all. We currently have a sufficient number of > > characters for creating identifiers. The German language also has > > some special characters, but I _never_ had the need to use one of > > them in an identifier. > > OK, thanks for that insight. > I guess I'll hold off, for at least several days, > until we hear from some other non-English programmers. > > It just seemed to me that if I had to do without > some of the English alphabet in my identifiers, > it would be annoying to me, so I figured it must be > annoying to non-English programmers. Well, I must admit that German with its 7 special characters (and I think e.g. French, Spanish or Swedish don't contain much more non-ASCII characters) is much closer to English than e.g. Russian or Japanese. So I understand especially Igor's intention here. > Also, if someone > creates identifiers that are not English-related, > I wouldn't understand them anyway, regardless of > whether they contain accents or funny-looking characters. I agree. I wanted to say that allowing special characters in identifiers _encourages_ programmers to write code that is hard to read for a lot of other people. So I think it increases the chance that an Eu programmer will see identifiers that (s)he wouldn't understand. > I guess it could be a problem though if some characters > resemble punctuation and other confusing shapes, > like some of the English ASCII 128-255 characters do on my > English region computer. I also think so. When you see non-English identifiers e.g. 'Pferd' and 'Ente', even when you do not know their meaning (which is btw. 'horse' and 'duck') you probably can easily recognize and distinguish them from each other in the whole code anyway. This might not be so easy with identifiers that consist of "very special" (from the point of view of the reader) characters. When I would try to read important code that contained identifiers which are meaningless to me, and which I could hardly recognize and distinguish from each other, then I think I would try to guess appropriate German or English names for them, and then "search and replace" these identifiers. This leads to another point, which I almost had forgotten: Special characters can confuse editors. In the past I repeatedly made the experience that editors handle some special characters as word delmiters. I just tested the following with the current Metapad version 3.51: When I double-cklick anywhere at the expression 'FooBar', Metapad always selects the whole expression, i.e. the entire "word". This does _not_ happen with the expression 'FoüBar'. (I hope it will read here on the message board as expected -- I replaced the third character with the lowercase German u-Umlaut.) Metapad handles this special German character as a word delimiter, so it "sees" the two words 'Fo' and 'Bar'! When I "search and replace" identifiers in program source code, I use the option: [v] whole words only With an editor that behaves as described above, I think this can lead to unexpected and unwanted results. Regards, Juergen
24. Re: Accented characters in identifiers
- Posted by jacques deschênes <desja at globetrotter.net> Apr 08, 2007
- 742 views
In the past I try to read some code from Aku but because the identifiers were in a language I don't understand, It was hard to understand and finaly I didn't persue. As a french speaking programmer, I always used english identifiers for code I distribute on web, because I consider english as a commun language for programmers all around the world. But when I write code for myself I use french identifier et comments, but I don't really miss accent in identifiers. regards, Jacques Deschênes Juergen Luethje wrote: > > Robert Craig wrote: > > > Juergen Luethje wrote: > > > Robert Craig wrote: > > > > CChris wrote: > > > >> Currently, Eu interprets characters with the most significant bit set > > > >> as > > > >> opcodes. Only old shrouded files store Eu opcodes this way. > > > >> > > > >> Isn't it time to remove that restriction, so as to be able to use non > > > >> english identifiers in programs? Other languages frequently use > > > >> accented > > > >> characters. > > > > > > > > Yes, I agree. I'll do that fairly soon, if nobody objects. > > > > Others, such as Igor Kachan, have also mentioned the lack of support > > > > for the higher ASCII codes for non-English languages. > > > > > > <snip> > > > > > > Sorry, I don't think that this is a good idea, because: > > > > > > a) The usage of this feature will bring a considerable disadvantage. > > > When someone creates identifiers that contain special characters of > > > her/his language, it is likely that other people somewhwre else in > > > the world will have problems to read that code. > > > You recently reminded us of a post from you on 12 Feb 2002: > > > <<a > > > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> > > > > > > In this message it reads: > > > | I like it better the way it is. You could argue that I don't have to > > > | use variable inits if I don't want to. You could argue that I don't > > > | have to use goto if I don't want to. A language does not exist just > > > | to serve the isolated programmer. It exists to serve a community > > > | of programmers. In situations where it really doesn't matter > > > | how something is written, I think there are advantages to > > > | reducing the number of choices. > > > > > > IMHO the same is true concerning special characters in identifiers, > > > especially since many of them are not equal in different languages. > > > The Euphoria community is small enough, Euphoria shouldn't encourage > > > people to write code that can only be read by a fraction of this > > > small community. > > > > > > b) It is not necessary at all. We currently have a sufficient number of > > > characters for creating identifiers. The German language also has > > > some special characters, but I _never_ had the need to use one of > > > them in an identifier. > > > > OK, thanks for that insight. > > I guess I'll hold off, for at least several days, > > until we hear from some other non-English programmers. > > > > It just seemed to me that if I had to do without > > some of the English alphabet in my identifiers, > > it would be annoying to me, so I figured it must be > > annoying to non-English programmers. > > Well, I must admit that German with its 7 special characters (and I > think e.g. French, Spanish or Swedish don't contain much more non-ASCII > characters) is much closer to English than e.g. Russian or Japanese. So > I understand especially Igor's intention here. > > > Also, if someone > > creates identifiers that are not English-related, > > I wouldn't understand them anyway, regardless of > > whether they contain accents or funny-looking characters. > > I agree. > > I wanted to say that allowing special characters in identifiers > _encourages_ programmers to write code that is hard to read for a lot of > other people. So I think it increases the chance that an Eu programmer > will see identifiers that (s)he wouldn't understand. > > > I guess it could be a problem though if some characters > > resemble punctuation and other confusing shapes, > > like some of the English ASCII 128-255 characters do on my > > English region computer. > > I also think so. When you see non-English identifiers e.g. 'Pferd' and > 'Ente', even when you do not know their meaning (which is btw. 'horse' > and 'duck') you probably can easily recognize and distinguish them from > each other in the whole code anyway. This might not be so easy with > identifiers that consist of "very special" (from the point of view of > the reader) characters. > > When I would try to read important code that contained identifiers which > are meaningless to me, and which I could hardly recognize and distinguish > from each other, then I think I would try to guess appropriate German or > English names for them, and then "search and replace" these identifiers. > > This leads to another point, which I almost had forgotten: > Special characters can confuse editors. In the past I repeatedly made the > experience that editors handle some special characters as word delmiters. > > I just tested the following with the current Metapad version 3.51: > When I double-cklick anywhere at the expression 'FooBar', Metapad always > selects the whole expression, i.e. the entire "word". This does _not_ > happen with the expression 'FoüBar'. (I hope it will read here on the > message board as expected -- I replaced the third character with the > lowercase German u-Umlaut.) Metapad handles this special German character > as a word delimiter, so it "sees" the two words 'Fo' and 'Bar'! > > When I "search and replace" identifiers in program source code, I use > the option: > [v] whole words only > > With an editor that behaves as described above, I think this can lead to > unexpected and unwanted results. > > Regards, > Juergen
25. Re: Accented characters in identifiers
- Posted by Igor Kachan <kinz at peterl?nk?ru> Jun 03, 2008
- 746 views
CChris wrote: > > Igor Kachan wrote: > > > > > > There are buggy bilingual interpreter for > > Linux exu_r and ex.err files for two euphoria/demos/linux > > programs in this package: > > > > http://www.private.peterlink.ru/kinz/exu_r_25.zip > > > > Try please, if you want. > > > > sanity.ex works ok with exu_r - 100% passed. > > > > Got those files, which are hardly informative indeed. Yes, it is a good puzzle Ok, you can see the changes to 2.5 source code here (gotten by the diff program) : http://www.private.peterlink.ru/kinz/changes.zip > I think any implementation of acccented chars (allowing any UTF-8 char in > identifiers > is trivial, they just may cause display concerns when the code page is not the > original one) would be done with the new tools in 4.0, and there wil be many. That code above uses just the DOS code pages, not UTF-8. Watcom has the single code page for the DOS pixel modes, so you have to replace fonts in your localised interpreter. > Perhaps you, Rob and Jeremy might want to discuss this? I do not think Rob wants those changes to official EU just now, maybe 8.0 Regards, Igor Kachan kinz at peterlink.ru