Re: Accented characters in identifiers
- Posted by Juergen Luethje <j.lue at gmx.de> Apr 08, 2007
- 697 views
Robert Craig wrote: > Juergen Luethje wrote: > > Robert Craig wrote: > > > CChris wrote: > > >> Currently, Eu interprets characters with the most significant bit set as > > >> opcodes. Only old shrouded files store Eu opcodes this way. > > >> > > >> Isn't it time to remove that restriction, so as to be able to use non > > >> english identifiers in programs? Other languages frequently use accented > > >> characters. > > > > > > Yes, I agree. I'll do that fairly soon, if nobody objects. > > > Others, such as Igor Kachan, have also mentioned the lack of support > > > for the higher ASCII codes for non-English languages. > > > > <snip> > > > > Sorry, I don't think that this is a good idea, because: > > > > a) The usage of this feature will bring a considerable disadvantage. > > When someone creates identifiers that contain special characters of > > her/his language, it is likely that other people somewhwre else in > > the world will have problems to read that code. > > You recently reminded us of a post from you on 12 Feb 2002: > > <<a > > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> > > > > In this message it reads: > > | I like it better the way it is. You could argue that I don't have to > > | use variable inits if I don't want to. You could argue that I don't > > | have to use goto if I don't want to. A language does not exist just > > | to serve the isolated programmer. It exists to serve a community > > | of programmers. In situations where it really doesn't matter > > | how something is written, I think there are advantages to > > | reducing the number of choices. > > > > IMHO the same is true concerning special characters in identifiers, > > especially since many of them are not equal in different languages. > > The Euphoria community is small enough, Euphoria shouldn't encourage > > people to write code that can only be read by a fraction of this > > small community. > > > > b) It is not necessary at all. We currently have a sufficient number of > > characters for creating identifiers. The German language also has > > some special characters, but I _never_ had the need to use one of > > them in an identifier. > > OK, thanks for that insight. > I guess I'll hold off, for at least several days, > until we hear from some other non-English programmers. > > It just seemed to me that if I had to do without > some of the English alphabet in my identifiers, > it would be annoying to me, so I figured it must be > annoying to non-English programmers. Well, I must admit that German with its 7 special characters (and I think e.g. French, Spanish or Swedish don't contain much more non-ASCII characters) is much closer to English than e.g. Russian or Japanese. So I understand especially Igor's intention here. > Also, if someone > creates identifiers that are not English-related, > I wouldn't understand them anyway, regardless of > whether they contain accents or funny-looking characters. I agree. I wanted to say that allowing special characters in identifiers _encourages_ programmers to write code that is hard to read for a lot of other people. So I think it increases the chance that an Eu programmer will see identifiers that (s)he wouldn't understand. > I guess it could be a problem though if some characters > resemble punctuation and other confusing shapes, > like some of the English ASCII 128-255 characters do on my > English region computer. I also think so. When you see non-English identifiers e.g. 'Pferd' and 'Ente', even when you do not know their meaning (which is btw. 'horse' and 'duck') you probably can easily recognize and distinguish them from each other in the whole code anyway. This might not be so easy with identifiers that consist of "very special" (from the point of view of the reader) characters. When I would try to read important code that contained identifiers which are meaningless to me, and which I could hardly recognize and distinguish from each other, then I think I would try to guess appropriate German or English names for them, and then "search and replace" these identifiers. This leads to another point, which I almost had forgotten: Special characters can confuse editors. In the past I repeatedly made the experience that editors handle some special characters as word delmiters. I just tested the following with the current Metapad version 3.51: When I double-cklick anywhere at the expression 'FooBar', Metapad always selects the whole expression, i.e. the entire "word". This does _not_ happen with the expression 'FoĆ¼Bar'. (I hope it will read here on the message board as expected -- I replaced the third character with the lowercase German u-Umlaut.) Metapad handles this special German character as a word delimiter, so it "sees" the two words 'Fo' and 'Bar'! When I "search and replace" identifiers in program source code, I use the option: [v] whole words only With an editor that behaves as described above, I think this can lead to unexpected and unwanted results. Regards, Juergen