Re: Accented characters in identifiers
- Posted by jacques deschênes <desja at globetrotter.net> Apr 08, 2007
- 741 views
In the past I try to read some code from Aku but because the identifiers were in a language I don't understand, It was hard to understand and finaly I didn't persue. As a french speaking programmer, I always used english identifiers for code I distribute on web, because I consider english as a commun language for programmers all around the world. But when I write code for myself I use french identifier et comments, but I don't really miss accent in identifiers. regards, Jacques Deschênes Juergen Luethje wrote: > > Robert Craig wrote: > > > Juergen Luethje wrote: > > > Robert Craig wrote: > > > > CChris wrote: > > > >> Currently, Eu interprets characters with the most significant bit set > > > >> as > > > >> opcodes. Only old shrouded files store Eu opcodes this way. > > > >> > > > >> Isn't it time to remove that restriction, so as to be able to use non > > > >> english identifiers in programs? Other languages frequently use > > > >> accented > > > >> characters. > > > > > > > > Yes, I agree. I'll do that fairly soon, if nobody objects. > > > > Others, such as Igor Kachan, have also mentioned the lack of support > > > > for the higher ASCII codes for non-English languages. > > > > > > <snip> > > > > > > Sorry, I don't think that this is a good idea, because: > > > > > > a) The usage of this feature will bring a considerable disadvantage. > > > When someone creates identifiers that contain special characters of > > > her/his language, it is likely that other people somewhwre else in > > > the world will have problems to read that code. > > > You recently reminded us of a post from you on 12 Feb 2002: > > > <<a > > > href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>> > > > > > > In this message it reads: > > > | I like it better the way it is. You could argue that I don't have to > > > | use variable inits if I don't want to. You could argue that I don't > > > | have to use goto if I don't want to. A language does not exist just > > > | to serve the isolated programmer. It exists to serve a community > > > | of programmers. In situations where it really doesn't matter > > > | how something is written, I think there are advantages to > > > | reducing the number of choices. > > > > > > IMHO the same is true concerning special characters in identifiers, > > > especially since many of them are not equal in different languages. > > > The Euphoria community is small enough, Euphoria shouldn't encourage > > > people to write code that can only be read by a fraction of this > > > small community. > > > > > > b) It is not necessary at all. We currently have a sufficient number of > > > characters for creating identifiers. The German language also has > > > some special characters, but I _never_ had the need to use one of > > > them in an identifier. > > > > OK, thanks for that insight. > > I guess I'll hold off, for at least several days, > > until we hear from some other non-English programmers. > > > > It just seemed to me that if I had to do without > > some of the English alphabet in my identifiers, > > it would be annoying to me, so I figured it must be > > annoying to non-English programmers. > > Well, I must admit that German with its 7 special characters (and I > think e.g. French, Spanish or Swedish don't contain much more non-ASCII > characters) is much closer to English than e.g. Russian or Japanese. So > I understand especially Igor's intention here. > > > Also, if someone > > creates identifiers that are not English-related, > > I wouldn't understand them anyway, regardless of > > whether they contain accents or funny-looking characters. > > I agree. > > I wanted to say that allowing special characters in identifiers > _encourages_ programmers to write code that is hard to read for a lot of > other people. So I think it increases the chance that an Eu programmer > will see identifiers that (s)he wouldn't understand. > > > I guess it could be a problem though if some characters > > resemble punctuation and other confusing shapes, > > like some of the English ASCII 128-255 characters do on my > > English region computer. > > I also think so. When you see non-English identifiers e.g. 'Pferd' and > 'Ente', even when you do not know their meaning (which is btw. 'horse' > and 'duck') you probably can easily recognize and distinguish them from > each other in the whole code anyway. This might not be so easy with > identifiers that consist of "very special" (from the point of view of > the reader) characters. > > When I would try to read important code that contained identifiers which > are meaningless to me, and which I could hardly recognize and distinguish > from each other, then I think I would try to guess appropriate German or > English names for them, and then "search and replace" these identifiers. > > This leads to another point, which I almost had forgotten: > Special characters can confuse editors. In the past I repeatedly made the > experience that editors handle some special characters as word delmiters. > > I just tested the following with the current Metapad version 3.51: > When I double-cklick anywhere at the expression 'FooBar', Metapad always > selects the whole expression, i.e. the entire "word". This does _not_ > happen with the expression 'FoüBar'. (I hope it will read here on the > message board as expected -- I replaced the third character with the > lowercase German u-Umlaut.) Metapad handles this special German character > as a word delimiter, so it "sees" the two words 'Fo' and 'Bar'! > > When I "search and replace" identifiers in program source code, I use > the option: > [v] whole words only > > With an editor that behaves as described above, I think this can lead to > unexpected and unwanted results. > > Regards, > Juergen