Re: Accented characters in identifiers

new topic     » goto parent     » topic index » view thread      » older message » newer message

Robert Craig wrote:

> Juergen Luethje wrote:
> > Robert Craig wrote:
> > > CChris wrote:
> > >> Currently, Eu interprets characters with the most significant bit set as
> > >> opcodes. Only old shrouded files store Eu opcodes this way.
> > >> 
> > >> Isn't it time to remove that restriction, so as to be able to use non 
> > >> english identifiers in programs? Other languages frequently use accented
> > >> characters.
> > > 
> > > Yes, I agree. I'll do that fairly soon, if nobody objects.
> > > Others, such as Igor Kachan, have also mentioned the lack of support
> > > for the higher ASCII codes for non-English languages.
> > 
> > <snip>
> > 
> > Sorry, I don't think that this is a good idea, because:
> > 
> > a) The usage of this feature will bring a considerable disadvantage.
> >    When someone creates identifiers that contain special characters of
> >    her/his language, it is likely that other people somewhwre else in
> >    the world will have problems to read that code.
> >    You recently reminded us of a post from you on 12 Feb 2002:
> >    <<a
> >    href="http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize">http://www.listfilter.com/cgi-bin/esearch.exu?fromMonth=2&fromYear=7&toMonth=2&toYear=7&postedBy=rds&keywords=declaration+initialize</a>>
> > 
> >    In this message it reads:
> >    | I like it better the way it is. You could argue that I don't have to
> >    | use variable inits if I don't want to. You could argue that I don't
> >    | have to use goto if I don't want to. A language does not exist just
> >    | to serve the isolated programmer. It exists to serve a community
> >    | of programmers. In situations where it really doesn't matter 
> >    | how something is written, I think there are advantages to 
> >    | reducing the number of choices.
> > 
> >    IMHO the same is true concerning special characters in identifiers,
> >    especially since many of them are not equal in different languages.
> >    The Euphoria community is small enough, Euphoria shouldn't encourage
> >    people to write code that can only be read by a fraction of this
> >    small community.
> > 
> > b) It is not necessary at all. We currently have a sufficient number of
> >    characters for creating identifiers. The German language also has
> >    some special characters, but I _never_ had the need to use one of
> >    them in an identifier.
> 
> OK, thanks for that insight.
> I guess I'll hold off, for at least several days, 
> until we hear from some other non-English programmers.
> 
> It just seemed to me that if I had to do without
> some of the English alphabet in my identifiers, 
> it would be annoying to me, so I figured it must be 
> annoying to non-English programmers.

Well, I must admit that German with its 7 special characters (and I
think e.g. French, Spanish or Swedish don't contain much more non-ASCII
characters) is much closer to English than e.g. Russian or Japanese. So
I understand especially Igor's intention here.

> Also, if someone
> creates identifiers that are not English-related,
> I wouldn't understand them anyway, regardless of
> whether they contain accents or funny-looking characters.

smile I agree.

I wanted to say that allowing special characters in identifiers
_encourages_ programmers to write code that is hard to read for a lot of
other people. So I think it increases the chance that an Eu programmer
will see identifiers that (s)he wouldn't understand.

> I guess it could be a problem though if some characters
> resemble punctuation and other confusing shapes, 
> like some of the English ASCII 128-255 characters do on my
> English region computer.

I also think so. When you see non-English identifiers e.g. 'Pferd' and
'Ente', even when you do not know their meaning (which is btw. 'horse'
and 'duck') you probably can easily recognize and distinguish them from
each other in the whole code anyway. This might not be so easy with
identifiers that consist of "very special" (from the point of view of
the reader) characters.

When I would try to read important code that contained identifiers which
are meaningless to me, and which I could hardly recognize and distinguish
from each other, then I think I would try to guess appropriate German or
English names for them, and then "search and replace" these identifiers.

This leads to another point, which I almost had forgotten:
Special characters can confuse editors. In the past I repeatedly made the
experience that editors handle some special characters as word delmiters.

I just tested the following with the current Metapad version 3.51:
When I double-cklick anywhere at the expression 'FooBar', Metapad always
selects the whole expression, i.e. the entire "word". This does _not_
happen with the expression 'FoĆ¼Bar'. (I hope it will read here on the
message board as expected -- I replaced the third character with the
lowercase German u-Umlaut.) Metapad handles this special German character
as a word delimiter, so it "sees" the two words 'Fo' and 'Bar'!

When I "search and replace" identifiers in program source code, I use
the option:
   [v] whole words only

With an editor that behaves as described above, I think this can lead to
unexpected and unwanted results.

Regards,
   Juergen

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu