Re: Unicode [was Re: [OT] it's vs its]
- Posted by CChris <christian.cuvier at agri??lture.gouv.fr> May 30, 2008
- 609 views
gshingles wrote: > > CChris wrote: > > > > Chiding someone whose > > english is not the native language for poor wording would be ridiculous and > > inappropriate. Thank goodness, it hasn't happened. > > I wouldn't expect that to happen among rational thinking programming types (if > there is such a thing > If something is unclear then I would expect > we would just ask for clarification. > > > Hey, I never said I wasn't making mistakes myself!! Even without the typos. > > I'm not a good typist, and have ranted against too verbose constructs quite > > a few times. > > I've been a native speaker of English my whole life and still get it wrong > often. > That's just the nature of English, but it does contain enough redundant > information > that a meaning can usually be derived at least (my guess) 95% of the time. No > matter what someone writes :) (Which is a danger too, when that meaning wasn't > meant to be there!) > > To bring it back to Euphoria, yes even EUForum being in English is ... a fact > (I was going to say 'unfortunate', 'a problem' etc but it is just a fact as > far as I'm concerned). That goes for the interpreter only accepting English > keywords, etc. > > For EUForum, I don't think there's much we can do about that, at least not > until > machine translation reaches acceptable quality (maybe another 5 years?). > > For the interpreter itself, the only solution I can think of is to replace > keywords > dynamically in an editor depending on which language is selected by the end > user. And by replace, I mean visually replace, not actually replace them in > the file. So the keywords in the source file are all English, but when opened > with "this" editor 'while' would be seen as '[insert language equivalent > here]' > (if it's not in a string literal, etc). > I wasn't that ambitious. Anyone with a C compiler and clear instructions can build from source and edit keylist.e replacing any names as desired. With a small mod in the parser, the whole upper half of UTF-8 is available for isentifiers. I was thinking only to having é, ç or ü in variable or routine names. These are e acute, c with cedilla and u with umlaut; they may display in a strange way on some codepages. And we have to accept © or ¿ (copyright and reverse question mark), because they may mean some reasonable letter on a different code page. > That leaves the routine names of any public interface you make available for > an include. A good example of a way around that (albeit for a relatively > simple > API) is Aku's INI File library. The main file is inifile.e which exports > tulisIni(..) > and ambilIni(..). If you want the English equivalent, you include inifile-en.e > which wraps those two functions with English names. > > So maybe an editor (like Ken's magic IDE) is the way to go to offset some of > these language issues, I realise this doesn't cover different character sets, > or the need for unicode for non-western or Cyrillic languages though, that's > an issue for us to decide whether to incorporate native Unicode into Euphoria > (which I am not against). > This is a different topic. I'm not opposed to it either, and do not think performance would be affected. But this requires more thought and knowledge. CChris > Gary