OpenEuphoria: Forum: Unicode [was Re: [OT] it's vs its]

1. Unicode [was Re: [OT] it's vs its]

Posted by gshingles <gshingles at g?ail.com> May 29, 2008
647 views
Last edited May 30, 2008

CChris wrote:
> 
>  Chiding someone whose
> english is not the native language for poor wording would be ridiculous and
> inappropriate. Thank goodness, it hasn't happened.

I wouldn't expect that to happen among rational thinking programming types (if
there is such a thing  If something is unclear then I would expect we would
just ask for clarification.

> Hey, I never said I wasn't making mistakes myself!! Even without the typos.
> I'm not a good typist, and have ranted against too verbose constructs quite
> a few times.

I've been a native speaker of English my whole life and still get it wrong
often.  That's just the nature of English, but it does contain enough redundant
information that a meaning can usually be derived at least (my guess) 95% of the
time. No matter what someone writes :) (Which is a danger too, when that meaning
wasn't meant to be there!)

To bring it back to Euphoria, yes even EUForum being in English is ... a fact (I
was going to say 'unfortunate', 'a problem' etc but it is just a fact as far as
I'm concerned). That goes for the interpreter only accepting English keywords,
etc.

For EUForum, I don't think there's much we can do about that, at least not until
machine translation reaches acceptable quality (maybe another 5 years?).

For the interpreter itself, the only solution I can think of is to replace
keywords dynamically in an editor depending on which language is selected by the
end user.  And by replace, I mean visually replace, not actually replace them in
the file.  So the keywords in the source file are all English, but when opened
with "this" editor 'while' would be seen as '[insert language equivalent here]'
(if it's not in a string literal, etc).

That leaves the routine names of any public interface you make available for an
include.  A good example of a way around that (albeit for a relatively simple
API) is Aku's INI File library.  The main file is inifile.e which exports
tulisIni(..) and ambilIni(..). If you want the English equivalent, you include
inifile-en.e which wraps those two functions with English names.

So maybe an editor (like Ken's magic IDE) is the way to go to offset some of
these language issues, I realise this doesn't cover different character sets, or
the need for unicode for non-western or Cyrillic languages though, that's an
issue for us to decide whether to incorporate native Unicode into Euphoria (which
I am not against).

Gary

new topic » topic index » view message » categorize

2. Re: Unicode [was Re: [OT] it's vs its]

Posted by Jeremy Cowgar <jeremy at ?owgar?com> May 30, 2008
621 views

gshingles wrote:
>  
> That leaves the routine names of any public interface you make available for
> an include.  A good example of a way around that (albeit for a relatively
> simple
> API) is Aku's INI File library.  The main file is inifile.e which exports
> tulisIni(..)
> and ambilIni(..). If you want the English equivalent, you include inifile-en.e
> which wraps those two functions with English names.
> 

Euphoria 4.0 actually has language translation include in the standard library
now as well as number/currency and date translation.

english.lng
-----------

hello Hello
world World

slang.lng
---------
hello What's up
world dudes

myprog.ex

include locale.e as l

l:lang("english")
printf(1, "%s, %s!", {l:w("hello"), l:w("world")})
l:lang("slang")
printf(1, "%s, %s!", {l:w("hello"), l:w("world")})


Output:

  Hello, World!
  What's up, dude!

Language files can have multi-line messages as well as contain codes that will
go along with printf/sprintf.

--
Jeremy Cowgar
http://jeremy.cowgar.com

new topic » goto parent » topic index » view message » categorize

3. Re: Unicode [was Re: [OT] it's vs its]

Posted by CChris <christian.cuvier at agri??lture.gouv.fr> May 30, 2008
610 views

gshingles wrote:
> 
> CChris wrote:
> > 
> >  Chiding someone whose
> > english is not the native language for poor wording would be ridiculous and
> > inappropriate. Thank goodness, it hasn't happened.
> 
> I wouldn't expect that to happen among rational thinking programming types (if
> there is such a thing
>  If something is unclear then I would expect
> we would just ask for clarification.
> 
> > Hey, I never said I wasn't making mistakes myself!! Even without the typos.
> > I'm not a good typist, and have ranted against too verbose constructs quite
> > a few times.
> 
> I've been a native speaker of English my whole life and still get it wrong
> often.
>  That's just the nature of English, but it does contain enough redundant
>  information
> that a meaning can usually be derived at least (my guess) 95% of the time. No
> matter what someone writes :) (Which is a danger too, when that meaning wasn't
> meant to be there!)
> 
> To bring it back to Euphoria, yes even EUForum being in English is ... a fact
> (I was going to say 'unfortunate', 'a problem' etc but it is just a fact as
> far as I'm concerned). That goes for the interpreter only accepting English
> keywords, etc.  
> 
> For EUForum, I don't think there's much we can do about that, at least not
> until
> machine translation reaches acceptable quality (maybe another 5 years?).
> 
> For the interpreter itself, the only solution I can think of is to replace
> keywords
> dynamically in an editor depending on which language is selected by the end
> user.  And by replace, I mean visually replace, not actually replace them in
> the file.  So the keywords in the source file are all English, but when opened
> with "this" editor 'while' would be seen as '[insert language equivalent
> here]'
> (if it's not in a string literal, etc).
> 

I wasn't that ambitious. Anyone with a C compiler and clear instructions can
build from source and edit keylist.e replacing any names as desired. With a small
mod in the parser, the whole upper half of UTF-8 is available for isentifiers.

I was thinking only to having é, ç or ü in variable or routine names. These are
e acute, c with cedilla and u with umlaut; they may display in a strange way on
some codepages. And we have to accept © or ¿ (copyright and reverse question
mark), because they may mean some reasonable letter on a different code page.

> That leaves the routine names of any public interface you make available for
> an include.  A good example of a way around that (albeit for a relatively
> simple
> API) is Aku's INI File library.  The main file is inifile.e which exports
> tulisIni(..)
> and ambilIni(..). If you want the English equivalent, you include inifile-en.e
> which wraps those two functions with English names.
> 
> So maybe an editor (like Ken's magic IDE) is the way to go to offset some of
> these language issues, I realise this doesn't cover different character sets,
> or the need for unicode for non-western or Cyrillic languages though, that's
> an issue for us to decide whether to incorporate native Unicode into Euphoria
> (which I am not against).
> 

This is a different topic. I'm not opposed to it either, and do not think
performance would be affected. But this requires more thought and knowledge.

CChris
> Gary

OpenEuphoria

1. Unicode [was Re: [OT] it's vs its]

2. Re: Unicode [was Re: [OT] it's vs its]

3. Re: Unicode [was Re: [OT] it's vs its]

Search

Include:

Quick Links

User menu

Misc Menu