1. Status of Edix

(Written in response to a private email)

Work has largely stalled.

I am now thinking of abandoning it and taking what I have learnt to Edita 2.0, sharing as much code and infrastructure with the original (Edita) as possible.

Making copies of those files, so there are two things to maintain, no longer sounds like my best idea, and I have run out of steam given that file printing/ report painter(/publisher)/bookmarks/several case and clipboard functions/ code folding/(un)comment/compare/directory compare/common code analysis/ database viewer(/verify)/macro management/lots of options/some help features/ project(/directory/recovery) tree handling/word wrap/message area, and probably quite a few other features are all outstanding. Obviously not all source files can be shared but one thing I will not miss is keeping Edita/syn/Euphoria.syn and Edita/syn/Phix.syn in step, and /help/Phix.txt in both Edita and Edix.

Right now I want enhance the File/Tabs List (Ctrl T) option to make it the best way to switch files. The existing listview would probably still be available, but disappear when another tab/radio option is selected, to be replaced by a tree/filter/whatever. As you say, multiple open tabs lose their charm once you go above a dozen or so. Ideas welcome.

Personally I use Edix for all my documentation work, mainly because it is slightly better at opening files, but still rely on Edita for coding work, mainly because that (still) has far better error handling than Edix. The other big problem with Edix at the moment is that closing a file does not clean up the innards properly, and it crashes if you close a file and then try to re-open that same file. Another problem is that IUP will not let me trap alt-keys correctly, sounding the bell if there is no menu entry for that alt-key - I may have left some dummy-menu-entry-type-stuff in there that needs removing.

Of course there is also wee, but that has other problems (for me). I have demo\wee here which is a hacked copy of 0.38 that runs on phix. (I was under the impression that phix ships with that, but I just checked and obviously I never added it.) Firstly it would need a re-porting of the latest version, but then I would be in much the same boat as with Edix, namely porting 15-years-worth of Edita features into it, plus afaict it does not handle unicode.
Edit: actually, there is a demo\wee36 in bitbucket, not in the distro.

new topic     » topic index » view message » categorize

2. Re: Status of Edix

petelomax said...

(Written in response to a private email)

Work has largely stalled. ~ ~ Of course there is also wee, but that has other problems (for me). I have demo\wee here which is a hacked copy of 0.38 that runs on phix. (I was under the impression that phix ships with that, but I just checked and obviously I never added it.) Firstly it would need a re-porting of the latest version, but then I would be in much the same boat as with Edix, namely porting 15-years-worth of Edita features into it, plus afaict it does not handle unicode.
Edit: actually, there is a demo\wee36 in bitbucket, not in the distro.

Is Unicode the same as UTF-8? (My Gnome-Terminal encoding preference is set to UTF-8 Unicode)

From WEE-52 Release notes: Version 0.50 - Set codepage to UTF-8 (thanks Irv)

Regards, Ken

new topic     » goto parent     » topic index » view message » categorize

3. Re: Status of Edix

Senator said...
petelomax said...

(Written in response to a private email)

Work has largely stalled. ~ ~ Of course there is also wee, but that has other problems (for me). I have demo\wee here which is a hacked copy of 0.38 that runs on phix. (I was under the impression that phix ships with that, but I just checked and obviously I never added it.) Firstly it would need a re-porting of the latest version, but then I would be in much the same boat as with Edix, namely porting 15-years-worth of Edita features into it, plus afaict it does not handle unicode.
Edit: actually, there is a demo\wee36 in bitbucket, not in the distro.

Is Unicode the same as UTF-8? (My Gnome-Terminal encoding preference is set to UTF-8 Unicode)

From WEE-52 Release notes: Version 0.50 - Set codepage to UTF-8 (thanks Irv)

Regards, Ken

Unicode is a 16 bit character list to encompass all the languages of the world. It was no enough for some of the Asian countries’ language needs and it is now a 20 bit character. Older programmers could not cope with 16 bit characters as everybody was geared to 8 bit characters. Even incorporating ANSi (8 bit) instead of ASCII (7 bit) was a big challenge. So, led by the Linux developers and the Internet needs they developed 8 bit UTF8 to allow for the 16 bit Unicode characters. The calisthenics involved are beyond my ability to explain, but they ended up converting all of the Unicode characters - each 2 byte character was converted to to 2-5 byte characters. There are many free conversion routines available for Unicode to UTF8 and vice versa. But definitely UTF8 and Unicode are not the same.

new topic     » goto parent     » topic index » view message » categorize

4. Re: Status of Edix

Senator said...
petelomax said...

plus afaict it does not handle unicode.

Is Unicode the same as UTF-8?

Yep, or more accurately UTF-8 is 1 of the 5 possible storage formats for Unicode.

Bhupen1277 said...

Unicode is a 16 bit character list

Not quite, that is UTF-16(LE or BE), and that is 2 of the other possible storage formats for Unicode (the remaining ones being UTF-32BE/LE).
(Actually, GB18030 might also technically be Unicode, and maybe things like SCSU and BOCU1, but I digress.) You are quite right in that routines to convert UTF8/16/32 exist.

Senator said...

From WEE-52 Release notes: Version 0.50 - Set codepage to UTF-8 (thanks Irv)

Good to know. By "afaict" I intended to stress the "dozen-plus-versions-out-of-date" side.

new topic     » goto parent     » topic index » view message » categorize

5. Re: Status of Edix

I was not clear in my first reply.

Unicode is a character set and UTF8 is a method of representing the character set in 8 bit bytes. Hence the representation can be 1-5 bytes long, because currently there are 136,755 characters covering 139 modern and historic scripts.

The original Unicode character could be represented by precisely 2 bytes each. Therefore, writing a piece of software to reach the nth character in a sequence, was very simple, IF you represented them in 2 byte fields, which is what Microsoft does internally and in their text files. However, when UTF8 is used to represent these, the nth character position cannot be estimated or guestimated; you have to crawl along to find the nth character. Therefore, UTF16 was invented, and it was good for the original extent (64K characters) but not enough for the extended characters.

UTF32 is the latest, and it is a 32 bit representation of each individual character and therefore, you can arrive at the nth character in a sequence by calculating the exact position in number of bytes. In a search algorithm this fast access gives speeds that easily overcome the bulkiness of 32 bits per character.

Euphoria has a 4 byte integer as standard. Therefore, let us say you create a character set at the private area Hex E000-F8FF. It is easy to create a Euphoria type, which allows integers in that range, and call it Euphoria type “MyPrivateChar”. That is exactly what I am doing in my work with characters discussed in another thread here. In fact, I have created separate types:

MyPrivateChar” - E000 – F8FF

MyPrivateHindi” – E000 – E1FF

MyPrivateGuj” – E200 – E3FF

MyPrivateBengali” – E400 – E5FF

MyPrivateKannada” – E600 – E7FF

And so on for all the Indic languages.

Actually, currently, three private use areas are defined: one in the Basic Multilingual Plane (U+E000–U+F8FF), and one each in, and nearly covering, planes 15 and 16 (U+F0000–U+FFFFD, U+100000–U+10FFFD). All these areas are accessible with the use of 32 bit integer values of Euphoria and hence my use of wxEuphoria for my work.

new topic     » goto parent     » topic index » view message » categorize

6. Re: Status of Edix

Bhupen1277 said...

The original Unicode character could be represented by precisely 2 bytes each. Therefore, writing a piece of software to reach the nth character in a sequence, was very simple, IF you represented them in 2 byte fields

when UTF8 is used to represent these, the nth character position cannot be estimated or guestimated; you have to crawl along to find the nth character.

Yep, but...

Bhupen1277 said...

Therefore, UTF16 was invented, and it was good for the original extent (64K characters) but not enough for the extended characters.

You are thinking of UCS-2. UTF-16 is an extension of UCS-2, which can represent the full new Unicode character set, including the extended characters. However, it does this by using four bytes for the extended characters. (The original set is still represented by two bytes.) So UTF-16 suffers from the same problem as UTF-8 in terms of getting the nth character (however fewer characters are affected).

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu