1. Unicode in Source code
- Posted by SocIoDim May 13, 2013
- 1157 views
Problem: Euphoria only accepts ASCII source files. UTF16 or UCS2 files are blatantly rejected because of the null (0) bytes embedded, and that byte is considered an illegal character by the scanner. UTF-8 encoding is widely used since it maintain compatibility with ASCII, but unfortunately the bytes 128-255 is reserved for (strange) shrouding that was used during commercial days of Eu.
Solution: Why not to write source code in RTF? It is 7-bit ASCII. Theoretically, RTF allows 8-bit after "\bin" tag, but it is rarely used feature, especially in the source code.
RTF is very simple format. You need just remove all tags to get pure ASCII-7. Formally, it is proprietary, but Microsoft opened the specification and imposes no restrictions on its use. Besides the format will melt apparent after examining of any RTF-file. Last but not least, my preferable text editor FocusWriter uses RTF for saved files by default.
2. Re: Unicode in Source code
- Posted by jimcbrown (admin) May 13, 2013
- 1146 views
Problem: Euphoria only accepts ASCII source files. UTF16 or UCS2 files are blatantly rejected because of the null (0) bytes embedded, and that byte is considered an illegal character by the scanner. UTF-8 encoding is widely used since it maintain compatibility with ASCII, but unfortunately the bytes 128-255 is reserved for (strange) shrouding that was used during commercial days of Eu.
Actually, this is no longer entirely true. Current versions of Euphoria no longer reserve bytes 128-255, and UTF-8 encoded files are supported as source code.