1. walk_dir and Chinese filenames
- Posted by annoyed Sep 30, 2009
- 1363 views
- Last edited Oct 01, 2009
It has taken me so-o-o-o long to get on this forum. I can't change my password because I've got a blank security question for which a blank answer is apparently insufficient. Just as well I've got a few email addresses.
ANYWAY
What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".
Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.
Sorry, this sounds like a rant.
Kind regards, Bruce
2. Re: walk_dir and Chinese filenames
- Posted by jimcbrown (admin) Sep 30, 2009
- 1379 views
- Last edited Oct 01, 2009
What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".
This doesn't help you at all, but I have no problems using UTF-8 encoded filenames that consist only of chinese characters. Of course, I'm using Linux.
The machine_func() implementation of M_DIR uses Watcom's readdir() call to get the list of directory entries. I haven't looked at the Watcom docs on this but my guess is that readdir() ends up calling the ANSI version of the W32API, which is why the hanzi ends up converted into question marks.
Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.
Probably this is the easiest way to go. As long as you are using the Unicode functions and are careful to convert the Unicode strings into sequences and back correctly, you shouldn't have any problems. (Note that puts() and printf() don't support UTF-16, so you'll need to wrap more W32API functions if you want to actually display the file names on the console or write them to a file.)
3. Re: walk_dir and Chinese filenames
- Posted by DerekParnell (admin) Sep 30, 2009
- 1319 views
- Last edited Oct 01, 2009
what's being done about dir()s inability to properly read Chinese filenames?
Euphoria does not yet support Unicode. We have started the process but because it is going to take quite some effort, it will be done is stages. We did not want the lack of total Unicode support to delay the next version of Euphoria.
If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.
4. Re: walk_dir and Chinese filenames
- Posted by useless Sep 30, 2009
- 1328 views
- Last edited Oct 01, 2009
If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.
Do like i have a really bad habit of doing: write the code, make it work, test it in a working application for a month, post it, and get ridiculed till you take it down. Wait a year or so, and it will be mainstream by someone else.
Or, do like i am learning to do: write it, test it for years, never release it, and find all your apps need that code because it never makes it into Eu, and you have namespace issues. Like parses() (no, i didn't typo parse()) in strtok v3.
useless
5. Re: walk_dir and Chinese filenames
- Posted by DerekParnell (admin) Oct 01, 2009
- 1298 views
... write the code, make it work, test it in a working application for a month, post it, and ... take it down. Wait a year or so, and it will be mainstream by someone else.
... you have namespace issues ...
If anyone, even Kat, submits code for review and possible inclusion, it will be done when possible. However, it is reviewed and if you don't like anyone questioning your code, then don't bother to submit it. That does not mean that any given review comment is correct, but one must be prepared to defend your code or to acknowledge its improvement. Worldwide evidence keeps showing that peer-reviewed code is the fastest and most efficient way to remove bugs and improve code.
Every single person that has contributed code to Euphoria has had some of it modified by others to improve it. In fact, I would recommend allowing everyone who wants to, to examine one's code before it gets released into Euphoria. The full source code for pre-release Euphoria has been available for scrutiny for some time now and I encourage you all to review it and post improvement suggestions.
As for namespaces, no one has the right to claim ownership for any specific symbol name. If a name you want to use clashes with someone else's code, then either change your symbol's name or use the namespace qualifier where possible.
6. Re: walk_dir and Chinese filenames
- Posted by useless Oct 01, 2009
- 1273 views
... write the code, make it work, test it in a working application for a month, post it, and ... take it down. Wait a year or so, and it will be mainstream by someone else.
... you have namespace issues ...
If anyone, even Kat, submits code for review and possible inclusion, it will be done when possible. However, it is reviewed and if you don't like anyone questioning your code, then don't bother to submit it. That does not mean that any given review comment is correct, but one must be prepared to defend your code or to acknowledge its improvement. Worldwide evidence keeps showing that peer-reviewed code is the fastest and most efficient way to remove bugs and improve code.
A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.
useless
7. Re: walk_dir and Chinese filenames
- Posted by DerekParnell (admin) Oct 01, 2009
- 1295 views
A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.
This post gives me the impression that you wrote some code, and presented it to the forum, which did not result in any significant changes to it. Then later, that code was seriously examined by a number of people on IRC to the point where you got it to be very good code. But in spite of that, one or more people in this forum somehow pressured you to withdraw the code from inclusion to Euphoria.
Have I got the story straight?
8. Re: walk_dir and Chinese filenames
- Posted by useless Oct 01, 2009
- 1313 views
A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.
This post gives me the impression that you wrote some code, and presented it to the forum, which did not result in any significant changes to it. Then later, that code was seriously examined by a number of people on IRC to the point where you got it to be very good code. But in spite of that, one or more people in this forum somehow pressured you to withdraw the code from inclusion to Euphoria.
Have I got the story straight?
Yes.
useless
9. Re: walk_dir and Chinese filenames
- Posted by mattlewis (admin) Oct 01, 2009
- 1299 views
What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".
Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.
wxEuphoria uses unicode. It wraps the wxDir object. It should also have no problem displaying the characters (assuming you use a font that supports those code points).
Matt
10. Re: walk_dir and Chinese filenames
- Posted by ghaberek (admin) Oct 01, 2009
- 1276 views
Been there, done that...
It's been nearly four years since I've touched that code, but anyone who wants to have a go at it, there's your basis for improving dir() in the Interpreter for Windows, as I suggested so very long ago.
-Greg
11. Re: walk_dir and Chinese filenames
- Posted by mattlewis (admin) Oct 01, 2009
- 1277 views
Been there, done that...
It's been nearly four years since I've touched that code, but anyone who wants to have a go at it, there's your basis for improving dir() in the Interpreter for Windows, as I suggested so very long ago.
A unicode upgrade is planned for a future release of euphoria (probably 4.1). But the key will be to make sure that it's comprehensive, so that different parts of the language and library don't provide different ways of doing things. It's a fairly big undertaking, and there is likely to be a lot of discussion when the time comes.
Matt
12. Re: walk_dir and Chinese filenames
- Posted by ghaberek (admin) Oct 01, 2009
- 1237 views
A unicode upgrade is planned for a future release of euphoria (probably 4.1). But the key will be to make sure that it's comprehensive, so that different parts of the language and library don't provide different ways of doing things. It's a fairly big undertaking, and there is likely to be a lot of discussion when the time comes.
Sounds good. I'll hopefully participate in that discussion. IMHO, Euphoria should have been using Unicode since day one, but I guess the two kinda grew up side-by-side.
-Greg
13. Re: walk_dir and Chinese filenames
- Posted by jeremy (admin) Oct 01, 2009
- 1292 views
A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.
What does this have to do with walk_dir and Chinese filenames? PLEASE when starting a new thread, do just that, Start a new thread! It's so easy to do with this forum. If it spawns from a comment in this thread, then just click the fork link and enter a new subject. If it's a totally new post (or a never ending rant), you can click New Topic. Can I make this any easier? Hijacking a forum thread is a terrible thing to do and it happens all to many times.
Jeremy
14. Re: walk_dir and Chinese filenames
- Posted by jeremy (admin) Oct 01, 2009
- 1233 views
It has taken me so-o-o-o long to get on this forum. I can't change my password because I've got a blank security question for which a blank answer is apparently insufficient. Just as well I've got a few email addresses.
Bruce, I'm sorry for the signup problems. Due to your signup, we have fixed a few bugs in the system. The software that runs the forum, news and ticket system here are pretty new and still going through some debugging. I'll look into the security question problem you've reported.
Jeremy
15. Re: walk_dir and Chinese filenames
- Posted by useless Oct 01, 2009
- 1230 views
A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.
What does this have to do with walk_dir and Chinese filenames? PLEASE when starting a new thread, do just that, Start a new thread! It's so easy to do with this forum. If it spawns from a comment in this thread, then just click the fork link and enter a new subject. If it's a totally new post (or a never ending rant), you can click New Topic. Can I make this any easier? Hijacking a forum thread is a terrible thing to do and it happens all to many times.
Jeremy
Sorry, but back in post #3 of this thread, Derek suggested we could effect changes in Euphoria :
If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.
Which, as a blanket statement, is not true in my experience.
useless
16. Re: walk_dir and Chinese filenames
- Posted by jeremy (admin) Oct 01, 2009
- 1277 views
Sorry, but back in post #3 of this thread, Derek suggested we could effect changes in Euphoria :
If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.
Which, as a blanket statement, is not true in my experience.
useless
Yes, and your follow up has nothing to do with walk_dir. Derek's did. You should have forked the message with a subject that says: "Creating a ticket does nothing" to which someone would probably point to the hundreds of closed, fixed, applied tickets.
Your reply has nothing to do with walk_dir and is just the same old rant that drives everyone nuts because we've heard the exact same thing forever. It seems that you try to sneak it in any possible way you can conceive.
Jeremy
Forked into: submitting code