1. walk_dir and Chinese filenames

It has taken me so-o-o-o long to get on this forum. I can't change my password because I've got a blank security question for which a blank answer is apparently insufficient. Just as well I've got a few email addresses.

ANYWAY

What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".

Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.

Sorry, this sounds like a rant.

Kind regards, Bruce

new topic     » topic index » view message » categorize

2. Re: walk_dir and Chinese filenames

annoyed said...

What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".

This doesn't help you at all, but I have no problems using UTF-8 encoded filenames that consist only of chinese characters. Of course, I'm using Linux.

The machine_func() implementation of M_DIR uses Watcom's readdir() call to get the list of directory entries. I haven't looked at the Watcom docs on this but my guess is that readdir() ends up calling the ANSI version of the W32API, which is why the hanzi ends up converted into question marks.

annoyed said...

Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.

Probably this is the easiest way to go. As long as you are using the Unicode functions and are careful to convert the Unicode strings into sequences and back correctly, you shouldn't have any problems. (Note that puts() and printf() don't support UTF-16, so you'll need to wrap more W32API functions if you want to actually display the file names on the console or write them to a file.)

new topic     » goto parent     » topic index » view message » categorize

3. Re: walk_dir and Chinese filenames

Bruce said...

what's being done about dir()s inability to properly read Chinese filenames?

Euphoria does not yet support Unicode. We have started the process but because it is going to take quite some effort, it will be done is stages. We did not want the lack of total Unicode support to delay the next version of Euphoria.

If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.

new topic     » goto parent     » topic index » view message » categorize

4. Re: walk_dir and Chinese filenames

DerekParnell said...

If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.


Do like i have a really bad habit of doing: write the code, make it work, test it in a working application for a month, post it, and get ridiculed till you take it down. Wait a year or so, and it will be mainstream by someone else.

Or, do like i am learning to do: write it, test it for years, never release it, and find all your apps need that code because it never makes it into Eu, and you have namespace issues. Like parses() (no, i didn't typo parse()) in strtok v3.

useless

new topic     » goto parent     » topic index » view message » categorize

5. Re: walk_dir and Chinese filenames

Kat said...

... write the code, make it work, test it in a working application for a month, post it, and ... take it down. Wait a year or so, and it will be mainstream by someone else.

... you have namespace issues ...

If anyone, even Kat, submits code for review and possible inclusion, it will be done when possible. However, it is reviewed and if you don't like anyone questioning your code, then don't bother to submit it. That does not mean that any given review comment is correct, but one must be prepared to defend your code or to acknowledge its improvement. Worldwide evidence keeps showing that peer-reviewed code is the fastest and most efficient way to remove bugs and improve code.

Every single person that has contributed code to Euphoria has had some of it modified by others to improve it. In fact, I would recommend allowing everyone who wants to, to examine one's code before it gets released into Euphoria. The full source code for pre-release Euphoria has been available for scrutiny for some time now and I encourage you all to review it and post improvement suggestions.

As for namespaces, no one has the right to claim ownership for any specific symbol name. If a name you want to use clashes with someone else's code, then either change your symbol's name or use the namespace qualifier where possible.

new topic     » goto parent     » topic index » view message » categorize

6. Re: walk_dir and Chinese filenames

DerekParnell said...
Kat said...

... write the code, make it work, test it in a working application for a month, post it, and ... take it down. Wait a year or so, and it will be mainstream by someone else.

... you have namespace issues ...

If anyone, even Kat, submits code for review and possible inclusion, it will be done when possible. However, it is reviewed and if you don't like anyone questioning your code, then don't bother to submit it. That does not mean that any given review comment is correct, but one must be prepared to defend your code or to acknowledge its improvement. Worldwide evidence keeps showing that peer-reviewed code is the fastest and most efficient way to remove bugs and improve code.

A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.

useless

new topic     » goto parent     » topic index » view message » categorize

7. Re: walk_dir and Chinese filenames

Kat said...

A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.

This post gives me the impression that you wrote some code, and presented it to the forum, which did not result in any significant changes to it. Then later, that code was seriously examined by a number of people on IRC to the point where you got it to be very good code. But in spite of that, one or more people in this forum somehow pressured you to withdraw the code from inclusion to Euphoria.

Have I got the story straight?

new topic     » goto parent     » topic index » view message » categorize

8. Re: walk_dir and Chinese filenames

DerekParnell said...
Kat said...

A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.

This post gives me the impression that you wrote some code, and presented it to the forum, which did not result in any significant changes to it. Then later, that code was seriously examined by a number of people on IRC to the point where you got it to be very good code. But in spite of that, one or more people in this forum somehow pressured you to withdraw the code from inclusion to Euphoria.

Have I got the story straight?


Yes.

useless

new topic     » goto parent     » topic index » view message » categorize

9. Re: walk_dir and Chinese filenames

annoyed said...

What I really wrote for was to ask: what's being done about dir()s inability to properly read Chinese filenames? Where I work (as IT Dept. Manager) I get a mix of English and Chinese filenames and its simply not good enough when dir() store question marks for Chinese characters. For example there's file called "组合 1.pdf" (Portfolio 1.pdf). It gets stored as "?? 1.pdf".

Now it may have to do with the fact that dir() uses machine_func() rather than the Win32 API call. Maybe I should write my own version of std/filesys.e, replacing all the machine_func()s with API calls. Or maybe keeping the old stuff but putting in ifdefs to catch a windows compile.

wxEuphoria uses unicode. It wraps the wxDir object. It should also have no problem displaying the characters (assuming you use a font that supports those code points).

Matt

new topic     » goto parent     » topic index » view message » categorize

10. Re: walk_dir and Chinese filenames

Been there, done that...

win_dir()

A Windows-specific directory routine. Returns the same information as Euphoria's native dir(). Uses optional Unicode* strings, so the maximum path name length may be 32,767 characters instead of just 255, and extended characters are fully supported, unlike dir(). *Note: Unicode is only supported in Windows NT/2000/XP. See docs for more info. Dec 17: Bug fixes thanks to Al Getz. Updated demo program.


It's been nearly four years since I've touched that code, but anyone who wants to have a go at it, there's your basis for improving dir() in the Interpreter for Windows, as I suggested so very long ago. blink

-Greg

new topic     » goto parent     » topic index » view message » categorize

11. Re: walk_dir and Chinese filenames

ghaberek said...

Been there, done that...

win_dir()

It's been nearly four years since I've touched that code, but anyone who wants to have a go at it, there's your basis for improving dir() in the Interpreter for Windows, as I suggested so very long ago. blink

A unicode upgrade is planned for a future release of euphoria (probably 4.1). But the key will be to make sure that it's comprehensive, so that different parts of the language and library don't provide different ways of doing things. It's a fairly big undertaking, and there is likely to be a lot of discussion when the time comes.

Matt

new topic     » goto parent     » topic index » view message » categorize

12. Re: walk_dir and Chinese filenames

mattlewis said...

A unicode upgrade is planned for a future release of euphoria (probably 4.1). But the key will be to make sure that it's comprehensive, so that different parts of the language and library don't provide different ways of doing things. It's a fairly big undertaking, and there is likely to be a lot of discussion when the time comes.

Sounds good. I'll hopefully participate in that discussion. IMHO, Euphoria should have been using Unicode since day one, but I guess the two kinda grew up side-by-side. getlost

-Greg

new topic     » goto parent     » topic index » view message » categorize

13. Re: walk_dir and Chinese filenames

useless said...

A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.

What does this have to do with walk_dir and Chinese filenames? PLEASE when starting a new thread, do just that, Start a new thread! It's so easy to do with this forum. If it spawns from a comment in this thread, then just click the fork link and enter a new subject. If it's a totally new post (or a never ending rant), you can click New Topic. Can I make this any easier? Hijacking a forum thread is a terrible thing to do and it happens all to many times.

Jeremy

new topic     » goto parent     » topic index » view message » categorize

14. Re: walk_dir and Chinese filenames

annoyed said...

It has taken me so-o-o-o long to get on this forum. I can't change my password because I've got a blank security question for which a blank answer is apparently insufficient. Just as well I've got a few email addresses.

Bruce, I'm sorry for the signup problems. Due to your signup, we have fixed a few bugs in the system. The software that runs the forum, news and ticket system here are pretty new and still going through some debugging. I'll look into the security question problem you've reported.

Jeremy

new topic     » goto parent     » topic index » view message » categorize

15. Re: walk_dir and Chinese filenames

jeremy said...
useless said...

A far as i remember, no one suggested any changes at all to the task msg handler in this forum, other than a couple typos due to wee-hour copy/paste. Ry and i went over it's general protocol repeatedly on irc. But i was pressured enough here to take it down, even tho there is no other msg handler for the tasks.

What does this have to do with walk_dir and Chinese filenames? PLEASE when starting a new thread, do just that, Start a new thread! It's so easy to do with this forum. If it spawns from a comment in this thread, then just click the fork link and enter a new subject. If it's a totally new post (or a never ending rant), you can click New Topic. Can I make this any easier? Hijacking a forum thread is a terrible thing to do and it happens all to many times.

Jeremy


Sorry, but back in post #3 of this thread, Derek suggested we could effect changes in Euphoria :

Derek said...

If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.

Which, as a blanket statement, is not true in my experience.

useless

new topic     » goto parent     » topic index » view message » categorize

16. Re: walk_dir and Chinese filenames

useless said...

Sorry, but back in post #3 of this thread, Derek suggested we could effect changes in Euphoria :

Derek said...

If you care to supply some code that get dir() working we can incorporate it into a future release, otherwise submit a ticket for the enhancement.

Which, as a blanket statement, is not true in my experience.

useless

Yes, and your follow up has nothing to do with walk_dir. Derek's did. You should have forked the message with a subject that says: "Creating a ticket does nothing" to which someone would probably point to the hundreds of closed, fixed, applied tickets.

Your reply has nothing to do with walk_dir and is just the same old rant that drives everyone nuts because we've heard the exact same thing forever. It seems that you try to sneak it in any possible way you can conceive.

Jeremy


Forked into: submitting code

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu