1. Bug in dir() leads to Unicode and win_dir()

While working on my Backup Utility for work, I found a bug in
Euphoria's dir() routine. It does not support extended ASCII
characters and just returns a question mark '?' in place of unknown
characters.

I therefore have written a function called win_dir() which uses
Unicode strings to full support all available characters, as well as
up to 32,767 character path names, instead of just 255. It only works
with Windows obviously, since it calls FindFirstFileW, FindNextFileW,
and FindClose in kernel32.dll.

Seeing as how I forced myself to learn how Unicode strings work, I
also wrote a Unicode Conversion Library that can convert plain ASCII
strings (or sequences) to Unicode values, as well as two very useful
routines, allocate_unicode() and peek_unicode().

Both libraries, windir.zip and unicode.zip have just been uploaded to
the Archive. I hope others find these as useful as I do, especially
Derek, who IMHO should implement Unicode in Win32Lib so that it will
be more compatible with the Windows environment.

~Greg

new topic     » topic index » view message » categorize

2. Re: Bug in dir() leads to Unicode and win_dir()

Greg Haberek wrote:

[snip]
>I hope others find these as useful as I do, especially
> Derek, who IMHO should implement Unicode in Win32Lib so that it will
> be more compatible with the Windows environment.

No argument there blink

-- 
Derek Parnell
Melbourne, Australia
irc://irc.sorcery.net:9000/euphoria

new topic     » goto parent     » topic index » view message » categorize

3. Re: Bug in dir() leads to Unicode and win_dir()

Greg Haberek wrote:
> 
> While working on my Backup Utility for work, I found a bug in
> Euphoria's dir() routine. It does not support extended ASCII
> characters and just returns a question mark '?' in place of unknown
> characters.

Hi. While this is great news and probably explain a crash I am having at
work, I have encountered a problem with this, ie:

junk = win_dir(folname)

the t_race shows after this line:
folname={70F,58:,92\,100d,97a,116t,97a,92\,109m,117u,115s,105i,99c}
junk={{{109m,117u,115s,105i,99c},{100d},0,105i,7,7,0,8,23}}


Which is not right, since there is no 'music' subfolder of "F:\data\music".

I'm running WinXP SP2 and the F: Drive is NTFS.

Also, while I'm here, when this didn't work I was attempting to add this 
to the include file:

global function walk_win_dir(sequence path_name, integer your_function,
                               integer scan_subdirs)
    object nothing

    my_dir = routine_id("win_dir") -- use win_dir() please
                                   -- my_dir is global atom in file.e
    nothing = walk_dir(path_name, your_function, scan_subdirs)
    my_dir = -2 	           -- constant DEFAULT=-2 in file.e
    return nothing
end function


I don't know if this can be incorporated?

Gary

new topic     » goto parent     » topic index » view message » categorize

4. Re: Bug in dir() leads to Unicode and win_dir()

Greg Haberek wrote:
 
 > While working on my Backup Utility for work, I found a bug in
 > Euphoria's dir() routine. It does not support extended ASCII
 > characters and just returns a question mark '?' in place of
 > unknown characters.

What is this bug?

Euphoria's dir() works properly for me.
It returns the pure Russian characters from
the extended ASCII table, cp866.

Another thing, dir and file names are in OEM
encoding, not in ANSI, so I have to recode them from
OEM(cp866) into ANSI(cp1251), if I want see them,
say, in NotePad under pure Windows encoding.

But anyway it is just Windows feature, not dir() bug,
I think. Correct me please, if I'm wrong.

Regards,
Igor Kachan
kinz at peterlink.ru

new topic     » goto parent     » topic index » view message » categorize

5. Re: Bug in dir() leads to Unicode and win_dir()

> But anyway it is just Windows feature, not dir() bug,
> I think. Correct me please, if I'm wrong.

All of Windows 2000/XP uses the Unicode (W) versions of functions. All
ANSI (A)versions are just backward compatible. When converting from
Unicode to ANSI, any unknown characters are replaced by a '?' as the
default character. I just learned this recently. So in reality, this
isn't even a bug, it's a Microsoft "feature" but either way, it has
caused me some issues. I tried creating file names with characters
from 129-255 and all I got back for some of them was a '?' character.

Thus the need to for win_dir() with Unicode support. Besides, who
doesn't want 32,767 character file names? :)

~Greg

new topic     » goto parent     » topic index » view message » categorize

6. Re: Bug in dir() leads to Unicode and win_dir()

Hi Greg

My Bad.  It seems win_dir("F:\\DATA\\MUSIC") literally means "give me the
directory entry "MUSIC" in F:\DATA

I guess the confusion on my part is that the standard Euphoria dir() was
giving me F:\DATA\MUSIC\* without me adding a trailing slash to the pathname.

This must be the operating system's default behaviour as dir() just calls
a machine function, like typing in DOS, "dir F:\data\music" will show the
contents of the music folder, not

07/06/2005  12:08 PM    <DIR>          music
               0 File(s)              0 bytes

Now when I do win_dir(pathname & "\\") it works, however replacing dir()
with win_dir() in walk_dir() doesn't recurse properly.

Do you think it's worth trying to find the cause of this and fix it in
win_dir() or to implement a new walk_win_dir() ?

IMHO, the standard Euphoria (/DOS) behaviour is nice, because if you want
the folder attributes for the folder you are looking at you need only 
examine the '.' entry?

Good job on this BTW :)


Gary

ags wrote:
> 
> Greg Haberek wrote:
> > 
> > While working on my Backup Utility for work, I found a bug in
> > Euphoria's dir() routine. It does not support extended ASCII
> > characters and just returns a question mark '?' in place of unknown
> > characters.
> 
> Hi. While this is great news and probably explain a crash I am having at
> work, I have encountered a problem with this, ie:
> 
> }}}
<eucode>
> junk = win_dir(folname)
> 
> the t_race shows after this line:
> folname={70F,58:,92\,100d,97a,116t,97a,92\,109m,117u,115s,105i,99c}
> junk={{{109m,117u,115s,105i,99c},{100d},0,105i,7,7,0,8,23}}
> <font color="#330033"></eucode>
{{{
</font>
> 
> Which is not right, since there is no 'music' subfolder of "F:\data\music".
> 
> I'm running WinXP SP2 and the F: Drive is NTFS.
> 
> Also, while I'm here, when this didn't work I was attempting to add this 
> to the include file:
> 
> }}}
<eucode>
> global function walk_win_dir(sequence path_name, integer your_function, 
>                                integer scan_subdirs)
>     object nothing
> 
>     my_dir = routine_id("win_dir") -- use win_dir() please
>                                    -- my_dir is global atom in file.e
>     nothing = walk_dir(path_name, your_function, scan_subdirs)
>     my_dir = -2 	           -- constant DEFAULT=-2 in file.e
>     return nothing
> end function
> <font color="#330033"></eucode>
{{{
</font>
> 
> I don't know if this can be incorporated?
> 
> Gary
>

new topic     » goto parent     » topic index » view message » categorize

7. Re: Bug in dir() leads to Unicode and win_dir()

> }}}
<eucode>
> junk = win_dir(folname)
>=20
> the t_race shows after this line:
> folname={70F,58:,92\,100d,97a,116t,97a,92\,109m,117u,115s,105i,99c}
> junk={{{109m,117u,115s,105i,99c},{100d},0,105i,7,7,0,8,23}}
> </eucode>
{{{

>
> Which is not right, since there is no 'music' subfolder of "F:\data\music=
".
>
> I'm running WinXP SP2 and the F: Drive is NTFS.

I forgot to mention this. I'll add it to the include file, here's why:

Let's say "F:\data\music" is a folder. You want all the files inside
that folder. You must specify win_dir( "F:\data\music\" ) with the
leading backslash. If you don't, win_dir() will return the information
*for that folder*.

This information may be found here:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/f=
s/findfirstfile.asp

Specifically:
<quote>
To examine a directory that is not a root directory, use the path to
that directory, without a trailing backslash. For example, an argument
of "C:\windows" returns information about the directory "C:\windows",
not about a directory or file in "C:\windows". An attempt to open a
search with a trailing backslash always fails.
</quote>

What Microsoft is saying, is that to obtain information about a
specific directory, specify "F:\data\music" to get information about
the files in a directory, specify "F:\data\music\*" and that
"F:\data\music\" is invalid.

Right now, win_dir() check for a trailing backslash and if it occurs,
adds a * to get all files. If you don't specify a trailing backslash,
then I have no way of knowing if you specified a specific file or a
folder.


> Also, while I'm here, when this didn't work I was attempting to add this
> to the include file:
>
> I don't know if this can be incorporated?

I just do this, it is much easier:

my_dir = routine_id("win_dir")
    junk = walk_dir( path, my_entry_routine, 1 )


new topic     » goto parent     » topic index » view message » categorize

8. Re: Bug in dir() leads to Unicode and win_dir()

> Do you think it's worth trying to find the cause of this and fix it in
> win_dir() or to implement a new walk_win_dir() ?

I think I may need to tweak walk_win_dir(). Also I'm trying to prepend
win_ to Euphoria routines, so it will be called win_walk_dir()
instead. I think in order to better support the dir()/win_dir()
compatibility, I'm going to have it check to see if the path is a
directory, and if so, look *in* that directory for files. This will
allow me to make win_walk_dir() like this:

global function win_walk_dir( sequence path, integer rtn_id, integer sub_di=
rs )
  integer old_rtn, retv

  old_rtn = my_dir
  my_dir = routine_id("win_dir")

  retv = walk_dir( path, rtn_id, sub_dirs )

  my_dir = old_rtn
  return retv
end function


> Good job on this BTW :)

Thanks!! I'm glad someone is already using it. I'm also perfecting the
Unicode support, so check for updates every few days. I think I may
have a way around the A/W versions of dll routines. If so, Derek may
be able to implement my code in the next Win32Lib so it will fully
support Unicode.

~Greg

new topic     » goto parent     » topic index » view message » categorize

9. Re: Bug in dir() leads to Unicode and win_dir()

Greg Haberek wrote:
> 
> > Do you think it's worth trying to find the cause of this and fix it in
> > win_dir() or to implement a new walk_win_dir() ?
> 
> I think I may need to tweak walk_win_dir(). Also I'm trying to prepend
> win_ to Euphoria routines, so it will be called win_walk_dir()
> instead. I think in order to better support the dir()/win_dir()
> compatibility, I'm going to have it check to see if the path is a
> directory, and if so, look *in* that directory for files. This will
> allow me to make win_walk_dir() like this:

That's cool, I just didn't want to change win_dir() itself, and in fact
had trouble doing that, hence the hacked walk_win_dir().

No problems about the name though :)

Gary

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu