1. Lfn.e

--=======6859956=======

Juergen,

	I started writing an app to test Lfn.e and came upon this:

Several files I was trying to process had been saved as .txt from Internet 
Explorer and contained the '@' sign in their names.  That is what is shown 
in a folder by Windows.
I verified with a disk editor that the '@' sign was #AE on disk.  However, 
DOS returns that character as an underscore ( #5F 
).   lfn_dir("discov*.txt") returned a list which contained:

"Discover_ Platinum Account Statement 09-26-02.txt"

Note the '_' instead of '@'.  lfn_open() failed to open this file.

	This is very likely a code page problem but I don't have time to research 
it now. Perhaps Igor could help?  I will mail you the test report.
			Thanks,

			Bob

P.S.   command.com's "dir" showed it like above; 4DOS, which I use every 
day, would only show the alias (short form).

--=======6859956=======
Content-Type: text/plain; charset=us-ascii; x-avg=cert;
x-avg-checked=avg-ok-64DA1F12
Content-Disposition: inline


---

--=======6859956=======--

new topic     » topic index » view message » categorize

2. Re: Lfn.e

Bob wrote:

> Juergen,
>
> 	I started writing an app to test Lfn.e and came upon this:

Many thanks.

> Several files I was trying to process had been saved as .txt from Internet
> Explorer and contained the '@' sign in their names.  That is what is shown
> in a folder by Windows.
> I verified with a disk editor that the '@' sign was #AE on disk.

That is strange. As Ricardo already wrote, normally '@' is #40.

> However, DOS returns that character as an underscore ( #5F ).
>    lfn_dir("discov*.txt") returned a list which contained:
>
> "Discover_ Platinum Account Statement 09-26-02.txt"
>
> Note the '_' instead of '@'.  lfn_open() failed to open this file.

To test this on my system, I created a file by hand, using Windows
Explorer (Win 98), with the name:  "äÄöÖüÜß at 123.txt"

I don't know how the characters in this filename look on your system.
The first 7 chars are spezial German characters whith the codes
#84, #8E, #94, #99, #81, #9A, #E1 on codepage 850.

On a Win 98 DOS box, lfn_dir("*.*") showed the name correctly, and
lfn_open() opened the file. On plain MS-DOS 7.10, the filename consisted
of the first 6 characters (all uppercase) & '~1', as usual.
lfn_dir("*.*") again showed the name correctly, and lfn_open() opened
the file.

> 	This is very likely a code page problem but I don't have time to research
> it now. Perhaps Igor could help?

Hmm.. I always thought, that the standard ASCII characters (#00-#7F) are
the same on all codepages. If this is right, then there shouldn't be a
codepage problem with '@'. What codepage are you using? (type 'chcp' at
the DOS prompt)
My LFN functions all use the DOS interrupt #21. Is this interrupt on
your system handled by MS-DOS, or by 4DOS?
Did the program in Ricardo's text utilities solve the problem for you?
I also downloaded it, but I couldn't see, what DOS codepage it supports.

I searched Ralph Brown's DOS interrupt list, but didn't find anything
like a 'translate_ANSI_to_current_OEM_codepage()' function. But if there
is need for it, I can provide tables for translation from/to ANSI on the
one hand, and DOS codepages 437, 850, 852, 863, 865, and 866 on the
other hand ... next year.[1]
I really hope, that such a translation normally isn't required for my
LFN functions. At least, for MS-DOS 7.10 and codepage 850 it doesn't
seem to be necessary.

Sorry that I can't help you more at the moment.

> I will mail you the test report.
> 			Thanks,
>
> 			Bob

I received it, thank you again.

> P.S.   command.com's "dir" showed it like above; 4DOS, which I use every
> day, would only show the alias (short form).

Regards,
   Juergen

---------------------
[1] BTW, to everyone:
    Merry Christmas and a Happy New Year.
    Frohe Weihnachten und ein gutes Neues Jahr.
    Prettige Kerstdagen en een Gelukkig Nieuwjaar.
    Joyeux Noël et Bonne Année.
    Feliz Navidad y Próspero Año Nuevo.
    Buon Natale e felice Anno Nuovo.

new topic     » goto parent     » topic index » view message » categorize

3. Re: Lfn.e

--=======1CD93C2B=======

Juergen and Ricardo,

         I must apologize for my bad eyesight and 14" monitor.  I believe=20
it actually was the  'registered trademark' symbol ( R with a circle around=
=20
it ) in my filename.   At least, I can't find  the @ symbol.

         "=AB" I just typed Alt-174 (#AE) before this sentence between=
 quotes=20
in Eudora.  It looks the same in DOS on my machine. So, that's not part of=
=20
the problem, I think.

         Anyway, I'm still getting the failure.  Ralf Browns list also=20
shows this:

  INT 21 - Windows95 - LONG FILENAME - FIND FIRST MATCHING FILE
          AX =3D 714Eh
          CL =3D allowable-attributes mask (see #01397 at AX=3D4301h)
                (bits 0 and 5 ignored)
          CH =3D required-attributes mask (see #01397)
          SI =3D date/time format (see #01755)
          DS:DX -> ASCIZ filespec (both "*" and "*.*" match any filename)
          ES:DI -> FindData record (see #01756)
  Return: CF clear if successful
              AX =3D filefind handle (needed to continue search)
              CX =3D Unicode conversion flags (see #01757)
          CF set on error
              AX =3D error code
                  7100h if function not supported
<snip>
  Bitfields for Windows95 Unicode conversion flags:
  Bit(s)  Description     (Table 01757)
   0      the returned full filename contains underscores for un-convertable
            Unicode characters
   1      the returned short filename contains underscores for=
 un-convertable
            Unicode characters
  SeeAlso: #01756

INT 21:AX =3D 714Eh  reg_list[REG_CX] bit 0 *IS* returning 1 on the bad=20
filenames. and, indeed, lfn_dir() is showing them with underscores (#5F)=20
instead of #AE. I even tried replacing the #5F with #AE before calling=20
lfn_open() but that didn't work either.

         My "Regional Settings" in Win98 are set to "English (United=
 States)"

         and, in Dos,    c:\windows\personal>chcp
                         Active code page: 437

         I don't know enough about Unicode to say why the offending=20
character is considered "un-convertible" by INT21714Eh. I don't know how to=
=20
determine whether my system is broken or just set to some odd but legal=20
configuration.  From what I've read so far, windows uses the ANSI code page=
=20
and DOS uses OEM. There are some "extended" characters which don't map=20
properly.

         I have been able to create a file on the command line using=20
Alt-keypad a file with your codes in the name. They work fine with my test=
=20
program using lfn.e and they probably display the way you see them.  I see=
=20
the lower and upper case a, o, u (alles mit umlaut) followed by the letter=
=20
whose name I can't remember. =E4=C4=F6=D6=FC=DC=DF  I even used #AE in a=
 filename and=20
that worked but in the long form it's changed to #AB on disk.  I will try=20
to take screen shots and show what I mean.

         It may be that there's some windows setting that's causing this=20
behavior.  I will share what I find with you if I think I understand it.

                 Thanks for your effort,

                         Bob

--=======1CD93C2B=======
Content-Type: text/plain; charset=us-ascii; x-avg=cert;
x-avg-checked=avg-ok-27CE47D
Content-Disposition: inline


---

--=======1CD93C2B=======--

new topic     » goto parent     » topic index » view message » categorize

4. Re: Lfn.e

Robert Elia wrote:

> Juergen and Ricardo,
>
>          I must apologize for my bad eyesight and 14" monitor.  I believe
> it actually was the  'registered trademark' symbol ( R with a circle around
> it ) in my filename.   At least, I can't find  the @ symbol.

Well, this puzzle seems to be solved. smile

>          "«" I just typed Alt-174 (#AE) before this sentence between quotes
> in Eudora.  It looks the same in DOS on my machine. So, that's not part of
> the problem, I think.


Here are links to unicode character code tables as PDF-files:

   http://www.unicode.org/charts/PDF/U0000.pdf
   http://www.unicode.org/charts/PDF/U0080.pdf

so it's easier for us to see, what characters the other one means.

>          Anyway, I'm still getting the failure.  Ralf Browns list also
> shows this:
>
>   INT 21 - Windows95 - LONG FILENAME - FIND FIRST MATCHING FILE
>           AX = 714Eh
>           CL = allowable-attributes mask (see #01397 at AX=4301h)
>                 (bits 0 and 5 ignored)
>           CH = required-attributes mask (see #01397)
>           SI = date/time format (see #01755)
>           DS:DX -> ASCIZ filespec (both "*" and "*.*" match any filename)
>           ES:DI -> FindData record (see #01756)
>   Return: CF clear if successful
>               AX = filefind handle (needed to continue search)
>               CX = Unicode conversion flags (see #01757)
>           CF set on error
>               AX = error code
>                   7100h if function not supported
> <snip>
>   Bitfields for Windows95 Unicode conversion flags:
>   Bit(s)  Description     (Table 01757)
>    0      the returned full filename contains underscores for un-convertable
>             Unicode characters
>    1      the returned short filename contains underscores for un-convertable
>             Unicode characters
>   SeeAlso: #01756
>
> INT 21:AX = 714Eh  reg_list[REG_CX] bit 0 *IS* returning 1 on the bad
> filenames. and, indeed, lfn_dir() is showing them with underscores (#5F)
> instead of #AE. I even tried replacing the #5F with #AE before calling
> lfn_open() but that didn't work either.

Strange. Even if I create a whole filename consisting of #AE characters
(Alt-174 in the Windows Explorer), lfn_dir() and lfn_open() work fine.

BUT: By typeing Alt-174, I do *not* get the (R) sign! I get chevrons
(#AB in http://www.unicode.org/charts/PDF/U0080.pdf). That means, that
my Windows Explorer doesn't use Unicode, right? Interestingly, up to
this point, OEM codepages and DOS do not seem to be involved at all.
But maybe I'm completely wrong ...

      Iiiiii-goooor ?!!  smile)

>          My "Regional Settings" in Win98 are set to "English (United States)"
>
>          and, in Dos,    c:\windows\personal>chcp
>                          Active code page: 437
>
>          I don't know enough about Unicode to say why the offending
> character is considered "un-convertible" by INT21714Eh. I don't know how to
> determine whether my system is broken or just set to some odd but legal
> configuration.  From what I've read so far, windows uses the ANSI code page
> and DOS uses OEM. There are some "extended" characters which don't map
> properly.

When I got my first PC, I read that only standard characters should be
used for building filenames. Maybe this is also true in modern times?

>          I have been able to create a file on the command line using
> Alt-keypad a file with your codes in the name. They work fine with my test
> program using lfn.e and they probably display the way you see them.  I see
> the lower and upper case a, o, u (alles mit umlaut)

Yup. smile

> followed by the letter whose name I can't remember.

The German name for this sign, looking like the Greek Beta, is "sz"
(pronounced like "Ess-Tssett" smile.
(#DF in http://www.unicode.org/charts/PDF/U0080.pdf).
Well, you see exactly what I typed.

> äÄöÖüÜß  I even used #AE in a filename and
> that worked but in the long form it's changed to #AB on disk.

That's the same what happened to me (see above). It's very strange for
me, too. However #AB works fine for me.  smile
I *always* hated this charakter set stuff ...

> I will try to take screen shots and show what I mean.
>
>          It may be that there's some windows setting that's causing this
> behavior.  I will share what I find with you if I think I understand it.
>
>                  Thanks for your effort,
>
>                          Bob

I want to say the same to you.

Regards,
   Juergen

-- 
 /"\  ASCII ribbon    | while not asleep do
 \ /  campain against |    sheep += 1
  X   HTML e-mail and | end while
 / \  news            |

new topic     » goto parent     » topic index » view message » categorize

5. Re: Lfn.e

Euler German wrote:

> On 23 Dec 2002, at 21:38, rforno at tutopia.com wrote:
>
>>
>> Juergen:
>> As I said before, Alt-174 (that is, #AE, not #AB; was that a typo, or
>> unicode has this different?) produces a left-chevron («) under Outlook
>> Express and DOS. But #AE shows as R within a circle under several
>> Windows editors. I tried EDXOR, Note Pad, Word Pad, Win Edit, Crimson
>> Editor, and the editor that comes with Borland C++ 5.0, all with the
>> same result. As I'm usign code page 850, the same as you, you should be
>> getting the same results. If not, I can't explain why. ----- Original
>
> Ricardo:
>
> Alt-0174 (note the zero) is ® (Registered) while Alt-174 (no zero) is «
> (left-chevron) in ANSI char-map (Windows).

Ahhh ... Thanks for this explanation.
In Unicode, 174 (#AE) is (R) <http://www.unicode.org/charts/PDF/U0080.pdf>.
So can generally be said, that with a leading zero, on Windows we get a
character from the Unicode table, and without it from the ANSI table?

> While in DOS, you're using
> OEM char-map and these codes will produce ® (Registered) and ©
> (Copyright) respectively.
> You have to be careful as some editors do not handle both tables. I use
> EditPad that works quite well. :)

The file(name)s can also be created with the Windows explorer. At least
on Win 98, it seems to handle both tables, too.


So when I create a file with the Windows explorer with a name consisting
of 9 (R)'s [Alt-0174], followed by ".txt", this file is shown with ex.exe
by
   '    dir("*.*")'   [short form]
   'lfn_dir("*.*")'   [long form, i.e. complete name]

But with ex.exe,
   '    dir(174 & "*.*")' and
   'lfn_dir(174 & "*.*")'
_both_ return -1 (so lfn_dir() behaves consistently to dir()).

With exw.exe, 'dir(174 & "*.*")' returns the complete filename! smile
       ^


BTW:
I just sent the new version 0.71 of Lfn.zip, that includes some bug
fixes, to RDS. (The bug fixes have nothing got to do with this codepage
stuff.) The lib will also be available for download on my website for a
week or so
   http://luethje.de.vu/temp/lfn.zip

In order to make the library as reliable as possible, it should be
tested on many various versions of DOS. This is especially important
because -- if the code is well-tested -- it will likely be built into
future versions of Euphoria for DOS. The testing will take just a few
minutes (see LfnRead.txt), and I'd be happy, if many poeple participate.

Again, I wish everybody a Merry Christmas and a Happy New Year!

Best regards,
   Juergen
(last post before X-mas)

-- 
 /"\  ASCII ribbon    | while not asleep do
 \ /  campain against |    sheep += 1
  X   HTML e-mail and | end while
 / \  news            |

new topic     » goto parent     » topic index » view message » categorize

6. Re: Lfn.e

On 24 Dec 2002, at 8:46, Juergen Luethje wrote:

> 
> Ahhh ... Thanks for this explanation.
> In Unicode, 174 (#AE) is (R)
> <http://www.unicode.org/charts/PDF/U0080.pdf>. So can generally be said,
> that with a leading zero, on Windows we get a character from the Unicode
> table, and without it from the ANSI table?
> 
Well, I guess so. :)
It's well known that Windows handles Unicode in a poorly manner. 
Unfortunately, I'm no expert in this issue (many others too). BTW, 
leading zeros are supposed to be used on Windows ANSI. DOS uses plain 
ASCII (also called OEM).

Warmest regards,

-- Euler German



-- Season's Greetings to All!! -- Have a Great New Year!!

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu