1. Lfn.e
- Posted by bobelia200 at NETZERO.NET Dec 21, 2002
- 439 views
--=======6859956======= Juergen, I started writing an app to test Lfn.e and came upon this: Several files I was trying to process had been saved as .txt from Internet Explorer and contained the '@' sign in their names. That is what is shown in a folder by Windows. I verified with a disk editor that the '@' sign was #AE on disk. However, DOS returns that character as an underscore ( #5F ). lfn_dir("discov*.txt") returned a list which contained: "Discover_ Platinum Account Statement 09-26-02.txt" Note the '_' instead of '@'. lfn_open() failed to open this file. This is very likely a code page problem but I don't have time to research it now. Perhaps Igor could help? I will mail you the test report. Thanks, Bob P.S. command.com's "dir" showed it like above; 4DOS, which I use every day, would only show the alias (short form). --=======6859956======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-64DA1F12 Content-Disposition: inline --- --=======6859956=======--
2. Re: Lfn.e
- Posted by Juergen Luethje <eu.lue at gmx.de> Dec 22, 2002
- 445 views
Bob wrote: > Juergen, > > I started writing an app to test Lfn.e and came upon this: Many thanks. > Several files I was trying to process had been saved as .txt from Internet > Explorer and contained the '@' sign in their names. That is what is shown > in a folder by Windows. > I verified with a disk editor that the '@' sign was #AE on disk. That is strange. As Ricardo already wrote, normally '@' is #40. > However, DOS returns that character as an underscore ( #5F ). > lfn_dir("discov*.txt") returned a list which contained: > > "Discover_ Platinum Account Statement 09-26-02.txt" > > Note the '_' instead of '@'. lfn_open() failed to open this file. To test this on my system, I created a file by hand, using Windows Explorer (Win 98), with the name: "äÄöÖüÜß at 123.txt" I don't know how the characters in this filename look on your system. The first 7 chars are spezial German characters whith the codes #84, #8E, #94, #99, #81, #9A, #E1 on codepage 850. On a Win 98 DOS box, lfn_dir("*.*") showed the name correctly, and lfn_open() opened the file. On plain MS-DOS 7.10, the filename consisted of the first 6 characters (all uppercase) & '~1', as usual. lfn_dir("*.*") again showed the name correctly, and lfn_open() opened the file. > This is very likely a code page problem but I don't have time to research > it now. Perhaps Igor could help? Hmm.. I always thought, that the standard ASCII characters (#00-#7F) are the same on all codepages. If this is right, then there shouldn't be a codepage problem with '@'. What codepage are you using? (type 'chcp' at the DOS prompt) My LFN functions all use the DOS interrupt #21. Is this interrupt on your system handled by MS-DOS, or by 4DOS? Did the program in Ricardo's text utilities solve the problem for you? I also downloaded it, but I couldn't see, what DOS codepage it supports. I searched Ralph Brown's DOS interrupt list, but didn't find anything like a 'translate_ANSI_to_current_OEM_codepage()' function. But if there is need for it, I can provide tables for translation from/to ANSI on the one hand, and DOS codepages 437, 850, 852, 863, 865, and 866 on the other hand ... next year.[1] I really hope, that such a translation normally isn't required for my LFN functions. At least, for MS-DOS 7.10 and codepage 850 it doesn't seem to be necessary. Sorry that I can't help you more at the moment. > I will mail you the test report. > Thanks, > > Bob I received it, thank you again. > P.S. command.com's "dir" showed it like above; 4DOS, which I use every > day, would only show the alias (short form). Regards, Juergen --------------------- [1] BTW, to everyone: Merry Christmas and a Happy New Year. Frohe Weihnachten und ein gutes Neues Jahr. Prettige Kerstdagen en een Gelukkig Nieuwjaar. Joyeux Noël et Bonne Année. Feliz Navidad y Próspero Año Nuevo. Buon Natale e felice Anno Nuovo.
3. Re: Lfn.e
- Posted by Robert Elia <bobelia200 at netzero.net> Dec 23, 2002
- 438 views
--=======1CD93C2B======= Juergen and Ricardo, I must apologize for my bad eyesight and 14" monitor. I believe=20 it actually was the 'registered trademark' symbol ( R with a circle around= =20 it ) in my filename. At least, I can't find the @ symbol. "=AB" I just typed Alt-174 (#AE) before this sentence between= quotes=20 in Eudora. It looks the same in DOS on my machine. So, that's not part of= =20 the problem, I think. Anyway, I'm still getting the failure. Ralf Browns list also=20 shows this: INT 21 - Windows95 - LONG FILENAME - FIND FIRST MATCHING FILE AX =3D 714Eh CL =3D allowable-attributes mask (see #01397 at AX=3D4301h) (bits 0 and 5 ignored) CH =3D required-attributes mask (see #01397) SI =3D date/time format (see #01755) DS:DX -> ASCIZ filespec (both "*" and "*.*" match any filename) ES:DI -> FindData record (see #01756) Return: CF clear if successful AX =3D filefind handle (needed to continue search) CX =3D Unicode conversion flags (see #01757) CF set on error AX =3D error code 7100h if function not supported <snip> Bitfields for Windows95 Unicode conversion flags: Bit(s) Description (Table 01757) 0 the returned full filename contains underscores for un-convertable Unicode characters 1 the returned short filename contains underscores for= un-convertable Unicode characters SeeAlso: #01756 INT 21:AX =3D 714Eh reg_list[REG_CX] bit 0 *IS* returning 1 on the bad=20 filenames. and, indeed, lfn_dir() is showing them with underscores (#5F)=20 instead of #AE. I even tried replacing the #5F with #AE before calling=20 lfn_open() but that didn't work either. My "Regional Settings" in Win98 are set to "English (United= States)" and, in Dos, c:\windows\personal>chcp Active code page: 437 I don't know enough about Unicode to say why the offending=20 character is considered "un-convertible" by INT21714Eh. I don't know how to= =20 determine whether my system is broken or just set to some odd but legal=20 configuration. From what I've read so far, windows uses the ANSI code page= =20 and DOS uses OEM. There are some "extended" characters which don't map=20 properly. I have been able to create a file on the command line using=20 Alt-keypad a file with your codes in the name. They work fine with my test= =20 program using lfn.e and they probably display the way you see them. I see= =20 the lower and upper case a, o, u (alles mit umlaut) followed by the letter= =20 whose name I can't remember. =E4=C4=F6=D6=FC=DC=DF I even used #AE in a= filename and=20 that worked but in the long form it's changed to #AB on disk. I will try=20 to take screen shots and show what I mean. It may be that there's some windows setting that's causing this=20 behavior. I will share what I find with you if I think I understand it. Thanks for your effort, Bob --=======1CD93C2B======= Content-Type: text/plain; charset=us-ascii; x-avg=cert; x-avg-checked=avg-ok-27CE47D Content-Disposition: inline --- --=======1CD93C2B=======--
4. Re: Lfn.e
- Posted by Juergen Luethje <eu.lue at gmx.de> Dec 23, 2002
- 442 views
Robert Elia wrote: > Juergen and Ricardo, > > I must apologize for my bad eyesight and 14" monitor. I believe > it actually was the 'registered trademark' symbol ( R with a circle around > it ) in my filename. At least, I can't find the @ symbol. Well, this puzzle seems to be solved. > "«" I just typed Alt-174 (#AE) before this sentence between quotes > in Eudora. It looks the same in DOS on my machine. So, that's not part of > the problem, I think. Here are links to unicode character code tables as PDF-files: http://www.unicode.org/charts/PDF/U0000.pdf http://www.unicode.org/charts/PDF/U0080.pdf so it's easier for us to see, what characters the other one means. > Anyway, I'm still getting the failure. Ralf Browns list also > shows this: > > INT 21 - Windows95 - LONG FILENAME - FIND FIRST MATCHING FILE > AX = 714Eh > CL = allowable-attributes mask (see #01397 at AX=4301h) > (bits 0 and 5 ignored) > CH = required-attributes mask (see #01397) > SI = date/time format (see #01755) > DS:DX -> ASCIZ filespec (both "*" and "*.*" match any filename) > ES:DI -> FindData record (see #01756) > Return: CF clear if successful > AX = filefind handle (needed to continue search) > CX = Unicode conversion flags (see #01757) > CF set on error > AX = error code > 7100h if function not supported > <snip> > Bitfields for Windows95 Unicode conversion flags: > Bit(s) Description (Table 01757) > 0 the returned full filename contains underscores for un-convertable > Unicode characters > 1 the returned short filename contains underscores for un-convertable > Unicode characters > SeeAlso: #01756 > > INT 21:AX = 714Eh reg_list[REG_CX] bit 0 *IS* returning 1 on the bad > filenames. and, indeed, lfn_dir() is showing them with underscores (#5F) > instead of #AE. I even tried replacing the #5F with #AE before calling > lfn_open() but that didn't work either. Strange. Even if I create a whole filename consisting of #AE characters (Alt-174 in the Windows Explorer), lfn_dir() and lfn_open() work fine. BUT: By typeing Alt-174, I do *not* get the (R) sign! I get chevrons (#AB in http://www.unicode.org/charts/PDF/U0080.pdf). That means, that my Windows Explorer doesn't use Unicode, right? Interestingly, up to this point, OEM codepages and DOS do not seem to be involved at all. But maybe I'm completely wrong ... Iiiiii-goooor ?!! ) > My "Regional Settings" in Win98 are set to "English (United States)" > > and, in Dos, c:\windows\personal>chcp > Active code page: 437 > > I don't know enough about Unicode to say why the offending > character is considered "un-convertible" by INT21714Eh. I don't know how to > determine whether my system is broken or just set to some odd but legal > configuration. From what I've read so far, windows uses the ANSI code page > and DOS uses OEM. There are some "extended" characters which don't map > properly. When I got my first PC, I read that only standard characters should be used for building filenames. Maybe this is also true in modern times? > I have been able to create a file on the command line using > Alt-keypad a file with your codes in the name. They work fine with my test > program using lfn.e and they probably display the way you see them. I see > the lower and upper case a, o, u (alles mit umlaut) Yup. > followed by the letter whose name I can't remember. The German name for this sign, looking like the Greek Beta, is "sz" (pronounced like "Ess-Tssett" . (#DF in http://www.unicode.org/charts/PDF/U0080.pdf). Well, you see exactly what I typed. > äÄöÖüÜß I even used #AE in a filename and > that worked but in the long form it's changed to #AB on disk. That's the same what happened to me (see above). It's very strange for me, too. However #AB works fine for me. I *always* hated this charakter set stuff ... > I will try to take screen shots and show what I mean. > > It may be that there's some windows setting that's causing this > behavior. I will share what I find with you if I think I understand it. > > Thanks for your effort, > > Bob I want to say the same to you. Regards, Juergen -- /"\ ASCII ribbon | while not asleep do \ / campain against | sheep += 1 X HTML e-mail and | end while / \ news |
5. Re: Lfn.e
- Posted by Juergen Luethje <eu.lue at gmx.de> Dec 24, 2002
- 429 views
Euler German wrote: > On 23 Dec 2002, at 21:38, rforno at tutopia.com wrote: > >> >> Juergen: >> As I said before, Alt-174 (that is, #AE, not #AB; was that a typo, or >> unicode has this different?) produces a left-chevron («) under Outlook >> Express and DOS. But #AE shows as R within a circle under several >> Windows editors. I tried EDXOR, Note Pad, Word Pad, Win Edit, Crimson >> Editor, and the editor that comes with Borland C++ 5.0, all with the >> same result. As I'm usign code page 850, the same as you, you should be >> getting the same results. If not, I can't explain why. ----- Original > > Ricardo: > > Alt-0174 (note the zero) is ® (Registered) while Alt-174 (no zero) is « > (left-chevron) in ANSI char-map (Windows). Ahhh ... Thanks for this explanation. In Unicode, 174 (#AE) is (R) <http://www.unicode.org/charts/PDF/U0080.pdf>. So can generally be said, that with a leading zero, on Windows we get a character from the Unicode table, and without it from the ANSI table? > While in DOS, you're using > OEM char-map and these codes will produce ® (Registered) and © > (Copyright) respectively. > You have to be careful as some editors do not handle both tables. I use > EditPad that works quite well. :) The file(name)s can also be created with the Windows explorer. At least on Win 98, it seems to handle both tables, too. So when I create a file with the Windows explorer with a name consisting of 9 (R)'s [Alt-0174], followed by ".txt", this file is shown with ex.exe by ' dir("*.*")' [short form] 'lfn_dir("*.*")' [long form, i.e. complete name] But with ex.exe, ' dir(174 & "*.*")' and 'lfn_dir(174 & "*.*")' _both_ return -1 (so lfn_dir() behaves consistently to dir()). With exw.exe, 'dir(174 & "*.*")' returns the complete filename! ^ BTW: I just sent the new version 0.71 of Lfn.zip, that includes some bug fixes, to RDS. (The bug fixes have nothing got to do with this codepage stuff.) The lib will also be available for download on my website for a week or so http://luethje.de.vu/temp/lfn.zip In order to make the library as reliable as possible, it should be tested on many various versions of DOS. This is especially important because -- if the code is well-tested -- it will likely be built into future versions of Euphoria for DOS. The testing will take just a few minutes (see LfnRead.txt), and I'd be happy, if many poeple participate. Again, I wish everybody a Merry Christmas and a Happy New Year! Best regards, Juergen (last post before X-mas) -- /"\ ASCII ribbon | while not asleep do \ / campain against | sheep += 1 X HTML e-mail and | end while / \ news |
6. Re: Lfn.e
- Posted by Euler German <efgerman at myrealbox.com> Dec 24, 2002
- 432 views
On 24 Dec 2002, at 8:46, Juergen Luethje wrote: > > Ahhh ... Thanks for this explanation. > In Unicode, 174 (#AE) is (R) > <http://www.unicode.org/charts/PDF/U0080.pdf>. So can generally be said, > that with a leading zero, on Windows we get a character from the Unicode > table, and without it from the ANSI table? > Well, I guess so. :) It's well known that Windows handles Unicode in a poorly manner. Unfortunately, I'm no expert in this issue (many others too). BTW, leading zeros are supposed to be used on Windows ANSI. DOS uses plain ASCII (also called OEM). Warmest regards, -- Euler German -- Season's Greetings to All!! -- Have a Great New Year!!