1. getc() ?? bug
- Posted by Bernie <xotron at PCOM.NET> Nov 19, 2000
- 425 views
- Last edited Nov 20, 2000
How can the getc return #FFFFFFFF from the middle of file when it is only reading byte sizes. If I ignore this false EOF and looked at the data return after this point and it is ok. I then compared a dump ( using another program ) of the same file and the dump of the getc function. The code that is returning the #FFFFFFFF in Euphoria is a #1A ( 26 decimal ) as a keyboard scancode it represents a [ ( left bracket ) and as printable character it represents a right arrow. The byte sequence is: #1E #0E ( #1A -- bad byte ) #32 #78 I think the bug is something to do with getc is confusing what type of device it is reading from. I am using euphoria dos ver2.2 Bernie
2. Re: getc() ?? bug
- Posted by Bernie <xotron at PCOM.NET> Nov 19, 2000
- 409 views
- Last edited Nov 20, 2000
I ran same program on REDHAT Linux 6.2 and the program works ok so It has to be a bug in the DOS Euphoria interpter.
3. Re: getc() ?? bug
- Posted by Derek Parnell <derekp at solace.com.au> Nov 20, 2000
- 412 views
One of the cute things about old operating systems like DOS is that they use embedded control characters that mean things in text files. The ASCII-26 is the control character that means End-Of-File. Open the file as a binary file instead of a text file and the problem should go away. ----- cheers, Derek Parnell >Subject: getc() ?? bug > > >How can the getc return #FFFFFFFF from the middle of file when > it is only reading byte sizes. > > If I ignore this false EOF and looked at the data return after > this point and it is ok. > > I then compared a dump ( using another program ) of the same file > and the dump of the getc function. The code that is returning the >#FFFFFFFF > in Euphoria is a #1A ( 26 decimal ) as a keyboard scancode it represents
4. Re: getc() ?? bug
- Posted by Bernie <xotron at PCOM.NET> Nov 19, 2000
- 411 views
- Last edited Nov 20, 2000
On Mon, 20 Nov 2000 12:06:20 +1100, Derek Parnell <derekp at SOLACE.COM.AU> wrote: >One of the cute things about old operating systems like DOS is that they use >embedded control characters that mean things in text files. The ASCII-26 is >the control character that means End-Of-File. > >Open the file as a binary file instead of a text file and the problem should >go away. > Derek: Thats the problem but it seems that if a user wanted to read a text file that contained a right arrow character ( a printable character ) the interpeter should not return a EOF. I say that is a bug. Bernie
5. Re: getc() ?? bug
- Posted by Derek Parnell <derekp at solace.com.au> Nov 20, 2000
- 417 views
> Derek: > > Thats the problem but it seems that if a user wanted to read > a text file that contained a right arrow character ( a printable > character ) the interpeter should not return a EOF. I say that > is a bug. > So blame the DOS designers. The interpreter is not wrong in this case. How can it know that for this specific text file, the ASCII-26 is not to be interpreted as EOF, but on others it is? And just because DOS has a glyph assigned to ASCII-26, does mean that it is a "printable" character according to ASCII or ANSI. Anyhow, the Euphoria behaviour is already documented. Here is an extract from the open() documentation. ---------------------------------- open() Syntax: fn = open(st1, st2) Description: Open a file or device, to get the file number. -1 is returned if the open fails. st1 is the path name of the file or device. st2 is the mode in which the file is to be opened. Possible modes are: "r" - open text file for reading "rb" - open binary file for reading "w" - create text file for writing "wb" - create binary file for writing "u" - open text file for update (reading and writing) "ub" - open binary file for update "a" - open text file for appending "ab" - open binary file for appending Files opened for read or update must already exist. Files opened for write or append will be created if necessary. A file opened for write will be set to 0 bytes. Output to a file opened for append will start at the end of file. Output to text files will have carriage-return characters automatically added before linefeed characters. On input, these carriage-return characters are removed. A control-Z character (ASCII 26) will signal an immediate end of file. I/O to binary files is not modified in any way. Any byte values from 0 to 255 can be read or written. ---------------------------------- If one needs to treat the bytes in a file in ways that are different from the "text file" interpretation, one must open it as binary and do one's own parsing. The built-in DOS commands work on the same principle. ----- cheers, Derek Parnell
6. Re: getc() ?? bug
- Posted by Bernie <xotron at PCOM.NET> Nov 20, 2000
- 403 views
On Mon, 20 Nov 2000 14:00:05 +1100, Derek Parnell <derekp at SOLACE.COM.AU> wrote: >So blame the DOS designers. The interpreter is not wrong in this case. How >can it know that for this specific text file, the ASCII-26 is not to be >interpreted as EOF, but on others it is? And just because DOS has a glyph >assigned to ASCII-26, does mean that it is a "printable" character according >to ASCII or ANSI. > >Anyhow, the Euphoria behaviour is already documented. Here is an extract >from the open() documentation. I still disagree. I should be able to open ANY disk file in ANY mode and read it and it should NEVER return -1 or EOF UNTIL it reaches the very end of the file and at no other time NO MATTER what text it contains look at the DOS interrupts the AX register indicates when it reachs an EOF. The context of the file should have nothing to do with it. The reason that euphoria is wrong is because it is trying to use the same function to get input from the keyboard. Bernie
7. Re: getc() ?? bug
- Posted by Derek Parnell <dparnell at BIGPOND.NET.AU> Nov 21, 2000
- 415 views
----- Original Message ----- From: "Bernie" <xotron at PCOM.NET> To: <EUPHORIA at LISTSERV.MUOHIO.EDU> Sent: Tuesday, November 21, 2000 3:57 AM Subject: Re: getc() ?? bug > On Mon, 20 Nov 2000 14:00:05 +1100, Derek Parnell <derekp at SOLACE.COM.AU> > wrote: > >So blame the DOS designers. The interpreter is not wrong in this case. How > >can it know that for this specific text file, the ASCII-26 is not to be > >interpreted as EOF, but on others it is? And just because DOS has a glyph > >assigned to ASCII-26, does mean that it is a "printable" character > according > >to ASCII or ANSI. > > > >Anyhow, the Euphoria behaviour is already documented. Here is an extract > >from the open() documentation. > > > I still disagree. > > I should be able to open ANY disk file in ANY mode and read it and it > should NEVER return -1 or EOF UNTIL it reaches the very end of the > file and at no other time NO MATTER what text it contains look at > the DOS interrupts the AX register indicates when it reachs an EOF. I agree with you, but the DOS designers have copied this protocol from CP/M and that is the documented way that DOS is supposed to operate. It is stupid, yes. But Euphoria is only implementing the DOS standard. > The context of the file should have nothing to do with it. True, I also believe that the context *should* have nothing to do with it, but unfortunately is *does*. > The reason that euphoria is wrong is because it is trying to use the > same function to get input from the keyboard. Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26) then getc(0) will return -1. This is totally consistent, because stdin is a DOS "text" device. To get the raw data from the keyboard is to open("CON", "rb") but even this is not perfect solution, because a single Ctrl-Z on a line is interpreted as EOF. In summary I believe that Euphoria is correct because it is implementing the DOS standard. I think it is the DOS standard that has got it wrong. Unix has no concept of binary/text files - they're all the same. ------ Derek Parnell Melbourne, Australia (Vote [1] The Cheshire Cat for Internet Mascot)
8. Re: getc() ?? bug
- Posted by Kat <gertie at PELL.NET> Nov 20, 2000
- 421 views
On 21 Nov 2000, at 4:50, Derek Parnell wrote: > ----- Original Message ----- > From: "Bernie" <xotron at PCOM.NET> > To: <EUPHORIA at LISTSERV.MUOHIO.EDU> > Sent: Tuesday, November 21, 2000 3:57 AM > Subject: Re: getc() ?? bug <snip> > > The reason that euphoria is wrong is because it is trying to use the same > > function to get input from the keyboard. > > Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26) then > getc(0) will return -1. This is totally consistent, because stdin is a DOS > "text" device. To get the raw data from the keyboard is to open("CON", "rb") > but even this is not perfect solution, because a single Ctrl-Z on a line is > interpreted as EOF. > > In summary I believe that Euphoria is correct because it is implementing the > DOS standard. I think it is the DOS standard that has got it wrong. Unix has > no concept of binary/text files - they're all the same. Ok, my *one* cent worth..... I had this same problem in pascal yrs ago, i solved it with pascal's blockread() and wrote my own text parser. Then i found some bug i have since forgotten in the keybd read routine, and began bypassing dos for that too. Then i *needed* some binary file routines for text files, and darn it, had to goto assy routines for that too, to move the dos file pointer around. Or was it the other way around? Well, that's why this is only one cent, Jiri. Anyhow, use dos int21, function 14 and 15 to solve most problems. Are these covered in dos32lib, David? Kat
9. Re: getc() ?? bug
- Posted by "Cuny, David at DSS" <David.Cuny at DSS.CA.GOV> Nov 20, 2000
- 438 views
Kat wrote: > Anyhow, use dos int21, function 14 and 15 > to solve most problems. Are these covered > in dos32lib, David? Urm, no... Dos32Lib is an old port of Win32Lib to DOS. It only deals with the GUI elements - windows, pushbuttons, etc. -- David Cuny
10. Re: getc() ?? bug
- Posted by Derek Parnell <dparnell at BIGPOND.NET.AU> Nov 21, 2000
- 487 views
Good point, Kat. Text files are ones that are composed of zero or more lines. A line (in DOS) is all bytes up to and including a ASCII-13 ASCII-10 combination (CRLF). In text file mode, Euphoria returns a single ASCII-10 at end of line, thus if you try to use seek() to position yourself within the file, you must not simply use the length of the line returned - because the ASCII-13 has been stripped off. But back to the original problem. If this file is truely a text file (ie. one that is composed of lines of text) and it has embedded CTRL-Z, then open it as a binary file and look for CRLF to mark end of lines. As a general rule-of-thumb, I'd open all DOS files as binary unless I know for sure that the files is *strictly* a text, line-based file. ------ Derek Parnell Melbourne, Australia (Vote [1] The Cheshire Cat for Internet Mascot) ----- Original Message ----- From: "Kat" <gertie at PELL.NET> To: <EUPHORIA at LISTSERV.MUOHIO.EDU> Sent: Tuesday, November 21, 2000 5:08 AM Subject: Re: getc() ?? bug > On 21 Nov 2000, at 4:50, Derek Parnell wrote: > > > ----- Original Message ----- > > From: "Bernie" <xotron at PCOM.NET> > > To: <EUPHORIA at LISTSERV.MUOHIO.EDU> > > Sent: Tuesday, November 21, 2000 3:57 AM > > Subject: Re: getc() ?? bug > > <snip> > > > > The reason that euphoria is wrong is because it is trying to use the same > > > function to get input from the keyboard. > > > > Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26) then > > getc(0) will return -1. This is totally consistent, because stdin is a DOS > > "text" device. To get the raw data from the keyboard is to open("CON", "rb") > > but even this is not perfect solution, because a single Ctrl-Z on a line is > > interpreted as EOF. > > > > In summary I believe that Euphoria is correct because it is implementing the > > DOS standard. I think it is the DOS standard that has got it wrong. Unix has > > no concept of binary/text files - they're all the same. > > Ok, my *one* cent worth..... > I had this same problem in pascal yrs ago, i solved it with pascal's > blockread() and wrote my own text parser. Then i found some bug i have > since forgotten in the keybd read routine, and began bypassing dos for that > too. Then i *needed* some binary file routines for text files, and darn it, had > to goto assy routines for that too, to move the dos file pointer around. Or > was it the other way around? Well, that's why this is only one cent, Jiri. > > Anyhow, use dos int21, function 14 and 15 to solve most problems. Are > these covered in dos32lib, David? > > Kat