1. getc() ?? bug

How can the getc return #FFFFFFFF from the middle of file when
 it is only reading byte sizes.

 If I ignore this false EOF and looked at the data return after
 this point and it is ok.

 I then compared a dump ( using another program ) of the same file
 and the dump of the getc function. The code that is returning the #FFFFFFFF
 in Euphoria is a #1A ( 26 decimal ) as a keyboard scancode it represents
 a [ ( left bracket ) and as printable character it represents a
 right arrow. The byte sequence is:
       #1E #0E ( #1A -- bad byte ) #32 #78

 I think the bug is something to do with getc is confusing what type
 of device it is reading from.

 I am using euphoria dos ver2.2

 Bernie

new topic     » topic index » view message » categorize

2. Re: getc() ?? bug

I ran same program on REDHAT Linux 6.2 and the program works ok
so It has to be a bug in the DOS Euphoria interpter.

new topic     » goto parent     » topic index » view message » categorize

3. Re: getc() ?? bug

One of the cute things about old operating systems like DOS is that they use
embedded control characters that mean things in text files. The ASCII-26 is
the control character that means End-Of-File.

Open the file as a binary file instead of a text file and the problem should
go away.

-----
cheers,
Derek Parnell

>Subject: getc() ?? bug
>
>
>How can the getc return #FFFFFFFF from the middle of file when
> it is only reading byte sizes.
>
> If I ignore this false EOF and looked at the data return after
> this point and it is ok.
>
> I then compared a dump ( using another program ) of the same file
> and the dump of the getc function. The code that is returning the
>#FFFFFFFF
> in Euphoria is a #1A ( 26 decimal ) as a keyboard scancode it represents

new topic     » goto parent     » topic index » view message » categorize

4. Re: getc() ?? bug

On Mon, 20 Nov 2000 12:06:20 +1100, Derek Parnell <derekp at SOLACE.COM.AU>
wrote:

>One of the cute things about old operating systems like DOS is that they
use
>embedded control characters that mean things in text files. The ASCII-26 is
>the control character that means End-Of-File.
>
>Open the file as a binary file instead of a text file and the problem
should
>go away.
>
  Derek:

   Thats the problem but it seems that if a user wanted to read
   a text file that contained a right arrow character ( a printable
   character ) the interpeter should not return a EOF. I say that
   is a bug.

  Bernie

new topic     » goto parent     » topic index » view message » categorize

5. Re: getc() ?? bug

>  Derek:
>
>   Thats the problem but it seems that if a user wanted to read
>   a text file that contained a right arrow character ( a printable
>   character ) the interpeter should not return a EOF. I say that
>   is a bug.
>


So blame the DOS designers. The interpreter is not wrong in this case. How
can it know that for this specific text file, the ASCII-26 is not to be
interpreted as EOF, but on others it is? And just because DOS has a glyph
assigned to ASCII-26, does mean that it is a "printable" character according
to ASCII or ANSI.

Anyhow, the Euphoria behaviour is already documented. Here is an extract
from the open() documentation.
----------------------------------
open()
Syntax: fn = open(st1, st2)
Description: Open a file or device, to get the file number. -1 is returned
if the open fails. st1 is the path name of the file or device. st2 is the
mode in which the file is to be opened. Possible modes are:
"r" - open text file for reading
"rb" - open binary file for reading
"w" - create text file for writing
"wb" - create binary file for writing
"u" - open text file for update (reading and writing)
"ub" - open binary file for update
"a" - open text file for appending
"ab" - open binary file for appending


Files opened for read or update must already exist. Files opened for write
or append will be created if necessary. A file opened for write will be set
to 0 bytes. Output to a file opened for append will start at the end of
file.

Output to text files will have carriage-return characters automatically
added before linefeed characters. On input, these carriage-return characters
are removed. A control-Z character (ASCII 26) will signal an immediate end
of file.

I/O to binary files is not modified in any way. Any byte values from 0 to
255 can be read or written.

----------------------------------

If one needs to treat the bytes in a file in ways that are different from
the "text file" interpretation, one must open it as binary and do one's own
parsing. The built-in DOS commands work on the same principle.

-----
cheers,
Derek Parnell

new topic     » goto parent     » topic index » view message » categorize

6. Re: getc() ?? bug

On Mon, 20 Nov 2000 14:00:05 +1100, Derek Parnell <derekp at SOLACE.COM.AU>
wrote:
>So blame the DOS designers. The interpreter is not wrong in this case. How
>can it know that for this specific text file, the ASCII-26 is not to be
>interpreted as EOF, but on others it is? And just because DOS has a glyph
>assigned to ASCII-26, does mean that it is a "printable" character
according
>to ASCII or ANSI.
>
>Anyhow, the Euphoria behaviour is already documented. Here is an extract
>from the open() documentation.


   I still disagree.

   I should be able to open ANY disk file in ANY mode and read it and it
   should NEVER return -1 or EOF UNTIL it reaches the very end of the
   file and at no other time NO MATTER what text it contains look at
   the DOS interrupts the AX register indicates when it reachs an EOF.
   The context of the file should have nothing to do with it. The
   reason that euphoria is wrong is because it is trying to use the
   same function to get input from the keyboard.

   Bernie

new topic     » goto parent     » topic index » view message » categorize

7. Re: getc() ?? bug

----- Original Message -----
From: "Bernie" <xotron at PCOM.NET>
To: <EUPHORIA at LISTSERV.MUOHIO.EDU>
Sent: Tuesday, November 21, 2000 3:57 AM
Subject: Re: getc() ?? bug


> On Mon, 20 Nov 2000 14:00:05 +1100, Derek Parnell <derekp at SOLACE.COM.AU>
> wrote:
> >So blame the DOS designers. The interpreter is not wrong in this case.
How
> >can it know that for this specific text file, the ASCII-26 is not to be
> >interpreted as EOF, but on others it is? And just because DOS has a glyph
> >assigned to ASCII-26, does mean that it is a "printable" character
> according
> >to ASCII or ANSI.
> >
> >Anyhow, the Euphoria behaviour is already documented. Here is an extract
> >from the open() documentation.
>
>
>    I still disagree.
>
>    I should be able to open ANY disk file in ANY mode and read it and it
>    should NEVER return -1 or EOF UNTIL it reaches the very end of the
>    file and at no other time NO MATTER what text it contains look at
>    the DOS interrupts the AX register indicates when it reachs an EOF.

I agree with you, but the DOS designers have copied this protocol from CP/M
and that is the documented way that DOS is supposed to operate. It is
stupid, yes. But Euphoria is only implementing the DOS standard.

>    The context of the file should have nothing to do with it.
True, I also believe that the context *should* have nothing to do with it,
but unfortunately is *does*.

>    The reason that euphoria is wrong is because it is trying to use the
>    same function to get input from the keyboard.

Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26) then
getc(0) will return -1. This is totally consistent, because stdin is a DOS
"text" device. To get the raw data from the keyboard is to open("CON", "rb")
but even this is not perfect solution, because a single Ctrl-Z on a line is
interpreted as EOF.

In summary I believe that Euphoria is correct because it is implementing the
DOS standard. I think it is the DOS standard that has got it wrong. Unix has
 no concept of binary/text files - they're all the same.

------
Derek Parnell
Melbourne, Australia
(Vote [1] The Cheshire Cat for Internet Mascot)

new topic     » goto parent     » topic index » view message » categorize

8. Re: getc() ?? bug

On 21 Nov 2000, at 4:50, Derek Parnell wrote:

> ----- Original Message -----
> From: "Bernie" <xotron at PCOM.NET>
> To: <EUPHORIA at LISTSERV.MUOHIO.EDU>
> Sent: Tuesday, November 21, 2000 3:57 AM
> Subject: Re: getc() ?? bug

<snip>

> >    The reason that euphoria is wrong is because it is trying to use the same
> >    function to get input from the keyboard.
>
> Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26) then
> getc(0) will return -1. This is totally consistent, because stdin is a DOS
> "text" device. To get the raw data from the keyboard is to open("CON", "rb")
> but even this is not perfect solution, because a single Ctrl-Z on a line is
> interpreted as EOF.
>
> In summary I believe that Euphoria is correct because it is implementing the
> DOS standard. I think it is the DOS standard that has got it wrong. Unix has
>  no concept of binary/text files - they're all the same.

Ok, my *one* cent worth.....
I had this same problem in pascal yrs ago, i solved it with pascal's
blockread() and wrote my own text parser. Then i found some bug i have
since forgotten in the keybd read routine, and began bypassing dos for that
too. Then i *needed* some binary file routines for text files, and darn it, had
to goto assy routines for that too, to move the dos file pointer around. Or
was it the other way around? Well, that's why this is only one cent, Jiri.

Anyhow, use dos int21, function 14 and 15 to solve most problems. Are
these covered in dos32lib, David?

Kat

new topic     » goto parent     » topic index » view message » categorize

9. Re: getc() ?? bug

Kat wrote:

> Anyhow, use dos int21, function 14 and 15
> to solve most problems. Are these covered
> in dos32lib, David?

Urm, no... Dos32Lib is an old port of Win32Lib to DOS. It only deals with
the GUI elements - windows, pushbuttons, etc.

-- David Cuny

new topic     » goto parent     » topic index » view message » categorize

10. Re: getc() ?? bug

Good point, Kat. Text files are ones that are composed of zero or more
lines. A line (in DOS) is all bytes up to and including a ASCII-13 ASCII-10
combination (CRLF). In text file mode, Euphoria returns a single ASCII-10 at
end of line, thus if you try to use seek() to position yourself within the
file, you must not simply use the length of the line returned - because the
ASCII-13 has been stripped off.

But back to the original problem. If this file is truely a text file (ie.
one that is composed of lines of text) and it has embedded CTRL-Z, then open
it as a binary file and look for CRLF to mark end of lines.

As a general rule-of-thumb, I'd open all DOS files as binary unless I know
for sure that the files is *strictly* a text, line-based file.

------
Derek Parnell
Melbourne, Australia
(Vote [1] The Cheshire Cat for Internet Mascot)

----- Original Message -----
From: "Kat" <gertie at PELL.NET>
To: <EUPHORIA at LISTSERV.MUOHIO.EDU>
Sent: Tuesday, November 21, 2000 5:08 AM
Subject: Re: getc() ?? bug


> On 21 Nov 2000, at 4:50, Derek Parnell wrote:
>
> > ----- Original Message -----
> > From: "Bernie" <xotron at PCOM.NET>
> > To: <EUPHORIA at LISTSERV.MUOHIO.EDU>
> > Sent: Tuesday, November 21, 2000 3:57 AM
> > Subject: Re: getc() ?? bug
>
> <snip>
>
> > >    The reason that euphoria is wrong is because it is trying to use
the same
> > >    function to get input from the keyboard.
> >
> > Sorry, but I don't get the connection? If you type in Ctrl-Z (ASCII-26)
then
> > getc(0) will return -1. This is totally consistent, because stdin is a
DOS
> > "text" device. To get the raw data from the keyboard is to open("CON",
"rb")
> > but even this is not perfect solution, because a single Ctrl-Z on a line
is
> > interpreted as EOF.
> >
> > In summary I believe that Euphoria is correct because it is implementing
the
> > DOS standard. I think it is the DOS standard that has got it wrong. Unix
has
> >  no concept of binary/text files - they're all the same.
>
> Ok, my *one* cent worth.....
> I had this same problem in pascal yrs ago, i solved it with pascal's
> blockread() and wrote my own text parser. Then i found some bug i have
> since forgotten in the keybd read routine, and began bypassing dos for
that
> too. Then i *needed* some binary file routines for text files, and darn
it, had
> to goto assy routines for that too, to move the dos file pointer around.
Or
> was it the other way around? Well, that's why this is only one cent, Jiri.
>
> Anyhow, use dos int21, function 14 and 15 to solve most problems. Are
> these covered in dos32lib, David?
>
> Kat

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu