1. Problems with GET.E
- Posted by Roderick Jackson <rjackson at CSIWEB.COM> Nov 14, 1999
- 612 views
Is it me, or is the get() function in GET.E faulty? I seem to recall discussion on the list regarding it a while back, but I can't recall... Basically, all I do is print() to a file, close it, reopen it, then get() from the file. Correct me if I'm wrong, but aren't several get() calls in a row supposed to return all of the objects that were printed to the file? (And I'm using the latest version of GET.E released with Eu 2.1) If it's not, then might I humbly (but fervently) suggest that it be made so? From what I can see, one way to do it would be to have print() put a space after every object it outputs. If it IS supposed to work that way, then, um, someone will have to tell me and I'll just post my code to the list for examination... Rod Jackson
2. Re: Problems with GET.E
- Posted by Robert Craig <rds at ATTCANADA.NET> Nov 15, 1999
- 572 views
Roderick Jackson writes: > Correct me if I'm wrong, but aren't several get() calls > in a row supposed to return all of the objects that were > printed to the file? ... get() requires that there be at least one character of whitespace (blank, tab or new-line) separating the top-level objects in the file. You can easily observe that this will likely be necessary between two atoms, e.g. 111 222 but it shouldn't be necessary between two sequences, or an atom and a sequence e.g.: 145.999{1,2,3} or {1,2,3}{4,5,6} because the braces should be enough. So why does get() require whitespace in *all* cases? It was much easier to implement that way (it's a long story), and I didn't think anyone would really mind. > From what I can see, one way to do it > would be to have print() put a space after every object > it outputs. What about the users of print() who don't want the extra space? Is it that hard to add: puts(fn, ' ') to your code? Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
3. Re: Problems with GET.E
- Posted by "Boehme, Gabriel" <gboehme at POSTOFFICE.MUSICLAND.COM> Nov 15, 1999
- 568 views
Robert Craig wrote (in response to Rod Jackson): >>Correct me if I'm wrong, but aren't several get() calls >>in a row supposed to return all of the objects that were >>printed to the file? ... > >get() requires that there be at least one character of >whitespace (blank, tab or new-line) separating the >top-level objects in the file. > >[...] > >>From what I can see, one way to do it >>would be to have print() put a space after every object >>it outputs. > >What about the users of print() who don't want the >extra space? Is it that hard to add: > puts(fn, ' ') >to your code? I don't think it's so much a case of how easy or hard it is to add "puts(fn, ' ')" to a given section of code. For me, at least, it's a case of paired functions having a not-quite symmetrical relationship. You can call get() as many times in a row as you like, but for print() you have to throw in a puts() for a space. The space between top-level objects may be necessary, and certainly makes sense when you think about it, but it *is* somewhat inconsistent with Euphoria's otherwise sensible and easy-to-understand language design, IMO. This is not limited to the get()/print() pair of functions, either -- Jason Gade recently pointed out (11/12/1999; Subject: Euphoria features) that position()/get_position() have a similarly non-symmetrical relationship. Another (possibly imperfect) example which springs to mind would be the video_config()/text_rows()/graphics_mode() family of routines. In any event, for the immediate future, this is probably the best solution for the problem in question: global procedure put(integer fn, object data) print(fn, data) puts(fn, ' ') -- or '\n', either one will work end procedure This way, we have the symmetrically-paired routines get() and put(). This enables those who wish to perform multiple get() calls against a file to have a self-contained procedure to write their data out to that file, without forcing the users of print() to use up an extra byte. Hep yadda, Gabriel Boehme ---------- There are few things as convincing as death to remind us of the quality with which we live our lives. Robert Fripp ----------
4. Re: Problems with GET.E
- Posted by Roderick Jackson <rjackson at CSIWEB.COM> Nov 15, 1999
- 567 views
Robert Craig wrote: >> Correct me if I'm wrong, but aren't several get() calls >> in a row supposed to return all of the objects that were >> printed to the file? ... > >get() requires that there be at least one character of >whitespace (blank, tab or new-line) separating the >top-level objects in the file. > >You can easily observe that this will likely be >necessary between two atoms, >e.g. > 111 222 > >but it shouldn't be necessary between >two sequences, or an atom and a sequence e.g.: > > 145.999{1,2,3} >or > {1,2,3}{4,5,6} > >because the braces should be enough. >So why does get() require whitespace in *all* cases? > >It was much easier to implement that way (it's a long story), >and I didn't think anyone would really mind. > >> From what I can see, one way to do it >> would be to have print() put a space after every object >> it outputs. > >What about the users of print() who don't want the >extra space? Is it that hard to add: > puts(fn, ' ') >to your code? No... if you really *meant* it the way it is, I'm not asking for a change; I can see some difficulty with affixing a space or newline to the end of everything print()ed. I just thought that your original intent was to have people be able to use: print (fn, obj1) print (fn, obj2) ... ext_obj1 = get (fn) ext_obj2 = get (fn) as naturally as they would puts() and getc(). As things currently stand, it means I may have to leave out parallel functions for print() and get() from my next project-- you'll see why soon. But, if that's the way they were meant to operate anyway, ah well... ... meanwhile, Gabriel Boehme wrote: >In any event, for the immediate future, this is probably the best solution >for the problem in question: > >global procedure put(integer fn, object data) > print(fn, data) > puts(fn, ' ') -- or '\n', either one will work >end procedure I'll be certain to do something like this whenever I start working with large datafiles. For now though, the situation isn't causing any real problems--I just thought I'd clear it up before it did. (Side note: I know I'm going to be called on this, but having get() and print() operate the way they currently do DOES seem... ill-paired. There's really no need to change get()--when writing two integers to a file, you're going to HAVE to put something between them anyway. So maybe we could try to find out how many folks use print() in a manner that would cause problems if print() were changed to attatch a newline? I suspect it's not that many, if any at all...) Rod Jackson
5. Re: Problems with GET.E
- Posted by Everett Williams <rett at GVTC.COM> Nov 15, 1999
- 561 views
- Last edited Nov 16, 1999
Roderick Jackson wrote: >(Side note: I know I'm going to be called on this, but >having get() and print() operate the way they currently do >DOES seem... ill-paired. There's really no need to change >get()--when writing two integers to a file, you're going >to HAVE to put something between them anyway. So maybe we >could try to find out how many folks use print() in a >manner that would cause problems if print() were changed >to attatch a newline? I suspect it's not that many, >if any at all...) > >Rod Jackson I like symmetry, too, but I'd like it without the newline attached to every print(). The ability to write more than once to a line has so many uses in report writing that it would be hard to enumerate them. Especially with the lack of string facilities in this language, it can simplify code quite a bit. A carriage return without a line feed would be less problematic, but still inconvenient. Everett L.(Rett) Williams rett at gvtc.com
6. Re: Problems with GET.E
- Posted by Roderick Jackson <rjackson at CSIWEB.COM> Nov 15, 1999
- 571 views
- Last edited Nov 16, 1999
Everett Williams wrote: >Roderick Jackson wrote: > >>(Side note: I know I'm going to be called on this, but >>having get() and print() operate the way they currently do >>DOES seem... ill-paired. There's really no need to change >>get()--when writing two integers to a file, you're going >>to HAVE to put something between them anyway. So maybe we >>could try to find out how many folks use print() in a >>manner that would cause problems if print() were changed >>to attatch a newline? I suspect it's not that many, >>if any at all...) >> >>Rod Jackson > >I like symmetry, too, but I'd like it without the newline attached to >every print(). The ability to write more than once to a line has >so many uses in report writing that it would be hard to >enumerate them. I agree, but how many people need to use print() to keep stuff on the same line? print(), unlike printf() and puts(), is used for generic objects, not strings or formatted output: print (fn, "ABCD") --> "{65,66,67,68}" is the literal --> string stored in your file, --> not 5 characters. If someone uses it extensively in their datafiles, I can see them having a problem with the extra bytes eating up memory; but I'd really like to know if anyone is doing this (or is using the routine in a way that the format would be messed up by the extra byte.) I suspect, because of it's incompatibility with get(), that practically no one ever uses print(). Of course, I could be wrong... I'm just curious. Rod Jackson
7. Re: Problems with GET.E
- Posted by "Lucius L. Hilley III" <lhilley at CDC.NET> Nov 15, 1999
- 576 views
- Last edited Nov 16, 1999
I guess the easy fix is to override the built-in print command. I.E.: procedure old_print(integer fn, object x) print(fn, x) end procedure without warning procedure print(integer fn, object x) print(fn, x) puts(fn, 10) -- you can replace 10 with '\n' or {10} or "\n" end procedure with warning Lucius L. Hilley III lhilley at cdc.net +----------+--------------+--------------+ | Hollow | ICQ: 9638898 | AIM: LLHIII | | Horse +--------------+--------------+ | Software | http://www.cdc.net/~lhilley | +----------+-----------------------------+
8. Re: Problems with GET.E
- Posted by Jiri Babor <J.Babor at GNS.CRI.NZ> Nov 17, 1999
- 588 views
Gals & Guys, Since I felt at least partially responsible for the get.e fiasco, I thought I should try to fix it. Attached is my first go at it, a little kludgy, but it seems to work. I have done only a couple of quick tests, so be careful. I altered only the routines for get() and value() functions, I left the rest of the clutter untouched. Test results, comments & other brutalities will be very much appreciated. If it rains tonight (no tennis!), I'll try to improve it. Then you will find a copy on my Euphoria page. jiri -- snip -------------------------------------------------------------- -- file : get.e -- author : jiri babor -- email : jbabor at paradise.net.nz -- project : get.e replacement -- tool : euphoria 2.1 -- date : 99-11-17 ------------------------------------ -- Input and Conversion Routines: -- -- get() -- -- value() -- -- wait_key() -- ------------------------------------ -- error status values returned from get() and value(): global constant GET_SUCCESS = 0, GET_EOF = -1, GET_FAIL = 1 constant M_WAIT_KEY = 26 constant TRUE = 1, FALSE=0, DIGITS = "0123456789", HEX_DIGITS = DIGITS & "ABCDEF", START_NUMERIC = DIGITS & "-+.#" type natural(integer x) return x >= 0 end type type char(integer x) return x >= -1 and x <= 255 end type object input_string -- string to be read from integer error_flag -- error flag natural input_file -- file to be read from natural string_next char ch -- the current character global function wait_key() -- Get the next key pressed by the user. -- Wait until a key is pressed. return machine_func(M_WAIT_KEY, 0) end function procedure get_ch() -- set ch to the next character in the input stream (either string or file) if sequence(input_string) then if string_next <= length(input_string) then ch = input_string[string_next] string_next += 1 else ch = GET_EOF end if else ch = getc(input_file) end if end procedure procedure skip_blanks() -- skip white space -- ch is "live" at entry and exit while find(ch, " \t\n") do get_ch() end while end procedure constant ESCAPE_CHARS = "nt'\"\\r", ESCAPED_CHARS = "\n\t'\"\\\r" function escape_char(char c) -- return escape character natural i i = find(c, ESCAPE_CHARS) if i = 0 then return GET_FAIL else return ESCAPED_CHARS[i] end if end function function get_qchar() -- get a single-quoted character -- ch is "live" at exit char c get_ch() c = ch if ch = '\\' then get_ch() c = escape_char(ch) if c = GET_FAIL then error_flag=GET_FAIL return 0 end if elsif ch = '\'' then error_flag=GET_FAIL return 0 end if get_ch() if ch != '\'' then error_flag=GET_FAIL return 0 else get_ch() return c end if end function -- get_qchar function get_string() -- get a double-quoted character string -- ch is "live" at exit sequence text text = "" while TRUE do get_ch() if ch = GET_EOF or ch = '\n' then error_flag=GET_FAIL return 0 elsif ch = '"' then get_ch() return text elsif ch = '\\' then get_ch() ch = escape_char(ch) if ch = GET_FAIL then error_flag=GET_FAIL return 0 end if end if text &= ch end while end function type plus_or_minus(integer x) return x = -1 or x = +1 end type function get_number() -- read a number -- ch is "live" at entry and exit plus_or_minus sign, e_sign natural ndigits integer hex_digit atom mantissa, dec, e_mag sign = +1 mantissa = 0 ndigits = 0 -- process sign if ch = '-' then sign = -1 get_ch() elsif ch = '+' then get_ch() end if -- get mantissa if ch = '#' then -- process hex integer and return get_ch() while TRUE do hex_digit = find(ch, HEX_DIGITS)-1 if hex_digit >= 0 then ndigits += 1 mantissa = mantissa * 16 + hex_digit get_ch() else if ndigits > 0 then return sign * mantissa else error_flag=GET_FAIL return 0 end if end if end while end if -- decimal integer or floating point while ch >= '0' and ch <= '9' do ndigits += 1 mantissa = mantissa * 10 + (ch - '0') get_ch() end while if ch = '.' then -- get fraction get_ch() dec = 10 while ch >= '0' and ch <= '9' do ndigits += 1 mantissa += (ch - '0') / dec dec *= 10 get_ch() end while end if if ndigits = 0 then error_flag=GET_FAIL return 0 end if mantissa = sign * mantissa if ch = 'e' or ch = 'E' then -- get exponent sign e_sign = +1 e_mag = 0 get_ch() if ch = '-' then e_sign = -1 get_ch() elsif ch = '+' then get_ch() end if -- get exponent magnitude if ch >= '0' and ch <= '9' then e_mag = ch - '0' get_ch() while ch >= '0' and ch <= '9' do e_mag = e_mag * 10 + ch - '0' get_ch() end while else -- no exponent error_flag=GET_FAIL return 0 end if e_mag *= e_sign if e_mag > 308 then -- rare case: avoid power() overflow mantissa *= power(10, 308) if e_mag > 1000 then e_mag = 1000 end if for i = 1 to e_mag - 308 do mantissa *= 10 end for else mantissa *= power(10, e_mag) end if end if return mantissa end function function get_sequence() sequence s integer comma, first -- flags get_ch() skip_blanks() s={} comma=FALSE first=TRUE while ch!='}' do if comma or first then if find(ch, START_NUMERIC) then s=append(s, get_number()) elsif ch = '{' then s=append(s, get_sequence()) get_ch() skip_blanks() elsif ch = '\"' then s=append(s, get_string()) elsif ch = '\'' then s=append(s, get_qchar()) elsif ch = -1 then error_flag=GET_EOF return 0 else error_flag=GET_FAIL return 0 end if comma=FALSE first=FALSE elsif ch=',' then comma=TRUE get_ch() skip_blanks() else error_flag=GET_FAIL return 0 end if end while if comma then error_flag=GET_FAIL return 0 end if return s end function -- get_sequence function Get() -- read a Euphoria data object as a string of characters -- set error_flag and return value skip_blanks() if find(ch, START_NUMERIC) then return get_number() elsif ch = '{' then return get_sequence() elsif ch = '\"' then return get_string() elsif ch = '\'' then return get_qchar() elsif ch = -1 then error_flag=GET_EOF return 0 else error_flag=GET_FAIL return 0 end if end function -- Get() global function get(integer file) -- Read the string representation of a Euphoria object -- from a file. Convert to the value of the object. -- Return {error_status, value}. input_file = file input_string = 0 error_flag=GET_SUCCESS get_ch() return {error_flag, Get()} end function global function value(sequence string) -- Read the representation of a Euphoria object -- from a sequence of characters. Convert to the value of the object. -- Return {error_status, value). input_string = string string_next = 1 error_flag=GET_SUCCESS get_ch() return {error_flag, Get()} end function global function prompt_number(sequence prompt, sequence range) -- Prompt the user to enter a number. -- A range of allowed values may be specified. object answer while 1 do puts(1, prompt) answer = gets(0) -- make sure whole line is read puts(1, '\n') answer = value(answer) if answer[1] != GET_SUCCESS or sequence(answer[2]) then puts(1, "A number is expected - try again\n") else if length(range) = 2 then if range[1] <= answer[2] and answer[2] <= range[2] then return answer[2] else printf(1, "A number from %g to %g is expected here - try again\n", range) end if else return answer[2] end if end if end while end function global function prompt_string(sequence prompt) -- Prompt the user to enter a string object answer puts(1, prompt) answer = gets(0) puts(1, '\n') if sequence(answer) and length(answer) > 0 then return answer[1..length(answer)-1] -- trim the \n else return "" end if end function global function get_bytes(integer fn, integer n) -- Return a sequence of n bytes (maximum) from an open file. -- If n > 0 and fewer than n bytes are returned, -- you've reached the end of file. sequence s integer c if n = 0 then return {} end if c = getc(fn) if c = -1 then return {} end if s = repeat(c, n) for i = 2 to n do s[i] = getc(fn) end for while s[n] = -1 do n -= 1 end while return s[1..n] end function
9. Re: Problems with GET.E
- Posted by Robert Craig <rds at ATTCANADA.NET> Nov 17, 1999
- 574 views
Thanks Jiri for attempting to improve get(). Your earlier efforts made the current released get() much faster. I haven't looked at your code yet, because I suspect that you haven't taken into account the thing that really made me require whitespace between top-level objects. It's not just a problem of correctly parsing an input stream from a file. That can be done, and maybe you've achieved it. I think it was even done correctly in a much earlier release of Euphoria. Consider the problem of parsing: 123{99, 88, 77} It will be necessary to read '1', '2' and '3' from the input file, and then read '{' to know that you've reached the end of the 123 atom. get() must then return the atom's value. The next call to get() must somehow see '{' again. That can be achieved by retaining an "ungotten" character or in some other way that requires get.e to retain some information between calls in some variable. This can work when you make consecutive calls to get() on the same input file. But what are you going to do if the user makes a call to get() *on a different file*, or if he performs a seek() to move to a different part of the same file. Your ungotten or saved character is now wrong. You could try to fix this by instead seeking back one position at the end of get(), so the next get() will see the same character again. I don't want to get into this because I don't want to assume that seek() will work on all types of files/devices on all platforms. I could also attempt to implement an "unget" facility for all Euphoria file I/O, but as I said, I don't consider the whitespace separation requirement to be so terrible that it requires any major effort to improve it. In any case you will always need whitespace between top-level atoms. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
10. Re: Problems with GET.E
- Posted by Jiri Babor <J.Babor at GNS.CRI.NZ> Nov 18, 1999
- 561 views
Rob, Somehow, wrongly, I assumed print() inserts a space after each top level *number* (betraying my forth past, I suppose). This is not, strictly speaking, necessary with the other elements, because the *closing* curly bracket, as well as the quote and double quote characters themselves can be treated as separators. You also wrote: >You could try to fix this by instead seeking back one position at the >end of get(), so the next get() will see the same character again. I >don't want to get into this because I don't want to assume that seek() >will work on all types of files/devices on all platforms. I could also >attempt to implement an "unget" facility for all Euphoria file I/O, >but as I said, I don't consider the whitespace separation requirement >to be so terrible that it requires any major effort to improve it. In >any case you will always need whitespace between top-level atoms. The real remedy should be quite clear by now. Change the print command to output the extra space after each number, or better still, after each top level object. It will improve readability, so close to so many hearts in this forum, and it will not break any existing code or a database. And we will be able to use the 'print/get' pair as suggested in your documentation *without* any additional hassles. Truly formatted output is adequately covered by different routines anyway. jiri
11. Re: Problems with GET.E
- Posted by Robert Craig <rds at ATTCANADA.NET> Nov 17, 1999
- 572 views
- Last edited Nov 18, 1999
Jiri Babor writes: > The real remedy should be quite clear by now. Change > the print command to output the extra space after each number, > or better still, after each top level object. Roderick Jackson already suggested that, and I am still opposed to that idea. Adding your own extra space with puts() is easy. Removing an unwanted extra space could be difficult. Also, some existing code would probably break. Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
12. Re: Problems with GET.E
- Posted by Jiri Babor <J.Babor at GNS.CRI.NZ> Nov 18, 1999
- 566 views
Rob Craig wrote: >Roderick Jackson already suggested that, and I am still >opposed to that idea. Adding your own extra space with puts() >is easy. Removing an unwanted extra space could >be difficult. Also, some existing code would probably break. No matter how hard I try I cannot imagine why on earth I would want to remove the extra space. Without it the data is unusable anyway. jiri
13. Re: Problems with GET.E
- Posted by Robert Craig <rds at ATTCANADA.NET> Nov 18, 1999
- 576 views
Jiri Babor writes: > No matter how hard I try I cannot imagine why on earth > I would want to remove the extra space. Without it the > data is unusable anyway. I was mainly thinking of other situations where print() might be used. print() can be used quite independently of get(), just as get() can be (and usually is) used independently of print(). e.g. puts(1, "there are ") print(1, x) puts(1, " days until the year 2000\n") A knowledgeable user could also do this with printf(), since in this case x is an atom, not a sequence. I don't think it's a big deal if print() and get() do not match perfectly. Anyway, there are cases where using print() to create the input for get() will cause you problems, such as when you have floating-point data, and you need to preserve more than the 10 significant digits that print() gives you. In those cases you will need to use printf() (or for perfect results you should forget about print/get and use atom_to_float64()). Regards, Rob Craig Rapid Deployment Software http://www.RapidEuphoria.com
14. Re: Problems with GET.E
- Posted by jiri babor <jbabor at PARADISE.NET.NZ> Nov 18, 1999
- 594 views
- Last edited Nov 19, 1999
Thanks, Rob, for your frankness, rarely used currency these days. It's a pity I have not convinced you. print() is a comparatively specialized routine and all your arguments for status quo are pretty feeble. But I don't despair, I am sure you *will* change your mind. jiri
15. Re: Problems with GET.E
- Posted by Roderick Jackson <rjackson at CSIWEB.COM> Nov 18, 1999
- 630 views
Jiri Babor wrote: >Rob Craig wrote: > >>Roderick Jackson already suggested that, and I am still >>opposed to that idea. Adding your own extra space with puts() >>is easy. Removing an unwanted extra space could >>be difficult. Also, some existing code would probably break. > >No matter how hard I try I cannot imagine why on earth I would >want to remove the extra space. Without it the data is unusable >anyway. Rob, Thanks for making a statement on this one way or the other. I'm not trying to "twist your arm" or anything to change the language. But, I think you might want to spend at least a little bit of time reconsidering: I doubt there is more than a minute fraction of code out there that uses print() in such a way that a new format would break it. (I'm not counting the instances where it's used inbetween puts(), since that's generally an unformatted usage anyway.) If I may suggest, a sampling of user contributions would likely support this. True, print() can be used seperately from get(). But I would think that the instances where get() is used without print() are smaller in number than the instances where code would be broken by a new format. get() practically demands a parallel routine that writes data in a form get() can use. It's almost certain that anyone using get() would NOT want the space removed. If you still have concerns about changing print(), perhaps a solution previously mentioned, a new routine (e.g., put()) could solve most ills from this. It wouldn't have to be a built-in; in fact, it would make sense to locate it in the GET.E file, along with get(). It could consist of nothing more than a call to print() and a call to puts() for the space. I only suggest making it a provided routine because (1) it then becomes a standard, and (2) it avoids the 'remove' problem of having several identical local routines (since practically anyone using get() will use it.) Rod Jackson