1. Problems with GET.E

Is it me, or is the get() function in GET.E faulty? I seem
to recall discussion on the list regarding it a while back,
but I can't recall...

Basically, all I do is print() to a file, close it, reopen
it, then get() from the file. Correct me if I'm wrong, but
aren't several get() calls in a row supposed to return all
of the objects that were printed to the file? (And I'm
using the latest version of GET.E released with Eu 2.1)

If it's not, then might I humbly (but fervently) suggest
that it be made so? From what I can see, one way to do it
would be to have print() put a space after every object
it outputs.

If it IS supposed to work that way, then, um, someone will
have to tell me and I'll just post my code to the list for
examination...


Rod Jackson

new topic     » topic index » view message » categorize

2. Re: Problems with GET.E

Roderick Jackson writes:
> Correct me if I'm wrong, but aren't several get() calls
> in a row supposed to return all of the objects that were
> printed to the file? ...

get() requires that there be at least one character of
whitespace (blank, tab or new-line) separating the
top-level objects in the file.

You can easily observe that this will likely be
necessary between two atoms,
e.g.
        111 222

but it shouldn't be necessary between
two sequences, or an atom and a sequence e.g.:

        145.999{1,2,3}
or
        {1,2,3}{4,5,6}

because the braces should be enough.
So why does get() require whitespace in *all* cases?

It was much easier to implement that way (it's a long story),
and I didn't think anyone would really mind.

> From what I can see, one way to do it
> would be to have print() put a space after every object
> it outputs.

What about the users of print() who don't want the
extra space? Is it that hard to add:
      puts(fn, ' ')
to your code?

Regards,
     Rob Craig
     Rapid Deployment Software
     http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

3. Re: Problems with GET.E

Robert Craig wrote (in response to Rod Jackson):

>>Correct me if I'm wrong, but aren't several get() calls
>>in a row supposed to return all of the objects that were
>>printed to the file? ...
>
>get() requires that there be at least one character of
>whitespace (blank, tab or new-line) separating the
>top-level objects in the file.
>
>[...]
>
>>From what I can see, one way to do it
>>would be to have print() put a space after every object
>>it outputs.
>
>What about the users of print() who don't want the
>extra space? Is it that hard to add:
>      puts(fn, ' ')
>to your code?

I don't think it's so much a case of how easy or hard it is to add "puts(fn,
' ')" to a given section of code. For me, at least, it's a case of paired
functions having a not-quite symmetrical relationship. You can call get() as
many times in a row as you like, but for print() you have to throw in a
puts() for a space. The space between top-level objects may be necessary,
and certainly makes sense when you think about it, but it *is* somewhat
inconsistent with Euphoria's otherwise sensible and easy-to-understand
language design, IMO.

This is not limited to the get()/print() pair of functions, either -- Jason
Gade recently pointed out (11/12/1999; Subject: Euphoria features) that
position()/get_position() have a similarly non-symmetrical relationship.
Another (possibly imperfect) example which springs to mind would be the
video_config()/text_rows()/graphics_mode() family of routines.

In any event, for the immediate future, this is probably the best solution
for the problem in question:

global procedure put(integer fn, object data)
   print(fn, data)
   puts(fn, ' ')   -- or '\n', either one will work
end procedure

This way, we have the symmetrically-paired routines get() and put(). This
enables those who wish to perform multiple get() calls against a file to
have a self-contained procedure to write their data out to that file,
without forcing the users of print() to use up an extra byte.


Hep yadda,
Gabriel Boehme

----------
There are few things as convincing as death to remind us of the quality with
which we live our lives.

Robert Fripp
----------

new topic     » goto parent     » topic index » view message » categorize

4. Re: Problems with GET.E

Robert Craig wrote:
>> Correct me if I'm wrong, but aren't several get() calls
>> in a row supposed to return all of the objects that were
>> printed to the file? ...
>
>get() requires that there be at least one character of
>whitespace (blank, tab or new-line) separating the
>top-level objects in the file.
>
>You can easily observe that this will likely be
>necessary between two atoms,
>e.g.
>        111 222
>
>but it shouldn't be necessary between
>two sequences, or an atom and a sequence e.g.:
>
>        145.999{1,2,3}
>or
>        {1,2,3}{4,5,6}
>
>because the braces should be enough.
>So why does get() require whitespace in *all* cases?
>
>It was much easier to implement that way (it's a long story),
>and I didn't think anyone would really mind.
>
>> From what I can see, one way to do it
>> would be to have print() put a space after every object
>> it outputs.
>
>What about the users of print() who don't want the
>extra space? Is it that hard to add:
>      puts(fn, ' ')
>to your code?

No... if you really *meant* it the way it is, I'm not
asking for a change; I can see some difficulty with
affixing a space or newline to the end of everything
print()ed. I just thought that your original intent
was to have people be able to use:

   print (fn, obj1)
   print (fn, obj2)
   ...
   ext_obj1 = get (fn)
   ext_obj2 = get (fn)

as naturally as they would puts() and getc(). As things
currently stand, it means I may have to leave out parallel
functions for print() and get() from my next project--
you'll see why soon. But, if that's the way they were
meant to operate anyway, ah well...


... meanwhile, Gabriel Boehme wrote:
>In any event, for the immediate future, this is probably the best solution
>for the problem in question:
>
>global procedure put(integer fn, object data)
>   print(fn, data)
>   puts(fn, ' ')   -- or '\n', either one will work
>end procedure

I'll be certain to do something like this whenever I start
working with large datafiles. For now though, the situation
isn't causing any real problems--I just thought I'd clear it
up before it did.

(Side note: I know I'm going to be called on this, but
having get() and print() operate the way they currently do
DOES seem... ill-paired. There's really no need to change
get()--when writing two integers to a file, you're going
to HAVE to put something between them anyway. So maybe we
could try to find out how many folks use print() in a
manner that would cause problems if print() were changed
to attatch a newline? I suspect it's not that many,
if any at all...)


Rod Jackson

new topic     » goto parent     » topic index » view message » categorize

5. Re: Problems with GET.E

Roderick Jackson  wrote:

>(Side note: I know I'm going to be called on this, but
>having get() and print() operate the way they currently do
>DOES seem... ill-paired. There's really no need to change
>get()--when writing two integers to a file, you're going
>to HAVE to put something between them anyway. So maybe we
>could try to find out how many folks use print() in a
>manner that would cause problems if print() were changed
>to attatch a newline? I suspect it's not that many,
>if any at all...)
>
>Rod Jackson

I like symmetry, too, but I'd like it without the newline attached to
every print(). The ability to write more than once to a line has
so many uses in report writing that it would be hard to
enumerate them. Especially with the lack of string facilities in
this language, it can simplify code quite a bit. A carriage return
without a line feed would be less problematic, but still
inconvenient.

Everett L.(Rett) Williams
rett at gvtc.com

new topic     » goto parent     » topic index » view message » categorize

6. Re: Problems with GET.E

Everett Williams wrote:
>Roderick Jackson  wrote:
>
>>(Side note: I know I'm going to be called on this, but
>>having get() and print() operate the way they currently do
>>DOES seem... ill-paired. There's really no need to change
>>get()--when writing two integers to a file, you're going
>>to HAVE to put something between them anyway. So maybe we
>>could try to find out how many folks use print() in a
>>manner that would cause problems if print() were changed
>>to attatch a newline? I suspect it's not that many,
>>if any at all...)
>>
>>Rod Jackson
>
>I like symmetry, too, but I'd like it without the newline attached to
>every print(). The ability to write more than once to a line has
>so many uses in report writing that it would be hard to
>enumerate them.

I agree, but how many people need to use print() to
keep stuff on the same line? print(), unlike printf()
and puts(), is used for generic objects, not strings
or formatted output:

   print (fn, "ABCD") --> "{65,66,67,68}" is the literal
                      --> string stored in your file,
                      --> not 5 characters.

If someone uses it extensively in their datafiles, I can
see them having a problem with the extra bytes eating up
memory; but I'd really like to know if anyone is doing
this (or is using the routine in a way that the format
would be messed up by the extra byte.) I suspect, because
of it's incompatibility with get(), that practically no
one ever uses print().

Of course, I could be wrong... I'm just curious.


Rod Jackson

new topic     » goto parent     » topic index » view message » categorize

7. Re: Problems with GET.E

I guess the easy fix is to override the built-in print command.

I.E.:

procedure old_print(integer fn, object x)
  print(fn, x)
end procedure

without warning
procedure print(integer fn, object x)
  print(fn, x)
  puts(fn, 10)
  -- you can replace 10 with '\n' or {10} or "\n"
end procedure
with warning

        Lucius L. Hilley III
        lhilley at cdc.net
+----------+--------------+--------------+
| Hollow   | ICQ: 9638898 | AIM: LLHIII  |
|  Horse   +--------------+--------------+
| Software | http://www.cdc.net/~lhilley |
+----------+-----------------------------+

new topic     » goto parent     » topic index » view message » categorize

8. Re: Problems with GET.E

Gals & Guys,

Since I felt at least partially responsible for the get.e fiasco, I
thought I should try to fix it. Attached is my first go at it, a
little kludgy, but it seems to work. I have done only a couple of
quick tests, so be careful. I altered only the routines for get() and
value() functions, I left the rest of the clutter untouched.

Test results, comments & other brutalities will be very much
appreciated.

If it rains tonight (no tennis!), I'll try to improve it. Then you will
find a copy on my Euphoria page. jiri


-- snip --------------------------------------------------------------

--  file    : get.e
--  author  : jiri babor
--  email   : jbabor at paradise.net.nz
--  project : get.e replacement
--  tool    : euphoria 2.1
--  date    : 99-11-17

------------------------------------
-- Input and Conversion Routines: --
-- get()                          --
-- value()                        --
-- wait_key()                     --
------------------------------------

-- error status values returned from get() and value():
global constant
    GET_SUCCESS = 0,
     GET_EOF = -1,
     GET_FAIL = 1

constant M_WAIT_KEY = 26

constant
    TRUE = 1,
    FALSE=0,
    DIGITS = "0123456789",
     HEX_DIGITS = DIGITS & "ABCDEF",
     START_NUMERIC = DIGITS & "-+.#"


type natural(integer x)
    return x >= 0
end type

type char(integer x)
    return x >= -1 and x <= 255
end type

object input_string -- string to be read from
integer error_flag  -- error flag
natural input_file  -- file to be read from
natural string_next
char ch             -- the current character


global function wait_key()
    -- Get the next key pressed by the user.
    -- Wait until a key is pressed.

    return machine_func(M_WAIT_KEY, 0)
end function

procedure get_ch()
    -- set ch to the next character in the input stream (either string or file)

    if sequence(input_string) then
     if string_next <= length(input_string) then
         ch = input_string[string_next]
         string_next += 1
     else
         ch = GET_EOF
     end if
    else
         ch = getc(input_file)
    end if
end procedure

procedure skip_blanks()
    -- skip white space
    -- ch is "live" at entry and exit

    while find(ch, " \t\n") do
     get_ch()
    end while
end procedure

constant
    ESCAPE_CHARS = "nt'\"\\r",
     ESCAPED_CHARS = "\n\t'\"\\\r"

function escape_char(char c)
    -- return escape character

    natural i

    i = find(c, ESCAPE_CHARS)
    if i = 0 then
         return GET_FAIL
    else
         return ESCAPED_CHARS[i]
    end if
end function

function get_qchar()
    -- get a single-quoted character
    -- ch is "live" at exit

    char c

    get_ch()
    c = ch
    if ch = '\\' then
         get_ch()
         c = escape_char(ch)
         if c = GET_FAIL then
            error_flag=GET_FAIL
              return 0
         end if
    elsif ch = '\'' then
        error_flag=GET_FAIL
        return 0
    end if
    get_ch()
    if ch != '\'' then
        error_flag=GET_FAIL
        return 0
    else
         get_ch()
         return c
    end if
end function -- get_qchar

function get_string()
    -- get a double-quoted character string
    -- ch is "live" at exit

    sequence text

    text = ""
    while TRUE do
     get_ch()
     if ch = GET_EOF or ch = '\n' then
            error_flag=GET_FAIL
            return 0
     elsif ch = '"' then
         get_ch()
         return text
     elsif ch = '\\' then
         get_ch()
         ch = escape_char(ch)
         if ch = GET_FAIL then
                error_flag=GET_FAIL
                return 0
         end if
     end if
     text &= ch
    end while
end function

type plus_or_minus(integer x)
    return x = -1 or x = +1
end type

function get_number()
    -- read a number
    -- ch is "live" at entry and exit

    plus_or_minus sign, e_sign
    natural ndigits
    integer hex_digit
    atom mantissa, dec, e_mag

    sign = +1
    mantissa = 0
    ndigits = 0

    -- process sign
    if ch = '-' then
     sign = -1
     get_ch()
        elsif ch = '+' then
     get_ch()
    end if

    -- get mantissa
    if ch = '#' then         -- process hex integer and return
     get_ch()
     while TRUE do
         hex_digit = find(ch, HEX_DIGITS)-1
         if hex_digit >= 0 then
              ndigits += 1
              mantissa = mantissa * 16 + hex_digit
              get_ch()
         else
               if ndigits > 0 then
                   return sign * mantissa
               else
                    error_flag=GET_FAIL
                    return 0
               end if
         end if
     end while
    end if

    -- decimal integer or floating point
    while ch >= '0' and ch <= '9' do
     ndigits += 1
     mantissa = mantissa * 10 + (ch - '0')
     get_ch()
    end while

    if ch = '.' then         -- get fraction
     get_ch()
     dec = 10
     while ch >= '0' and ch <= '9' do
         ndigits += 1
         mantissa += (ch - '0') / dec
         dec *= 10
         get_ch()
     end while
    end if

    if ndigits = 0 then
        error_flag=GET_FAIL
        return 0
    end if

    mantissa = sign * mantissa

    if ch = 'e' or ch = 'E' then         -- get exponent sign
     e_sign = +1
     e_mag = 0
     get_ch()
     if ch = '-' then
         e_sign = -1
         get_ch()
     elsif ch = '+' then
         get_ch()
     end if
     -- get exponent magnitude
     if ch >= '0' and ch <= '9' then
         e_mag = ch - '0'
         get_ch()
         while ch >= '0' and ch <= '9' do
          e_mag = e_mag * 10 + ch - '0'
          get_ch()
         end while
     else                    -- no exponent
            error_flag=GET_FAIL
            return 0
     end if
     e_mag *= e_sign
     if e_mag > 308 then
         -- rare case: avoid power() overflow
         mantissa *= power(10, 308)
         if e_mag > 1000 then
          e_mag = 1000
         end if
         for i = 1 to e_mag - 308 do
          mantissa *= 10
         end for
     else
         mantissa *= power(10, e_mag)
     end if
    end if

    return mantissa
end function

function get_sequence()
    sequence s
    integer comma, first    -- flags

    get_ch()
    skip_blanks()
    s={}
    comma=FALSE
    first=TRUE
    while ch!='}' do
        if comma or first then
            if find(ch, START_NUMERIC) then
              s=append(s, get_number())
            elsif ch = '{' then
                s=append(s, get_sequence())
                get_ch()
                skip_blanks()
            elsif ch = '\"' then
              s=append(s, get_string())
            elsif ch = '\'' then
              s=append(s, get_qchar())
            elsif ch = -1 then
                error_flag=GET_EOF
                return 0
            else
                error_flag=GET_FAIL
                return 0
            end if
            comma=FALSE
            first=FALSE
        elsif ch=',' then
            comma=TRUE
            get_ch()
            skip_blanks()
        else
            error_flag=GET_FAIL
            return 0
        end if
    end while
    if comma then
        error_flag=GET_FAIL
        return 0
    end if
    return s
end function -- get_sequence

function Get()
    -- read a Euphoria data object as a string of characters
    -- set error_flag and return value

    skip_blanks()

    if find(ch, START_NUMERIC) then
         return get_number()
    elsif ch = '{' then
        return get_sequence()
    elsif ch = '\"' then
         return get_string()
    elsif ch = '\'' then
         return get_qchar()
    elsif ch = -1 then
        error_flag=GET_EOF
        return 0
    else
        error_flag=GET_FAIL
        return 0
    end if
end function -- Get()

global function get(integer file)
    -- Read the string representation of a Euphoria object
    -- from a file. Convert to the value of the object.
    -- Return {error_status, value}.

    input_file = file
    input_string = 0
    error_flag=GET_SUCCESS
    get_ch()
    return {error_flag, Get()}
end function

global function value(sequence string)
    -- Read the representation of a Euphoria object
    -- from a sequence of characters. Convert to the value of the object.
    -- Return {error_status, value).

    input_string = string
    string_next = 1
    error_flag=GET_SUCCESS
    get_ch()
    return {error_flag, Get()}
end function

global function prompt_number(sequence prompt, sequence range)
-- Prompt the user to enter a number.
-- A range of allowed values may be specified.
    object answer

    while 1 do
      puts(1, prompt)
      answer = gets(0) -- make sure whole line is read
      puts(1, '\n')

      answer = value(answer)
      if answer[1] != GET_SUCCESS or sequence(answer[2]) then
           puts(1, "A number is expected - try again\n")
      else
          if length(range) = 2 then
            if range[1] <= answer[2] and answer[2] <= range[2] then
                return answer[2]
            else
                printf(1,
                "A number from %g to %g is expected here - try again\n",
                 range)
            end if
           else
            return answer[2]
           end if
      end if
    end while
end function

global function prompt_string(sequence prompt)
    -- Prompt the user to enter a string

    object answer

    puts(1, prompt)
    answer = gets(0)
    puts(1, '\n')
    if sequence(answer) and length(answer) > 0 then
         return answer[1..length(answer)-1] -- trim the \n
    else
         return ""
    end if
end function

global function get_bytes(integer fn, integer n)
    -- Return a sequence of n bytes (maximum) from an open file.
    -- If n > 0 and fewer than n bytes are returned,
    -- you've reached the end of file.

    sequence s
    integer c

    if n = 0 then
     return {}
    end if
    c = getc(fn)
    if c = -1 then
     return {}
    end if
    s = repeat(c, n)
    for i = 2 to n do
     s[i] = getc(fn)
    end for
    while s[n] = -1 do
     n -= 1
    end while
    return s[1..n]
end function

new topic     » goto parent     » topic index » view message » categorize

9. Re: Problems with GET.E

Thanks Jiri for attempting to improve get().
Your earlier efforts made the current
released get() much faster.

I haven't looked at your code yet, because
I suspect that you haven't taken into account
the thing that really made me require whitespace
between top-level objects. It's not just a problem
of correctly parsing an input stream from a file.
That can be done, and maybe you've achieved it.
I think it was even done correctly in a much earlier release of
Euphoria.

Consider the problem of parsing:

                123{99, 88, 77}

It will be necessary to read '1', '2' and '3' from the input file,
and then read '{' to know that you've reached the end of the
123 atom. get() must then return the atom's value.
The next call to get() must somehow see '{' again. That can
be achieved by retaining an "ungotten" character or in
some other way that requires get.e to retain some information
between calls in some variable. This can work when
you make consecutive calls to get() on the same input file.
But what are you going to do if the user makes a call to
get() *on a different file*, or if he performs a seek() to move
to a different part of the same file. Your ungotten or
saved character is now wrong.

You could try to fix this by instead seeking back one position
at the end of get(), so the next get() will see the same
character again. I don't want to get into this because
I don't want to assume that seek() will work on all types
of files/devices on all platforms. I could also attempt to
implement an "unget" facility for all Euphoria file I/O,
but as I said, I don't consider the whitespace
separation requirement to be so terrible that it requires
any major effort to improve it. In any case you will
always need whitespace between top-level atoms.

Regards,
     Rob Craig
     Rapid Deployment Software
     http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

10. Re: Problems with GET.E

Rob,

Somehow, wrongly, I assumed print() inserts a space after each top
level *number* (betraying my forth past, I suppose). This is not,
strictly speaking, necessary with the other elements, because the
*closing* curly bracket, as well as the quote and double quote
characters themselves can be treated as separators.

You also wrote:

>You could try to fix this by instead seeking back one position at the
>end of get(), so the next get() will see the same character again. I
>don't want to get into this because I don't want to assume that seek()
>will work on all types of files/devices on all platforms. I could also
>attempt to implement an "unget" facility for all Euphoria file I/O,
>but as I said, I don't consider the whitespace separation requirement
>to be so terrible that it requires any major effort to improve it. In
>any case you will always need whitespace between top-level atoms.

The real remedy should be quite clear by now. Change the print command
to output the extra space after each number, or better still, after
each top level object. It will improve readability, so close to so
many hearts in this forum, and it will not break any existing code or
a database. And we will be able to use the 'print/get' pair as
suggested in your documentation *without* any additional hassles.
Truly formatted output is adequately covered by different routines
anyway.

jiri

new topic     » goto parent     » topic index » view message » categorize

11. Re: Problems with GET.E

Jiri Babor writes:
> The real remedy should be quite clear by now. Change
> the print command to output the extra space after each number,
> or better still, after each top level object.

Roderick Jackson already suggested that, and I am still
opposed to that idea. Adding your own extra space with puts()
is easy. Removing an unwanted extra space could
be difficult. Also, some existing code would probably break.

Regards,
     Rob Craig
     Rapid Deployment Software
     http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

12. Re: Problems with GET.E

Rob Craig wrote:

>Roderick Jackson already suggested that, and I am still
>opposed to that idea. Adding your own extra space with puts()
>is easy. Removing an unwanted extra space could
>be difficult. Also, some existing code would probably break.

No matter how hard I try I cannot imagine why on earth I would
want to remove the extra space. Without it the data is unusable
anyway.

jiri

new topic     » goto parent     » topic index » view message » categorize

13. Re: Problems with GET.E

Jiri Babor writes:
> No matter how hard I try I cannot imagine why on earth
> I would want to remove the extra space. Without it the
> data is unusable anyway.

I was mainly thinking of other situations where print() might be
used. print() can be used quite independently of get(), just
as get() can be (and usually is) used independently of print().

e.g.
        puts(1, "there are ")
        print(1, x)
        puts(1, " days until the year 2000\n")

A knowledgeable user could also do this with printf(),
since in this case x is an atom, not a sequence.

I don't think it's a big deal if print() and get() do not
match perfectly. Anyway, there are cases where
using print() to create the input for get() will cause you
problems, such as when you have floating-point data,
and you need to preserve more than the 10 significant digits
that print() gives you. In those cases you will need to use
printf() (or for perfect results you should forget about
print/get and use atom_to_float64()).

Regards,
     Rob Craig
     Rapid Deployment Software
     http://www.RapidEuphoria.com

new topic     » goto parent     » topic index » view message » categorize

14. Re: Problems with GET.E

Thanks, Rob, for your frankness, rarely used currency these days. It's
a pity I have not convinced you. print() is a comparatively
specialized routine and all your arguments for status quo are pretty
feeble. But I don't despair, I am sure you *will* change your mind.blink

jiri

new topic     » goto parent     » topic index » view message » categorize

15. Re: Problems with GET.E

Jiri Babor wrote:
>Rob Craig wrote:
>
>>Roderick Jackson already suggested that, and I am still
>>opposed to that idea. Adding your own extra space with puts()
>>is easy. Removing an unwanted extra space could
>>be difficult. Also, some existing code would probably break.
>
>No matter how hard I try I cannot imagine why on earth I would
>want to remove the extra space. Without it the data is unusable
>anyway.

Rob,

Thanks for making a statement on this one way or the
other. I'm not trying to "twist your arm" or anything
to change the language. But, I think you might want
to spend at least a little bit of time reconsidering:

I doubt there is more than a minute fraction of code
out there that uses print() in such a way that a new
format would break it. (I'm not counting the
instances where it's used inbetween puts(), since
that's generally an unformatted usage anyway.) If I
may suggest, a sampling of user contributions would
likely support this.

True, print() can be used seperately from get(). But I
would think that the instances where get() is used
without print() are smaller in number than the
instances where code would be broken by a new format.
get() practically demands a parallel routine that
writes data in a form get() can use. It's almost certain
that anyone using get() would NOT want the space removed.

If you still have concerns about changing print(),
perhaps a solution previously mentioned, a new routine
(e.g., put()) could solve most ills from this. It wouldn't
have to be a built-in; in fact, it would make sense to
locate it in the GET.E file, along with get(). It could
consist of nothing more than a call to print() and a call
to puts() for the space. I only suggest making it a
provided routine because (1) it then becomes a standard,
and (2) it avoids the 'remove' problem of having several
identical local routines (since practically anyone using
get() will use it.)


Rod Jackson

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu