load_map in map.e - unstructured programming example.
- Posted by gimlet May 08, 2014
- 1512 views
Unfortunately this is a long post because I feel it is necessary to provide background as the code is quite obscure.
I have been reluctant to post because I could be seen as 'picking-on' Euphoria and posting the negative. However nothing is improved unless it is questioned.
So to begin
If you read the notes you may understand what the procedure is intended to do. But a precis:
Blank lines and comments are ignored. A key-value line is <key> = <value> where <key> and <value> are Eu objects. A line may be laid out over several lines if each line except the last ends with a continuation character ( `,${` ). Lines are evaluated applying get:value() to the key and value and placed into a map.
load_map is a potentially valuable procedure. It is a pity it is all but unusable.
load_map and save_map have been mentioned several times as having problems but the code has not been corrected (a major task?) or removed.
-- -- Example 1: -- <eucode--** -- Loads a map from a file -- -- Parameters: -- # ##file_name_p## : The file to load from. This file may have been created -- by the [[:save_map]] function. This can either be a -- name of a file or an already opened file handle. -- -- Returns: -- Either a **map**, with all the entries found in ##file_name_p##, or **-1** -- if the file failed to open, or **-2** if the file is incorrectly formatted. -- -- Comments: -- If ##file_name_p## is an already opened file handle, this routine will write -- to that file and not close it. Otherwise, the named file will be created and -- closed by this routine. -- -- The input file can be either one created by the [[:save_map]] function or -- a manually created/edited text file. See [[:save_map]] for details about -- the required layout of the text file. And relevant details from save_map --** -- Saves a map to a file. -- -- Parameters: -- # ##m## : a map. -- # ##file_name_p## : Either a sequence, the name of the file to save to, -- or an open file handle as returned by [[:open]](). -- # ##type## : an integer. SM_TEXT for a human-readable format (default), -- SM_RAW for a smaller and faster format, but not human-readable. -- -- Returns: -- An **integer**, the number of keys saved to the file, or -1 if the -- save failed. -- -- Comments: -- If ##file_name_p## is an already opened file handle, this routine will write -- to that file and not close it. Otherwise, the named file will be created and -- closed by this routine. -- -- The SM_TEXT type saves the map keys and values in a text format which can -- be read and edited by standard text editor. Each entry in the map is saved as -- a KEY/VALUE pair in the form \\ -- {{{ -- key = value -- }}} -- Note that if the 'key' value is a normal string value, it can be enclosed in -- double quotes. If it is not thus quoted, the first character of the key -- determines its Euphoria value type. A dash or digit implies an atom, an left-brace -- implies a sequence, an alphabetic character implies a text string that extends to the -- next equal '=' symbol, and anything else is ignored. -- -- Note that if a line contains a double-dash, then all text from the double-dash -- to the end of the line will be ignored. This is so you can optionally add -- comments to the saved map. Also, any blank lines are ignored too. -- -- All text after the '=' symbol is assumed to be the map item's value data. -- -- Because some map data can be rather long, it is possible to split the text into -- multiple lines, which will be considered by [[:load_map]] as a single //logical// -- line. If an line ends with a comma (,) or a dollar sign ($), then the next actual -- line is appended to the end of it. After all these physical lines have been -- joined into one logical line, all combinations of `",$"` and `,$` are removed. -- -- For example: -- {{{ -- one = {"first", -- "second", -- "third", -- $ -- } -- second = "A long text ",$ -- "line that has been",$ -- " split into three lines" -- third = {"first", -- "second", -- "third"} -- }}} -- is equivalent to -- {{{ -- one = {"first","second","third"} -- second = "A long text line that has been split into three lines" -- third = {"first","second","third"} -- }}} -- -- -- Example 1: -- <eucode> -- include std/error.e -- -- map AppOptions -- if save_map(AppOptions, "c:\myapp\options.txt") = -1 -- crash("Failed to save application options") -- end if -- -- if save_map(AppOptions, "c:\myapp\options.dat", SM_RAW) = -1 -- crash("Failed to save application options") -- end if -- </eucode> -- -- See Also: -- [[:load_map]] --
lines 1579 to 1636
while sequence(logical_line) with entry do delim_pos = find('=', logical_line) if delim_pos > 0 then data_key = text:trim(logical_line[1..delim_pos-1]) if length(data_key) > 0 then = data_key = search:match_replace("\\-", data_key, "-") if not t_alpha(data_key[1]) then conv_res = stdget:value(data_key,,stdget:GET_LONG_ANSWER) if conv_res[1] = stdget:GET_SUCCESS then if conv_res[3] = length(data_key) then data_key = conv_res[2] end if end if end if data_value = text:trim(logical_line[delim_pos+1..$]) data_value = search:match_replace("\\-", data_value, "-") conv_res = stdget:value(data_value,,stdget:GET_LONG_ANSWER) if conv_res[1] = stdget:GET_SUCCESS then if conv_res[3] = length(data_value) then data_value = conv_res[2] end if end if put(new_map, data_key, data_value) end if end if entry logical_line = -1 while sequence(line_in) with entry do if atom(logical_line) then logical_line = "" end if has_comment = search:rmatch("--", line_in) if has_comment != 0 then line_in = text:trim(line_in[1..has_comment-1]) else line_in = text:trim(line_in) end if logical_line &= line_in if length(line_in) then if not find(line_in[$], line_conts) then logical_line = search:match_replace(`",$"`, logical_line, "") logical_line = search:match_replace(`,$`, logical_line, "") exit end if end if entry line_in = gets(file_handle) end while end while
Part 1: the structure
start with the basic structure: using shorter names
object ll, line while sequence(ll) with entry do ... entry ll = -1 -- not needed while sequence(line) with entry do if not sequence(ll) then ll = "" end if -- consequence of ll = -1 ... if length(line) then -- ignoring blank lines?? if line[$] not in conts then -- clean ll exit -- exit condition should be end if -- length(ll) and not ll[$] in conts end if entry line = gets(fh) end while end while -- but this is unsatisfactory because if ll is not "" -- and line = -1 then the loop exits with an -- incomplete ll is fed into the main loop.
What is, should be, happening in the inner loop - the inner loop is combining getting ll with completing ll. Hence the mess.
ll line action correct action ----------------------------------------------------- "" -1 exit inner exit outer -- unnecessary cont -1 exit inner error -- ll is incomplete "" "" loop loop inner cont "" loop error -- comments should be separate "" line assign exit inner assign exit inner cont line append exit inner append exit inner "" cont assign assign cont cont append append Note: it is much simpler to get ll before considering continuations
-- rewrite the inner loop while TRUE label "MAIN" do object ll, loop do ll = getline(fh) -- ** we are not defining getline here (see Part 2) if not sequence(ll) then exit "MAIN" end if until length(ll) != 0 end loop while find(ll[$], line_conts) do object ln = getline(fh) if not sequence(ln) then goto "ERROR" end if if not length(ln) then goto "ERROR" end if end while -- clean ll -- process ll end while
I have not included functions for getline() and clean() as they are not particularly relevant.
Part 2 errors
1 examples which ignore quotes and other structures
has_comment = search:rmatch("--", ln) -- note search from the right if has_comment != 0 then ln = text:trim(ln[1..has_comment-1]) else ln = text:trim(ln) end if ll = search:match_replace(`",$"`, ll, "") -- assumes these sequences ll = search:match_replace(`,$`, ll, "") -- will not contain spaces delim_pos = find('=', ll)
2 ignoring the possible error
Where is else clause?
This silent passing over of errors is totally anti-user
if delim_pos > 0 then ... end if if length(data_key) > 0 then ... end if if conv_res[1] = stdget:GET_SUCCESS then ... end if
If one record is misread then
3 assuming data entry is correct
Assuming the line ends when there are no more continuations. Or that the line doesn't end until there are no more continuations.
are unjustified assumptions.
Lines contain sequences among other things and sequences can nest and one can have a sequence as a key or a value. One can easily miss a comma at line-end or before a comment.
Safe is to assume that blank lines and comments come between data blocks and report errors.
Why does he ignore the possibility of special characters `=,${` appearing in quotes? Why allow unquoted text?
Because this way there are no errors?
4 the true test of fidelity
If the text read in produces the same number of key-value pairs as are defined and when read out the resultant text defines the same key-value pairs as in the original then you have a successful read (of one text).
5 Conclusion
It seems to me the writer of this procedure was more taken with the cleverness of nested while with entry loop than with any integrity the program might have.