Re: load_map in map.e - unstructured programming example.

new topic     » goto parent     » topic index » view thread      » older message » newer message
gimlet said...

Unfortunately this is a long post because I feel it is necessary to provide background as the code is quite obscure.

Understandable.

gimlet said...

However nothing is improved unless it is questioned.

Agreed.

gimlet said...

I have been reluctant to post because I could be seen as 'picking-on' Euphoria and posting the negative.

You say that, but you also say this.

gimlet said...

5 Conclusion

It seems to me the writer of this procedure was more taken with the cleverness of nested while with entry loop than with any integrity the program might have.

The latter is more 'picking-on' than any direct criticism of code could be.

gimlet said...

A key-value line is <key> = <value> where <key> and <value> are Eu objects.

Lines are evaluated applying get:value() to the key and value and placed into a map.

They can also be unquoted text, which are turned into strings in Eu.

gimlet said...

load_map is a potentially valuable procedure. It is a pity it is all but unusable.

It's designed to work with the output of save_map() and only that. Are you aware of any cases where save_map() will create a file that load_map() can't load? If not, then you are just plain wrong here.

Of course, perhaps there are enhancements to load_map() to make it more usuable when dealing with hand-cranked map files. I won't disagree there, I'll only point out that this wasn't the original purpose of load_map() but a concept that was added later.

gimlet said...

load_map and save_map have been mentioned several times as having problems but the code has not been corrected (a major task?)

References to that would help. I admit that I generally don't follow stuff about maps in general, hence my lack of familiarity with these in particular.

gimlet said...

or removed.

They work, so why would they be removed? This just seems silly.

gimlet said...

start with the basic structure: using shorter names

I disagree in general - we want the names to be as readable as possible. ll is really bad in that regard, though changing line_in to line is probably ok.

gimlet said...
   ll = -1                                   -- not needed 
   while sequence(line) with entry do 
     if not sequence(ll) then ll = "" end if -- consequence of ll = -1 

Conceded. We can probably do this instead:

   object ll = "", ... 
   ... 
   while sequence(line) with entry do 

though perhaps with at least some of the original variable names, instead of shorthand ones.

gimlet said...
     if length(line) then                    -- ignoring blank lines?? 

Yep. That's in spec. As you said,

gimlet said...

Blank lines and comments are ignored.

Perhaps the documentation could be clearer about that.

gimlet said...
        if line[$] not in conts then 
          -- clean ll 
          exit                               -- exit condition should be 
        end if                               -- length(ll) and not ll[$] in conts 

I disagree. If that were the case, wouldn't the function exit as soon as it read the first line in, regardless of line continuation? It'd break the line continuation functionality.

gimlet said...
                  -- but this is unsatisfactory because if ll is not ""  
                            -- and line = -1 then the loop exits with an  
                            -- incomplete ll is fed into the main loop. 
 

I think we can live with that. We've hit the end of the file in that case, so it is guaranteed that there's no more line input to add to the logical line. We don't have the ability to make the logical line any more complete in that case.

gimlet said...

What is, should be, happening in the inner loop - the inner loop is combining getting ll with completing ll. Hence the mess.

Note: it is much simpler to get ll before considering continuations

How do you get the logical line without completing it??? The logical line is the line that has all line continuations combined (so the multiple physical lines have already been merged into a single logical line).

gimlet said...

 ll    line    action             correct action 
 ----------------------------------------------------- 
 ""      -1    exit inner         exit outer       -- unnecessary  

You're the one saying that the correct action is unnecessary here... I agree with that.

gimlet said...

 cont    -1    exit inner         error            -- ll is incomplete  

Like I said, I don't believe that this means the logical line is incomplete. There's no more physical lines left to complete it.

gimlet said...

 cont    ""    loop               error            -- comments should be separate 

This case is either a blank line, or a comment. We just strip off comments (as we should be doing), so I'm not convinced that this is an error.

Also, for a logical line, we might have comments dispersed throughout the many physical lines. We don't want to drop the rest of the logical line the moment we see the first comment.

gimlet said...
 -- rewrite the inner loop 
 
  while TRUE label "MAIN" do 
  
   object ll,   
    
   loop do 
     ll = getline(fh)    -- ** we are not defining getline here (see Part 2) 
     if not sequence(ll) then exit "MAIN" end if 
   until length(ll) != 0 
   end loop 
    
   while find(ll[$], line_conts) do  
     object ln = getline(fh)         
     if not sequence(ln) then goto "ERROR" end if 
     if not length(ln) then goto "ERROR" end if 
   end while 
    
  -- clean ll 
  -- process ll 
  
 end while 
  

I have not included functions for getline() and clean() as they are not particularly relevant.

getline() must be handling the comments and line continuation. I think that's relevant. The entire algorithm as a whole can't be judged until it is completely revealed.

gimlet said...
 has_comment = search:rmatch("--", ln)     -- note search from the right 
 if has_comment != 0 then  
   ln = text:trim(ln[1..has_comment-1])  
 else  
   ln = text:trim(ln)  
 end if  
 

Conceeded. We should be looking for the first --, not the last. A comment can have multiple of -- in it.

[quote gimlet]

 ll = search:match_replace(`",$"`, ll, "") -- assumes these sequences  
 ll = search:match_replace(`,$`, ll, "")   -- will not contain spaces 
   
 delim_pos = find('=', ll)  

Why does it matter if they have spaces or not? We just want to strip out the line continuation characters from the logical line. Then the delim_pos is looking for the equal sign - not a space.

Maybe we have a problem here of not signalling an error if a line continuation character is in the middle or beginning of a line, instead of at the end.

2 ignoring the possible error

gimlet said...

Where is else clause?

This silent passing over of errors is totally anti-user

 if delim_pos > 0 then  
   ... 
 end if 
 
 if length(data_key) > 0 then 
   ... 
 end if  
 

Conceded. Of course, I'd prefer a load_map() that had the option of going on and processing as many lines as it could, but some sort of error mechanism to warn the user about any invalid lines found makes sense too.

gimlet said...
 if conv_res[1] = stdget:GET_SUCCESS then 
   ... 
 end if 
 

I think this last one is not an error - it's just that for strings, the quotes are optional.

gimlet said...

If one record is misread then

The code skips it and goes to the next line. Basically, it conceptually filters out bad records. I can live with that.

gimlet said...

Assuming the line ends when there are no more continuations. Or that the line doesn't end until there are no more continuations.

are unjustified assumptions.

I disagree. By definition, a logical line is a physical line unless that physical line has a line continuation - in which case it is the combination of all physical lines until the line continuation stops. You say so yourself, in fact:

gimlet said...

A line may be laid out over several lines if each line except the last ends with a continuation character ( `,${` ).

Now, should having a line continuation at the end of a file be an error? I think not, for the same reason we decided to allow a list of constants to end with a comma - it's much easier to add a line to the bottom of the list in the future, lowers teh chance of svn/hg conflicts, etc.

gimlet said...

Lines contain sequences among other things and sequences can nest and one can have a sequence as a key or a value.

But these should all be on one logical line.

gimlet said...

One can easily miss a comma at line-end or before a comment.

Conceded. value()/get() reports this error to us, but we just ignore the entire record in that case.

gimlet said...

Why does he ignore the possibility of special characters `=,${` appearing in quotes?

Conceded. That seems like a bug.

gimlet said...

Why allow unquoted text?

Because this way there are no errors?

Also because it's easier to turn a keyvalue text file into a map this way?

gimlet said...

4 the true test of fidelity

If the text read in produces the same number of key-value pairs as are defined and when read out the resultant text defines the same key-value pairs as in the original then you have a successful read (of one text).

I agree, but I guess the issue is what should load_map() do on error? Should it abort the program, return failure and no valid data, or keep going and get as much valid data as possible? Of course it should report errors (which it sadly does not), but I think it should also try to handle erors gracefully.

new topic     » goto parent     » topic index » view thread      » older message » newer message

Search



Quick Links

User menu

Not signed in.

Misc Menu