1. Writing a language in Euphoria

I'm attempting to write a new language in Euphoria. I don't really know what I'm doing. And I've never tried this before, but... I do have an AST at this point. Feel free to follow along if you'd like.

BzScript on Github

BzScript on Github (Ported to Phix and moved)

Ronald Weidner

new topic     » topic index » view message » categorize

2. Re: Writing a language in Euphoria

building an AST from this:

{#n=((((#x+1)*2)-((3/(#y+4))+5)));}

was just a bit humbling.

new topic     » goto parent     » topic index » view message » categorize

3. Re: Writing a language in Euphoria

Do you have any plans for a debugger? I'm currently thinking about writing a "fake" debugger,
as in a complete mock-up to show what the world's best ever debugger might feel like to use,
without a care in the world at this stage whether any of it is practical. For instance, stepping
backwards through the code (aka time travel or omniscient debugging).

Some early thoughts on the latter are: don't go down the rr multi-gigabyte core dumps route,
instead just keep a simple short-ish history of what has actually been displayed and don't
let anything be modifiable - but also allow a "restart" button which quits the debugger and
gets back to where you are, this time without having run anything further on and allowing
what is displayed to be modified as usual. (Very much like a movie director: what you say
"now" has effect, as soon as you rewind what you've filmed, it don't matter how loud you
bark at it, nothing changes. Plus you'd have the whole "take 2/from" thing.)

It would never be possible to step back from the first point the debugger was "woken up",
though you might be able to restart with an earlier wake-up point. The latter is a line plus
a set of variables that have [relop] specific values.

An ex.err file could contain an additional section to fire up the debugger at the point of
failure, and perhaps have a short record of the last few actions by the user so no more
"can't remember" or "no I didn't press that key" when they really did (which has sometimes
included me point blank lying to my own face).

I should admit I've thought very little about dealing with non-determinism, and no plans on
rolling back disk updates other than "restore these files on restart". The ex.err side might
however need to fake mouse/keyboard input, probably limited to say 100 such events.

I rather suspect better ways to show/visualise the data you really need is more of a matter
of encouraging additions to the program source code, in a more dynamic/restart-able way.

new topic     » goto parent     » topic index » view message » categorize

4. Re: Writing a language in Euphoria

A decent debugger and editor are top level goals for this project. But, like I said, I've not done this before so, I'm sort of winging it. This enum is how I visualize the main data container called TAstToken. (aka indexes in a sequence) I'm hoping it's enough info to build a proper debugger and reference original code. By making an AST tree, I hope I can walk the tree one node or leaf at a time and show the results of each command in a textbox or simulated console. stepping should be a matter moving to the next node or next leaf in the tree. I have 2 hopes at this point, 1 - I hope my plan works. And 2 - I hope that I came close to answering you question. :)

public enum   
    __TYPE__,                   -- must be first value in enum  
    _kind,                      -- this is a constant that represents resolvable,  
                                -- literal, or action. This turned out to be less  
                                -- valuable than I hoped. Targeted for deprecation. 
    _name,                      -- the name of the data or symbol @x, [,  
                                -- or __BZ_STRING__ for example is a literal string. 
    _line_num,                  -- line number of the original source code 
    _col_num,                   -- column number of the original source code 
    _value,                     -- for variables.. this is the value.  for not  
                                -- variables I may or may not use the field for  
                                -- meta data about the token.   
    _factory_request_str,       -- the interpreter dispatchers actions to something I call  
                                -- Ants.  This is data about which Ant to dispatch. 
    _child_stream_location,     -- Metadata about where children are located if the tokens were  
                                -- arraigned left to right in a row.  For example consider 5 + 6 
                                -- + will have _child_stream_location of "left_and_right" because  
                                -- both 5 and 6 are needed to give + meaning. 
    _child_count,               -- every token can have children.  This is how many they have. 
    _ast_tokens,                -- the token's children 
    __MYSIZE__                  -- must be last value in enum  

Here is what the tree looks like. (as flat as I can make it. most of the data is represented)

Input:

fun do_stuff(#x, $r, @t via TAstToken(@_arg[3])){ 
#x = #x*5; 
print($r); 
printf(`%s`, @t[#_factory_request_str]); 
} 
let @token; 
do_stuff(6, `me`, @token); 

AST Output:

 name: __ast_token_root__ line: 0 col: 0 value:  factory_request_str:  child_count: 4 
    name: fun line: 1 col: 1 value: __KEYWORD__ factory_request_str: fun child_count: 1 
       name: do_stuff line: 1 col: 5 value: __FUN_DEF__ factory_request_str: fun_def child_count: 1 
          name: ( line: 1 col: 13 value: __FUN_DEF_GROUP__ factory_request_str: node_open child_count: 3 
             name: #x line: 1 col: 14 value:  factory_request_str: var_number child_count: 0 
             name: $r line: 1 col: 18 value:  factory_request_str: var_string child_count: 0 
             name: via line: 1 col: 25 value: __KEYWORD__ factory_request_str: via child_count: 2 
                name: @t line: 1 col: 22 value:  factory_request_str: var_sequence child_count: 0 
                name: TAstToken line: 1 col: 29 value: __FUN_CALL__ factory_request_str: fun_call child_count: 1 
                   name: ( line: 1 col: 38 value: __FUN_CALL_GROUP__ factory_request_str: node_open child_count: 1 
                      name: @_arg line: 1 col: 39 value:  factory_request_str: var_sequence child_count: 1 
                         name: [ line: 1 col: 44 value:  factory_request_str: node_open child_count: 1 
                            name: __BZ__NUMBER__ line: 1 col: 45 value: 3 factory_request_str: literal_num child_count: 0 
    name: { line: 1 col: 49 value:  factory_request_str: block_open child_count: 3 
       name: = line: 2 col: 4 value:  factory_request_str: assignment child_count: 2 
          name: #x line: 2 col: 1 value:  factory_request_str: var_number child_count: 0 
          name: * line: 2 col: 8 value:  factory_request_str: multiply child_count: 2 
             name: #x line: 2 col: 6 value:  factory_request_str: var_number child_count: 0 
             name: __BZ__NUMBER__ line: 2 col: 9 value: 5 factory_request_str: literal_num child_count: 0 
       name: print line: 3 col: 1 value: __FUN_CALL__ factory_request_str: fun_call child_count: 1 
          name: ( line: 3 col: 6 value: __FUN_CALL_GROUP__ factory_request_str: node_open child_count: 1 
             name: $r line: 3 col: 7 value:  factory_request_str: var_string child_count: 0 
       name: printf line: 4 col: 1 value: __FUN_CALL__ factory_request_str: fun_call child_count: 1 
          name: ( line: 4 col: 7 value: __FUN_CALL_GROUP__ factory_request_str: node_open child_count: 2 
             name: __BZ__STRING__ line: 4 col: 9 value: %s factory_request_str: literal_str child_count: 0 
             name: @t line: 4 col: 14 value:  factory_request_str: var_sequence child_count: 1 
                name: [ line: 4 col: 16 value:  factory_request_str: node_open child_count: 1 
                   name: #_factory_request_str line: 4 col: 17 value:  factory_request_str: var_number child_count: 0 
    name: let line: 6 col: 1 value: __KEYWORD__ factory_request_str: let child_count: 1 
       name: @token line: 6 col: 5 value:  factory_request_str: var_sequence child_count: 0 
    name: do_stuff line: 7 col: 1 value: __FUN_CALL__ factory_request_str: fun_call child_count: 1 
       name: ( line: 7 col: 9 value: __FUN_CALL_GROUP__ factory_request_str: node_open child_count: 3 
          name: __BZ__NUMBER__ line: 7 col: 10 value: 6 factory_request_str: literal_num child_count: 0 
          name: __BZ__STRING__ line: 7 col: 14 value: me factory_request_str: literal_str child_count: 0 
          name: @token line: 7 col: 19 value:  factory_request_str: var_sequence child_count: 0 
Program finished Successfully.  (Or at least it didn't crash.  :) ) 
 
new topic     » goto parent     » topic index » view message » categorize

5. Re: Writing a language in Euphoria

As far as using a debugger. I've thought a little about that too.

  • right arrow move to the next command in the tree.
  • down arrow (step into) move to the next node. Assumes the interpreter is keeping a stack of blocks for jumping. If that's what happens move to the node on the top of the stack. or said a different way... stop when new stack len == current stack len + 1.
  • up arrow (step out) move to the node one level deeper in the stack. or said a different way... stop when new stack len == current stack len - 1.
  • left arrow - I agree with you. we should be able to go backwards. And as long was we don't mind data may already be mutated allow this to happen. Developer beware.

Not sure if any of this works yet. But, that's my initial thoughts.

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu