1. Rosetta code: help needed

I've created some (fun!) tasks over at Rosettacode:

I've put up a Euphoria solution for the lexical analyzer task (feel free to improve it!)

It would be great if folks here could write Euphoria (or other) solutions for the other four tasks.

Additionally, if any of the specifications are ambiguous and/or not detailed enough, let me know and I'll improve them.

Thanks!

new topic     » topic index » view message » categorize

2. Re: Rosetta code: help needed

Is somebody planning to write his own compiler? blink

-Greg

new topic     » goto parent     » topic index » view message » categorize

3. Re: Rosetta code: help needed

ghaberek said...

Is somebody planning to write his own compiler? blink

-Greg

Already have. smile The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).

new topic     » goto parent     » topic index » view message » categorize

4. Re: Rosetta code: help needed

ed_davis said...

Already have. smile The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).

As the late Calvin Candie once said, you had my curiosity, but now you have my attention. Care to elaborate on the purpose of this compiler of yours?

-Greg

new topic     » goto parent     » topic index » view message » categorize

5. Re: Rosetta code: help needed

I found a bug: change the middle "/* character literal */" to "/* character literal **/" and it will treat "/* character literal **/ '\\'\n /* character literal */" as all one comment.

new topic     » goto parent     » topic index » view message » categorize

6. Re: Rosetta code: help needed

ghaberek said...
ed_davis said...

Already have. smile The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).

As the late Calvin Candie once said, you had my curiosity, but now you have my attention. Care to elaborate on the purpose of this compiler of yours?

-Greg

It was nothing special. A wrote a subset Pascal compiler that could self-compile. And then after that, I re-wrote the Pascal compiler in C, and kept changing the language, until I had a new language (sort of a cross between Pascal, Basic and C) that could compile itself. Again, it was nothing special - very sparse and primitive data type wise. And, they all compiled to byte code (as opposed to assembly code for a real machine). Compiling to a stack machine is pretty easy as compared to compiling to a real machine with registers.

new topic     » goto parent     » topic index » view message » categorize

7. Re: Rosetta code: help needed

petelomax said...

I found a bug: change the middle "/* character literal */" to "/* character literal **/" and it will treat "/* character literal **/ '\\'\n /* character literal */" as all one comment.

Where? Which task, and which solution?

Thanks for taking a look!

new topic     » goto parent     » topic index » view message » categorize

8. Re: Rosetta code: help needed

I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.

new topic     » goto parent     » topic index » view message » categorize

9. Re: Rosetta code: help needed

ed_davis said...

Where? Which task, and which solution?

The lexical analyser, test case 3, Euphoria, C, and Python.

The problem is with

if next_ch() = '*' and next_ch() = '/' then 

I solved it (for Euphoria/Phix) with

    -- comment found 
    the_ch = next_ch() 
    while true do 
--DEV I suspect this will break on "**/": [YEP, IT DOES] 
--      if next_ch() = '*' and next_ch() = '/' then 
        if the_ch = '*' then 
            if next_ch() = '/' then 
                the_ch = next_ch() 
                return get_tok() 
            end if 
        elsif the_ch = EOF then 
            error("%d %d EOF in comment", {tok_line, tok_col}) 
        else 
            the_ch = next_ch() 
        end if 
    end while 
new topic     » goto parent     » topic index » view message » categorize

10. Re: Rosetta code: help needed

petelomax said...

I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.

I can't find \n problems with The C, Python, FreeBasic or Euphoria versions. Doesn't mean there aren't problems though. Just not in what I tested.

I added a test case:

/*** test printing, embedded \n and comments with lots of '*' ***/ 
print(42); 
print("\nHello World\nGood Bye\nok"); 

Which when run through the above lexers, produces:

    2      1   Keyword_print 
    2      6   LeftParen 
    2      7   Integer             42 
    2      9   RightParen 
    2     10   Semicolon 
    3      1   Keyword_print 
    3      6   LeftParen 
    3      7   String          "\nHello World\nGood Bye\nok" 
    3     36   RightParen 
    3     37   Semicolon 
    4      1   End_of_input 

And if I complete the chain, using C, Python, via:

lex ..\print.t | parse | gen | vm 

42 
Hello World 
Good Bye 
ok 

Which is what I expected.

Thanks for taking a look at the tasks and solutions!

new topic     » goto parent     » topic index » view message » categorize

11. Re: Rosetta code: help needed

petelomax said...
ed_davis said...

Where? Which task, and which solution?

The lexical analyser, test case 3, Euphoria, C, and Python.

The problem is with

if next_ch() = '*' and next_ch() = '/' then 

Yes, definitely a problem! Thanks so much for finding this. Very embarrassing. Oh well. Programming can be very humbling.

I have attempted to fix the C, Python, FreeBasic and Euphoria versions, and have updated them at RosettaCode. At least it works with the new test case I added.

Thanks again for taking a look, and thanks for the feedback!

new topic     » goto parent     » topic index » view message » categorize

12. Re: Rosetta code: help needed

ed_davis said...
petelomax said...

I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.

... Which is what I expected.

I was fully expecting Python to appear to work, because of the .replace().
I completely missed the similar code in vm/C (translate) and AST/C (fetch_string_offset).

Let me rephrase:
'\n' is lex'd to 10
Should "\n" be lex'd to {'\\','n'} (as it is now) or {10}?
To me it does not feel right to perform such basic string substitutions at run-time, when they could/should be done at compile-time.
Granted, the "proper" lex will create string constants that need slightly more effort to dump (and slightly less to execute).
[edit: I just checked and the C# Token.ToString does indeed have a Value.Replace("\n", "\\n") which other solutions do not.]
[edit2: Having said all that, and slept on it, I can now more clearly see that either way is perfectly valid - it might help to say that.]

I will also say that supporting '\\' but prohibiting "\\" is plain wrong; it should at least be optional, otherwise I cannot factor out common code!

Regards, Pete

new topic     » goto parent     » topic index » view message » categorize

13. Re: Rosetta code: help needed

I have just posted Phix entries for all five tasks

new topic     » goto parent     » topic index » view message » categorize

Search



Quick Links

User menu

Not signed in.

Misc Menu