1. Rosetta code: help needed
- Posted by ed_davis Oct 26, 2016
- 1827 views
I've created some (fun!) tasks over at Rosettacode:
- http://rosettacode.org/wiki/Compiler/lexical_analyzer
- http://rosettacode.org/wiki/Compiler/syntax_analyzer
- http://rosettacode.org/wiki/Compiler/code_generator
- http://rosettacode.org/wiki/Compiler/virtual_machine_interpreter
- http://rosettacode.org/wiki/Compiler/AST_interpreter
I've put up a Euphoria solution for the lexical analyzer task (feel free to improve it!)
It would be great if folks here could write Euphoria (or other) solutions for the other four tasks.
Additionally, if any of the specifications are ambiguous and/or not detailed enough, let me know and I'll improve them.
Thanks!
2. Re: Rosetta code: help needed
- Posted by ghaberek (admin) Oct 26, 2016
- 1788 views
Is somebody planning to write his own compiler?
-Greg
3. Re: Rosetta code: help needed
- Posted by ed_davis Oct 26, 2016
- 1780 views
Is somebody planning to write his own compiler?
-Greg
Already have. The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).
4. Re: Rosetta code: help needed
- Posted by ghaberek (admin) Oct 26, 2016
- 1817 views
Already have. The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).
As the late Calvin Candie once said, you had my curiosity, but now you have my attention. Care to elaborate on the purpose of this compiler of yours?
-Greg
5. Re: Rosetta code: help needed
- Posted by petelomax Oct 28, 2016
- 1727 views
I found a bug: change the middle "/* character literal */" to "/* character literal **/" and it will treat "/* character literal **/ '\\'\n /* character literal */" as all one comment.
6. Re: Rosetta code: help needed
- Posted by ed_davis Oct 28, 2016
- 1754 views
Already have. The RosettaCode tasks are necessarily abbreviated versions of that effort. I thought it would be interesting to see the approach in other languages. To get the ball rolling, I posted C and Python solutions for all the tasks. And Euphoria is an ideal language for doing data transformations, which is basically what a compiler boils down to (transform the source into tokens, transform the tokens into a parse tree, transform the parse tree into assembly code).
As the late Calvin Candie once said, you had my curiosity, but now you have my attention. Care to elaborate on the purpose of this compiler of yours?
-Greg
It was nothing special. A wrote a subset Pascal compiler that could self-compile. And then after that, I re-wrote the Pascal compiler in C, and kept changing the language, until I had a new language (sort of a cross between Pascal, Basic and C) that could compile itself. Again, it was nothing special - very sparse and primitive data type wise. And, they all compiled to byte code (as opposed to assembly code for a real machine). Compiling to a stack machine is pretty easy as compared to compiling to a real machine with registers.
7. Re: Rosetta code: help needed
- Posted by ed_davis Oct 28, 2016
- 1692 views
I found a bug: change the middle "/* character literal */" to "/* character literal **/" and it will treat "/* character literal **/ '\\'\n /* character literal */" as all one comment.
Where? Which task, and which solution?
Thanks for taking a look!
8. Re: Rosetta code: help needed
- Posted by petelomax Oct 28, 2016
- 1688 views
I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.
9. Re: Rosetta code: help needed
- Posted by petelomax Oct 28, 2016
- 1694 views
Where? Which task, and which solution?
The lexical analyser, test case 3, Euphoria, C, and Python.
The problem is with
if next_ch() = '*' and next_ch() = '/' then
I solved it (for Euphoria/Phix) with
-- comment found the_ch = next_ch() while true do --DEV I suspect this will break on "**/": [YEP, IT DOES] -- if next_ch() = '*' and next_ch() = '/' then if the_ch = '*' then if next_ch() = '/' then the_ch = next_ch() return get_tok() end if elsif the_ch = EOF then error("%d %d EOF in comment", {tok_line, tok_col}) else the_ch = next_ch() end if end while
10. Re: Rosetta code: help needed
- Posted by ed_davis Oct 28, 2016
- 1672 views
I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.
I can't find \n problems with The C, Python, FreeBasic or Euphoria versions. Doesn't mean there aren't problems though. Just not in what I tested.
I added a test case:
/*** test printing, embedded \n and comments with lots of '*' ***/ print(42); print("\nHello World\nGood Bye\nok");
Which when run through the above lexers, produces:
2 1 Keyword_print 2 6 LeftParen 2 7 Integer 42 2 9 RightParen 2 10 Semicolon 3 1 Keyword_print 3 6 LeftParen 3 7 String "\nHello World\nGood Bye\nok" 3 36 RightParen 3 37 Semicolon 4 1 End_of_input
And if I complete the chain, using C, Python, via:
lex ..\print.t | parse | gen | vm
42 Hello World Good Bye ok
Which is what I expected.
Thanks for taking a look at the tasks and solutions!
11. Re: Rosetta code: help needed
- Posted by ed_davis Oct 28, 2016
- 1669 views
Where? Which task, and which solution?
The lexical analyser, test case 3, Euphoria, C, and Python.
The problem is with
if next_ch() = '*' and next_ch() = '/' then
Yes, definitely a problem! Thanks so much for finding this. Very embarrassing. Oh well. Programming can be very humbling.
I have attempted to fix the C, Python, FreeBasic and Euphoria versions, and have updated them at RosettaCode. At least it works with the new test case I added.
Thanks again for taking a look, and thanks for the feedback!
12. Re: Rosetta code: help needed
- Posted by petelomax Oct 28, 2016
- 1653 views
- Last edited Oct 29, 2016
I also think \n handling in strings is suspect and could do with clarifying/extra tests.
C# has code in the lexer, Python and zkl have a few .replace("\\n", "\n") in the vm|interpreter.
Other solutions seem to be wholly unaware of their existence.
... Which is what I expected.
I was fully expecting Python to appear to work, because of the .replace().
I completely missed the similar code in vm/C (translate) and AST/C (fetch_string_offset).
Let me rephrase:
'\n' is lex'd to 10
Should "\n" be lex'd to {'\\','n'} (as it is now) or {10}?
To me it does not feel right to perform such basic string substitutions at run-time, when they could/should be done at compile-time.
Granted, the "proper" lex will create string constants that need slightly more effort to dump (and slightly less to execute).
[edit: I just checked and the C# Token.ToString does indeed have a Value.Replace("\n", "\\n") which other solutions do not.]
[edit2: Having said all that, and slept on it, I can now more clearly see that either way is perfectly valid - it might help to say that.]
I will also say that supporting '\\' but prohibiting "\\" is plain wrong; it should at least be optional, otherwise I cannot factor out common code!
Regards, Pete
13. Re: Rosetta code: help needed
- Posted by petelomax Nov 15, 2016
- 1585 views
I have just posted Phix entries for all five tasks