Re: Contest #2... Example Programs
- Posted by PeteE Dec 14, 2010
- 2522 views
I do the label fixup very differently. each time my assembler hit a forward reference to a label it reserve a fixed number of code instructions. Derek and Matt assembler use addition of power of 2, mine use power of tens. I reserve a register to hold number 10 and multiply the targer register with that number. I grow the address by a serie of "addi Rt n" and "mult Rt Rm". It doesn't need a second pass to fix the labels. When a label is met in the code the assembler immediatly lookup the pending fix and resolve them. There is 2 limitations to my system: to space allocated is always the same so it is oversized for small address. Secondly It limit the program address space to 0 - 9999. I could increase it but it would means more space lost.
Very interesting, Jacques. I took another different approach to resolving labels in my (unreleased) cpu assembler. Instead of powers of 2 or 10, I wrote a general purpose function to generate instructions to load an arbitrary integer into a register using add and mul operations. To generate a jump offset, it loads the address of the label into the register using that function. For example, to load 999 into a register, the instructions emitted are:
289 r8=9 484 r8*=4 (36) 381 r8+=1 (37) 483 r8*=3 (111) 483 r8*=3 (333) 483 r8*=3 (999)
There's an obvious optimization missed there: it should use a *=9 instead of two *=3. There's also a problem when there's a backward reference and forward reference that overlap: the size of the label loads may shift the label address up and down, and the assembler doesn't terminate until the label addresses stop fluctuating. One solution is to manually put nops into the code in random places to try to get it to converge, but requires luck. I ended up making the assembler insert nops for me to keep a label address fixed when it takes too many tries to get them to settle.
It was rainy here too in Oregon this past weekend. I initially wasn't going to write my own assembler for a cpu that won't be used after a week, but it turns out I couldn't resist. I'll post the code later tonight.
Edit: and here it is. I also solved that number optimization bug.
-- User: PeteE -- Contest 2 virtual machine assembler -- Usage: vmasm.ex prog.src > prog.cpu include std/io.e include std/text.e include std/cmdline.e include std/convert.e include std/map.e as map include std/regex.e as re re:regex set_val = re:new(`^r([0-9])=(-?[0-9]+)`), set_reg = re:new(`^r([0-9])=r([0-9])`), add = re:new(`^r([0-9])\+=([0-9])`), add_reg = re:new(`^r([0-9])\+=r([0-9])`), mul = re:new(`^r([0-9])\*=([0-9])`), mul_reg = re:new(`^r([0-9])\*=r([0-9])`), load = re:new(`^r([0-9])=\[r([0-9])\]`), store = re:new(`^\[r([0-9])\]=r([0-9])`), set_lbl = re:new(`^r([0-9])=@([A-Za-z0-9_]+)`), if_reg = re:new(`^if r([0-9]) ([A-Za-z0-9_]+)`), a_label = re:new(`^([A-Za-z0-9_]+):`), halt = re:new(`^halt`), nop = re:new(`^nop`) map:map nums = map:new() -- memoized num() -- return a sequence of instructions which load the value n into a register r function num(integer r, integer n, integer limit) sequence result, tmp if n < 0 then result = {sprintf("2%s0 r%s=0",{r,r}), sprintf("8%s%s r%s=[r%s] (-1)",{r,r,r,r})} if n != -1 then result &= num(r, -n, 0) result[3][1] = '4' -- convert first set op to a mul result[3] = insert(result[3], '*', 7) end if return result end if result = map:get(nums, n, {}) if length(result) then -- we've seen n before for i = 1 to length(result) do result[i][2] = r -- rewrite the dest register result[i][6] = r end for return result elsif n >= 0 and n <= 9 then result = {sprintf("2%s%d r%s=%d",{r,n,r,n})} elsif limit = 0 or limit > 2 then if limit != 0 then limit -= 1 end if for i = 2 to 9 do -- factors if i*floor(n/i) = n then tmp = num(r, floor(n/i), limit) if length(tmp) then tmp &= {sprintf("4%s%d r%s*=%d (%d)", {r,i,r,i,n})} if limit = 0 or length(tmp) < limit then result = tmp limit = length(tmp) end if end if end if end for for i = 1 to 9 do -- sums tmp = num(r, n-i, limit) if length(tmp) then tmp &= {sprintf("3%s%d r%s+=%d (%d)", {r,i,r,i,n})} if limit = 0 or length(tmp) < limit then result = tmp limit = length(tmp) end if end if end for end if map:put(nums, n, result) -- memoize this result return result end function function assemble(sequence prog) sequence s object ref integer n = 0, done = 0, tries = 0 map:map labels = map:new() for i = 1 to length(prog) do prog[i] = trim(prog[i]) end for -- first pass: assemble instructions and remove blank lines n = 1 while n <= length(prog) do s = prog[n] if length(s) = 0 then prog = remove(prog, n) continue end if ref = re:find(if_reg, s) if sequence(ref) then s = {"r9=@" & s[ref[3][1]..ref[3][2]], "09" & s[ref[2][1]] & " " & s} prog = replace(prog, s, n) n += 2 continue end if ref = re:find(set_val, s) if sequence(ref) then s = num(s[ref[2][1]], to_integer(s[ref[3][1]..ref[3][2]]), 0) s[1] &= prog[n][ref[1][2]+1..$] prog = replace(prog, s, n) n += length(s) continue end if s = re:find_replace(add, s, `3\1\2 r\1+=\2`) s = re:find_replace(mul, s, `4\1\2 r\1*=\2`) s = re:find_replace(set_reg, s, `5\1\2 r\1=r\2`) s = re:find_replace(add_reg, s, `6\1\2 r\1+=r\2`) s = re:find_replace(mul_reg, s, `7\1\2 r\1*=r\2`) s = re:find_replace(load, s, `8\1\2 r\1=[r\2]`) s = re:find_replace(store, s, `9\2\1 [r\1]=r\2`) s = re:find_replace(halt, s, `100 halt`) s = re:find_replace(nop, s, `300 nop`) prog[n] = s n += 1 end while -- middle pass: presize label loads and check each label address while not done do done = 1 tries += 1 n = 0 for i = 1 to length(prog) do ref = re:find(a_label, prog[i]) if sequence(ref) then s = prog[i][ref[2][1]..ref[2][2]] if map:get(labels, s, -1) != n then -- recalculate label if tries > 20 and map:get(labels, s) = n + 1 then n = map:get(labels, s) + 1 -- give up and pad later with nops elsif tries > 30 and map:get(labels, s) > n then n = map:get(labels, s) + 1 -- give up and pad later with nops else done = 0 --if length(num('0', n, 0)) > length(num('0', n+1, 0)) and tries < 10 then -- n += 1 --end if map:put(labels, s, n) end if --printf(2, "%s is %d\n", {s, n}) end if continue end if ref = re:find(set_lbl, prog[i]) if sequence(ref) then s = prog[i][ref[3][1]..ref[3][2]] n += length(num('0', map:get(labels, s), 0)) continue end if n += 1 end for end while --printf(2, "tries=%d\n", tries) -- final pass: assemble label loads and remove labels n = 1 while n <= length(prog) do ref = re:find(set_lbl, prog[n]) if sequence(ref) then s = prog[n][ref[3][1]..ref[3][2]] if not map:has(labels, s) then printf(2, "undefined label: %s\n", {s}) abort(1) end if s = num(prog[n][ref[2][1]], map:get(labels, s), 0) s[1] &= prog[n][ref[1][2]+1..$] prog = replace(prog, s, n) n += length(s) continue end if ref = re:find(a_label, prog[n]) if sequence(ref) then s = prog[n][ref[2][1]..ref[2][2]] --printf(2,"label %s is at %d n is %d\n", {s, map:get(labels, s), n}) s = repeat("300 nop", map:get(labels, s) - n + 1) -- nop padding prog = replace(prog, s, n) n += length(s) continue end if n += 1 end while return prog end function constant cmd_line = command_line() write_lines(1, assemble(read_lines(cmd_line[3])))