Dynamic recompilation using registers

I don't know if anyone knows about this since it's a pretty niche subject, but I'm writing a dynamic recompiler for Microsoft's CIL. The target language is x86 machine code*. It's not too difficult to produce, just tedious, but what I can't figure out is how to pick registers for the instructions which use them. For example, lets say I'm generating an ADD instruction where the operands are a memory location and a register. How do I decide which register to use, and how to I keep track of which ones are in use? What do I do if they're all in use?

I really can't avoid using registers because some instructions only accept registers as their source parameter, so I'd have to generate a MOV to use the register as a temporary, which might slow my output code down too much for the dynamic recompiler to be worth writing.

* when I get the x86 target working, I'm writing a code generator because F*CK writing code for every single instruction for multiple architectures by hand.
Last edited on
Is there no reserved register for assembler temporaries?
Are you asking about register allocation? There are algorithms to determine which values need to be kept in registers.
https://en.wikipedia.org/wiki/Register_allocation

The requirement of sometimes "spilling" and "filling" values between registers and memory is going to be inevitable (as will associated slowdown for having to do it) since you have only a limited number of registers and possibly an unlimited number of values to distribute among them.
@LB
Nope, on x86 there are only four 32-bit general-purpose registers E{A,B,C,D}X and at any time any of them could have been clobbered by another instruction. I could use the stack, but that slows down my code because I'll have to use a bunch of MOVs and PUSHes for what should be a single instruction. Temporaries, i.e. local variables, go on the stack. If you look at the assembly code gcc produces you'll see something like add esp, 4 which is reserving 4 bytes on the stack for a local variable.

CIL is a stack-based language, so where x86 does:
1
2
3
mov eax, 0x0f
mov ebx, 0x01
add eax, ebx

CIL does something like:
1
2
3
push 0x0f
push 0x01
add

which I have to somehow translate into the above. The problem is that I can only process one opcode at a time, it would be too complicated for my program to figure out why the input program is pushing values on the stack. I could just use the stack myself, but it would be much faster to use the registers. The problem arises when I have two input operations that use the same data. Because I'm only looking at one opcode at a time, I can't handle this situation:
1
2
3
4
5
push 0x0f
push 0x01
add
push 0x10
add

In CIL, this pushes 0x0f and 0x01, adds them, taking them off the stack and putting the result (0x10) on, and then adds 0x10, taking it and the previous result off the stack and putting the result (0x20) back on. A naive dynamic recompiler would generate this:
1
2
3
4
5
mov eax, 0x0f
mov eax, 0x01
add eax, ebx
mov eax, 0x10
add eax, ebx

the result of which would be unknown and totally useless. When the code generator turns push 0x01 into mov eax, 0x01, it doesn't know that eax was just loaded with the value 0x0f, or that the next instruction wants to add the previous EAX and the new one together. It's pretty easy to mark registers as "in use", which appears to solve the problem, but there's no obvious way of marking a register as no longer in use -- the function that generated the opcode that's using it doesn't know when it'll be finished -- and since x86 only has 4 registers for general use, I'd run out in less than a millisecond.

It wouldn't be a problem if I just used the stack, but that would be a lot slower.

[edit]
@Zhuge
Thanks! I should've figured there would be existing algorithms for this. Hopefully that solves my problem.
Last edited on
Topic archived. No new replies allowed.