I also mentioned that ``branch'' instructions' offset field size in their opcodes is smaller than that in the ``jump'' instruction's offset field. This means some conditional branches may need to branch to a jump instruction if the target is out of range. Another way that a compiler could handle this is to conditionally branch to code which loads an address into a register and use the jr jump-register instruction.
The main part of the lecture had to do with what compilers can and can not do for you. The first portion had to do with assignment of variables to memory. In C, the extern and static storage class specifiers place variables in the data segment (initialized or uninitialized: this subdivision of the data segment is made to make the executable size small), and the register specifier tells the compiler to try to keep the corresponding variable in a register, which is much faster to access than memory.
Next, I talked briefly about compiler optimizations. I gave the following example C code:
func() { char *cp; int x; x = *cp++; ... }A dumb compiler might translate this code into this assembly code:
lw $t1, -8($fp) # load cp from stack frame lw $t2, 0($t1) # derefernce cp add $t1,$t1,1 # post-increment sw $t1,-8($fp) # save cp back into stack frame sw $t2,-12($fp) # save x into stack framewhereas register declarations (or a reasonable, optimizing compiler) would keep things in registers as much as possible (see the assignment 1 and 2 code).
There are some optimizations that a programmer can do but the compiler can't. The example that I gave has to do with aliasing:
int x = 3; int func(int *ip) { int y; y = x; *ip = 1; y += x; *ip = *ip + y; y += x; return y; }The compiler can not know at compile time whether ip might point to the integer x. The programmer, on the other hand, might know that this will never happen. The compiler must reload x from memory after each modification to the memory location pointed to by ip:
... lw $t1,x # t1 is y li $t2,1 sw $t2,0($s0) # assume s0 has ip in it lw $t3,x # <--- reload x into t3, a temporary add $t1,$t1,$t3 add $t2,$t2,$t1 # *ip in t2 is still good, so use it sw $t2,0($s0) lw $t3,x # <--- reload x into t3, a temporary add $t1,$t1,$t3 move $v0,$t1 # place t1 into return value register v0 ... stuff to clean up stack frame jr rainstead of
... lw $t1,x # t1 is y move $t3,$t1 # save a copy in temporary t3 li $t2,1 sw $t2,0($s0) # assume s0 has ip in it add $t1,$t1,$t3 add $t2,$t2,$t1 # *ip in t2 is still good, so use it sw $t2,0($s0) add $t1,$t1,$t3 move $v0,$t1 # place t1 into return value register v0 ... stuff to clean up stack frame jr rawhich is a little shorter and faster, since memory references are expensive (though in this case, with the cache the impact is not as significant).
This is an optimization that no C or C++ compiler can do, since compilers must preserve the correctness of the translation for the worse case scenario: ip may point to x in one call of func, but point to a completely different place in another call. You, as the programmer, might know that ip never points to x, and can safely cache the value of x in a register and never have to reload it again. Unfortunately, there is no way for you to communicate this information to the compiler. If you need the little bit of extra speed that this optimization could represent, you will have to write the assembly code yourself instead of letting a compiler do it for you.
bsy@cse.ucsd.edu, last updated