CSE 30 -- Lecture 7 -- Oct 21


During this lecture, I went over assignment 2 sample solutions:
  1. simple test
  2. table lookup
For assignment 3, I will be allowing paper handins.

I also mentioned that ``branch'' instructions' offset field size in their opcodes is smaller than that in the ``jump'' instruction's offset field. This means some conditional branches may need to branch to a jump instruction if the target is out of range. Another way that a compiler could handle this is to conditionally branch to code which loads an address into a register and use the jr jump-register instruction.

The main part of the lecture had to do with what compilers can and can not do for you. The first portion had to do with assignment of variables to memory. In C, the extern and static storage class specifiers place variables in the data segment (initialized or uninitialized: this subdivision of the data segment is made to make the executable size small), and the register specifier tells the compiler to try to keep the corresponding variable in a register, which is much faster to access than memory.

Next, I talked briefly about compiler optimizations. I gave the following example C code:

func()
{
	char	*cp;
	int	x;

	x = *cp++;
	...
}
A dumb compiler might translate this code into this assembly code:
	lw $t1, -8($fp)	# load cp from stack frame
	lw $t2, 0($t1)	# derefernce cp
	add $t1,$t1,1	# post-increment
	sw $t1,-8($fp)	# save cp back into stack frame
	sw $t2,-12($fp)	# save x into stack frame
whereas register declarations (or a reasonable, optimizing compiler) would keep things in registers as much as possible (see the assignment 1 and 2 code).

There are some optimizations that a programmer can do but the compiler can't. The example that I gave has to do with aliasing:

int	x = 3;

int	func(int	*ip)
{
	int	y;

	y = x;
	*ip = 1;
	y += x;
	*ip = *ip + y;
	y += x;
	return y;
}
The compiler can not know at compile time whether ip might point to the integer x. The programmer, on the other hand, might know that this will never happen. The compiler must reload x from memory after each modification to the memory location pointed to by ip:
	...
	lw	$t1,x		# t1 is y
	li	$t2,1
	sw	$t2,0($s0)	# assume s0 has ip in it
	lw	$t3,x		# <--- reload x into t3, a temporary
	add	$t1,$t1,$t3
	add	$t2,$t2,$t1	# *ip in t2 is still good, so use it
	sw	$t2,0($s0)
	lw	$t3,x		# <--- reload x into t3, a temporary
	add	$t1,$t1,$t3
	move	$v0,$t1		# place t1 into return value register v0
	... stuff to clean up stack frame
	jr	ra
instead of
	...
	lw	$t1,x		# t1 is y
	move	$t3,$t1		# save a copy in temporary t3
	li	$t2,1
	sw	$t2,0($s0)	# assume s0 has ip in it
	add	$t1,$t1,$t3
	add	$t2,$t2,$t1	# *ip in t2 is still good, so use it
	sw	$t2,0($s0)
	add	$t1,$t1,$t3
	move	$v0,$t1		# place t1 into return value register v0
	... stuff to clean up stack frame
	jr	ra
which is a little shorter and faster, since memory references are expensive (though in this case, with the cache the impact is not as significant).

This is an optimization that no C or C++ compiler can do, since compilers must preserve the correctness of the translation for the worse case scenario: ip may point to x in one call of func, but point to a completely different place in another call. You, as the programmer, might know that ip never points to x, and can safely cache the value of x in a register and never have to reload it again. Unfortunately, there is no way for you to communicate this information to the compiler. If you need the little bit of extra speed that this optimization could represent, you will have to write the assembly code yourself instead of letting a compiler do it for you.


[ CSE 80 | ACS home | CSE home | CSE calendar | bsy's home page ]
picture of bsy

bsy@cse.ucsd.edu, last updated Mon Oct 21 20:59:11 PDT 1996.

email bsy