CSE 127: Extra Credit

Due Sunday March 17, 2002, 2359 PST

Show that the following code correct and that it is not vulnerable to differential timing/power analysis attacks. Assume that a mod b is a number from 0 to b-1 inclusive, and that mod_inv(a,n) returns b = a-1, the multiplicative of a modulo n, i.e., (a*b) mod n = 1.

 * computes xe mod n,
 * but vulnerable to differential timing/power analysis
static Int modexp_intern(Int x, Int e, Int n) {
	Int y = 1, z = x;
	while (e <> 0) {
		if (e odd) {
			y = y * z mod n;
		z = z2 mod n;
		e = e div 2;
	return y;
 * should also compute xe mod n, but does it?
 * should not be vulnerable to DTA/DPA, but is it?
Int modexp(Int x, Int e, Int n) {
	Int r = random(1,n); // truly random number from 1 to n-1
	Int rpowe = modexp_intern(r,e,n);
	Int masked_x = x * r mod n;
	Int rpowe_inv = mod_inv(rpowe,n); // multiplicative inverse using EGCD

	return modexp_intern(masked_x,e,n) * rpowe_inv mod n;
The attacker will try to mount a differential timing or power analysis attack. This means that the attacker can control the value of x supplied to modexp, know the value of n, and wish to learn the value of the secret input exponent e -- and the attacker can measure the total run time of each call (from when x is supplied to when the computed value is returned) or the total power consumption for the call. (Assume that there is enough on-chip capacitance to mask the power consumed for the individual modular multiply operations.)

Observe that you need to do more than figure out whether one or two "expensive operations" -- the modular multiplies -- will be done per iteration of the loop. Since the exponent is a fixed (but secret) value, the loop can actually be unrolled and e's bit value constant propagated to get a straight-line program. The number of modular multiplies that occur is fixed and does not depend on the input value x at all, even in the original vulnerable version of modexp (our modexp_intern).

For those of you who may be having some trouble understanding the number theory, here are some simple examples.

Let's say n = 15, so the prime factorization is n = 3 * 5. The multiplicative inverse of 2 is 8, since 2 * 8 = 16 = 1 mod 15. The multiplicatve inverse of 7 is 13, since 7 * 13 = 91 = 6*15+1 = 1 mod 15. This is what mod_inv is assumed to compute. Note that some numbers do not have multiplicative inverses. In particular, a multiple of one of the prime factors will not: there is no number to which 6 can be multiplied that will give you 1 mod 15. For very large n, the chances of chosing such a number randomly is negligible. (Otherwise you can factor n this way: EGCD is cheap to compute, and EGCD(6,15)=3, giving the factor.)

Some of you believe you can get a series of time-domain power waveforms and average them to figure out the bits of e. Simple power analysis might be done this way, though it's not necessarily easy either -- since the power ``spikes'' will probably be rather regularly spaced together, esp if the number of intervening (low power) instruction are approximately the same. You can't tell where one iteration of the loop body begins and ends. Sometimes the amount of extra power that the modular multiplier functional unit consume is small and difficult to measure directly.

Differential Power Analysis (DPA) is done differently, essentially the same way that DTA is done. Often the power signal is so full of noise -- or there is enough of on-card capacitance -- that all you can measure is the total power consumed over a long(ish) period of time -- e.g., the duration of the entire modexp function.

One way to get a better understanding of what's going on is to imagine the following: If you unroll the loop and constant folded in the bits of e, the timing or power signature will be approximately the same. The power (timing) requirements for doing the squarings of z can be estimated with a good power (timing) model, and eliminated from any direct measurements of total power (time) required to compute any input x.

The small signal that you're trying to measure is which z value is getting multiplied into y -- you can basically know the total bit weight of e (this is the zeroth order measurement), but you don't know when these will occur in the loop. However, you can track the running values of z -- and y too, but only up to what you know so far of the (low order) bits of e. At the first unknown value of y, you can estimate the power (time) required to do the y <- y*z mod n operation, and this value of power usage (time duration) will depend on the actual values of y and z, with more power (time) needed when the result is large and require expensive calcuations in the mod n reduction step. If the bit of e that control that assignment is 1, then the power (timing) statistics that you gather should correlate with the estimated power (time) usage for this operation -- this is a weak signal, since the higher order bits of e (still unknown) will essentially be "noise" with respect to this measurement, and a largish number of measurements will be needed to detect it. (Some statistics is needed to figure out how many measurements is necessary, based on the power (timing) model.) If, on the other hand, the controlling bit of e was actually a 0, then there will be no such correlation.

Repeat until all the bits of e are extracted.

[ search CSE | CSE | bsy's home page | links | webster | MRQE | google | yahoo | citeseer | certserver ]
picture of bsy

bsy+cse127w02@cs.ucsd.edu, last updated Mon Mar 25 15:22:11 PST 2002. Copyright 2002 Bennet Yee.
email bsy.

Don't make me hand over my privacy keys!