The topics covered in this lecture are building a specification for code, proving code correct, and assignment 2.

Specification for `void sort(int *arr, int nelt)`

:

**Input constraints:**

`arr`

is an array of integers`nelt`

is the length of the array`arr`

`arr`

the original array contents and by `arr'`

what the array contains after the code runs):
- For all i, 0 <= i < nelt - 1: arr'[i] <= arr'[i+1]

`arr`

and `arr'`

. This type of function is
a permutation:
`arr'`

:- For all i, 0 <= i < nelt - 1: arr'[i] <= arr'[i+1]
- There exists phi: phi is a permutation on {0, 1, 2, ..., nelt - 1}
and for all i: 0 <= i < nelt,
`arr'[phi(i)] = arr[i]`

`nelt`

objects, labeled 0, 1, 2,
... `nelt-1`

. We define our domain and range set as
S={0,1,2,...,nelt-1}, and require that phi is a bijection on S. A
fact that we will use is that the composition of two permutations is
itself a permutation.
Then we need to prove the correctness of the sort routine starting
with the an identity permutation phi_{0}, and showing that at
each step s the function maintains that there is a permutation
phi_{s} relating the current state of the array with its
initial state.

Now here is the proposed code that will implement this specification:

1. void sort(int *arr, int nelt) { 2. int mid, smallix, bigix, t; 3. if (nelt <= 1) return; 4. 5. mid = arr[0]; 6. smallix = 0; 7. bigix = nelt; 8. 9. while (smallix < bigix) { 10. while (smallx < nelt && arr[smallix] <= mid) 11. ++smallix; 12. while (bigix > 0 && arr[bigix-1] > mid) 13. bigix--; 14. if (!(smallix==nelt || 0 == bigix || smallix == bigix)) { 15. t = arr[smallix]; 16. arr[smallix] = arr[bigix - 1]; 17. arr[bigix - 1] = t; 18. } 19. } 20. t = arr[0]; arr[0] = arr[smallix - 1]; arr[smallix - 1] = t; 21. sort(arr, smallix - 1); 22. sort(arr + smallix, nelt - smallix); 23. }

There are two key strategies to proving code correctness:

- Loop invariants -- an expression that is true upon entry to the loop and remains true after every execution of the body
- Induction -- used to prove recursive algorithms correct and to prove that loop invariants hold

Notice that the main while loop divides the array into two parts, based
on a pivot point (which is based on the value of variable
`mid`

). Now we begin with our inductive proof:

**Proof by induction**

- Base case: arrays of length 0 or 1 are sorted by the algorithm. We know that, by definition, arrays of length 0 or 1 are already sorted. So line 3 handles that case.
- Assume that the algorithm works for all
`nelt < k`

. - Prove that the algorithm works for
`nelt = k`

.

We begin the proof of the third step by noticing that a loop invariant that can help us. We know that smallix < bigix for the entire loop. Why? Because it begins that way (before the loop), and the two small while loops ensure that smallix is only incremented when arr[smallix] <= mid, and that bigix is only decremented when arr[bigix] > mid.

There are three regions of the array: [0,smallix), [smallix, bigix), and [bigix,nelt). The region (smallix,bigix) gets smaller due to the inner while loops, until it eventually reaches size 0 when smallix = bigix.

For each of these regions, we know the following loop invariants are true:

- for all i, 0 <= i < smallix: arr[i] <= mid
- for all i, bigix <= i < nelt: mid < arr[i]

We will continue this proof next time in class.

You can download this annotated code and play with it yourself:

Your task is to fix his code. In order to keep your job -- your pointy-haired boss will undoubtedly read your changes -- you must keep the random pivot selection idea. Use what you know about testing and proofs of correctness to make this code work again.

You should hand in: the fixed (and properly annotated) code, the testing scaffolding that you used when debugging your implementation, the test inputs (esp those that showed bugs in the pointy-haired boss's implementation or your own initial fixes) that will serve as test cases for regression testing subsequent versions, and a README.txt file containing a description of what you did, how you decide to do what you did, and why you believed it to be the correct fix(es). You should also include in your writeup a discussion of whether the worse-case performance of the original code might actually a problem in practice.

Clarification: by test scaffolding I mean what code you write to test your fixes to the sort function. I expect you to have some sort of testing driver which allows you to at least semi-automate the process of feeding in the test cases from a test suite, and such a driver program will be part of the testing tools for the project, to be handed over to sustaining engineering along with the regression testing test suite. You may wish to generate your test cases via a program, or just have data files -- your design decision should be part of the writeup.

This assignment is due at 2359 on February 15, 2002.

[ search CSE | CSE | bsy's home page | links | webster | MRQE | google | yahoo | citeseer | certserver ]

bsy+cse127w02@cs.ucsd.edu, last updated

email bsy.