CSE 127: Lecture 16

The topics covered in this lecture are: bound and free variables, designing components to fail safely with defense in depth.

Bound and free variables

When doing proofs of code correctness, you have to be able to use the terminology correctly to communicate clearly. To this end, we need to understand the difference between bound and free variables in mathematics. We say that a variable is bound if it is a ``formal parameter'' of a function. A variable is bound to the innermost definition of that variable. Consider the statement:

i = 5 AND FOR ALL i: 0 <= i < 10 --> P(i)
^                 ^       ^            ^
A                 B       C            D

Where A, B, C, and D are used here to refer to occurrences of the variable i. There are two parts of this statement: everything before the AND, and everything after the AND. After the AND, the variables C and D are bound to the definition at B, but not A. This is analogous to variable scoping in a code fragment like this:

if (i == 5) {
    int i;
    for (i = 0; i < 10; i++) {
        P(i);
    }
}

Note that in this code fragment, the variable i inside the if block is different than the variable in the if test.

Failing safely

Now that we have investigated proofs of code correctness, we will turn to look at more realistic situations where we know that there may be bugs, or unproven code, in our system. In such cases we want to minimize the damage that can be caused by the failure of a software module. We will call this topic "failing safely", and we will also talk about "defense in depth". Again, these have to do with designing components so that if (when) they do fail, bad things are less likely to happen. "Defense in depth" refers to having multiple layers of protection so that there is a buffer zone between the user-accessible layer and protected systems.

Example: Input validation
Suppose you are designing a web-based storage system for users. This system should let many (authorized) users store and retrieve their own files using a web interface. So you are designing the CGI program that handles user input and accesses the file system to store and retrieve files. The interface for the system could be as simple as this:

Figure 1: sample user interface for web-based file storage

You want to store all the files for each user in their own directory (perhaps with subdirectories) on the server running the web program. Each user's files will be named by their username. Imagine the user "bsy" is an authorized user of the system. Then his files would be stored under the directory "/var/foo/bsy/". Then if bsy wants to retrieve file "/var/foo/bsy/text/README", he should supply the relative path name "/text/README", which will be appended to "/var/foo/bsy/" by the retrieval program to find the intended file.

However, there is a problem with allowing him to supply any path name, because of the ".." directive, which means to go up a directory. Suppose the user supplied the path "/../ghamerly/secretBankAccountInfo.txt". If not checked, this will give bsy access to potentially secret information of ghamerly. How can we handle this? There are several ways:

We could try to determine where the relative path name will lead us, and not allow access that goes above "/var/foo/bsy/". This is a bit complicated and has ambiguities (e.g. what does "/var/foo/bsy/../../../../etc/passwd" refer to?).
We could restrict the input for the relative path name so that it disallowed all periods (and perhaps other characters). This is simpler but may cost more for the users (e.g. they would not be able to name a file "readme.txt", if periods are disallowed).

Suppose we chose to implement the first policy of allowing any file name, but tracking the accesses and disallowing access outside of a user's own area. One way we could do this is to calculate the relative path name ourselves. This is error prone and amiguous, as noted above, but it is possible. Another option is the "chroot", which is a privileged operation to set the root of a filesystem for a process. For example, if the program we wrote was run as "chroot /var/foo/bsy/ program", then the program would see /var/foo/bsy/ as the root of the filesystem (i.e. /), and the operating system would not allow access anywhere outside of that area. However, chroot has other problems we will not go into here.

Suppose we choose to implement the second policy of disallowing certain characters in the relative path name. How should we do it? Should we scan the input path name for characters that aren't allowed and remove (or reject) any we see, or should we scan for characters that are allowed, and remove (or reject) anything we don't recognize? Which is a better policy?

Suppose that we are scanning for all characters that are not allowed in the path name. What if we make a mistake and omit some illegal characters that we should be checking for? Then we may have a security breach. However, what if we are scanning for all characters that are allowed in the input and we make a omission? Then we may have a usability problem, as users won't be able to use certain characters that they should, but we are also less likely to have a security breach.

Defense in depth

An example of defense in depth is using firewalls on the internet to protect private networks in conjunction with other network security techniques such as intrusion detection systems. A firewall is a computer that sits on the border between a private network and another network (e.g. the internet), and all traffic between the private network and the other network must pass through the firewall. The firewall protects the private net from the rest of the world by allowing only packets intended for certain IP's and ports. A layered system may have several layers of firewalled networks to protect the most sensitive systems at the core of the private network.

A Virtual Private Network (VPN) is a system that is becoming common that allows a host that is not in a private network to pretend as if it is. Imagine that a company employee with computer A is on business trip. Computer A is connected to the global internet (through an ISP), and the user wants to connect to the internal private network P. Then his company owning network P can create a VPN gateway (call it G) that A can connect to. Then G pretends to be host A for the internal network, and A communicates to G over the global internet using encryption. Often times, VPN software will disallow host A to connect to any site other than G (or receive connections from elsewhere other than through G) while using the VPN. This helps to prevent host A from becoming a stepping stone in an attack.

bsy+cse127w02@cs.ucsd.edu, last updated Mon Mar 25 15:22:09 PST 2002. Copyright 2002 Bennet Yee.
email bsy.