Back to the index
Correct shell programming, part 1
Writing shell scripts without security holes is said to be difficult, if not impossible. That opinion is clearly wrong. This article shows how to write shell scripts that are so robust and secure, that I would even use them as CGIs on a publicly accessible web server, even if the source code is publicly available, too.

Why is variable quoting necessary?

Filenames have traditionally (first years of Unix) been composed from letters and dots. When the Microsoft operating systems gained their market share, filenames with embedded spaces have become popular. And almost every Unix operating system allows filenames to be composed of arbitrary characters, except the slash, which is used for splitting pathnames into components, and the '\0' character, which marks the end of the filename.

Another point is that the shell applies the expansion rules more often than most programmers would expect. Consider this example code:

var="foo bar baz * `date`"
for i in $var; do
    echo $i
done

What happens here? In the first line the value which is assigned to var is written in double quotes. Not a bad start. But the backticks are evaluated even in double quotes, so the output of the date(1) command is inserted in place. The * is copied as-is.

In the second line $var is unquoted again. That means another application of the shell expansion rules. This time the * is exanded to all files in the current directory.

In the third line the shell expansion rules are applied a third time, now on the variable i. Now suppose that you have a file called *.txt in your current directory. This filename is also split into words, resulting not only in *.txt be echoed, but every file that matches the shell pattern *.txt.

How are variables quoted correctly?

There are many shell expansion rules, and numerous contexts in which some of these rules are applied and others are not. The most common ones are:

Variable assignment
In statements like foo=$bar, no quoting is strictly necessary. But it doesn't hurt to quote here, too, because most other contexts require quoting. And for more complex values, like foo="$dirname/$filename" the quotes just makes clear that you are operating on strings. That's what strings look in other languages, too.
for loops

When you have a list of values that you want to iterate over, you could naively use the code above, but that would lead to unexpected results. The following code is a little better, but still not perfect.

touch "foo" "bar" "*"
var=`ls *`
for i in $var; do
    echo "$i"
done

One problem has been eliminated. The quotes around the $i prevent the shell from expanding that variable, so the echo command will only print one item per line. But the $var in the for is still expanded. It does not help to quote the $var in the for loop, as it would come out as a single string foo bar * instead of three words.

The easiest work-around for the filename expansion is to enable the -f flag in the shell using set -f, which should be quite portable, but I cannot guarantee for it.