Permutations and Combinations

Permutations and combinations are used to determine the number of ways in which it's possible to select or arrange a group of objects. For example, we could take a string of letters, say a, b, and c, and ask "how many ways can these letters be arranged?" We could simply do this by writing out the possible ways, like so: abc, acb, bac, bca, cab, cba. In each case, the letters are arranged in a different order, and the total number of different possible ways is 6.

We could make this slightly more complex, and only choose to arrange a smaller number of objects from some larger set. For example, we can ask "how many different ways can we arrange two letters out of a set of five?" Let's say that the five letters are a, b, c, d and e. We are trying to make as many different 2-letter "words" out of these 5 letters. Again, we could just write them out and see what we get: ab, ac, ad, ae, ba, bc, bd, be, ca, cb, cd, ce, da, db, dc, de, ea, eb, ec, ed. So we get 20 possible ways to do this. This is called a permutation.

There is another terms that is also used when talking about ways of selecting objects from a set - combinations. The difference between the two is that in permutations, the order of the objects is important. In the example above, ab and ba are considered two different ways of selecting the letters a and b. However, in combinations, the order doesn't matter, so ab and ba count as just one. Obviously, the number of combinations possible when you take 2 objects out of 5 is smaller than the number of permutations possible, because in combinations, we are ignoring the order in which the objects appear, and combining cases where the order is different into a single case.

Which one of these methods we use depends on the application. In some cases, the order is important, for example, when considering how letters make words. Obviously, "rat" and "tar" and "art" are all different words, although they use the same 3 letters. This is a case where the order of the objects matter, so we would use permutations in such cases. On the other hand, consider the problem where you are asked to pull 2 socks out of a laundry bag, without looking. What are the odds of getting a matched pair? Let's say the bag contains socks of 4 different colors, and we are interested in getting two socks of the same color. In this case, we don't care about the order in which two blue socks were pulled from the bag, so long as both are blue. There is no difference between the sock that goes on the left foot and the right foot, so the order is not important. This would be a case for using combinations.

The number of permutations and combinations increases very quickly as the number of objects is increased. With only a dozen objects, the number of possible ways of arranging them can get so large that it's no longer possible to just list them all. We need formulas to calculate the number of possible arrangements.

The formulas used are:

1. Permutations of n objects, taken r at a time. Remember, permutation means that the order is important. Repetition is not allowed, meaning you can't use the same object twice or more often.

Permutations = n! / (n-r)!

2. Permutations of n objects, taken r at a time. Repetition is allowed.

Permutation = nr

3. Combinations of n objects, taken r at a time. Remember, combination means that order doesn't matter. Repetition is not allowed.

Combination = n! / [ r! (n-r)! ]

4. Combination of n objects, taken r at a time. Repetition is allowed.

Combination = (n + r - 1)! / [ r! (n - 1)! ]

Example

Let's consider the example of password security. We use passwords to protect files, accounts on web sites, etc. Suppose that the password is random (meaning, it doesn't spell out a word or phrase that could be subject to a dictionary attack). In this case, how long would it take to discover the password by simply guessing (this is called a brute force attack, where no other information is available to the attacker, so he is forced to simply try out all possible arrangements of the letters to guess the password)?

First off, we know that in passwords, the order of letters is important, that is, "mypassword" isn't the same as "passwordmy". So this makes it a problem of permutations rather than combinations. Second, we know that we can use the same letter twice or more, for example, "moon" is a perfectly valid password, although it contains the letter "o" twice.

So the applicable formula here is the one shown in (2) above. Now let's figure out what "n" and "r" will be. The first, "n", is the set from which we draw the letters. This set could be all the letters in the English language (26 letters). In this case, "n" would be 26. While 26 may seem like a lot compared to our examples earlier, it's really a very small number.

Consider a password that is 6 letters long, made out of 26 letters of the alphabet. Repetition is allowed. How many possible passwords are there?

Number of possible passwords = nr = 266 = 308,915,776

So the total number of possible passwords is about 309 million. This seems like a huge number, but we need to remember the speed of modern computers. Obviously, the person trying to break the password isn't sitting there, typing in guesses by hand. He will be using a password cracking program, which can generate random passwords and test them at a tremendous rate. So 309 million isn't really such a large number. A good program could easily crack such a password in half an hour, using nothing more than a home computer. Also remember, the hacker doesn't have to go through the whole set of possible passwords to find out your password. What are the odds that out of 309 million possible passwords which he was trying out at random, yours would be the very last that he tried? The odds are not good. On average, you only need to search half the solution space to discover the password. That is, on average, a search of only 155 million passwords would likely turn up yours. This is not very good security.

What happens if we increase the password size to 8 letters?

Number of possible passwords = nr = 268 = 208,827,064,576

Now we have almost 209 billion possible passwords. This is a much harder problem to crack. There are 676 as many possible permutations with 8 letters as there are with 6. So if we assume half an hour to crack a 6 letter password, an 8 letter password would take 338 hours, or over 14 days to crack. This means that someone would have to be serious about cracking your password to do a brute force search and discover it.

This is why most people recommend that you use at least an 8-letter password, if not more. You have to remember, a single computer might take 14 days, but these days computers are cheap, and there is little to stop anyone from building a password cracking farm with 14 machines, or even a couple hundred machines. And computers get faster every couple of years, so there is not much future proofing in this.

What else can we do to increase security? One thing that works really well, but doesn't really depend upon us, is whether we can use more than 26 letters. This depends upon the people who built the password system. Does the system distinguish between uppercase and lowercase letters? If it does, you've immediately doubled the set from which you draw letters, from 26 to 52. In such a system, "abnqrhse" is not the same as "abnqRhse". Capitalizing one letter made it a completely different password, and now the hacker has to search through a much larger solution space. How much larger? For the same 8-letter password:

Number of possible passwords = nr = 528 = 53,459,728,531,456

So now simply by allowing for uppercase letters, the 8-letter password could be any of over 53 trillion possibilities. This is much better than the previous result. Fortunately, most password systems these days do in fact distinguish between uppercase and lowercase letters. However, don't take this for granted. Make sure that the description of the password system says "your password is case sensitive". Otherwise, you never know. The system could simply be taking all letters, whether you type them in lowercase or uppercase, and be storing them as uppercase.

We can use the same principle to further increase the number of permutations by including numbers, and even symbols such as &, $, @, or punctuation marks, such as commas and stops. Of course, the password system has to allow for these. Most password systems these days in fact do allow for them.

How large is the search space if we include such numbers and symbols. There are a few different ways to go about figuring that. You could look up the ASCII symbol set and see what special characters are available, and count them. However, for practical purposes, the number of symbols is limited to what a person can directly type by pressing a key on their keyboard. If I count all the keys available on my keyboard that result in a letter, number or character on the screen when pressed, it comes to 47 (US English Keyboard). Each of these keys actually codes for two possible characters, depending upon whether or not I pressed the Shift key first. So that's a total of 47 times 2, or 94 possible letters and characters that are available to me with a couple of keypresses.

Using a set of 94 characters, an 8-letter password could have:

Number of possible passwords = nr = 948 = 6,095,689,385,410,816

or about 6 quadrillion possiblities. This is now a really large number, probably beyond the reach of ordinary, casual hackers. That's not to say it's uncrackable. Large industry or governments may well have the means to crack it, but you are probably safe from garden variety crooks. Of course, if you want even more security, the easy solution is to make your password longer. Going to 10 or 12 characters makes it much harder. At about 15 letters, you have probably future proofed it beyond reach of governments. At this point, it's much cheaper and easier to get your password some other way (such as by installing hidden cameras in your home, keyloggers on your computer, or simply torturing you until you reveal it) rather than try to brute force crack it.

For much more about these calculations and password security, read this.