Password Security

Since a lot of us are involved in online transactions these days, I started thinking about the issue of password security. What makes a password secure? How long should the password be, what kind of characters should it use?

The idea behind password security is that if your password can't be guessed, then it must be brute-force cracked. By "guessed" I mean anything that requires fewer steps than a brute force attack. For example, if your password is your first or last name, your spouse's name, your dog's name - then it is guessable. If it's your date of birth, your social security number, your child's birthday, then it is guessable. Further, if it's any word that can be found in a dictionary, then it's guessable.

The difference between guessable and not-guessable is this. Let's say a password consists of 8 characters. If there is no way to guess what those 8 characters might be, and in what order they appear, then the problem of cracking it becomes a problem of brute-forcing it. Which means looking at all possible permutations of legal characters that can be arranged in a string 8 characters long.

Suppose that legal characters (meaning, characters allowed by the passwording system) include the lowercase alphabet. The lowercase alphabet consists of 26 letters. The number of possible ways to arrange 8 letters out of a set of 26 letters (allowing for repetition of letters) is passwords_1.gif, where r is the range of legal characters (in this case 26, the lowercase alphabet), and n is the number of characters in the password (in this case 8).

In[1]:=

passwords_2.gif

Out[1]=

passwords_3.gif

So there are almost 209 billion ways in which you can make up a unique string of 8 characters, using only the lowercase alphabet.

This set of all possible ways in which to make unique 8 character strings out of the 26 lowercase alphabetic characters is called the keyspace. So you can say that for a password that is 8 characters long, and that is made only out of lowercase alphabet characters, the size of the keyspace is about 209 billion.

What this means is that if the password cracker has no way to guess your password, he must try 209 billion combinations in order to guarantee that he will crack your password. This is a brute force search.

In reality, on average, the password cracker won't need to search the entire keyspace. After all, what are the odds that he has to attempt every single last combination before he hits the right one? He could luck out and hit the right combination sooner, purely by chance. Taking that into account, we can say that the keyspace that needs to be searched is, on average, half the entire keyspace.

So if the keyspace in this case was 209 billion, a password cracker would, on average, need to search through half of that (104.5 billion) to discover your password.

Before we move on to calculations about password security, let's reiterate a few points:

1. If your password is a dictionary word (or words), or if it is some guessable number connected to you (such as birthdays, social security numbers, drivers license number, etc.) then you have no security and all bets are off. None of what follows below applies to you, you have already given your password away.

2. A good passwording system will have safeguards on password-attempts-per-second. All brute force attacks require using a computer that searches the keyspace by presenting possible passwords at a very high rate. For example, if you want to brute force a keyspace of 100 billion passwords, and you want to do it quickly, you must check passwords at a rate of millions per second. A good passwording system will reject such high rates, because obviously, no human is capable of typing millions of passwords per second. If the system is presented with passwords at the rate of millions per second, it's clear that it must be a computer and not a human which is doing the presenting, and a computer usually means someone is trying to crack passwords. A good passwording system will reject such attempts and simply lock the account down after the first few attempts.

However, the passwording system is usually not under user control. It's on your bank's web site, or somewhere on the internet. You have no way to know how good it is, or whether it will reject obvious cracking attempts. Further, many passwords are not on remote computers, but might be on your own home computer. For example, your Windows password. Or any encryption password you use on your system. In this case, it's always possible for someone to steal your entire computer (happens a lot with notebooks), or to install a virus or trojan on your computer. In this case, since the attacker has access to the actual programs that do password checking, he has the opportunity to tamper with them.

So there are many pitfalls between having a good password and having a good passwording system. The first is under your control, the second less so. Even with the best passwords, weakness in the passwording systems can break your security. Keep that in mind.

What's a secure password?

So with that out of the way, let's look at what makes a password secure. You have two variables to work with: password length and password characters.

Password characters means the set of legal characters which can be used to type out a password. For example:

lowercase alphabet - 26 characters
lowercase and uppercase alphabet - 52 characters
lowercase and uppercase alphabet and numerals (alphanumeric) - 62 characters
alphanumeric + all type-able symbols on the 101 keyboard - 94 characters
all unicode characters - over 107,000

For passwords, these sets are limited by issues of practicality. For example, if typing a unicode character means pressing the Alt/Command key and then typing out a 3 letter code on your numeric keyboard, for each character, almost no one will bother to do it. It's too much trouble. For practical purposes, we can assume that the maximum number of possible characters is whatever you can directly type on the usual US-stype 101 keyboard. That is 94 characters, including lowercase and uppercase alphabetical characters, numbers, and symbols.

It's also unlikely that the passwording system will allow the full unicode range of characters (though Windows is a notable exception, it does allow the full range). So for these calculations, we'll assume that the range of characters in the password is limited to whatever you can directly type on your keyboard, with or without the Shift key. That's 94 characters.

So the problem boils down to how to use those 94 possible characters to build a secure password.

How big should the keyspace be?

The short answer is, big enough to make brute force search impractical. This is a moving target, because computers keep getting better every year, and so the speed of password cracking hardware/programs keeps improving.

Secondly, you have to consider - who are you protecting your password from? From your room mate or spouse? From the average script kiddie on the internet? From professional crooks who steal passwords for a living? From a dedicated hacker? From a government? Which government?

The answer determines your security needs. Even trivial passwords can defeat the average person. A script kiddie might have password cracking tools, and the resources of his own computer, so he'll likely break any password based on dictionary words, or small passwords in general. A professional crook will be even better, but not a whole lot better. Professional crooks don't care too much about you - if your password is too hard, they'll just move on to the next guy whose password is easier.

A dedicated hacker who happens to hate you will try much harder and have better tools. A government will have many, many more resources. In fact, a government probably doesn't even have to crack your password, they can bug your home, steal your computer, do a whole range of other things that are easier than cracking passwords. These things, of course, are a whole different matter, so let's return to the issue of how secure must a password be in order to make brute force search of the keyspace impractical?

Here's an example of what people have access to today: Elcomsoft released a system based on CUDA (GPU based computing) that can check a billion passwords per second. This means that our 8 character password with its 104 billion (average) keyspace, can be cracked in a little over a minute. By the time you read this, the number may well have jumped to 2 billion passwords a second, or perhaps 10 billion. A few years from now, it will definitely be an order of magnitude higher. And that's just a system that anyone could buy, with not much cash.

If you're thinking that a rival corporation might be the one cracking your password, or a government, you can easily factor in resources a thousand or even a million times higher. Don't think of a billion passwords a second - think of a trillion, or more. That would be closer to reality, though of course, it's hard to say exactly how close.

With these numbers in mind, let's see how increasing password length deals with it.

In[2]:=

passwords_4.gif

Out[2]=

passwords_5.gif

The graph above shows how keyspace (y-axis) increases with length of the password. The y-axis is logarithmic, and the password is based on the 26 lowercase letters of the alphabet. As we can see, the keyspace increases dramatically with password length, and at 10 characters, has already exceeded the trillion mark (passwords_6.gif). So there is a tremendous increase in keyspace simply by increasing the password length.

Now let's examine the effect of password complexity. By "complexity", I mean the range of characters used to represent the password. So by my definition, a password based on the lowercase alphabet (26 letters) is less complex than one based on both lowercase and uppercase (52 characters), which is in turn less complex than one based on the full range of type-able characters (94 characters).

In[3]:=

passwords_7.gif

Out[3]=

passwords_8.gif

The above plot shows the keyspace of an 8-character password, as its complexity increases from 26 (lowercase alphabet only) to 94 (all type-able characters on a 101 keyboard). Again, the y-axis is logarithmic, and represents keyspace.

As we can see here, complexity doesn't seem to increase keyspace quite as fast as password length. The graph starts to flatten out. Also note the difference in the x-axis. In the previous graph, we only had to increase keylength from 1 to 16 to see such fantastically large keyspace numbers. Here in this second graph, we're increasing complexity from 26 to 94, which is a much larger increase, but we're still not seeing the very large keyspaces.

Let's take this farther. What if we had some really dedicated person who actually used the range of unicode characters to type a password?

In[4]:=

passwords_9.gif

Out[4]=

passwords_10.gif

Now we're seeing some big numbers for keyspace, but to see these big numbers, we have to increase complexity to 100,000+. That's a tremendous increase, and while it does increase keyspace by a fair amount, you could increase keyspace by similar amounts with only a small increase in password length.

In[5]:=

passwords_11.gif

Out[5]=

passwords_12.gif

As we can see in the above graph, if we stick to our low complexity model (just the 26 lowercase letters), we can see the same large keyspaces just by increasing password length to 30.

So increasing password length from say 8 to 30, has the same effect on keyspace as does increasing complexity from 26 to 100,000. From the point of view of mathematics alone, it's obvious that keyspace increases with password length much faster than it does with complexity.

Getting back to the question of how big the keyspace should be:

- Current cheap hardware - cracks a billion passwords per second
- Current expensive hardware (owned by a large corporation) - cracks a trillion passwords per second
- Current government level hardware - cracks a quadrillion passwords per second? Who knows.

Let's say that if a password takes 100 years to crack, it's impractical to crack it, because you'll be dead by then and won't care if it's cracked at that point anyway.

In[6]:=

passwords_13.gif

Out[6]=

passwords_14.gif

We're taking a unit, passwords_15.gif, which we calculate as the size of a keyspace checked per year by a large government with pretty much unlimited resources. We calculated that number by simply multiplying our best guess of their highest possible keyspace checking speed (one quadrillion combinations per second), times the number of seconds in a year.

In[7]:=

passwords_16.gif

Out[7]=

passwords_17.gif

Multiply that by 100, and you get the keyspace they could cover in 100 years, which was our predefined limit of practicality. We've then divided that by 2, because on average, you only need to search half the keyspace to discover the password. This gives us "statictarget", which is the keyspace you need to be secure for 100 years against brute force attacks.

However, consider that password-cracking speed will increase with technology. Moore's Law says that transistor counts double every couple years. Let's assume that this means computation speed also doubles every two years. Then, let's be even more aggressive and say it doubles every year, not every 2 years (because we want some margin of safety, we don't want to live on the edge).

So next year, they'll be cracking passwords twice as fast, and the year after that, 4x as fast, and so on.

In[8]:=

passwords_18.gif

Out[8]=

passwords_19.gif

What we did above is a simple calculation to see the effect of this increasing technology. We took the fastest possible password cracking system today, and operated it at this speed for 100 years. But next year, technology improves, so we added another system, twice as fast, and operated it for 100 - 1 = 99 years, working on the same problem. The year after that, we added another system, 4 times as fast as the original, and operated it for 100 - 2 = 98 years. And so on, doubling speed every year, for the remainder of 100 years.

After 100 years, these systems will have brute force searched a keyspace of 2.02 * passwords_21.gif So if you want to be secure against technology of this sort for a period of 100 years, you need a password with a keyspace that size, or larger. To be more accurate, we should divide this keyspace by 2 as well, since you need to search only half the keyspace on average to brute force the password:

In[9]:=

passwords_22.gif

Out[9]=

passwords_23.gif

So the keyspace you want for your password is about passwords_24.gif, in order to be safe against cracking attempts for the next 100 years, based on our current guesses about technology.

Let's see what we need to do to attain this.

In[10]:=

passwords_25.gif

Out[10]=

passwords_26.gif

We see that if we use only the base character set of 26 lowercase letters, we need a password length of about 40 to exceed the requirements.

In[11]:=

passwords_27.gif

Out[11]//ScientificForm=

passwords_28.gif

At a password length of 40, we have a keyspace of 3.9 * passwords_29.gif, which exceeds the capability of any known cracking system for the next 100 years, even given that technology increases by a factor of 2 every year.

A 40-character password is pretty long and difficult to remember. Let's see how much we can decrease password length, by increasing complexity of the password. Again, the target is to reach that passwords_30.gif keyspace.

In[12]:=

passwords_31.gif

Out[12]=

passwords_32.gif

Adding uppercase to our password (increasing complexity from 26 to 52) allows us to use shorter passwords to get the same keyspace. The critical point appears to be at a password length of 33.

In[13]:=

passwords_33.gif

Out[13]//ScientificForm=

passwords_34.gif

This exceeds the required keyspace, so a 33 character password built from lowercase plus uppercase letters is sufficient. By using uppercase in addition to lowercase, we have decreased the required password length from 40 to 33, which is a significant savings.

Now let's see what happens if we include numerals in the mix, expanding complexity from 52 to 62.

In[14]:=

passwords_35.gif

Out[14]=

passwords_36.gif

This only allows us to decrease the password length by one, from 33 down to 32:

In[15]:=

passwords_37.gif

Out[15]//ScientificForm=

passwords_38.gif

If you go down any further, say decrease length one more to 31, you get:

In[16]:=

passwords_39.gif

Out[16]//ScientificForm=

passwords_40.gif

This number is smaller than required, so we have to stick to 32. This is what I was talking about earlier, when I said that the effect on keyspace of increasing the password length is much greater than simply the effect of increasing complexity. We have increased complexity by 10 in this example, adding the numerals 0, 1, 2, ..... 9 to our lowercase and uppercase alphabet, but were only able to decrease password size by 1 as a result.

Let's move on and add all type-able characters on the 101 keyboard, which include (in addition to alphanumerics) the characters ,./;'[]\-=`~!@#$%^&*()_+{}|:"<>?  which increases complexity to 94. Let's see what happens then.

In[17]:=

passwords_41.gif

Out[17]=

passwords_42.gif

By adding the set of type-able characters, we have further decreased the length of the password by 3. Now a password of length 29 is required for the defined level of security.

In[18]:=

passwords_43.gif

Out[18]//ScientificForm=

passwords_44.gif

We can't decrease it any more. See what happens if we go down to a password length of 28:

In[19]:=

passwords_45.gif

Out[19]//ScientificForm=

passwords_46.gif

This number is lower than our required keyspace of passwords_47.gif, so we can see that with a range of 94 characters, the smallest password size needed for the required level of security is 29.

We went from a password length of 40 characters for lowercase letters only, to 29 characters, using lowercase plus uppercase plus numerals plus type-able characters. I think a savings of 11 characters is significant, and probably makes it worth it, to use the whole 94 character set.

Practical Limits

This has been mostly theory, from a math perspective. In reality, several factors may affect the results:

How many characters does the password system allow? If it doesn't allow 40, you're out of luck. In that case, I would increase complexity by adding uppercase/numbers/symbols, providing the password system allows those. Be careful that some password systems are poorly designed, and while they allow you to type passwords of any length, internally they truncate passwords at a certain length and ignore the rest of the characters.

Password security is only one layer, there is the matter of the security of the underlying encryption algorithm. If the system uses a bad encryption algorithm, or a bad implementation of a good encryption algorithm, you have a weakness that the best passwords can't overcome. There are plenty of encryption algorithms out there which have received good reviews - AES, Blowfish, Serpent, Twofish, etc. For best security, you need a good implementation of a good algorithm, using a large key size. Preferably one that has been published and examined by experts for several years, with no weaknesses uncovered.

Human errors may simplify password cracking. For example, someone may increase complexity by adding uppercase letters to their password. However, if you only capitalize the first letter of your password, you don't really have the advantage of adding the full complexity of the 52 character set. To use the full complexity, you need to capitalize random letters in your password. Similarly, someone may add complexity by including symbols from their keyboard. But if out of laziness they limit themselves to only those symbols you can type without pressing the Shift key, you've not really increased complexity as much as the math would indicate. Another human error which is too trivial to relate, but which I'll mention anyway, is to use a dictionary word or words as part of your password. If you do that, you may as well paste your password on Facebook. You have no security. Same with using any information that is connectable to you, such as a social security number, birth date, etc.

Finally, we can only talk about technology which exists. Quantum computing is in the works, and there have been successful demonstrations of quantum computers using a small number of qubits. If technology progresses in that direction, and computers with hundreds of qubits are invented, all bets are off.

What to do?

So here's some general advice on picking passwords.

First, never pick a word that could be found in a dictionary. Never pick your name, your child's name, any name. Any noun. Any number that's not random, meaning, any number that actually represents something, such as your ID, your address, your date of birth, whatever.

Second, examine the passwording system. Many passwording systems are poorly designed and don't give you sufficient information. But that doesn't excuse not reading what information is there. Does it give you a password length limit (for example, does it say something like "valid passwords are 8 - 20 characters in length"). Pick a password at the upper extent of that limit, within the bounds described above. Does it allow symbols? It may say it does. If it doesn't say, use them anyway. Worst that can happen is the system will say "sorry, that's not a valid password" and you may have to retype without symbols. Adding complexity is specially important when the password length is limited to some small number by the system.

Third, if you're on the internet, pay attention to encryption. If you're typing in passwords, you need to be on an encrypted connection to the web site. This means an SSL (secure socket layer) connection, which is indicated by the fact that the URL of the page starts with https:// rather than http://. If the page isn't encrypted, you already know the security at the site is lax. Don't entrust any important personal information to such sites.

Fourth, try to pick a password that is as random as possible. Don't be lazy with your hands. Don't pick keys that are all in one little corner of the keyboard. Don't be afraid of using the Shift key to use additional characters in the upper row.

Finally, exercise some discretion based on purpose. Obviously, a password for some throwaway site on the net which you may never access again doesn't need to be as secure as the password for your bank, or credit card. Spend as much effort as the task deserves and requires.

Practical Hints

In the end, the whole discussion of complexity and password size and all the calculations deriving from it depend on one thing - that your password is random. Creating a random password is not as easy as it would appear - we all have biases and habits, which tend to introduce order and lessen the randomness. Here are some hints on how to go about it.

I've found that the only way to remember complex passwords is through mnemonics. One way is by using first letters from a memorable phrase, for example:

      but I would not feel so all alone
    everybody must get stoned


These lines (from a Dylan song) can be converted to a password by picking the first letter of each word: biwnfsaaemgs. This is fairly random, and therefore good. You can make it even more random by:

- not picking a phrase from a popular song, just make up your own phrase
- intentionally mis-spell something. Misheard lyrics are a good example
- using a foreign language, if you know one

If you pick a phrase from a foreign language, but write the words in English using the Roman alphabet, you've improved your password. I would highly recommend this for anyone who knows a foreign language.

The next step is to increase complexity, by adding capitalization, numbers, symbols. Of course, each time you add complexity, your password becomes harder to remember, though it also becomes stronger. To ease the process of remembering, you can do stuff like:

- use numbers for alphabetic letters, using a simple substitution cipher
- use punctuation where it makes sense to you

For example, consider the password we derived from the Dylan lyrics above: biwnfsaaemgs. I could make a simple rule that says replace vowels with numbers (A = 1, B = 2, C = 3, D = 4 ...... Z = 26). The password has 3 vowels - i, a and e. If I replaced those with numbers, I would have b9wnfs115mgs. All I have done here is to replace the i with a 9, the two a's with 1's and the e with a 5, according to the substitution rules described above. This increases complexity and randomness.

Similarly, I could make a rule to always capitalize the 3rd and the 7th letters, unless either of those is a number. So password would now be b9Wnfs115mgs (note that the 7th character is a number, so it didn't get capitalized).

Such rules can help you remember a password, or help you figure it out if you have forgotten. However, even with such rules and mnemonics, there is a limit. You can't be expected to remember 20 different passwords for 20 different bank/creditcard/internet sites, even with mnemonics. But DON'T make the mistake of avoiding the problem by choosing the same password for all 20 sites. The reason is, any given site may have poor security. Some hacker may break into it, obtain your password, then publish it on the web. Now your password has been compromised on 20 sites instead of just 1.

Instead, use password files. One way to do this is as follows.

1. Pick 1 password, just 1, that is long, complex, secure, but easy for you to remember, with mnemonics. This is your master password.

2. Create a plain text file. Every time you need a new password for something (new bank, new credit card, whatever), make up a random password for it on the spot, by randomly pressing lowercase, uppercase, numerals, symbols. Try to make it at least 15 characters long. 20 is better, if the web site allows it. Don't try to make one that's easy to remember, you won't need to remember it. Just copy it down in the text file, one line for each password:

   ABC Bank    login: mynamehere    password: shfn7hGs%ksjsKsnpd    website: www.abcbank.com
   
  You can keep adding lines for each site at which you have a password.
  
3. Encrypt this text file using your master password. Store it in an encrypted form only, NEVER decrypted. You decrypt only when you need to look up your password for a given site.

There are a few things to consider. First, what kind of computer are you using? If it's a desktop that stays at home, chances are it's pretty secure. If it's a notebook that you carry around, you have to plan for it being lost or stolen at any time. Which means, extra care on not storing any decrypted information, making sure you have backups of the encrypted file on an online storage service or on your home computer, etc.

Second, how do you encrypt the text file? There are many password protection programs out there which do exactly this. They use a master password to store an encrypted file, and that encrypted file contains all your other passwords. They are very handy, with helpful features such as copy and paste directly into your browser, etc. Some of them are good too, in that they don't make temporary or backup files, show the decrypted information only on screen (never making a physical copy of it on disk), wiping any traces of their operation.

However, I don't trust any of them. The problem is that by definition, your security is only as good as the weakest link. Why should I pick some program to manage my security, when I have no idea who wrote that program, how honest he was, how competent he was, whether he deliberately built in back doors, whether he made mistakes, etc.

Instead, since I have to rely on an encryption program anyway, I pick a program that has had massive public attention. A program for which the full code is available for anyone to look at. A program popular enough to ensure that thousands of people HAVE been looking at its code. A program old enough that they've been looking at its code for years. And STILL haven't found any flaws. In this sense, security through obscurity is a sham. The only security comes from building/using a program that is good enough that you can afford to have millions of people looking for holes in it, and yet unable to find holes. But even that is more an ideal than reality. The reality is having people looking for holes, finding them, and the programmer fixing those holes on a regular basis.

By these criteria, I would pick a few programs in particular - the original PGP (which is now commercial software), and its free version GPG. And more recently, TrueCrypt. These are all programs with fully published sources, that have been under scrutiny for a long time, and still survived that scrutiny. If you really care about security, get one of them, spend some time learning it. Then use it.

Spikey Created with Wolfram Mathematica 7.0