We are working with a security policy that treats two passwords of equivalent strength:

  • 8 character password with two character sets represented (pick two of upper/lower/number/symbol)
  • 6 character password with three character sets represented (pick three of upper/lower/number/symbol)

The question arises, how equivalent (or not) are they? Well, it’s time to do some math.

Total Possible Passwords

One way to measure password strength is in the total number of passwords that one might be able to generate that meet that criteria. More would be better. There are 26 uppercase, 26 lowercase, 10 digit, and 33 ASCII-printable symbols available on the average keyboard (totaling 95 options). If we simply asked how many possible 6 character passwords are there, you can multiply 95 for each character that your password is long (i.e. 95 to the 6th power):

Total possible passwords from 95 printable character choices
six characters? 95 x 95 x 95 x 95 x 95 x 95 = 735 billion +
eight characters? 95 x 95 x 95 x 95 x 95 x 95 x 95 x 95 = 6.6 quadrillion +

You can see that these aren’t equal – the longer password is clearly stronger. However, there is no accounting for the different character sets that are required – maybe that will make a difference? Read on for more of the math behind passwords.

The funny thing is that since our policy requires that you have *at least* one character from each of certain character sets, it actually forces the total number of passwords to be lower than you expect. Our 6 character password needs to have at least one character from each of 3 character sets. Let’s assume we leave digits out and stick only to upper, lower, and symbols because they provide more complexity. Again remember our 6 character password must include at least one of 3 character sets, and our 8 character password only must include two.

Total possible passwords where  characters must be from certain sets (largest)
six characters? 26 x 26 x 33 x 95 x 95 x 95 = 19 billion +

eight characters? 26 x 33 x 95 x 95 x 95 x 95 x 95 x 95 = 630 trillion +

This is entirely dependent upon the character sets we choose, though. Let’s go for the minimums this time, using only letters and digits.

Total possible passwords where  characters must be from certain sets (smallest)
six characters? 26 x 26 x 10 x 95 x 95 x 95 = 5.7 billion +

eight characters? 26 x 10 x 95 x 95 x 95 x 95 x 95 x 95 = 191 trillion +

And now to really look at the minimums, let’s assume that each password ONLY contains the character sets that it is required to have. Again, it will depend on the sets we choose – below I’ve listed the smallest and largest possible sets for each.

Total possible passwords where certain characters are ONLY from certain sets
six characters – upper,lower,number 26 x 26 x 10 x 36 x 36 x 36 = 315 million +

six characters – upper,lower,symbol 26 x 26 x 33 x 85 x 85 x 85 = 13 billion +
eight characters – upper,number 26 x 10 x 36 x 36 x 36 x 36 x 36 x 36 = 565 billion +
eight characters – upper,symbol 26 x 33 x 85 x 85 x 85 x 85 x 85 x 85 = 323 trillion +

So the interesting thing is that by forcing passwords to be complex by requiring certain character sets to appear, you actually reduce the number of total possible combinations a single password must have. I’m honestly not sure what impact this might have on the effectiveness of password-guessing or password-hacking tools – but the math indicates that there are fewer options available.

Edit 5pm. Did not consider the fact that order matters (mostly) in your password. At least in the cases where you need to choose characters from certain sets. We need to consider how many different ways can the characters from the chosen sets can be added to the password. This will also help close the gap for the 6 character 3 character set password, but by a pretty insignificant amount. 

Think of the simple case of your password needing to have an ‘A’, a ‘B’, and a ‘C’ in it. There isn’t just one choice, there are 6. In order to include this permutation, we need to multiply our final result by (number of characters in the password factorial / (number of characters in the password – number of required sets) factorial). So if the number of characters in our password is N, and the number of required sets is S, we need to multiply our final answer by (N!/(N-S)!). In the case above we’d get (3!/(3-3)!) = 6 (since 0! = 1). So our new results have increased the total complexity significantly. Now I’m going to stick with just the best and worst possible cases:

Total possible passwords including permutations
6 chars 3 sets – worst: 26 x 26 x 10 x 36 x 36 x 36 x (6!/(6-3)!) = 37 billion +

6 chars 3 sets – best: 26 x 26 x 33 x 95 x 95 x 95 x (6!/(6-3)!)= 2.2 trillion +
8 chars 2 sets – worst: 26 x 10 x 36 x 36 x 36 x 36 x 36 x 36 x (8!/(8-2)!) = 31 trillion +
8 chars 2 sets – best: 26 x 33 x 85 x 85 x 85 x 85 x 85 x 85 x (8!/(8-2)!) = 18 quadrillion +

Password Entropy

Another common method of determining the complexity of a password (from the Wikipedia article on password strength) is to calculate the number of bits of information entropy each password generates. Per the Wikipedia article, “The strength of a random password as measured by the information entropy is just the base-2 logarithm or log2 of the number of possible passwords, assuming each symbol in the password is produced independently.” Entropy is a way of measuring the difficulty in brute-force attacks – so a password with 35 bits of entropy would require 235 attempts in order to successfully attempt all the possible passwords. So let’s take the worst-case numbers above and calculate their entropy:

Information Entropy of password
six characters, minimum w/3 character sets
– log2 ( 315 million + ) = 28.23 bits (4.7 bits/character)
– log2 ( 37 billion + ) = 35.14 bits (5.8 bits/character)
six characters, maximum w/3 character sets
– log2 ( 19 billion + ) = 34.15 bits (5.7 bits/character)
– log2 ( 2.2 trillion + ) = 41.06 bits (6.8 bits/character)
eight characters, minimum w/2 character sets
– log2 ( 565 billion + ) = 39.04 bits (4.9 bits/character)
– log2 ( 31 trillion + ) = 44.85 bits (5.6 bits/character)
eight characters, maximum w/2 character sets
– log2 ( 630 trillion + ) = 49.16 bits (6.1 bits/character)
– log2 ( 18 quadrillion + ) = 54.01 bits (6.6 bits/character)

Again, looking at the math, it doesn’t seem like this can be right. The password which requires characters from more character sets has a lower entropy than the password which has a less restrictive requirement. It is right, however. Actually, it’s now been made correct because of the permutation math. The 3 factors of complexity added by requiring an extra character set to be present has actually resulted in a very slight increase in the entropy per character – 0.2 more per character. The point remains the same though: By restricting certain characters to a character set, you reduce the entropy provided by that password. And again, this is only “assuming each symbol in the password is produced independently”.

NIST Guidance on Password Entropy

It turns out, people don’t produce passwords, or characters within passwords, independently. We’re kind of notoriously bad at it. This is how magicians get people to pick a number, or why whenever they release the most commonly cracked passwords, the word password always tops on the list. NIST produced a special publication 800-63 which provides guidance on the entropy provided by user-supplied passwords. You can read the guidance in Appendix A-2 of the document. However, let’s compare the entropy we calculated above with the guidance provided by NIST:

NIST Guidance on Information Entropy of user-selected password
six characters w/3 character sets = 14 bits (20 bits if symbols are used as a character set)

eight characters, w/2 character sets = 18 bits (24 bits if symbols and letters are used as the character sets)

Well those are ridiculous! You are suggesting that you can crack my 6 character password in 16,384 attempts if I don’t use symbols? Well that’s not realistic either. It turns out that some people way smarter than me agree: the NIST guidance is broken. The truth lies somewhere in between the entropy numbers I provided, and the NIST guidance which tries to reduce entropy based on the imperfections of humans.

Final Words

In conclusion, both of these are bad password policies. As XKCD pointed out in 2011complex passwords are harder to remember than long passwords, and they provide less entropy. Your best password policy these days is to encourage the use of long passwords – at least 12 characters long.

 

Thanks to Walt Turnes for pointing out errors in this. And shout-out to the Web 2.0 scientific calculator which is pretty useful.

 

11 thoughts on “The math behind passwords

  1. Walt Turnes says:

    Headline in 2016:
    “TheQuckBrownFoxJumpedOverTheLazyDog judged as the most insecure password in the wild”

  2. Walt Turnes says:

    This post and subsequent edits are a case study in why you should trust but verify policy math, and also why amateurs shouldn’t try to implement cryptography.

  3. am says:

    “Total possible passwords where characters must be from certain sets (smallest)
    six characters?
    26 x 26 x 10 x 95 x 95 x 95 = 5.7 billion +”

    Why not:
    10 x 10 x 10 x 10 x 26 x 26? (numbers, upper and lower)

  4. pmhesse says:

    @am: The “Total possible passwords where certain characters are ONLY from certain sets” chart is more like what you’re talking about. The rule is only “at least 1 number, upper, and lower” so you can’t restrict the character space for the other 3 characters. In that example I assumed the other 3 characters were from two of the three smaller sets. I suppose it would actually be more accurate say those could only be restricted to 62 characters (upper, lower, and number).

  5. am says:

    i think there is also another problem:
    you choose a particular example with A,B,C. In this particular case the number of permutations is equal to the number of arrangements.

    What you actually need is permutations not arrangements. an you have used bellow the formula of arrangements for multiplication (N!/(N-S)!). Instead you should have just used the formula for permutations which means you should have multiplied with N!

Comments are closed.