How do Websites like Facebook store Your password: An easy Explanation.


After making some websites I came to know a very interesting fact about which I was always clueless.

Password protected web sites do not need to store your actual password, Facebook indeed did not. For the nontechnical part of the audience, let me try to explain how this actually works.

A well designed password system uses something called a "hash function." (Hash as in hash browns, not hashish. )

Think of a hash function as a machine. It has a hopper on the top and you throw in the password, and a moment later, something completely incomprehensible called a "hash" is spit out.

For example, if you feed into the machine "p4ssw0rd", it might spit out "2a9d119d". 

Now what on earth is the point of a machine that takes a perfectly good password and spits out a bunch of garbled nonsense?

The trick is that, given the same password, it will always spit out the same garbled nonsense. If you feed in "p4ssw0rd" it always spits out "2a9d119d". If you feed in "qwerty" it always spits out "d8578edf".

And the hashing machine gets more amazing. When you give it different inputs, you are almost guaranteed to get different outputs. (The mathematicians among you may notice that for this to be true, the output should probably be longer than the input. Indeed it usually is.)

Now when somebody sets up a password, you don't just store that password on disk. You first feed it through the machine, and you store the hash. For instance, if somebody sets up "p4ssw0rd" as their password, you would store "2a9d119d".

Now when that person wants to log in, they type in "p4ssw0rd", and we need to see if that is the correct password. So we feed it through the same machine, and because the machine always spits out the same output if you give it the same input, it's going to again give us "2a9d11d". That matches the hash stored on disk so it will let the person in.

But suppose that you have forgotten the password and you call a system administrator and ask for them to look up the password in the system. You're out of luck! The only thing stored in the system is "2a9d11d", which by itself is completely useless to you.

What you would need is an unhashing machine, that takes in "2a9d11d" and gives you back "p4ssw0rd". But one of the amazing features of a good hashing machine is that it is extremely hard to build such an unhashing machine. Basically, the only way to do it is to simply feed all possible passwords into the hashing machine until "2a9d11d" comes out. But there are a lot of possible passwords, and that would take a long time.

Now where do we get this amazing hashing machine, that always produces the same output for the same input, virtually always produces different outputs for different inputs, and for which it is practically impossible to build an unhashing machine?

This is where a branch of math called cryptography comes in. Cryptography is a highly specialized branch of math that deals with creating exotic contraptions like hash functions and ciphers. Every once in a while, the cryptographers discover that for some hash function it actually isn't that hard to build the unhashing machine after all, and everybody has to scurry around and switch to using a different hash function. But by and large, it all works, even though the majority of mathematicians and computer scientists who don't specialize in cryptography don't really know how it works.

We have come to the end of my explanation of how Facebook, as well as any other decently implemented computer system, can know whether you typed the correct password on any given occasion without actually storing your password. Reality is a bit more complicated. (It always is...) There is something called password salting, for instance, that makes it a bit harder to crack passwords even if you use a common dictionary word for your password (which you should nevertheless never do). You can go read about it on Wikipedia, or in a good computer security book.

Resources to help you: 

Wikipedia has an extensive, if somewhat rambling, overview of Password security
also, an excellent list of common cryptographic hash algorithms and their known vulnerabilities
and articles about techniques such as Key stretching (where you use multiple hashes in series to increase the difficulty of reversing the hash) and Salt (which prevents the use of precomputed tables of hashes of common passwords by "hashing in" some additional information known as the "salt").

Note that all good computer security is based on multiple lines of defense; in the case of passwords, you typically want to try very hard to prevent even the hashed passwords from escaping in the first place, and rate-limit login attempts.

Thanks to Jaap Weel to make more clear.

Search Anything from This Blog or Web

Web hosting