I posted this tweet recently and got many replies. Since I was answering the exact same few questions dozens of times, I’d like to summarize everything here. I will also add some new information that I gathered that will clear many things up.
Yahoo: “Your password is too similar to the one you’ve used previously.” You wouldn’t know unless you store it in an insecure way, again.
— Anna Filina (@afilina) October 11, 2016
First, the context was not a password change, but a password reset. This means that Yahoo! did not have access to the plaintext version of the old password.
What do we know?
According to a friend who used to work at Yahoo!, they indeed salt and hash their passwords, which is considered secure at the time of writing this post. This clears up any speculation about how passwords are stored.
How do they check for password similarity? They don’t. Turns out that it’s just a poorly worded message, or perhaps it’s like that on purpose to give the illusion of having a feature that they don’t have. How do we know? Changing one character when resetting a password does not trigger the error message. I tried simply changing the case of a letter or a digit at the end: no error. Only an exact match does that. This means that the new password can be hashed with the same salt as the old password and the two hashes compared directly. Although some fancy theories were proposed in replies, none of them are actually implemented.
Do they only check the last password? As far as I know, they store all previous passwords, supposedly using the same secure method as the current password. I tried to change passwords 5 times and then tried the old password again: got an error message. I’d rather they only stored the last password and erased it after the change. A history of passwords does not look good in a data breach, otherwise, there was no point in making me change my password after the recent breach.
Based on the information gathered, your Yahoo! passwords seem safe enough. When databases are leaked, companies ask you to change passwords just in case, but it would be realistically really hard to crack those passwords.
In any case, I strongly recommend that you don’t use passwords that you can remember, but rather use a strong generated password using a tool like 1Password, which also lets you manage and auto-complete them. It’s the one I’ve been using for years.
I’m glad that this tweet sparked such interesting conversations and got people to think about password security.
Edit 2016-10-13: I’ll blog about general password security and how to compute similarity separately, for those who are interested to learn more on the topic.