Love and Passwords
Patchen Barss | February 12, 2015UOIT researcher Christopher Collins has been part of a team studying more than 32 million passwords that were released from a now-defunct hacked website called “RockYou.” This work was led by graduate student Rafael Veras and was a collaboration with Julie Thorpe, from UOIT's Faculty of Business and Information Technology. Even though passwords are one of our best-kept secrets, it turns out millions of people put a little love in their computer security. You can explore some examples of real passwords in an interactive visualization, but here are the highlights of what he found: Research Matters; How much love did you find in people’s passwords? Christopher Collins: In our study of 32 million passwords, we found that love was the most common verb. Love is about 23 times more common than hate. In fact, we compared the occurrence rates of words in passwords against the occurrence in “normal English” and found that “love” occurs much more often in passwords than in normal English. RM: How do people build love into their passwords? CC: Love was seen in many patterns beyond the straightforward use of the word. For example, “<3” is the second most common sequence of the form <number + special character> (“#1” is the most common). “<3” is a shorthand for a heart (turn your head). We also see a high occurrence of “4u” as in “ilive4ubaby” compared to other two character sequences. We found thousands of passwords containing “lover”, “loverboy”, “lovergirl”, even “latinlover.” We found that people love dogs slightly more than cats. We found interesting patterns in the objects of love. For example, for the pattern “ilove<name>”, the name is 4 times more likely to be a man’s name than a woman’s name. In order of probability, people expressed love for “boys”, sex (in various word forms), dad, dogs, cats, music, mum/mom, love, horses, cities, and countries, rock, girls, alcohol (various word forms), chocolate, and angels. However, as objects of “hate”, male names are only twice as likely as female names. The pattern “ihate<name>” overall is about 75 times less common than “ilove<name>,” revealing that love wins over hate – at least in our passwords. The object of “hate” is much more likely to be what our system calls a “knowledge domain”, like “ihatemath”! The pattern “Iloveyou” is 10 times more likely than “Iloveme” – so much for rumours of our self-centred society! We also see examples of unrequited love in patterns like “iloved<name>” and “imiss<name>”. RM: What else? CC: We see these “romantic” patterns popping up in the adjectives used in passwords as well. Here are the top 10 adjectives: 1) sexy 2) hot 3) pink 4) blue 5) red 6) big 7) cute 8) sweet 9) cool 10) green We see pet names, like “baby” occurring often, as in “babygirl”, “babygurl”, “Babyboy”, “babycakes”, “sexybaby”. We don’t know, but I think these are not talking about infants. RM: Predictability would seem to be a boon to hackers. Is there a security risk in choosing a love-based password? CC: We are continuing to work to quantify the security risks raised by our work. This is the primary focus of our analysis, however, the human-interest part of this data is exciting too. I would advise people against using “ilove<name>” as a password, as the pattern is just too common. Approaches to guessing passwords (called “offline guessing attacks”) make millions of guesses and would quickly try these patterns. However, breaking up the pattern with other words, with numbers and other characters can help a lot. We are working on systems to suggest modifications to passwords to make them more “semantically secure” in our current research. RM: Does your research teach you anything about people's inner thoughts? CC: We can’t really say much about people as computer scientists. We didn’t know anything about the people – not even their username. So, while we see very evocative passwords, such as “lovehurts”, we can’t ask anyone about them. Passwords are a very personal thing – we don’t expect anyone to know them, and we select them knowing they will be repeated many times throughout our days. So, when someone chooses to name their partner, such as “ilovedan” in a password, I expect that person is making a conscious effort to reaffirm that love as a sort of mantra through the day. Or, maybe they have a crush and are trying to manifest the unrequited love. We can’t say from the data! We were also surprised at the amount of affirmation in passwords. While we did find hateful passwords, expressing hate against particular groups, religions, or people, overall, the sentiments seem quite positive. I think that passwords are a chance to have a personal secret, to be free with what we write, and perhaps play around a bit with language. When we choose intimate passwords, I think that expresses a desire and maybe reminds us of the positive things in life throughout our day. But, I’m a computer scientist, this is just conjecture!