Using neural networks for password cracking

In this blogpost I'm going to write about how it's possible to use neural networks for password cracking.

TL;DR

It's possible to train a neural network on a dataset of cracked passwords and then use it to predict uncracked passwords.

Idea

Somewhen during the last year I found a fascinating blogpost about recurrent neural networks on hackernews and Karpathy's char-rnn on Github. I was playing around with neural networks before, so I started to think about ways to (ab)use these in a security context. People are already trying to use neural networks as intrusion detection systems or similar defensive security strategies.
But what about the attacker's perspective? Can neural networks somehow be used to facilitate attacks or offensive behaviour?

After finishing Karpathy's article and especially the section about creating new baby names, I was wondering if it could learn patterns from passwords and generate similar ones.

Imagine that you have a large dataset of hashed passwords from a specific website [insert recent hack here]. You run your standard hash cracking tool and wordlist against it and manage to crack a portion of the passwords. Depending on the website's niche, there may be a pattern in the passwords which you don't know. So let's see if a RNN can figure it out and generate new valid passwords for previously uncracked hashes.

Setup

Disclaimer: I'm not a machine learning expert, so I may have done something wrong. I also didn't finish the model training, so this is more of a proof of concept.

I didn't want to waste my time and computation power on hashing and cracking passwords, so I skipped this step and downloaded the rockyou.txt wordlist. It is around 134MB in size and contains around 14M passwords. The notes state Best list available; huge, stolen unencrypted, but there was no particular reason for choosing this one. I always assumed that wordlists would be free of duplicates, but this assumption didn't hold for this one, so I removed them first:

$> sort rockyou.txt | uniq -u > rockyou.unique.txt
$> du -sb rockyou.* 
139921497	rockyou.txt
139842217	rockyou.unique.txt

It's unlikely that cracked passwords are sorted, so I randomized the unique wordlist again:

$> sort -R rockyou.unique.txt > rockyou.random.txt

After that I split it into two parts:

  • The training set with the first 250k passwords:
$> head -n 250000 rockyou.random.txt > train.txt
  • The rest is the testing set

I reassured me that there are no overlapping entries in both the training and testing dataset:

$> sort train.txt test.txt | uniq -d | wc -l 
0

Training neural networks on a CPU isn't recommended anymore, because GPUs are much faster when it comes to parallel number crunching. Unfortunately I don't have a CUDA cluster at home, so I rent an AWS GPU g2.2xlarge spot instance for some hours. The spot instances are really, really cheap and perfect for such experiments. Luckily someone created a community AMI (ami-c79b7eac) with all tools and cuda drivers installed and configured. You'll just need to run the following commands to setup char-nn:

$> cd /tmp/
$> sudo luarocks install nngraph 
$> sudo luarocks install optim
$> git clone https://github.com/karpathy/char-rnn
$> cd char-rnn

The last step is to upload the train.txt file to /tmp/char-rnn/data/pws/input.txt.

I trained the dataset on three different recurrent neural networks for around 10 epochs:

  • $> th train.lua -rnn_size 128 -num_layers 3 -eval_val_every 500 -seq_length 50 -data_dir data/pws/ -savefile pws
  • $> th train.lua -rnn_size 512 -num_layers 3 -eval_val_every 500 -seq_length 100 -data_dir data/pws/ -savefile pws2
  • $> th train.lua -rnn_size 64 -num_layers 3 -eval_val_every 500 -seq_length 20 -data_dir data/pws/ -savefile pws3

Once in every epoch, I let the network generate output of 25kb in length:

$> th sample.lua -length 25000 cv/lm_pws_epochXY > output/pws-epochXY

That was around 2500 lines (=passwords).

$> wc -l *.txt
 2681 pws-0.54.txt
     2534 pws1-1.62.txt
     2599 pws1-2.16.txt
     2571 pws1-3.24-12345.txt
     2571 pws1-3.24-pass.txt
     2570 pws1-3.24.txt
     2498 pws1-4.86.txt
     2583 pws1-5.40.txt
     2564 pws1-6.48.txt
     2566 pws1-7.02.txt
     2638 pws1-8.10.txt
     2523 pws1-9.18.txt
     2416 pws1-10.26-t0.3.txt
     2540 pws1-10.26-t0.5.txt
     2628 pws1-10.26-t0.8.txt
     2567 pws1-10.26.txt
     2446 pws2-1.08.txt
     2681 pws2-2.16.txt
     2634 pws3-0.65.txt
     2585 pws3-6.91.txt
     2609 pws3-9.72.txt
     2552 pws3-12.09.txt
     2574 pws3-16.19.txt
     2611 pws3-18.13.txt

The files have the following format: pwsX-EPOCH[-t].txt where

  • X is the network id
  • EPOCH is the epoch of the used savefile
  • -t is the temperature used for sampling

Results

Again, our goal was to see if the network can predirect passwords from the testing dataset (test.txt) by learning passwords from the training set (train.txt). Now that we've gathered some (minimal amount of) data, let's evaluate it.

Network #1

  • Neurons per layer: 128
  • Layers: 3
  • Sequence length: 50

This network generated 31k unique passwords of which only 92 are in the training set:

$> cat pws-* | sort | uniq -u > output-1.txt
$> wc -l output-1.txt 
31441 output-1.txt
$> sort output-1.txt train.txt | uniq -d | wc -l 
92
$> sort output-1.txt test.txt | uniq -d | wc -l 
3180

YAY the network created 3180 passwords which can also be found in the testing set. Let's have a look at some of these:

$> sort output-1.txt test.txt | uniq -d | sort -R | head -n 25 
freddy12
ava2005
joseph2006
sexylexy
marcolo
jackson7
jackson1
ariela11
andrew12345
mercedes23
brittshit
19081984
jklover
boodoo
eliana19
725972
daliani
pacy99
121093a
celin1
Hottie.
andrew1010
tatty9
karina79
amanda213
080223

Network #2

  • Neurons per layer: 512
  • Layers: 3
  • Sequence length: 100

I didn't train it too long, because it was relative slow per batch. Only 5k password generated of which 3 are in the training and 178 are in the testing dataset.

$> wc -l output-2.txt 
5046 output-2.txt
$> sort output-2.txt train.txt | uniq -d | wc -l 
3
$> sort output-2.txt test.txt | uniq -d | wc -l 
178
$> sort output-2.txt test.txt | uniq -d | sort -R | head -n 25 
zhan23
hooda1
dobes
081087
69777
jestin
492007
25372
griss34
14711
172085
ka2006
ellva
10113
franz27
nissa09
33
012187
a
1113sd
ilove2
ford19
40309
kina13
kiku13

Network #3

  • Neurons per layer: 64
  • Layers: 3
  • Sequence length: 20

The smallest network managed to generate 15k passwords with only 15 being in the training and 726 in the testing set.

$> wc -l output-3.txt 
14976 output-3.txt
$> sort output-3.txt train.txt | uniq -d | wc -l 
15
$> sort output-3.txt test.txt | uniq -d |wc -l 
726
$> sort output-3.txt test.txt | uniq -d | sort -R | head -n 25 
dannica
hissss
lobe21
carlo14
ho
killer1
candice12
laiska
kaley2
pusher1
778658
7344501
leanng
bron55
GOLLITA
jaimi
490767
552655
71
12
cocker07
660244
nanina
hirann
love19

All networks together

Let's combine the output of all three networks. We get more than 50k unique passwords from which 3,8k are elements from the testing set and 105 from the training set.

$> wc -l output-all.txt 
50691 output-all.txt
$> sort output-all.txt train.txt | uniq -d | wc -l 
105
$> sort output-all.txt test.txt | uniq -d | wc -l 
3881
$> sort output-all.txt test.txt | uniq -d | sort -R | head -n 25 
newmon
chesar
pawita
opalla
buster6
reneeia
singers123
patito123
6312355
biggie1
297572
jairo01
bday1992
coolboy1
meeka1
blackie
laura25
neil10
mone10
12345sexy
dog143
193208
mash28
alexa87
793806

I'm sure that one can predict more passwords when the model is trained longer, with other parameters or when the -primetext parameter is used during sampling.

Data

If that caught your interest and you want to continue to work on this and/or start the recurrent neural network from a savefile, I've uploaded the contents of the directories:

  • cv/
  • data/
  • output/

to Github: https://github.com/gehaxelt/RNN-Passwords.

Conclusion

As shown above, it is possible to let a computer crunch some numbers and predirect unkown passwords. But based on the amount of data used for the training/testing dataset and the small percentage of predirected passwords, I don't think that his approach has any practical relevance.

-=-