This page has been robot translated, sorry for typos if any. Original content here.

All methods of hacking MD5

On this topic:


Все методы взлома MD5

It's no secret that cryptography has become a part of our life. Internet services, social networks, mobile devices - they all store in their databases user passwords, encrypted using various algorithms.

The most popular such algorithm today, of course, is MD5.

About ways of its breaking and there will be a speech.

A little about cryptography

Modern cryptography includes three directions: encryption with a private key, public key encryption and hashing. Today we will talk about what is hashing and what it's eaten with.

In general, hashing is understood as the conversion of input data of arbitrary length into an output bit string of fixed length. Most often, hash functions are used in the process of user authentication (the database usually stores a hash of the password instead of the password itself) and for the calculation of checksums of files, data packets, and so on.

One of the most well-known and widely used hashing algorithms is MD5.

Start

The MD5 algorithm is a 128-bit hash algorithm. This means that it calculates a 128-bit hash for an arbitrary set of data arriving at its input. This algorithm was developed by Professor Ronald Rivest of the Massachusetts Institute of Technology in 1991 to replace the less reliable predecessor - MD4. The algorithm was first published in April 1992 in RFC 1321. After that, MD5 was used to solve a variety of tasks, from hashing passwords to CMS to creating digital signatures and SSL certificates.

The fact that the MD5 algorithm can be cracked was first mentioned in 1993. Researchers Bert den Bauer and Anton Bossilaris showed that pseudocollisions are possible in the algorithm. Three years later, in 1996, Hans Dobbertin published an article in which he proved the existence of collisions and described the theoretical possibility of hacking MD5. It was not yet hacking, but the world began talking about the need to switch to more reliable hashing algorithms, for example SHA1 (at the time of this writing, it has already been proven that there are collisions in this algorithm, so I recommend using SHA2) or RIPEMD-160.

First attacks

The direct breaking of MD5 began on March 1, 2004. CertainKey Cryptosystems has launched the MD5CRK project, a distributed collision search system. The goal of the project was to search for two messages with identical hash codes. The project ended on August 24, 2004, when four independent researchers - Wang Xiaoyun, Feng Denguo, Lai Xuejia and Yu Hongbo - discovered the vulnerability of the algorithm, which allows finding collisions by analytical method for more or less acceptable time. With this method, it is possible to identify collisions on an IBM p690 cluster in just an hour (it's a pity that I do not have such a house).

Все методы взлома MD5
Example of collision MD5 hashes

On March 1, 2005, the first use of this vulnerability in practice was demonstrated. The research team submitted two X.509 certificates with different sets of keys, but with identical checksums. In the same year, Vlastimil Klima published an algorithm that makes it possible to detect collisions on a regular laptop in a few hours. In 2006 he went on. On March 18, 2006, the researcher published an algorithm that found a collision in one minute! This method was called "tunneling". In 2008, an article on the method for generating X.509 forged certificates was presented at the Chaos Communication Congress. In fact, this was the first case of real use of collisions in the MD5 algorithm.

Все методы взлома MD5
MD5 Brute with mask

Much work has also been done to accelerate the cracking of hashes. In 2007, Kevin Breeze introduced a program that uses Sony PlayStation3 for hacking MD5. He managed to achieve very good results: 1.4 billion MD5 hashes were generated in just one second! Two years later, in 2009, BlackHat USA published an article about using the GPU to search for collisions, which made it possible to increase its speed several times, especially if it was executed with the help of several video cards simultaneously.

The ATI Radeon HD 4850 X2 graphics card allows you to generate up to 2.2 billion hashes per second! The use of the MD5 algorithm in EDS is unacceptable due to the lack of stability of this algorithm to the search for collisions.

This is the end?

In 2011, the IETF agreed to amend RFC 1321 (MD5) and RFC 2104 (HMAC-MD5). So the document RFC 6151 appeared. He recognizes the MD5 encryption algorithm as unsafe and recommends not using it. In my opinion, this document officially put an end to MD5.

However, despite the fact that the MD5 algorithm was officially recognized as unsafe, there are thousands if not tens and hundreds of thousands of applications that use it for storing passwords, in digital signatures and for calculating checksums of files.

By the way, on October 31, 2008 NIST announced a competition among cryptographers. The goal of the contest is to develop a hash algorithm to replace the obsolete SHA1 and SHA2. At the moment, the finalists are already defined - this is BLAKE, Gostl, JH, Keccak and Skein.

Ighashgpu: hacking using GPU

But enough theory. Let's get down to business and talk directly about hacking our favorite algorithm. Suppose that we had a hash of some kind of password: d8578edf8458ce06fbc5bb76a58c5ca4.

To crack this hash, I suggest using the program Ighashgpu, which can be downloaded from www.golubev.com or found on our disk. The utility is distributed completely free of charge and works quietly under Windows. To speed up the process of hacking, Ighashgpu uses a GPU, so you need at least one nVidia or ATI video card with CUDA / ATI Stream support.

Modern graphics processors are built on a slightly different architecture than conventional CPUs, so they are much more efficient in processing graphics information. Although GPUs are designed to handle 3D graphics, in the last few years there has been a trend towards their use for conventional computations. Start working with the program is not easy, but very simple: unzip the archive to any place on the disk and start burglary using the command line Windows:

ighashgpu.exe -t:md5 \
-h:d8578edf8458ce06fbc5bb76a58c5ca4 -max:7

We use the above method to crack one specific hash generated by the MD5 algorithm. The maximum length of a possible password is seven characters. After some time, the password will be found (qwerty). Now let's try to crack another hash, but with slightly different conditions. Let our hash has the form d11fd4559815b2c3de1b685bb78a6283, but includes letters, numbers, underscore and has the suffix "_admin". In this case, we can use the password search by mask to simplify the program task:

ighashgpu.exe -h:d11fd4559815b2c3de1b685bb78a6283 -t:md5
-u:[abcdefghijklmnopqrstuwvxyz1234567890_] -m:??????_admin

Here the '-u' option allows you to specify the character set used for brute force, and the '-m' parameter specifies the password mask. In our case, the mask consists of six arbitrary characters, followed by a combination of "_admin". The choice of password will not be difficult either.

Collisions - a collision in cryptography is called two different input data blocks, which for the same hash function give the same hash. Each output function produces a sequence of bits of a certain length that does not depend on the size of the original data. It follows that there are collisions for any hashing algorithm. However, the likelihood that you will be able to find a collision in a "good" algorithm, practically tends to zero. Unfortunately, or fortunately, hashing algorithms can contain errors, just like any program. Many hash functions have either been broken, or will soon be. In this case, "breaking" means finding a collision in a time that is much less than the declared infinity.

Ighashgpu: Lists

Now let's try to crack several passwords at once. Suppose that we have a database of password hashes in our hands. It is known that each password ends with the characters c00l:

f0b46ac8494b7761adb7203aa7776c2a
f2da202a5a215b66995de1f9327dbaa6
c7f7a34bbe8f385faa89a04a9d94dacf
cb1cb9a40708a151e6c92702342f0ac5
00a931d3facaad384169ebc31d38775c
4966d8547cce099ae6f666f09f68458e

Save the hashes in the encrypted.dat file and run Ighashgpu as follows:

ighashgpu.exe -t:md5 -u:[abcdefghijklmnopqrstuwvxyz1234567890_]
-m:??????c00l encrypted.dat

After the program is finished, the ighashgpu_results.txt file appears in the Ighashgpu folder with the passwords cracked:

f0b46ac8494b7761adb7203aa7776c2a:1rootxc00l
f2da202a5a215b66995de1f9327dbaa6:pwd12xc00l
c7f7a34bbe8f385faa89a04a9d94dacf:pwd34yc00l

cb1cb9a40708a151e6c92702342f0ac5:pwd56yc00l
4966d8547cce099ae6f666f09f68458e:pwd98zc00l
00a931d3facaad384169ebc31d38775c:pwd78zc00l
Все методы взлома MD5
Hacked hashes from the encrypted.dat file

Ighashgpu: salt

Finally, let's make a cracking of the "salted" hash. Suppose that a hash is generated using the following algorithm:

var plain = password + "s41t";
var hash = md5(plain);

As a result, we received the following hash: 42151cf2ff27c5181bb36a8bcfafea7b. Ighashgpu allows you to specify "salt" in the "-asalt" parameter:

ighashgpu.exe -h:42151cf2ff27c5181bb36a8bcfafea7b \
-t:md5 -u:[abcdefghijklmnopqrstuwvxyz1234567890_] \
-asalt:s41t

And we again got the required password easily and quickly.

Entertaining Mathematics - For the 8-character password, composed of the first 126 ASCII characters, 63 527 879 748 485 376 possible combinations are available. For 254 characters, the number of possible combinations increases to 17 324 859 956 700 833 536, which is 2.7 billion times more than people on our planet. If you create a text file containing all these passwords, it will take millions of terabytes. Of course, in the modern world this is possible, but the cost of storing such a file will be simply transcendental.

Burglary of MD5 in turbo mode

Hacking hashes through a full search even on the best hardware takes quite a lot of time, especially if the password is more than eight characters long. The easiest way to increase the speed of password matching is to create a database of all hashes for a particular set of characters. In the 80s of the last century, hackers believed that when they have more powerful hardware, 640 KB of memory and a 10 MB hard drive, then such a base will become a reality and the selection of any password will turn into a minute affair. However, iron developed, and the dream remained a dream. The situation changed only in August 2003, after Philip Oeslin, Ph.D. in computer networking from the Swiss Institute of Technology in Lausanne, published his paper on the problem of choosing the optimal time-to-time ratio. It described the method of hacking hash functions with the help of "rainbow" tables.

The essence of the new method is as follows. First, you must select an arbitrary password that is then hashed and subjected to a reduction function that converts the hash to any possible password (for example, it can be the first 64 bits of the original hash). Next, a chain of possible passwords is constructed, from which the first and last elements are selected. They are written to the table. To restore the password, we apply the reduction function to the source hash and look for the possible password in the table. If there is no such password in the table, we will hash it and calculate the next possible password. The operation is repeated until a password is found in the rainbow table. This password represents the end of one of the strings. To find the original password, you need to run the entire conversation again. Such an operation does not take much time, depending on the chain construction algorithm, it is usually a few seconds or minutes. "Rainbow" tables allow you to significantly reduce the amount of memory used compared to conventional search. The only drawback of the described method is that it takes quite a long time to build tables.

Now let's move from words to deeds and try to crack a couple of password hashes using this method.

Rainbow tables - "Rainbow" tables - this is a special type of dictionary, which contains a chain of passwords and allows you to choose a password in a few seconds or minutes with a probability of 85-99%.

"Iridescent" hacking

First you need to decide on the program. Personally, I like the RainbowCrack, which is free and works on both Windows and Linux. It supports four hashing algorithms: LN / NTLM, MD5 and SHA1. The program does not require installation, just unpack it somewhere on the disk. After unpacking, you need to find the "rainbow" tables for the MD5 algorithm. Here everything is not so simple: you can either download them for free, or buy them, or generate them yourself. One of the largest archives of free tables is available on the Free Rainbow Tables project website.

By the way, you can also help the project if you download the client from the site and join a distributed international network that generates "rainbow" tables. At the time of this writing, 3 TB tables for the MD5, SHA1, LM and NTLM algorithms were already available on this site. If you do not have the opportunity to merge this amount of information, then on the same site you can order discs with "rainbow" tables. At the moment, there are three packages: LN / NTLM, MD5 and SHA1 - $ 200 each. We will generate the tables ourselves. To do this, you need to use the program rtgen, which is part of RainbowCrack. It takes the following input parameters:

  • hash_algorithm - hashing algorithm (LM, NTLM, MD5 or SHA1);
  • charset - one of the character sets contained in the charset.txt file;
  • plaintextlenmin and plaintextlenmax - the minimum and maximum length of the password;
  • tableindex, chainlen, chainnum, and partindex are the "magic numbers" described in the article by Philip Oeschlin

Let's consider the last parameters in more detail:

  1. table_index - the index of the "rainbow" table, which can be used when splitting the table into several files. I used 0, since my table consisted of only one file.
  2. chain_len - the number of unique passwords in the chain.
  3. chain_num - the number of chains in the table.
  4. part_index is the parameter that determines the beginning of the chain. The creators of the program are asked to use only this number as this parameter (I used 0). Now run the generation of the rainbow table for MD5:
rtgen.exe md5 loweralpha-numeric 1 7 0 2000 97505489 0

In this case, we create a table of passwords consisting of digits and capital letters of the Latin alphabet and having a length of one to seven characters. On my Eee PC with the Intel Atom N450 processor, this process took almost two days :) . As a result, I got the file md5loweralpha-numeric # 1-702000? 975054890.rt in the size of 1.5 GB.

Next, the resulting table must be sorted to optimize the search for the desired chain. To do this, run rtsort.exe:

rtsort.exe md5_loweralpha-numeric#1-7_0_2000x97505489_0.rt

We are waiting for a couple of minutes and the table is ready! Now you can break the password itself. First, we'll try to find the password for one hash: d8578edf8458ce06fbc5bb76a58c5ca4. Run rcrack_gui.exe and select Add Hash ... in the File menu. In the appeared window, enter a hash and click OK. Now select the file with the rainbow table. To do this, use the Search Rainbow Tables ... item in the Rainbow Table menu. In the window that opens, to select the file, we look for the file with the table, I have it md5_loweralpha-numeric # 1-7_0_2000x97505489_0.rt, then click Open. A few seconds later the password is in our hands! A similar operation can be performed on the hash list from the file.

Все методы взлома MD5
I generate a rainbow table

"Rainbow" tables vs. CPU vs. GPU

I think you noticed how fast Ighashgpu is able to crack MD5 hashes by brute force, and the fact that RainbowCrack does it even faster with a good rainbow table. I decided to compare the speed of these programs. For the purity of the experiment, I used the MDCrack program, which brutes the password on the CPU (and is one of the best among programs of this type). Here's what happened for the GPU (nVidia GeForce GT 220M), CPU (Intel Atom N450, two cores) and "rainbow" tables:

Длина пароля | GPU | CPU | Таблицы
4 символа | 00:00:01 | 00:00:01 | 00:00:16
5 символов | 00:00:02 | 00:00:09 | 00:00:16
6 символов | 00:00:16 | 00:05:21 | 00:00:10
7 символов | 00:07:11 | 09:27:52 | 00:00:04

As you can see, the speed of the search using the CPU is much less than using a GPU or "rainbow" tables. Moreover, most specialized programs allow you to create a cluster of video cards, so that the speed of password enumeration increases many times. I think you noticed that the speed of matching the 4- and 5-character passwords is lower than the speed of selecting a password of six or seven characters. This is due to the fact that the password search starts only after the table is loaded into memory. It turns out that out of sixteen seconds on average thirteen is spent on downloading and three - on hacking a hash.

Все методы взлома MD5
Iridescent table from the inside

bit.ly/vEhdir - adding a new hashing algorithm to RainbowCrack using the API.

bit.ly/vTSB9K - description of the format of the rainbow table.

Instead of concluding

In the end, I would like to talk a little about protecting your passwords. First, do not use vulnerable hashing algorithms, such as MD5 or SHA1. At this point, it's worth thinking about using one of SHA2 or SHA3 cryptographic hash functions (as soon as the corresponding standard is published). Secondly, do not use hashing functions directly. Always try to use "salt" and combine various algorithms. And thirdly, choose complex arbitrary passwords with a length of at least eight characters. Of course, this will not protect you from hacking 100%, but at least complicate the life of intruders.