This page has been robot translated, sorry for typos if any. Original content here.

Deobfuscating PHP and JavaScript scripts

On this topic:


One of the most reliable methods for protecting scripts from the study and modification is obfuscation.

Obfuscation (from Latin obfuscare, "shade, darken") - entanglement of the program code, that is, bringing the source text to a form that preserves the functionality of the program, but complicates the analysis, understanding of the algorithms and modification.

Unlike uniquely reversible encryption and packaging, obfuscation is an irreversible action. This means that it is not possible to restore the script code to the original form. All without exception, PHP and JavaScript obfuscators work in two ways: from the source text, line breaks and minor spaces are removed, and the names of classes, variables and functions are replaced by a meaningless set of characters. These actions can both be combined and performed independently.

As I already said, with deobfuscation, it is impossible to bring the code to its original form. But it is quite possible to "refine" it to such a state that you can easily understand the algorithm and make the necessary changes. For example, after deobfuscation, you can break the protection of some PHP scripts, remove the JavaScript binding to the domain, cut out compulsory advertising or something like that.

The first stage of deobfuscation is the formatting of the script's text, hyphenation, alignment of the code that is easy to read with "ladder". For this I use two tools. The company WaterProof Software has developed a small (less than 100 kilobytes) free program for formatting PHP code phpCodeBeautifier . To download it from offsite requires free registration, so here's for the convenience of a direct link to the download. The program is console, the command line parameters can be viewed in the instructions from the archive. For fans of the windows there is a GUI-version, though the older one, but it is quite possible to put a console file from the latest version.

Деобфускация скриптов PHP и JavaScript

For the formatting of scripts in JavaScript and HTML-code there is a wonderful online service Beautify Javascript . Just insert the text of the script into the form, press the "Beautify" button and get a beautifully formatted script. For convenience, I have slightly modified and compiled this page into a stationary exe file. Maybe someday I will find the time and energy to rewrite the script into a full-fledged application.

When you format large scripts, the browser can display a message that the script is frozen, and will prompt you to stop it. You do not need to do this, it takes more time to process.

Деобфускация скриптов PHP и JavaScript

If obfuscation did not replace the names of variables, then after formatting the code, deobfuscation can be considered complete. In any case, the script is already much more readable and understandable

If the names of variables and functions are corrupted, then go to the second part of deobfuscation. Here, unfortunately, there are no ready-made tools, or at least I have not met them. If someone wants to take up the writing of such a tool free of charge, then I can provide a detailed technical assignment. While it is necessary to be limited to theoretical calculations.

Strange as it sounds, but for deobfuscation we will use the same principles as for obfuscation. From all scripts, the names of all variables are extracted, and are replaced by others. The only difference is that we will extract the spoiled names and replace them with more convenient ones. For example, $ kOObgZ4tf2LEaSmFfc555 (Obfusc) or $ IIIIIIIIIIIl (PHP LockIt!) Is replaced with $ var_3 . For a single script, you can do this in a regular notepad with a global replacement, for several scripts you will first have to extract all variable names from all the scripts, and only then perform a global replacement. Do not forget about utility variables such as global arrays $ _GET and $ _POST in PHP, as well as reserved words in JavaScript. They do not need to be processed. In order for the replacement to be performed more qualitatively, it is recommended to do it after formatting the code.

This article describes only the general principles of deobfuscation, for each case it will be necessary to think and apply an individual approach. But usually, it is not necessary to completely de-buffize to perform an algorithm cracking or parsing an individual function.