What is reverse engineering in cyber security

It’s no secret that many of us as children tried to take apart a toy to understand how it works from the inside. Some have carried this habit throughout their lives, applying their curiosity to their profession. Following the same example, coders try to [take apart] the mechanism of a program in order to fix bugs or to improve it.

What is reverse-engineering in IT?

Reverse-engineering, or sometimes called reverse engineering, is the process of analyzing an application to determine its functional characteristics, internal architecture and, actually, its operation: modules, functions, algorithms. Reverse engineering is used in IT for different purposes:

  • improving the functionality of an application in cases where the company that developed it has ceased to exist or there is no way to contact it
  • analysis of viruses, [worms, Trojan horses, to isolate their signatures and create means of protection (anti-virus software)
  • decoding file formats to improve compatibility (file formats for popular Windows applications that do not have Linux counterparts such as Open Office or Gimp);
  • training and much more.

However, Reverse Engineering is often used [inappropriately], because after studying the architecture of an application or getting the source code, you can modify it and use it for your [selfish] purposes. Here are some examples:

Using trial versions of an application all the time. Say we have a product that we can use for free for a month. When we run the app, it checks the installation date relative to the current one. By removing this check or replacing it with a function that will always return the desired result, the application will remain in trial mode forever.

Information or code theft. An attacker can target not the application itself, but a module or part of it. This tactic is relevant for competing software companies.

Bypassing technical means of copyright protection. An intruder aims to bypass copy protection for audio and video files, computer games, or e-books for subsequent free distribution.

Attackers can target both [desktop and mobile applications. In the context of reverse-engineering, it does not matter whether the application is written to run on a smartphone or a PC, because the hacking methods depend to a large extent on the programming language and the security mechanisms implemented. After all, a mobile app is an archive that consists of configuration files, libraries and compiled code files. Therefore, in general terms, the approaches to [hacking mobile and desktop applications will be the same.

The source code retrieval process depends on the programming language and platform, as it is a reverse compilation process. For example, applications developed in the .Net framework are first compiled into the Common Intermediate Language (CIL) and then converted to machine code through the Common Language Runtime (CLR) at runtime. The compilation of Java and Python applications works similarly: high-level code is first compiled into an intermediate low-level byte-code language and then converted to machine code by a just-in-time compiler.

This arrangement provides cross-platform compatibility and also allows different parts of the application to be written in different languages within the same framework. However, in terms of reverse-engineering, it is possible to get information about classes, structures, interfaces, etc. from the intermediate language (both CIL and bytecode) and restore the original architecture. There are ready-made utilities for this such as .Net Reflector, MSIL Disassembler, ILSpy, dotPeek for .Net applications, Javap, JAD, DJ for recovering Java from bytecode and pyREtic, pycdc, Uncompyle2 for handling Python applications.

If an attacker is sufficiently familiar with CIL or bytecode, sooner or later he will be able to modify it, recompile it, and make the application work for his own purposes.

Reverse-engineering applications in traditional programming languages (such as C, C++, orObjectiveC) is a more difficult task. Applications written in them are immediately compiled into executable machine code, which does not store any information about the structure of the original application: class names, function or variable names, etc. An additional obstacle is that the low-level representation does not contain branching constructions (if, for, etc.), and their reconstruction requires building a graph of the flow of the program’s control constructions. This requires significant time costs. But even this cannot guarantee the safety of the application’s source code. Having deep knowledge in Assembler and programming skills, the task of source code recovery (or identical functionality) becomes only a matter of time.

So how do you secure your application? Or at least make it more difficult for an attacker?

Here are some popular ways:

  • Code obfuscation is the process of reducing the code to a form difficult to analyze, while keeping its functionality. Obfuscation makes reverse-engineering much more difficult, since if an attacker gets hold of the source code, it is extremely difficult to determine what the attacker is doing. One of the most effective types of obfuscation is mutation. This means that the application constantly changes its source code at runtime, which makes the task of reverse-engineering extremely difficult. However, there are problems here as well. The obfuscated code [is unreadable not only for the attacker, but also for the developer himself. Also, adding extra code branches can reduce performance and even add bugs to the code. But perhaps the biggest disadvantage is that obfuscation does not guarantee high security in case an attacker gets hold of the source code, even if it is hard to read. Because the target in this case is a specific piece of code, i.e. it is not necessary to dismantle the entire application to remove, say, copy protection or license verification.
  • Integrity checks confirm that the code has not been changed. To do this the checksums of different sections of the application code are calculated and if they don’t match the specified value the application stops working. But here again there are difficulties. If an intruder gets access to the application’s source code he can remove the integrity check or replace it with a function which always returns the correct result.
  • Encryption of the program code verifies that only [legal customers] can use the application, that is, without an encryption key the program becomes unusable, or works only on its branches. However, even here nothing can guarantee the security of the code because it is possible to expose the key generation mechanism.

There are other methods of protection such as watermarks, placing critical code sections into separate modules, protected execution environments, etc., but none of them can provide complete security. The approach to protecting an application must be unique to each individual case.

For example, code obfuscation is not only a security feature, but in some cases it may increase performance. For example, writing the code to a single line or replacing variable names with shorter and non-obvious names reduces the size of the build and increases the performance of the application. However, types of obfuscation such as adding code branches or aliasing can reduce performance.

Therefore, when choosing methods of code protection, you should first be guided by the threat model, namely: what in the application should be protected and in what ways an intruder can try to get it. If it is a code change then you should focus on integrity checking, and if you are examining a part of the application you should consider obfuscation or encryption. Although there is no guaranteed solution, with the above security methods you can make it as hard as possible for an attacker.

Leave a Reply

Your email address will not be published. Required fields are marked *