Android decompiled process and tools

TO START WITH, we can see a picture showing the process from apk to java. To reach the final purpose of decompiling an apk, we may use some tools like apktool, dex2jar, enjarify, jd-core(jd-gui), cfr, procyon and etc. Apk could be regarded as a zip file and the DEX inside is the optimal Java bytecode recognized by Dalvik or ART in android system. Then, before cracking to get the source code of Java, DEX should be transfered to equilvalent JAR which wraps java class files. At last, using one of the java decompiler among cfr, jd-core and procyon, a JAR can be decompiled to java files that are very-close to the source project. However, we cannnot completely restore it and actually this is also a hard nut to crack in developing a practical decompiler.
apk2java

The picture below shows the difference between Jar file and Apk file. Class file is the same as dex file essentially and both of them are byte-stream code files. In other words, dex comes from class file because DVM or ART-VM are developed from JVM. We can use javap in JDK tools to disassemble class file and apktool to disassemble dex file.

jar-apk-compare

First, Get resources and smali-code from apk

apktool

https://ibotpeaches.github.io/Apktool/

A tool for reverse engineering 3rd party, closed, binary Android apps. It can decode resources to nearly original form and rebuild them after making some modifications. It also makes working with an app easier because of the project like file structure and automation of some repetitive tasks like building apk, etc.

java -jar apktool.jar d yourapp.apk

Second, Get jar file from apk/dex

2.1 dex2jar

https://github.com/pxb1988/dex2jar

dex2jar to work with android .dex and java .class files

./d2j-dex2jar.sh <your-classes.dex> -o <out-jar-file>

2.2 enjarify

https://github.com/google/enjarify

Introduction

Enjarify is a tool for translating Dalvik bytecode to equivalent Java bytecode. This allows Java analysis tools to analyze Android applications.

Why not dex2jar?

Dex2jar is an older tool that also tries to translate Dalvik to Java bytecode. It works reasonable well most of the time, but a lot of obscure features or edge cases will cause it to fail or even silently produce incorrect results. By contrast, Enjarify is designed to work in as many cases as possible, even for code where Dex2jar would fail. Among other things, Enjarify correctly handles unicode class names, constants used as multiple types, implicit casts, exception handlers jumping into normal control flow, classes that reference too many constants, very long methods, exception handlers after a catchall handler, and static initial values of the wrong type.

Use Python 3 to run it

Enjarify is a pure python 3 application, so you can just git clone and run it. To run it directly, assuming you are in the top directory of the repository, you can just do

python3 -O -m enjarify.main yourapp.apk

For normal use, you'll probably want to use the wrapper scripts and set it up on your path.

For convenience, a wrapper shell script is provided, enjarify.sh. This will try to use Pypy if available, since it is faster than CPython. If you want to be able to call Enjarify from anywhere, you can create a symlink from somewhere on your PATH, such as ~/bin. To do this, assuming you are inside the top level of the repository,

ln -s "$PWD/enjarify.sh" ~/bin/enjarify

Windows

A wrapper batch script, enjarify.bat, is provided. To be able to call it from anywhere, just add the root directory of the repository to your PATH. The batch script will always invoke python3 as interpreter. If you want to use pypy, just edit the script.

Usage

Assuming you set up the script on your path correctly, you can call it from anywhere by just typing enjarify, e.g.

enjarify yourapp.apk

The most basic form of usage is to just specify an apk file or dex file as input. If you specify a multidex apk, Enjarify will automatically translate all of the dex files and output the results in a single combined jar. If you specify a dex file, only that dex file will be translated. E.g. assuming you manually extracted the dex files you could do

enjarify classes2.dex

The default output file is [inputname]-enjarify.jar in the current directory. To specify the filename for the output explicitly, pass the -o or --output option.

enjarify yourapp.apk -o yourapp.jar

By default, Enjarify will refuse to overwrite the output file if it already exists. To overwrite the output, pass the -f or --force option.

Third, Get java code from jar file

3.1 jd-core

https://github.com/nviennot/jd-core-java

JD-Core is used by jd-gui as its java decompiler which is usually ignored. It's open-sourced and can do the decompiling job quickly.

JD-Core-java is a thin-wrapper for the Java Decompiler.

This is hack around the IntelliJ IDE plugin. It fakes the interfaces of the IDE, and provides access to JD-Core.

java -jar jd-core.jar <compiled.jar> <out-dir>

3.2 cfr

http://www.benf.org/other/cfr/

CFR will decompile modern Java features - up to and including much of Java 9, 10 and beyond, but is written entirely in Java 6, so will work anywhere! (FAQ) - It'll even make a decent go of turning class files from other JVM langauges back into java.

java -jar cfr.jar <compiled.jar> --outputdir <dir>

3.3 procyon

procyon

https://bitbucket.org/mstrobel/procyon/wiki/Java%20Decompiler

Procyon is a suite of Java metaprogramming tools focused on code generation and analysis. It includes the following libraries:

  1. Core Framework
  2. Reflection Framework
  3. Expressions Framework
  4. Compiler Toolset (Experimental)
  5. Java Decompiler

procyon-decompiler is a standalone front-end for the Java decompiler included in
procyon-compiler tools.

The procyon's author said that

As a developer who splits his time between the .NET and Java platforms, I have been surprised and dismayed by the lackluster selection of decompilers in the Java ecosystem. Jad (no longer maintained, closed source) and JD-GUI (GPL3) are pretty decent choices, but the former does not support Java 5+ language features, and the latter tends to barf on code emitted by my LINQ/DLR tree compiler.

To address the situation, I recently started developing a decompiler myself, inspired by (and borrowed heavily from) ILSpy and Mono.Cecil.

java -jar procyon.jar <compiled.jar> -o <dir>

4 Other

4.1 Jeb (Android-java decompiler)

https://www.pnfsoftware.com/

Reverse Engineering for Professionals.
Decompile and debug binary code. Break down and analyze document files.
Dalvik, MIPS, ARM, Intel, WebAssembly & Ethereum Decompilers.

jeb-gui

You may find Some missing libraries while you starting the JEB and you can use sudo apt-get install command to fix it.

# sudo apt-get install libcarberra-gtk-module

4.2 IDA_Pro (C/.so DisAssembler)

https://www.hex-rays.com/

IDA is the Interactive DisAssembler: the world's smartest and most feature-full disassembler, which many software security specialists are familiar with.

Written entirely in C++, IDA runs on the three major operating systems: Microsoft Windows, Mac OS X, and Linux.

IDA is also the solid foundation on which our second product, the Hex-Rays decompiler, is built.

The unique Hex-Rays decompiler delivers on the promise of high level representation of binary executables. It can handle real world code. It is real.

ida-gui

You may get in trouble for some missing libraries while you starting the IDA.

$ sudo apt-get install multiarch-support

###To install the following x86-supported libraries
###libgthread-2.0.so.0
###libfreetype.so.6
###libSM.so.6
###libXrender.so.1
###libfontconfig.so.1
###libXext.so.6

$ sudo apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386 libglib2.0-0:i386 libfreetype6:i386 libsm6:i386 libxrender1:i386 libfontconfig1:i386 libxext6:i386
发表新评论