You’re almost there — sign up to start building in Notion today.
Sign up or login
Introduction To Smali

Introduction To Smali

Smali
is a programming language used for developing Android apps. It is a low-level language that is used to write code that can be executed by the Dalvik Virtual Machine, which is the runtime environment for Android apps.

Lets do it

First, create a project and then create a class with your desired name.(Select Java)
Person.class
ALT
MainActivity.class
ALT
Then, decompile to
smali
this APK with
Jadx-GUI
and
apktool
. The result are like this:
Jadx-GUI output
ALT
APKTool output.
ALT
As you can see we have some difference in outputs. In
Jadx-GUI
we have
.registers 5
, but in
APKTool
we have
.locals 5
. This difference because these to tools use difference
decompilers
.

What is registers and locals?

.registers
 is used to indicate the number of registers that a method will use. Registers are a type of temporary storage that are used by the Dalvik Virtual Machine to hold data and intermediate results during method execution. The number of registers required by a method depends on its complexity and the types of operations it performs.
.locals
 is used to specify the number of local variables that a method will use. Local variables are variables that are used within a method and are not accessible outside of it. The number of local variables required by a method depends on the number of arguments passed to the method, as well as any variables that are created and used within the method.

Can we use .registers instead of .locals?

While both registers and local variables are used for data storage in
Smali
, they serve different purposes and cannot be used interchangeably.
Registers are a type of temporary storage that are used by the
Dalvik Virtual Machine
to hold data and intermediate results during method execution. They are a limited resource, and the number of registers available for use is determined by the
.registers
directive at the beginning of the method.
Local variables, on the other hand, are variables that are used within a method and are not accessible outside of it. They are used to store data that needs to persist across multiple instructions within a method. The number of local variables available for use is determined by the
.locals
directive at the beginning of the method.
Therefore, registers and local variables serve different purposes and cannot be used interchangeably. However, in some cases, it may be possible to use registers to store small amounts of data that would otherwise be stored in local variables. This approach can help to conserve the limited number of registers available and improve the performance of the app. However, this should be done with caution and only when necessary, as it can make the code more difficult to read and maintain.

Java2Smali

We have another options too. There is plugin named
Java2Smali
for
Android Studio
. We can install it like this:
Now let’s compile to
smali
:

Java VS Smali

Can spot where is number
26
in
smali
?
💡 Callout icon
As you found out, every number in
Smali
will stored in Hex format.

Can you change the age?

Task: Change the age to 39 and recompile the APK.
How to convert number to hex and vice versa?
hex(26) # The result is 0x1a int("0x1a", 16) # The result is 26

How Registers Works?

First, consider this code:
public int sum(int firstNumber, int secondNumber){ int result = 0; result = firstNumber + secondNumber; return result; }
As you know, this is High-Level language. The CPU isn’t able to understand theses statements. Therefore the compiler, convert this codes to instructions which is understandable for CPU.
The CPU couldn’t understand variable name and function parameters in this way. CPU just understand registers. So what is registers?
Think of registers as small storage spaces in the CPU where the processor can store data that it needs to work on. It's like a desk with several drawers, where you can store papers and files that you need to access frequently while you're working.
Registers are used by the CPU to perform arithmetic and logical operations, to hold data that is being transferred between different parts of the CPU, and to manage the flow of instructions and data in and out of the CPU. They play a critical role in the performance of the CPU because they can be accessed much faster than the computer's main memory.

Register names in ARM arhitecture

The ARM architecture has several types of registers with different purposes and names. Here are some of the most common registers used in ARM processors:
General-purpose registers (R0-R15): These registers can be used for a variety of purposes, such as storing data, addresses, or operands for arithmetic operations. R0-R7 are also known as "low registers", while R8-R15 are "high registers".
Program Counter (PC): This register holds the memory address of the current instruction being executed.
Link Register (LR): This register stores the return address when a function call is made.
Stack Pointer (SP): This register points to the current top of the stack, which is used for storing temporary data and return addresses when executing functions.
Status Register (CPSR): This register contains various flags that indicate the current status of the processor, such as the mode of operation, whether an arithmetic operation resulted in an overflow or carry, and whether interrupts are enabled or disabled.
Saved Program Status Register (SPSR): This register is used to hold the CPSR value when an interrupt occurs, so that the processor can return to the original state after handling the interrupt.
Note that the specific names and number of registers can vary depending on the specific ARM architecture and implementation.

What is interrupt

In computing, an interrupt is a signal to the processor from an external device, such as a keyboard, mouse, or disk drive, requesting the processor's attention. When an interrupt occurs, the processor suspends its current task, saves its current state, and starts executing a special routine called an interrupt handler or interrupt service routine (ISR) to handle the interrupt request.
Interrupts are used to handle events that require immediate attention, such as input/output (I/O) operations, hardware errors, or timer events. By using interrupts, the processor can efficiently handle multiple tasks and events simultaneously, without wasting time continuously checking for them.
Interrupts can be either hardware interrupts or software interrupts. Hardware interrupts are generated by external devices, while software interrupts are generated by software instructions, such as system calls or exceptions.

Look at the code

public int sum(int firstNumber, int secondNumber){ int result = 0; result = firstNumber + secondNumber; return result; }
Now you should guess how variables can handled by compiler. For example
r0
to
r3
registers are responsible to store function parameters and
firstNumber
and
secondNumber
variables can store on program stack. You may ask this question, What if we have more than 4 function parameters?
It’s a very good question. The first 4 parameters stored on
r0
to
r3
registers. The others will store on the
stack
.

What is the stack?

It is a region of memory that is used for temporary storage of data and return addresses during program execution.

But How Smali works?

The
Smali
works like CPU registers but it have it own registers name and also it have it own method to store them and interact with CPU registers. As you know
Smali
will process in
Dalvik Virtual Machine
.
The
Smali
divide registers in two separate types:
Local: These registers will initialized and used in body of function. These registers named
v0
up to
v(n)
.
Parameters: These registers are responsible to store function parameters. These registers named from
p0
up to
p(n)
. The parameters start from
p1
, because the
p0
always point to
this
object.
💡 Callout icon
Some tools like
Jadx
just use one register types for all registers. This is why I prefer
APKTool
. It make easiyer to read and understand
Smali
codes.

What is this keyword?

In programming, the "
this
" keyword is used to refer to the current instance of an object in object-oriented programming languages, such as Java, C++, and Python. It is a reference to the memory location of the object itself, and it is used to access its properties and methods.
Now let’s go deep on codes. Compile
IntroductionToRegisters
project in
debug
mode and the decompile it. Then check the difference.

Type Descriptor Semantics

Syntax
Meaning
V
Void
Z
Boolean
B
byte
S
short
C
char
F
float
I
int
J
long
D
double
[descriptor (Example: [B → byte[])
array
Lfully/qualified/Name;
class name

Practical Example

.method public sum2(II)I .locals 1 .param p1, "firstNumber" # I .param p2, "secondNumber" # I .line 25 add-int v0, p1, p2 .line 26 .local v0, "result":I return v0 .end method
As you can see in function parameters we have two
II
. This mean we have 2
Integer
parameters. Next to this we have another
I
which indicates this function will return an Integer value.
This code defines a method named "sum2" that takes two integer parameters, "firstNumber" and "secondNumber", and returns an integer. Here's a breakdown of what each line does:
.method public sum2(II)I
: This line declares the method signature. It specifies that the method is public, takes two integer parameters (
II
), and returns an integer (
I
).
.locals 1
: This line indicates that the method needs one local variable.
.param p1, "firstNumber" # I
and
.param p2, "secondNumber" # I
: These lines declare the two integer parameters named "firstNumber" and "secondNumber" respectively.
.line 25
: This line specifies the line number in the source code where the following instruction is located.
add-int v0, p1, p2
: This line adds the values of the two parameters (
p1
and
p2
) and stores the result in the local variable
v0
.
.line 26
: This line specifies the line number in the source code where the following instruction is located.
.local v0, "result":I
: This line declares a local variable named "result" of type integer (
I
) and assigns it the value stored in register
v0
.
return v0
: This line returns the value of the local variable
v0
as the result of the method.
.end method
: This line marks the end of the method definition.
So, in summary, the
sum2
method takes two integer parameters, adds them together, and returns the result as an integer.

Now It’s your turn

.method public static function(D)D .registers 6 .param p0, "r" # D .prologue .line 45 const-wide v2, 0x400921fb54442d18L # Math.PI mul-double/2addr v2, p0 mul-double v0, v2, p0 .line 46 .local v0, "area":D return-wide v0 .end method
Solution
.class public Llab/seczone64/introductiontoregisters/HackerClass; .super Ljava/lang/Object; .source "HackerClass.java" # static fields .field public static canWriteCPP:Z = false .field public static final profession:Ljava/lang/String; = "Android Hacking" # instance fields .field private age:I .field private nickName:Ljava/lang/String; .field private realName:Ljava/lang/String; # direct methods .method static constructor <clinit>()V .locals 1 .line 11 const/4 v0, 0x1 sput-boolean v0, Llab/seczone64/introductiontoregisters/HackerClass;->canWriteCPP:Z return-void .end method .method public constructor <init>()V .locals 1 .line 3 invoke-direct {p0}, Ljava/lang/Object;-><init>()V .line 5 const/4 v0, 0x0 iput-object v0, p0, Llab/seczone64/introductiontoregisters/HackerClass;->nickName:Ljava/lang/String; return-void .end method
Here's a description of each line of code in the
HackerClass
class:
.class public Llab/seczone64/introductiontoregisters/HackerClass;
This line defines the class
HackerClass
in the package
lab.seczone64.introductiontoregisters
.
.super Ljava/lang/Object;
This line specifies that
HackerClass
extends the
java.lang.Object
class.
.source "HackerClass.java"
This line specifies the source file that this class was generated from.
.field public static canWriteCPP:Z = false
This line defines a public static
boolean
variable named
canWriteCPP
and initializes it to
false
.
.field public static final profession:Ljava/lang/String; = "Android Hacking"
This line defines a public static final string variable named
profession
and initializes it to the value
"Android Hacking"
. The
final
keyword means that this variable cannot be reassigned after it is initialized.
.field private age:I
This line defines a private integer instance variable named
age
.
.field private nickName:Ljava/lang/String;
This line defines a private string instance variable named
nickName
.
.field private realName:Ljava/lang/String;
This line defines a private string instance variable named
realName
.
.method static constructor <clinit>()V
This line of code defines a static initializer method for the
HackerClass
class. The static initializer is a special method that is called by the Java Virtual Machine (JVM) when the class is first loaded. The purpose of the static initializer is to initialize static variables and perform other initialization tasks that need to be done once before any instances of the class are created.
The static initializer method is identified by the name
<clinit>
and has a return type of
void
(
V
). The
static
keyword indicates that the method is a class method rather than an instance method.
In this specific example, the static initializer method sets the value of the
canWriteCPP
static variable to
true
. The method achieves this by first loading the value
1
onto the operand stack (with the
const/4 v0, 0x1
instruction) and then storing that value in the
canWriteCPP
static variable using the
sput-boolean
instruction.
.locals 1
This line specifies that the method will use one local variable.
const/4 v0, 0x1
This line sets the value of local variable
v0
to
1
.
sput-boolean v0, Llab/seczone64/introductiontoregisters/HackerClass;->canWriteCPP:Z
This line sets the value of the
canWriteCPP
static variable to the value of local variable
v0
. The
sput-boolean
instruction stores a boolean value in a static field.
return-void .end method
This line ends the static initializer method.
.method public constructor <init>()V
This line defines the public constructor method for the class.
.locals 1
This line specifies that the method will use one local variable.
invoke-direct {p0}, Ljava/lang/Object;-><init>()V
This line calls the constructor of the
java.lang.Object
class (the superclass of
HackerClass
) to initialize the object.
const/4 v0, 0x0
This line sets the value of local variable
v0
to
0
.
iput-object v0, p0, Llab/seczone64/introductiontoregisters/HackerClass;->nickName:Ljava/lang/String;
This line sets the value of the
nickName
instance variable to the value of local variable
v0
. The
iput-object
instruction stores an object reference in an instance field.
return-void .end method
This line ends the constructor method.

Inner classes in Smali

In Smali, an inner class is defined as a separate
.smali
file that is nested within the directory of the outer class. The name of the inner class file is formed by concatenating the name of the outer class with the name of the inner class, separated by a
$
symbol.
Consider this Java code:
package lab.seczone64.nestedclassesinsmali; public class OuterClass { public static class InnerClass{ public static int innerClassVariable = 1; } public int accessInnerClass(){ return InnerClass.innerClassVariable; } }
The result in
Smali
is two file in same directory with these names:
lab/seczone64/nestedclassesinsmali/OuterClass.java
.class public Llab/seczone64/nestedclassesinsmali/OuterClass; .super Ljava/lang/Object; .source "OuterClass.java" # annotations .annotation system Ldalvik/annotation/MemberClasses; value = { Llab/seczone64/nestedclassesinsmali/OuterClass$InnerClass; } .end annotation # direct methods .method public constructor <init>()V .locals 0 .line 3 invoke-direct {p0}, Ljava/lang/Object;-><init>()V return-void .end method # virtual methods .method public accessInnerClass()I .locals 1 .line 10 sget v0, Llab/seczone64/nestedclassesinsmali/OuterClass$InnerClass;->innerClassVariable:I return v0 .end method
lab/seczone64/nestedclassesinsmali/OuterClass$InnerClass.java
.class public Llab/seczone64/nestedclassesinsmali/OuterClass$InnerClass; .super Ljava/lang/Object; .source "OuterClass.java" # annotations .annotation system Ldalvik/annotation/EnclosingClass; value = Llab/seczone64/nestedclassesinsmali/OuterClass; .end annotation .annotation system Ldalvik/annotation/InnerClass; accessFlags = 0x9 name = "InnerClass" .end annotation # static fields .field public static innerClassVariable:I # direct methods .method static constructor <clinit>()V .locals 1 .line 6 const/4 v0, 0x1 sput v0, Llab/seczone64/nestedclassesinsmali/OuterClass$InnerClass;->innerClassVariable:I return-void .end method .method public constructor <init>()V .locals 0 .line 5 invoke-direct {p0}, Ljava/lang/Object;-><init>()V return-void .end method
In the context of
Smali
, annotations are additional metadata that can be attached to classes, methods, fields, and parameters in a Dalvik bytecode file. Annotations are used to provide additional information to the Android runtime or to other tools that process the bytecode.
The
.annotation
directive in Smali is used to define an annotation. It takes two parameters: the first parameter is the type of the annotation, and the second parameter is a list of name-value pairs that specify the values of the annotation's fields.
Here's an example of a
.annotation
directive in Smali:
.annotation system Ldalvik/annotation/EnclosingClass; value = Lcom/example/OuterClass; .end annotation
This example defines an annotation of type
dalvik.annotation.EnclosingClass
and sets its
value
field to
com.example.OuterClass
.
Annotations in Smali can be used for various purposes, such as indicating the source of an API call or specifying the layout of UI elements in an Android app.

This keyword

In Smali, the
p0
register is used to hold the reference to the current object (i.e., the object on which a method is being called). When you want to return this object from a method, you cannot simply return the value of
p0
directly, because
p0
is not a valid return register.
The reason for this is that the
p
registers (also known as parameter registers) are reserved for passing method parameters and are not used for returning values. To return a value from a method in Smali, you need to use a different set of registers called the
v
registers (also known as value registers).
When you want to return the current object from a method, you need to first move its reference from
p0
to a valid return register (such as
v0
) using the
move-result-object
instruction. Here's an example:
package lab.seczone64.thisobject; public class AndroidHacker { private String name; private int skill; public int getSkill() { return this.skill; } }
The
Smali
is:
.class public Llab/seczone64/thisobject/AndroidHacker; .super Ljava/lang/Object; .source "AndroidHacker.java" # instance fields .field private name:Ljava/lang/String; .field private skill:I # direct methods .method public constructor <init>()V .locals 0 .line 3 invoke-direct {p0}, Ljava/lang/Object;-><init>()V return-void .end method # virtual methods .method public getSkill()I .locals 1 .line 8 iget v0, p0, Llab/seczone64/thisobject/AndroidHacker;->skill:I return v0 .end method