Arithmetic Obfuscation
When to use it?
The majority of the time, arithmetic operations (+, -, ^, &, |, ...
) are compiled into assembly instructions
that exactly resemble the original operation1.
For instance, consider the compilation of this function in which the xor
operation is considered sensitive:
void encode(uint8_t* data, size_t len) {
for (size_t i = 0; i < len; ++i) {
data[i] = data[i] ^ 0x23;
}
}
The following sequence of AArch64 instructions is generated:
mov w10, #35
...
ldrb w11, [x9]
eor w11, w11, w10
strb w11, [x9], #1
From a reverse engineering perspective, the sensitive xor operation associated with the EOR
instruction can
be straightforwardly identified.
Thus, if you reckon that the arithmetic operations performed by your function may be sensitive, you may want to consider enabling this obfuscation pass.
As a result, when this obfuscation pass is enabled, the previous operation is transformed into a more complex expression:
// Transformation of data[i] = data[i] ^ 0x23;
uint32_t C = data[i];
int32_t X = -(C & 0x23);
data[i] = ((C + (0xffffffdc | !C) + 0x24) ^ X) +
(((C + (0xffffffdc | !C) + 0x24) & X) << 1);
How to use it?
In the O-MVLL configuration file, you must define the following method in the Python class implementing omvll.ObfuscationConfig
def obfuscate_arithmetic(self, mod: omvll.Module,
fun: omvll.Function) -> omvll.ArithmeticOpt:
if fun.name == "encode":
# Explicitly define the number of iterations
# applied on the arithmetic expressions
return omvll.ArithmeticOpt(rounds=8)
if mod.name.endswith("encrypt.cpp"):
# Obfuscate with a number of iteration defined by O-MVLL
return True
return False
The rounds
parameter defines the number of loops transforming the arithmetic operation:
Operation : X ^ Y
#1 Loop : (X | Y) - (X & Y)
#2 Loop : (((Y + X) - (Y & X)) ^ ((Y | X) - (Y + X))) + 2*(((Y + X) - (Y & X)) & ((Y | X) - (Y + X)))
It is more than highly recommended to run this pass with at least 2 iterations.
See the next section for the details
Implementation
The transformation of the arithmetic operations works by iterating over all the instructions
of a basic block and selecting arithmetic operations (+, -, ^, &, |
) that can be transformed with
a Mixed Boolean-Arithmetic expression:
for (Instruction& I : BasicBlock) {
auto* BinOp = dyn_cast<BinaryOperator>(&I);
if (isSupported(*BinOp)) {
// Process ...
}
}
In the first stage, the processing of the operation consists in extracting and wrapping the arithmetic operation (e.g. add
)
with a function.
For instance, let’s consider this addition:
int b = a + b;
After creating its wrapper, it becomes:
int __omvll_add(int x, int y) {
return x + y;
}
int b = __omvll_add(x, y);
“Why doing such transformation and not directly replacing the addition with its MBA equivalent?”
First, the compiler itself is able to simplify MBA as it is discussed in the next section: Limitation & Attacks.
One solution to prevent these optimizations is to create a function flagged with the attribute OptimizeNone
,
which is roughly equivalent to:
__attribute__((optnone))
int __omvll_add(int x, int y) {
return x + y;
}
The second advantage of this call-transformation lies in the implementation of the iteration process. If we perform the transformations on the arithmetic operations on the fly in the original basic block, for each iteration, we must re-process all the other instructions of the basic block.
With a wrapper, it only requires applying the iterations on the isolated function, not on all the instructions of the original basic block:
__attribute__((optnone))
int __omvll_add(int x, int y) {
E = x + y;
for (size_t i = 0; i < NB_ITERATIONS; ++i) {
E = MBA_add_xor_sub_or_and(E);
}
}
Last but not least, we have to make sure that the wrapped function is inlined at the end of the processing otherwise, it would be a weakness.
Wrapper->addFnAttr(Attribute::AlwaysInline);
This LLVM inlining code is equivalent to:
__attribute__((optnone))
__attribute__((alwaysinline))
int __omvll_add(int x, int y) {
E = x + y;
for (size_t i = 0; i < NB_ITERATIONS; ++i) {
E = MBA_add_xor_sub_or_and(E);
}
}
Regarding the transformations themselves, they rely on the built-in LLVM pattern matcher:
Instruction& I = ...;
Value *X, *Y;
// Match X & Y
if (!match(&I, m_And(m_Value(X), m_Value(Y)))) {
return nullptr;
}
Upon a match , the instruction associated with the operation is replaced with a Mixed Boolean-Arithmetic expression (MBA):
// (X + Y) - (X | Y)
return BinaryOperator::CreateSub(
Builder.CreateAdd(X, Y),
Builder.CreateOr(X, Y)
);
The overall structure of the transformer is highly inspired by the InstructionCombining pass and the MBA used in this pass are taken from sspam developed by Ninon Eyrolles:
Operation | MBA Transformation |
---|---|
X ^ Y | (X | Y) - (X & Y) |
X + Y | (X & Y) + (X | Y) |
X - Y | (X ^ -Y) + 2*(X & -Y) |
X & Y | (X + Y) - (X | Y) |
X | Y | X + Y + 1 + (~X | ~Y) |
Limitations & Attacks
First and foremost, LLVM is able to simplify some MBAs as we can see from
llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
:
...
// (A ^ B) ^ (A | C) --> (~A & C) ^ B -- There are 4 commuted variants.
if (match(&I, m_c_Xor(m_OneUse(m_Xor(m_Value(A), m_Value(B))),
m_OneUse(m_c_Or(m_Deferred(A), m_Value(C))))))
return BinaryOperator::CreateXor(
Builder.CreateAnd(Builder.CreateNot(A), C), B);
// (A ^ B) ^ (B | C) --> (~B & C) ^ A -- There are 4 commuted variants.
if (match(&I, m_c_Xor(m_OneUse(m_Xor(m_Value(A), m_Value(B))),
m_OneUse(m_c_Or(m_Deferred(B), m_Value(C))))))
return BinaryOperator::CreateXor(
Builder.CreateAnd(Builder.CreateNot(B), C), A);
// (A & B) ^ (A ^ B) -> (A | B)
if (match(Op0, m_And(m_Value(A), m_Value(B))) &&
match(Op1, m_c_Xor(m_Specific(A), m_Specific(B))))
return BinaryOperator::CreateOr(A, B);
// (A ^ B) ^ (A & B) -> (A | B)
if (match(Op0, m_Xor(m_Value(A), m_Value(B))) &&
match(Op1, m_c_And(m_Specific(A), m_Specific(B))))
return BinaryOperator::CreateOr(A, B);
...
Hence, we must be careful not to trigger these simplifications (c.f. Implementation)
Second, the MBA currently used in this pass are:
- Fixed and statically encoded.
- Simplified by the LLVM optimization pipeline if misconfigured.
- Can be simplified with existing public tools once extracted.
For the third point, msynth or Triton could be used to verify if the expressions are simplified.
Nevertheless, automatically identifying AND extracting MBAs from a compiled binary is not trivial so we can consider that this pass introduces a non-negligible overhead for the reverse engineers.
For these reasons and, in its current form, this pass should be considered as a cosmetic protection rather than an advanced protection.
References
Publications
Efficient Deobfuscation of Linear Mixed Boolean-Arithmetic Expressions
by Benjamin Reichenwallner, Peter Meerwald-Stadler
Improving MBA Deobfuscation using Equality Saturation
by Matteo Favaro, Tim Blazytko
QSynth - A Program Synthesis based Approach for Binary Code Deobfuscation
by Robin David, Luigi Coniglio, Mariano Ceccato
Tools
msynth
by Tim Blazytko
qsynthesis
by Robin David
Talks
Advanced Code Obfuscation With MBA Expressions
by Arnau Gàmez Montolio
Attacks
Oracle Synthesis Meets Equality Saturation
PoC associated with the blog post: 'Improving MBA Deobfuscation using Equality Saturation'.
The compiler might also optimize the operation into instructions that are equivalent but less meaningful from a reverse engineering perspective. For instance,
x % 8
is usually transformed inx & 7
. Hacker’s Delight is a good reference for this kind of transformations. ↩︎