Strings Encryption
The strings used within Java or Kotlin classes are a good indicator for reverse engineers. This protection statically encodes strings such as the clear strings are only present at runtime when the class’ methods need them.
How to use it?
This protection can be activated by using the -obfuscate-strings
option in the dProtect configuration file:
-obfuscate-strings ...
-obfuscate-strings
accepts two kinds of argument:
# 1. List of strings
-obfuscate-strings "hello*", "world"
# 2. Class specifications
-obfuscate-strings class dprotect.tests.string.TestObfuscationSyntax {
private static java.lang.String API_KEY;
public static java.lang.String sayHello();
}
1. Class specifications
The regular usage of this option is very close to the -keep
option1:
We define classes, methods, and fields for which, we want to obfuscate the strings.
To better understand the impact of this option, let’s consider the following code:
package re.obfuscator.dprotect;
public class MySensitiveClass {
private static String API_KEY = "XEYnuNOGoEQtj7cFOPmXBMvQTE8FyAWC";
boolean isAuth;
public String getApiKey() {
return String.format("TOKEN: %s", API_KEY);
}
public String toString() {
return String.format("MySensitiveClass{isAuth: %b | Token: %s}",
isAuth, API_KEY);
}
}
First, if we want to protect the API Key associated with the API_KEY
attribute, we can use this definition:
# Class specifications
-obfuscate-strings class re.obfuscator.dprotect.MySensitiveClass {
private static java.lang.String API_KEY;
}
This configuration produces this protection:
If we also want to protect the string(s) in the getApiKey()
method, we must add this definition:
# Class specifications
-obfuscate-strings class re.obfuscator.dprotect.MySensitiveClass {
private static java.lang.String API_KEY;
public java.lang.String getApiKey();
}
This new definition provides the following changes:
Finally, we could protect all the strings of the class by using the wildcard option:
# Class specifications
-obfuscate-strings class re.obfuscator.dprotect.MySensitiveClass {
*;
}
And we get these transformations:
The details about the class specifications syntax are documented in the official Proguard documentation: ProGuard manual.
Now, let’s see how we can use a list of strings for the -obfuscate-strings
option.
2. List of strings
In addition to a class specifier, we can feed -obfuscate-strings
with a list of strings delimited by a comma.
-obfuscate-strings "hello*", "world"
With this option, all the strings that match one of the elements specified in the list will be protected.
Class Specifications Required
Instead, the input strings are sourced by the classes specified with the
-obfuscate-strings
specifier:Pitfall
# DOES NOT PROTECT ANY STRING
-obfuscate-strings "check*", "world"
Protected
-obfuscate-strings "check*", "world"
-obfuscate-strings class dprotect.**
# Protect the strings "check" "check password ", "world", ...
# present in the package 'dprotect'
When to use it?
This pass should be enabled for all sensitive classes. We also recommend protecting all the strings of the class as any clear string – even though it might not seem sensitive at first sight – could provide information to reverse engineers.
Implementation
The logic of the pass is located in the package dprotect.obfuscation.strings
.
First, the CodeObfuscator
filters the classes that have been flagged as string-obfuscated:
programClassPool.accept(
new AllClassVisitor(
new ClassVisitor() {
public void visitAnyClass(Clazz clazz) {
if (ApplyStringObfuscation(clazz)) {
// 1. Flag strings field
markStringsField();
// 2. Encode strings
runObfuscator();
}
}
}));
The initial step markStringsField()
is used to mark strings that are associated with a class’s attributes
that are marked as “protected” by the user:
# Class specifications
-obfuscate-strings class re.obfuscator.dprotect.MySensitiveClass {
private static java.lang.String API_KEY;
public java.lang.String getApiKey();
}
To identify the strings that are paired with a class attribute, we basically try to fingerprint this sequence of instructions:
Code:
1: ldc #9 // java.lang.String <to protect>
...
4: putfield #14 // Field API_KEY:Ljava/lang/String;
This identification is performed by implementing the Proguard’s InstructionVisitor
and ConstantVisitor
which are used for backtracking the strings involved in the putfield/putstaticfield
instructions:
// Pseudo-code for the logic of markStringsField()
public void visitConstantInstruction(...) {
if (opcode == Instruction.OP_LDC) {
// Keep a reference of the current string visited
this.stringConstant = ...;
}
else if (opcode == Instruction.OP_PUTFIELD) {
if (IsMarked(field) && this.stringConstant != null) {
mark(this.stringConstant);
}
}
}
Once the strings associated with fields are marked, we can process the whole class for the obfuscation:
// 1. Flag strings field
markStringsField();
// 2. Obfuscate strings
runObfuscator();
The overall logic behind runObfuscator
is to:
- Inject a decoding routine in the classes for which strings must be protected.
- Replace all the strings with their encoded representation.
- Add a call to the injected decoding routine for the encoded strings.
For the first step, the idea is very similar to the O-MVLL String Encoding pass:
The class dprotect.runtime.strings.StringEncoding
implements a set of encoding/decoding routines that
are used for the injection.
The idea of this injection is that, on one hand, Proguard has all the functionalities to add, create and
modify the class’ methods. Therefore, given a compiled .class
file, we could copy the bytecode of a
specific method within the class that aims to welcome the decoding routine.
On the other hand, the Java bytecode associated with the injected routine can also be executed by the pass itself to get the encoded string.
The injection of the decoding routine is performed by the following (pseudo) code:
// Class in which we want to inject the decoding routine
ProgramClass target = ...
ClassBuilder builder = new ClassBuilder(target);
// Create a (empty) method into the targeted class
ProgramMethod decodingRoutine = builder.addAndReturnMethod(
AccessConstants.STATIC,
/* Name */ "myDecodingRoutine",
/* Prototype */ "(Ljava/lang/String;)Ljava/lang/String;");
// Lift the bytecode into target.myDecodingRoutine
// from StringEncoding.myDecodingRoutine
MethodCopier.copy(target, decodingRoutine,
StringEncoding.class, "decodingRoutine");
MethodCopier
MethodCopier
is not present in the original version of ProguardCORE and has been added for the purpose
of this pass.Once the decoding routine injected into the targeted class, we can address the next points which consist in replacing the original strings with their encoded representation.
For that purpose, we can combine the following Proguard’s visitors (pseudo-code):
// AttributeVisitor.
@Override
public void visitCodeAttribute(Clazz clazz, Method method, ...) {
// Prepare the "editors" and trigger the instructions visitor
constantPoolEditor = new ConstantPoolEditor((ProgramClass)clazz);
codeAttributeEditor.reset(codeAttribute.u4codeLength);
// Trigger InstructionVisitor that is implemetned by the same class
codeAttribute.instructionsAccept(clazz, method, this);
}
// InstructionVisitor.
@Override
public void visitConstantInstruction(Clazz clazz,
Method method,
CodeAttribute codeAttribute,
int offset,
ConstantInstruction instruction) {
// Filter on the LDC/LDC_W opcodes which load strings
if (instruction.opcode == Instruction.OP_LDC ||
instruction.opcode == Instruction.OP_LDC_W) {
// Find the decoding routine which has been injected in the step 1
Method decodingRoutine = clazz.findMethod(...);
// Create a static-call instruction for the decoding routine
Instruction call = new ConstantInstruction(Instruction.OP_INVOKESTATIC,
decodingRoutine);
// Replace the string with its encoded version
String encoded = encode(originalString);
instruction.constantIndex = constantPoolEditor.addStringConstant(encoded);
codeEditor.replaceInstruction(offset, instruction);
// Add the static call to the decoding routine
codeEditor.insertAfterInstruction(offset, replacementInstruction);
}
}
In the previous snippet, String encoded = encode(originalString)
actually uses Java reflection to
call the encoding routine implemented in StringEncoding
(whilst the decoding routine has been injected with
MethodCopier
in the class).
The full implementation is a bit more complex but the previous description provides a good overview of the process.
Limitations
Regarding the limitations, this pass might introduce a certain overhead on the size of the final application since a new method is added for all the classes in which strings must be protected. Nevertheless, this overhead is balanced by the fact that the decoding routines are usually small and self-consistent.
The decoding routine could also be hooked by an attacker to access the clear string at runtime. Nevertheless, this would require to setup hooks for all the classes as the decoding routines are local and different for each class.
References
Attacks
Reversing dProtect - Strings Obfuscation
This blog post explains how JEB Decompiler can recover strings protected with dProtect.
-obfuscate-string
relies on the same parser as the-keep
option. ↩︎