Opaque Fields Access

As mentioned in the talk Automation Techniques in C++ Reverse Engineering by R. Rolles, reverse engineers spend a non-negligible amount of time to identify structures and their attributes:

And I completely agree!

To better understand how structures are involved in reverse engineering, let’s consider the following code which involves a JNI function:

class SecretString {
  public:
  SecretString(const char* input) : value_(input) {}
  bool check() {
    checked_ = (value_ == "OMVLL");
    return checked_;
  }
  private:
  bool checked_ = false;
  std::string value_;
};

bool check_jni_password(JNIEnv* env, jstring passwd) {
  const char* pass = env->GetStringUTFChars(passwd, nullptr);
  SecretString secret(pass);
  return secret.check();
}

When this code is compiled, env->GetStringUTFChars() is called through:

  1. An access to the GetStringUTFChars pointer in the JNIEnv structure.
  2. A call on the dereferenced pointer.

In assembly it looks like this:

ldr    x8, [x8, #1352] ; 1352 is the offset of GetStringUTFChars
blr    x8              ; in the JNIEnv structure

When decompiling the check_jni_password function, we can effectively observe this offset, and most of the disassemblers can also resolve the structure’s attribute, once the user has resolved and provided its type:

Output of the decompilation when the user has reversed the types

Similarly, once we have identified and reversed the layout of the SecretString* this pointer, the SecretString::check function is a bit more meaningful:

On the other hand, when using this pass on the structures JNIEnv and SecretString, the output of the decompilation is confusing even if, we manually define the type of the registers associated with JNIEnv and SecretString.

The following figures show the differences in BinaryNinja and the output of IDA is very close:

check_jni_password() before and after the obfuscation
Section of SecretString::check() before and after the obfuscation

When to use it?

You should trigger this pass on structures that aim at containing sensitive information. It might be also worth enabling this pass on the JNIEnv structure for JNI functions involves in sensitive computations.

How to use it?

You can trigger this pass by defining the method obfuscate_struct_access in the configuration class file:

def obfuscate_struct_access(self, _: omvll.Module, __: omvll.Function,
                                  struct: omvll.Struct):
  if struct.name.endswith("JNINativeInterface"):
      return True
  if struct.name == "class.SecretString":
      return True
  return False

In the current version, O-MVLL expects a boolean value but futures versions should also be able to accept an option on the access type (read or write). For instance:

if struct.name == "class.SecretString":
    return omvll.StructAccessOpt(read=True, write=False)

Implementation

This pass works with a first stage which consists in identifying the LLVM instructions: llvm::LoadInst and llvm::StoreInst.

Then, there is a processing of the operands for these instructions, to check if they are used to access the content of a structure or an element of a global variable. In such a case, it resolves the name of the structure or the name of the global variable and calls the user-defined callback to determine whether the access should be obfuscated.

Upon positive feedback from the user’s callback, O-MVLL transforms the access from this:

ldr x0, [x1, #offset];

Into that:

$var := #offset + 0;
ldr x0, [x1, $var];

Without any additional layer of protection, $var := #offset + 0; can be folded by the compiler which would result in the original instruction. To prevent this simplification, the instruction #offset + 0 is annotated1 to automatically apply Opaque Constants and Arithmetic Obfuscation on this instructions:

IRBuilder<NoFolder> IRB(&Load);

Value* opaqueOffset =
  IRB.CreateAdd(ConstantInt::get(IRB.getInt32Ty(), 0),
                ConstantInt::get(IRB.getInt32Ty(), ComputedOffset));

if (auto* OpAdd = dyn_cast<Instruction>(opaqueOffset)) {
  addMetadata(*OpAdd, {MetaObf(OPAQUE_CST), MetaObf(OPAQUE_OP, 2llu)});
}

Limitations

This pass would not resist against the Dynamic Structure Reconstruction technique presented by R. Rolles in the presentation mentioned in the introduction.

Nevertheless, it would require to use an AArch64 DBI which does not exist yet2.

References


  1. See the section Annotations for the details. ↩︎

  2. I personally worked on this support in Quarkslab’s QBDI but since I left the company this support is owned by Quarkslab. It might be published by Quarkslab though. ↩︎