HTB - Pedometer (Mobile Reverse Engineering)

🧠 Analysis Summary

The application implements a custom bytecode interpreter. This interpreter loads and executes critical application logic from an obfuscated asset file. The challenge involves reverse engineering this custom bytecode and its interpreter to understand the application’s true functionality.

Decompiled code are modified to improve readability and understanding. The original code may have different variable names or structures.

🔍 Vulnerability/Key Concepts

Custom Bytecode Interpreter Obfuscation

The Pedometer application uses a custom bytecode interpreter to execute its core logic. Analysis of the MainActivity and u1.c (C0976c) classes reveals that the application reads an unknown bytecode format from an asset file named a. This design acts as a form of obfuscation, making static analysis challenging as the actual execution flow is defined by this custom bytecode, rather than directly visible Java or Kotlin code. The u1.c class is responsible for reading the bytecode from the asset file, managing a stack, and providing a method (m2264a()) to pop integer values.

// From u1.c (C0976c) constructor
public C0976c(MainActivity mainActivity) {
    AbstractC1073e.m2493y(mainActivity, "main");
    this.mainAct = mainActivity;
    InputStream open = mainActivity.getAssets().open("a"); // Reading bytecode from asset 'a'
    AbstractC1073e.m2491x(open, "main.assets.open(\"a\")");
    this.f4060b = open; // This is the bytecodeInput stream
    this.cStack = new Stack(); // This is the vm.stack
}

// From u1.c (C0976c) method to pop values
public final int m2264a() { 
    Stack stack = this.cStack;
    Integer num = (Integer) stack.peek();
    stack.pop();
    AbstractC1073e.m2491x(num, "value");
    return num.intValue();
}

Dynamic Bytecode Execution via SensorEventListener

The MainActivity sets up sensor event handling. It calls the m997n() method, which gets a SensorManager and registers a listener named C0974a. This listener contains the main logic. Its onSensorChanged method works like a virtual machine.

Each time the sensor detects a change, the method reads a byte from the asset file a, XORs it with a changing key, and decodes it into an opcode. The opcode then refered from EnumC0975b enum opcodes. The interpreter supports stack operations (push, pop), math (add, subtract), jumps, and battery checks.

// Deobfuscated SensorEventListener (p070u1.a.onSensorChanged) logic
@Override
public void onSensorChanged(SensorEvent event) {
    long now = Calendar.getInstance().getTimeInMillis();
    if (now - lastEventTime <= 300) return; // Rate limit sensor events

    lastEventTime = now;
    // ... (step counting logic omitted for brevity) ...

    if (vm != null) { // vm refers to the C0976c instance, holding bytecode input stream and stack
        InputStream in = vm.bytecodeInput; // Input stream from 'a' file
        if (in.available() == 0) return;

        int encryptedOpcode = in.read(); // Read the next encrypted byte
        int opcode = encryptedOpcode ^ vm.key; // Decrypt opcode using current XOR key

        EnumC0975b instruction = EnumC0975b.fromOpcode(opcode); // Map to a known instruction
        Stack<Integer> stack = vm.stack;
        MainActivity act = vm.mainActivity;

        switch (instruction) { // Execute instruction based on the decoded opcode
            case STOP: // opcode 0
                in.skip(in.available()); // Skip remaining bytes, effectively stopping execution
                break;
            case PUSH: // opcode 1
                int val = in.read() ^ vm.key; // Read next byte, decrypt it, and push to stack
                stack.push(val);
                break;
            case POP: // opcode 2
                stack.pop();
                break;
            case ADD: // opcode 3
                stack.push(stack.pop() + stack.pop());
                break;
            case XOR: // opcode 12
                vm.key = stack.pop() ^ stack.pop(); // Pop two, XOR them, and update the global XOR key (vm.key)
                break;
            case IF: // opcode 13 (Conditional jump)
                if (stack.pop() == 1) { // If condition (top of stack) is true
                    int offset = vm.nextByte(); // Read next byte (which is the jump target/offset)
                    vm.key = offset; // Update XOR key with the offset
                } else {
                    vm.nextByte(); // If condition is false, consume the offset byte but don't jump
                }
                break;
            case JMP: // opcode 14 (Unconditional jump)
                int offset = vm.nextByte(); // Read next byte (which is the jump target/offset)
                vm.key = offset; // Update XOR key with the offset
                break;
            case CHRG: // opcode 15
                boolean isCharging = act.getBatteryStatus();
                stack.push(isCharging ? 1 : 0);
                break;
            case FLAG: // opcode 20
                char[] flagChars = new char[22]; // Allocate for 22 characters
                for (int i = 0; i < 22; i++) {
                    flagChars[i] = (char) vm.nextByte(); // Read 22 bytes, decrypt, and convert to char
                }
                String flag = new String(flagChars); // Create the flag string
                TextView flagView = act.findViewById(R.id.flagText);
                flagView.setText(flag); // Display the flag
                break;
            // ... (other cases for SUB, MUL, DIV, MOD, EQ, LT, GT, NOT, AIRPLN, INTRNT, ENC, DEC) ...
        }
    } else {
        throw new IllegalStateException("stepReader not initialized");
    }
}

⚙️ Exploitation/Methodology

In order to solving this challenge we should involves a step-by-step reverse engineering process. This process extracts the obfuscated bytecode. It then simulates the custom virtual machine’s execution.

Step 1: asset/a File Extraction

First, we need to extract the a file from the Android application. This file contains the custom bytecode that the interpreter executes. We can use tools like apktool or jadx to decompile the APK and access its assets.

# Use apktool to decompile the APK
apktool d pedometer.apk -o pedometer_decompiled
# Navigate to the decompiled assets directory
cd pedometer_decompiled/assets
# The 'a' file should be present here
# Alternatively, use jadx to view the APK structure
jadx -d pedometer_jadx pedometer.apk
# Navigate to the jadx output directory
cd pedometer_jadx/assets
# The 'a' file should be present here as well

hexdump decompiled/assets/a 
0000000 0101 0001 0101 0101 0101 0101 0101 0101
0000010 20f0 0040 20f1 0040 20f2 0040 20f0 0040
0000020 2a01 2bf3 2b60 1b1d 567c 227c 754c 7538
0000030 4508 3131 4131 7101 7189 41fa 0972 5672
0000040 5e42 5e96 6ede 4d49 2849 6479 64dd 54a9
0000050 7575 2a75 5e45 5eca 6eb9 c872 8372 4a42
0000060 4ad5 7aa7 1173 2573 3543 3591 05fc 0d6c
0000070 526c 5e5c 5e61 6e39 7259 0959 7a69 7a53
0000080 4a11 5943 0d43 5573 55bd 65f5 ffbc 0000

Step 3: Bytecode Interpreter Simulation

We can create a solver that emulates the custom bytecode interpreter used by the Android app, we can use LLM for creating this solver. This solver follows the logic found in the original u1.c class and the onSensorChanged method. It allows us to run the bytecode outside the app and extract the hidden string when the FLAG instruction is reached.

But first, we need to extract the opcode value from the enum EnumC0975b. Since the code are taking the opcode from the enum using .values() method, we can use frida to dump the enum values and their corresponding opcodes

Java.perform(function() {
    let EnumC0975b = Java.use("u1.b");
    let enumValues = EnumC0975b.values();
    console.log("EnumC0975b values", enumValues);

    for (let i = 0; i < enumValues.length; i++) {
        const e = enumValues[i];
        console.log(
            `Ordinal: ${e.ordinal()}, Name: ${e.name()}, f4058a: ${e.a.value}`
        );
    }


});

After we get the opcode values, we can implement the bytecode solver in Python that simulates the stack-based execution.

#!/usr/bin/env python3

def solve_bytecode(hex_string):
    # Convert hex to bytes
    bytecode = bytes.fromhex(hex_string.replace(' ', '').replace('\n', ''))
    
    # Initialize state
    stack = []
    position = 0
    xor_key = 0
    
    print(f"Processing {len(bytecode)} bytes...")
    
    while position < len(bytecode):
        # Read and decode current byte
        raw_byte = bytecode[position]
        opcode = raw_byte ^ xor_key
        position += 1
        
        # Execute instruction
        '''
        From the frida dump, we have the following opcodes:
        EnumC0975b values STOP,PUSH,POP,ADD,SUB,MUL,DIV,MOD,EQ,LT,GT,NOT,XOR,IF,JMP,CHRG,AIRPLN,INTRNT,ENC,DEC,FLAG
        Ordinal: 0, Name: STOP, f4058a: 0
        Ordinal: 1, Name: PUSH, f4058a: 1
        Ordinal: 2, Name: POP, f4058a: 2
        Ordinal: 3, Name: ADD, f4058a: 16
        Ordinal: 4, Name: SUB, f4058a: 17
        Ordinal: 5, Name: MUL, f4058a: 18
        Ordinal: 6, Name: DIV, f4058a: 19
        Ordinal: 7, Name: MOD, f4058a: 20
        Ordinal: 8, Name: EQ, f4058a: 32
        Ordinal: 9, Name: LT, f4058a: 33
        Ordinal: 10, Name: GT, f4058a: 34
        Ordinal: 11, Name: NOT, f4058a: 48
        Ordinal: 12, Name: XOR, f4058a: 49
        Ordinal: 13, Name: IF, f4058a: 64
        Ordinal: 14, Name: JMP, f4058a: 65
        Ordinal: 15, Name: CHRG, f4058a: 240
        Ordinal: 16, Name: AIRPLN, f4058a: 241
        Ordinal: 17, Name: INTRNT, f4058a: 242
        Ordinal: 18, Name: ENC, f4058a: 243
        Ordinal: 19, Name: DEC, f4058a: 244
        Ordinal: 20, Name: FLAG, f4058a: 255
        '''
        if opcode == 0:  # STOP
            continue
            
        elif opcode == 1:  # PUSH
            if position >= len(bytecode):
                break
            value = bytecode[position]
            stack.append(value)
            position += 1
            
        elif opcode == 2:  # POP
            if stack:
                stack.pop()
                
        elif opcode == 16:  # ADD
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(a + b)
                
        elif opcode == 17:  # SUB
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(a - b)
                
        elif opcode == 18:  # MUL
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(a * b)
                
        elif opcode == 19:  # DIV
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(a // b if b != 0 else 0)
                
        elif opcode == 20:  # MOD
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(a % b if b != 0 else 0)
                
        elif opcode == 32:  # EQ
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(1 if a == b else 0)
                
        elif opcode == 33:  # LT
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(1 if a < b else 0)
                
        elif opcode == 34:  # GT
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                stack.append(1 if a > b else 0)
                
        elif opcode == 48:  # NOT
            if stack:
                a = stack.pop()
                stack.append(1 if a == 0 else 0)
                
        elif opcode == 49:  # XOR
            if len(stack) >= 2:
                b, a = stack.pop(), stack.pop()
                result = a ^ b
                stack.append(result)
                xor_key = result & 0xFF  # Update XOR key
                
        elif opcode == 64:  # IF (conditional skip)
            if stack:
                condition = stack.pop()
                if condition == 0 and position < len(bytecode):
                    position += 1  # Skip next byte
                    
        elif opcode == 65:  # JMP (unconditional skip)
            if position < len(bytecode):
                position += 1  # Skip next byte
                
        elif opcode == 240:  # CHRG (charging status)
            stack.append(1)  # Assume charging
            
        elif opcode == 241:  # AIRPLN (airplane mode)
            stack.append(0)  # Assume airplane mode off
            
        elif opcode == 242:  # INTRNT (internet)
            stack.append(1)  # Assume connected
            
        elif opcode == 243 or opcode == 244:  # ENC/DEC
            if stack:
                xor_key = stack.pop() & 0xFF
                
        elif opcode == 255:  # FLAG (extract string)
            if len(stack) >= 21:
                # Get 21 characters from stack
                chars = []
                for _ in range(21):
                    chars.append(chr(stack.pop() & 0xFF))
                
                # Reverse since stack is LIFO
                result = ''.join(reversed(chars))
                print(f"Found target string: '{result}'")
                return result
            else:
                print(f"FLAG: Not enough stack elements ({len(stack)}/21)")
                break

    return stack

if __name__ == "__main__":
    hex = """
    0101 0100 0101 0101 0101 0101 0101 0101
    f020 4000 f120 4000 f220 4000 f020 4000
    012a f32b 602b 1d1b 7c56 7c22 4c75 3875
    0845 3131 3141 0171 8971 fa41 7209 7256
    425e 965e de6e 494d 4928 7964 dd64 a954
    7575 752a 455e ca5e b96e 72c8 7283 424a
    d54a a77a 7311 7325 4335 9135 fc05 6c0d
    6c52 5c5e 615e 396e 5972 5909 697a 537a
    114a 4359 430d 7355 bd55 f565 bcff 00
    """
    
    
    result = solve_bytecode(hex)
    print(f"\nResult: {result}")

And we got the flag 😁

✅ Conclusion

This challenge demonstrates how Android apps can embed custom VMs and obfuscation techniques to hinder analysis. By dissecting the interpreter logic and simulating the bytecode execution, we successfully reverse engineered the hidden logic and extracted the flag.