Table of Contents
- 1. Project Overview
- 2. Research Question
- 3. Background
- 4. Tool Architecture
- 5. Planned Workflow
- 6. Static Analysis Plan
- 7. Protocol Analysis Plan
- 8. Dynamic Analysis Plan
- 9. Ethical Boundaries
- 10. Expected Challenges
- 11. Lab Setup
- 12. Initial Hypothesis
1. Project Overview {#overview}
My final project is RE-Protocol-Agent, an AI-assisted reverse engineering and protocol analysis tool for Android mobile applications. The research target for this project is the Mercedes-Benz / Mercedes me mobile application, specifically the parts related to connected vehicle behavior.
The broader domain is mobile application reverse engineering and protocol analysis. Connected vehicle apps are a good target for this kind of study because they combine user accounts, authentication, mobile app logic, backend APIs, vehicle identity, VIN handling, region-specific behavior, and vehicle-linking workflows. These systems are interesting from a reverse engineering perspective because the mobile app often gives clues about how the larger system is structured.
The goal of this project is not to bypass the app, attack a backend, brute force credentials, or exploit any service. The goal is to build a local academic analysis framework that can inspect user-provided APK artifacts, process local traffic captures, correlate evidence, and produce a clean technical report. This project was also motivated by my own experience with regional VIN restrictions in the app. I tried to connect my personal vehicle using the German version of the app, while the vehicle was linked to the Japanese market. When using the Japanese version, the connection process required support from an authorized dealer in Japan because of country-specific laws and policies. This made me interested in understanding how the system works technically, using only local APK artifacts, traffic captures, and evidence collected in a controlled academic environment.

2. Research Question {#research-question}
The main research question is:
Can an AI-assisted local agent automate the reverse engineering workflow for a connected vehicle Android app and produce a grounded report about static mobile behavior and protocol-level observations?
More specifically, the areas of interest are:
Areas of investigation:
VIN and vehicle terms → where they appear in the app
Authentication flows → strings, endpoints, OAuth patterns
Garage / ownership logic → registration and linking behavior
Region and market logic → locale, country, feature flags
Backend endpoint candidates → recoverable API paths and hosts
Evidence quality → confirmed vs inferred vs unknown
A successful result means the tool can take an APK and optional traffic captures, run the analysis pipeline, and generate a Markdown report with clear evidence. A successful result does not require finding a vulnerability, it requires producing a reliable and honest analysis.
3. Background {#background}
Android applications are distributed as APK files or app bundles. An APK can contain compiled DEX bytecode, resources, native libraries, metadata, and an AndroidManifest.xml file. The manifest is especially important because it defines the package name, permissions, activities, services, receivers, providers, exported components, deep links, and other app-level configuration.
Key Android reversing concepts for this project:
APK structure → DEX, resources, native libs, manifest
Package identity → package name, signing, version
Android permissions → declared capabilities
Activities and components → launchable entry points
Exported components → accessible from outside the app
Intent filters → deep links, URI schemes
String extraction → visible constants in code and resources
Endpoint extraction → hardcoded URLs, domains, API paths
Static vs dynamic → file analysis vs running the app
The protocol analysis side focuses on passive traffic artifacts: HAR files, PCAP files, or mitmproxy JSON exports. These can reveal hosts, paths, methods, status codes, DNS queries, TLS SNI values, IP addresses, ports, and request/response structure. However, the tool does not decrypt TLS, bypass certificate pinning, or interfere with authentication.
4. Tool Architecture {#architecture}
RE-Protocol-Agent is designed as a modular pipeline. A central Controller coordinates each module and writes outputs into a case folder. The AI layer is added after deterministic analysis, it does not invent findings. It only summarizes locally generated artifacts.
Controller
│
├── Intake Module → metadata.json
├── Decompiler Module → extracted source files
├── Manifest Analyzer → manifest_summary.json
├── Code Search → static_findings.json
├── String Extractor → strings_interesting.json
├── Endpoint Extractor → endpoints.json
├── Protocol Parsers → protocol_summary.json / pcap_summary.json
├── Correlator → correlated_findings.json
├── AI Reasoner → ai_summary.md
└── Reporter → final_report.md
The key design principle is evidence first, AI second. The deterministic modules generate structured artifacts. The AI reasoner reads those artifacts and produces a summary. If evidence is missing, it reports that it is missing.
5. Planned Workflow {#workflow}
The workflow begins with APK intake and ends with a report:
APK Artifact
→ Hash and Metadata
→ Manifest Analysis
→ Decompilation / Existing Extracted Folder
→ Code and Resource Search
→ String Extraction
→ Endpoint Extraction
→ Optional Protocol Capture Parsing
→ Correlation
→ AI Summary
→ Markdown Report
The tool creates a structured case directory so the analysis can be reproduced:
outputs/<case_name>/
├── metadata.json
├── manifest_summary.json
├── static_findings.json
├── strings_interesting.txt
├── strings_interesting.json
├── endpoints.json
├── protocol_summary.json
├── pcap_summary.json
├── correlated_findings.json
├── ai_summary.md
├── dynamic_summary.md
└── final_report.md

6. Static Analysis Plan {#static-plan}
The static analysis stage searches the APK and decompiled files for evidence related to the research target. The keyword categories are designed around connected vehicle behavior:
Category Keywords
──────────────────────────────────────────────────────────
Vehicle identity VIN, vehicle, garage, registration,
pairing, activation, ownership
Authentication OAuth, token, login, auth, bearer,
credential, session
Region logic region, country, market, locale,
timezone, territory
Backend / API endpoint, API, backend, host,
telematics, feature flag
Decompiled App Files
├── Keyword Search
│ ├── Vehicle and VIN findings
│ ├── Authentication findings
│ └── Region and market findings
├── String Extraction → interesting strings
└── Endpoint Extraction → endpoint candidates
↓
Static Findings Report
The tool redacts obvious sensitive values such as bearer tokens, API keys, JWT-looking values, cookies, and long random secrets, the goal is analysis and reporting, not credential collection.
7. Protocol Analysis Plan {#protocol-plan}
The protocol side is intentionally passive. The tool parses files the analyst provides or files generated during an authorized local capture. It does not attack services or generate unauthorized requests.
Local Traffic Artifact
│
├── HAR Parser → HTTP methods, hosts, paths, status codes
├── PCAP Parser → DNS, TLS SNI, IPs, ports
└── mitmproxy JSON → request and response metadata
↓
Protocol Summary
↓
Static + Protocol Correlation
For PCAP parsing, the tool uses tshark if installed. It extracts visible network metadata but does not decrypt traffic.
8. Dynamic Analysis Plan {#dynamic-plan}
The dynamic workflow makes protocol analysis more realistic. The agent handles the environment: starting the emulator, launching the app, starting capture, collecting logcat. The user performs sensitive in-app actions manually.
Sequence:
User → Start guided capture
CLI → Begin dynamic workflow
Agent → Start or connect to Android emulator
Agent → Install or launch target app
Agent → Start local PCAP and logcat capture
User → Manual login in app
User → Manual VIN or vehicle-flow check
Capture → Save runtime artifacts
Agent → Parse, correlate, and summarize
Reporter → Updated final_report.md

The dynamic stage is one of the hardest parts of the project because Android apps often depend on emulator compatibility, Google Play Services, package identity, app bundles, region settings, and device environment details.
9. Ethical Boundaries {#ethics}
The project has strict boundaries. It is for academic, legal, local analysis only.
Allowed Not Allowed
────────────────────────────── ─────────────────────────────
Analyze user-provided APKs Credential attacks
Parse local HAR/PCAP/mitmproxy Brute forcing
Collect local emulator logs Authentication bypass
Extract strings and endpoints Certificate pinning bypass
Produce grounded reports App patching or repacking
Production fuzzing
Unauthorized API interaction
Automated VIN submission
Automated login
10. Expected Challenges {#challenges}
Challenge Why it matters
──────────────────────────────────────────────────────────────────
Emulator launch reliability App may need Play Services or
specific device profile
APK vs app bundle support Split APKs require different
install approach
Separating confirmed vs inferred Evidence quality directly
evidence affects report credibility
Encrypted traffic TLS blocks passive PCAP reading
Accidental sensitive value exposure Redaction must work correctly
Grounded AI summaries AI must not speculate beyond
available evidence
The biggest expected difficulty is dynamic protocol analysis. Static analysis can produce useful evidence, but protocol reversing requires the app to run correctly and generate authorized traffic that can be captured.
11. Lab Setup {#lab-setup}
Component Tool / Version
──────────────────────────────────────
Language Python 3.11
RE Agent RE-Protocol-Agent CLI
Decompiler jadx
Packaging tool apktool
APK inspection aapt
Android tooling adb
Android emulator Android Emulator (AVD)
Traffic capture tshark / Wireshark
Case storage Local output folders
AI key Environment variable
The first version prioritizes a reliable CLI and Markdown report generation over a graphical interface. The goal is to make the tool usable by an average technical user while keeping the analysis reproducible.
12. Initial Hypothesis {#hypothesis}
My hypothesis is that static analysis will reveal useful Mercedes me app concepts: vehicle, VIN, authentication, region, and endpoint references, because these concepts appear in most connected vehicle apps at the code and resource level.
I also expect that dynamic protocol analysis will require more environment work because the app needs to run correctly in an emulator before traffic can be captured and correlated.
Part I therefore frames the project as both a reverse engineering study and a tool-building project. The main objective is to create a safe agentic workflow that can collect evidence, organize it, and explain it clearly.