Decompose
Contents
Overview
Decompose takes a git repository or local folder and runs an eight-phase AI analysis, producing a complete language-agnostic specification of the codebase plus a searchable composition inventory.
The output is a set of Markdown files saved to Output/Decomposition/<project>/. Nothing leaves your machine beyond the API calls to your chosen AI backend.
Running a Decomposition
Web app
- Navigate to the Decompose screen (home page).
- Enter a git URL (e.g.
https://github.com/org/repo) or an absolute local path. - The project name is auto-detected from the URL or folder name — edit it if needed.
- Choose an AI backend and click Run.
The Job Detail screen opens immediately and shows:
- A horizontal phase stepper (phases 0–7) updating live as each phase completes
- A streaming log panel with real-time AI output
- Per-phase output previews, expandable to full text
- A Re-run from phase N button on each row
CLI
# Decompose a remote repository (uses default backend from otx settings)
otx decompose https://github.com/org/repo
# Decompose a local folder with an explicit project name
otx decompose ./my-project --project my-project
# Run only phases 0–2
otx decompose https://github.com/org/repo --end-phase 2
# Resume from phase 3 after a failure
otx decompose https://github.com/org/repo --project my-project --start-phase 3
See CLI: decompose for the full option reference.
The Eight Phases
Phase 0 — Index & Architecture Overview
Output: 00-index.md | Model weight: regular |
Produces a navigational index and a concise architectural summary: what the project is, what problem it solves, how the major subsystems fit together at runtime, and 4–8 key design decisions a re-implementer must understand.
Phase 1 — Structural Survey
Output: 01-structural-survey.md | Model weight: regular |
Exhaustive directory tree (at least two levels deep), primary entry points, all configuration files with every key and default value, external dependencies, and the build/test toolchain.
Phase 2 — Initialization & Runtime Flow
Output: 02-initialization-flow.md | Model weight: regular | Uses: phase 1 |
Every step of the startup sequence in order — which file runs, what state it reads, what state it produces. All hook and callback registration points. The shutdown sequence. Background and async tasks. Includes a Mermaid flow diagram where it aids clarity.
Phase 3 — Component Specifications
Output: 03-01-*.md … 03-N-*.md | Model weight: thick |
An expansion phase — runs in two steps:
- Discovery: The AI identifies logically distinct subsystems and groups them into 3–8 thematic clusters.
- Spec: For each cluster, a full specification covering purpose, public interface (every function with inputs, outputs, preconditions, postconditions, side effects, and error conditions), internal state, algorithms, integration points, configuration, extension interfaces, edge cases, and composition extracts.
The number of output files varies by project.
Phase 4 — Data Formats & Protocols
Output: 04-data-formats.md | Model weight: thick |
Every file format the project reads or writes (with full grammar/schema and examples), every inter-process protocol (with sequence diagrams), and every public API surface not covered in Phase 3.
Phase 5 — Re-implementation Checklist
Output: 05-reimplementation-checklist.md | Model weight: regular | Uses: phases 0–4 |
An actionable, ordered checklist:
- Dependency inventory (stdlib equivalents in Python, Go, Rust, TypeScript, Java)
- Implementation order (leaves first, with spec cross-references)
- Acceptance criteria (3–5 testable behavioral assertions per component)
- Compatibility traps
- What to skip (historical accidents, deprecated code)
- Security checklist — every component and data flow analysed for injection, traversal, privilege escalation, unsafe defaults, and more. Includes a Malicious Intent Indicators section.
Phase 6 — Composition Inventory
Output: 06-composition-inventory.md → parsed into SQLite | Model weight: regular | Uses: phase 3 |
Synthesises all composition extracts from Phase 3 into a single flat document organised into seven categories. Every item includes:
- Full description and pseudocode
- EARS behavioral requirements
- Runnable test cases (with anti-cheat probes and mutation-detection cases)
- Boundary and adversarial cases
- Invariant assertions
- Security score (1–10) and safe re-implementation pattern
After Phase 6 completes, the inventory is automatically parsed and imported into the SQLite database.
Phase 7 — Ethos & Style Fingerprint
Output: 07-ethos.md | Model weight: regular | Uses: phases 1, 2, 3 |
A standalone style guide derived entirely from reading the real codebase. Every claim cites a concrete example from the source. Covers eleven dimensions:
- Naming conventions (per identifier category, with examples)
- Error handling philosophy
- Abstraction discipline
- Logging and observability
- Concurrency and async patterns
- Configuration and magic values
- Dependency and coupling style
- Comment and documentation style
- Code organisation preferences
- Test philosophy
- Overall character (qualitative synthesis)
Where a practice diverges from widely-accepted conventions, Phase 7 appends a Divergence note explaining the conventional approach and whether the deviation appears deliberate — so a Compose or Transmute run can make an informed choice about whether to replicate or correct it.
Domain Hints
If you have domain knowledge the AI would not discover from reading the code alone, pass it via --hints (CLI) or the Hints field in the web app. Hints are injected verbatim into every phase prompt.
otx decompose ./trading-engine \
--hints "Uses event sourcing with CQRS. The Saga pattern coordinates multi-step trades. \
Idempotency keys prevent double-execution on retries."
Keep hints factual and concise. They are most useful for:
- Explaining domain terminology the AI might misread as generic code
- Surfacing known quirks or non-obvious constraints
- Directing attention toward specific analysis goals
Resuming a Failed Run
Phase state is saved to Output/Decomposition/<project>/job.json after every phase completes. If a job fails mid-run:
Web app: go to Job History, open the failed job, and click Re-run from phase N on the failed phase row.
CLI:
otx decompose https://github.com/org/repo --project my-project --start-phase 3
Outputs from completed phases are reloaded from disk automatically — you never lose work from a phase that already finished.
Output Files
| File | Contents |
|---|---|
00-index.md | Architecture overview, key design decisions |
01-structural-survey.md | Directory tree, entry points, config, dependencies |
02-initialization-flow.md | Startup sequence, hooks, shutdown, async tasks |
03-NN-<name>.md | Component specification (one per cluster) |
04-data-formats.md | File formats, protocols, public API surfaces |
05-reimplementation-checklist.md | Ordered checklist with acceptance criteria and security analysis |
06-composition-inventory.md | Flat inventory catalogue (source for DB import) |
07-ethos.md | Style fingerprint and coding standards |
inventory.json | Per-project inventory export |
job.json | Job state for resume-on-failure |
All files are relative to Output/Decomposition/<project>/ in your working directory.