Decompose

Contents

Overview
Running a Decomposition
- Web app
- CLI
The Eight Phases
Domain Hints
Resuming a Failed Run
Output Files

Overview

Decompose takes a git repository or local folder and runs an eight-phase AI analysis, producing a complete language-agnostic specification of the codebase plus a searchable composition inventory.

The output is a set of Markdown files saved to Output/Decomposition/<project>/. Nothing leaves your machine beyond the API calls to your chosen AI backend.

Running a Decomposition

Web app

Navigate to the Decompose screen (home page).
Enter a git URL (e.g. https://github.com/org/repo) or an absolute local path.
The project name is auto-detected from the URL or folder name — edit it if needed.
Choose an AI backend and click Run.

The Job Detail screen opens immediately and shows:

A horizontal phase stepper (phases 0–7) updating live as each phase completes
A streaming log panel with real-time AI output
Per-phase output previews, expandable to full text
A Re-run from phase N button on each row

CLI

# Decompose a remote repository (uses default backend from otx settings)
otx decompose https://github.com/org/repo

# Decompose a local folder with an explicit project name
otx decompose ./my-project --project my-project

# Run only phases 0–2
otx decompose https://github.com/org/repo --end-phase 2

# Resume from phase 3 after a failure
otx decompose https://github.com/org/repo --project my-project --start-phase 3

See CLI: decompose for the full option reference.

The Eight Phases

Phase 0 — Index & Architecture Overview

Output: 00-index.md Model weight: regular

Produces a navigational index and a concise architectural summary: what the project is, what problem it solves, how the major subsystems fit together at runtime, and 4–8 key design decisions a re-implementer must understand.

Phase 1 — Structural Survey

Output: 01-structural-survey.md Model weight: regular

Exhaustive directory tree (at least two levels deep), primary entry points, all configuration files with every key and default value, external dependencies, and the build/test toolchain.

Phase 2 — Initialization & Runtime Flow

Output: 02-initialization-flow.md Model weight: regular Uses: phase 1

Every step of the startup sequence in order — which file runs, what state it reads, what state it produces. All hook and callback registration points. The shutdown sequence. Background and async tasks. Includes a Mermaid flow diagram where it aids clarity.

Phase 3 — Component Specifications

Output: 03-01-*.md … 03-N-*.md Model weight: thick

An expansion phase — runs in two steps:

Discovery: The AI identifies logically distinct subsystems and groups them into 3–8 thematic clusters.
Spec: For each cluster, a full specification covering purpose, public interface (every function with inputs, outputs, preconditions, postconditions, side effects, and error conditions), internal state, algorithms, integration points, configuration, extension interfaces, edge cases, and composition extracts.

The number of output files varies by project.

Phase 4 — Data Formats & Protocols

Output: 04-data-formats.md Model weight: thick

Every file format the project reads or writes (with full grammar/schema and examples), every inter-process protocol (with sequence diagrams), and every public API surface not covered in Phase 3.

Phase 5 — Re-implementation Checklist

Output: 05-reimplementation-checklist.md Model weight: regular Uses: phases 0–4

An actionable, ordered checklist:

Dependency inventory (stdlib equivalents in Python, Go, Rust, TypeScript, Java)
Implementation order (leaves first, with spec cross-references)
Acceptance criteria (3–5 testable behavioral assertions per component)
Compatibility traps
What to skip (historical accidents, deprecated code)
Security checklist — every component and data flow analysed for injection, traversal, privilege escalation, unsafe defaults, and more. Includes a Malicious Intent Indicators section.

Phase 6 — Composition Inventory

Output: 06-composition-inventory.md → parsed into SQLite Model weight: regular Uses: phase 3

Synthesises all composition extracts from Phase 3 into a single flat document organised into seven categories. Every item includes:

Full description and pseudocode
EARS behavioral requirements
Runnable test cases (with anti-cheat probes and mutation-detection cases)
Boundary and adversarial cases
Invariant assertions
Security score (1–10) and safe re-implementation pattern

After Phase 6 completes, the inventory is automatically parsed and imported into the SQLite database.

Phase 7 — Ethos & Style Fingerprint

Output: 07-ethos.md Model weight: regular Uses: phases 1, 2, 3

A standalone style guide derived entirely from reading the real codebase. Every claim cites a concrete example from the source. Covers eleven dimensions:

Naming conventions (per identifier category, with examples)
Error handling philosophy
Abstraction discipline
Logging and observability
Concurrency and async patterns
Configuration and magic values
Dependency and coupling style
Comment and documentation style
Code organisation preferences
Test philosophy
Overall character (qualitative synthesis)

Where a practice diverges from widely-accepted conventions, Phase 7 appends a Divergence note explaining the conventional approach and whether the deviation appears deliberate — so a Compose or Transmute run can make an informed choice about whether to replicate or correct it.

Domain Hints

If you have domain knowledge the AI would not discover from reading the code alone, pass it via --hints (CLI) or the Hints field in the web app. Hints are injected verbatim into every phase prompt.

otx decompose ./trading-engine \
  --hints "Uses event sourcing with CQRS. The Saga pattern coordinates multi-step trades. \
           Idempotency keys prevent double-execution on retries."

Keep hints factual and concise. They are most useful for:

Explaining domain terminology the AI might misread as generic code
Surfacing known quirks or non-obvious constraints
Directing attention toward specific analysis goals

Resuming a Failed Run

Phase state is saved to Output/Decomposition/<project>/job.json after every phase completes. If a job fails mid-run:

Web app: go to Job History, open the failed job, and click Re-run from phase N on the failed phase row.

CLI:

otx decompose https://github.com/org/repo --project my-project --start-phase 3

Outputs from completed phases are reloaded from disk automatically — you never lose work from a phase that already finished.

Output Files

File	Contents
`00-index.md`	Architecture overview, key design decisions
`01-structural-survey.md`	Directory tree, entry points, config, dependencies
`02-initialization-flow.md`	Startup sequence, hooks, shutdown, async tasks
`03-NN-<name>.md`	Component specification (one per cluster)
`04-data-formats.md`	File formats, protocols, public API surfaces
`05-reimplementation-checklist.md`	Ordered checklist with acceptance criteria and security analysis
`06-composition-inventory.md`	Flat inventory catalogue (source for DB import)
`07-ethos.md`	Style fingerprint and coding standards
`inventory.json`	Per-project inventory export
`job.json`	Job state for resume-on-failure

All files are relative to Output/Decomposition/<project>/ in your working directory.