Training Sources
ConCRG learns your application through four complementary sources. They run in parallel during training and each contributes a different angle of knowledge.
Overview
| Source | Input | What It Learns | LLM Used? |
|---|---|---|---|
| Probe | Live running app | Pages, navigation, UI elements, workflows | Yes — screen analysis |
| Code | TypeScript source | Routes, data types, permissions, API contracts | Hybrid — AST + LLM |
| Docs | Documentation URL | Concepts, feature descriptions, workflows | Yes — per-page extraction |
| Chat | Conversation with PM/developer | Domain rules, business logic, edge cases | Yes — conversational |
You can run all four together or pick a subset. Each source produces RDF triples that are merged into the shared knowledge store.
Probe
The Probe is ConCRG's autonomous DOM explorer. It runs inside the host app's browser context — not in a separate Playwright session — so it sees the actual rendered UI with real authentication and data.
How it works:
- Starts at the app's root route
- Serializes the DOM tree (labels, roles, nav links, interactive elements)
- Captures a screenshot
- Sends the DOM snapshot + screenshot to Claude for analysis
- Extracts page understanding, elements, workflows, and triples
- Follows navigation links and repeats for each discovered route
Crawler modes:
legacy— Breadth-first queue walkerstategraph— Models the app as a state machine; discovers transitions, modal states, and conditional UI
Configuration:
crawlerMode: 'stategraph',
crawler: {
maxNodes: 50,
maxActionsPerNode: 10,
maxInstancesPerRoutePattern: 3,
}
Code
The Code source analyzes your TypeScript source files to extract structural knowledge that's hard to observe from the DOM alone — route definitions, interface shapes, permission guards, and API contracts.
What it extracts:
- Route declarations and their corresponding components
- TypeScript interfaces (data models)
- JSX structure and prop relationships
- API endpoint handlers and their contracts
- Permission/role guards on routes and components
Two modes:
- Deterministic — TypeScript AST parsing (fast, precise)
- LLM-enhanced — Claude interprets complex handlers and business logic (
useAgents: true)
Setup:
const config = {
sourceCodePath: '/absolute/path/to/your/src',
useAgentsV2: true, // recommended for complex codebases
};
Or upload a zip file via the training panel for remote analysis.
Docs
The Docs source crawls your existing documentation website and extracts product knowledge from it. Useful for bridging the gap between what your app can do (Probe + Code) and the business-level concepts behind it (Docs).
How it works:
- Starts at the
docsUrlyou configure - Discovers all linked pages within the same domain
- Extracts article content using
@mozilla/readability - Converts HTML to Markdown
- Sends each page to Claude for knowledge extraction (batched)
- Streams progress in real time as pages are processed
Setup:
const config = {
docsUrl: 'https://docs.yourproduct.com',
};
Streaming response: The docs crawl streams NDJSON progress updates so you can see pages being processed in real time.
Chat
The Chat source lets you teach ConCRG directly through conversation. This is ideal for domain knowledge that isn't visible in the UI or documented anywhere — business rules, edge cases, terminology, "why does this work this way?" context.
Example conversation:
You: In our system, a "settlement" only applies to bond trades,
not equity trades. The UI looks the same but the validation
rules are completely different.
ConCRG: Got it. I'll record that settlement workflows for bonds
have different validation from equity. Can you tell me
what the key differences are?
Each exchange is converted into knowledge triples and added to the store.
Combining Sources
The four sources are complementary, not redundant:
- Probe discovers what exists and how to navigate it
- Code reveals the underlying data model and permissions
- Docs provides the business-level concepts and terminology
- Chat adds tacit knowledge that no automated source can capture
The more sources you use, the richer and more accurate the knowledge graph becomes.