VeloQ
Pure CLI in / JSON contract out. No GUI required.
An agent-friendly profile-query CLI for Nsight Systems timelines and Nsight Compute kernel reports plus PyTorch/Kineto Chrome traces — one binary, one versioned JSON envelope, one shot per call. Built so a coding agent (or a shell script) can reason about GPU profiles without opening a GUI.
curl -fsSL
https://raw.githubusercontent.com/lucifer1004/veloq/main/scripts/install.sh
| bash
why VeloQ
Built for the way agents read profiles
VeloQ reads exported profiler evidence; it does not replace
nsys or ncu. It gives programs a stable,
compact contract for asking profile questions.
// stable contract
Versioned JSON envelope
Every success is a v1 envelope; every list response
is canonical data.rows[] with a stable per-row
key. Parse .data or
.error, never stderr.
// token economy
Shaped rows, not dumps
Headline columns plus truncation signals (count vs
total_matched) so an agent spends tokens on
answers, not on scrolling raw text.
// scriptable
One shot, pipe to jq
Stateless calls, CSV/table projections where useful, and stable keys that make two captures diffable across a run.
// extensible
Three sources, one shape
Nsight Systems, Nsight Compute, and PyTorch/Kineto behind a
pluggable ProfileSource trait. PyTorch covers
Perfetto-style Chrome traces.
report-ready figures
Static timeline SVGs, generated from the CLI
veloq viz timeline writes a bounded NSys timeline
artifact while stdout stays a JSON envelope. Agents can cite the
row metadata and embed the SVG directly in a report.
viz timeline artifact with top-kernel
highlights, track roles, CUDA API annotations, and idle gaps.
query → jq → answer
A query is one line
Ask a question, get rows, reshape with jq. No session state or scraping.
$ ask
# top 3 kernels by total GPU time veloq stats trace.nsys-rep --type kernel \ --sort total:desc --limit 3 \ | jq '.data.rows[] | {name, ms: (.total_ns/1e6)}'
→ get
{ "name": "ampere_sgemm_128x64_nn", "ms": 812.4 }
{ "name": "elementwise_kernel", "ms": 506.1 }
{ "name": "ncclDevKernel_AllReduce", "ms": 333.9 }
how it compares
Compared with common alternatives
Agents often fall back to the Nsight GUI, raw tool text in context, or hand-rolled SQLite + jq. VeloQ keeps the same work scriptable and contract-shaped.
| Nsight GUI | Raw nsys/ncu text | SQLite + jq | VeloQ | |
|---|---|---|---|---|
| Scriptable / one-shot | ✗ | ~ ad hoc | ✓ | ✓ |
| Token-efficient for an agent | n/a | ✗ broad dumps | ~ | ✓ shaped rows |
| Stable typed contract | ✗ | ✗ free text | ✗ schema you own | ✓ versioned envelope |
| Cross-capture diffable | ✗ | ✗ | ~ | ✓ stable per-row key |
| Zero setup per query | ✓ | ✓ | ✗ | ✓ |
When not to use VeloQ: use the Nsight GUI for interactive timeline exploration or one-off visual inspection. Use VeloQ for programmatic, repeatable, agent- or script-driven querying.
the surface
One tool, one envelope, many verbs
NSys verbs are hoisted to the top level (also under
veloq nsys ...); NCU and PyTorch/Kineto verbs live
under veloq ncu ... and
veloq pytorch ....
// Nsight Systems
// Nsight Compute
// PyTorch/Kineto
// meta
install
One script, three skills
Installs the veloq binary plus three Agent Skills
(nsys-profile-analysis,
ncu-profile-analysis, and
pytorch-profile-analysis) under
~/.agents/skills/. Linux x86_64/aarch64 and macOS
x86_64/arm64.
# quick install (binary + skills) curl -fsSL https://raw.githubusercontent.com/lucifer1004/veloq/main/scripts/install.sh | bash # keep it current (binary AND bundled Agent Skills) veloq self-update # build from source cargo build --release -p veloq # → target/release/veloq
Re-running the installer or veloq self-update refreshes
bundled Agent Skills and removes stale skill files.
Codex plugin install from a VeloQ checkout:
codex plugin marketplace add .,
then codex plugin add veloq@veloq.
The plugin installs Agent Skills only; they still require the VeloQ CLI
for evidence extraction.
Claude Code plugin metadata is also provided:
/plugin marketplace add
https://github.com/lucifer1004/veloq.git
then /plugin install veloq@veloq.
Codex plugin metadata lives under .codex-plugin/;
Claude plugin metadata remains under
.claude-plugin/. Full docs in the
README.