Codebase tour

This page answers "I want to add or change X, where do I touch?". It is a directory map, not a behavior reference. For what each piece does, see the Simulator and Profiler sections.

The five domains

LLMServingSim/
├── serving/      Simulator       Python, the core loop
├── profiler/     Profiler        Python, vLLM-based latency capture
├── bench/        Bench           Python, real vLLM run + sim validation
├── workloads/    Workloads       JSONL traces + generators
├── configs/      Configs         JSON: cluster / model / PIM
├── scripts/      Env scripts     Docker launchers + builders
└── astra-sim/    Backend         C++ analytical network simulator

Each domain has a clear boundary. A typical PR touches one or two of these, not all of them. If you find yourself editing four domains for a single change, stop and reconsider the scope.

Simulator (`serving/`)

Where most contributor work happens.

serving/
├── __main__.py              CLI + main loop
└── core/
    ├── scheduler.py         vLLM-style continuous batching
    ├── trace_generator.py   Profile lookup -> text trace
    ├── memory_model.py      KV / weight / CXL byte accounting
    ├── graph_generator.py   Text trace -> Chakra protobuf
    ├── controller.py        ASTRA-Sim subprocess IPC
    ├── router.py            Request routing across instances
    ├── gate_function.py     MoE expert routing
    ├── config_builder.py    Cluster config -> ASTRA-Sim inputs
    ├── power_model.py       Power / energy estimation
    ├── pim_model.py         PIM device model
    ├── request.py           Request / Batch dataclasses
    ├── radix_tree.py        Prefix cache (RadixCache, from SGLang)
    ├── logger.py            Rich-based logging + stdio capture
    └── utils.py             Model config loading, formatters

Where to touch by intent:

Intent	Edit
Change scheduling policy	`scheduler.py`
Change how latency is looked up	`trace_generator.py` (`_lookup_*`)
Change byte accounting (KV, weights, prefix cache)	`memory_model.py`
Change inter-instance routing	`router.py`
Add a new CLI flag	`__main__.py` (argparse), then thread through
Change MoE expert distribution	`gate_function.py`
Change ASTRA-Sim input generation	`config_builder.py`
Add a new power component	`power_model.py`

Profiler (`profiler/`)

profiler/
├── __main__.py              CLI dispatch (profile / slice)
├── core/                    internals (runner, engine, categories, fit_alpha)
├── models/<model_type>.yaml Architecture catalogs (one per HF model_type)
├── perf/<hw>/<model>/...    Output bundles (CSV per category)
└── profile.sh               Editable user template

Where to touch by intent:

Intent	Edit
Add a new hardware target	Run the profiler with `HARDWARE=` set; output lands in `perf/<hw>/`. See Profiler / Adding hardware
Add a new model architecture	Drop a YAML in `profiler/models/<model_type>.yaml`. See Profiler / Adding model architecture
Change the skew alpha fit	`core/fit_alpha.py`
Change what categories get profiled	`core/categories.py` + `core/runner.py`
Change output CSV columns	`core/writer.py` (and `_load_perf_db()` in `serving/core/trace_generator.py` to consume them)

Bench (`bench/`)

bench/
├── __main__.py              CLI (run / validate)
├── core/                    AsyncLLM driver, recorder, validator
├── examples/<model>/        Committed end-to-end runs
└── results/<run_id>/        Output for ad-hoc runs

You'll touch this only if you change the validation methodology itself (how vLLM is driven, what metrics are compared, what plots are emitted). For day-to-day "did my change regress?" use, see Validating your changes.

Configs (`configs/`)

configs/
├── cluster/<name>.json      Cluster topology (the main thing)
├── model/<org>/<name>.json  Model architecture (subset of HF config.json)
└── pim/<name>.ini           PIM device specs (DRAMSim3 format)

Cluster configs are the most edited file outside serving/. Adding a new scenario almost always means dropping a new configs/cluster/<scenario>.json and not touching simulator code at all. Field-by-field schema lives in Reference / Cluster config.

Workloads (`workloads/`)

workloads/
├── *.jsonl                  Datasets (one request or session per line)
├── generators/              ShareGPT / SWE-bench JSONL builders
└── README.md                JSONL format reference

Adding a new workload generator is a contained change: a new module under generators/, runnable as python -m workloads.generators.<your_module>. See Workloads / ShareGPT generators for the existing pattern.

ASTRA-Sim (`astra-sim/`)

C++ network simulator, lives as a submodule. Don't edit unless the change targets simulator integration. Most simulator-side changes never touch this.

The handful of files you might edit:

File	Why
`astra-sim/extern/graph_frontend/chakra/src/converter/llm_converter.py`	New trace `comm_type` syntax, new memory location enum
`astra-sim/astra-sim/system/Workload.cc`	Custom collective issuance, `involved_dim` handling
`astra-sim/astra-sim/system/AstraMemoryAPI.hh`	New memory tier enum (paired with `llm_converter.py`)
`astra-sim/inputs/...`	Don't edit. Generated by `config_builder.py` on every run

If you do edit ASTRA-Sim, rerun ./scripts/compile.sh before testing.

Scripts (`scripts/`)

scripts/
├── docker-sim.sh            Sim container launcher
├── docker-vllm.sh           vLLM container launcher (profiler / bench)
├── install-vllm.sh          Bare-metal vLLM install (uv venv)
└── compile.sh               ASTRA-Sim + Chakra build

You'll touch these rarely. If you add a new entry point, prefer python -m <module> (handled inside the existing containers) over adding more shell scripts.

Tests and fixtures

There is no unit-test suite. Validation is done by:

Running a smoke python -m serving … and inspecting the output CSV.
Running python -m bench validate against a known-good vLLM replay (see Validating your changes).

When adding a feature that has clean inputs and outputs (a new _lookup_* function, a new memory accounting helper), feel free to add a script under scripts/ or a notebook checked into your branch. The project has not adopted a formal test framework yet; that itself is an open contribution opportunity.

Where docs live

Audience	Location
User-facing docs (this site)	`docs/`
Per-module developer notes	`<module>/README.md` (each top-level Python module has one)
Top-level project README	`README.md`
Project context for AI agents	`CLAUDE.md` (mirrors `AGENTS.md`)

When you change behavior, update the relevant page under docs/. When you add a feature, also update the module's own README.md if it covers something the website doesn't.

What's next

Coding conventions: the rules every PR follows.
Validating your changes: how to prove your change works.

The five domains​

Simulator (serving/)​

Profiler (profiler/)​

Bench (bench/)​

Configs (configs/)​

Workloads (workloads/)​

ASTRA-Sim (astra-sim/)​

Scripts (scripts/)​

Tests and fixtures​

Where docs live​

What's next​