Onboarding
This page walks you through the dev environment from a fresh clone
to a working simulator run. The goal: by the end you should be able
to edit a Python file in serving/, rerun a simulation, and see
your change reflected in the output CSV.
If you only plan to read code (not run it), skip to Codebase tour instead.
Prerequisites
- Linux (Ubuntu 22.04+ tested). macOS works for editing but not for running the profiler / bench (those need an NVIDIA GPU).
- Docker (for the simplest path) or the bare-metal vLLM installer if you can't use Docker.
- ~5 GB free disk for the simulator container, ~10 GB additional if you'll also profile or bench.
- A GitHub account (for the eventual PR).
You do not need a GPU just to run the simulator. The bundled RTXPRO6000 / H100 profile bundles let you simulate without hardware.
1. Clone with submodules
ASTRA-Sim lives as a git submodule. Always clone with
--recurse-submodules:
git clone --recurse-submodules https://github.com/casys-kaist/LLMServingSim.git
cd LLMServingSim
If you already cloned without submodules:
git submodule update --init --recursive
2. Pick your container
Two containers, one per role:
| Container | Image | When you need it |
|---|---|---|
scripts/docker-sim.sh | astrasim/tutorial-micro2024 + Python deps | Running the simulator. Always. |
scripts/docker-vllm.sh | vllm/vllm-openai:v0.19.0 | Profiling new hardware, running the bench, generating workloads from ShareGPT. Only if you touch those. |
For most contributor work (scheduler, memory model, trace generator, configs), the sim container is all you need:
./scripts/docker-sim.sh
This drops you into a shell at /app/LLMServingSim with all Python
deps installed. The repo root is bind-mounted, so edits on your host
are immediately visible inside.
3. Build ASTRA-Sim and Chakra
Inside the sim container, on first run:
./scripts/compile.sh
This compiles ASTRA-Sim's analytical backend (used by the simulator)
and installs the Chakra trace converter. Takes a few minutes the
first time, ~30 seconds on incremental rebuilds. Rerun whenever you
touch astra-sim/ C++ sources.
If the compile fails with a missing dependency, the most common
cause is the submodule not being checked out. Rerun
git submodule update --init --recursive from the host and try
again.
4. Smoke run
The fastest "is everything working?" check is the bundled single-instance trace:
python -m serving \
--cluster-config configs/cluster/single_node_single_instance.json \
--dataset workloads/example_trace.jsonl \
--output outputs/onboarding_smoke.csv \
--num-reqs 10
What you should see:
- A few seconds of throughput log lines
(
step=N batch=K prompt_t=… decode_t=…). - A summary line at the end with totals (
Finished N requests). outputs/onboarding_smoke.csvcontaining one row per request.
If you got that, the simulator is working. If you got an error, see Troubleshooting.
5. Make a real change
Time to actually edit something. A safe first edit: bump the default log interval so you can see throughput updates more often.
Open serving/__main__.py and find the --log-interval arg
(it defaults to 1.0). Change the default to 0.5, save, and rerun
the smoke command from step 4. You should see twice as many
throughput log lines.
Revert the change (git checkout serving/__main__.py) when you're
done playing.
6. Read the next pages
You're now set up. Before opening a PR, please skim:
- Codebase tour: where each kind of change lives.
- Coding conventions: the small set of rules that keep the codebase readable.
- Validating your changes: how to know your change didn't break anything (we don't have a unit-test suite, so this matters).
- PR workflow: branch, commit message style, PR template.
Common setup gotchas
--recurse-submodulesforgotten → ASTRA-Sim is missing,compile.shfails immediately. Rerungit submodule update --init --recursive.- Wrong container for the task → profiler / bench scripts will
complain about missing CUDA or vLLM. Switch to
scripts/docker-vllm.sh. - Edits not visible inside the container → check that your edit landed under the cloned repo dir (the container mounts the repo root, not your full home directory).
- Python version mismatch → both containers ship the right
Python; don't try to install your own. If you must run
bare-metal,
scripts/install-vllm.shhandles the vLLM side. docker-sim.shsays container exists → either reattach (docker exec -it servingsim_docker bash) or remove (docker rm -f servingsim_docker) before rerunning.
Where to ask for help
- GitHub Discussions: casys-kaist/LLMServingSim/discussions. First stop for "how do I…" questions.
- GitHub Issues: file under
casys-kaist/LLMServingSim/issues
with
[contributor]in the title for setup blockers. - Email the main contributors: jhcho@casys.kaist.ac.kr and hmchoi@casys.kaist.ac.kr (CC both whenever possible).