Simulator setup
The simulator runs in a Docker container based on
astrasim/tutorial-micro2024.
The container ships with the C++ ASTRA-Sim backend pre-built and
provides a Python environment for the simulator.
This is the install path everyone needs. If you also want to profile new hardware or run end-to-end vLLM validation, follow vLLM setup afterwards.
1. Clone the repository
The repo includes ASTRA-Sim and Chakra as git submodules, so you must
clone with --recurse-submodules:
git clone --recurse-submodules https://github.com/casys-kaist/LLMServingSim.git
cd LLMServingSim
If you already cloned without --recurse-submodules, fix it with:
git submodule update --init --recursive
2. Launch the simulator container
./scripts/docker-sim.sh
This:
- Mounts the repo root into the container at
/app/LLMServingSim - Installs the Python deps the simulator needs
(
pyyaml,transformers,pandas,xgboost,matplotlib, …) - Drops you into a
bashshell at/app/LLMServingSim
The container is named servingsim_docker. To re-attach later (e.g.,
after a reboot):
docker start -ai servingsim_docker
To remove and start fresh:
docker rm -f servingsim_docker
./scripts/docker-sim.sh
3. Build ASTRA-Sim and install Chakra
Inside the simulator container, compile the analytical backend and install Chakra:
./scripts/compile.sh
What this does:
pip installChakra (the C++ → protobuf converter ASTRA-Sim consumes) fromastra-sim/extern/graph_frontend/chakra.- Compile the analytical backend of ASTRA-Sim
(
astra-sim/build/astra_analytical/build.sh).
The build takes 2–5 minutes on a typical machine. When it finishes you should see:
Compilation finished successfully.
:::tip ns3 backend
The compile.sh script also has a commented-out block to build
the ns3 backend (cycle-accurate network simulation). Most users
don't need this. Uncomment it only if you set --network-backend ns3.
:::
4. Verify the install
Run the bundled smoke test from inside the simulator container:
python -m serving \
--cluster-config 'configs/cluster/single_node_single_instance.json' \
--dtype float16 --block-size 16 \
--dataset 'workloads/example_trace.jsonl' \
--output 'outputs/example_single_run.csv' \
--log-interval 1.0
You should see throughput logs scrolling once per second, and a
final per-request CSV at outputs/example_single_run.csv. If you
see a [FileNotFoundError: profile_*.csv] or similar, check
Troubleshooting → Missing profile data.
You're done
The simulator is installed. Continue with one of:
- Quickstart: walk through the example run, understand the flags, and read the output.
- vLLM setup: install the vLLM environment for profiling new hardware or running the benchmark suite. (Optional.)