Cluster config schema
Formal field-by-field schema for the JSON file passed via
--cluster-config. For a guided walkthrough with examples, see
Examples → Cluster config explained.
This page is the lookup reference: every field, every type,
every default.
File location
Configs live at configs/cluster/<name>.json. The simulator reads
the file once at startup and serving/core/config_builder.py
generates derived ASTRA-Sim input files (network.yml,
system.json, memory_expansion.json).
Top-level
{
"num_nodes": 1,
"link_bw": 16,
"link_latency": 20000,
"nodes": [...],
"cxl_mem": {...}
}
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
num_nodes | int | ✓ | Number of physical nodes in the cluster | |
link_bw | float | ✓ | Inter-node link bandwidth in GB/s | |
link_latency | float | ✓ | Inter-node link latency in ns | |
nodes | array | ✓ | Length must equal num_nodes | |
cxl_mem | object | optional | absent | CXL memory expansion (see below) |
cxl_mem (top-level, optional)
"cxl_mem": {
"mem_size": 1024,
"mem_bw": 60,
"mem_latency": 250,
"num_devices": 4
}
| Field | Type | Required | Description |
|---|---|---|---|
mem_size | float | ✓ | Capacity per device in GB |
mem_bw | float | ✓ | Bandwidth per device in GB/s |
mem_latency | float | ✓ | Access latency in ns |
num_devices | int | ✓ | Number of CXL devices (cxl:0 through cxl:N-1) |
When present, instances can reference cxl:N in their placement
field.
Per-node (nodes[i])
{
"num_instances": 2,
"cpu_mem": {"mem_size": 512, "mem_bw": 256, "mem_latency": 0},
"instances": [...],
"power": {...},
"cpu_mem.pim_config": "DDR4_8GB_3200_pim"
}
| Field | Type | Required | Description |
|---|---|---|---|
num_instances | int | ✓ | Number of serving instances on this node |
cpu_mem | object | ✓ | Host CPU memory config (see below) |
instances | array | ✓ | Length must equal num_instances |
power | object | optional | Power model config (see below) |
cpu_mem
| Field | Type | Required | Description |
|---|---|---|---|
mem_size | float | ✓ | Host CPU memory capacity in GB |
mem_bw | float | ✓ | CPU memory bandwidth in GB/s |
mem_latency | float | ✓ | CPU memory latency in ns |
pim_config | string | optional | Name of a PIM device config in configs/pim/. See PIM config |
power (optional)
Enables the power model on this node. See Examples → Power modeling for the full schema. Top-level structure:
"power": {
"base_node_power": 60,
"npu": {"<hardware>": {...}},
"cpu": {...},
"dram": {...},
"link": {...},
"nic": {...},
"storage": {...}
}
| Sub-field | Required | Description |
|---|---|---|
base_node_power | ✓ | Always-on host platform power in W |
npu.<hardware>.idle_power | ✓ | NPU idle wattage |
npu.<hardware>.standby_power | ✓ | NPU post-compute standby wattage |
npu.<hardware>.active_power | ✓ | NPU active compute wattage |
npu.<hardware>.standby_duration | ✓ | Time to stay in standby after compute, in ns |
cpu.idle_power, cpu.active_power, cpu.util | ✓ | CPU baseline + utilization fraction |
dram.dimm_size, dram.idle_power, dram.energy_per_bit | ✓ | DIMM size, idle power, per-bit energy |
link.num_links, link.idle_power, link.energy_per_bit | ✓ | Network link power |
nic.num_nics, nic.idle_power | ✓ | NIC count and baseline |
storage.num_devices, storage.idle_power | ✓ | Storage devices |
Per-instance (instances[i])
{
"model_name": "Qwen/Qwen3-32B",
"hardware": "RTXPRO6000",
"npu_mem": {"mem_size": 96, "mem_bw": 1597, "mem_latency": 0},
"num_npus": 2,
"tp_size": 2,
"pp_size": 1,
"ep_size": 1,
"dp_group": null,
"pd_type": null,
"placement": {...}
}
Required fields
| Field | Type | Description |
|---|---|---|
model_name | string | HF id. Must match a config at configs/model/<model_name>.json (see Model config) |
hardware | string | Hardware label. Must match profiler/perf/<hardware>/ |
npu_mem.mem_size | float | Per-GPU NPU memory in GB |
npu_mem.mem_bw | float | Per-GPU NPU memory bandwidth in GB/s |
npu_mem.mem_latency | float | Per-GPU NPU memory latency in ns |
pd_type | string | null | "prefill", "decode", or null (combined) |
Parallelism (at least one of num_npus / tp_size)
| Field | Type | Default | Description |
|---|---|---|---|
num_npus | int | inferred from tp_size * pp_size | Total GPUs for this instance |
tp_size | int | inferred from num_npus // pp_size | Tensor-parallel degree |
pp_size | int | 1 | Pipeline-parallel degree |
ep_size | int | tp_size (MoE) / 1 (dense) | Expert-parallel degree |
dp_group | string | null | null | Group ID. Instances with the same string share experts via cross-instance ALLTOALL |
Constraints:
num_npus == tp_size * pp_size(always)- Without
dp_group:ep_size <= tp_size - For MoE:
ep_sizemust dividenum_local_experts
placement (optional)
Per-layer / per-block weight + KV-cache placement rules. See Examples → CXL extended memory for a worked example.
"placement": {
"default": {"weights": "npu", "kv_loc": "npu", "kv_evict_loc": "cpu"},
"blocks": [
{"blocks": "0-3", "weights": "cxl:0", "kv_loc": "npu", "kv_evict_loc": "cpu"}
],
"layers": {
"embedding": {"weights": "cxl:1", "kv_loc": "npu", "kv_evict_loc": "cpu"}
}
}
| Sub-field | Type | Required | Description |
|---|---|---|---|
default | object | ✓ | Catch-all rule for layers / blocks not in blocks or layers |
blocks | array | optional | Per-decoder-block-range overrides |
layers | object | optional | Per-named-layer overrides |
Each rule object has three string fields:
| Field | Allowed values | Description |
|---|---|---|
weights | npu / cpu / cxl:<id> | Where this layer's weights live |
kv_loc | npu / cpu / cxl:<id> | Where active KV blocks live (attention layers only) |
kv_evict_loc | npu / cpu / cxl:<id> | Where evicted KV blocks spill |
blocks strings are dash-and-comma-separated ranges:
"0-3", "4-7", "8,9,10", "11-23". Layer-name keys must match
canonical layer names from the architecture YAML.
Validation rules
num_nodes == len(nodes)and per-nodenum_instances == len(instances).- Per-instance
weight_per_gpu * num_npus <= npu_mem.mem_size * num_npus(otherwise startup OOM). - Hardware folder must exist at
profiler/perf/<hardware>/<model_name>/<variant>/tp<tp_size>/. dp_groupmust be a valid string ornull.- All instances within the same
dp_groupmust share the sameep_sizeandtp_size.
What's next
- Model config: schema for the file
model_nameresolves to. - PIM config: schema for the file
cpu_mem.pim_configresolves to.