Coding conventions

A short checklist. Skim before opening a PR. None of these are arbitrary; each has bitten the project at least once.

Python style

4-space indentation, snake_case for functions/variables, PascalCase for classes. Match surrounding code in the file you're editing.
No enforced formatter. Don't run black / ruff format on a whole file unless you're rewriting it. Style noise hides real diffs.
Imports: keep minimal and consistent. serving/ modules use relative imports (from .scheduler import …).
English only in code, comments, log messages, and docstrings. Korean / other-language identifiers and comments will be flagged in review.
Docstrings: optional. If you write one, make it a single line that explains why the function exists, not what it does. The signature already says what.
No top-level prints. Use serving/core/logger.py (already imported as logger in most files):
```
logger.info(...)
logger.warning(...)
logger.success(...)   # Rich-styled green check
```

CLI flag conventions

CLI flags use hyphens: --cluster-config, --max-num-seqs, --enable-prefix-caching.
Internal Python uses underscores: cluster_config, max_num_seqs, enable_prefix_caching.

Boolean flags use BooleanOptionalAction so both --enable-X and --no-enable-X work:

parser.add_argument('--enable-prefix-caching',
                    action=argparse.BooleanOptionalAction,
                    default=True)

Match vLLM naming where applicable (--max-num-batched-tokens, --block-size, --kv-cache-dtype). Users coming from vLLM should not have to relearn.

File and config naming

JSON config filenames: descriptive snake_case (single_node_pim_instance.json, not singleNodePimInstance.json).
One config = one scenario. Don't reuse the same cluster JSON across unrelated examples; copy it.
Don't commit machine-specific paths. All paths in code and configs must be relative to the repo root.

Things to never do

These each correspond to a real incident or strong project preference:

Don't add getattr(request, 'attr', default) fallbacks for Request attributes. Initialize all attributes in Request.__init__ and access directly. Fallbacks hide initialization bugs.

Don't assume hidden_size == num_heads * head_dim. Some models (Qwen3) violate this. Always:

head_dim = config.get('head_dim', n_embd // n_head)
q_dim   = n_head * head_dim         # NOT n_embd
kv_dim  = kv_head * head_dim        # NOT n_embd // group

Don't invent layer names. Every name the simulator emits must also appear in the architecture YAML's catalog. Canonical set: qkv_proj, o_proj, gate_up_proj, act_fn, down_proj, rotary_emb, qk_norm, attention, layernorm, final_layernorm, embedding, lm_head, sampler, moe.
Don't edit astra-sim/ unless the change targets simulator integration (Chakra converter, Workload.cc, input configs). Most contributions never touch this directory.
Don't manually edit astra-sim/inputs/*.json. Those files are regenerated by config_builder.py on every run; your edits will be silently overwritten.
Don't commit large generated files. Trace files, outputs/*.csv from your local runs, .et protobufs, and profiler bundle CSVs that exceed the gitignore patterns should stay local. The gitignore is set up; just don't git add -A.
Don't use --no-verify to bypass pre-commit hooks. If a hook fails, fix the underlying issue.
Don't add error handling for cases that can't happen. Trust internal invariants; only validate at the boundaries (CLI args, JSON config load, dataset parsing). Defensive programming inside scheduler.py makes the file unreadable.
Don't add features beyond the task at hand. A bug fix doesn't need surrounding cleanup. Three similar lines is better than a premature abstraction.
Don't add comments explaining what the code does. The identifier names already do that. Comments are reserved for why something non-obvious is the way it is (a hidden invariant, a bug workaround, a citation to a paper).

Layer-name and unit reminders

These two trip up new contributors most often:

Profiler CSVs store microseconds (time_us column). The simulator multiplies by 1000 and rounds to nanoseconds at load time. Don't divide twice.
Communication sizes for ASTRA-Sim are total (not per-NPU) bytes. ASTRA-Sim divides by ring size internally. If you pass per-NPU sizes, every collective will be N times too small.

Trace-format invariants

If you touch trace_generator.py or graph_generator.py:

The first layer's input_loc and the last layer's output_loc must be REMOTE:{node_id}. The Chakra converter emits a MEM_LOAD from the first and a MEM_STORE from the last; if either is LOCAL without local memory configured, ASTRA-Sim crashes.
The sampler's output_loc is what feeds the MEM_STORE. Don't put it on lm_head.

Commit and PR style

The short version (full process is on PR workflow):

Commit messages: short imperative one-liner.
- Good: Fix incorrect evict_size accumulation, Add Qwen3 model support.
- Bad: fixes, update scheduler.py, WIP.
One logical change per commit. Don't bundle a refactor with a feature.
PR description includes the validation command you ran, so the reviewer can rerun it.

What's next

Validating your changes: how to prove the change actually works.
PR workflow: branch model, attribution, review expectations.

Python style​

CLI flag conventions​

File and config naming​

Things to never do​

Layer-name and unit reminders​

Trace-format invariants​

Commit and PR style​

What's next​