Skip to main content

Coding conventions

A short checklist. Skim before opening a PR. None of these are arbitrary; each has bitten the project at least once.

Python style

  • 4-space indentation, snake_case for functions/variables, PascalCase for classes. Match surrounding code in the file you're editing.
  • No enforced formatter. Don't run black / ruff format on a whole file unless you're rewriting it. Style noise hides real diffs.
  • Imports: keep minimal and consistent. serving/ modules use relative imports (from .scheduler import …).
  • English only in code, comments, log messages, and docstrings. Korean / other-language identifiers and comments will be flagged in review.
  • Docstrings: optional. If you write one, make it a single line that explains why the function exists, not what it does. The signature already says what.
  • No top-level prints. Use serving/core/logger.py (already imported as logger in most files):
    logger.info(...)
    logger.warning(...)
    logger.success(...) # Rich-styled green check

CLI flag conventions

  • CLI flags use hyphens: --cluster-config, --max-num-seqs, --enable-prefix-caching.
  • Internal Python uses underscores: cluster_config, max_num_seqs, enable_prefix_caching.
  • Boolean flags use BooleanOptionalAction so both --enable-X and --no-enable-X work:
    parser.add_argument('--enable-prefix-caching',
    action=argparse.BooleanOptionalAction,
    default=True)
  • Match vLLM naming where applicable (--max-num-batched-tokens, --block-size, --kv-cache-dtype). Users coming from vLLM should not have to relearn.

File and config naming

  • JSON config filenames: descriptive snake_case (single_node_pim_instance.json, not singleNodePimInstance.json).
  • One config = one scenario. Don't reuse the same cluster JSON across unrelated examples; copy it.
  • Don't commit machine-specific paths. All paths in code and configs must be relative to the repo root.

Things to never do

These each correspond to a real incident or strong project preference:

  1. Don't add getattr(request, 'attr', default) fallbacks for Request attributes. Initialize all attributes in Request.__init__ and access directly. Fallbacks hide initialization bugs.

  2. Don't assume hidden_size == num_heads * head_dim. Some models (Qwen3) violate this. Always:

    head_dim = config.get('head_dim', n_embd // n_head)
    q_dim = n_head * head_dim # NOT n_embd
    kv_dim = kv_head * head_dim # NOT n_embd // group
  3. Don't invent layer names. Every name the simulator emits must also appear in the architecture YAML's catalog. Canonical set: qkv_proj, o_proj, gate_up_proj, act_fn, down_proj, rotary_emb, qk_norm, attention, layernorm, final_layernorm, embedding, lm_head, sampler, moe.

  4. Don't edit astra-sim/ unless the change targets simulator integration (Chakra converter, Workload.cc, input configs). Most contributions never touch this directory.

  5. Don't manually edit astra-sim/inputs/*.json. Those files are regenerated by config_builder.py on every run; your edits will be silently overwritten.

  6. Don't commit large generated files. Trace files, outputs/*.csv from your local runs, .et protobufs, and profiler bundle CSVs that exceed the gitignore patterns should stay local. The gitignore is set up; just don't git add -A.

  7. Don't use --no-verify to bypass pre-commit hooks. If a hook fails, fix the underlying issue.

  8. Don't add error handling for cases that can't happen. Trust internal invariants; only validate at the boundaries (CLI args, JSON config load, dataset parsing). Defensive programming inside scheduler.py makes the file unreadable.

  9. Don't add features beyond the task at hand. A bug fix doesn't need surrounding cleanup. Three similar lines is better than a premature abstraction.

  10. Don't add comments explaining what the code does. The identifier names already do that. Comments are reserved for why something non-obvious is the way it is (a hidden invariant, a bug workaround, a citation to a paper).

Layer-name and unit reminders

These two trip up new contributors most often:

  • Profiler CSVs store microseconds (time_us column). The simulator multiplies by 1000 and rounds to nanoseconds at load time. Don't divide twice.
  • Communication sizes for ASTRA-Sim are total (not per-NPU) bytes. ASTRA-Sim divides by ring size internally. If you pass per-NPU sizes, every collective will be N times too small.

Trace-format invariants

If you touch trace_generator.py or graph_generator.py:

  • The first layer's input_loc and the last layer's output_loc must be REMOTE:{node_id}. The Chakra converter emits a MEM_LOAD from the first and a MEM_STORE from the last; if either is LOCAL without local memory configured, ASTRA-Sim crashes.
  • The sampler's output_loc is what feeds the MEM_STORE. Don't put it on lm_head.

Commit and PR style

The short version (full process is on PR workflow):

  • Commit messages: short imperative one-liner.
    • Good: Fix incorrect evict_size accumulation, Add Qwen3 model support.
    • Bad: fixes, update scheduler.py, WIP.
  • One logical change per commit. Don't bundle a refactor with a feature.
  • PR description includes the validation command you ran, so the reviewer can rerun it.

What's next