Test Systems

Reviewing AI suggestions

When the dev panel finds a missing or invalid case-meta field, you can ask GridArena to draft a fix. The suggestion flow is read-only by default — nothing is written to the merged CASE_META until you explicitly accept it — and every accept or revert is recorded in an append-only audit trail (see the validation feedback loop for the end-to-end picture).

1. Open the suggestion drawer

Click Suggest fix on any issue row. The right-hand drawer calls suggestCaseMetaFix, which sends the case key, field, and current value to the configured LLM and returns a structured proposal: the new value, the model that produced it, and a short rationale. A spinner appears while the request is in flight; failures show an inline toast and leave the panel unchanged.

2. Inspect the diff

The drawer renders the previous value next to the proposed value so you can eyeball the change before committing. Use this to catch hallucinated source_url values, wrong last_reviewed dates, or rationales that don't match the field type. If the suggestion is wrong, just close the drawer — nothing is persisted.

3. Accept or revert

Accept writes a row to case_meta_overrides and immediately re-merges the override into CASE_META so the validator re-runs. Accepted overrides appear in the Overrides table directly under the dev panel. Each row has a Revert action that deletes the override and restores the original value — the UI updates optimistically and rolls back if the server rejects the request.

4. Bulk actions

For sweeps across many cases, tick the row checkboxes in the Overrides table and use Bulk revert to remove a batch in a single database round-trip. The audit trail records one entry per affected override so the history stays granular even when the action was bulk.

5. Audit trail

Every accept and revert is appended to case_meta_override_audit with the actor, timestamp, action, previous value, new value, and the AI model + rationale (when the action originated from a suggestion). The Audit history panel below the Overrides table renders the chronological log so you can answer "who changed this field, when, and why" long after the fact. Audit rows are user-scoped via RLS and cannot be edited or deleted.

Exporting invalid items

The dev panel above lists every case-meta validation issue detected at build time. Two scope-aware dropdowns at the top of the panel — Export CSV and Export JSON — let you download the currently filtered list of invalid items for offline review or for attaching to a bug report.

Filter and sort awareness

Exports follow the panel's active filter and the on-screen sort order exactly. A display_order column is included in every row so you can trace any line in the export back to its position in the panel, even after re-sorting or re-filtering.

Choosing a scope

Each dropdown offers three scopes. Counts shown next to each option are live, and an option is disabled when its bucket is empty:

  • All — every issue in the current filter.
  • Errors only — rows where a required field is missing.
  • Warnings only — rows where a field is present but its format is invalid.

The severity column reflects this distinction: error for missing fields, warning for invalid formats.

Preview before download

Choosing a scope opens a preview modal instead of downloading immediately:

  • CSV preview — a table of up to the first 50 rows showing case_key, severity, field, and message, plus the total row count and the active filter.
  • JSON preview — the full payload rendered in a code block so you can inspect the exact structure before saving.

Both modals expose Download and Cancel buttons, and Esc dismisses the preview without downloading.

CSV column schema

ColumnTypeDescription
display_orderintegerPosition in the on-screen list (preserves the current sort).
case_keystringIdentifier of the case the issue belongs to.
severityerror | warningerror = missing field, warning = invalid format.
problem_typemissing | invalidSame distinction in machine-friendly form.
fieldstringName of the offending meta field.
messagestringHuman-readable validation message (empty for missing).

JSON shape

The JSON export mirrors the CSV: an array of objects with the same six keys per row, in the same order as the on-screen list.

Worked examples

CSV:

display_order,case_key,severity,problem_type,field,message
1,case14,error,missing,prompt_version,
2,case14,warning,invalid,random_seed,"random_seed must be an integer"

JSON:

[
  { "display_order": 1, "case_key": "case14", "severity": "error",
    "problem_type": "missing", "field": "prompt_version", "message": "" },
  { "display_order": 2, "case_key": "case14", "severity": "warning",
    "problem_type": "invalid", "field": "random_seed",
    "message": "random_seed must be an integer" }
]

GridArena ships three built-in IEEE-style transmission benchmarks case5, case14, and case30. They are simplified, deterministic versions of well-known reference networks, embedded directly in the codebase (src/server/simulation/cases.ts) so the in-Worker DC power flow and the optional PyPSA microservice produce reproducible results without external downloads.

The tables below are generated at build time from the actual case definitions, so what you see here is exactly what the solver runs.

Engine assumptions

Both solvers use DC power flow:

  • Lossless network (line resistance ignored).
  • Flat 1.0 pu voltage magnitudes — voltage-magnitude violations are always empty.
  • Small-angle approximation (sin θ ≈ θ).
  • Generator dispatch is greedy merit-order by marginal cost to cover total load before each PF run.

These assumptions make results meaningful for line loading, thermal violations, and redispatch, but not for reactive power, voltage collapse, or losses.

case5 — 5-bus system

A small 5-bus system commonly used for teaching LMP and congestion. Roughly based on the PJM 5-bus educational example. Useful as a sanity check: the topology is small enough to reason about by hand.

Dataset version
gridarena-case5@1.0.0
Last reviewed
2026-04-28
Prompt version
— not set
Random seed
— not set
Standardized
  • Topology (5 buses, 6 branches) matches the canonical PJM 5-bus.
  • Bus types (slack/PV/PQ) follow the original classification.
  • Line reactances (x_pu) preserved from the reference dataset.
Simplified
  • Resistances and shunts dropped — DC power flow only.
  • Generator cost curves replaced by a flat merit-order ranking.
  • Voltage limits not enforced (flat 1.0 pu assumption).
Buses
5 (slack 1 · PV 2 · PQ 2)
Branches
6
Generators
3
Base MVA
100
Total load
1000 MW
Initial dispatch
1000 MW
Total gen capacity
1170 MW
Reserve margin
17%

Topology

SlackPVPQhas generator● size = load (MW)
case5
01234

Buses

#TypePd (MW)Vm (pu)
0slack01
1pv3001
2pq3001
3pq4001
4pv01

Branches

#FromTox (pu)Rating (MW)
0010.02810400
1030.03040400
2120.00640400
3230.01080240
4340.02970240
5140.02970240

Generators

#BusP (MW)P_min (MW)P_max (MW)
002000400
112000170
246000600

case14 — IEEE 14-bus

Derived from the classic IEEE 14-bus test case, which represents a portion of the American Electric Power (AEP) system in the US Midwest as of February 1962. The most widely cited small-scale benchmark in power-flow literature.

Dataset version
gridarena-case14@1.0.0
Last reviewed
2026-04-28
Prompt version
— not set
Random seed
— not set
Standardized
  • 14 buses, 20 branches, 5 generators per the IEEE reference.
  • Bus loads (Pd) match published values.
  • Per-line thermal ratings preserved (50–200 MW).
Simplified
  • Transformer tap ratios collapsed into plain reactances.
  • Reactive load (Qd) and bus shunts ignored under DC-PF.
  • Generator Q-limits and voltage setpoints omitted.
Buses
14 (slack 1 · PV 4 · PQ 9)
Branches
20
Generators
5
Base MVA
100
Total load
259.0 MW
Initial dispatch
272 MW
Total gen capacity
772 MW
Reserve margin
198%

Topology

SlackPVPQhas generator● size = load (MW)
case14
012345678910111213

Buses

#TypePd (MW)Vm (pu)
0slack01
1pv21.701
2pv94.201
3pq47.801
4pq7.601
5pv11.201
6pq01
7pv01
8pq29.501
9pq91
10pq3.501
11pq6.101
12pq13.501
13pq14.901

Branches

#FromTox (pu)Rating (MW)
0010.05917200
1040.22304200
2120.19797100
3130.17632100
4140.17388100
5230.17103100
6340.04211100
7360.20912100
8380.55618100
9450.25202100
105100.1989050
115110.2558150
125120.1302750
13670.17615100
14680.11001100
15890.0845050
168130.2703850
179100.1920750
1811120.1998850
1912130.3480250

Generators

#BusP (MW)P_min (MW)P_max (MW)
002320332
11400140
2200100
3500100
4700100

case30 — IEEE 30-bus

Derived from the IEEE 30-bus test case, also based on the AEP system (December 1961). Standard benchmark for contingency / N-1 analysis with enough topology to exercise meaningful re-routing under line outages.

Dataset version
gridarena-case30@1.0.0
Last reviewed
2026-04-28
Prompt version
— not set
Random seed
— not set
Standardized
  • 30 buses, 41 branches, 6 generators per the IEEE reference.
  • Bus loads (Pd) match published values.
  • Branch reactances preserved from the standard dataset.
Simplified
  • Uniform 130 MW thermal rating applied to every branch.
  • Transformers, shunts, and reactive elements dropped.
  • Generator cost curves replaced by greedy merit order.
Buses
30 (slack 1 · PV 5 · PQ 24)
Branches
41
Generators
6
Base MVA
100
Total load
283.4 MW
Initial dispatch
289.4 MW
Total gen capacity
465 MW
Reserve margin
64%

Topology

SlackPVPQhas generator● size = load (MW)
case30
01234567891011121314151617181920212223242526272829

Buses

#TypePd (MW)Vm (pu)
0slack21.701
1pv2.401
2pq7.601
3pq01
4pq94.201
5pq01
6pq22.801
7pq301
8pq01
9pq5.801
10pq01
11pq11.201
12pv01
13pq6.201
14pq8.201
15pq3.501
16pq91
17pq3.201
18pq9.501
19pq2.201
20pq17.501
21pv01
22pv3.201
23pq8.701
24pq01
25pq3.501
26pv01
27pq01
28pq2.401
29pq10.601

Branches

#FromTox (pu)Rating (MW)
0010.05750130
1020.16520130
2130.17370130
3230.03790130
4140.19830130
5150.17630130
6350.04140130
7460.11600130
8560.08200130
9570.04200130
10580.20800130
11590.55600130
128100.20800130
13890.11000130
143110.25600130
1511120.14000130
1611130.25590130
1711140.13040130
1811150.19870130
1913140.19970130
2015160.19230130
2114170.21850130
2217180.12920130
2318190.06800130
249190.20900130
259160.08450130
269200.07490130
279210.14990130
2820210.02360130
2914220.20200130
3021230.17900130
3122230.27000130
3223240.32920130
3324250.38000130
3424260.20870130
3527260.03960130
3626280.41530130
3726290.60270130
3828290.45330130
397270.20000130
405270.05990130

Generators

#BusP (MW)P_min (MW)P_max (MW)
00138.60200
1157.6080
2120050
32124.6050
42221.6030
52647055

Supported actions

The solvers accept these structured actions on any case:

  • scale_all_loads — multiply every load by value.
  • set_generator_p_mw — set generator at target_index to value MW.
  • line_outage — remove branch at target_index.
  • shed_load — shed value MW total, scaled across loads.

Supported perturbations

For sensitivity / robustness analysis (batch perturbation jobs):

  • load_scale / load_increase — scale every load by parameter_value.
  • line_rating_decrease — scale every line s_nom by parameter_value.
  • generator_outage — remove generator at index parameter_value.

References

  • MATPOWER — canonical .m case archive (case5.m, case14.m, case30.m).
  • PyPSA documentation — engine used by the optional simulation microservice.
  • Illinois Center for a Smarter Electric Grid — one-line diagrams and history of the IEEE 14 / 30 systems.
  • In-repo: src/server/simulation/cases.ts, src/server/simulation/dc-powerflow.ts, simulation-service/README.md.