Test Systems

Reviewing AI suggestions

When the dev panel finds a missing or invalid case-meta field, you can ask GridArena to draft a fix. The suggestion flow is read-only by default — nothing is written to the merged CASE_META until you explicitly accept it — and every accept or revert is recorded in an append-only audit trail (see the validation feedback loop for the end-to-end picture).

1. Open the suggestion drawer

Click Suggest fix on any issue row. The right-hand drawer calls suggestCaseMetaFix, which sends the case key, field, and current value to the configured LLM and returns a structured proposal: the new value, the model that produced it, and a short rationale. A spinner appears while the request is in flight; failures show an inline toast and leave the panel unchanged.

2. Inspect the diff

The drawer renders the previous value next to the proposed value so you can eyeball the change before committing. Use this to catch hallucinated source_url values, wrong last_reviewed dates, or rationales that don't match the field type. If the suggestion is wrong, just close the drawer — nothing is persisted.

3. Accept or revert

Accept writes a row to case_meta_overrides and immediately re-merges the override into CASE_META so the validator re-runs. Accepted overrides appear in the Overrides table directly under the dev panel. Each row has a Revert action that deletes the override and restores the original value — the UI updates optimistically and rolls back if the server rejects the request.

4. Bulk actions

For sweeps across many cases, tick the row checkboxes in the Overrides table and use Bulk revert to remove a batch in a single database round-trip. The audit trail records one entry per affected override so the history stays granular even when the action was bulk.

5. Audit trail

Every accept and revert is appended to case_meta_override_audit with the actor, timestamp, action, previous value, new value, and the AI model + rationale (when the action originated from a suggestion). The Audit history panel below the Overrides table renders the chronological log so you can answer "who changed this field, when, and why" long after the fact. Audit rows are user-scoped via RLS and cannot be edited or deleted.

Exporting invalid items

The dev panel above lists every case-meta validation issue detected at build time. Two scope-aware dropdowns at the top of the panel — Export CSV and Export JSON — let you download the currently filtered list of invalid items for offline review or for attaching to a bug report.

Filter and sort awareness

Exports follow the panel's active filter and the on-screen sort order exactly. A display_order column is included in every row so you can trace any line in the export back to its position in the panel, even after re-sorting or re-filtering.

Choosing a scope

Each dropdown offers three scopes. Counts shown next to each option are live, and an option is disabled when its bucket is empty:

All — every issue in the current filter.
Errors only — rows where a required field is missing.
Warnings only — rows where a field is present but its format is invalid.

The severity column reflects this distinction: error for missing fields, warning for invalid formats.

Preview before download

Choosing a scope opens a preview modal instead of downloading immediately:

CSV preview — a table of up to the first 50 rows showing case_key, severity, field, and message, plus the total row count and the active filter.
JSON preview — the full payload rendered in a code block so you can inspect the exact structure before saving.

Both modals expose Download and Cancel buttons, and Esc dismisses the preview without downloading.

CSV column schema

Column	Type	Description
`display_order`	integer	Position in the on-screen list (preserves the current sort).
`case_key`	string	Identifier of the case the issue belongs to.
`severity`	`error` \| `warning`	`error` = missing field, `warning` = invalid format.
`problem_type`	`missing` \| `invalid`	Same distinction in machine-friendly form.
`field`	string	Name of the offending meta field.
`message`	string	Human-readable validation message (empty for `missing`).

JSON shape

The JSON export mirrors the CSV: an array of objects with the same six keys per row, in the same order as the on-screen list.

Worked examples

CSV:

display_order,case_key,severity,problem_type,field,message
1,case14,error,missing,prompt_version,
2,case14,warning,invalid,random_seed,"random_seed must be an integer"

JSON:

[
  { "display_order": 1, "case_key": "case14", "severity": "error",
    "problem_type": "missing", "field": "prompt_version", "message": "" },
  { "display_order": 2, "case_key": "case14", "severity": "warning",
    "problem_type": "invalid", "field": "random_seed",
    "message": "random_seed must be an integer" }
]

GridArena ships three built-in IEEE-style transmission benchmarks — case5, case14, and case30. They are simplified, deterministic versions of well-known reference networks, embedded directly in the codebase (src/server/simulation/cases.ts) so the in-Worker DC power flow and the optional PyPSA microservice produce reproducible results without external downloads.

The tables below are generated at build time from the actual case definitions, so what you see here is exactly what the solver runs.

Engine assumptions

Both solvers use DC power flow:

Lossless network (line resistance ignored).
Flat 1.0 pu voltage magnitudes — voltage-magnitude violations are always empty.
Small-angle approximation (sin θ ≈ θ).
Generator dispatch is greedy merit-order by marginal cost to cover total load before each PF run.

These assumptions make results meaningful for line loading, thermal violations, and redispatch, but not for reactive power, voltage collapse, or losses.

case5 — 5-bus system

A small 5-bus system commonly used for teaching LMP and congestion. Roughly based on the PJM 5-bus educational example. Useful as a sanity check: the topology is small enough to reason about by hand.

Dataset version

gridarena-case5@1.0.0

Source

PJM 5-bus educational example (Li & Bo, 2010)

Last reviewed

2026-04-28

Prompt version

— not set

Random seed

— not set

Standardized

Topology (5 buses, 6 branches) matches the canonical PJM 5-bus.
Bus types (slack/PV/PQ) follow the original classification.
Line reactances (x_pu) preserved from the reference dataset.

Simplified

Resistances and shunts dropped — DC power flow only.
Generator cost curves replaced by a flat merit-order ranking.
Voltage limits not enforced (flat 1.0 pu assumption).

Buses

5 (slack 1 · PV 2 · PQ 2)

Branches

Generators

Base MVA

100

Total load

1000 MW

Initial dispatch

1000 MW

Total gen capacity

1170 MW

Reserve margin

17%

Topology

SlackPVPQhas generator● size = load (MW)

case5

Buses

#	Type	Pd (MW)	Vm (pu)
0	slack	0	1
1	pv	300	1
2	pq	300	1
3	pq	400	1
4	pv	0	1

Branches

#	From	To	x (pu)	Rating (MW)
0	0	1	0.02810	400
1	0	3	0.03040	400
2	1	2	0.00640	400
3	2	3	0.01080	240
4	3	4	0.02970	240
5	1	4	0.02970	240

Generators

#	Bus	P (MW)	P_max (MW)
0	0	200	400
1	1	200	170
2	4	600	600

case14 — IEEE 14-bus

Derived from the classic IEEE 14-bus test case, which represents a portion of the American Electric Power (AEP) system in the US Midwest as of February 1962. The most widely cited small-scale benchmark in power-flow literature.

Dataset version

gridarena-case14@1.0.0

Source

IEEE 14-bus (AEP, Feb 1962) via MATPOWER case14

Last reviewed

2026-04-28

Prompt version

— not set

Random seed

— not set

Standardized

14 buses, 20 branches, 5 generators per the IEEE reference.
Bus loads (Pd) match published values.
Per-line thermal ratings preserved (50–200 MW).

Simplified

Transformer tap ratios collapsed into plain reactances.
Reactive load (Qd) and bus shunts ignored under DC-PF.
Generator Q-limits and voltage setpoints omitted.

Buses

14 (slack 1 · PV 4 · PQ 9)

Branches

Generators

Base MVA

100

Total load

259.0 MW

Initial dispatch

272 MW

Total gen capacity

772 MW

Reserve margin

198%

Topology

SlackPVPQhas generator● size = load (MW)

case14

Buses

#	Type	Pd (MW)	Vm (pu)
0	slack	0	1
1	pv	21.70	1
2	pv	94.20	1
3	pq	47.80	1
4	pq	7.60	1
5	pv	11.20	1
6	pq	0	1
7	pv	0	1
8	pq	29.50	1
9	pq	9	1
10	pq	3.50	1
11	pq	6.10	1
12	pq	13.50	1
13	pq	14.90	1

Branches

#	From	To	x (pu)	Rating (MW)
0	0	1	0.05917	200
1	0	4	0.22304	200
2	1	2	0.19797	100
3	1	3	0.17632	100
4	1	4	0.17388	100
5	2	3	0.17103	100
6	3	4	0.04211	100
7	3	6	0.20912	100
8	3	8	0.55618	100
9	4	5	0.25202	100
10	5	10	0.19890	50
11	5	11	0.25581	50
12	5	12	0.13027	50
13	6	7	0.17615	100
14	6	8	0.11001	100
15	8	9	0.08450	50
16	8	13	0.27038	50
17	9	10	0.19207	50
18	11	12	0.19988	50
19	12	13	0.34802	50

Generators

#	Bus	P (MW)	P_max (MW)
0	0	232	332
1	1	40	140
2	2	0	100
3	5	0	100
4	7	0	100

case30 — IEEE 30-bus

Derived from the IEEE 30-bus test case, also based on the AEP system (December 1961). Standard benchmark for contingency / N-1 analysis with enough topology to exercise meaningful re-routing under line outages.

Dataset version

gridarena-case30@1.0.0

Source

IEEE 30-bus (AEP, Dec 1961) via MATPOWER case30

Last reviewed

2026-04-28

Prompt version

— not set

Random seed

— not set

Standardized

30 buses, 41 branches, 6 generators per the IEEE reference.
Bus loads (Pd) match published values.
Branch reactances preserved from the standard dataset.

Simplified

Uniform 130 MW thermal rating applied to every branch.
Transformers, shunts, and reactive elements dropped.
Generator cost curves replaced by greedy merit order.

Buses

30 (slack 1 · PV 5 · PQ 24)

Branches

Generators

Base MVA

100

Total load

283.4 MW

Initial dispatch

289.4 MW

Total gen capacity

465 MW

Reserve margin

64%

Topology

SlackPVPQhas generator● size = load (MW)

case30

Buses

#	Type	Pd (MW)	Vm (pu)
0	slack	21.70	1
1	pv	2.40	1
2	pq	7.60	1
3	pq	0	1
4	pq	94.20	1
5	pq	0	1
6	pq	22.80	1
7	pq	30	1
8	pq	0	1
9	pq	5.80	1
10	pq	0	1
11	pq	11.20	1
12	pv	0	1
13	pq	6.20	1
14	pq	8.20	1
15	pq	3.50	1
16	pq	9	1
17	pq	3.20	1
18	pq	9.50	1
19	pq	2.20	1
20	pq	17.50	1
21	pv	0	1
22	pv	3.20	1
23	pq	8.70	1
24	pq	0	1
25	pq	3.50	1
26	pv	0	1
27	pq	0	1
28	pq	2.40	1
29	pq	10.60	1

Branches

#	From	To	x (pu)	Rating (MW)
0	0	1	0.05750	130
1	0	2	0.16520	130
2	1	3	0.17370	130
3	2	3	0.03790	130
4	1	4	0.19830	130
5	1	5	0.17630	130
6	3	5	0.04140	130
7	4	6	0.11600	130
8	5	6	0.08200	130
9	5	7	0.04200	130
10	5	8	0.20800	130
11	5	9	0.55600	130
12	8	10	0.20800	130
13	8	9	0.11000	130
14	3	11	0.25600	130
15	11	12	0.14000	130
16	11	13	0.25590	130
17	11	14	0.13040	130
18	11	15	0.19870	130
19	13	14	0.19970	130
20	15	16	0.19230	130
21	14	17	0.21850	130
22	17	18	0.12920	130
23	18	19	0.06800	130
24	9	19	0.20900	130
25	9	16	0.08450	130
26	9	20	0.07490	130
27	9	21	0.14990	130
28	20	21	0.02360	130
29	14	22	0.20200	130
30	21	23	0.17900	130
31	22	23	0.27000	130
32	23	24	0.32920	130
33	24	25	0.38000	130
34	24	26	0.20870	130
35	27	26	0.03960	130
36	26	28	0.41530	130
37	26	29	0.60270	130
38	28	29	0.45330	130
39	7	27	0.20000	130
40	5	27	0.05990	130

Generators

#	Bus	P (MW)	P_max (MW)
0	0	138.6	200
1	1	57.6	80
2	12	0	50
3	21	24.6	50
4	22	21.6	30
5	26	47	55

Supported actions

The solvers accept these structured actions on any case:

scale_all_loads — multiply every load by value.
set_generator_p_mw — set generator at target_index to value MW.
line_outage — remove branch at target_index.
shed_load — shed value MW total, scaled across loads.

Supported perturbations

For sensitivity / robustness analysis (batch perturbation jobs):

load_scale / load_increase — scale every load by parameter_value.
line_rating_decrease — scale every line s_nom by parameter_value.
generator_outage — remove generator at index parameter_value.

References

MATPOWER — canonical .m case archive (case5.m, case14.m, case30.m).
PyPSA documentation — engine used by the optional simulation microservice.
Illinois Center for a Smarter Electric Grid — one-line diagrams and history of the IEEE 14 / 30 systems.
In-repo: src/server/simulation/cases.ts, src/server/simulation/dc-powerflow.ts, simulation-service/README.md.