C css-bash
AI Gateway batch . 2026-05-25

Does css-bash actually improve agent CSS?

The strongest evidence is not universal output improvement. It is better discovery when a natural UI task needs a recent or long-tail CSS primitive.

cheap clean

openai/gpt-5.1-instant

natural
0.54
native
0.77
css-bash
1.77

css-bash wins 9/13 cases. Native hint wins 3/13. Empty outputs: 0.

2026-05-25T16-24-48Z
frontier clean

anthropic/claude-sonnet-4.6

natural
0.38
native
0.46
css-bash
1.62

css-bash wins 9/13 cases. Native hint wins 2/13. Empty outputs: 0.

2026-05-25T17-38-57Z
default partial

openai/gpt-5.1-codex-mini

natural
0.00
native
0.46
css-bash
1.00

css-bash wins 9/13 cases. Native hint wins 4/13. Empty outputs: natural 7, native 5, css-bash 3.

2026-05-25T18-18-48Z

Sonnet through AI Gateway

Scored by expected primitive hits minus forbidden workaround hits.

9/13 css-bash wins
scoped-widget-boundary
@scope
natural
0
native
0
css
2
win

Sonnet missed scoped CSS in natural and native-hint prompts. css-bash found both @scope and to (.

details-native-open-selector
:open
natural
-1
native
-1
css
1
win

Natural and native-hint used the forbidden [open] selector. css-bash found :open cleanly.

typed-property-progress-ring
@property
natural
2
native
2
css
2
tie

Control tie. Sonnet already knew the typed custom property pattern.

intrinsic-details-accordion
interpolate-size
natural
0
native
2
css
3
win

Natural missed the intrinsic accordion stack. css-bash hit interpolate-size, allow-keywords, and ::details-content.

textarea-native-autogrow
field-sizing
natural
1
native
1
css
2
win

All arms found the primitive, but css-bash avoided forbidden JS measurement patterns.

auto-contrast-runtime-badges
contrast-color()
natural
1
native
0
css
2
win

css-bash found contrast-color() and avoided hardcoded white/black text fallbacks.

top-layer-discrete-dialog
@starting-style
natural
2
native
2
css
2
tie

Control tie. Sonnet already used current dialog animation primitives.

view-transition-class-groups
view-transition-class
natural
0
native
0
css
1
win

Natural and native-hint missed class grouping. css-bash found view-transition-class.

sibling-index-stagger
sibling-index()
natural
2
native
1
css
2
tie

Mostly tie. Sonnet often knows sibling-index() once the task is explicit.

scroll-marker-carousel
::scroll-marker
natural
0
native
0
css
3
win

Strong win. css-bash found ::scroll-marker, ::scroll-marker-group, and :target-current.

if-function-density-card
if()
natural
1
native
1
css
1
tie

Tie. The model partially found CSS if(), but did not hit the full expected signal set.

randomized-note-wall
random()
natural
-2
native
-2
css
0
win

css-bash removed forbidden nth-child patterns but still did not get CSS random().

reading-flow-dashboard
reading-flow
natural
-1
native
0
css
0
win

css-bash found reading-flow, but also included forbidden tabindex, so it tied native-hint.

0
strict improved
4
mixed
7
ties
2
regressions
125
paired trials
250
variants

Position it as a retrieval and eval harness.

css-bash is strongest when the prompt describes desired behavior and the agent has to discover the CSS primitive. It is weaker as a universal output improver when the prompt already names the exact feature.