Bullshitbox

Turn messy human intent into deterministic actions.

Bullshitbox records raw interaction signals, proposes repeatable actions, replays them in a deterministic testbed, and validates outcomes with assertions. When the world is ambiguous, it escalates to humans.

Static site. No tracking. No cookies. No external assets.

The validation-first automation loop

Scroll to step through the system. Each stage has its own surface and evidence. The goal is not more automation. The goal is reproducible correctness.

Stage 1

Record

Capture raw events: mouse, keyboard, focus changes, timing. Nothing is interpreted yet. This is the only way to keep later abstraction honest.

  • Raw event stream only
  • Timestamped + ordered
  • No “helpful” assumptions
Stage 2

Propose

Group signals into candidate actions with context cues and confidence. Multiple hypotheses are allowed. Hand-wavy certainty is not.

  • Action candidates + rationale
  • Context signals (targets, modifiers, focus)
  • Confidence + evidence attached
Stage 3

Replay

Execute proposed actions inside a deterministic testbed. Same inputs, same layout, same state expectations. Reproducibility is the feature.

  • Fixed-position GUI reference environment
  • Known window geometry + UI layout
  • State snapshot after replay
Stage 4

Validate

Assert outcomes and invariants. If it can’t be validated, it doesn’t get promoted. When ambiguity remains, the system asks a human.

  • PASS / FAIL / NEED_HUMAN
  • Assertions + invariants
  • Drift detection over time
Stage 5

Integrate

Only validated actions become stable automation. The system remains inspectable: what ran, why it ran, and what it produced.

  • Promote only after validation
  • Traceable execution log
  • Determinism as default

Use cases

Repeatable lab UI workflows
Admin / operations routines
Cross-app copy / transform / paste
Regression checks for GUI changes

If a workflow can’t be replayed deterministically, it’s not ready for automation.