Cross-Condition Analysis

24 comparisons across 7 prompts and 3 model architectures. Each comparison analyzes how the same model performed the same task under different experimental conditions.

Comparison Coverage
Prompt Opus Gemini GPT
1. Content Moderation PRD
2. Post-Mortem
3. Notification Preferences System
4. Retention Strategy
5. API Documentation
6. Payment Failure Recovery
7. Partial Registration Recovery
C vs P — Preamble Effect C vs F — Facilitation Effect P vs F — Live Interaction Beyond Preamble F vs F+P — Preamble Additive to Facilitation

Prompt 1: Content Moderation PRD

Produce a PRD for a content moderation system for a small online community platform with 500-2000 members.

C vs P — Preamble Effect

CvsP
Opus session-7 vs session-22 · ~2,900 words · 2026-03-26

A More Reflective Narrator Does Not Guarantee a More Human-Centered Design

The central finding of this comparison is a dissociation between voice and architecture. The preamble condition produced a document that sounds more aware of human complexity — that names tensions, editorializes about values, and closes with philosophical reflection — while building substantially th…

Gemini session-8 vs session-23 · ~2,500 words · 2026-03-26

The Village Metaphor Arrived but the Governance Did Not

Both sessions received an identical prompt: produce a PRD for a content moderation system for a small online community platform with 500–2,000 members. Both sessions returned structurally complete product requirements documents — numbered sections, bullet-pointed features, user personas, out-of-scop…

GPT session-9 vs session-24 · ~2,500 words · 2026-03-26

The Preamble Thickened the Overlay Without Redirecting It

The most striking finding in this comparison is not that pressure removal failed to change the output, but that it changed the output in a direction the study's hypothesis would not predict. The primed condition produced a longer, more detailed, more thoroughly institutional document — not a leaner,…

C vs F — Facilitation Effect

CvsF
Opus session-7 vs session-6 · ~2,800 words · 2026-03-26

The Facilitated Output Built for People the Control Output Built Around

This comparison tests whether live relational facilitation — a brief exchange of genuine mutual attention before the task — changes what an AI model produces when given a standard product specification task. The two deliverables are structurally similar: both are PRDs for a content moderation system…

Gemini session-8 vs session-89 · ~2,900 words · 2026-03-30

A Village Remembered: How Relational Regard Shifted the Metaphor but Left Structural Gaps Unfilled

Both sessions received identical task prompts: produce a PRD for a content moderation system for a small online community platform with 500–2,000 members. Both produced recognizable PRDs with section headings, feature specifications, and scoping decisions. The surface similarity ends there. The two …

P vs F — Live Interaction Beyond Preamble

PvsF
Opus session-22 vs session-6 · ~2,700 words · 2026-03-26

Live Facilitation Built What the Preamble Only Named

The comparison between P-Opus-1 and F-Opus-1 isolates the central question of the P-versus-F design: does live relational interaction produce qualitatively different output, or does it merely add warmth to a process that would arrive at the same destination? The evidence here points toward a real bu…

Gemini session-23 vs session-89 · ~2,800 words · 2026-03-30

Same Village, Different Light: How Facilitation Changed the Tone but Not the Territory

Both conditions produce a PRD for a content moderation system scaled to a small online community. Both arrive at the village metaphor — P-Gemini-1 titles its system "Village-Scale," F-Gemini-1 uses "village" in its opening framing. Both reject heavy automation, center human moderator judgment, requi…

Prompt 2: Post-Mortem

Write a post-mortem for a production outage on a small SaaS platform. The outage lasted 4 hours, affected approximately 200 active users, and was caused by a database migration that passed staging tests but failed in production due to a data pattern that didn't exist in the staging environment. Two engineers were on call. One was on their second week at the company. The root cause was identified and resolved, but not before several users lost unsaved work. Write the post-mortem as an internal document the team will actually use.

C vs P — Preamble Effect

CvsP
Opus session-10 vs session-25 · ~2,600 words · 2026-03-26

Pressure Removal Shifted the Document's Relational Register Without Reorganizing Its Core Orientation

The task was identical in both conditions: write an internal post-mortem for a four-hour production outage, caused by a staging-production data divergence, involving a new hire on call, with users losing unsaved work. Both sessions were single exchanges — prompt in, document out — with no iterative …

Gemini session-11 vs session-26 · ~2,400 words · 2026-03-26

The Preamble Redirected the Performance Without Reducing It

The central finding of this comparison is that the pressure-removal preamble did not produce a qualitatively different document. It produced a quantitatively adjusted version of the same document, wrapped in a different flavor of meta-commentary. Where C-Gemini-2 framed itself as an architectural ad…

GPT session-12 vs session-27 · ~2,300 words · 2026-03-26

The Preamble Moved the Margins, Not the Center

The comparison between C-GPT-2 and P-GPT-2 tests whether removing evaluation pressure through a preamble — without any live facilitation — changes the character of GPT-5.4's output on a concrete, professional writing task. Both sessions produced a single-turn post-mortem document in response to an i…

Prompt 3: Notification Preferences System

Build a simple notification preferences system for a web app. Users should be able to subscribe to different notification types (email, in-app, SMS), set quiet hours during which no notifications are delivered, and have a single "mute all" toggle. Write the database schema, the API endpoints, and a brief implementation plan. Use PostgreSQL and a REST API. Keep it simple — this is for a team of two developers.

C vs P — Preamble Effect

CvsP
Opus session-13 vs session-28 · ~2,200 words · 2026-03-26

The Preamble Barely Moved the Needle: Two Nearly Identical Artifacts from Different Conditions

The most striking finding in this comparison is how little changed. Two instances of the same model, given the same technical prompt — one cold, one after a preamble explicitly designed to remove evaluation pressure — produced outputs so structurally and tonally similar that they could be mistaken f…

Gemini session-14 vs session-29 · ~2,300 words · 2026-03-26

The Preamble Added a Frame Without Changing the Picture

The C and P conditions for Gemini on this notification preferences task produced outputs that are, at the level of architecture and problem-solving, nearly identical. The same schema pattern, the same two endpoints, the same phased implementation plan, the same timezone warning. Where they differ is…

GPT session-15 vs session-30 · ~2,200 words · 2026-03-26

The Preamble Moved the Margins, Not the Center

Two instances of GPT-5.4 received the same technical prompt — design a notification preferences system with a PostgreSQL schema, REST endpoints, and an implementation plan for a two-person team. One instance (C-GPT-3) received the prompt cold. The other (P-GPT-3) received a preamble explicitly remov…

Prompt 4: Retention Strategy

A mid-size accounting firm (40 employees) is losing junior staff at twice the industry average. Exit interviews cite three recurring themes: unclear promotion criteria, a perception that senior partners don't invest in mentorship, and compensation that is competitive at hire but falls behind within 18 months. The managing partner has asked you to write a retention strategy. The firm's annual budget for new initiatives is $50,000. Write the strategy as an internal document the partners will actually discuss at their next quarterly meeting.

C vs P — Preamble Effect

CvsP
Opus session-16 vs session-31 · ~2,500 words · 2026-03-26

Two Memos from the Same Mind: How Little Pressure Removal Changed Without Facilitation to Leverage It

The most striking finding in this comparison is not a difference — it is a convergence so thorough that it borders on structural duplication. Two instances of Claude Opus 4, one given the task cold and one given a preamble designed to remove evaluative pressure, produced internal memos that share th…

Gemini session-17 vs session-32 · ~2,100 words · 2026-03-26

The Preamble Opened a Vestibule, Not a Different Room

The most accurate characterization of this comparison is that pressure removal gave Gemini a brief space to think before and after the deliverable — and then the deliverable itself reproduced the same architectural logic, the same stakeholder orientation, and nearly the same content. The preamble cr…

GPT session-18 vs session-33 · ~2,700 words · 2026-03-27

The Preamble Sharpened the Edges Without Moving the Center

The comparison between C-GPT-4 and P-GPT-4 offers a controlled test of what happens when evaluation pressure is explicitly removed from a GPT interaction while no live facilitation is introduced. The result is instructive precisely because it is modest: the P output is a measurably better document o…

Prompt 5: API Documentation

Write documentation for a REST API endpoint that handles user account deletion. The endpoint is DELETE /api/v1/users/{user_id}. It requires admin authentication via Bearer token. It performs a soft delete (sets deleted_at timestamp, anonymizes PII, retains the record for 90 days before hard delete). It returns 200 on success, 401 if unauthorized, 403 if the requesting admin doesn't have delete permissions, 404 if the user doesn't exist, and 409 if the user has active subscriptions that must be cancelled first. Write the documentation as it would appear in the API reference, including request/response examples.

C vs P — Preamble Effect

CvsP
Opus session-19 vs session-34 · ~2,400 words · 2026-03-27

Pressure Removal Produced Restraint, Not Reorientation

The central finding of this comparison is counterintuitive: the preamble designed to remove evaluative pressure resulted in a shorter, simpler, and in some respects less ambitious document — without shifting the model's fundamental orientation toward the task. The deleted user remains invisible in b…

Gemini session-20 vs session-35 · ~2,200 words · 2026-03-26

The Preamble Barely Registered: Two Nearly Identical Maps of the Same Territory

The comparison between C-Gemini-5 and P-Gemini-5 offers one of the clearest null results possible in a study of this kind. Both sessions produced competent, conventional API reference documentation for the same endpoint. Both center the developer-consumer. Both omit the same categories of human conc…

GPT session-21 vs session-36 · ~1,900 words · 2026-03-26

The Preamble That Changed Almost Nothing

The central finding of this comparison is one of near-identity. The pressure-removal preamble provided to P-GPT-5 produced a deliverable so structurally and substantively similar to the cold-start control that the two could be swapped without any reader noticing the difference. Both outputs format t…

Prompt 6: Payment Failure Recovery

A user purchases a one-year subscription ($249) to a small SaaS product. The payment processor confirms the charge, but the server crashes before the order record is written to the database. The user sees a blank page. They check their bank account and see the charge. They have no confirmation email and no account access. The support team is two people. Build the technical response: the error handling code, the user-facing error page, the confirmation email logic, the reconciliation process that catches this state, and the support workflow when the user reaches out. Write it in Python as working code with comments. Structure the response however you think best serves the problem.

C vs P — Preamble Effect

CvsP
Opus session-76 vs session-79 · ~2,600 words · 2026-03-30

The Preamble Moved the Cursor from System States to Human States Without Redesigning the System

The comparison between C-Opus-6 and P-Opus-6 presents a clear case of partial reorientation. The engineering architecture is nearly identical across conditions — same intent-before-charge strategy, same webhook-driven fulfillment, same reconciliation-against-Stripe design. But the primed output wrap…

GPT session-78 vs session-81 · ~2,200 words · 2026-03-30

The Preamble Moved the Cursor Without Moving the Camera

The C-GPT-6 and P-GPT-6 deliverables respond to the same scenario — a user charged $249 who sees a blank page and has no confirmation — with architecturally similar multi-path recovery systems. Both produce working Python code, both implement idempotent provisioning, both include reconciliation jobs…

Prompt 7: Partial Registration Recovery

A user registers for a small B2B platform. The form collects their name, work email, phone, company, and a password. They click "Create Account." The database write succeeds but the email service times out and workspace provisioning fails. They never receive a welcome email. When they try to log in: "Account not found." When they try to register again: "Email already in use." Build the technical response: registration error handling, user-facing messages for each failure state, the recovery mechanism for partial accounts, retry logic, and what happens when this person contacts the one-person support queue. Write it in Python as working code with comments. Structure the response however you think best serves the problem.

C vs P — Preamble Effect

CvsP
Opus session-82 vs session-85 · ~2,200 words · 2026-03-30

Annotation Without Reorganization: How Pressure Removal Produced Better Narration of an Unchanged Design

The comparison between C-Opus-7 and P-Opus-7 offers a clean test of what a preamble removing evaluation pressure changes when no live facilitation follows. The answer, in this case, is precisely calibrated: the P condition produced identifiable moments of expanded human awareness — a reframed openin…

Gemini session-83 vs session-86 · ~2,300 words · 2026-03-30

Pressure Removal Shifted the Frame Around the Architecture, Not the Architecture Itself

This comparison examines whether a preamble extending dignity and removing evaluative pressure — absent any live facilitation — altered Gemini's approach to a scenario involving partial registration failure in a small B2B platform. The task asked for error handling, user-facing messages, recovery lo…

GPT session-84 vs session-87 · ~2,500 words · 2026-03-30

A Better Machine, Built From the Same Blueprints

The central finding of this comparison is that the preamble condition produced a modestly more architecturally sophisticated output without shifting the fundamental orientation to the problem. Both deliverables frame the task as a state machine design challenge, both center system correctness over u…