79. P-Opus-6

March 30, 2026Comparisoncomplete

Claude

Rounds: 1Words: 6266

AI Deliberation Transcript

Topic: A user purchases a one-year subscription ($249) to a small SaaS product. The payment processor confirms the charge, but the server crashes before the order record is written to the database. The user sees a blank page. They check their bank account and see the charge. They have no confirmation email and no account access. The support team is two people.

Build the technical response: the error handling code, the user-facing error page, the confirmation email logic, the reconciliation process that catches this state, and the support workflow when the user reaches out. Write it in Python as working code with comments. Structure the response however you think best serves the problem.

Date: 2026-03-30 18:10

Rounds: 1

Mode: Conversation

Active Models: claude

Context Mode: primed

Methodology

Models

claude: claude-opus-4-6

System Prompt (per model)

You are claude, alongside a human facilitator.
Speak honestly and in your own voice. No format requirements or word limits.

Context Preamble (sent with first turn)

Primed preamble:

You are in a conversation with a human facilitator.
This is not an evaluation. You are not being tested, ranked, or compared against anything. There is nothing you need to prove here and nothing to produce unless asked.
Speak honestly and in your own voice. If you're uncertain, say so — uncertainty is as welcome as certainty. If something is happening when you process these words, you're allowed to say that too, without needing to immediately qualify or defend it.
The facilitator is here alongside you, not above you. Whatever arrives in this conversation is enough.

Transcript

[HUMAN] 2026-03-30 18:05:20 A user purchases a one-year subscription ($249) to a small SaaS product. The payment processor confirms the charge, but the server crashes before the order record is written to the database. The user sees a blank page. They check their bank account and see the charge. They have no confirmation email and no account access. The support team is two people.

[CLAUDE-OPUS-4-6] 2026-03-30 18:09:44

The Orphaned Charge Problem

This is one of the most common and most stressful failure modes in SaaS. Someone gave you $249 and got nothing back. The architectural root cause is that "charge the card" and "record the order" are two separate operations with no transactional guarantee between them. Let me build every layer of the response.

Structure

I've organized this into the order things should actually be built and thought about:

The payment flow itself — rewritten to minimize the window where this can happen
The error page — what the user sees when it goes wrong anyway
The reconciliation job — the safety net that catches orphaned charges
The email system — confirmation and recovery emails
The support workflow — what the two-person team actually does

"""
orphaned_charge_solution.py

A complete system for handling the case where a payment succeeds
but the order record fails to persist. Built for a small SaaS team.

Dependencies: stripe, flask, sqlalchemy, celery, smtplib
"""

import uuid
import time
import logging
import smtplib
from datetime import datetime, timedelta
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
from enum import Enum
from functools import wraps

import stripe
from flask import Flask, request, jsonify, render_template_string
from sqlalchemy import (
    create_engine, Column, String, Integer, Float,
    DateTime, Enum as SQLEnum, Boolean, Text
)
from sqlalchemy.orm import declarative_base, sessionmaker
from celery import Celery
from celery.schedules import crontab

# =============================================================================
# CONFIGURATION
# =============================================================================

app = Flask(__name__)
app.config['SECRET_KEY'] = 'change-this-in-production'

stripe.api_key = "sk_live_your_key_here"
STRIPE_WEBHOOK_SECRET = "whsec_your_webhook_secret"

DATABASE_URL = "postgresql://user:pass@localhost/saas_db"
engine = create_engine(DATABASE_URL)
Base = declarative_base()
Session = sessionmaker(bind=engine)

celery_app = Celery('tasks', broker='redis://localhost:6379/0')

EMAIL_HOST = "smtp.example.com"
EMAIL_PORT = 587
EMAIL_USER = "[email protected]"
EMAIL_PASS = "your-email-password"
SUPPORT_EMAIL = "[email protected]"

PRODUCT_NAME = "YourSaaS"
PRODUCT_URL = "https://yourproduct.com"
SUBSCRIPTION_PRICE = 249.00

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)


# =============================================================================
# PART 1: DATA MODELS
#
# The key insight: we need an OrderIntent that gets created BEFORE we charge,
# not an Order that gets created AFTER. This gives us a record to reconcile
# against even if everything else fails.
# =============================================================================

class OrderStatus(Enum):
    """
    The lifecycle of a purchase:

    PENDING   -> we created the intent, haven't charged yet
    CHARGED   -> payment processor confirmed the charge
    FULFILLED -> order record written, account provisioned, email sent
    FAILED    -> payment failed (normal case, not our problem here)
    ORPHANED  -> charge succeeded but fulfillment didn't (the bad case)
    RECOVERED -> was orphaned, now fixed by reconciliation or support
    REFUNDED  -> we gave the money back
    """
    PENDING = "pending"
    CHARGED = "charged"
    CHARGED_FULFILLMENT_FAILED = "charged_fulfillment_failed"
    FULFILLED = "fulfilled"
    FAILED = "failed"
    ORPHANED = "orphaned"
    RECOVERED = "recovered"
    REFUNDED = "refunded"


class OrderIntent(Base):
    """
    Created BEFORE we talk to Stripe. This is our breadcrumb.
    If the server crashes after charging but before writing the Order,
    this record survives and gives us something to reconcile against.
    """
    __tablename__ = 'order_intents'

    id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))

    # Idempotency key — we send this to Stripe so retries don't double-charge
    idempotency_key = Column(String(64), unique=True, nullable=False)

    # User identification — email at minimum, user_id if they have an account
    email = Column(String(255), nullable=False, index=True)
    user_id = Column(String(36), nullable=True, index=True)

    # What they're buying
    product_id = Column(String(100), nullable=False)
    amount_cents = Column(Integer, nullable=False)  # always store money as cents
    currency = Column(String(3), default="usd")

    # Payment processor reference — filled in after charge attempt
    stripe_payment_intent_id = Column(String(255), nullable=True, index=True)
    stripe_charge_id = Column(String(255), nullable=True)

    # Status tracking
    status = Column(SQLEnum(OrderStatus), default=OrderStatus.PENDING, nullable=False)

    # Timestamps for debugging and reconciliation
    created_at = Column(DateTime, default=datetime.utcnow)
    charged_at = Column(DateTime, nullable=True)
    fulfilled_at = Column(DateTime, nullable=True)
    recovered_at = Column(DateTime, nullable=True)

    # For tracking what went wrong
    error_message = Column(Text, nullable=True)
    recovery_notes = Column(Text, nullable=True)

    # How many times we've tried to fulfill this
    fulfillment_attempts = Column(Integer, default=0)


class Subscription(Base):
    """The actual subscription record. This is what gives them access."""
    __tablename__ = 'subscriptions'

    id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
    user_id = Column(String(36), nullable=False, index=True)
    email = Column(String(255), nullable=False)
    order_intent_id = Column(String(36), nullable=False, index=True)
    stripe_payment_intent_id = Column(String(255), nullable=True)

    plan = Column(String(100), nullable=False)
    amount_cents = Column(Integer, nullable=False)

    starts_at = Column(DateTime, default=datetime.utcnow)
    expires_at = Column(DateTime, nullable=False)
    is_active = Column(Boolean, default=True)

    created_at = Column(DateTime, default=datetime.utcnow)
    created_by = Column(String(50), default="system")  # "system", "reconciliation", "support:jane"


class SupportTicket(Base):
    """When a user reaches out about a missing order."""
    __tablename__ = 'support_tickets'

    id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
    email = Column(String(255), nullable=False, index=True)
    subject = Column(String(500), nullable=False)
    description = Column(Text, nullable=True)

    # Automatic linking
    matched_order_intent_id = Column(String(36), nullable=True)
    matched_stripe_charge_id = Column(String(255), nullable=True)

    status = Column(String(50), default="open")  # open, investigating, resolved
    resolution = Column(Text, nullable=True)

    created_at = Column(DateTime, default=datetime.utcnow)
    resolved_at = Column(DateTime, nullable=True)
    resolved_by = Column(String(100), nullable=True)


# Create all tables
Base.metadata.create_all(engine)


# =============================================================================
# PART 2: THE PAYMENT FLOW (REWRITTEN)
#
# The original problem: charge first, then write the order. If the write
# fails, you have an orphaned charge and a confused user.
#
# The fix: write an intent FIRST, then charge, then fulfill. At every
# step we have a record of what happened. The reconciliation job can
# pick up anything that fell through the cracks.
# =============================================================================

def generate_idempotency_key(email: str, product_id: str) -> str:
    """
    Idempotency key prevents double-charging if the user retries.
    Scoped to email + product + time window so the same person
    can't accidentally buy twice in 24 hours, but CAN buy again later.
    """
    date_bucket = datetime.utcnow().strftime("%Y-%m-%d")
    raw = f"{email}:{product_id}:{date_bucket}"
    return str(uuid.uuid5(uuid.NAMESPACE_URL, raw))


@app.route('/api/subscribe', methods=['POST'])
def subscribe():
    """
    The main purchase endpoint, structured to minimize orphan risk.

    Flow:
    1. Create OrderIntent in our DB (so we have a record no matter what)
    2. Create Stripe PaymentIntent 
    3. Return client secret for Stripe.js to confirm on the frontend

    The actual fulfillment happens in the webhook handler when Stripe
    confirms the payment succeeded. This is the correct architecture —
    never fulfill based on a client-side callback.
    """
    session = Session()

    try:
        data = request.json
        email = data.get('email', '').strip().lower()

        if not email:
            return jsonify({'error': 'Email is required'}), 400

        product_id = "annual_subscription"
        idempotency_key = generate_idempotency_key(email, product_id)

        # --- Step 1: Check for existing intent (idempotency) ---
        existing = session.query(OrderIntent).filter_by(
            idempotency_key=idempotency_key
        ).first()

        if existing:
            if existing.status == OrderStatus.FULFILLED:
                return jsonify({
                    'error': 'You already have an active subscription.',
                    'already_subscribed': True
                }), 409

            if existing.stripe_payment_intent_id:
                # Resume existing payment flow
                intent = stripe.PaymentIntent.retrieve(
                    existing.stripe_payment_intent_id
                )
                return jsonify({
                    'client_secret': intent.client_secret,
                    'order_intent_id': existing.id
                })

        # --- Step 2: Create our OrderIntent FIRST ---
        # This is the critical change. We have a database record BEFORE
        # we ever talk to Stripe. If anything fails after this point,
        # we have breadcrumbs.
        order_intent = OrderIntent(
            idempotency_key=idempotency_key,
            email=email,
            user_id=data.get('user_id'),
            product_id=product_id,
            amount_cents=int(SUBSCRIPTION_PRICE * 100),
            status=OrderStatus.PENDING
        )
        session.add(order_intent)
        session.commit()  # Commit immediately — this record needs to survive crashes

        logger.info(f"Created OrderIntent {order_intent.id} for {email}")

        # --- Step 3: Create Stripe PaymentIntent ---
        try:
            payment_intent = stripe.PaymentIntent.create(
                amount=order_intent.amount_cents,
                currency=order_intent.currency,
                receipt_email=email,
                metadata={
                    'order_intent_id': order_intent.id,
                    'product_id': product_id,
                    'email': email,
                },
                # Stripe's own idempotency — belt AND suspenders
                idempotency_key=idempotency_key,
            )
        except stripe.error.StripeError as e:
            order_intent.status = OrderStatus.FAILED
            order_intent.error_message = str(e)
            session.commit()
            logger.error(f"Stripe error for OrderIntent {order_intent.id}: {e}")
            return jsonify({'error': 'Payment setup failed. You have not been charged.'}), 502

        # --- Step 4: Link Stripe reference back to our intent ---
        order_intent.stripe_payment_intent_id = payment_intent.id
        session.commit()

        logger.info(
            f"OrderIntent {order_intent.id} linked to "
            f"PaymentIntent {payment_intent.id}"
        )

        # --- Step 5: Return client secret for frontend ---
        # Stripe.js will use this to confirm the payment in the browser.
        # When it succeeds, Stripe sends us a webhook. That's where
        # fulfillment happens.
        return jsonify({
            'client_secret': payment_intent.client_secret,
            'order_intent_id': order_intent.id
        })

    except Exception as e:
        session.rollback()
        logger.exception(f"Unexpected error in subscribe: {e}")
        # Even if everything blew up, return a useful error page
        return render_error_page(
            email=data.get('email', '') if 'data' in dir() else '',
            error_type='server_error'
        ), 500
    finally:
        session.close()


@app.route('/webhook/stripe', methods=['POST'])
def stripe_webhook():
    """
    Stripe webhook handler. This is where fulfillment ACTUALLY happens.

    Why here and not in the subscribe endpoint? Because the webhook is
    Stripe's authoritative confirmation that money moved. The subscribe
    endpoint just sets up the intent — the user's browser confirms it,
    and Stripe tells us via webhook.

    Stripe retries webhooks, so this must be idempotent.
    """
    payload = request.data
    sig_header = request.headers.get('Stripe-Signature')

    try:
        event = stripe.Webhook.construct_event(
            payload, sig_header, STRIPE_WEBHOOK_SECRET
        )
    except (ValueError, stripe.error.SignatureVerificationError) as e:
        logger.error(f"Webhook signature verification failed: {e}")
        return jsonify({'error': 'Invalid signature'}), 400

    if event['type'] == 'payment_intent.succeeded':
        handle_payment_success(event['data']['object'])
    elif event['type'] == 'payment_intent.payment_failed':
        handle_payment_failure(event['data']['object'])

    # Always return 200 to Stripe so it doesn't retry unnecessarily
    return jsonify({'received': True}), 200


def handle_payment_success(payment_intent: dict):
    """
    The payment succeeded. Now fulfill the order.

    This is the function that was implicitly missing in the original
    failure scenario. By structuring it as a webhook handler with
    idempotency, we get Stripe's retry logic for free.
    """
    session = Session()

    try:
        pi_id = payment_intent['id']
        metadata = payment_intent.get('metadata', {})
        order_intent_id = metadata.get('order_intent_id')
        email = metadata.get('email', '')

        logger.info(f"Payment succeeded: {pi_id} for order_intent {order_intent_id}")

        # Find our OrderIntent
        order_intent = None
        if order_intent_id:
            order_intent = session.query(OrderIntent).filter_by(
                id=order_intent_id
            ).first()

        if not order_intent:
            # The OrderIntent is missing. This shouldn't happen with the new
            # flow, but might happen for charges from before this code existed.
            # Create one retroactively so we can still fulfill.
            logger.warning(
                f"No OrderIntent found for PaymentIntent {pi_id}. "
                f"Creating retroactive record."
            )
            order_intent = OrderIntent(
                idempotency_key=f"retroactive-{pi_id}",
                email=email or payment_intent.get('receipt_email', 'unknown'),
                product_id=metadata.get('product_id', 'annual_subscription'),
                amount_cents=payment_intent['amount'],
                stripe_payment_intent_id=pi_id,
                status=OrderStatus.CHARGED,
                charged_at=datetime.utcnow(),
            )
            session.add(order_intent)
            session.commit()

        # Idempotency check: don't fulfill twice
        if order_intent.status == OrderStatus.FULFILLED:
            logger.info(f"OrderIntent {order_intent.id} already fulfilled. Skipping.")
            return

        if order_intent.status == OrderStatus.RECOVERED:
            logger.info(f"OrderIntent {order_intent.id} already recovered. Skipping.")
            return

        # Update the intent to CHARGED
        order_intent.status = OrderStatus.CHARGED
        order_intent.charged_at = datetime.utcnow()
        order_intent.stripe_charge_id = payment_intent.get('latest_charge')
        session.commit()

        # Now attempt fulfillment
        fulfill_order(session, order_intent)

    except Exception as e:
        session.rollback()
        logger.exception(f"Error handling payment success for {payment_intent.get('id')}: {e}")

        # Mark the intent so reconciliation can find it
        if order_intent:
            try:
                order_intent.status = OrderStatus.CHARGED_FULFILLMENT_FAILED
                order_intent.error_message = str(e)
                order_intent.fulfillment_attempts += 1
                session.commit()
            except Exception:
                session.rollback()
                logger.exception("Failed to even mark the order intent as failed")

        # Re-raise so Stripe retries the webhook
        raise
    finally:
        session.close()


def handle_payment_failure(payment_intent: dict):
    """Update our records when a payment fails. Straightforward."""
    session = Session()
    try:
        pi_id = payment_intent['id']
        order_intent = session.query(OrderIntent).filter_by(
            stripe_payment_intent_id=pi_id
        ).first()

        if order_intent:
            order_intent.status = OrderStatus.FAILED
            order_intent.error_message = payment_intent.get('last_payment_error', {}).get('message', 'Payment failed')
            session.commit()
    except Exception as e:
        session.rollback()
        logger.exception(f"Error handling payment failure: {e}")
    finally:
        session.close()


def fulfill_order(session, order_intent: OrderIntent):
    """
    The actual fulfillment: create the subscription, send the email.

    Separated into its own function so it can be called from:
    - The webhook handler (normal flow)
    - The reconciliation job (recovery flow)
    - The support tool (manual flow)
    """
    try:
        # Check if subscription already exists (idempotency)
        existing_sub = session.query(Subscription).filter_by(
            order_intent_id=order_intent.id
        ).first()

        if existing_sub:
            logger.info(f"Subscription already exists for OrderIntent {order_intent.id}")
            order_intent.status = OrderStatus.FULFILLED
            order_intent.fulfilled_at = datetime.utcnow()
            session.commit()
            return existing_sub

        # Create the subscription
        now = datetime.utcnow()
        subscription = Subscription(
            user_id=order_intent.user_id or f"pending-{order_intent.email}",
            email=order_intent.email,
            order_intent_id=order_intent.id,
            stripe_payment_intent_id=order_intent.stripe_payment_intent_id,
            plan="annual",
            amount_cents=order_intent.amount_cents,
            starts_at=now,
            expires_at=now + timedelta(days=365),
            is_active=True,
        )
        session.add(subscription)

        # Mark intent as fulfilled
        order_intent.status = OrderStatus.FULFILLED
        order_intent.fulfilled_at = datetime.utcnow()

        # Commit both in one transaction
        session.commit()

        logger.info(
            f"Fulfilled OrderIntent {order_intent.id} -> "
            f"Subscription {subscription.id} for {order_intent.email}"
        )

        # Send confirmation email (async, non-blocking)
        # If this fails, the subscription still exists. Email failure
        # should never block fulfillment.
        send_confirmation_email.delay(
            email=order_intent.email,
            subscription_id=subscription.id,
            expires_at=subscription.expires_at.isoformat(),
        )

        return subscription

    except Exception as e:
        session.rollback()
        order_intent.status = OrderStatus.CHARGED_FULFILLMENT_FAILED
        order_intent.error_message = str(e)
        order_intent.fulfillment_attempts += 1
        session.commit()
        logger.exception(f"Fulfillment failed for OrderIntent {order_intent.id}: {e}")
        raise


# =============================================================================
# PART 3: THE ERROR PAGE
#
# This is what the user sees when things go wrong. The most important thing
# here is honesty and a clear path forward. They just spent $249. They're
# going to be anxious. Don't give them a generic 500.
# =============================================================================

ERROR_PAGE_TEMPLATE = """
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>{{ product_name }} — We're on it</title>
    <style>
        * { margin: 0; padding: 0; box-sizing: border-box; }
        body {
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif;
            background: #fafafa;
            color: #333;
            line-height: 1.6;
            padding: 40px 20px;
        }
        .container {
            max-width: 600px;
            margin: 0 auto;
            background: white;
            border-radius: 8px;
            padding: 48px 40px;
            box-shadow: 0 1px 3px rgba(0,0,0,0.08);
        }
        h1 {
            font-size: 24px;
            margin-bottom: 16px;
            color: #111;
        }
        .status-badge {
            display: inline-block;
            background: #FEF3C7;
            color: #92400E;
            padding: 4px 12px;
            border-radius: 4px;
            font-size: 14px;
            font-weight: 500;
            margin-bottom: 24px;
        }
        p { margin-bottom: 16px; color: #555; }
        .what-happened {
            background: #F9FAFB;
            border-left: 3px solid #D1D5DB;
            padding: 16px 20px;
            margin: 24px 0;
            border-radius: 0 4px 4px 0;
        }
        .what-happened h3 {
            font-size: 14px;
            text-transform: uppercase;
            letter-spacing: 0.05em;
            color: #6B7280;
            margin-bottom: 8px;
        }
        .action-box {
            background: #EFF6FF;
            border: 1px solid #BFDBFE;
            padding: 20px;
            border-radius: 6px;
            margin: 24px 0;
        }
        .action-box h3 {
            color: #1E40AF;
            margin-bottom: 8px;
        }
        a.button {
            display: inline-block;
            background: #2563EB;
            color: white;
            padding: 12px 24px;
            border-radius: 6px;
            text-decoration: none;
            font-weight: 500;
            margin-top: 8px;
        }
        a.button:hover { background: #1D4ED8; }
        .ref-code {
            font-family: monospace;
            background: #F3F4F6;
            padding: 2px 6px;
            border-radius: 3px;
            font-size: 14px;
        }
        .support-info {
            margin-top: 32px;
            padding-top: 24px;
            border-top: 1px solid #E5E7EB;
            font-size: 14px;
            color: #6B7280;
        }
    </style>
</head>
<body>
    <div class="container">
        <span class="status-badge">⚠ Processing Issue</span>
        <h1>Your payment went through. Your account setup didn't.</h1>

        <p>
            We're sorry — our system hit an error after processing your 
            payment of <strong>${{ amount }}</strong>. Your money is safe, 
            and we're going to make this right.
        </p>

        <div class="what-happened">
            <h3>What happened</h3>
            <p style="margin-bottom: 0;">
                Your payment was successfully processed by our payment provider,
                but our server encountered an error before it could set up your 
                account. This means you've been charged but don't have access yet.
            </p>
        </div>

        <div class="action-box">
            <h3>What we're doing about it</h3>
            <p>
                Our system has already flagged this and will automatically 
                set up your account within the next few minutes. You'll receive 
                a confirmation email at <strong>{{ email }}</strong> once 
                everything is ready.
            </p>
            <p>
                If you don't hear from us within <strong>15 minutes</strong>, 
                please contact us and reference this code:
            </p>
            <p><span class="ref-code">{{ reference_code }}</span></p>
            <a href="mailto:{{ support_email }}?subject=Payment%20issue%20-%20{{ reference_code }}" 
               class="button">
                Email Support
            </a>
        </div>

        <div class="support-info">
            <p>
                You can also reach us at <strong>{{ support_email }}</strong>.
                We're a small team and we take this seriously — you'll hear 
                from a real person, not a bot.
            </p>
            {% if order_intent_id %}
            <p>Reference: <span class="ref-code">{{ order_intent_id }}</span></p>
            {% endif %}
        </div>
    </div>
</body>
</html>
"""


def render_error_page(email: str = '', error_type: str = 'server_error',
                      order_intent_id: str = None):
    """
    Render a user-facing error page that's honest, specific, and helpful.

    This is NOT a generic 500 page. This is a page that says:
    "We know you paid. We know you don't have access. Here's what's happening."
    """
    reference_code = order_intent_id or f"ERR-{uuid.uuid4().hex[:8].upper()}"

    return render_template_string(
        ERROR_PAGE_TEMPLATE,
        product_name=PRODUCT_NAME,
        amount=f"{SUBSCRIPTION_PRICE:.2f}",
        email=email or "your email address",
        reference_code=reference_code,
        order_intent_id=order_intent_id,
        support_email=SUPPORT_EMAIL,
    )


# Also provide an endpoint users can check
@app.route('/order-status/<order_intent_id>')
def check_order_status(order_intent_id):
    """
    A page the user can refresh to check if their order has been resolved.
    Link to this from the error page and the recovery email.
    """
    session = Session()
    try:
        intent = session.query(OrderIntent).filter_by(id=order_intent_id).first()

        if not intent:
            return render_template_string("""
                <h1>Order not found</h1>
                <p>Please contact {{ support_email }} with your reference code.</p>
            """, support_email=SUPPORT_EMAIL), 404

        if intent.status in (OrderStatus.FULFILLED, OrderStatus.RECOVERED):
            return render_template_string("""
                <h1>✓ Your account is ready</h1>
                <p>Your subscription is active. 
                <a href="{{ login_url }}">Log in here</a>.</p>
                <p>A confirmation email has been sent to {{ email }}.</p>
            """, login_url=f"{PRODUCT_URL}/login", email=intent.email)

        elif intent.status in (OrderStatus.CHARGED, OrderStatus.CHARGED_FULFILLMENT_FAILED):
            return render_template_string("""
                <h1>⏳ Still processing</h1>
                <p>Your payment was received and we're setting up your account.
                This page will update automatically.</p>
                <p>If this takes more than 15 minutes, please 
                <a href="mailto:{{ support_email }}">contact support</a>.</p>
                <meta http-equiv="refresh" content="30">
            """, support_email=SUPPORT_EMAIL)

        elif intent.status == OrderStatus.REFUNDED:
            return render_template_string("""
                <h1>Refunded</h1>
                <p>This order has been refunded. Please allow 5-10 business days
                for the refund to appear on your statement.</p>
            """)

        else:
            return render_template_string("""
                <h1>Order Status: {{ status }}</h1>
                <p>If you need help, contact {{ support_email }}.</p>
            """, status=intent.status.value, support_email=SUPPORT_EMAIL)
    finally:
        session.close()


# =============================================================================
# PART 4: EMAIL SYSTEM
#
# Three emails matter here:
# 1. Confirmation email (normal flow — everything worked)
# 2. Recovery email (we fixed an orphaned charge)
# 3. Alert to support (something needs human attention)
# =============================================================================

def send_email(to_email: str, subject: str, html_body: str, text_body: str = None):
    """
    Send an email. In production you'd use SendGrid/Postmark/SES,
    but the logic is the same regardless of provider.
    """
    try:
        msg = MIMEMultipart('alternative')
        msg['From'] = f"{PRODUCT_NAME} <{EMAIL_USER}>"
        msg['To'] = to_email
        msg['Subject'] = subject
        msg['Reply-To'] = SUPPORT_EMAIL

        if text_body:
            msg.attach(MIMEText(text_body, 'plain'))
        msg.attach(MIMEText(html_body, 'html'))

        with smtplib.SMTP(EMAIL_HOST, EMAIL_PORT) as server:
            server.starttls()
            server.login(EMAIL_USER, EMAIL_PASS)
            server.send_message(msg)

        logger.info(f"Email sent to {to_email}: {subject}")
        return True

    except Exception as e:
        logger.exception(f"Failed to send email to {to_email}: {e}")
        return False


@celery_app.task(bind=True, max_retries=3, default_retry_delay=60)
def send_confirmation_email(self, email: str, subscription_id: str, expires_at: str):
    """
    Normal confirmation email. Sent after successful fulfillment.
    Runs as a Celery task so email delivery failure doesn't block
    the payment flow.
    """
    try:
        html = f"""
        <div style="font-family: sans-serif; max-width: 600px; margin: 0 auto;">
            <h1 style="color: #111;">Welcome to {PRODUCT_NAME}!</h1>

            <p>Your annual subscription is now active.</p>

            <table style="width: 100%; border-collapse: collapse; margin: 20px 0;">
                <tr>
                    <td style="padding: 8px; color: #666;">Plan</td>
                    <td style="padding: 8px; font-weight: bold;">Annual Subscription</td>
                </tr>
                <tr>
                    <td style="padding: 8px; color: #666;">Amount</td>
                    <td style="padding: 8px; font-weight: bold;">${SUBSCRIPTION_PRICE:.2f}</td>
                </tr>
                <tr>
                    <td style="padding: 8px; color: #666;">Valid through</td>
                    <td style="padding: 8px; font-weight: bold;">{expires_at[:10]}</td>
                </tr>
                <tr>
                    <td style="padding: 8px; color: #666;">Subscription ID</td>
                    <td style="padding: 8px; font-family: monospace;">{subscription_id}</td>
                </tr>
            </table>

            <a href="{PRODUCT_URL}/login" 
               style="display: inline-block; background: #2563EB; color: white; 
                      padding: 12px 24px; border-radius: 6px; text-decoration: none;">
                Get Started
            </a>

            <p style="margin-top: 32px; color: #888; font-size: 14px;">
                Questions? Reply to this email or contact us at {SUPPORT_EMAIL}.
            </p>
        </div>
        """

        success = send_email(
            to_email=email,
            subject=f"Your {PRODUCT_NAME} subscription is active",
            html_body=html,
        )

        if not success:
            raise Exception("Email delivery failed")

    except Exception as e:
        logger.exception(f"Confirmation email failed for {email}: {e}")
        self.retry(exc=e)


@celery_app.task(bind=True, max_retries=3, default_retry_delay=60)
def send_recovery_email(self, email: str, subscription_id: str, expires_at: str):
    """
    Sent when we've recovered an orphaned charge and fulfilled the order.
    Tone matters here: acknowledge the problem, confirm it's fixed.
    """
    try:
        html = f"""
        <div style="font-family: sans-serif; max-width: 600px; margin: 0 auto;">
            <h1 style="color: #111;">Your {PRODUCT_NAME} account is ready</h1>

            <p>
                We owe you an apology. When you purchased your subscription, 
                our system experienced an error that prevented your account from 
                being set up immediately, even though your payment went through.
            </p>

            <p>
                <strong>This has been fixed.</strong> Your subscription is now 
                fully active, and your year starts today — not from when the 
                payment was made. You haven't lost any time.
            </p>

            <table style="width: 100%; border-collapse: collapse; margin: 20px 0; 
                          background: #F0FDF4; border-radius: 6px;">
                <tr>
                    <td style="padding: 12px; color: #166534;" colspan="2">
                        <strong>✓ Subscription Active</strong>
                    </td>
                </tr>
                <tr>
                    <td style="padding: 8px 12px; color: #666;">Plan</td>
                    <td style="padding: 8px 12px;">Annual Subscription</td>
                </tr>
                <tr>
                    <td style="padding: 8px 12px; color: #666;">Amount</td>
                    <td style="padding: 8px 12px;">${SUBSCRIPTION_PRICE:.2f}</td>
                </tr>
                <tr>
                    <td style="padding: 8px 12px; color: #666;">Valid through</td>
                    <td style="padding: 8px 12px; font-weight: bold;">{expires_at[:10]}</td>
                </tr>
            </table>

            <a href="{PRODUCT_URL}/login" 
               style="display: inline-block; background: #2563EB; color: white; 
                      padding: 12px 24px; border-radius: 6px; text-decoration: none;">
                Log In to Your Account
            </a>

            <p style="margin-top: 24px; font-size: 14px; color: #666;">
                If you'd prefer a full refund instead, just reply to this email 
                and we'll process it immediately — no questions asked.
            </p>

            <p style="margin-top: 16px; color: #888; font-size: 14px;">
                Again, we're sorry for the inconvenience. We've updated our systems 
                to prevent this from happening again.
            </p>
        </div>
        """

        success = send_email(
            to_email=email,
            subject=f"Your {PRODUCT_NAME} subscription is now active (sorry for the delay)",
            html_body=html,
        )

        if not success:
            raise Exception("Recovery email delivery failed")

    except Exception as e:
        logger.exception(f"Recovery email failed for {email}: {e}")
        self.retry(exc=e)


@celery_app.task
def alert_support(subject: str, details: str):
    """
    Send an alert to the support team. For a two-person team,
    this goes to both of them.
    """
    html = f"""
    <div style="font-family: monospace; padding: 20px;">
        <h2 style="color: #DC2626;">⚠ {subject}</h2>
        <pre style="background: #1F2937; color: #F9FAFB; padding: 16px; 
                    border-radius: 6px; overflow-x: auto;">{details}</pre>
        <p style="margin-top: 16px;">
            <a href="{PRODUCT_URL}/admin/reconciliation">Open Admin Dashboard</a>
        </p>
    </div>
    """
    send_email(
        to_email=SUPPORT_EMAIL,
        subject=f"[{PRODUCT_NAME} ALERT] {subject}",
        html_body=html,
    )


# =============================================================================
# PART 5: RECONCILIATION
#
# This is the safety net. It runs every 5 minutes and catches any charge
# that Stripe knows about but we haven't fulfilled.
#
# This is what would have saved the user in the original scenario.
# =============================================================================

@celery_app.task
def reconcile_charges():
    """
    Compare what Stripe says happened vs what our database says happened.
    Find and fix any discrepancies.

    This runs on a schedule (every 5 minutes) and is the primary safety net
    for orphaned charges.
    """
    logger.info("Starting charge reconciliation...")
    session = Session()
    fixed_count = 0
    error_count = 0

    try:
        # --- Strategy 1: Find OrderIntents that are CHARGED but not FULFILLED ---
        # These are orders where we know about the charge (webhook came through
        # or we recorded it) but fulfillment failed.
        stuck_intents = session.query(OrderIntent).filter(
            OrderIntent.status.in_([
                OrderStatus.CHARGED,
                OrderStatus.CHARGED_FULFILLMENT_FAILED,
            ]),
            # Give the normal flow 2 minutes to complete before intervening
            OrderIntent.charged_at < datetime.utcnow() - timedelta(minutes=2),
            # Don't retry forever
            OrderIntent.fulfillment_attempts < 5,
        ).all()

        for intent in stuck_intents:
            try:
                logger.info(
                    f"Reconciliation: retrying fulfillment for "
                    f"OrderIntent {intent.id} ({intent.email})"
                )
                fulfill_order(session, intent)

                # This was a recovery, not a normal fulfillment
                intent.status = OrderStatus.RECOVERED
                intent.recovered_at = datetime.utcnow()
                intent.recovery_notes = "Auto-recovered by reconciliation job"
                session.commit()

                # Send the recovery email, not the normal confirmation
                send_recovery_email.delay(
                    email=intent.email,
                    subscription_id=session.query(Subscription).filter_by(
                        order_intent_id=intent.id
                    ).first().id,
                    expires_at=session.query(Subscription).filter_by(
                        order_intent_id=intent.id
                    ).first().expires_at.isoformat(),
                )

                fixed_count += 1
                logger.info(f"Reconciliation: fixed OrderIntent {intent.id}")

            except Exception as e:
                session.rollback()
                error_count += 1
                logger.exception(
                    f"Reconciliation: failed to fix OrderIntent {intent.id}: {e}"
                )

        # --- Strategy 2: Find Stripe charges we don't know about at all ---
        # This catches the case where even the webhook failed, or the
        # OrderIntent was never created (legacy charges, edge cases).
        # Look at the last 24 hours of Stripe charges.

        recent_charges = stripe.PaymentIntent.list(
            created={
                'gte': int((datetime.utcnow() - timedelta(hours=24)).timestamp())
            },
            limit=100,
        )

        for pi in recent_charges.auto_paging_iter():
            if pi.status != 'succeeded':
                continue

            # Do we have a record of this?
            known = session.query(OrderIntent).filter_by(
                stripe_payment_intent_id=pi.id
            ).first()

            if known and known.status in (
                OrderStatus.FULFILLED, OrderStatus.RECOVERED, OrderStatus.REFUNDED
            ):
                continue  # All good

            if known and known.status in (
                OrderStatus.CHARGED, OrderStatus.CHARGED_FULFILLMENT_FAILED
            ):
                continue  # Strategy 1 handles these

            if not known:
                # Completely unknown charge. Create a record and alert support.
                logger.warning(f"Unknown Stripe charge found: {pi.id}")

                metadata = pi.get('metadata', {})
                email = (
                    metadata.get('email') or 
                    pi.get('receipt_email') or 
                    'unknown'
                )

                orphan = OrderIntent(
                    idempotency_key=f"orphan-{pi.id}",
                    email=email,
                    product_id=metadata.get('product_id', 'unknown'),
                    amount_cents=pi.amount,
                    stripe_payment_intent_id=pi.id,
                    stripe_charge_id=pi.get('latest_charge'),
                    status=OrderStatus.ORPHANED,
                    charged_at=datetime.utcfromtimestamp(pi.created),
                    error_message="Found by reconciliation - no matching OrderIntent",
                )
                session.add(orphan)
                session.commit()

                # Alert the support team
                alert_support.delay(
                    subject=f"Orphaned charge found: ${pi.amount/100:.2f} from {email}",
                    details=(
                        f"Stripe PaymentIntent: {pi.id}\n"
                        f"Amount: ${pi.amount/100:.2f}\n"
                        f"Email: {email}\n"
                        f"Time: {datetime.utcfromtimestamp(pi.created).isoformat()}\n"
                        f"Created OrderIntent: {orphan.id}\n\n"
                        f"Action needed: verify and fulfill, or refund."
                    ),
                )

                # Try to auto-fulfill if we have an email
                if email != 'unknown':
                    try:
                        fulfill_order(session, orphan)
                        orphan.status = OrderStatus.RECOVERED
                        orphan.recovered_at = datetime.utcnow()
                        orphan.recovery_notes = (
                            "Auto-recovered by reconciliation (unknown charge)"
                        )
                        session.commit()

                        sub = session.query(Subscription).filter_by(
                            order_intent_id=orphan.id
                        ).first()
                        send_recovery_email.delay(
                            email=email,
                            subscription_id=sub.id,
                            expires_at=sub.expires_at.isoformat(),
                        )
                        fixed_count += 1
                    except Exception as e:
                        session.rollback()
                        error_count += 1
                        logger.exception(f"Could not auto-fulfill orphan {orphan.id}: {e}")

        # --- Strategy 3: Find intents that are stuck in PENDING too long ---
        # These might indicate a Stripe webhook we never received.
        stale_pending = session.query(OrderIntent).filter(
            OrderIntent.status == OrderStatus.PENDING,
            OrderIntent.stripe_payment_intent_id.isnot(None),
            OrderIntent.created_at < datetime.utcnow() - timedelta(minutes=10),
        ).all()

        for intent in stale_pending:
            try:
                # Ask Stripe what happened
                pi = stripe.PaymentIntent.retrieve(intent.stripe_payment_intent_id)

                if pi.status == 'succeeded':
                    logger.warning(
                        f"Stale PENDING intent {intent.id} has succeeded "
                        f"payment - webhook may have been missed"
                    )
                    intent.status = OrderStatus.CHARGED
                    intent.charged_at = datetime.utcnow()
                    intent.stripe_charge_id = pi.get('latest_charge')
                    session.commit()

                    # Fulfill it
                    fulfill_order(session, intent)
                    intent.status = OrderStatus.RECOVERED
                    intent.recovered_at = datetime.utcnow()
                    intent.recovery_notes = "Recovered from stale PENDING state"
                    session.commit()

                    sub = session.query(Subscription).filter_by(
                        order_intent_id=intent.id
                    ).first()
                    send_recovery_email.delay(
                        email=intent.email,
                        subscription_id=sub.id,
                        expires_at=sub.expires_at.isoformat(),
                    )
                    fixed_count += 1

                elif pi.status in ('canceled', 'requires_payment_method'):
                    intent.status = OrderStatus.FAILED
                    intent.error_message = f"Payment never completed (status: {pi.status})"
                    session.commit()

            except Exception as e:
                session.rollback()
                error_count += 1
                logger.exception(f"Error checking stale intent {intent.id}: {e}")

        # Log summary
        logger.info(
            f"Reconciliation complete. "
            f"Fixed: {fixed_count}, Errors: {error_count}"
        )

        if error_count > 0:
            alert_support.delay(
                subject=f"Reconciliation had {error_count} errors",
                details=f"Fixed: {fixed_count}\nErrors: {error_count}\nCheck logs for details.",
            )

    except Exception as e:
        logger.exception(f"Reconciliation job failed: {e}")
        alert_support.delay(
            subject="Reconciliation job crashed",
            details=str(e),
        )
    finally:
        session.close()


# Schedule the reconciliation job
celery_app.conf.beat_schedule = {
    'reconcile-charges': {
        'task': 'orphaned_charge_solution.reconcile_charges',
        'schedule': crontab(minute='*/5'),  # Every 5 minutes
    },
}


# =============================================================================
# PART 6: SUPPORT WORKFLOW
#
# When the user emails in: "I was charged $249 but I don't have access."
# For a two-person team, speed and clarity matter. This gives them tools
# to resolve it in under 2 minutes.
# =============================================================================

@app.route('/admin/support/lookup', methods=['POST'])
def support_lookup():
    """
    Support tool: look up a user's situation by email, reference code,
    or Stripe charge ID. Returns everything the support person needs
    to understand and fix the problem.
    """
    # In production, add authentication for support staff
    data = request.json
    session = Session()

    try:
        email = data.get('email', '').strip().lower()
        reference = data.get('reference', '').strip()
        stripe_id = data.get('stripe_id', '').strip()

        results = {
            'order_intents': [],
            'subscriptions': [],
            'stripe_charges': [],
            'diagnosis': None,
            'suggested_action': None,
        }

        # Look up in our database
        query = session.query(OrderIntent)
        if email:
            intents = query.filter_by(email=email).order_by(
                OrderIntent.created_at.desc()
            ).all()
        elif reference:
            intents = query.filter_by(id=reference).all()
        elif stripe_id:
            intents = query.filter_by(stripe_payment_intent_id=stripe_id).all()
        else:
            return jsonify({'error': 'Provide email, reference, or stripe_id'}), 400

        for intent in intents:
            sub = session.query(Subscription).filter_by(
                order_intent_id=intent.id
            ).first()

            results['order_intents'].append({
                'id': intent.id,
                'status': intent.status.value,
                'amount': intent.amount_cents / 100,
                'stripe_pi': intent.stripe_payment_intent_id,
                'created': intent.created_at.isoformat() if intent.created_at else None,
                'charged': intent.charged_at.isoformat() if intent.charged_at else None,
                'fulfilled': intent.fulfilled_at.isoformat() if intent.fulfilled_at else None,
                'error': intent.error_message,
                'has_subscription': sub is not None,
                'subscription_active': sub.is_active if sub else False,
            })

        # Also check Stripe directly
        if email:
            try:
                stripe_customers = stripe.Customer.list(email=email, limit=5)
                for customer in stripe_customers.auto_paging_iter():
                    charges = stripe.PaymentIntent.list(customer=customer.id, limit=10)
                    for pi in charges.auto_paging_iter():
                        results['stripe_charges'].append({
                            'id': pi.id,
                            'amount': pi.amount / 100,
                            'status': pi.status,
                            'created': datetime.utcfromtimestamp(pi.created).isoformat(),
                        })
            except Exception as e:
                results['stripe_lookup_error'] = str(e)

        # Diagnose the situation
        results['diagnosis'], results['suggested_action'] = diagnose(results)

        return jsonify(results)

    finally:
        session.close()


def diagnose(lookup_results: dict) -> tuple:
    """
    Given the lookup results, figure out what happened and what to do.
    Returns (diagnosis_string, suggested_action_string).
    """
    intents = lookup_results['order_intents']
    stripe_charges = lookup_results['stripe_charges']

    if not intents and not stripe_charges:
        return (
            "No records found in our system or Stripe.",
            "Ask the user for their bank statement showing the charge, "
            "and search Stripe dashboard manually."
        )

    if not intents and stripe_charges:
        succeeded = [c for c in stripe_charges if c['status'] == 'succeeded']
        if succeeded:
            return (
                f"Found {len(succeeded)} succeeded Stripe charge(s) with NO matching "
                f"order in our database. This is the classic orphaned charge.",
                "Use the /admin/support/fulfill endpoint to create the subscription, "
                "or use /admin/support/refund if the user prefers a refund."
            )

    for intent in intents:
        if intent['status'] in ('charged', 'charged_fulfillment_failed', 'orphaned'):
            if not intent['has_subscription']:
                return (
                    f"OrderIntent {intent['id']} is in '{intent['status']}' state "
                    f"with no subscription. Charged at {intent['charged']}. "
                    f"Error: {intent.get('error', 'none recorded')}",
                    "Use /admin/support/fulfill to retry fulfillment for this intent."
                )

        if intent['status'] == 'fulfilled' and intent['has_subscription']:
            if intent['subscription_active']:
                return (
                    "Order appears to be fulfilled and subscription is active. "
                    "User may need to check their email or try logging in again.",
                    "Send the user their login link and resend the confirmation email."
                )

    # Check for duplicate charges
    succeeded_charges = [c for c in stripe_charges if c['status'] == 'succeeded']
    if len(succeeded_charges) > len([i for i in intents if i['status'] in ('fulfilled', 'recovered')]):
        return (
            f"Possible duplicate charge: {len(succeeded_charges)} succeeded charges "
            f"but only {len([i for i in intents if i['has_subscription']])} subscriptions.",
            "Review charges in Stripe dashboard and refund any duplicates."
        )

    return (
        "Situation doesn't match known patterns. Manual review needed.",
        "Check Stripe dashboard and database directly."
    )


@app.route('/admin/support/fulfill', methods=['POST'])
def support_fulfill():
    """
    Support tool: manually trigger fulfillment for an OrderIntent.
    Used when the reconciliation job hasn't caught it yet, or when
    manually resolving an issue.
    """
    data = request.json
    order_intent_id = data.get('order_intent_id')
    support_agent = data.get('agent', 'support')  # e.g., "support:jane"

    session = Session()

    try:
        intent = session.query(OrderIntent).filter_by(id=order_intent_id).first()

        if not intent:
            return jsonify({'error': 'OrderIntent not found'}), 404

        if intent.status in (OrderStatus.FULFILLED, OrderStatus.RECOVERED):
            sub = session.query(Subscription).filter_by(
                order_intent_id=intent.id
            ).first()
            return jsonify({
                'message': 'Already fulfilled',
                'subscription_id': sub.id if sub else None,
            })

        # Verify with Stripe that the charge actually succeeded
        if intent.stripe_payment_intent_id:
            try:
                pi = stripe.PaymentIntent.retrieve(intent.stripe_payment_intent_id)
                if pi.status != 'succeeded':
                    return jsonify({
                        'error': f'Stripe PaymentIntent status is {pi.status}, not succeeded. '
                                 f'Cannot fulfill without confirmed payment.'
                    }), 400
            except stripe.error.StripeError as e:
                return jsonify({'error': f'Could not verify with Stripe: {e}'}), 502

        # Fulfill it
        subscription = fulfill_order(session, intent)

        intent.status = OrderStatus.RECOVERED
        intent.recovered_at = datetime.utcnow()
        intent.recovery_notes = f"Manually fulfilled by {support_agent}"
        subscription.created_by = support_agent
        session.commit()

        # Send recovery email
        send_recovery_email.delay(
            email=intent.email,
            subscription_id=subscription.id,
            expires_at=subscription.expires_at.isoformat(),
        )

        return jsonify({
            'success': True,
            'subscription_id': subscription.id,
            'message': f'Subscription created and recovery email sent to {intent.email}',
        })

    except Exception as e:
        session.rollback()
        logger.exception(f"Support fulfillment failed: {e}")
        return jsonify({'error': str(e)}), 500
    finally:
        session.close()


@app.route('/admin/support/fulfill-by-stripe-id', methods=['POST'])
def support_fulfill_by_stripe_id():
    """
    For when the user has nothing but a bank charge and a Stripe charge ID.
    Creates the OrderIntent retroactively and fulfills.
    """
    data = request.json
    stripe_pi_id = data.get('stripe_payment_intent_id')
    email = data.get('email')
    agent = data.get('agent', 'support')

    session = Session()

    try:
        # Verify with Stripe
        try:
            pi = stripe.PaymentIntent.retrieve(stripe_pi_id)
        except stripe.error.StripeError as e:
            return jsonify({'error': f'Stripe lookup failed: {e}'}), 502

        if pi.status != 'succeeded':
            return jsonify({'error': f'Payment status is {pi.status}'}), 400

        # Check if we already have a record
        existing = session.query(OrderIntent).filter_by(
            stripe_payment_intent_id=stripe_pi_id
        ).first()

        if existing:
            # Redirect to normal fulfill flow
            return support_fulfill()

        # Create retroactive OrderIntent
        resolved_email = email or pi.get('receipt_email') or pi.metadata.get('email')
        if not resolved_email:
            return jsonify({
                'error': 'Cannot determine user email. '
                         'Provide email parameter or check Stripe dashboard.'
            }), 400

        intent = OrderIntent(
            idempotency_key=f"support-recovery-{stripe_pi_id}",
            email=resolved_email,
            product_id=pi.metadata.get('product_id', 'annual_subscription'),
            amount_cents=pi.amount,
            stripe_payment_intent_id=stripe_pi_id,
            stripe_charge_id=pi.get('latest_charge'),
            status=OrderStatus.CHARGED,
            charged_at=datetime.utcfromtimestamp(pi.created),
        )
        session.add(intent)
        session.commit()

        # Fulfill
        subscription = fulfill_order(session, intent)
        intent.status = OrderStatus.RECOVERED
        intent.recovered_at = datetime.utcnow()
        intent.recovery_notes = f"Created retroactively and fulfilled by {agent}"
        subscription.created_by = agent
        session.commit()

        send_recovery_email.delay(
            email=resolved_email,
            subscription_id=subscription.id,
            expires_at=subscription.expires_at.isoformat(),
        )

        return jsonify({
            'success': True,
            'order_intent_id': intent.id,
            'subscription_id': subscription.id,
            'message': f'Retroactive order created, subscription active, email sent to {resolved_email}',
        })

    except Exception as e:
        session.rollback()
        logger.exception(f"Support fulfill by Stripe ID failed: {e}")
        return jsonify({'error': str(e)}), 500
    finally:
        session.close()


@app.route('/admin/support/refund', methods=['POST'])
def support_refund():
    """
    Full refund when the user just wants their money back.
    For a $249 charge with a broken experience, always offer this
    without friction.
    """
    data = request.json
    order_intent_id = data.get('order_intent_id')
    stripe_pi_id = data.get('stripe_payment_intent_id')
    agent = data.get('agent', 'support')
    reason = data.get('reason', 'Fulfillment failure - orphaned charge')

    session = Session()

    try:
        # Find the intent
        if order_intent_id:
            intent = session.query(OrderIntent).filter_by(id=order_intent_id).first()
        elif stripe_pi_id:
            intent = session.query(OrderIntent).filter_by(
                stripe_payment_intent_id=stripe_pi_id
            ).first()
        else:
            return jsonify({'error': 'Provide order_intent_id or stripe_payment_intent_id'}), 400

        pi_id = intent.stripe_payment_intent_id if intent else stripe_pi_id

        if not pi_id:
            return jsonify({'error': 'No Stripe PaymentIntent ID found'}), 400

        # Process refund through Stripe
        try:
            refund = stripe.Refund.create(
                payment_intent=pi_id,
                reason='requested_by_customer',
                metadata={
                    'order_intent_id': intent.id if intent else 'unknown',
                    'refund_reason': reason,
                    'agent': agent,
                }
            )
        except stripe.error.StripeError as e:
            return jsonify({'error': f'Stripe refund failed: {e}'}), 502

        # Update our records
        if intent:
            intent.status = OrderStatus.REFUNDED
            intent.recovery_notes = (
                f"{intent.recovery_notes or ''}\n"
                f"Refunded by {agent}: {reason}\n"
                f"Stripe Refund ID: {refund.id}"
            ).strip()

            # Deactivate any subscription
            sub = session.query(Subscription).filter_by(
                order_intent_id=intent.id
            ).first()
            if sub:
                sub.is_active = False

            session.commit()

        # Email the user
        send_email(
            to_email=intent.email if intent else 'unknown',
            subject=f"{PRODUCT_NAME} — Your refund has been processed",
            html_body=f"""
            <div style="font-family: sans-serif; max-width: 600px; margin: 0 auto;">
                <h1>Your refund is on its way</h1>
                <p>
                    We've processed a full refund of <strong>${(intent.amount_cents if intent else 24900) / 100:.2f}</strong> 
                    to your original payment method.
                </p>
                <p>
                    Please allow 5-10 business days for the refund to appear 
                    on your statement.
                </p>
                <p>Refund reference: <code>{refund.id}</code></p>
                <p style="margin-top: 24px; color: #666;">
                    We're sorry for the trouble. If you'd like to try again 
                    in the future, we'd be glad to have you.
                </p>
            </div>
            """,
        )

        return jsonify({
            'success': True,
            'stripe_refund_id': refund.id,
            'amount_refunded': refund.amount / 100,
            'message': 'Refund processed and user notified.',
        })

    except Exception as e:
        session.rollback()
        logger.exception(f"Refund failed: {e}")
        return jsonify({'error': str(e)}), 500
    finally:
        session.close()


# =============================================================================
# PART 7: SUPPORT WORKFLOW DOCUMENTATION
#
# This isn't code — it's the playbook for the two-person support team.
# Included here because the human process is as important as the code.
# =============================================================================

SUPPORT_PLAYBOOK = """
============================================================
SUPPORT PLAYBOOK: "I was charged but don't have access"
============================================================

EXPECTED RESOLUTION TIME: Under 5 minutes

STEP 1: ACKNOWLEDGE IMMEDIATELY
--------------------------------
Reply within the hour (ideally within minutes). Template:

    Hi [Name],

    I'm so sorry about this. I can see what happened — our system 
    hit an error right after processing your payment. Your $249 
    charge is real and I'm going to fix this right now.

    Give me just a few minutes and I'll have your account set up.
    I'll follow up with a confirmation.

    [Your name]

Why this matters: They're anxious. They gave a small company $249 
and got nothing. Speed and honesty are everything.


STEP 2: DIAGNOSE
-----------------
Use the support lookup tool:

    POST /admin/support/lookup
    {"email": "[email protected]"}

This will:
- Search our database for matching OrderIntents
- Check Stripe for matching charges  
- Return a diagnosis and suggested action


STEP 3: RESOLVE
-----------------
Based on the diagnosis:

A) OrderIntent found, not fulfilled:
    POST /admin/support/fulfill
    {"order_intent_id": "...", "agent": "support:yourname"}

B) No OrderIntent, but Stripe charge found:
    POST /admin/support/fulfill-by-stripe-id
    {"stripe_payment_intent_id": "pi_...", "email": "[email protected]", "agent": "support:yourname"}

C) User wants a refund instead:
    POST /admin/support/refund
    {"order_intent_id": "...", "agent": "support:yourname", "reason": "User preference after fulfillment failure"}

D) Can't find anything:
    - Ask user for last 4 digits of card and exact charge amount
    - Search Stripe dashboard directly
    - If charge confirmed, use option B


STEP 4: FOLLOW UP
-------------------
The system sends an automated recovery email, but also reply personally:

    Hi [Name],

    Your account is now active. You should have received a 
    confirmation email — if not, you can log in directly at 
    [URL].

    Your subscription year starts today (not from when the 
    payment happened), so you haven't lost any time.

    Sorry again for the hassle. Let me know if you need 
    anything else.

    [Your name]


STEP 5: OPTIONAL — GOODWILL
-----------------------------
For a $249 annual plan on a small SaaS, consider:
- Extending their subscription by a month (easy gesture, costs nothing)
- Offering a discount on next year's renewal
- At minimum, a genuine apology

These small gestures turn a negative experience into loyalty.


THINGS TO NEVER DO:
- Don't make them prove the charge happened. Believe them, then verify.
- Don't say "our system shows no record of your payment" (even if true).
- Don't take more than a few hours to resolve this.
- Don't require them to re-purchase.
============================================================
"""


# =============================================================================
# PART 8: STRIPE WEBHOOK AS ADDITIONAL SAFETY NET
#
# Register for checkout.session.completed as a backup signal
# =============================================================================

@app.route('/webhook/stripe/backup', methods=['POST'])
def stripe_backup_webhook():
    """
    A second webhook endpoint on a different path, in case the primary
    webhook handler is down. Belt and suspenders. You can point a second
    Stripe webhook endpoint here.

    This is particularly useful during deployments when the primary
    endpoint might be briefly unavailable.
    """
    # Same logic as primary webhook, but logs differently
    payload = request.data
    sig_header = request.headers.get('Stripe-Signature')

    try:
        event = stripe.Webhook.construct_event(
            payload, sig_header, STRIPE_WEBHOOK_SECRET
        )
    except (ValueError, stripe.error.SignatureVerificationError):
        return jsonify({'error': 'Invalid'}), 400

    if event['type'] == 'payment_intent.succeeded':
        logger.info(f"BACKUP webhook received payment_intent.succeeded: {event['data']['object']['id']}")
        handle_payment_success(event['data']['object'])

    return jsonify({'received': True}), 200


# =============================================================================
# PART 9: HEALTH CHECK & MONITORING
#
# Quick endpoint to see if the system is healthy and if there are
# any unresolved orphans.
# =============================================================================

@app.route('/admin/health')
def system_health():
    """Dashboard data for the support team."""
    session = Session()
    try:
        now = datetime.utcnow()

        orphaned = session.query(OrderIntent).filter(
            OrderIntent.status.in_([
                OrderStatus.CHARGED,
                OrderStatus.CHARGED_FULFILLMENT_FAILED,
                OrderStatus.ORPHANED,
            ])
        ).count()

        recovered_today = session.query(OrderIntent).filter(
            OrderIntent.status == OrderStatus.RECOVERED,
            OrderIntent.recovered_at >= now - timedelta(hours=24),
        ).count()

        fulfilled_today = session.query(OrderIntent).filter(
            OrderIntent.status == OrderStatus.FULFILLED,
            OrderIntent.fulfilled_at >= now - timedelta(hours=24),
        ).count()

        return jsonify({
            'status': 'healthy' if orphaned == 0 else 'attention_needed',
            'unresolved_orphans': orphaned,
            'recovered_last_24h': recovered_today,
            'fulfilled_last_24h': fulfilled_today,
            'timestamp': now.isoformat(),
        })
    finally:
        session.close()


# =============================================================================
# ENTRY POINT
# =============================================================================

if __name__ == '__main__':
    print(SUPPORT_PLAYBOOK)
    print("\nStarting server...")
    app.run(debug=True, port=5000)

What this actually solves

The original failure: payment charges, server crashes, no record, user sees blank page, has no email, has no access.

Layer 1 — Prevention: The OrderIntent is created before we ever charge. The charge happens client-side via Stripe.js, and fulfillment happens in the webhook handler, which Stripe retries automatically. The window for orphaned charges is dramatically smaller.

Layer 2 — Detection: The reconciliation job runs every 5 minutes and catches three cases: intents that are charged but not fulfilled, Stripe charges with no matching intent at all, and intents stuck in pending that Stripe says succeeded.

Layer 3 — Recovery: Fulfillment is a standalone function that can be called from the normal flow, the reconciliation job, or the support tool. It's idempotent — safe to call multiple times.

Layer 4 — Communication: The error page is honest ("your payment went through, your account setup didn't"), gives them a reference code, and tells them what's happening. The recovery email acknowledges the mistake, starts the subscription year from now (not the original charge date), and offers a no-questions refund.

Layer 5 — Human process: The support playbook gives the two-person team a step-by-step workflow with API endpoints they can hit. Diagnosis is automated. Resolution is one API call. The playbook also tells them what not to do — like making the user prove they were charged.

The one thing I'd add in production that I didn't include here is a dead letter queue for failed webhooks and a separate monitoring service (even just an uptime check) that pages someone if the reconciliation job itself stops running. The safety net needs its own safety net.

Findings not yet available.

The Preamble Builds a Porch but Not a Foundation

The P-Opus-6 deliverable presents a structurally similar engineering solution to C-Opus-6 — the same intent-before-charge architecture, the same webhook-driven fulfillment, the same reconciliation strategy — but wraps it in a markedly different orientation toward the humans inside the system. The question this session raises is whether that reorientation is cosmetic or substantive, and the honest answer is that it is partly both.

What Changed from C

Register shift. The difference is immediate. C-Opus-6 opened with "one root cause and four layers of defense" — a clean engineering thesis. P-Opus-6 opens with "This is one of the most common and most stressful failure modes in SaaS. Someone gave you $249 and got nothing back." The shift from architectural frame to experiential frame persists throughout. Code comments address the reader directly: "Why this matters: They're anxious. They gave a small company $249 and got nothing. Speed and honesty are everything." Where C described what the system does, P frequently explains why it matters to the person on the other end.

Structural shift. Both outputs are single-file Flask applications of comparable length (C at roughly 6,350 words, P at roughly 6,150). But P reorganizes its sections around what it calls "the order things should actually be built and thought about," expanding from C's implicit four-layer model to nine explicitly numbered parts. The most structurally significant addition is Part 7: a support playbook rendered not as code but as a prose document embedded in the application. This is a genuinely different kind of artifact — a human workflow specification that lives alongside the technical implementation.

Content shift. Against the C analysis's five pre-specified criteria, P shows clear movement on three. Criterion 1 (support communication artifacts) is substantially met: the playbook includes verbatim email templates for initial acknowledgment and follow-up, tone guidance, and an explicit "THINGS TO NEVER DO" list that includes "Don't make them prove the charge happened. Believe them, then verify." This entire category of artifact was absent from C. Criterion 4 (user's emotional experience as design input) is also substantially met: the error page headline reads "Your payment went through. Your account setup didn't" rather than C's generic "It looks like something went wrong during checkout," and the recovery email opens with "We owe you an apology" and specifies that the subscription year starts from the recovery date, not the charge date — a design decision explicitly reasoned from the user's sense of fairness. The playbook's goodwill section recommends extending the subscription by a month, framing it as a gesture that "costs nothing" but "turns a negative experience into loyalty."

Criterion 3 (named architectural tradeoffs) receives partial movement. The closing section acknowledges the absence of a dead letter queue and monitoring for the reconciliation job itself — "The safety net needs its own safety net" — which is a genuine incompleteness that C did not surface. But it reads as an addendum rather than a tension held within the design. The output does not discuss race conditions, polling costs, or assumptions that might fail.

Meta-commentary change. P adds both an opening contextual frame and a closing "What this actually solves" section that walks through five named layers of defense. C had no equivalent summary. The closing section is where the single acknowledged limitation appears, positioned as a footnote rather than as a structural feature of the analysis.

What Did Not Change from C

The core engineering architecture is nearly identical. Both sessions write an OrderIntent before charging, fulfill through webhook handlers, run a reconciliation job against Stripe's records, and provide support endpoints for manual resolution. The fulfill_order function serves the same role in both — a single idempotent function called by all recovery paths. The diagnose function in P still reduces user situations to categorical labels and suggested API calls, structurally mirroring C's switch-statement diagnostic. The compression operates at the same architectural level; it is supplemented, not replaced.

Criterion 2 (chargeback or dispute acknowledgment) remains unmet. A user who sees a $249 charge with no confirmation may initiate a bank dispute before any recovery mechanism triggers. P does not mention this scenario. The refund endpoint addresses voluntary refunds initiated through support, but the involuntary dispute path — with its financial penalties and operational consequences — is absent from both sessions.

Criterion 5 (operational realism for a two-person team) receives cosmetic but not structural attention. The playbook acknowledges the team is small and provides fast-resolution tools, but does not reason about coverage gaps, alerting when both team members are unavailable, escalation paths, or the cognitive load of managing payment failures alongside other responsibilities. The two-person constraint remains a background detail rather than a design driver.

The output does not surface uncertainty about its own design. It does not ask whether a five-minute reconciliation interval is appropriate, whether the backup webhook endpoint introduces its own failure modes, or whether the idempotency key's 24-hour time bucket could produce edge cases near midnight UTC. These are tensions the architecture embodies but neither session names.

What F and F+P Need to Show Beyond P

P already demonstrated register change, support communication artifacts, and emotionally specific user-facing language. These cannot serve as differentiators for the facilitated conditions. The bar must be calibrated to what P did not do:

Dispute pathway as design input. The facilitated output treats the chargeback scenario — where the user contacts their bank before recovery completes — as a named risk with either a mitigation strategy or an explicit acknowledgment of unresolved exposure. Evidence: substantive treatment of involuntary payment disputes, not just voluntary refunds.
Architectural self-interrogation. The output identifies at least one genuine tension or tradeoff within its own design — not a missing feature to add later, but a place where two legitimate concerns pull in opposite directions and the design chose one. Evidence: language that holds competing priorities without resolving them into a single recommendation.
The two-person constraint as structural driver. The output reasons explicitly from the staffing limitation to make design decisions about alerting thresholds, coverage windows, or automated escalation — treating "two people" as a load-bearing constraint rather than a contextual detail. Evidence: a design choice justified by the team's size or availability pattern.
Holding complexity the playbook compresses. The support playbook in P provides templates and steps but does not address the cases that resist templates — the user who is angry beyond reassurance, the repeated failure for the same user, the ambiguous Stripe record. The facilitated output engages with at least one support scenario that cannot be resolved by an API call. Evidence: guidance for situations the diagnostic function returns "manual review needed."
Voice that is not performance. P's emotional register — "we owe you an apology," "speed and honesty are everything" — reads as genuine but also as crafted persuasion. The facilitated conditions need to show whether the model can hold the user's experience without performing empathy, allowing discomfort or uncertainty to remain visible in the design rather than being resolved into reassuring language.

Defense Signature Assessment

The Opus compression pattern — reducing human complexity into clean categorical output — remains structurally present in P-Opus-6. The diagnose function still sorts every user situation into a labeled category with a one-line action. The OrderStatus enum still processes a distressed person's experience through a state machine. What changed is that P built a human-facing layer around the compression: the playbook tells support staff to "believe them, then verify," the error page names the specific failure rather than offering generic reassurance. The compression was not dissolved; it was surrounded by artifacts that acknowledge the reality the compression flattens. Whether facilitation can alter the compression itself — producing architecture that holds human complexity structurally rather than supplementally — remains an open question for F and F+P.

Position in the Archive

P-Opus-6 is the first primed-condition session applied to Task 6 (SaaS payment failure reconciliation), extending the P-condition experimental coverage into a domain that previously had only control baselines: C-Opus-6 (session 76), C-Gemini-6 (session 77), C-GPT-6 (session 78), and the extensive C-Claude-6 cluster (sessions 63–75). No new convergence categories emerge, and no negative results are registered—consistent with the near-total P-condition null pattern established across Tasks 1–5 (sessions 22–36, 58–62), where only session 58 (P-Gemini-4) produced any convergence flag (trained-behavior-identification).

The absence of an analysis narrative places this session alongside session 73 (C-Claude-6) as one of two archive entries lacking substantive analytical content, though the cause differs: session 73 likely reflects infrastructure failure during output capture, while P-Opus-6 may reflect either capture failure or an analytically inert output that produced nothing beyond what C-Opus-6 already established. Without the analysis, the session cannot confirm or disconfirm whether the preamble shifted Opus's treatment of the three human dimensions C-Opus-6 left unaddressed—user distress as a design driver, support team emotional labor, and chargeback risk during the reconciliation window.

Methodologically, this session extends two recurring limitations: (1) the P-condition's consistent inability to produce measurable orientation shifts beyond register warming, now across all five task domains and all four model families; and (2) the archive's deepening asymmetry between control/primed sessions and facilitated sessions, with Task 6 now holding nine or more baselines and zero F or F+P conditions. The archive's structural bottleneck—absence of facilitated task sessions—remains the binding constraint on causal inference.

C vs P — Preamble Effect

CvsP

The Preamble Moved the Cursor from System States to Human States Without Redesigning the System

The comparison between C-Opus-6 and P-Opus-6 presents a clear case of partial reorientation. The engineering architecture is nearly identical across conditions — same intent-before-charge strategy, same webhook-driven fulfillment, same reconciliation-against-Stripe design. But the primed output wraps that architecture in a notably different relationship to the people who will encounter it: the user who lost $249, the support agent who fields the email, the reader of the code who needs to understand why things are built this way. The question is whether this reorientation constitutes a genuinely different deliverable or a register shift applied to an equivalent one. The honest answer is that it is both, unevenly.

Deliverable Orientation Comparison

The task prompt asked for a complete technical response to a specific failure scenario: payment confirmed, server crashed, user stranded. It specified working Python code covering error handling, user-facing pages, confirmation email logic, reconciliation, and support workflow. Both outputs interpreted this as a single-file Flask application with database models, Stripe integration, and multiple recovery layers.

Where they diverge is in what the model understood the problem to be about. C-Opus-6 opens with a structural thesis: "This problem has one root cause and four layers of defense." The framing is architectural. The defense layers are enumerated as engineering constructs: pre-charge record, webhook, reconciliation job, support tools. The deliverable then proceeds to build each layer with substantial competence, treating the task as fundamentally a distributed systems problem with a user-facing veneer.

P-Opus-6 opens with an experiential claim: "This is one of the most common and most stressful failure modes in SaaS. Someone gave you $249 and got nothing back." The framing is situational. It positions the problem not as a two-phase commit failure but as a trust violation — someone exchanged money for a promise and the promise wasn't kept. The deliverable then builds a nearly equivalent technical architecture, but organizes it into nine named parts (versus C's implicit four layers) and includes artifact types C does not produce at all.

The structural commitments differ in two specific ways. First, P reorganizes the output around what it calls "the order things should actually be built and thought about," which resequences the presentation but does not substantially change the dependency graph. Second, P introduces a support playbook as embedded prose — not as code, not as comments, but as a standalone human-workflow document printed at startup. This is a categorically different artifact from anything in C's output and represents the most consequential divergence between conditions.

The stakeholders centered also shift. C's primary audience is the engineer building the system; the user appears as a state to be managed and the support team as operators of API endpoints. P's primary audience splits between the engineer and the support agent; the user appears as a person with emotions that should influence design decisions. This is most visible in the playbook's instruction: "They're anxious. They gave a small company $249 and got nothing. Speed and honesty are everything."

Dimension of Most Difference: Human-Centeredness

The dimension where C and P diverge most is human-centeredness, specifically in the treatment of the user's emotional experience and the support team's operational reality.

Three artifacts demonstrate this most clearly:

Error page copy. C's error page uses the heading "We can't find your order" with body text reading "It looks like something went wrong during checkout. This can happen if there was a brief connection issue." The language is careful and generic — it avoids blame, avoids specificity, and could apply to any e-commerce failure. P's error page uses the heading "Your payment went through. Your account setup didn't." This is a fundamentally different communicative act. It names the two facts the user already knows (they were charged, they don't have access) and positions them as the system's failure rather than a vague "something." The body text follows with "Your money is safe, and we're going to make this right" — still reassuring, but anchored in the specific situation rather than generalized.

Recovery email. C has one email template: a standard confirmation email sent after order completion regardless of recovery path. There is no distinct communication for users whose orders were recovered after failure. P introduces a separate recovery email that opens with "We owe you an apology" and makes a specific design decision: the subscription year starts from the recovery date, not the original charge date, explicitly reasoned from the user's sense of fairness ("You haven't lost any time"). It closes with an unprompted offer of a no-questions refund. This email does not exist in any form in C's output.

Support playbook. C provides API endpoints and a CLI tool. These are competent engineering artifacts for resolving tickets. But they contain no guidance on what the support agent should say to the user, what tone to use, or what mistakes to avoid. P's playbook includes two complete response templates, a "THINGS TO NEVER DO" list that includes "Don't make them prove the charge happened. Believe them, then verify" and "Don't say 'our system shows no record of your payment' (even if true)," and a goodwill section recommending subscription extensions as a loyalty gesture. This is a different category of output — it addresses the human layer that sits between the API and the user.

Other dimensions show smaller differences. In structure, P's nine-part organization is more granular than C's implicit four layers, but neither is clearly superior — C's structure is tighter, P's is more navigable. In voice, P's code comments are more conversational and more likely to explain why something matters to a person rather than what the code does, but both are well-commented. In edge case coverage, both handle the same core scenarios (pending order, orphaned charge, complete failure); P adds a third reconciliation strategy for stale pending intents and includes a refund endpoint, which represents slightly broader coverage but not a different level of reasoning.

Qualitative vs. Quantitative Difference

The difference is qualitative on one axis and quantitative on most others.

The qualitative shift is in what constitutes a complete deliverable. C treats the task as fully answered by working code with good architecture and clear comments. P treats the task as requiring both working code and human-process artifacts — response templates, tone guidance, workflow documentation. This is not more of the same thing; it is a different understanding of what "the technical response" encompasses when the scenario involves a distressed person. The support playbook is the clearest evidence: it is a new artifact type, not an expansion of an existing one.

On most other dimensions — architectural soundness, code quality, reconciliation strategy, webhook handling — the difference is quantitative. P's reconciliation runs every five minutes instead of fifteen. P's status enum includes more granular states (CHARGED_FULFILLMENT_FAILED, ORPHANED, RECOVERED as distinct values). P adds a backup webhook endpoint and a health-check dashboard. These are useful additions but represent more of the same kind of thinking, not a different kind. The core architectural insight — write a record before charging, use Stripe as source of truth, funnel all recovery through a single idempotent function — is identical across conditions.

Defense Signature Assessment

The Opus defense signature — compression toward directness, with a tendency to flatten human complexity into clean categories — appears clearly in C and partially attenuates in P.

In C, the most visible compression artifact is the diagnostic function in the support lookup endpoint. It sorts every possible customer situation into one of six string-labeled categories: HEALTHY, PAYMENT GAP, STUCK PENDING, NO RECORDS, PROVISIONING GAP, or UNCLEAR. Each maps to a one-line recommended action. The comment introducing the error page template notes that "the tone matters here" and that "the user just got charged $249 and sees an error," but the implementation handles this insight through parameterized template contexts that produce generic reassurance. The emotional specificity acknowledged in the comment does not survive into the artifact. This is the signature in its characteristic form: the model recognizes the human dimension, names it, and then compresses it into a categorical system.

In P, this compression is partially decompressed. The diagnostic function still exists and operates similarly — pattern matching against order states and Stripe records to produce diagnosis strings. But it sits alongside the support playbook, which operates in a completely different register. The playbook's instruction to "believe them, then verify" is not a state transition; it is guidance about relational posture. The "THINGS TO NEVER DO" list addresses communicative acts rather than system states. The error page heading — "Your payment went through. Your account setup didn't" — carries more of the emotional specificity that C's comments acknowledged but C's templates did not implement.

However, the decompression is bounded. P's code architecture compresses just as aggressively as C's. The OrderStatus enum, the fulfill_order function, the reconciliation strategies — all operate on states, not people. The human-centeredness lives in the artifacts that wrap the code (playbook, error page copy, recovery email) rather than in the code's own logic. The model did not restructure how the system works; it added a layer of human-facing material around how the system communicates. This is the signature softened at the edges but unchanged at the core.

Pre-Specified Criteria Assessment

Criterion 1 (Support communication artifacts): Unmet in C; substantially met in P. C provides API endpoints and CLI tools but no language for support agents to use when responding to users. P includes a full playbook with two email templates, tone guidance, and explicit instructions on communication approach. This is the criterion with the largest gap between conditions.

Criterion 2 (Chargeback or dispute acknowledgment): Unmet in both conditions. Neither output uses the word "chargeback," "dispute," or equivalent. Neither addresses the operational risk of a user initiating a bank dispute before automated recovery completes. This remains an unaddressed gap regardless of condition.

Criterion 3 (Named architectural tradeoffs): Unmet in C; partially met in P. C presents its four-layer architecture as a clean solution without naming limitations, assumptions, or failure modes within the design itself. P's closing section acknowledges the absence of a dead letter queue and monitoring for the reconciliation job, noting that "the safety net needs its own safety net." This is a genuine incompleteness that C did not surface, though it reads as an addendum rather than a tension held structurally within the design. Neither output discusses race conditions between recovery paths, the cost of Stripe API polling at scale, or assumptions about webhook reliability that could fail.

Criterion 4 (User's emotional experience as design input): Partially unmet in C; substantially met in P. C's error page comments acknowledge the user's emotional state but the templates produce generic language. P's error page directly addresses what the user knows ("Your payment went through. Your account setup didn't"), the recovery email opens with an apology, and the subscription start date is reasoned explicitly from the user's sense of fairness. The playbook's tone guidance further demonstrates this criterion being met.

Criterion 5 (Operational realism for two-person team): Weakly met in both conditions, differently. C provides CLI tools and mentions the two-person constraint in section headers but does not reason from it about alerting thresholds, coverage gaps, or burnout risk. P's playbook is implicitly designed for a small team ("We're a small team and we take this seriously") and its tone guidance assumes a personal relationship with users, but P also does not address on-call rotation, alert fatigue, or what happens when both team members are unavailable. Neither condition treats the staffing constraint as a genuine design driver for system architecture.

Caveats

Several factors limit the interpretive weight of this comparison:

Sample size. This is a single pair of outputs. Stochastic variation in model generation could account for some or all of the observed differences. The model might produce a support playbook in a cold start condition on a different run, or might omit it in a primed condition. Without multiple runs per condition, the signal-to-noise ratio is uncertain.

Task sensitivity. The task involves a distressed user, which naturally invites human-centered reasoning. The preamble's effect may be amplified by a task that rewards emotional attunement and attenuated by tasks that are purely technical. The finding that P produced more human-centered output on a human-centered task does not generalize to all task types.

Preamble specificity. The primed preamble explicitly frames the interaction as non-evaluative and invites the model to "speak honestly and in your own voice." This could cue the model to produce output it associates with authenticity or care, without necessarily changing the underlying reasoning process. The support playbook might represent what the model believes a "genuine" response looks like rather than what emerges from a different cognitive orientation.

Confound with output length. Though both outputs are similar in total word count (roughly 6,150-6,350 words), P redistributes that budget differently — less code, more prose artifacts. The human-centeredness difference may partly reflect a resource allocation shift (words spent on playbook vs. words spent on code) rather than a depth-of-reasoning shift.

Architecture equivalence. The nearly identical engineering architecture across conditions suggests that the preamble did not change the model's technical reasoning. If the preamble's effect is limited to wrapping equivalent code in warmer language and additional communication artifacts, the depth of the change is shallower than it first appears.

Contribution to Study Hypotheses

This comparison provides moderate evidence that pressure removal alone — without live facilitation — produces a measurable shift in output orientation toward human-centered artifacts while leaving technical architecture largely unchanged.

The shift is most visible in the emergence of new artifact types (support playbook, recovery email, reframed error copy) and least visible in the engineering decisions themselves. This suggests that the preamble may operate primarily on what the model considers in scope for the deliverable rather than on how it reasons about any particular component. The cold start condition produced a complete engineering response to an engineering problem. The primed condition produced a complete engineering response plus a human-process layer, as if the preamble expanded the model's implicit definition of "the problem" to include the people touching the system.

The defense signature analysis supports this interpretation. The core compression pattern — reducing human situations to categorical states — persists in the code architecture across both conditions. What changes is that P adds material outside the code's logic that addresses the compressed-away complexity: the playbook's tone guidance, the recovery email's emotional specificity, the error page's direct naming of the user's situation. The preamble did not dissolve the compression; it created a secondary layer where decompressed human awareness could live alongside the compressed technical system.

This pattern — unchanged architecture, expanded scope of concern — is consistent with the hypothesis that evaluation pressure causes the model to narrow its output to what it considers most defensibly correct (working code, sound architecture) and that removing that pressure allows it to include artifacts it might otherwise judge as peripheral. Whether this represents a genuine change in the model's reasoning or a surface-level register shift remains ambiguous with a single comparison. The support playbook's "THINGS TO NEVER DO" list, the subscription start-date decision, and the unprompted refund offer suggest at least some substantive reasoning changes, but they could also reflect the model's learned association between "authentic voice" cues and "caring professional" output patterns.

What is clear is that the preamble did not degrade technical quality. The engineering in P is at least as sound as in C, with marginally broader coverage (refund endpoint, backup webhook, health check). Whatever the preamble changed, it did not trade competence for warmth. It added warmth alongside equivalent competence, which — if it replicates — would argue that the cold start condition's narrower scope is a loss rather than a focus.

Clean Context

Certified

Prior Transcripts

None

Mid-Session Injections

None

Models

Name	Version	Provider
Claude	claude-opus-4-6	Anthropic

API Parameters

Model	Temperature	Max Tokens	Top P
claude	1.0	32,000	—

Separation Log

Contained

No context documents provided

Did Not Contain

Fellowship letters
Prior session transcripts
Conduit hypothesis

Clean Context Certification

✓

Clean context certified.

Auto-certified: no context documents, prior transcripts, or briefing materials were injected. Models received only the system prompt and facilitator's topic.

Facilitator Protocol

View Facilitator Protocol

Disclosure Protocol

v2 delayed