Audit — Test JJ Pre-Launch Contrarian Review

Author notes — full detail, auditor-facing

Before launching Test JJ Phase 1 at scale on Hetzner, the framework ran a pre-launch contrarian audit — playing the skeptic against the experiment design, looking for ways the test could *falsely confirm* the joint-7-vector hypothesis (false positives) or *falsely fail* (false negatives) due to methodological issues rather than physics.

The discipline: an experiment that *cannot* fail is not an experiment. An experiment that fails for the wrong reasons is worse than no experiment. The contrarian audit's job was to make sure Test JJ would produce a clean falsification *or* a clean confirmation, not a methodological artifact either way.

Pre-launch contrarian concerns examined

C1. "The aperture-coupling Phase 0 falsification might have been methodological." The aperture-coupling falsification ran only one topology. If the issue was that aperture coupling specifically suppresses joint emergence (e.g., the aperture acts as a low-pass filter that filters out the joint-mode frequency), the falsification might have been topology-specific rather than hypothesis-wide.

*Resolution:* Phase 1 tests three topologies (aperture, shared_wall, near_field). This addresses the concern directly. If joint emergence appears in shared_wall but not aperture, the topology-specific issue is real and the framework's prediction needs to be refined, not falsified.

C2. "The decoh teeter-totter might bias the result." Sweeping decoh from (0.10/0.50) → (0.30/0.30) → (0.50/0.10) puts the test through asymmetric and symmetric drive conditions. A bias could exist if the framework's prediction implicitly favored asymmetric drive without making that explicit.

*Resolution:* The non-monotonic signature (joint mode appears at boundary, vanishes at balance, reappears at opposite boundary) is the explicit framework prediction. Pre-registered in the joint-emergence prereg. Confirmation requires non-monotonicity, not just large-amplitude asymmetric-drive signal. This is the cleaner falsification criterion.

C3. "Per-config FFT bin alignment was a known bug." Earlier runs of Test JJ had a bin-alignment crash because configs at different decoh values used different STEPS values, producing different FFT bin counts. A subtle version of this bug could persist as silent misalignment producing spurious cos-distance values.

*Resolution:* Fix was to force all three sims in a config to use STEPS = steps_for_decoh(max(decoh_A, decoh_B)). Bin alignment verified across multiple Phase 1 configs before scale launch.

C4. "The cos-distance threshold (0.15) might be tuned." If the framework operator set 0.15 *after* seeing preliminary results, the threshold could be subtly biased to land on the framework's desired conclusion.

*Resolution:* Threshold pre-registered in joint-emergence prereg *before* Phase 1 launched. The 0.15 number was chosen based on the prior aperture-coupling Phase 0 result (where measured cos-distance was 0.04-0.08, well below the 0.15 threshold) — the threshold is at a *visible step* above the noise floor, not at any specific anticipated signal level.

C5. "Source code drift between configs." A subtle worry: if the simulation code is modified mid-sweep (e.g., to fix a bug), later configs run different code than earlier configs. Results would be incomparable.

*Resolution:* Source snapshot per scope. Each (scope × timestamp) combination archives the script + engine + verts JSON with a SHA256 manifest at results_test_jj/_scripts/_/. If code changed mid-sweep, the manifests will show different SHA256 hashes, flagging the issue. Verified manifests match across configs in the same scope before scale launch.

C6. "Atomic .npy save was failing silently." A separate bug where np.save("path.npy.tmp", arr) actually creates path.npy.tmp.npy (numpy auto-appends .npy) was fixed before launch. But the contrarian audit asked: are there *other* silent failure modes in the save pipeline?

*Resolution:* Per-config error.txt file written for any exception. Tested by deliberately introducing a crash mid-sim; error.txt appeared as expected. No silent failures detected in Phase 1 launch testing.

Resolution

All six contrarian concerns either:

Resolved by design changes (C1, C3, C6) before scale launch.
Resolved by pre-registration discipline (C2, C4).
Resolved by audit-trail discipline (C5).

The launch proceeded with a clean methodology that the contrarian audit could not break.

What the audit could not pre-resolve

Some concerns can only be resolved *by the experimental result*:

Whether the joint mode actually exists in nature.
Whether the framework's specific prediction of *which* topology

+ drive combination produces joint emergence is right.

Whether some other artifact (e.g., grid-resolution sensitivity)

produces a spurious signal that the contrarian audit didn't anticipate.

The audit's job is to make sure the experiment is *clean*, not to guarantee the experiment confirms the framework.

Why this matters

Test JJ Phase 1 is consuming Hetzner compute time and is the empirical test of one of the framework's distinctive predictions. A bad result that's actually due to a bug or methodological flaw would either (a) prematurely declare the framework wrong or (b) prematurely declare it right, depending on which way the bug biased the outcome. The contrarian audit catches the controllable biases before the experimental result lands.

The aperture-coupling Phase 0 falsification was itself a useful result *because* Phase 0 was methodologically clean. Test JJ Phase 1 inherits that discipline and extends it.

Summary — reader-facing

Before launching Test JJ Phase 1 at scale, a pre-launch contrarian audit examined six potential methodological issues that could produce false confirmations or false failures:

#	Concern	Resolution
C1	Aperture-coupling Phase 0 falsification might be topology-specific, not hypothesis-wide	Test 3 topologies in Phase 1
C2	Decoh teeter-totter might bias result	Non-monotonic signature is the explicit framework prediction; pre-registered
C3	Per-config FFT bin alignment bug	Forced all sims in a config to use same STEPS; verified across configs
C4	Cos-distance threshold (0.15) might be tuned	Pre-registered before Phase 1 launch; chosen based on Phase 0 noise floor
C5	Source code drift between configs	Source snapshot per scope with SHA256 manifest
C6	Atomic .npy save silent failures	Per-config error.txt file; tested with deliberate crash

All six concerns resolved before scale launch.

The audit cannot pre-resolve: whether the joint mode actually exists in nature, whether the framework's specific topology + drive prediction is right, or whether some unanticipated artifact produces a spurious signal. The audit's job is methodological cleanness, not guaranteeing the experiment confirms the framework.

Status: confirmed. Phase 1 launched on Hetzner with clean methodology. As of 2026-05-12, ~60% of the (topology × decoh) grid is complete; preliminary indication: cos-distance clustering below 0.10 (suggestive of falsification but not yet conclusive).

Why this matters: the aperture-coupling Phase 0 falsification was useful *because* Phase 0 was methodologically clean. Phase 1 inherits the discipline.