NHI Anomalous
Science & Physics

What the Evidence Actually Says: We Scored Every Case on the Same Four Axes

Most UFO writing argues case by case, which makes every story sound equally strong. We did the opposite: scored every case file on the same four-part rubric and lined the numbers up. The pattern that falls out is not the one either side wants.

5 min read
UFO case photographs pinned to an investigation board with rating bars and analytical charts, organized, cool investigative lighting, photorealistic AI illustration
AI illustration·Generated, not photographic

The trouble with reading UFO cases one at a time is that they all start to sound convincing. A good writer can make Roswell and a backyard light in the sky feel like the same weight of evidence, because narrative flattens everything to the same pitch. The only way out of that is to stop arguing and start scoring — the same questions, the same scale, every case, out loud.

So we did. Every case file with an incident record gets a Signal Strength score: four axes — witnesses, instrumentation, official record, debunk-resistance — each rated none / weak / moderate / strong, summed to a number out of twelve. Twenty-four cases are scored so far. Here is what the column of numbers shows once you stop reading them as stories.

Prefer the leaderboard? Every case below is ranked and re-sortable on the interactive Ranked Cases page — sort by any single axis to see exactly which kind of evidence each case is leaning on.

The shape of the data

The scores do not cluster at the top. They spread, and they lean low.

If you came here believing UFO evidence is overwhelmingly strong, that ledger should bother you: two-thirds of even our selected cases land at Contested or Thin. If you came here believing it is all nonsense, the top of the ledger should bother you just as much — because the strongest cases do not get strong by being the most famous or the most lurid. They get strong for a specific, boring reason.

The instrumentation gap is the whole story

Look at what separates the top of the table from the bottom, and it is almost never the witnesses. Nearly every case has strong witnesses — frightened, credible, consistent people are the one thing the phenomenon never lacks. The axis that actually moves the score is instrumentation and official record.

Rendlesham scores because of a signed deputy base commander’s memo and radar returns. Trindade scores because of a sequence of photographs taken in front of a ship’s company. Aguadilla scores because of a federal thermal-camera video with a chain of custody. The Phoenix Lights scores because thousands of people and the state’s own governor are on the record. Every case at the top has a piece of evidence that exists outside a human memory.

Now look at the bottom. Allagash, the Hills, Strieber — these are abduction cases, and they are thin not because the witnesses are weak but because the evidence never leaves the witness. Hypnotic regression, a remembered table, a scar with no provenance. When the only instrument is the human nervous system, the score caps out fast no matter how sincere the testimony. That is not a judgment about whether those people experienced something. It is a measurement of how much of it we can check.

The cases that climb without strong instrumentation climb on the official axis instead — Tehran on a favorable DIA report, Shag Harbour on a multi-agency government response, the Belgian Wave on an air force that published its own radar. Notably, the Belgian photo was later confessed a hoax and its radar reattributed to atmospheric artifacts — and the case still scores Credible, purely on witness volume and official candor. That is the rubric working as intended: it tells you the case is strong on people and paperwork, and explicitly not on physical evidence.

The dividing line in the data is not believer versus skeptic. It is whether the evidence survives outside the witness’s skull.

What the observables say, separately

We also tag cases with the Five Observables — the AATIP-era flight characteristics (anti-gravity lift, instantaneous acceleration, hypersonic speed, low observability, trans-medium travel). Across the scored set the distribution is lopsided:

  • Low observability — 5 cases, the most common
  • Trans-medium travel — 3
  • Anti-gravity lift and sudden acceleration — 2 each
  • Hypersonic velocity — 1

That ordering is quietly revealing. “Low observability” — it was there, then it wasn’t — is the easiest thing to report and the hardest to falsify, so it shows up most. The genuinely physics-breaking claims — trans-medium and instant acceleration, the ones that would actually require new science — are rarest, and they concentrate in exactly the better-instrumented cases (Aguadilla, the Catalina USOs). The wild claims are not spread evenly across the dross. They track the cases that have a machine watching.

The geography, briefly

Plotted on the case map, the scored cases do not fall where “aliens would land.” They fall where instruments and officials already were — military ranges, coastlines patrolled by federal aircraft, nuclear infrastructure, photographed naval expeditions. That is almost certainly an observation bias rather than a clue about intent: you score well where someone was already recording. But it is worth saying plainly, because the same bias quietly inflates how alien the pattern looks when you only read the strong cases.

Why bother scoring at all

Because the alternative is the thing this whole site exists to refuse: deciding what is true by how good the story is. Scoring does not tell you Rendlesham was a craft. It tells you Rendlesham is the case you cannot dismiss without explaining away a signed memo and a radar tape — and that Communion, whatever Whitley Strieber lived through, is a case you can set down with nothing more than “memory is not evidence.”

That distinction costs something to maintain. It means conceding that most of the canon is thin, including cases people love. But it is the only version of this subject that a skeptic and a believer can actually argue about, because they are finally arguing about the same numbers. The day the phenomenon produces a case that scores a clean 12, everyone will know — not because the story got better, but because the evidence finally left the room the witness was standing in.

Sources

  1. [1] ODNI — Preliminary Assessment: Unidentified Aerial Phenomena (June 2021)
  2. [2] All-domain Anomaly Resolution Office (AARO)
  3. [3] Scientific Coalition for UAP Studies (SCU)
The Briefing

Follow the thread

New disclosure reporting, physics breakdowns, and case files — in your inbox. Sources or it didn't happen. No spam, unsubscribe anytime.