Methodology

This document is the editorial standard for CholesterolResearch. It defines how evidence is selected, classified, weighted, and presented. It exists so that every judgment on the site is auditable: a reader can see not just what we concluded, but the rule we applied and the source we relied on.

The single biggest risk to a project like this is editorial drift disguised as data - small, individually-defensible choices accumulating into a site that looks scientific but quietly favours a preferred conclusion. The rules below are the defence against that. They apply equally to mainstream and skeptic positions.

Not medical advice. This site maps research to inform a decision made with a clinician. It does not diagnose, treat, or recommend treatment.

1. Principles

Traceability over assertion. Every classification links to its evidence, and every editorial judgment records its reasoning and a supporting quote.
Show the disagreement. Where serious people disagree, both sides are argued in good faith, with their evidence.
Evidence hierarchy is explicit. A meta-analysis of RCTs and a case report are not weighted equally, and the interface makes the difference visible.
Conflicts of interest are surfaced evenly - pharma funding and book/supplement/clinic income alike.
Separate the questions. “Is LDL causal?”, “Is ApoB a better marker?”, “Do statins reduce events for this group?”, and “Should this person take one?” are different questions and are never conflated.

2. The data model

The spine of the site is the Claim: a single, precisely-scoped question. A study does not carry one global “pro/anti” stance, because a single paper can support one claim, weaken another, and be silent on a third.

Instead:

A Study is a source record with a GRADE-lite evidence tier.
An EvidenceAssessment is the edge between one study and one claim: it records the direction, the endpoint, the weight, the reasoning, and the quote.
A Claim aggregates its assessments and carries both steelmans and a bottom line.
A Person is mapped with sourced stance scores and claim-level positions.
A Guideline keeps its recommendation separate from the evidence it rests on and from how skeptics respond.

See DATA_DICTIONARY.md for every field and allowed value.

3. Evidence tiering (GRADE-lite)

Each study is assigned an evidence tier - high, moderate, low, or very-low - for the question it speaks to. Design is the starting point, not the answer. A small, industry-funded, short-duration RCT can be lower tier than a large, clean cohort.

Start from design, then adjust using these factors (recorded in tierRationale):

Factor	Downgrades when…	Upgrades when…
Risk of bias	unblinded, high dropout, selective reporting, weak adjustment	rigorous, pre-registered
Precision	wide confidence intervals, few events, small sample	many events, tight intervals
Directness	surrogate endpoint, wrong population, indirect comparison	measures the outcome and the population directly
Consistency	conflicts with comparable studies	replicated across settings
Effect size	-	very large effect, clear dose-response gradient
Publication bias	suspected selective publication / unregistered, missing data	comprehensive, registered, negative results published
Funding / COI	conflicted funding or sponsor with a stake in the result	independent funding does NOT upgrade; adversarial collaboration may raise confidence

Note on funding: a conflict of interest is a reason to downgrade; independent funding is the baseline expectation and does not by itself upgrade evidence. (An adversarial collaboration - opponents designing a study together - can genuinely raise confidence.)

Rough starting points (before adjustment):

High: meta-analysis of RCTs; large, well-conducted RCTs; consistent Mendelian randomization for causal claims.
Moderate: single RCTs with limitations; strong prospective cohorts.
Low: retrospective/observational, case-control, cross-sectional.
Very low: case series/reports, mechanistic/animal, narrative reviews, uncontrolled n-of-few.

A study carries a source-level defaultEvidenceTier. The authoritative weight for a specific claim is the per-claim EvidenceAssessment.assessmentTier, which may downgrade the study’s tier when the study is indirect for that claim’s population (recorded in tierOverrideReason). Example: a rigorous primary-prevention statin RCT is strong evidence for a general statin claim but weaker, indirect evidence for an LMHR-phenotype claim. Tier-weighted visualizations always use assessmentTier, never the source-level tier.

4. Direction (stance) taxonomy

How a study bears on a claim, recorded on each assessment:

strongly-supports - supports - mixed - neutral - challenges - strongly-challenges

mixed is for a study that genuinely cuts both ways on the same claim (e.g. favourable on one endpoint, unfavourable on another). When a study speaks to different claims in different directions, that is modelled as separate assessments, not as mixed.

5. Endpoints are never collapsed

Different outcomes carry very different weight, and conflating them is a primary way evidence gets distorted. Every assessment names an endpointType, and they are kept distinct:

all-cause mortality - the hardest, least-gameable outcome.
cardiovascular mortality / clinical events (MI, stroke, MACE).
plaque imaging (CAC, CCTA, non-calcified plaque volume) - informative but a surrogate for events.
surrogate lipids (LDL-C, ApoB change) - a surrogate, not an outcome.
other biomarkers (hs-CRP, insulin).
adverse events - benefits and harms are tracked separately.

A drug that lowers LDL (surrogate) has not thereby been shown to extend life (mortality). The site shows which endpoint each claim of benefit actually rests on. Conflicting endpoints within one study (e.g. KETO-CTA’s low plaque correlation alongside a non-calcified-plaque concern) are entered as multiple assessments, never averaged into one verdict.

6. Treatment effects: relative vs absolute

Treatment benefit must never be shown as relative risk reduction alone. Where a study reports a benefit or harm, we record, where available:

relative risk reduction (RRR) and absolute risk reduction (ARR);
number needed to treat (NNT) and number needed to harm (NNH);
the baseline risk of the population (a 30% RRR means very different things at 1% vs 20% baseline risk);
the prevention context (primary vs secondary).

7. Claims: steelmans and the bottom line

Every claim must:

Carry a mainstream steelman and a skeptic steelman - each the strongest honest case for that side, not a strawman. The schema enforces a minimum length; the review enforces good faith.
State an agreementLevel (broad-consensus - leaning - contested - deeply-disputed) that is honest about how settled the question is.
Carry a clearly-labelled bottom line with a confidence level. The bottom line is editorial judgment, not fact, and it is governed by one rule:

The bottom line must survive its own skeptic steelman. If you cannot write a fair steelman of the opposing view that your bottom line still answers, the bottom line is wrong or overstated.
State what would change this conclusion - the specific evidence that would move it. A claim whose author cannot say what would change their mind is a belief, not an assessment.

8. Person stance scores

People are placed on two axes - LDL (benign - causal) and statins (anti - pro) - each scored from -1 to +1. These scores power an at-a-glance overview, but they are deliberately lossy summaries. Therefore:

Every score carries a rationale and at least one source (no unsourced scores).
The nuance lives in positionSummary, keyArguments, and sourced claimPositions, which the overview always links to.
The interface labels the plot as a lossy summary and never presents a dot as the totality of a person’s view.
criticisms (fair critiques) and conflictsOfInterest are recorded for everyone, mainstream and skeptic alike.

9. Conflicts of interest

COI is surfaced for every person, study, and guideline, and applied evenly:

For mainstream figures/guidelines: pharmaceutical funding, advisory roles, institutional incentives.
For skeptics: book sales, supplement lines, paid clinics, subscription media, brand identity tied to a position.

Where none is known, that is stated explicitly (“None known”) rather than left blank, so silence is never mistaken for absence.

10. Citation verification

No citation enters the corpus unless it has been verified against a primary source (DOI, PubMed/PMC, the journal page, or the official guideline document). Each source records how and when it was verified (citation.verification). Memory of a study - including an AI’s or an author’s - is not a source. Quotes are recorded with a locator so a reader can find them in the original.

Quotes are content-verbatim with typographic normalization: the words, numbers, and order are exactly as published, but non-ASCII typography is normalized for portability - en-dashes and minus signs to a hyphen, the approximately sign to “~”, “I-squared” written as “I2”, “plus or minus” as “+/-”, and “x10” exponents written inline. Where two non-adjacent fragments of one sentence are quoted together, an ellipsis (“…”) marks the omission and a note records it. No wording is paraphrased inside quotation marks.

If a primary source is paywalled and a detail (for example a full funding statement) cannot be verified, that detail is marked pending verification rather than asserted - the same standard applies to inconvenient and convenient facts alike.

11. Uncertainty language

We use consistent wording to avoid overclaiming:

“causes / reduces” only for high-tier, direct, consistent evidence.
“is associated with / predicts” for observational/correlational findings (association is not causation).
“may / suggests / is consistent with” for low-tier, mechanistic, or single-study findings.
“contested / disputed” where serious, well-evidenced disagreement exists.

12. Versioning and review

Editorial judgments change as evidence accrues. Every record carries:

reviewStatus (unreviewed - in-review - reviewed - needs-revision);
reviewedBy and lastReviewedAt;
classificationVersion, bumped when a judgment materially changes.

Git history is the full audit log. The content validator (scripts/validate-content.mjs) blocks records that violate the traceability rules, and the coverage of the corpus is tracked in COVERAGE_MATRIX.md.

13. How to audit any conclusion

For any bottom line on the site you can:

open the claim and read both steelmans and “what would change this”;
see every assessment for that claim, each with its direction, endpoint, tier, reasoning, and a quote with a locator;
open the underlying study, see its GRADE-lite tier rationale, and follow the verified citation to the primary source.

If any step in that chain is missing, it is a bug - please report it.

14. Limitations

Classifications are expert-informed editorial judgments, not a peer-reviewed meta-analysis. They can be wrong; the versioning and correction process exists for exactly that reason.
Scope is bounded (see SEARCH_PROTOCOL.md); the corpus is systematic within that scope, not exhaustive of all literature.
The two-axis person plot is a simplification, by design and with disclaimers.