If you sequence the tumor of a cancer patient, you might find 10,000 somatic variants. Which one is driving the cancer? If you sequence a child with a rare developmental disorder, you might find 50 novel variants not seen in the parents. Which one is the culprit?
Have you run into a confusing p-value in your genomic data recently? Let me know in the comments. biostatgv
By applying linear models across the entire genome, we can now tell a 20-year-old: "Based on your 1.2 million variants, your statistical risk for heart disease is in the top 10% of the population." You cannot Google your way through genomic variation. The human genome is too noisy, too large, and too complex for intuition. If you sequence the tumor of a cancer
Biostatistics gives us the : [ PRS = \sum (EffectSize_i \times NumberOfRiskAlleles_i) ] Which one is the culprit