#statgen2024 — Public Fediverse posts
Live and recent posts from across the Fediverse tagged #statgen2024, aggregated by home.social.
-
STATGEN 2024 talk
A Kernel-Based Neural Network for High-dimensional Risk Prediction on Massive Genetic Data
Qing LuNeural Network
Nonlinear
Non-additiveKernel-Based Neural Network (KNN)
kernel matrics constructed based on the genetic variables.Related preprint:
An Association Test Based on Kernel-Based Neural Networks for Complex Genetic Association Analysis
https://arxiv.org/abs/2312.066691/
-
STATGEN 2024 talk
Improved methods for empirical Bayes multivariate multiple testing and effect size estimation
Yunqi YangEmpirical Bayes multivariate normal means (EBMNM) model [Urbut et al., 2019]
Allow for heterogeneous sharing of eQTLs in multiple tissues (e.g., some are shared across all tissues, some are shared only within brain tissues, etc.)
Truncated Eigenvalue Decomposition
udr: Ultimate Deconvolution in R
https://stephenslab.github.io/udr/ -
STATGEN 2024 talk
MultiSTAAR: A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies
Xihao LiFunctionally-informed Multi-Trait MultiSTAAR approach.
MultiSTAAR-O: Omnibus test
1. Burden
2. SKAT
3. ACAT-VLi X et al. A statistical framework for powerful multi-trait rare variant analysis in large-scale whole-genome sequencing studies. bioRxiv doi: 10.1101/2023.10.30.564764.
-
STATGEN 2024 talk
Adventures in Human Genetics: Purpose, Serendipity, Innovation
Gonçalo Abecasis"It is important to think carefully about what is the right question, and what are the right statistics. But there is a lot of opportunity in thinking about what is the best design to answer the question."
Goal
Understand disease
Treat
Predict disease
PreventCan learn from natural experiments in millions of people.
1/
-
STATGEN 2024 talk
Working towards Inclusivity in Genetic Studies: Estimating accurate population structure with Small Reference Sample Sizes
Souha TifourArriaga-MacKenzie et al Summix: A method for detecting and adjusting for population structure in genetic summary data. Am J Hum Genet. 2021 Jul 1;108(7):1270-1282. doi: 10.1016/j.ajhg.2021.05.016.
Summix relies on reference populations, but what if the ref pop is small?
1/
-
STATGEN 2024 talk
Genotype prediction of 336,463 samples from public expression data
Afrooz Razirecount3: uniformly processed RNA-seq
https://rna.recount.bio/We developed a statistical model to predict genotypes from the Recount3 data
It has high prediction accuracy.
1/
-
STATGEN 2024 talk
BRCAPRO+BCRAT: extending a Mendelian breast cancer risk prediction model to include non-genetic risk factors
Zoe GuanBRCAPRO: Mendelian model, genes
BCRAT: 1st family hx, hormonal risk factors, hx of benign disease
Combine these complementary models.
https://www.mdpi.com/2072-6694/15/4/1090
#STATGEN2024 #Genetics #BreastCancer #RiskPrediction #StatisticalGenetics
-
STATGEN 2024 talk
Polygenic risk score analysis for multiethnic populations
Chris AmosPolygenic Risk Scores (PRS)
* Inform re biological processes
* Identify some at higher risk
* Might motivate behavioral changePRS could inform when to start screening.
"measles plot instead of a manhattan plot" - has excessive false positives all over the genome.
Lung cancer risk snp also is related to response to smoking cessation
1/
-
STATGEN 2024 talk
Bayesian Meta-Analysis of Penetrance for Cancer Risk with Adjustment for Ascertainment Bias
Swati BiswasNeed accurate estimates of age-specific penetrance for cancer risk variants.
https://arxiv.org/abs/2304.01912Heterogeneous studies w/ different measures of risk
Marabelli et al. Penetrance of ATM Gene Mutations in Breast Cancer: A Meta-Analysis of Different Measures of Risk. Genet Epidemiol. 2016 doi: 10.1002/gepi.219711/
-
STATGEN 2024 talk
Improving Genetic Risk Prediction with Genetic Architecture and Functional Annotations
Wei JiangGenome-wide Empirical Bayes to use both genetic architecture and functional annotations in a computationally efficient way.
* Summary-statistics-based
* No parameter tuning needed
* Has improved prediction accuracy over existing methods. -
STATGEN 2024 talk
Novel Methods for Estimating Risk Parameters Associated with Polygenic Scores Using Case-Parent Trio Designs
Ziqiao WangEstimates of SNP effect sizes can be biased due to
* Population stratification
* Assortative matingPrior method
PRS TDT (pTDT) (Weiner et al., Nat Genet 2017)Goal
To develop a joint model that is flexible and robustAssume family PGS ~ multivariate normal distribution w/ family-specfic mean & var
1/
-
STATGEN 2024 talk
Linking variants to gene networks with multivariate association approaches
Xuanyao LiuDetecting trans-eQTLs is challenging
- Small trans- effects
- Multiple-testing correction
- Overwhelmed by false positivesTrans-PCO method
PCO = PC-based omnibus test
https://github.com/liliw-w/Trans
1/
-
STATGEN 2024 talk
Localizing Rare-Variant Association Regions via Multiple Testing Embedded in an Aggregation Tree
Jichun XieWhich variants
* Gene region
* Sliding window (fixed size)
* Varying window
DYNamic Aggregation TEsting (DYNATE) algorithm"DYNATE dynamically and hierarchically aggregates smaller genomic regions into larger ones"
https://cran.r-project.org/package=DYNATE
1/
-
STATGEN 2024 talk
Quantile regression GWAS with related samples
Fan WangQuantile regression tests whether a genetic variant associates with various quantiles of a trait.
Quantile Rank Score test
- Distribution-free
No transformation needed.
- Very fast
Estimate the null model only once.
- R package: QRank https://cran.r-project.org/package=QRank1/
#STATGEN2024 #StatisticalGenetics #Genetics #QuantileRegression
-
STATGEN 2024 talk
Distinct explanations underlie gene-environment interactions in the UK Biobank.
Arun DurvasulaGenetic effects across the genome may exhibit context dependence
- European vs. East Asian genetic correlation is less than 1 across a wide range of traits.
- Hinting at polygenic GxEGxE can arise through different scenarios
- Imperfect genetic correlation
- Varying genetic variance
- Proportional amplification1/
-
STATGEN 2024 talk
The influence of antipsychotic exposure on genetic susceptibility to obesity
Anne JusticeMany factors contribute to obesity risk, including medications.
Obesity Related to Antipsychotic Liability & Exposure (ORAcLE) Genetics Consortium
https://sites.wustl.edu/oracle/Examine polygenic risk scores (PRS) for antipsychotic-induced weight gain in Geisinger MyCode, which began in 2007. 184,293 with genotype & whole exome data.
1/
-
STATGEN 2024 talk
Detecting latent systemic structure in deep phenotyping and genotyping data
Audrey HendricksExpecting systemic structure S to be the same/similar across all the traits.
Trait_i = X_i + E_i + (O_i + S)
How to infer S?
Multitrait finite mixture of regressions (MFMR) by Dahl et al (2019)
1/
-
STATGEN 2024 talk
Statistical Methods for Single-Cell RNA-Seq Analysis and Spatial Transcriptomics
Rafael IrizarrytSNE and UMAP plots:
"They really aren't informative, but they are really pretty."Negative control scRNAseq data set: the percent of zeros is very high, and contributes strongly to the first PCA. tSNE plot 'discovers' new cells.
Transformed to log2(1 + CPM): looks zero-inflated.
Raw counts: Poisson
1/
#Genetics #STATGEN2024 #StatisticalGenetics #RNAseq #Transcriptomics
-
STATGEN 2024 talk
Identifying GxE through Mendelian Randomization
Xiaofeng ZhuStatistical power is low and detecting GxE is a challenge.
See:
Aschard H. A perspective on interaction effects in genetic association studies. Genet Epidemiol. 2016 Dec;40(8):678-688. doi: 10.1002/gepi.21989. Epub 2016 Jul 7. PMID: 27390122; PMCID: PMC5132101.
1/
#Genetics #STATGEN2024 #StatisticalGenetics #MendelianRandomization
-
STATGEN 2024 talk
An efficient method for network Mendelian randomization allows network structure discovery and effect estimation.
Jean MorrisonExisting methods
* GenomicSEM
* Network deconvolution
Graph-cML
bimmerOur method
* Network empirical shrinkage Mendelian randomization (NESMR)
- likelihood-basedAssumptions
1. Causal effects between traits are linear with no interactions1/
#Genetics #MendelianRandomization #STATGEN2024 #StatisticalGenetics
-
STATGEN 2024 talk
Synthetic Variables for Genetic Analysis with Censored Outcomes
Jin ZhouACCORD trail in T2D: more tightly controlling glycemic mean levels led to increase in deaths due to CVD.
But variation in glucose levels is a risk factor.
What factors influence glucose levels of variability?
Developed method to do a GWAS of trait variability within a longitudinal context.
Developed a fast robust estimating equations method
1/
-
STATGEN 2024 talk
A New Test for Trait Mean and Variance Detects Unreported Loci for Blood Pressure Variation
Todd L. EdwardsIf there is a GxE, then variance of Y differs by genotype.
If we don't know E, we can model both the mean & variance as function of the main predictor.
Known a long time - e.g. Waddington (1942) Canalization of development and the inheritance of acquired characters.
1/
-
STATGEN2024 talk
Optimizing Polygenic Risk Scores for Diverse Populations
Nilanjan ChatterjeeBreast Cancer example:
Progress in developing PRS - OR per SD increasing from 1.49 (77 SNPs) to 1.64 (313 SNPs) to 1.71 (3,820 SNPs).313 SNP PRS out-performed a 6-million SNP PRS. "Let's not get jazzed by the number of SNPs in the PRS".
Wanted to develop a method for better PRSs in diverse pops, so collaborated with 23andMe.
1/
-
STATGEN 2024 talk
Cell type specific functional characterization of Alzheimer's disease in microglia
Yun LiYang, X., Wen, J., Yang, H. et al. Functional characterization of Alzheimer’s disease genetic variants in microglia. Nat Genet 55, 1735–1744 (2023). https://doi.org/10.1038/s41588-023-01506-8
AD SNP h^2 most highly enriched in microglia regulatory regions.
iPSC differentiation of microglia
Identified cis-regulatory elements (cCREs) near 37 AD loci
1/
-
STATGEN 2024
Interpreting structure in sequence count data with differential expression analysis allowing for grades of membership (GoM)
Peter CarbonettoAllow a cell to be a partial member of > 1 group.
GoM is closely related to Non-negative Matrix Factorization (NMF).
Comparing k groups - a more stringent measure - drives more to zero than DESeq2 and has more power.
Groups = 'topics'. Can be cell-types or groups of cell-types, or more general
-
STATGEN 2024
Pleiotropy-robust methods for high-dimensional multivariable Mendelian randomization (HDMR)
Nathan LaPierre presenting, co-authors: Matthew Stephens, Xin HeIn HDMR, we have many genetically correlated exposures, which may be explained by unobserved shared factors. These can be inferred by factor analysis.
Flexible, modular framework: Factor-Augmented MR
1. Factor Analysis
2. Regression/Variable Selection#Genetics #StatisticalGenetics #MendelianRandomization #STATGEN2024
-
The keynote speaker opening the "STATGEN 2024: Conference on Statistics in Genomics and Genetics" is Kathryn Roeder, talking about "Testing of differential genomic outcomes in the presence of unmeasured confounding and missing data".
When testing gene expression across the genome, the majority of genes will follow the null. This enables QC checks, as the majority will not follow the null if we haven't adjusted adequately for unmeasured covariates.