| Title: | debiased inverse-variance weighted estimator for univariable and multivariable Mendelian randomization |
|---|---|
| Description: | Perform causal effect estimation for summary-data Mendelian randomization using IVW, dIVW and MV-SRIVW estimators. |
| Authors: | Ting Ye [aut, cre], Yinxiang Wu [aut] (ORCID: <https://orcid.org/0000-0001-7806-6999>) |
| Maintainer: | Ting Ye <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-05-19 09:04:05 UTC |
| Source: | https://github.com/remlapmot/mr.divw |
It contains independent datasets from three genome-wide association studies (GWASs):
Exposure dataset: A GWAS for BMI in round 2 of the UK BioBank (sample size: 336,107), http://www.nealelab.is/uk-biobank.
Outcome dataset: A GWAS A GWAS for CAD from the CARDIoGRAMplusC4D consortium (sample size: ~185,000), with genotype imputation using the 1000 Genome Project, (PubMed 26343387).
Selection dataset: A GWAS for BMI in the Japanese population (sample size: 173,430), (PubMed 28892062).
data(bmi.cad)data(bmi.cad)
A data.frame with 1119 rows and 42 variables.
https://github.com/qingyuanzhao/mr.raps
Summary Statistics from Simulated Individual-level data
data_gen_individual( case = c("case4", "case5", "case6", "case7"), true_var = FALSE )data_gen_individual( case = c("case4", "case5", "case6", "case7"), true_var = FALSE )
case |
Simulation scenario used in Section 5.3 in Ye et al., (2020). |
true_var |
Do se.exposure, se.outcome, se.selection equal the true standard deviations or the estimated standard errors. Default is FALSE. |
A data frame
Ting Ye, Jun Shao, Hyunseung Kang (2020). Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization.https://arxiv.org/abs/1911.09802.
Summary Statistics Simulated from the BMI-CAD Dataset
data_gen_summary(case = c("case1", "case2", "case3", "case3_pleiotropy"))data_gen_summary(case = c("case1", "case2", "case3", "case3_pleiotropy"))
case |
Simulation scenario used in Section 5.1 in Ye et al., (2020). |
A data frame
Ting Ye, Jun Shao, Hyunseung Kang (2020). Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization.https://arxiv.org/abs/1911.09802.
It contains SNP-exposure and SNP-outcome association summary statistics from the following genome-wide association studies (GWASs):
Exposure dataset: A GWAS for traditional lipids http://csg.sph.umich.edu/willer/public/lipids2013/ (PubMed 24097068); A GWAS for subfractions http://www.computationalmedicine.fi/data\#NMR_GWAS (PubMed 27005778).
Outcome dataset: A GWAS for CAD from the CARDIoGRAMplusC4D consortium http://www.cardiogramplusc4d.org/data-downloads/ (PubMed 28714975).
Selection dataset: A GWAS for traditional lipids https://www.ebi.ac.uk/gwas/studies/GCST007141 (PubMed 29507422); A GWAS for subfractions http://csg.sph.umich.edu/boehnke/public/metsim-2017-lipoproteins/ (PubMed 29084231).
data(hdl_subfractions)data(hdl_subfractions)
A list with 3 elements with the first element being a data.frame with 24 columns (see below for column descriptions), the second element being an empty data.frame, the third element being an estimated correlation matrix.
SNP rsid
effect allele
other allele
Estimated associations of each SNP with respectively traditional lipids HDL, LDL, TG, and HDL subfractions S-HDL-P, S-HDL-TG, M-HDL-P, M-HDL-C, L-HDL-P, and L-HDL-C.
Standard error estiamtes for gamma_exp1,...,gamma_exp9
Estimated SNP-outcome association
Standard error estimate for gamma_out1
p-values for SNP-exposure associations in the selection dataset
https://github.com/tye27/mr.divw
Main function for dIVW
mr.divw( beta.exposure, beta.outcome, se.exposure, se.outcome, alpha = 0.05, pval.selection = NULL, lambda = 0, over.dispersion = FALSE, diagnostics = FALSE, overlap = FALSE, gen_cor = 0 )mr.divw( beta.exposure, beta.outcome, se.exposure, se.outcome, alpha = 0.05, pval.selection = NULL, lambda = 0, over.dispersion = FALSE, diagnostics = FALSE, overlap = FALSE, gen_cor = 0 )
beta.exposure |
A vector of SNP effects on the exposure vairable, usually obtained from a GWAS |
beta.outcome |
A vector of SNP effects on the outcome vairable, usually obtained from a GWAS |
se.exposure |
A vecor of standard errors of |
se.outcome |
A vector of standard errors of |
alpha |
Confidence interval has level 1-alpha |
pval.selection |
A vector of p-values calculated based on the selection dataset that is used for IV selection. It is not required when lambda=0 |
lambda |
The specified z-score threhold. Default is 0 (without thresholding) |
over.dispersion |
Should the model consider balanced horizontal pleiotropy. Default is FALSE |
diagnostics |
Should the function returns the q-q plot for assumption diagnosis. Default is FALSE |
overlap |
Should the model consider overlapping exposure and outcome datasets. Default is FALSE |
gen_cor |
If overlap = TRUE, provide an estimate of the correlation between the effect of the genetic variants on the exposure and the outcome. Default value is 0, meaning that the exposure and outcome datasets are non-overlapping. |
A list
Estimated causal effect
Standard error of beta.hat
A measure that needs to be large for reliable asymptotic approximation based on the dIVW estimator. It is recommended to be greater than 20
Overdispersion parameter if over.dispersion=TRUE
Number of IVs used in the dIVW estimator
IVs that are used in the dIVW estimator
Ting Ye, Jun Shao, Hyunseung Kang (2020). Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization.https://arxiv.org/abs/1911.09802.
data(bmi.cad) with(bmi.cad, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, diagnostics=TRUE) )data(bmi.cad) with(bmi.cad, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, diagnostics=TRUE) )
MR-EO Algorithm to Adaptively Find the Optimal Z-score Threhold.
mr.eo( lambda.start, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection, over.dispersion = FALSE, max_opt_iter = 5 )mr.eo( lambda.start, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection, over.dispersion = FALSE, max_opt_iter = 5 )
lambda.start |
Initial value for lambda (the z-score threshold). |
beta.exposure |
A vector of SNP effects on the exposure vairable, usually obtained from a GWAS |
beta.outcome |
A vector of SNP effects on the outcome vairable, usually obtained from a GWAS |
se.exposure |
A vecor of standard errors of |
se.outcome |
A vector of standard errors of |
pval.selection |
A vector of p-values calculated based on the selection dataset that is used for IV selection |
over.dispersion |
Should the model consider balanced horizontal pleiotropy. Default is FALSE |
max_opt_iter |
Maximum number of iterations. Default is 5 |
mr.eo is an adaptive algorithm that finds the optimal z-socre threshold that leads to the dIVW estimator with the smallest variance.
A list
Optimal z-socre threshold
Number of iterations to find lambda.opt
Ting Ye, Jun Shao, Hyunseung Kang (2020). Debiased Inverse-Variance Weighted Estimator in Two-Sample Summary-Data Mendelian Randomization.https://arxiv.org/abs/1911.09802.
df <- data_gen_summary("case1") lambda.opt <- with(df, mr.eo(0, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection)$lambda.opt ) with(df, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection = pval.selection, lambda = lambda.opt) ) data(bmi.cad) lambda.opt <- with(bmi.cad, mr.eo(0, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection)$lambda.opt ) with(bmi.cad, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection = pval.selection, lambda = lambda.opt) )df <- data_gen_summary("case1") lambda.opt <- with(df, mr.eo(0, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection)$lambda.opt ) with(df, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection = pval.selection, lambda = lambda.opt) ) data(bmi.cad) lambda.opt <- with(bmi.cad, mr.eo(0, beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection)$lambda.opt ) with(bmi.cad, mr.divw(beta.exposure, beta.outcome, se.exposure, se.outcome, pval.selection = pval.selection, lambda = lambda.opt) )
Perform inverse-variance weighted (IVW) estimator for two-sample summary-data multivariable Mendelian randomization
mvmr.ivw(beta.exposure, se.exposure, beta.outcome, se.outcome, gen_cor = NULL)mvmr.ivw(beta.exposure, se.exposure, beta.outcome, se.outcome, gen_cor = NULL)
beta.exposure |
A data.frame or matrix. Each row contains the estimated marginal effect of a SNP on K exposures, usually obtained from a GWAS |
se.exposure |
A data.frame or matrix of estimated standard errors of beta.exposure |
beta.outcome |
A vector of the estimated marginal effect of a SNP on outcome, usually obtained from a GWAS |
se.outcome |
A vector of estimated standard errors of beta.outcome |
gen_cor |
A K-by-K matrix for the estimated shared correlation matrix between the effect of the genetic variants on each exposure, where K is the number of exposure. The correlations can either be estimated, be assumed to be zero, or fixed at zero using non-overlapping samples of each exposure GWAS. Default input is NULL, meaning that an identity matrix is used as the correlation matrix. |
A list with elements
beta.hat |
Estimated direct effects of each exposure on the outcome |
beta.se |
Estimated standard errors of beta.hat |
iv_strength_parameter |
The minimum eigenvalue of the sample IV strength matrix, which quantifies the IV strength in the sample |
data("hdl_subfractions") # Estimate the effect of S-HDL-P on CAS risk, adjusting for HDL, LDL, and TG # Columns: SNP effects on HDL, LDL, TG, and S-HDL-P respectively beta.exposure <- hdl_subfractions$data[, c("gamma_exp1","gamma_exp2","gamma_exp3","gamma_exp4")] se.exposure <- hdl_subfractions$data[,c("se_exp1","se_exp2","se_exp3","se_exp4")] beta.outcome <- hdl_subfractions$data$gamma_out1 se.outcome <- hdl_subfractions$data$se_out1 P <- hdl_subfractions$cor.mat[c(1:4),c(1:4)] mvmr.ivw(beta.exposure = beta.exposure, se.exposure = se.exposure, beta.outcome = beta.outcome, se.outcome = se.outcome, gen_cor = P)data("hdl_subfractions") # Estimate the effect of S-HDL-P on CAS risk, adjusting for HDL, LDL, and TG # Columns: SNP effects on HDL, LDL, TG, and S-HDL-P respectively beta.exposure <- hdl_subfractions$data[, c("gamma_exp1","gamma_exp2","gamma_exp3","gamma_exp4")] se.exposure <- hdl_subfractions$data[,c("se_exp1","se_exp2","se_exp3","se_exp4")] beta.outcome <- hdl_subfractions$data$gamma_out1 se.outcome <- hdl_subfractions$data$se_out1 P <- hdl_subfractions$cor.mat[c(1:4),c(1:4)] mvmr.ivw(beta.exposure = beta.exposure, se.exposure = se.exposure, beta.outcome = beta.outcome, se.outcome = se.outcome, gen_cor = P)
Perform spectral regularized inverse-variance weighted (SRIVW) estimator for summary-data multivariable Mendelian randomization
mvmr.srivw( beta.exposure, se.exposure, beta.outcome, se.outcome, phi_cand = 0, over.dispersion = TRUE, overlap = FALSE, gen_cor = NULL )mvmr.srivw( beta.exposure, se.exposure, beta.outcome, se.outcome, phi_cand = 0, over.dispersion = TRUE, overlap = FALSE, gen_cor = NULL )
beta.exposure |
A data.frame or matrix. Each row contains the estimated marginal effect of a SNP on K exposures, usually obtained from a GWAS |
se.exposure |
A data.frame or matrix of estimated standard errors of beta.exposure |
beta.outcome |
A vector of the estimated marginal effect of a SNP on outcome, usually obtained from a GWAS |
se.outcome |
A vector of estimated standard errors of beta.outcome |
phi_cand |
A vector of tuning parameters for SRIVW estimator. Default is 0. To use the recommended set for the tuning parameter, simply set phi_cand = NULL. |
over.dispersion |
Should the model consider balanced horizontal pleiotropy? Default is TRUE. |
overlap |
Should the model consider overlapping exposure and outcome datasets? Default is FALSE. |
gen_cor |
If overlap = FALSE, provide a K-by-K matrix for the estimated shared correlation matrix between the effect of the genetic variants on each exposure, where K is the number of exposure. If overlap = TRUE, provide a (K+1)-by-(K+1) matrix for the estimated shared correlation matrix between the effect of the genetic variants on each exposure and the outcome, where the last index position corresponds to the outcome. The correlations can either be estimated, be assumed to be zero, or fixed at zero. Default input is NULL, meaning that an identity matrix is used as the correlation matrix. |
A list with elements
beta.hat |
Estimated direct effects of each exposure on the outcome |
beta.se |
Estimated standard errors of beta.hat |
iv_strength_parameter |
The minimum eigenvalue of the sample IV strength matrix, which quantifies the IV strength in the sample |
phi_selected |
The selected tuning parameter for the SRIVW estimator |
tau.square |
Overdispersion parameter if |
data("hdl_subfractions") # Estimate the effect of S-HDL-P on CAS risk, adjusting for HDL, LDL, and TG # Columns: SNP effects on HDL, LDL, TG, and S-HDL-P respectively beta.exposure <- hdl_subfractions$data[, c("gamma_exp1","gamma_exp2","gamma_exp3","gamma_exp4")] se.exposure <- hdl_subfractions$data[,c("se_exp1","se_exp2","se_exp3","se_exp4")] beta.outcome <- hdl_subfractions$data$gamma_out1 se.outcome <- hdl_subfractions$data$se_out1 # last index must correspond to the outcome P <- hdl_subfractions$cor.mat[c(1:4,10),c(1:4,10)] mvmr.srivw(beta.exposure = beta.exposure, se.exposure = se.exposure, beta.outcome = beta.outcome, se.outcome = se.outcome, gen_cor = P, phi_cand = NULL, over.dispersion = FALSE, overlap = TRUE)data("hdl_subfractions") # Estimate the effect of S-HDL-P on CAS risk, adjusting for HDL, LDL, and TG # Columns: SNP effects on HDL, LDL, TG, and S-HDL-P respectively beta.exposure <- hdl_subfractions$data[, c("gamma_exp1","gamma_exp2","gamma_exp3","gamma_exp4")] se.exposure <- hdl_subfractions$data[,c("se_exp1","se_exp2","se_exp3","se_exp4")] beta.outcome <- hdl_subfractions$data$gamma_out1 se.outcome <- hdl_subfractions$data$se_out1 # last index must correspond to the outcome P <- hdl_subfractions$cor.mat[c(1:4,10),c(1:4,10)] mvmr.srivw(beta.exposure = beta.exposure, se.exposure = se.exposure, beta.outcome = beta.outcome, se.outcome = se.outcome, gen_cor = P, phi_cand = NULL, over.dispersion = FALSE, overlap = TRUE)
Published Table 4
table_publish()table_publish()
A table
table_publish()table_publish()