Title: | The MRAPSS package implement the MR-APPSS approach to test for the causal effect of an exposure on a outcome disease. |
---|---|
Description: | The MRAPSS package implement the MR-APPSS approach to test for the causal effects between an exposure and a outcome disease. The MR-APPSS is a unified approach to Mendelian Randomization accounting for polygenicity, pleiotropy and sample structure using genome-wide summary statistics. Specifically, MR-APPSS uses a background-foreground model to characterize both SNP-exposure effects and SNP-outcome effects estimates, where the background model accounts for confounding from genetic correlation and sample structure and the foreground model captures the valid signal for causal inference. |
Authors: | Xianghong HU [aut, cre] |
Maintainer: | Xianghong HU <[email protected]> |
License: | What license it uses |
Version: | 0.0.0.9000 |
Built: | 2025-01-09 08:30:01 UTC |
Source: | https://github.com/YangLabHKUST/MR-APSS |
Peform LD clumping, to prune SNPs in LD within a window. Keep the most significant ones.
clump( dat, IV.Threshold = 5e-05, SNP_col = "SNP", pval_col = "pval.exp", clump_kb = 1000, clump_r2 = 0.001, clump_p = 0.999, pop = "EUR", bfile = NULL, plink_bin = NULL )
clump( dat, IV.Threshold = 5e-05, SNP_col = "SNP", pval_col = "pval.exp", clump_kb = 1000, clump_r2 = 0.001, clump_p = 0.999, pop = "EUR", bfile = NULL, plink_bin = NULL )
dat |
a data frame must have columns with information about SNPs and p values |
SNP_col |
column with SNP rsid. The default is '"SNP"' |
pval_col |
column with p value. The default is '"pval"' |
clump_kb |
clumping window in kb. Default is 1000. |
clump_r2 |
clumping r2 threshold. Default is 0.001. |
clump_p |
clumping significance level for index variants. Default = 5e-05 |
bfile |
bfile as LD reference panel. If this is provided, then will use local PLINK. Default = NULL. |
plink_bin |
path to local plink binary. Default = NULL. |
data frame of clumped SNPs
A function harmonising datasets and estimate background parameters by LD score regression.
est_paras( dat1, dat2, trait1.name = "exposure", trait2.name = "outcome", LDSC = T, h2.fix.intercept = F, ldscore.dir = NULLL )
est_paras( dat1, dat2, trait1.name = "exposure", trait2.name = "outcome", LDSC = T, h2.fix.intercept = F, ldscore.dir = NULLL )
dat1: |
formmated summary statistics for trait 1. |
dat2: |
formmated summary statistics for trait 2. |
trait1.name: |
specify the name of trait 1, default 'exposure'. |
trait2.name: |
specify the name of trait 2, default 'outcome'. |
LDSC: |
whether to run LD score regression, default 'TRUE'. If 'FALSE', the function will not give the parameter estimates but will do harmonising. |
h2.fix.intercept: |
whether to fix LD score regression intercept to 1, default 'FALSE'. |
ldscore.dir: |
specify the path to the LD score files. |
List with the following elements:
Homonised data set
the estimated C matrix capturing the effects of sample structure
the estimated variance-covariance matrix for polygenic effects
Reads in GWAS summary data. Infer Zscores from p-values and signed satatistics. This function is adapted from the format_data() function in MRCIEU/TwoSampleMR.
format_data( dat, snps.merge = w_hm3.snplist, snps.remove = MHC.SNPs, snp_col = "SNP", b_col = "b", or_col = "or", se_col = "se", freq_col = "freq", A1_col = "A1", A2_col = "A2", p_col = "p", ncase_col = "ncase", ncontrol_col = "ncontrol", n_col = "n", n = NULL, z_col = "z", info_col = "INFO", log_pval = FALSE, chi2_max = NULL, min_freq = 0.05 )
format_data( dat, snps.merge = w_hm3.snplist, snps.remove = MHC.SNPs, snp_col = "SNP", b_col = "b", or_col = "or", se_col = "se", freq_col = "freq", A1_col = "A1", A2_col = "A2", p_col = "p", ncase_col = "ncase", ncontrol_col = "ncontrol", n_col = "n", n = NULL, z_col = "z", info_col = "INFO", log_pval = FALSE, chi2_max = NULL, min_freq = 0.05 )
dat |
Data frame. Must have header with at least SNP A1 A2 signed statistics pvalue and sample size. |
snps.merge |
Data frame with SNPs to extract. must have headers: SNP A1 and A2. For example, the hapmap3 SNPlist. |
snps.remove |
a set of SNPs needed to be removed. For example, the SNPs in MHC region. |
snp_col |
column with SNP rs IDs. The default is |
b_col |
Name of column with effect sizes. The default is |
se_col |
Name of column with standard errors. The default is |
freq_col |
Name of column with effect allele frequency. The default is |
A1_col |
Name of column with effect allele. Must contain only the characters "A", "C", "T" or "G". The default is |
A2_col |
Name of column with non effect allele. Must contain only the characters "A", "C", "T" or "G". The default is |
p_col |
Name of column with p-value. The default is |
ncase_col |
Name of column with number of cases. The default is |
ncontrol_col |
Name of column with number of controls. The default is |
n_col |
Name of column with sample size. The default is |
n |
Sample size |
z_col |
Name of column with Zscore. The default is |
info_col |
Name of column with inputation Info. The default is |
log_pval |
The pval is -log10(p_col). The default is |
chi2_max |
SNPs with tested chi^2 statistics large than chi2_max will be removed.The default is |
min_freq |
SNPs with allele frequecy less than min_freq will be removed.The default is |
or_col: |
Name of column with odds ratio. The default is |
n_qc |
Whether to remove SNPs according to the sample size of SNPs. The default is |
data frame wih headers: SNP: rsid; A1: effect allele; A2: non effect allel; Z: Z score; N: sample size; chi2: chi square statistics; P: p-value.
MR-APSS: a unified approach to Mendelian Randomization accounting for pleiotropy and sample structure using genome-wide summary statistics. MA-APSS uses a variantional EM algorithm for estimation of parameters. MR-APSS uses likelihood ratio test for inference.
MRAPSS( MRdat = NULL, exposure = "exposure", outcome = "outcome", pi0 = NULL, sigma.sq = NULL, tau.sq = NULL, C = matrix(c(1, 0, 0, 1), 2, 2), Omega = matrix(0, 2, 2), Cor.SelectionBias = T, tol = 1e-08, ELBO = F )
MRAPSS( MRdat = NULL, exposure = "exposure", outcome = "outcome", pi0 = NULL, sigma.sq = NULL, tau.sq = NULL, C = matrix(c(1, 0, 0, 1), 2, 2), Omega = matrix(0, 2, 2), Cor.SelectionBias = T, tol = 1e-08, ELBO = F )
MRdat |
data frame at least contain the following varaibles: b.exp b.out se.exp se.out L2 Threshold. L2:LD score, Threshold: modified IV selection threshold for correction of selection bias |
exposure |
exposure name |
outcome |
outcome name |
pi0 |
initial value for pi0, default 'NULL' will use the default initialize procedure. |
sigma.sq |
initial value for sigma.sq , default 'NULL'will use the default initialize procedure. |
tau.sq |
initial value for tau.sq , default 'NULL' will use the default initialize procedure. |
C |
the estimated C matrix capturing the effects of sample structure. default 'diag(2)'. |
Omega |
the estimated variance-covariance matrix of polygenic effects. default 'matrix(0,2,2)'. |
Cor.SelectionBias |
Whether use the selection Threshold for correction of selection bias. If FALSE, the model won't correct for selection bias. |
tol |
tolerence, default '1e-08' |
ELBO |
Whether check the evidence lower bound or not, if 'FALSE', check the maximum likelihood instead. default 'FALSE'. |
a list with the following elements:
Input data frame
exposure of interest
outcome of interest
causal effect estimate
standard error
p-value
variance of forground exposure effect
variance of forground outcome effect
The probability of a SNP with forground signal after selection
Posterior estimates of latent varaibles
"MR-APSS"
library(MRAPSS) exposure = "BMI" outcome = "T2D" Threshold = 5e-05 # IV selection Threshold data(C) data(Omega) data(MRdat) MRres = MRAPSS(MRdat, exposure = "BMI", outcome = "T2D", C = C, Omega = Omega , Cor.SelectionBias = T) MRplot(MRres, exposure = "BMI", outcome = "T2D")
library(MRAPSS) exposure = "BMI" outcome = "T2D" Threshold = 5e-05 # IV selection Threshold data(C) data(Omega) data(MRdat) MRres = MRAPSS(MRdat, exposure = "BMI", outcome = "T2D", C = C, Omega = Omega , Cor.SelectionBias = T) MRplot(MRres, exposure = "BMI", outcome = "T2D")
Visualize the MRAPSS results
MRplot(MRres, exposure = "trait 1", outcome = "trait 2")
MRplot(MRres, exposure = "trait 1", outcome = "trait 2")
outcome |
: outcome name |
MRres: |
MRAPSS fit results |
exposure: |
exposure name |
Plot of SNP-exposure effect and SNP-outcome effect with the causal effect and 95% confidence interval.