Title: | R Package for MR Framework GRAPPLE |
---|---|
Description: | Fitting and diagnosing two-sample summary data Mendelian randomization with heterogeneous instruments. |
Authors: | Jingshu Wang [aut, cre], Qingyuan Zhao [aut] |
Maintainer: | Jingshu Wang <[email protected]> |
License: | GPL (>=2) |
Version: | 0.2.2 |
Built: | 2024-12-22 04:18:53 UTC |
Source: | https://github.com/jingshuw/GRAPPLE |
Compute the conditional Q-statistic for assessing instrument strength
computeQ(dat.list, p.thres = NULL)
computeQ(dat.list, p.thres = NULL)
dat.list |
Object returned from |
p.thres |
The p-value threshold for SNP selection. The SNPs whose |
The conditional Q-statistics, degrees of freedom, and the corresponding p-values
Eleanor Sanderson, George Davey Smith, Frank Windmeijer, Jack Bowden, An examination of multivariable Mendelian randomization in the single-sample and two-sample summary data settings, International Journal of Epidemiology, Volume 48, Issue 3, June 2019, Pages 713–727, https://doi.org/10.1093/ije/dyy262.
THis function can be run only for k = 1
when there is only one risk factor
findModes( data, p.thres = NULL, marker.data = NULL, marker.p.thres = NULL, mode.lmts = c(-5, 5), cor.mat = NULL, loss.function = c("tukey", "huber", "l2"), k.findmodes = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 3), include.thres = 1, exclude.thres = 2, map.marker = T, ldThres = 0.9, npoints = 10000 )
findModes( data, p.thres = NULL, marker.data = NULL, marker.p.thres = NULL, mode.lmts = c(-5, 5), cor.mat = NULL, loss.function = c("tukey", "huber", "l2"), k.findmodes = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 3), include.thres = 1, exclude.thres = 2, map.marker = T, ldThres = 0.9, npoints = 10000 )
data |
A data frame containing the information of the selected genetic instruments.
One can simply take the |
p.thres |
The p-value threshold for SNP selection. The SNPs whose |
marker.data |
A data frame containing the information of candidate marker genes.
Default is NULL, which sets |
marker.p.thres |
P-value threshold for marker SNP selection. See |
cor.mat |
Either NULL or a |
loss.function |
Loss function used, one of "tukey", "huber" or "l2". Default is "tukey", which is robust to outlier SNPs with large pleiotropic effects |
k.findmodes |
Tuning parameters of the loss function, for loss "l2", it is NA, for loss "huber", default is 1.345 and for loss "tukey", default is 3. |
include.thres |
Absolute value upper threshold of the standardized test statistics of one SNP on one mode for the SNP to be included as a marker for that mode, default is 1.4 |
exclude.thres |
Absolute value lower threshold of the standardized test statistics of one SNP on other modes for the SNP to be included as a marker for that mode, default is |
map.marker |
Whether map each marker to the earist gene or not. Default is TRUE if multiple markers are found. It is always FALSE if there is just one mode. |
ldThres |
the parameter passed to the |
npoints |
Number of equally spaced points chosen for grid search of modes within the range |
mode_lmts |
The range of |
A list containing the following elements:
fun |
The profile likelihood function with argument |
modes |
The position of modes. Only include modes where marker genes can be detected |
p |
The profile likelihood plot with gene markers when there are multiple modes. The range of the x.axis depends on the distance between the maximum mode and minimum mode when there are multiple modes. |
markers |
A data frame of marker information |
raw.modes |
All modes of the profile likelihood function within the range of |
supp_gwas |
More information about the markers. |
This function has GWAS summary statistics data files as inputs, perform genetic instrument selection and return matrices that are ready to use for GRAPPLE
getInput( sel.files, exp.files, out.files, plink_refdat, max.p.thres = 0.01, cal.cor = T, p.thres.cor = 0.5, get.marker.candidates = T, marker.p.thres = 1e-05, marker.p.source = "exposure", clump_r2 = 0.001, clump_r2_formarkers = 0.05, plink_exe = NULL )
getInput( sel.files, exp.files, out.files, plink_refdat, max.p.thres = 0.01, cal.cor = T, p.thres.cor = 0.5, get.marker.candidates = T, marker.p.thres = 1e-05, marker.p.source = "exposure", clump_r2 = 0.001, clump_r2_formarkers = 0.05, plink_exe = NULL )
sel.files |
A vector of the GWAS summary statistics file names for the risk factors SNP selection. Each GWAS file is a ".csv" or ".txt" file containing a data frame that at least has a column "SNP" for the SNP ids and "pval" for the p-values. The length of |
exp.files |
A vector of length |
out.files |
The GWAS summary statistics file name for the disease data, can be a vector of length |
plink_refdat |
The reference genotype files (.bed, .bim, .fam) for clumping using PLINK (loaded with –bfile). |
max.p.thres |
The upper threshold of the selection p-values for a SNP to be selected before clumping. It only requires that at least one of the p-values of the risk factors of the SNPs to be below the threshold. Default is |
cal.cor |
Whether calculate the |
p.thres.cor |
The lower threshold of the p-values for a SNP to be used in calculating the correlation matrix. It only select SNPs whose p-values are above the threshold for all risk factors. Default is |
get.marker.candidates |
Whether getting SNPs which are used for mode marker selection. Only applies to cases where the number of risk factors |
marker.p.thres |
P-value threshold of p-values in the exposure files for mode markers. Default is |
marker.p.source |
source of p-values of mode markers, a string of either "exposure" or "selection". Default is "exposure" for obtaining more markers. |
clump_r2 |
The clumping r2 threshold in PLINK for genetic instrument selection. Default is set to 0.001 for selection of independent SNPs. |
clump_r2_formarkers |
The clumping r2 threshold in PLINK. Default is set to 0.05 for selection of candidates for the marker SNPs. |
plink_exe |
The name of the plink exe. Default is NULL, which uses "plink". For users with Linux systems, one may want to have a different name, like "./plink" depending on where they install plink |
A list of selected summary statistics, which include
data |
A data frame of size |
marker.data |
A data frame for marker candidate SNPs, which has the same columns as |
.
cor.mat |
The estimated |
The main function of GRAPPLE to estimate causal effects of risk factors beta
under a random effect model of the pleiotropic effects.
grappleRobustEst( data, p.thres = NULL, cor.mat = NULL, tau2 = NULL, loss.function = c("tukey", "huber", "l2"), k = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 4.685), niter = 20, tol = .Machine$double.eps^0.5, opt.method = "L-BFGS-B", diagnosis = T, plot.it = T )
grappleRobustEst( data, p.thres = NULL, cor.mat = NULL, tau2 = NULL, loss.function = c("tukey", "huber", "l2"), k = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 4.685), niter = 20, tol = .Machine$double.eps^0.5, opt.method = "L-BFGS-B", diagnosis = T, plot.it = T )
data |
A data frame containing the information of the selected genetic instruments.
One can simply take the |
p.thres |
The p-value threshold for SNP selection. The SNPs whose |
cor.mat |
Either NULL or a |
tau2 |
The dispersion parameter. The default value is NULL, which is to be estimated by the function |
loss.function |
Loss function used, one of "tukey", "huber" or "l2". Default is "tukey", which is robust to outlier SNPs with large pleiotropic effects |
k |
Tuning parameters of the loss function, for loss "l2", it is NA, for loss "huber", default is 1.345 and for loss "tukey", default is 4.685 |
niter |
Number of maximum iterations allowed for optimization. Default is 20 |
tol |
Tolerance for convergence, default is the square root of the smallest positive floating number depending on the machine R is running on |
opt.method |
the optimization used, which is one of choices the R function |
diagnosis |
Run diagnosis analysis based on the residuals or not, default is FALSE |
plot.it |
Whether show the QQ-plot or not if diagnosis if performed. Default is TRUE. |
A list with elements
beta.hat |
Point estimates of |
tau2.hat |
Point estimates of the pleiotropic effect variance |
beta.variance |
Estimated covariance matrix of |
tau2.se |
Estimated standard deviation of |
beta.p.vaue |
A vector of p-values where the kth element is the p-value for whether |
std.resid |
Returned if |
QQ-plot diagnosis and outlier detection from standardized residuals
qqDiagnosis( std.residuals, outlier.quantile = 0.1/length(std.residuals), plot.it = T )
qqDiagnosis( std.residuals, outlier.quantile = 0.1/length(std.residuals), plot.it = T )
std.residuals |
A vector of standardized residuals, can be the output element |
outlier.quantile |
The quantile threshold for outliers. |
plot.it |
Whether show the QQ-plot or not if diagnosis if performed. Default is TRUE. |
A list with two elements
p |
The QQ plot |
outliers |
The data frame for the detected outliers |
Tukey's beweight loss function and its derivatives
rho.tukey(r, k = 4.685, deriv = 0)
rho.tukey(r, k = 4.685, deriv = 0)
r |
Function value |
k |
Tuning parameter, default value is 4.685 |
deriv |
The derivative function to calculate. 0 is the Tukey's loss function, 1 is for the first derivative and 2 for the second derivative function |
The value of the corresponding function at r
Calculate the robustified profile likelihood function
robustLossFixtau( data = NULL, b_exp = NULL, b_out = NULL, se_exp = NULL, se_out = NULL, cor.mat = NULL, loss.function = c("tukey", "huber", "l2"), k.findmodes = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 3) )
robustLossFixtau( data = NULL, b_exp = NULL, b_out = NULL, se_exp = NULL, se_out = NULL, cor.mat = NULL, loss.function = c("tukey", "huber", "l2"), k.findmodes = switch(loss.function[1], l2 = NA, huber = 1.345, tukey = 3) )
data |
A data frame containing the information of the selected genetic instruments.
One can simply take the |
cor.mat |
Either NULL or a |
loss.function |
Loss function used, one of "tukey", "huber" or "l2". Default is "tukey", which is robust to outlier SNPs with large pleiotropic effects |
k.findmodes |
Tuning parameters of the loss function, for loss "l2", it is NA, for loss "huber", default is 1.345 and for loss "tukey", default is 3. |
The robustified profile likelihood function