| Title: | Implements a Ready-for-Use Mendelian Randomisation Pipeline |
|---|---|
| Description: | This package implements a pipeline which in turns allows for simple and generally "hands-free" Mendelian randomisation analyses to be run. Data may be used from the OpenGWAS DB or locally, using .vcf files. Analyses include MR, colocalisation and standard MR sensitivity analyses. Please see the documentation for more details. |
| Authors: | Jamie Robinson |
| Maintainer: | Jamie Robinson <[email protected]> |
| License: | What license is it under? |
| Version: | 0.1.0 |
| Built: | 2026-06-10 11:57:58 UTC |
| Source: | https://github.com/jwr-git/mrpipeline |
Calculates proportion of variance explained. From S1 Text
.calc_pve(b, maf, se, n).calc_pve(b, maf, se, n)
b |
Vector or number, beta |
maf |
Vector or number, minor allele frequency |
se |
Vector or number, standard error of beta |
n |
Vector or number, sample size |
Vector or number, proportion of variance explained
Helper function to extract colocalisation regions for when one dataset comes from a local file and another from OpenGWAS.
.cdat_from_mixed(f1, f2, chrpos, verbose = TRUE).cdat_from_mixed(f1, f2, chrpos, verbose = TRUE)
f1 |
File path or OpenGWAS ID for trait 1 |
f2 |
File path or OpenGWAS ID for trait 2 |
chrpos |
Character of the format chr:pos1-pos2 |
verbose |
Display verbose information (Optional, boolean) |
list of coloc-ready data
[gwasglue::gwasvcf_to_coloc()], [gwasglue::ieugwasr_to_coloc()]
Splits vector into chunks
.chunk(x, n).chunk(x, n)
x |
Vector |
n |
Number of chunks to create |
list of chunks
Sub-function for the colocalisation analyses
.coloc_sub( dat1, dat2, min_snps = 100, p1 = 1e-04, p2 = 1e-04, p12 = 1e-05, susie = FALSE, bfile = NULL, plink = NULL, verbose = TRUE ).coloc_sub( dat1, dat2, min_snps = 100, p1 = 1e-04, p2 = 1e-04, p12 = 1e-05, susie = FALSE, bfile = NULL, plink = NULL, verbose = TRUE )
dat1 |
SNPs, etc. from first dataset |
dat2 |
SNPs, etc. from second dataset |
min_snps |
Number of minimum SNPs to check for analysis to continue (Optional) |
p1 |
p1 for coloc (Optional) |
p2 |
p2 for coloc (Optional) |
p12 |
p12 for coloc (Optional) |
susie |
Run SuSiE? (Optional, boolean) |
bfile |
Path to Plink bed/bim/fam files (Optional; required for SuSiE) |
plink |
Path to Plink binary (Optional; required for SuSiE) |
verbose |
Display verbose information (Optional, boolean) |
Results data.frame
Sub function to run SuSiE and coloc
.coloc_susie_sub(d1, d2, bfile = NULL, plink = NULL, verbose = TRUE, ...).coloc_susie_sub(d1, d2, bfile = NULL, plink = NULL, verbose = TRUE, ...)
d1 |
Dataset 1 |
d2 |
Dataset 2 |
bfile |
Path to Plink bed/bim/fam files (Optional; required for SuSiE) |
plink |
Path to Plink binary (Optional; required for SuSiE) |
verbose |
Display verbose information (Optional, boolean) |
... |
Other arguments passed to coloc.susie and coloc.bf |
Results data.frame
Lookup ENSGs using the Drug Genome Interaction DB API.
.dgidb_linkage(ensgs).dgidb_linkage(ensgs)
ensgs |
Vector of ENSG IDs |
data.frame of results
Attempts to convert ENSG IDs to gene names (hgnc_symbol). This is attempting using biomaRt's service and thus requires the optional biomaRt package to be installed.
.ensg_to_name( dat, ensg_col = "trait", new_col = "hgnc_symbol", build = "grch37" ).ensg_to_name( dat, ensg_col = "trait", new_col = "hgnc_symbol", build = "grch37" )
dat |
Data.frame of data |
ensg_col |
Column name containing ENSG IDs (Optional) |
new_col |
Column to append to 'dat' with converted names (Optional) |
build |
Genomic build (Optional) |
Data.frame with appended column for names
Prepare gwasvcf files for coloc. This method will extract SNPs from one file using one chrompos and then look up those SNPs in the other file – this is to ensure coloc can be conducted upon two datasets of different genomic builds without the need of liftover.
.gwasvcf_to_coloc_rsid( vcf1, vcf2, chrompos, type1 = NULL, type2 = NULL, build1 = "GRCh37", build2 = "GRCh37", verbose = TRUE ).gwasvcf_to_coloc_rsid( vcf1, vcf2, chrompos, type1 = NULL, type2 = NULL, build1 = "GRCh37", build2 = "GRCh37", verbose = TRUE )
vcf1 |
VCF object or path to vcf file |
vcf2 |
VCF object or path to vcf file |
chrompos |
Character of the format chr:pos1-pos2 |
list of coloc-ready data, or NA if failed
Write files for PWCoCo where data are read from two VCF objects or files.
.gwasvcf_to_pwcoco(vcf1, vcf2, chrompos, type1 = NULL, type2 = NULL, outfile).gwasvcf_to_pwcoco(vcf1, vcf2, chrompos, type1 = NULL, type2 = NULL, outfile)
vcf1 |
VCF object or path to VCF file |
vcf2 |
VCF object or path to VCF file |
chrompos |
Character of the format chr:pos1-pos2 |
type1 |
How to treat vcffile1 for coloc, either "quant" or "cc" (Optional) |
type2 |
How to treat vcffile2 for coloc, either "quant" or "cc" (Optional) |
outfile |
Path to output files, without file ending |
0 if success, 1 if there was a problem
Write files for PWCoCo where data are read from the OpenGWAS DB.
.ieugwasr_to_pwcoco(id1, id2, chrompos, type1 = NULL, type2 = NULL, outfile).ieugwasr_to_pwcoco(id1, id2, chrompos, type1 = NULL, type2 = NULL, outfile)
id1 |
ID for trait 1 |
id2 |
ID for trait 2 |
chrompos |
Character of the format chr:pos1-pos2 |
type1 |
How to treat vcffile1 for coloc, either "quant" or "cc" (Optional) |
type2 |
How to treat vcffile2 for coloc, either "quant" or "cc" (Optional) |
outfile |
Path to output files, without file ending |
0 if success, 1 if there was a problem
Calculates the inverse variance weighted delta method from the MendelianRandomization package
.ivw_delta(dat).ivw_delta(dat)
object |
Harmonised data.frame |
Results data.frame
Helper function for message printing.
.print_msg(msg, verbose).print_msg(msg, verbose)
msg |
Message |
verbose |
Display message or suppress |
Sub-function to run PWCoCo
.pwcoco_sub( bfile, chrpos, pwcoco, maf = 0.01, p1 = 1e-04, p2 = 1e-04, p12 = 1e-05, workdir = tempdir(), verbose = TRUE ).pwcoco_sub( bfile, chrpos, pwcoco, maf = 0.01, p1 = 1e-04, p2 = 1e-04, p12 = 1e-05, workdir = tempdir(), verbose = TRUE )
bfile |
Path to Plink bed/bim/fam files |
chrpos |
Character of the format chr:pos1-pos2 |
pwcoco |
Path to PWCoCo executible |
maf |
MAF cut-off (Optional) |
p1 |
p1 for coloc (Optional) |
p2 |
p2 for coloc (Optional) |
p12 |
p12 for coloc (Optional) |
workdir |
Path to save temporary files (Optional) |
verbose |
Display verbose information (Optional, boolean) |
Results data.frame
Helper function that is called from read_exposure and read_outcome. Extracts exposure and outcome data according to arguments. Should not be called directly.
.read_dataset( ids, rsids = NULL, pval = 5e-08, proxies = TRUE, proxy_rsq = 0.8, proxy_kb = 5000, proxy_nsnp = 5000, plink = NULL, bfile = NULL, clump_r2 = 0.01, clump_kb = 10000, pop = "EUR", type = "exposure", cores = 1, cores_proxy = 1, verbose = TRUE ).read_dataset( ids, rsids = NULL, pval = 5e-08, proxies = TRUE, proxy_rsq = 0.8, proxy_kb = 5000, proxy_nsnp = 5000, plink = NULL, bfile = NULL, clump_r2 = 0.01, clump_kb = 10000, pop = "EUR", type = "exposure", cores = 1, cores_proxy = 1, verbose = TRUE )
ids |
List of OpenGWAS IDs or file paths (to gwasvcf files) |
rsids |
List of SNP rsIDs to extract |
pval |
Threshold to extract SNPs (Optional) |
proxies |
Whether to search for proxies (Optional, boolean) |
proxy_rsq |
R2 threshold to use when searching for proxies (Optional) |
proxy_kb |
kb threshold to use when searching for proxies (Optional) |
proxy_nsnp |
Number of SNPs when searching for proxies (Optional) |
plink |
Path to Plink binary (Optional) |
bfile |
Path to Plink .bed/.bim/.fam files (Optional) |
clump_r2 |
r2 threshold for clumping SNPs (Optional) |
clump_kb |
Distance outside of which SNPs are considered in linkage equilibrium (Optional) |
pop |
Population (Optional, used only for clumping on OpenGWAS) |
type |
Type of data (Optional, "exposure" or "outcome") |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
cores_proxy |
Number of cores for multi-threaded proxy searching (Optional) NB: Unavailable on Windows machines NB: Should not be more than 'cores' argument! |
verbose |
Display verbose information (Optional, boolean) |
Data.frame of datasets
[read_exposure()], [read_outcome()]
Calculates the second term Taylor approximation for standard error of the Wald ratio method. From supplementary
.wr_taylor_approx(dat).wr_taylor_approx(dat)
object |
Harmonised data.frame |
Results data.frame
Annotates the data using given IDs
annotate_data(dat, id1, id2)annotate_data(dat, id1, id2)
dat |
Data.frame of data from vcf files or OpenGWAS DB |
id1 |
DatasetsID class of exposure IDs |
id2 |
DatasetsID class of outcome IDs |
Data.frame of annotated dat
Attempts to annotate disease names with EFO IDs using the 'epigraphdb' package. Note that the matching is fuzzy and some disease names will have multiple associated EFOs which may differ in definition slightly.
annotate_efo(dat, column = "outcome")annotate_efo(dat, column = "outcome")
dat |
Data.frame of data |
column |
Column name containing disease names |
Data.frame with appended column for EFO IDs
Attempts to convert ENSG IDs to gene names (hgnc_symbol). This is attempting using biomaRt's service and thus requires the optional biomaRt package to be installed.
annotate_ensg( dat, column = "exposure", gene_name_col = "hgnc_symbol", build = "grch37" )annotate_ensg( dat, column = "exposure", gene_name_col = "hgnc_symbol", build = "grch37" )
dat |
Data.frame of data |
gene_name_col |
Column to append to 'dat' with converted names (Optional) |
build |
Genomic build (Optional) |
col |
Column name containing ENSG IDs (Optional) |
Data.frame with appended column for names
Calculates portion of variance explained and F-statistic.
If the data is lacking key information, i.e. allele frequencies, sample size
or consists of only one SNP, then the approximate F-statistic will be used
instead: .
calc_f_stat(dat, f_cutoff = 10, force_approx = FALSE, verbose = TRUE)calc_f_stat(dat, f_cutoff = 10, force_approx = FALSE, verbose = TRUE)
dat |
Data.frame from do_mr() |
f_cutoff |
F-statistic cutoff (Optional) |
force_approx |
Force to use the approximate F-statistic instead (Optional, boolean) |
verbose |
Display verbose information (Optional, boolean) |
Modified 'dat' data.frame (if f_cutoff > 0 supplied)
[do_mr()]
Check if SNPs are good for use in analyses and mark them as such.
check_snps(dat, analyses = c("mr", "coloc"), drop = T)check_snps(dat, analyses = c("mr", "coloc"), drop = T)
dat |
A data.frame of formatted data (exposure or outcome) |
analyses |
Which analyses should be checked? |
drop |
Whether to drop SNPs if they failed the check |
List of analyses and what data are checked for:
"MR"beta, SE, P value
"coloc"chromosome, position, P value
Data.frame
Attempts to annotate SNPs as cis or trans depending on their location to the gene coding region. This is achieved using the 'biomaRt' R package.
cis_trans( dat, cis_region = 5e+05, chr_col = "chr.exposure", pos_col = "position.exposure", snp_col = NULL, values_col = "exposure", filter = "ensembl_gene_id", missing = "include", build = "grch37" )cis_trans( dat, cis_region = 5e+05, chr_col = "chr.exposure", pos_col = "position.exposure", snp_col = NULL, values_col = "exposure", filter = "ensembl_gene_id", missing = "include", build = "grch37" )
dat |
Data.frame of data |
cis_region |
Cis region definition (Optional, in kb) |
chr_col |
Column name for chromosome (Optional) |
pos_col |
Column name for position (Optional) |
snp_col |
Column name for SNP rsIDs (Optional) |
values_col |
Column name for gene names or ENSG IDs (Optional) NB: Choice must match the 'filter' value |
filter |
How to search for genes in biomaRt, either:
(Optional) |
missing |
"Include" or "drop" SNPs which could not be matched (Optional) |
build |
Genomic build (Optional) |
Data.frame with appended column for cis/trans status
Combine MR and coloc results into one data.frame
combine_results( mr_res, coloc_res, mr_res.by = c("id.exposure", "id.outcome"), coloc_res.by = c("file.exposure", "file.outcome") )combine_results( mr_res, coloc_res, mr_res.by = c("id.exposure", "id.outcome"), coloc_res.by = c("file.exposure", "file.outcome") )
mr_res |
A data.frame of MR results from 'do_mr()' |
coloc_res |
A data.frame of coloc results from 'do_coloc()' |
mr_res.by |
MR columns to use for merging |
coloc_res.by |
Coloc columns to use for merging |
Data.frame of merged results
Function to convert a data.frame to gwasvcf format.
dat_to_gwasvcf( dat, out, chr_col, pos_col, nea_col, ea_col, snp_col = NULL, eaf_col = NULL, beta_col = NULL, se_col = NULL, pval_col = NULL, n = NULL, n_case = NULL, name = NULL, bcf_tools = NULL, verbose = TRUE )dat_to_gwasvcf( dat, out, chr_col, pos_col, nea_col, ea_col, snp_col = NULL, eaf_col = NULL, beta_col = NULL, se_col = NULL, pval_col = NULL, n = NULL, n_case = NULL, name = NULL, bcf_tools = NULL, verbose = TRUE )
dat |
Data.frame |
out |
Path to save output |
chr_col |
Column name for chromosome |
pos_col |
Column name for position |
nea_col |
Column name for non-effect allele |
ea_col |
Column name for effect allele |
snp_col |
Column name for SNP (Optional) |
eaf_col |
Column name for effect allele frequency (Optional) |
beta_col |
Column name for beta (Optional) |
se_col |
Column name for standard error (Optional) |
pval_col |
Column name for P value (Optional) NB: P values will be saved as 10^-P |
n |
Sample size (Optional), can be int or column name |
n_case |
Number of cases (Optional), can be int or column name |
name |
Trait name (Optional), can be string or column name |
bcf_tools |
Path to bcf_tools (Optional) |
verbose |
Display verbose information (Optional, boolean) |
gwasvcf object
[file_to_gwasvcf()] for converting files.
Runs colocalisation using any of the following methods:
Coloc.abf, see coloc::coloc.abf()
Coloc.susie, see coloc::coloc.susie()
PWCoCo, see PWCoCo
NB: PWCoCo is not available on Windows.
do_coloc( dat, cdat = NA, method = "coloc.abf", coloc_window = 5e+05, plot_region = F, bfile = NULL, plink = NULL, pwcoco = NULL, workdir = tempdir(), cores = 1, verbose = TRUE )do_coloc( dat, cdat = NA, method = "coloc.abf", coloc_window = 5e+05, plot_region = F, bfile = NULL, plink = NULL, pwcoco = NULL, workdir = tempdir(), cores = 1, verbose = TRUE )
dat |
A data.frame of harmonised data |
cdat |
A named list of regional data but can be omitted (Optional) |
method |
Which method of colocalisation to use: coloc.abf, coloc.susie, pwcoco (Optional) |
coloc_window |
Size (+/-) of region to extract for colocalisation analyses (Optional) |
plot_region |
Whether to plot the regions or not |
bfile |
Path to Plink bed/bim/fam files (Optional) |
plink |
Path to Plink binary (Optional) |
pwcoco |
If PWCoCo is the selected coloc method, path to PWCoCo binary (Optional) |
workdir |
Path to save temporary files (Optional) |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
verbose |
Display verbose information (Optional, boolean) |
A data.frame of colocalistion results
[coloc::coloc.abf()], [coloc::coloc.susie()]
Runs Mendelian randomisation and related analyses:
Wald ratio, see .wr_taylor_approx()
Inverse variance weighted, see .ivw.delta()
Steiger filtering, see TwoSampleMR::directionality_test()
do_mr(dat, f_cutoff = 10, all_wr = TRUE, verbose = TRUE)do_mr(dat, f_cutoff = 10, all_wr = TRUE, verbose = TRUE)
dat |
A data.frame of harmonised data |
f_cutoff |
Define an F-statistic cutoff (Optional) |
all_wr |
Should the Wald ratio be calculated for all SNPs, even if IVW can be used? (Optional) |
verbose |
Display verbose information (Optional, boolean) |
A data.frame of MR results
[.wr_taylor_approx()], [.ivw_delta()], [TwoSampleMR::directionality_test()]
Uses the Drug Genome Interaction DB's API to search for drug target-related evidence, including on: Druggable Genome, Clinically Actionable and Drug Resistant ontologies.
drug_target_evidence(dat, ensg_col = "exposure")drug_target_evidence(dat, ensg_col = "exposure")
dat |
A data.frame or named list |
ensg_col |
Column, or name, to be accessed in 'dat' |
The lookup MUST be ENSG IDs.
data.frame of results
Extract SNPs based on region for colocalisation analyses. Can be used before calling the 'do_coloc' function or will be called as part of that function automatically.
extract_matched_regions(dat, window = 5e+05, cores = 1, verbose = TRUE)extract_matched_regions(dat, window = 5e+05, cores = 1, verbose = TRUE)
dat |
A data.frame of harmonised data |
window |
Window around SNPs to extract (Optional) |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
verbose |
Display verbose information (Optional, boolean) |
Named list of matched regional data
[mrpipeline::do_coloc()]
Function to convert a file (or files) to gwasvcf format.
file_to_gwasvcf( file, chr_col, pos_col, nea_col, ea_col, snp_col = NULL, eaf_col = NULL, beta_col = NULL, se_col = NULL, pval_col = NULL, n = NULL, n_case = NULL, name = NULL, header = TRUE, sep = "\t", cores = 1, bcf_tools = NULL, verbose = TRUE )file_to_gwasvcf( file, chr_col, pos_col, nea_col, ea_col, snp_col = NULL, eaf_col = NULL, beta_col = NULL, se_col = NULL, pval_col = NULL, n = NULL, n_case = NULL, name = NULL, header = TRUE, sep = "\t", cores = 1, bcf_tools = NULL, verbose = TRUE )
file |
Path to file |
chr_col |
Column name for chromosome |
pos_col |
Column name for position |
nea_col |
Column name for non-effect allele |
ea_col |
Column name for effect allele |
snp_col |
Column name for SNP (Optional) |
eaf_col |
Column name for effect allele frequency (Optional) |
beta_col |
Column name for beta (Optional) |
se_col |
Column name for standard error (Optional) |
pval_col |
Column name for P value (Optional) NB: P values will be saved as 10^-P |
n |
Sample size (Optional), can be int or column name |
n_case |
Number of cases (Optional), can be int or column name |
name |
Trait name (Optional), can be string or column name |
header |
Whether file has a header or not (Optional, boolean) |
sep |
File separater (Optional) |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
bcf_tools |
Path to bcf_tools (Optional) |
verbose |
Display verbose information (Optional, boolean) |
gwasvcf object(s)
[dat_to_gwasvcf()] for converting data.frames
Creates a forest plot of MR estimates from do_mr().
Will plot both the Wald ratios for all SNPs which form the instrument and
inverse variance weighted method. However, if you wish for only the "discovery"
results to be plotted (i.e. WR for single-SNP instruments and only IVW for
multi-SNP instruments), then setting 'plot_all_res' to FALSE will achieve this.
If the plot is too crowded, subsetting the results before passing them to this plotter will help.
forest_plot( res, snp_col = "snp", beta_col = "b", se_col = "se", pval_col = NULL, or_col = "or", or_lci_col = "or_lci95", or_uci_col = "or_uci95", method_col = "method", exposure_col = "exposure", outcome_col = "outcome", plot_all_res = TRUE )forest_plot( res, snp_col = "snp", beta_col = "b", se_col = "se", pval_col = NULL, or_col = "or", or_lci_col = "or_lci95", or_uci_col = "or_uci95", method_col = "method", exposure_col = "exposure", outcome_col = "outcome", plot_all_res = TRUE )
res |
A data.frame of MR results |
snp_col |
Column name for SNPs (Optional) |
beta_col |
Column name for beta (Optional) |
se_col |
Column name for standard error (Optional) |
pval_col |
Column name for P value (Optional, unused for now) |
or_col |
Column name for odds ratio (Optional) |
or_lci_col |
Column name for lower CI of OR (Optional) |
or_uci_col |
Column name for upper CI of OR (Optional) |
method_col |
Column name which contains the MR method (Optional) |
exposure_col |
Column name for exposure names (Optional) |
outcome_col |
Column name for outcome names (Optional) |
plot_all_res |
For multi-SNP instruments, also plot the Wald ratios for
all SNPs ( |
Plot
[do_mr()]
Get column names from agnostic but formatted data.frame
get_col_name(df, data)get_col_name(df, data)
df |
Data.frame of formatted data (exposure or outcome) |
data |
Name of column to find |
Name of column formatted for given data.frame
Wrapper function for harmonise_data function in the 'TwoSampleMR' package.
harmonise(exposure, outcome, action = 1, cores = 1, verbose = TRUE)harmonise(exposure, outcome, action = 1, cores = 1, verbose = TRUE)
exposure |
Data.frame of exposure dataset(s) |
outcome |
Data.frame of outcome dataset(s) |
action |
How to harmonise alleles; see harmonise_data. |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
verbose |
Display verbose information (Optional, boolean) |
Harmonised data.frame
[TwoSampleMR::harmonise_data()]
Check if SNP frequencies are ambiguous
is_ambiguous(freq, tol = 0.08)is_ambiguous(freq, tol = 0.08)
freq |
Frequency |
tol |
Tolerance around 0.5 (Optional) |
True/False if ambiguous
Check if SNP is palindromic
is_palindromic(A1, A2)is_palindromic(A1, A2)
A1 |
Allele 1 |
A2 |
Allele 2 |
True/False if palindromic
Entry point for the pipeline.
mr_pipeline(ids1, ids2, out_name = "", config_file = "", ...)mr_pipeline(ids1, ids2, out_name = "", config_file = "", ...)
ids1 |
IDs or filenames for summary statistics |
ids2 |
IDs or filenames for summary statistics |
out_name |
Name of the analysis given to the markdown report, defaults to time and date |
config_file |
Path to config.yml file. Defaults to file that comes with the package. Please see that file for more details. |
... |
Other arguments for plotting, NOT YET IMPLEMENTED |
list of results for debugging
In this function, an exposure-outcome pair are harmonised, analyses are ran on those data and those results are saved to a file. Analyses ran can be MR or colocalisation, as desired.
pairwise_analysis( exposure, outcome, res_path, ..., do_coloc = FALSE, cores = 1, verbose = TRUE )pairwise_analysis( exposure, outcome, res_path, ..., do_coloc = FALSE, cores = 1, verbose = TRUE )
exposure |
Data.frame of exposure dataset(s) |
outcome |
Data.frame of outcome dataset(s) |
res_path |
Path to save result files |
... |
Other arguments for the following functions: harmonise do_mr do_coloc |
do_coloc |
True/False run colocalisation analyses |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
verbose |
Display verbose information (Optional, boolean) |
Read a dataset (or datasets) as exposure. Can accept both OpenGWAS IDs or file paths. Accepts only gwasvcf file formats. This function can clump data locally, if supplied with the 'plink' and 'bfile' arguments. If these are not supplied, clumping will take place on the OpenGWAS servers only for OpenGWAS IDs. If you would like to clump local files, please provide paths to Plink and bfiles.
read_exposure( ids, pval = 5e-08, plink = NULL, bfile = NULL, clump_r2 = 0.01, clump_kb = 10000, pop = "EUR", cores = 1, verbose = TRUE )read_exposure( ids, pval = 5e-08, plink = NULL, bfile = NULL, clump_r2 = 0.01, clump_kb = 10000, pop = "EUR", cores = 1, verbose = TRUE )
ids |
List of OpenGWAS IDs or file paths (to gwasvcf files) |
pval |
Threshold to extract SNPs (Optional) |
plink |
Path to Plink binary (Optional) |
bfile |
Path to Plink .bed/.bim/.fam files (Optional) |
clump_r2 |
r2 threshold for clumping SNPs (Optional) |
clump_kb |
Distance outside of which SNPs are considered in linkage equilibrium (Optional) |
pop |
Population (Optional, used only for clumping on OpenGWAS) |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
verbose |
Display verbose information (Optional, boolean) |
Data.frame of exposure datasets
Read a dataset (or datasets) as outcome. Can accept both OpenGWAS IDs or file paths. Accepts only gwasvcf file formats. This function can search for proxy SNPs locally, if supplied with the 'plink' and 'bfile' arguments. If these are not supplied, proxy searching will take place on the OpenGWAS servers only for OpenGWAS IDs. If you would like to search for proxies locally, please provide paths to Plink and bfiles.
read_outcome( ids, rsids, proxies = TRUE, proxy_rsq = 0.8, proxy_kb = 5000, proxy_nsnp = 5000, plink = NULL, bfile = NULL, cores = 1, cores_proxy = 1, verbose = TRUE )read_outcome( ids, rsids, proxies = TRUE, proxy_rsq = 0.8, proxy_kb = 5000, proxy_nsnp = 5000, plink = NULL, bfile = NULL, cores = 1, cores_proxy = 1, verbose = TRUE )
ids |
List of OpenGWAS IDs or file paths (to gwasvcf files) |
rsids |
List of SNP rsIDs to extract |
proxies |
Whether to search for proxies (Optional, boolean) |
proxy_rsq |
R2 threshold to use when searching for proxies (Optional) |
proxy_kb |
kb threshold to use when searching for proxies (Optional) |
proxy_nsnp |
Number of SNPs when searching for proxies (Optional) |
plink |
Path to Plink binary (Optional) |
bfile |
Path to Plink .bed/.bim/.fam files (Optional) |
cores |
Number of cores for multi-threaded tasks (Optional) NB: Unavailable on Windows machines |
cores_proxy |
Number of cores for multi-threaded proxy searching (Optional) NB: Unavailable on Windows machines NB: Should not be more than 'cores' argument! |
verbose |
Display verbose information (Optional, boolean) |
Data.frame of outcome datasets
Plots a regional plot of the area being tested for colocalisation
regional_plot( dat, exposure, outcome, bfile = NULL, plink = NULL, verbose = TRUE )regional_plot( dat, exposure, outcome, bfile = NULL, plink = NULL, verbose = TRUE )
dat |
A data.frame of harmonised data |
exposure |
Character, name of exposure |
outcome |
Character, name of outcome |
bfile |
Path to Plink bed/bim/fam files |
plink |
Path to Plink binary |
verbose |
Print messages or not |
Validates parameters in config file
validate_config(conf)validate_config(conf)
conf |
config::config file of parameters |
Validated config class
Creates a volcano plot of Wald ratios from [do_mr()]. This function will take all of the Wald ratios in the given data.frame and plot these. If the plot is too crowded, subsetting the results before passing them to this plotter will help. If 'plotly' is installed, the plot will be returned interactive.
volcano_plot( res, label = "outcome", snp_col = "snp", beta_col = "b", pval_col = "pval", or_col = "or", or_lci_col = "or_lci95", or_uci_col = "or_uci95", method_col = "method", force_static = FALSE )volcano_plot( res, label = "outcome", snp_col = "snp", beta_col = "b", pval_col = "pval", or_col = "or", or_lci_col = "or_lci95", or_uci_col = "or_uci95", method_col = "method", force_static = FALSE )
res |
A data.frame of MR results |
label |
Column whose values will be used to group results by (Optional) |
snp_col |
Column name for SNPs (Optional) |
beta_col |
Column name for beta (Optional) |
pval_col |
Column name for P value (Optional) |
or_col |
Column name for odds ratio (Optional) |
or_lci_col |
Column name for lower CI of OR (Optional) |
or_uci_col |
Column name for upper CI of OR (Optional) |
method_col |
Column name which contains the MR method (Optional) |
force_static |
True for forcing the plot to be returned as a static plot (Optional) |
Plot
If 'plotly' is installed, the plot will be returned interactive.
z_comparison_plot(dat1, dat2, z_col = "z", p_col = "pvalues", verbose = TRUE)z_comparison_plot(dat1, dat2, z_col = "z", p_col = "pvalues", verbose = TRUE)
dat1 |
A list of data |
dat2 |
A list of data |
z_col |
Column name for Z scores (Optional) |
verbose |
Display verbose information (Optional, boolean) |
force_static |
True for forcing the plot to be returned as a static plot (Optional) |
Plot
Creates a Z-score plot for SNPs, where .
This should follow a parabollic shape and so can be used to find certain
SNPs which may not follow this shape.
If 'plotly' is installed, the plot will be returned interactive.
z_plot( dat, snp_col = "SNP", beta_col = "beta.exposure", se_col = "se.exposure", pval_col = "pval.exposure", force_static = FALSE )z_plot( dat, snp_col = "SNP", beta_col = "beta.exposure", se_col = "se.exposure", pval_col = "pval.exposure", force_static = FALSE )
dat |
A data.frame of data |
snp_col |
Column of SNP names (Optional) |
beta_col |
Column of MR beta estimates (Optional) |
se_col |
Column of standard errors for the beta estimates (Optional) |
pval_col |
Column of P values (Optional) |
force_static |
True for forcing the plot to be returned as a static plot (Optional) |
Plot