Package 'gpmapr' reference manual

Title:	Query the OpenGWAS genotype-phenotype map
Description:	This package is a simple wrapper around the OpenGWAS genotype-phenotype map API.
Authors:	Gibran Hemani [aut, cre] (ORCID: <https://orcid.org/0000-0003-0920-1055>)
Maintainer:	Gibran Hemani <[email protected]>
License:	MIT + file LICENSE
Version:	0.0.1.0
Built:	2026-06-22 19:44:53 UTC
Source:	https://github.com/MRCIEU/gpmapr

All genes

Description

Get all genes from the API

Usage

all_genes()
all_genes()

Value

A dataframe containing all genes with the following columns:

id: the id of the gene
gene: the name of the gene
description: the description of the gene
gene_biotype: the gene biotype
chr: the chromosome of the gene
start: the start position of the gene
stop: the end position of the gene
strand: the strand of the gene
source: the source of the gene
distinct_trait_categories: the number of trait categories that the gene is associated with via coloc groups
distinct_protein_coding_genes: the number of genes that the gene is associated with via coloc groups
num_study_extractions: the number of study extractions for this gene
num_coloc_groups: the number of coloc groups for this gene
num_coloc_studies: the number of studies that have coloc results for this gene
num_rare_groups: the number of rare groups for this gene

All traits

Description

Get all traits from the API

Usage

all_traits()
all_traits()

Value

A dataframe containing all traits with the following columns:

id: the id of the trait
data_type: the data type of the trait
trait: the internal string id of the trait
trait_name: the name of the trait
trait_category: the trait category of the trait
variant_type: the type of variant
sample_size: the sample size of the trait
category: the category of the trait (continuous, categorical)
ancestry: the ancestry of the trait
heritability: the LDSC heritability score of the trait
heritability_se: the standard error of the LDSC heritability score of the trait
num_study_extractions: the number of study extractions for this trait
num_coloc_groups: the number of coloc groups for this trait
num_coloc_studies: the number of studies that have coloc results for this trait
num_rare_results: the number of rare results for this trait

Get Associations by SNP ID and Study ID

Description

Get associations from the API by SNP id and study id

Usage

associations(variant_ids, study_ids)
associations(variant_ids, study_ids)

Arguments

variant_ids

A vector of numeric values specifying the SNP IDs

study_ids

A vector of numeric values specifying the Study IDs

Value

A dataframe containing the associations

associations_dataframe

The associations dataframe contains information about which studies have association results. It has the following columns:

variant_id: the id of the SNP associated with this association
study_id: the id of the study associated with this association
beta: the beta value of the association
se: the standard error of the association
p: the p-value of the association
eaf: the estimated allele frequency of the association
imputed: whether the association is imputed

Gene

Description

A collection of studies that are associated with a particular gene.

Usage

gene(
  gene_id,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  include_trans = TRUE,
  h4_threshold = 0.8
)
gene(
  gene_id,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  include_trans = TRUE,
  h4_threshold = 0.8
)

Arguments

gene_id

A numeric value specifying the gene id

include_associations

A logical value specifying whether to include associations (BETA, SE, P), defaults to FALSE

include_coloc_pairs

A logical value specifying whether to include coloc pairs, defaults to FALSE

include_trans

A logical value specifying whether to include trans genetic effects, defaults to TRUE

h4_threshold

A numeric value specifying the h4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

gene: A list containing metadata about the gene, including region, and neighboring genes.
coloc_groups: a dataframe containing information about which studies have coloc results for this gene. See below for details.
study_extractions: a list of dataframes containing the study extractions for this trait. See below for details.
rare_results: (optional) a list of dataframes containing the rare results for this trait
coloc_pairs: (optional) a dataframe containing all pairwise coloc results for this trait.
variants: a dataframe containing the variants for each associated coloc group or rare group.

See below for details.

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

variants_dataframe

The variants dataframe contains variant information that is pulled from the Variant Effect Predictor (VEP) database. It has the following columns, along side many more columns from VEP:

id: the id of the SNP
gene_id: the id of the gene as predicted by VEP
gene: the gene name as predicted by VEP

Genes

Description

Get specific genes from the API. The API returns collapsed/combined data for all requested genes.

Usage

genes(
  gene_ids,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  include_trans = TRUE,
  h4_threshold = 0.8
)
genes(
  gene_ids,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  include_trans = TRUE,
  h4_threshold = 0.8
)

Arguments

gene_ids

A vector of gene ids (1 or more)

include_associations

A logical value specifying whether to include associations (BETA, SE, P), defaults to FALSE

include_coloc_pairs

A logical value specifying whether to include coloc pairs, defaults to FALSE

include_trans

A logical value specifying whether to include trans genetic effects, defaults to TRUE

h4_threshold

A numeric value specifying the h4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

genes: gene metadata for the requested genes
coloc_groups: a dataframe containing information about which studies have coloc results for all genes
study_extractions: a dataframe containing the study extractions for all genes
rare_results: a dataframe containing the rare results for all genes

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

Get All Gene Pleiotropies

Description

Get gene pleiotropy from the API by gene id

Usage

get_all_gene_pleiotropies()
get_all_gene_pleiotropies()

Value

A list containing the gene pleiotropy

gene_id: the id of the gene
gene: the name of the gene
distinct_trait_categories: the number of trait categories that the gene is associated with via coloc groups
distinct_protein_coding_genes: the number of genes that the gene is associated with via coloc groups

Get All SNP Pleiotropies

Description

Get all SNP pleiotropies from the API

Usage

get_all_variant_pleiotropies()
get_all_variant_pleiotropies()

Value

A list containing the SNP pleiotropies

variant_id: the id of the SNP
display_snp: the name of the SNP
distinct_trait_categories: the number of trait categories that the SNP is associated with via coloc groups
distinct_protein_coding_genes: the number of genes that the SNP is associated with via coloc groups

Get a GWAS from the API

Description

Get a GWAS from the API

Usage

get_gwas(gwas_id, include_associations = FALSE, include_summary_stats = FALSE)
get_gwas(gwas_id, include_associations = FALSE, include_summary_stats = FALSE)

Arguments

gwas_id

The ID of the GWAS

include_associations

Whether to include associations

include_summary_stats

Whether to include summary statistics

Value

A list containing the GWAS information

LD Matrix

Description

Get LD matrix from the API by Variant ID

Usage

ld_matrix(variant_ids = c())
ld_matrix(variant_ids = c())

Arguments

variant_ids

A character string specifying the Variant ID. Variant IDs can be SNP IDs or variant IDs.

Value

A list containing the LD matrix

ld_dataframe

The ld dataframe contains information about the LD matrix. It has the following columns:

lead_variant_id: the id of the lead SNP
proxy_variant_id: the id of the variant SNP
ld_block_id: the id of the LD block
r: the r value between the lead and variant SNPs

LD Proxies

Description

Get LD proxies from the API by Variant ID

Usage

ld_proxies(variant_ids = c())
ld_proxies(variant_ids = c())

Arguments

variant_ids

A character string specifying the Variant ID. Variant IDs can be SNP IDs or variant IDs.

Value

A list containing the LD proxies

ld_dataframe

The ld dataframe contains information about the LD matrix. It has the following columns:

lead_variant_id: the id of the lead SNP
proxy_variant_id: the id of the variant SNP
ld_block_id: the id of the LD block
r: the r value between the lead and variant SNPs

Region

Description

A collection of studies that are associated with a particular region.

Usage

region(
  region_id,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)
region(
  region_id,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)

Arguments

region_id

A numeric value specifying the region id

include_associations

A logical value specifying whether to include associations (BETA, SE, P), defaults to FALSE

include_coloc_pairs

A logical value specifying whether to include coloc pairs, defaults to FALSE

h4_threshold

A numeric value specifying the h4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

gene: A list containing metadata about the gene, including region, and neighboring genes.
coloc_groups: a dataframe containing information about which studies have coloc results for this gene. See below for details.
study_extractions: a list of dataframes containing the study extractions for this trait. See below for details.
rare_results: (optional) a list of dataframes containing the rare results for this trait
coloc_pairs: (optional) a dataframe containing all pairwise coloc results for this trait.
variants: a dataframe containing the variants for each associated coloc group or rare group.

See below for details.

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

genes_in_region_dataframe

The genes_in_region dataframe contains information about which genes are in a region. It has the following columns:

id: the id of the gene
ensembl_id: the ensembl id of the gene
gene: the name of the gene
description: the description of the gene
gene_biotype: the gene biotype
chr: the chromosome of the gene
start: the start position of the gene
stop: the stop position of the gene
strand: the strand of the gene
source: the source of the gene

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

variants_dataframe

The variants dataframe contains variant information that is pulled from the Variant Effect Predictor (VEP) database. It has the following columns, along side many more columns from VEP:

id: the id of the SNP
gene_id: the id of the gene as predicted by VEP
gene: the gene name as predicted by VEP

Search the Genotype-Phenotype Map

Description

Search the GP Map for Traits, Genes or Variants

Usage

search_gpmap(search_text, rsquared_threshold = 0.8)
search_gpmap(search_text, rsquared_threshold = 0.8)

Arguments

search_text

A character string specifying the search text

rsquared_threshold

A numeric value specifying the rsquared threshold for proxy variants, defaults to 0.8

Details

After calling search, you can use call the subsequent data as described in the call column of the search results.

Value

A dataframe containing the search results with the following columns:

type: the type of the search result: "original_variant", "proxy_variant", "trait", "gene"
name: the name of the search result
type_id: the type_id of the search result. This is the internal id in which the data can be accessed.
call: the call to get the search result: "variant(type_id)", "trait(type_id)", "gene(type_id)"
info: a string containing informaiton about the search result, which may include:
- Extractions: the number of extractions
- Colocalisation Groups: the number of colocalisation groups
- Colocalisation Studies: the number of colocalisation studies
- Rare Results: the number of rare results
- Rsquared: the rsquared of the proxy variant compared to the original variant

A collection of studies that are associated with a particular phenotype. A trait will include a common study and occasionally a rare study. When trait_id is a GUID (from GWAS upload), fetches the upload result instead.

Usage

trait(
  trait_id,
  include_associations = FALSE,
  include_full_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)
trait(
  trait_id,
  include_associations = FALSE,
  include_full_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)

Arguments

trait_id

A numeric value or GUID (from GWAS upload) specifying the trait id

include_associations

A logical value specifying whether to include associations (BETA, SE, P), defaults to FALSE

include_full_associations

A logical value specifying whether to include full trait associations from /associations-full, defaults to FALSE

include_coloc_pairs

A logical value specifying whether to include coloc pairs, defaults to FALSE

h4_threshold

A numeric value specifying the h4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

trait: A list containing metadata about the trait, including common and rare studies associated with the trait
coloc_groups: a dataframe containing information about which studies have coloc results for this trait. See below for details.
study_extractions: a list of dataframes containing the study extractions for this trait. See below for details.
rare_results: (optional) a list of dataframes containing the rare results for this trait
coloc_pairs: (optional) a dataframe containing all pairwise coloc results for this trait.
full_associations: (optional) a dataframe of full trait associations from /associations-full.

See below for details.

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

variants_dataframe

The variants dataframe contains variant information that is pulled from the Variant Effect Predictor (VEP) database. It has the following columns, along side many more columns from VEP:

id: the id of the SNP
gene_id: the id of the gene as predicted by VEP
gene: the gene name as predicted by VEP

Traits

Description

Get specific traits from the API. The API returns collapsed/combined data for all requested traits. When a trait ID is a GUID (from GWAS upload), fetches the upload result instead.

Usage

traits(
  trait_ids,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)
traits(
  trait_ids,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)

Arguments

trait_ids

A vector of trait ids (numeric) or GUIDs (from GWAS upload)

include_associations

A logical value specifying whether to include associations (BETA, SE, P), defaults to FALSE

include_coloc_pairs

A logical value specifying whether to include coloc pairs, defaults to FALSE. Coloc pairs are fetched from a separate endpoint per trait.

h4_threshold

A numeric value specifying the h4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

traits: trait metadata for the requested traits
coloc_groups: a dataframe containing information about which studies have coloc results for all traits. See below for details.
study_extractions: a dataframe containing the study extractions for all traits. See below for details.
rare_results: a dataframe containing the rare results for all traits
coloc_pairs: (optional) a dataframe containing all pairwise coloc results for all traits.

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

Upload a GWAS to the API

Description

Upload a GWAS to the API

Usage

upload_gwas(
  file,
  name,
  p_value_threshold = 5e-08,
  column_names = list(),
  email = NA,
  category = "continuous",
  is_published = FALSE,
  doi = NA,
  should_be_added = FALSE,
  ancestry = "EUR",
  sample_size = NA,
  reference_build = "GRCh38",
  compare_with_upload_guids = NA
)
upload_gwas(
  file,
  name,
  p_value_threshold = 5e-08,
  column_names = list(),
  email = NA,
  category = "continuous",
  is_published = FALSE,
  doi = NA,
  should_be_added = FALSE,
  ancestry = "EUR",
  sample_size = NA,
  reference_build = "GRCh38",
  compare_with_upload_guids = NA
)

Arguments

file

The path to the GWAS file, maximum size is 1GB

name

The name of the GWAS

p_value_threshold

The p-value threshold for the GWAS

column_names

A list of column names in the format of: list(CHR = "chr", BP = "pos"...)

CHR: chromosome
BP: base pair position
P: p-value
EA: allele 1
OA: allele 2
EAF: allele frequency And either BETA and SE, or OR, LB, and UB
BETA: beta
SE: standard error
OR: odds ratio
LB: lower bound of the confidence interval
UB: upper bound of the confidence interval

email

The email of the user

category

The category of the GWAS. Only "continuous" and "categorical" are accepted.

is_published

Whether the GWAS is published

doi

The DOI of the GWAS

should_be_added

Whether the GWAS should be added to the API

ancestry

The ancestry of the GWAS. Currently only "EUR" is accepted.

sample_size

The sample size of the GWAS

reference_build

The reference build of the GWAS. Only "GRCh37" and "GRCh38" are accepted.

compare_with_upload_guids

A vector of GUIDs of uploads to compare with

Value

A list containing the GWAS information

Variant

Description

A collection of studies that are associated with a particular variant.

Usage

variant(
  variant_id,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8,
  include_summary_stats = FALSE
)
variant(
  variant_id,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8,
  include_summary_stats = FALSE
)

Arguments

variant_id

A character string specifying the SNP ID

include_coloc_pairs

A logical value specifying whether to include coloc pairs

h4_threshold

A numeric value specifying the cutoff for included coloc pairs, defaults to 0.8. Only used if include_coloc_pairs is TRUE.

include_summary_stats

A logical value specifying whether to include summary stats

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

variant: named list containing the variant information
coloc_groups: a dataframe containing information about which studies have coloc results for this variant
rare_results: a list of dataframes containing the rare variants
study_extractions: a list of dataframes containing the study extractions
summary_stats (optional): a list of dataframes containing the summary stats for each study, where the name of each element is the study_id. Column names are uppercase (e.g. SNP, BP, BETA, SE, LBF_1).
coloc_pairs (optional): a dataframe containing information about which studies have coloc pairs for this variant where the study_extraction_a_id and study_extraction_b_id are the study_extraction_ids of the two studies. h4_threshold is the cutoff for included coloc pairs, defaults to 0.8

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

summary_statistics_dataframe

The summary_statistics dataframe contains information about which studies have summary statistics. From the API, column names are typically uppercase (SNP, CHR, BP, EA, OA, EAF, Z, BETA, SE, P, LBF_1, etc.). It has the following columns (names may be upper or lower case depending on source):

SNP / variant_id: the id of the SNP
CHR / chr: the chromosome of the SNP
BP / bp: the base pair position of the SNP
EA / ea: the effect allele
OA / oa: the other allele
EAF / eaf: the estimated allele frequency
Z / z: the z-score
BETA / beta: the beta value
SE / se: the standard error
P / p: the p-value
imputed: whether the summary statistics are imputed
LBF_* / lbf_*: all different finemapped log-bayes factors for each credible set. Each credible set is numbered from 1 to 10. If finemapped failed or only returned 1 credible set, the LBF_1 column is just converted directly from the z-score.

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

Variants

Description

Get specific variants from the API. The API accepts variant identifiers (variant_ids, rsids, or strings) and returns collapsed/combined data. The API distinguishes between identifier types automatically. Max 10 variants when expand=TRUE.

Usage

variants(
  variants,
  expand = FALSE,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)
variants(
  variants,
  expand = FALSE,
  include_associations = FALSE,
  include_coloc_pairs = FALSE,
  h4_threshold = 0.8
)

Arguments

variants

A vector of variant identifiers (variant_ids, rsids, or strings)

expand

Logical. FALSE (default) returns minimal data. TRUE returns full VariantResponse (max 10)

include_associations

Logical. Whether to include associations (BETA, SE, P). Only when expand=TRUE

include_coloc_pairs

Logical. Whether to include coloc pairs. Only when expand=TRUE

h4_threshold

Numeric. H4 threshold for coloc pairs, defaults to 0.8

Details

The dataframes returned by this function are as follows:

Value

A list which contains the following elements:

variants: a dataframe containing the variants for all requested variants
coloc_groups: (if expanded) a dataframe containing the coloc groups for all variants
study_extractions: (if expanded) a dataframe containing the study extractions for all variants
rare_results: (if expanded) a dataframe containing the rare results for all variants

coloc_groups_dataframe

The coloc_groups dataframe contains information about which studies have coloc results. It has the following columns:

coloc_group_id: the unique id for this group of colocalised results
study_id: the id of the study
study_extraction_id: the id of the study extraction
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

study_extractions_dataframe

The study_extractions dataframe contains information about which studies have coloc results. It has the following columns:

id: the unique id for this study extraction
study_id: the id of the study associated with this study extraction
variant_id: the id of the SNP
snp: the SNP name
ld_block_id: the id of the LD block
unique_study_id: the unique id for this study
study: the study name
file: the file name
svg_file: the SVG file name
file_with_lbfs: the file name with lbfs
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
cis_trans: the cis/trans status of the SNP
ld_block: the LD block of the SNP
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait

rare_results_dataframe

The rare_results dataframe contains information about which studies have coloc results. It has the following columns:

rare_result_group_id: the unique id for this rare result group
study_id: the id of the study associated with this rare result
study_extraction_id: the id of the study extraction associated with this rare result
variant_id: the id of the SNP
ld_block_id: the id of the LD block
chr: the chromosome of the SNP
bp: the base pair position of the SNP
min_p: the minimum p-value related to the study_extraction_id
display_snp: the display SNP name
gene: the gene associated with the SNP
gene_id: the id of the gene
trait_id: the id of the trait
trait_name: the name of the trait
trait_category: the category of the trait
data_type: the data type of the trait
tissue: the tissue of the trait
ld_block: the LD block of the SNP

summary_statistics_dataframe

SNP / variant_id: the id of the SNP
CHR / chr: the chromosome of the SNP
BP / bp: the base pair position of the SNP
EA / ea: the effect allele
OA / oa: the other allele
EAF / eaf: the estimated allele frequency
Z / z: the z-score
BETA / beta: the beta value
SE / se: the standard error
P / p: the p-value
imputed: whether the summary statistics are imputed
LBF_* / lbf_*: all different finemapped log-bayes factors for each credible set. Each credible set is numbered from 1 to 10. If finemapped failed or only returned 1 credible set, the LBF_1 column is just converted directly from the z-score.

coloc_pairs_dataframe

The coloc_pairs dataframe contains information about which studies have coloc pairs. It has the following columns:

study_extraction_a_id: the id of the study extraction associated with this coloc pair
study_extraction_b_id: the id of the study extraction associated with this coloc pair
ld_block_id: the id of the LD block
h3: the h3 value for this coloc pair
h4: the h4 value for this coloc pair
spurious: whether this coloc pair is spurious

Package 'gpmapr'

Help Index

All genes

Description

Usage

Value

All traits

Description

Usage

Value

Get Associations by SNP ID and Study ID

Description

Usage

Arguments

Value

associations_dataframe

Gene

Description

Usage

Arguments

Details

Value

coloc_groups_dataframe

study_extractions_dataframe

rare_results_dataframe

coloc_pairs_dataframe

variants_dataframe

Genes

Description

Usage

Arguments

Details

Value

coloc_groups_dataframe

study_extractions_dataframe

rare_results_dataframe

coloc_pairs_dataframe

Get All Gene Pleiotropies

Description

Usage

Value

Get All SNP Pleiotropies

Description

Usage

Value

Get a GWAS from the API

Description

Usage

Arguments

Value

Get API Health

Description

Usage

Value

LD Matrix

Description

Usage

Arguments

Value

ld_dataframe

LD Proxies

Description

Usage

Arguments

Value

ld_dataframe

Region

Description

Usage

Arguments

Details

Value

coloc_groups_dataframe

genes_in_region_dataframe

study_extractions_dataframe

rare_results_dataframe

coloc_pairs_dataframe

variants_dataframe

Search the Genotype-Phenotype Map

Description