Conditional analysis of VCF files can be performed using GCTA’s COJO routine. The procedure implemented here is as follows
Ultimately, a list of results will be returned where every fine-mapped variant has a regional set of summary data that is conditionally independent of all neighbouring fine-mapped variants.
Setup:
vcffile <- "ieu-a-300.vcf.gz"
ldref <- "/Users/gh13047/repo/mr-base-api/app/ld_files/EUR"
gwasvcf::set_bcftools()Perform susieR pipeline:
out <- susieR_pipeline(
vcffile=vcffile,
bfile=ldref,
plink_bin=genetics.binaRies::get_plink_binary(),
pop="EUR",
threads=1,
L=10,
estimate_residual_variance=TRUE,
estimate_prior_variance=TRUE,
check_R=FALSE,
z_ld_weight=1/500
)Each detected region now has a finemapped object stored against it. You can see them for example like this:
For each region we can extract the variants with the highest posterior inclusion probability per credible set, e.g.:
Now we can perform conditional analysis at each region using
knowledge of the finemapped variants. The cojo_cond
function does the following
The result is a list of regions, with a set of conditional summary stats for every fine-mapped variant in that region.
out2 <- cojo_cond(
vcffile=vcffile,
bfile=ldref,
pop="EUR",
snplist=unlist(sapply(out$res, function(x) x$susieR$fmset))
)TODO