Title: | Structural equation modeling based on GWAS summary statistics |
---|---|
Description: | Later |
Authors: | Andrew Grotzinger, Matthijs van der Zee, Mijke Rhemtulla, Hill Ip, Michel Nivard, Elliot Tucker-Drob |
Maintainer: | Andrew Grotzinger <[email protected]> |
License: | GPL-3.0 |
Version: | 0.0.5 |
Built: | 2024-12-19 16:24:53 UTC |
Source: | https://github.com/GenomicSEM/GenomicSEM |
Function to expand the S and V matrices to include SNP effects for multivariate GWAS in GenomicSEM
addSNPs(covstruc, SNPs,SNPSE=FALSE,parallel=TRUE,cores=NULL,GC="standard", ...)
addSNPs(covstruc, SNPs,SNPSE=FALSE,parallel=TRUE,cores=NULL,GC="standard", ...)
covstruc |
Output from Genomic SEM 'ldsc' function |
SNPs |
Summary statistics file created using the 'sumstats' function |
SNPSE |
Whether the user wants to provide a different standard error (SE) of the SNP variance than the package default. The default is to use .0005 to reflect the fact that the SNP SE is assumed to be population fixed. |
parallel |
addSNPs automatically uses mclapply to create the S and V matrices in parallel. Sometimes running in parallel can cause memory issues within the computing cores. If this is the case, the parallel argument can be set to FALSE, and addSNPs will create the S and V matrices serially. |
cores |
addSNPs automatically uses mclapply to create the S and V matrices in parallel. If the user does not provide an argument to the cores option, then addSNPs will automatically use one less than the total number of cores available. |
GC |
Level of Genomic Control (GC) you want the function to use. The default is 'standard' which adjusts the univariate GWAS standard errors by multiplying them by the square root of the univariate LDSC intercept. Additional options include 'conserv' which corrects standard errors using the univariate LDSC intercept, and 'none' which does not correct the standard errors. |
The function expands the S and V matrices to include SNP effects. As many S and V matrices will be created as there are rows in the summary statistics file (i.e., one S and V matrix per SNP). The function returns a list with 3 named entries:
V_Full |
variance covariance matrix of the parameter estimates in S that includes an individual SNP effect |
S_Full |
genetic covaraiance matrix including individual SNP effect |
RS |
A list containing relevant genetic information (e.g., rsID, basepair, A1/A2) to be appended to the output from other functions (e.g., userGWAS) |
Function to run a common factor model based on output from multivariable LDSC
commonfactor(covstruc,estimation="DWLS", ...)
commonfactor(covstruc,estimation="DWLS", ...)
covstruc |
Output from the multivariable LDSC function of Genomic SEM |
estimation |
Options are either Diagonally Weighted Least Squares ("DWLS"; the default) or Maximum Likelihood ("ML") |
The function estimates a common factor model, along with model fit indices, using output from GenomicSEM LDSC. The function returns a list with 2 named entries
modelfit |
The model fit results (e.g., model chi-square, AIC, CFI) from the specified model. |
results |
Parameter estimates and sandwich corrected standard errors from the specified model. |
Function to obtain SNP effects on common factor along with index of SNP heterogeneity
commonfactorGWAS(covstruc=NULL,SNPs=NULL,estimation="DWLS",cores=NULL,toler=FALSE,SNPSE=FALSE,parallel=TRUE,GC="standard",MPI=FALSE,TWAS=FALSE,smooth_check=FALSE, ...)
commonfactorGWAS(covstruc=NULL,SNPs=NULL,estimation="DWLS",cores=NULL,toler=FALSE,SNPSE=FALSE,parallel=TRUE,GC="standard",MPI=FALSE,TWAS=FALSE,smooth_check=FALSE, ...)
covstruc |
Output from Genomic SEM 'ldsc' function |
SNPs |
Summary statistics file created using the 'sumstats' function |
estimation |
The estimation method to be used when running the factor model. The options are Diagonally Weighted Least Squares ("DWLS", this is the default) or Maximum Likelihood ("ML") |
cores |
The number of cores to use on your computer for parallel processing. If no number is provided, the default is to use one less core then is available on your computer |
toler |
The tolerance to use for matrix inversion. |
SNPSE |
Whether the user wants to provide a different standard error (SE) of the SNP variance than the package default. The default is to use .0005 to reflect the fact that the SNP SE is assumed to be population fixed. |
parallel |
Whether the function should run using parallel or serial processing. Default = TRUE |
GC |
Level of Genomic Control (GC) you want the function to use. The default is 'standard' which adjusts the univariate GWAS standard errors by multiplying them by the square root of the univariate LDSC intercept. Additional options include 'conserv' which corrects standard errors using the univariate LDSC intercept, and 'none' which does not correct the standard errors. |
MPI |
Whether the function should use multi-node processing (i.e., MPI). Please note that this should only be used on a computing cluster on which the R package Rmpi is already installed. |
TWAS |
Whether the function is being used to estimate a multivariate TWAS using read_fusion output for the SNPs argument. |
smooth_check |
Whether the function should save the consequent largest Zstatistic difference between the pre and post-smooth matrices. |
The function outputs a series of SNP effects with their SEs and estimate of QSNP (the heterogeneity index). The output is a single object.
Function to take output from multivariable S-LDSC and estimate enrichment of model parameter for user specified model
enrich(s_covstruc, model = "",params,fix= "regressions",std.lv=FALSE,rm_flank=TRUE,tau=FALSE,base=TRUE,toler=NULL,fixparam=NULL, ...)
enrich(s_covstruc, model = "",params,fix= "regressions",std.lv=FALSE,rm_flank=TRUE,tau=FALSE,base=TRUE,toler=NULL,fixparam=NULL, ...)
s_covstruc |
Output from the multivariable S-LDSC function of Genomic SEM (s_ldsc) |
model |
Model to be specified using lavaan notation |
params |
Parameters of interest to be examined for enrichment (e.g., factor variances). |
fix |
What components of the model should be fixed for follow-up enrichment models. Default = "regressions", which will fix all regression parameters. |
std.lv |
Optional argument to denote whether all latent variables are standardized using unit variance identification (default = FALSE) |
rm_flank |
Optional argument to denote whether flanking window annotations should automatically be removed from output (default = TRUE) |
tau |
Optional argument to denote whether the user wants to use the tau genetic covariance matrices, as opposed to the default zero-order matrices, for estimation of enrichment (default = FALSE) |
base |
Optional argument to denote whether the user wants to include the full model output from the genome-wide (i.e., baseline) matrix (default = TRUE) |
toler |
Optional argument to manually set tolerance for matrix inverison. |
fixparam |
Optional argument to manually fix paramters when estimating the model within annotations |
Function to take output from multivariable S-LDSC and estimate enrichment of model parameter for user specified model
Function to run HDL (https://github.com/zhenin/HDL) to compute the genetic covariance between a series of traits based on genome wide summary statistics obtained from GWAS. The results generate by this function are necessary and sufficient to facilitate the fit of structural equation models (or other multivariate models) to the genetic covariance matrix. HDL is more powerfull than LDSC but if the LD structure in the reference file mismatches the GWAS LD structure, LDSC seems to perfrom better, expescially for estiamtes of heritability. For medium samples (N > 50.000) with moderate SNP-h2 (snp h2 > 0.07) where the LD structure isnt similar we would recomend ldsc, especially for GWAS. If you have small GWAS ( N < 25.000) the extra power HDL provides is worth the downward bias in snp h2 estimates relative to ldsc.
hdl(traits,sample.prev,population.prev,LDpath,Nref,trait.names=NULL,method, ...)
hdl(traits,sample.prev,population.prev,LDpath,Nref,trait.names=NULL,method, ...)
traits |
A vector of strings which point to munged files for trait you want to include in a Genomic SEM model. the HDL function works with standard munged files |
sample.prev |
A vector of sample prevalences for dichotomous traits and |
population.prev |
A vector of population prevalences for dichotomous traits and |
LDpath |
String which contains the path to the folder in which the LD matrices used in the analysis are located. Expects LD matirices formated as required by the original HDL software. |
Nref |
Sample size of the reference file, default is 335265 |
trait.names |
A character vector specifying how the traits should be named in the genetic covariance (S) matrix. These variable names can subsequently be used in later steps for model specification. If no value is provided, the function will automatically name the variables using the generic from of V1-VX. |
method |
sting, either "piecewise" which estimates the heritability or genetic covariance locally in chunks across the genome and then sums these estimates, or "jackknife" which uses a genoem wide estiamte and uses a jackknife estimator for the variance of the parameter. defaults to "piecewise" the original HDL implementation is equal to "jackknife" |
The function returns a list with 3 named entries
S |
estimated genetic covariance matrix |
V |
variance covariance matrix of the parameter estimates in S |
I |
matrix containing the "cross trait intercepts", or the error covariance between traits induced by overlap, in terms of subjects, between the GWASes on which the analyses are based |
Ning, Z., Pawitan, Y. & Shen, X. High-definition likelihood inference of genetic correlations across human complex traits. Nat Genet (2020).
Function to run LD score regression (https://github.com/bulik/ldsc) to compute the genetic covariance between a series of traits based on genome wide summary statistics obtained from GWAS. The results generate by this function are necessary and sufficient to facilitate the fit of structural equation models (or other multivariate models) to the genetic covariance matrix.
ldsc(traits,sample.prev,population.prev,ld,wld,trait.names=NULL, sep_weights = FALSE,chr=22,n.blocks=200,ldsc.log=NULL,stand=FALSE,select=FALSE, ...)
ldsc(traits,sample.prev,population.prev,ld,wld,trait.names=NULL, sep_weights = FALSE,chr=22,n.blocks=200,ldsc.log=NULL,stand=FALSE,select=FALSE, ...)
traits |
A vector of strings which point to LDSC munged files for trait you want to include in a Genomic SEM model. |
sample.prev |
A vector of sample prevalences for dichotomous traits and |
population.prev |
A vector of population prevalences for dichotomous traits and |
ld |
String which contains the path to the folder in which the LD scores used in the analysis are located. Expects LD scores formated as required by the original LD score regression software. |
wld |
String which contains the path to the folder in which the LD score weights used in the analysis are located. Expects LD scores formated as required by the original LD score regression software. |
trait.names |
A character vector specifying how the traits should be named in the genetic covariance (S) matrix. These variable names can subsequently be used in later steps for model specification. If no value is provided, the function will automatically name the variables using the generic from of V1-VX. |
sep_weights |
Logical which indicates wheter the weights are different form the LD scores used for the regression, defaults to FALSE. |
chr |
number of chromosomes over which the LDSC weights are split, defalts to 22 (Human) but can be switched for other species |
n.blocks |
Number of blocks to use for the jacknive procedure which is used to estiamte entries in V, higher values will be optimal if you have a large number of variables and also slower, defaults to 200 |
ldsc.log |
How you want to name your .log file for ldsc. The default is NULL as the package will automatically name the log based in the file names unless a log file name is provided to this arugment. |
stand |
Whether you want the package to also output a genetic correlation and sampling correlation matrix. Default is FALSE. |
select |
Whether you want the package to estimate LDSC using only even or odd chromosomes by setting select to "ODD" and "EVEN" respectively. It can also be set to a set of numbers, such as c(1,3,10), to run ldsc on a specific chromosome or chromosomes. Default is FALSE, in which case LDSC is estimated using all chromosomes. |
chisq.max |
Maximum value of the squared Z statistics for a SNP that is considered in the LD-score regression. Default behaviour is to set to the maximum of 80 and N*0.001 |
The function returns a list with 5 named entries
S |
estimated genetic covariance matrix |
V |
variance covariance matrix of the parameter estimates in S |
I |
matrix containing the "cross trait intercepts", or the error covariance between traits induced by overlap, in terms of subjects, between the GWASes on which the analyses are based |
N |
a vector contsaining the sample size (for the genetic variances) and the geometric mean of sample sizes (i.e. sqrt(N1,N2)) between two samples for the covariances |
m |
number of SNPs used to compute the LD scores with. |
Given a set of parameter values from Genomic SEM models, calculate the extent to which groups differ on these parameters.
localSRMD(unconstrained, constrained, lhsvar, rhsvar, ...)
localSRMD(unconstrained, constrained, lhsvar, rhsvar, ...)
unconstrained |
A vector of parameter values from an unconstrained structural equation model, where focal parameters are estimated freely in each group. |
constrained |
A vector of parameter values from a constrained structural equation model, where focal parameters are constrained to be equal across groups. |
lhsvar |
A list containing the variances, in each group, of the variables in the usermodel() results column lhs |
rhsvar |
A list containing the variances, in each group, of the variables in the usermodel() results column rhs |
The function returns the average standardized extent to which estimates from a constrained set of structural equation model parameters differ from those obtained when the same set of parameters are freely estimated.
Function to expand the S and V matrices to include multiple SNP effects in a single matrix, along with LD information across these SNPs.
multiSNP (covstruc, SNPs, LD, SNPSE = FALSE, SNPlist = NA, ...)
multiSNP (covstruc, SNPs, LD, SNPSE = FALSE, SNPlist = NA, ...)
covstruc |
Output from Genomic SEM multivariable LDSC |
SNPs |
Summary statistics file created using the sumstats function |
LD |
Matrix of LD information across the SNPs. If only independent SNPs are being provided a matrix of 0s can be entered. Note that the function requires that A1 and A2 be included in the LD matrix column names (e.g., rs12345_A_T) |
SNPSE |
User provided SE of the SNP variance for entry in the V matrix. If no number is provided the package defaults to using .0005 to reflect a practically fixed population value taken from a reference panel |
SNPlist |
List of rsIDs if the user wishes to subset out a set of SNPs from a full set of summary statistics |
The function expands the S and V matrices to include multiple SNP effects. These matrices include LD information across the SNPs.
Function to process GWAS summary statistis files and prepair them for LD score regression
munge(files,hm3,trait.names=NULL,N,info.filter = .9,maf.filter=0.01, column.names=list(),parallel=FALSE,cores=NULL,overwrite=TRUE ...)
munge(files,hm3,trait.names=NULL,N,info.filter = .9,maf.filter=0.01, column.names=list(),parallel=FALSE,cores=NULL,overwrite=TRUE ...)
files |
A vector of file names, files must be located in the working directory, or a path must be provided. |
hm3 |
A file of SNPs with A1, A2 and rsID used to allign alleles across traits. We suggest using an (UNZIPPED) file of HAPMAP3 SNPs with some basic cleaning applied (e.g., MHC region removed) that is supplied and created by the original LD score regression developers and available here: https://data.broadinstitute.org/alkesgroup/LDSCORE/w_hm3.snplist.bz2: |
trait.names |
A vector of trait names which will be used as names for the munged files |
N |
A vector of sample size |
info.filter |
Numeric value which is used as a lower bound for inputation quality (INFO) |
maf.filter |
Numeric value used as a lower bound for minor allel frequency |
column.names |
Optional list detailing which columns represent, SNP, MAF, etc. e.g. list(SNP=my_snp_column) |
parallel |
Indicates whether munge should process the summary statistics files in parallel or serial fashion. Default is TRUE, indicating that it will run in parallel. |
cores |
Indicates how many cores to use when running in parallel. The default is NULL, in which case munge will use 1 less than the total number of cores available in the local environment. |
overwrite |
Indicates whether existing .sumstats.gz files should be overwritten |
The function writes files of the ".sumstats" format, which can be used to estimate SNP heritability and genetic covariance using the ldsc() function. The function will also output a .log file that should be examined to ensure that column names are being interpret correctly.
Function to format TWAS summary statistics for T-SEM
read_fusion(files,trait.names=NULL,binary=NULL,N=NULL,perm=FALSE, ...)
read_fusion(files,trait.names=NULL,binary=NULL,N=NULL,perm=FALSE, ...)
files |
List of FUSION files. |
trait.names |
What names to use when naming the beta and SE columns for each trait |
binary |
Vector specifying whether each trait is binary (TRUE) or continuous (FALSE) |
N |
The sample size to use when backing out the betas and SEs from FUSION Z-statistics. This should reflect the sum of effective sample size for binary traits |
perm |
Whether you want to use the permutation test p-values from FUSION. Default = FALSE |
The function combines and formats FUSION TWAS summary statistics for subsequent T-SEM analyses. The function returns a single data.frame.
Function to run Stratified LD score regression.
s_ldsc(traits,sample.prev=NULL,population.prev=NULL,ld,wld,frq,trait.names=NULL,n.blocks=200,ldsc.log=NULL,exclude_cont=TRUE, ...)
s_ldsc(traits,sample.prev=NULL,population.prev=NULL,ld,wld,frq,trait.names=NULL,n.blocks=200,ldsc.log=NULL,exclude_cont=TRUE, ...)
traits |
A vector of file names which point to LDSC munged files for trait you want to include. |
sample.prev |
A vector of sample prevalences for dichotomous traits and |
population.prev |
A vector of population prevalences for dichotomous traits and |
ld |
A folder (or folders) of partitioned LD scores used as the independent variable in S-LDSC. |
wld |
A folder of non-partitioned LD scores used as regression weights. |
frq |
A folder of allele frequency files. |
trait.names |
A character vector specifying how the traits should be named. These variable names can subsequently be used in later steps for model specification. |
n.blocks |
Number of blocks to use for the jacknive procedure which is used to estiamte entries in V, higher values will be optimal if you have a large number of variables and also slower, defaults to 200. |
ldsc.log |
What to name the .log file if you want to overrride default to name file based on file names used as input. |
exclude_cont |
Whether to exclude continuous annotations from S-LDSC estimation. |
The function returns a list with 9 named entries
S |
The zero-order genetic covariance matrices for each annotation. |
V |
The zero-order sampling covariance matrices for each annotation. |
S_Tau |
The tau matrices for each annotation. |
V_Tau |
The tau sampling covariance matrices for each annotation. |
I |
matrix containing the "cross trait intercepts", or the error covariance between traits induced by overlap, in terms of subjects, between the GWASes on which the analyses are based |
N |
a vector contsaining the sample size (for the genetic variances) and the geometric mean of sample sizes (i.e. sqrt(N1,N2)) between two samples for the covariances |
m |
number of SNPs used to compute the LD scores with. |
Prop |
The proportional size of each annotation relative to the annotation containing all SNPs. |
Select |
A data.frame that codes flanking window and continuous annotations as 2 and all other annotations as 1. This is used by the 'enrich' function to exclude the flanking window and continuous annotations from enrichment estimates. |
Function to process GWAS summary statistics files and prepare them for a GWAS in genomicSEM
sumstats(files,ref,trait.names=NULL,se.logit,OLS=NULL,linprob=NULL,N=NULL,betas=NULL,info.filter = .6,maf.filter=0.01, keep.indel=FALSE,parallel=FALSE,cores=NULL,ambig=FALSE,direct.filter=FALSE, ...)
sumstats(files,ref,trait.names=NULL,se.logit,OLS=NULL,linprob=NULL,N=NULL,betas=NULL,info.filter = .6,maf.filter=0.01, keep.indel=FALSE,parallel=FALSE,cores=NULL,ambig=FALSE,direct.filter=FALSE, ...)
files |
a vector of file names, files must be located in the working directory, or a path must be provided. |
ref |
A reference file of SNPs to keep in your GWAS, one based on 1000 genomes phase 3 is provided. |
trait.names |
a vector of trait names which will be used as names for the munged files |
se.logit |
a logical vector indicating whether the standard errors in each set of summary statistics is on the logit scale |
OLS |
a logical vector indicating whether the GWAS was for a continuous trait and used OLS (or a LMM) |
linprob |
a logical vector indicating whether the GWAS is a binary outcome with only Z-statistics or was analyzed using a linear probability model i.e. a dichotomous trait using OLS (or a LMM) |
N |
A vector of total sample sizes for continuous traits and the sum of effective sample sizes for binary traits |
betas |
A vector of column names of betas for continuous traits that are known to have been standardized prior to running the GWAS |
N |
A vector of sample size |
info.filter |
Numeric value which is used as a lower bound for imputation quality (INFO) |
maf.filter |
Numeric value used as a lower bound for minor allele frequency |
keep.indel |
Indicates whether insertion-deletion mutations (indels) should be included in your summary statistics. The default is FALSE. |
parallel |
Indicates whether sumstats should process the summary statistics files in parallel or serial fashion. Default is TRUE, indicating that it will run in parallel. |
cores |
Indicates how many cores to use when running in parallel. The default is NULL, in which case sumstats will use 1 less than the total number of cores available in the local environment. |
ambig |
Indicates whether strand ambiguous SNPs should be removed from output. |
direct.filter |
Indicates whether SNPs that have missing information for more than half of contributing cohorts, as indicated by missing information in the direction column, should be removed. |
The function ensures the SNPs in each file are aligned to the same reference allele, it attempts to filter strand issues, it retains SNPs present in the reference file. The function can deal with GWAS of continous traits, dichotomous traits using logistic regression and even dichotomous traits using (misspecified) OLS regression or a mixed model. The function returns .log files that should be inspected to ensure that all column names were appropriately interpreted.
Function to obtain model estimates for a user-specified model across SNPs.
userGWAS(covstruc=NULL,SNPs=NULL,estimation="DWLS",model="",printwarn=TRUE,sub=FALSE,cores=NULL,toler=FALSE,SNPSE=FALSE, parallel=TRUE,GC="standard",MPI=FALSE,smooth_check=FALSE,TWAS=FALSE,std.lv=FALSE,fix_measurement=TRUE, ...)
userGWAS(covstruc=NULL,SNPs=NULL,estimation="DWLS",model="",printwarn=TRUE,sub=FALSE,cores=NULL,toler=FALSE,SNPSE=FALSE, parallel=TRUE,GC="standard",MPI=FALSE,smooth_check=FALSE,TWAS=FALSE,std.lv=FALSE,fix_measurement=TRUE, ...)
covstruc |
Output from Genomic SEM 'ldsc' function |
SNPs |
Summary statistics file created using the 'sumstats' function |
estimation |
The estimation method to be used when running the factor model. The options are Diagonally Weighted Least Squares ("DWLS", this is the default) or Maximum Likelihood ("ML") |
model |
The user-specified model to use in model estimation using lavaan syntax. The SNP is referred to as 'SNP' in the model. |
printwarn |
Whether you want warnings and errors printed for each run. This can take up significant space across all SNPs, but the default is set to TRUE as these warnings may not be safe to ignore. |
sub |
Whether you want to only output a piece of the model results (e.g., F1 ~ SNP). The argument takes a vector, as multiple pieces of the model result can be output. |
cores |
Indicates how many cores to use when running in parallel. The default is NULL, in which case sumstats will use 1 less than the total number of cores available in the local environment. |
toler |
The tolerance to use for matrix inversion. |
SNPSE |
Whether the user wants to provide a different standard error (SE) of the SNP variance than the package default. The default is to use .0005 to reflect the fact that the SNP SE is assumed to be population fixed. |
parallel |
Whether the function should run using parallel or serial processing. Default = TRUE |
GC |
Level of Genomic Control (GC) you want the function to use. The default is 'standard' which adjusts the univariate GWAS standard errors by multiplying them by the square root of the univariate LDSC intercept. Additional options include 'conserv' which corrects standard errors using the univariate LDSC intercept, and 'none' which does not correct the standard errors. |
MPI |
Whether the function should use multi-node processing (i.e., MPI). Please note that this should only be used on a computing cluster on which the R package Rmpi is already installed. |
smooth_check |
Whether the function should save the consequent largest Z-statistic difference between the pre and post-smooth matrices. |
TWAS |
Whether the function is being used to estimate a multivariate TWAS using read_fusion output for the SNPs argument. |
std.lv |
Optional argument to denote whether all latent variables are standardized using unit variance identification (default = FALSE) |
fix_measurement |
Optional argument to denote whether the measurement model should be fixed across all SNPs (default = TRUE) |
The function outputs results from the multivariate GWAS. If the sub argument is used, it will output as many list objects as there are sub objects requested. If the sub argument is FALSE (as is the package default), the function will ouput as many list objects as there are SNPs.
Function to run a user specified model based on output from multivariable LDSC
usermodel(covstruc,estimation="DWLS", model = "", CFIcalc=TRUE,std.lv=FALSE,imp_cov=FALSE,fix_resid=TRUE,toler=FALSE, ...)
usermodel(covstruc,estimation="DWLS", model = "", CFIcalc=TRUE,std.lv=FALSE,imp_cov=FALSE,fix_resid=TRUE,toler=FALSE, ...)
covstruc |
Output from the multivariable LDSC function of Genomic SEM |
estimation |
Options are either Diagonally Weighted Least Squares ("DWLS"; the default) or Maximum Likelihood ("ML") |
model |
Model to be specified using lavaan notation |
CFIcalc |
Optional argument to denote whether CFI is being requested (default = TRUE). In some cases the estimation of the independent (i.e., Null) model for calculation of CFI can be time consuming. If the funciton seems to be stuck on this step, we would suggest re-running with this option set to FALSE |
std.lv |
Optional argument to denote whether all latent variables are standardized using unit variance identification (default = FALSE) |
imp_cov |
Optional argument to denote whether the user wants the model implied and residual covariance matrix included in the usermodel output (default = FALSE) |
fix_resid |
Optional argument to denote whether the user wants Genomic SEM to try troubleshooting a model that does not converge by fixing residual variances to be above 0 (default = TRUE) |
toler |
Optional argument to set lower tolerance for matrix inversion used to produce sadnwich corrected standard errors. (default = FALSE) |
The function estimates a user-specified model, along with model fit indices, using output from GenomicSEM LDSC.
Function to automate writing model syntax based on EFA loadings. This is most likely to be useful when examining larger numbers of traits (e.g., > 10).
write.model(Loadings,S_LD,cutoff,fix_resid=TRUE,bifactor=FALSE,mustload=FALSE,common=FALSE, ...)
write.model(Loadings,S_LD,cutoff,fix_resid=TRUE,bifactor=FALSE,mustload=FALSE,common=FALSE, ...)
Loadings |
The matrix of EFA loadings. Note that the number of columns in this matrix determines how many factors are specifeid in the model. |
S_LD |
The LDSC genetic covariance matrix |
cutoff |
The EFA standardized loadings cutoff to determine which traits should load on a factor |
fix_resid |
Whether to apply constraint on all variables to keep residual variances above .001. Default is TRUE. |
bifactor |
Whether to specify a bifactor model in which a general factor predicts all included traits and the remaining factors are specifided to be orthogonal of one another. |
mustload |
Whether all variables should load on at least one factor, even if they dont meet the threshold specified using the cutoff argument. |
common |
Whether to specify a common factor model. |
The function outputs model syntax that can be used to run the model using the usermodel function in Genomic SEM.