qq.chisq {snpMatrix}R Documentation

Quantile-quantile plot for chi-squared tests


This function plots ranked observed chi-squared test statistics against the corresponding expected order statistics. It also estimates an inflation (or deflation) factor, lambda, by the ratio of the trimmed means of observed and expected values. This is useful for inspecting the results of whole-genome association studies for overdispersion due to population substructure and other sources of bias or confounding.


qq.chisq(x, df=1, x.max, main="QQ plot", 
    sub=paste("Expected distribution: chi-squared (",df," df)", sep=""), 
    xlab="Expected", ylab="Observed",
    conc=c(0.025, 0.975), overdisp=FALSE, trim=0.5,  
    slope.one=FALSE, slope.lambda=FALSE, 
    thin=c(0.25,50), oor.pch=24, col.shade="gray", ...)


x A vector of observed chi-squared test values
df The degreees of freedom for the tests
x.max If present, truncate the observed value (Y) axis here
main The main heading
sub The subheading
xlab x-axis label (default "Expected")
ylab y-axis label (default "Observed")
conc Lower and upper probability bounds for concentration band for the plot. Set this to NA to suppress this
overdisp If TRUE, an overdispersion factor, lambda, will be estimated and used in calculating concentration band
trim Quantile point for trimmed mean calculations for estimation of lambda. Default is to trim at the median
slope.one Is a line of slope one to be superimpsed?
slope.lambda Is a line of slope lambda to be superimposed?
thin A pair of numbers indicating how points will be thinned before plotting (see Details). If NA, no thinning will be carried out
oor.pch Observed values greater than x.max are plotted at x.max. This argument sets the plotting symbol to be used for out-of-range observations
col.shade The colour with which the concentration band will be filled
... Further graphical parameter settings to be passed to points()


To reduce plotting time and the size of plot files, the smallest observed and expected points are thinned so that only a reduced number of (approximately equally spaced) points are plotted. The precise behaviour is controlled by the parameter thin, whose value should be a pair of numbers. The first number must lie between 0 and 1 and sets the proportion of the X axis over which thinning is to be applied. The second number should be an integer and sets the maximum number of points to be plotted in this section.

The "concentration band" for the plot is shown in grey. This region is defined by upper and lower probability bounds for each order statistic. The default is to use the 2.5 Note that this is not a simultaneous confidence region; the probability that the plot will stray outside the band at some point exceeds 95

When required, he dispersion factor is estimated by the ratio of the observed trimmed mean to its expected value under the chi-squared assumption.


The function returns the number of tests, the number of values omitted from the plot (greater than x.max), and the estimated dispersion factor, lambda.


All tests must have the same number of degrees of freedom. If this is not the case, I suggest transforming to p-values and then plotting -2log(p) as chi-squared on 2 df.


David Clayton david.clayton@cimr.cam.ac.uk


Devlin, B. and Roeder, K. (1999) Genomic control for association studies. Biometrics, 55:997-1004

See Also

single.snp.tests, snp.lhs.tests, snp.rhs.tests


## See example the single.snp.tests() function

[Package snpMatrix version 1.6.1 Index]