\name{hwe}
\alias{hwe}
\alias{print.hwe}
\alias{hwe.stats}
\alias{hwex.stats}
\alias{hwe.sim}
\alias{hwex.sim}
\alias{hwe.exact.bi}
\title{
  Hardy-Weinberg Equilibrium
}
\description{
Test whether genotype frequencies fit Hardy-Weinberg Equilibrium proportions.
Different statistical tests are provided, as well as an option to evaluate
statistical significance by simulations.
}
\usage{
hwe(object, n.sim=0, sex=NULL, seed.val=NULL)
}
\arguments{
\item{object}{
An object of class "locus". Computations for an  X-linked locus
(object with x.linked=TRUE) require that a sex vector be provided 
(see below).
}
\item{sex}{
A character of  male and female codes for each person (length of sex must
be the same as the number of pairs of alleles in the locus argument
"object" given above.  No sex argument is required for loci that have
x.linked=FALSE.  If x.linked=TRUE, anappropriate sex vector must be given or 
an error will result.
}
\item{n.sim }{
An integer indicating the number of simulations to run.  If n.sim <=0,
no simulations will be run.
}
\item{seed.val }{
An integer or a saved copy of .Random.seed.  This allows simulations to be
reproduced by using the same initial seed.  If no value is given, the 
current copy of .Random.seed will be used. 
}
}
\value{
A list with the following components:

gof.stat:         Chi-square goodness of fit statistic.

gof.df:           Degrees of freedom for chi-square goodness of fit statistic
                  (see DETAILS).

gof.pval:         Asymptotic p-value of the chi-square goodness of fit
                  statistic.
                          
rare.stat:        Chi-square statistic for rare-homozygotes. 

rare.pval:        Asymptotic p-value of the chi-square statistic for 
                  rare-homozygotes. 

like.exact:       P-value for the exact test of HWE.  Computed only if locus has
                  two distinct alleles codes.  If there are more than two allele,
                  codes, pval.exact is not computed, and is assigned a value of 
                  NULL.  When there are more than 2 alleles, pval.exact can be 
                  approximated by simulations (see like.siim.pval below).

gof.sim.pval:     P-value of the simulated chi-square goodness of fit statistic.
                  This is computed only when there are more than two distinct 
                  alleles and n.sim > 0.

like.sim.pval:    P-value of the simulated conditional likelihood test, which 
                  conditions on the observed allele frequencies (simulation
                  approximation of the exact test).  This is computed only when
                  there are more than two distinct alleles and n.sim > 0.

rare.sim.pval:    P-value of the simulated chi-square statistic for rare 
                  homozygotes.  This is computed only when there are more than 
                  two distinct alleles and n.sim > 0.

n.sim:            The number of simulations that were requested.

x.linked:         A logical value indicating whether the locus object is X-linked.
}
\section{Side Effects}{
If seed.val is assigned a non-NULL argument, the value of .Random.seed in the 
user's .Data directory will be updated with this value.  The components of the
locus object are not modified.
}
\details{
The chi-square goodness of fit statistic compares the observed genotype counts 
with their expected values, where the expected values are computed assuming HWE
is true.  For autosomes (x.linked=F), the expected number of i/i homozgotes is 
N*p[i]^2, and the expected number of i/j heterozygotes is N*2*p[i]*p[j], where 
N is the number of subjects and p[i] is the frequency of the ith allele.  For 
X-linkage (x.linked=T), the expected values are computed separately for males 
and females, but using the pooled allele frequencies, and these expected values 
are compared with the observed sex-specific genotype counts by the chi-square 
goodness of fit statistic.  For large samples, this statistic has a chi-square 
distribution.  When x.linked=F, df = number of distinct heterozygotes (for K 
alleles, df = K(K-1)/2). When x.linked=T, df = K(K+1)/2 - 1, because we 
condition on the total number of females and males.  For small samples the 
p-values may not be accurate, so simulations should be performed. 

When a locus is X-linked and there are no males, the test for HWE is performed 
among the females.  But in this case, the tests are equivalent to treating the 
locus as if it were not X-linked.  Therefore, the statistical tests for an 
autosomal locus are applied in this situation, with the degrees of freedom, df, 
as explained above.

The exact test is based on the probability (likelihood) of the genotypes under
HWE, conditional on the observed allele frequencies (see Weir, page 99).  If
there are more than two distinct alleles, pval.exact is not computed, but can 
be approximated by simulations (see Weir, page 109).  When x.linked=T, the
exact test is computed only on the subset of females.

The goodness of fit statistic has weak power when there are many different 
alleles.  So, another chi-square statistic is computed that is sensitive to an
excess of rare homozygotes.  This statisic scores heterozygotes as -1, and 
homozygous i/i as (1-p[i])/p[i] (see Yasuda, 1968), and is useful for evaluating
the quality of a genetic marker (see Gomes, 1999).  When x.linked=T, this
test is computed only on the subset of females.

If there are more than two distinct alleles, simulations can be performed to
compute accurate p-values for small samples.  The simulations provide p-values
for the goodness of fit test, the rare homozygous test, and the conditional 
likelihood "exact" test.

If one or both alleles of a person are missing, the person is eliminated from
all computations. 

Simulations under HWE are performed by randomly reordering all alleles, and 
sequentially selecting pairs of alleles from the reordered set to create new 
genotypes (see Weir, page 109).  When x.linked=T, sequential pairs of random
alleles are chosen for females, and one random allele is chosen for each male.
Random uniform variables are generated by the Wichman and Hill random number
generator (see REFERENCES below).
 
}
\section{References}{
Gomes I, Collins A, Lonjou C, Thomas N, Wilkinson J, Watson M, Morton N (1999)
Hardy-Weinberg quality control. Ann Hum Genet 63:535-538

The Wichmann & Hill Random Number Generator, Alogorithm AS 183; Applied 
Statistics, volume 32, number 2; The Royal Statistical Society, 1982.

Weir, Bruce S., Genetic Data Analysis II:  Methods for Discrete Population 
Genetic Data, 2nd ed.; Sinauer Associates, Incorporated Publishers, 
October, 1997.

Yasuda N (1968) Estimation of the inbreeding coefficient from phenotype 
frequencies by a method of maximum likelihood scoring. Biometrics 24:915-935
}
\seealso{
\code{\link{locus}}
}
\examples{
# Assume that we have four allele codes and they are denoted by
# the first four positive integers 1, 2, 3, and 4.  Let a1 and 
# a2 be two allele vectors of length 10 with a corresponding sex
# vector given by female:

a1     <- c( 2,   1,   2,   1,   1,   3,   1,   3,   4,   2 )
a2     <- c( 4,   4,   4,   3,   3,   3,   1,   3,   4,   2 )
female <- c("F", "F", "F", "F", "F", "M", "M", "M", "M", "M")
set.seed(10)
locx.obj <- locus(a1,a2,sex=female,female.code="F",male.code="M",x.linked=TRUE)
hwex.obj <- hwe(locx.obj,sex=female,n.sim=1000)

# hwe.obj
#                 stat   df      pval 
#      gof      10.031    9     0.348
#     rare       1.667    1     0.197
#  gof.sim           -    -     0.408
# rare.sim           -    -     0.364
# like.sim           -    -     0.279
#
# The number of simulations requested, n.sim, was 1000
# x.linked=TRUE


# Use the same a1 and a2 as above, but assume that x.linked=F:

a1     <- c( 2,   1,   2,   1,   1,   3,   1,   3,   4,   2 )
a2     <- c( 4,   4,   4,   3,   3,   3,   1,   3,   4,   2 )
set.seed(10)
loc.obj <- locus(a1,a2,x.linked=FALSE)
hwe.obj <- hwe(loc.obj,n.sim=1000)

# hwe.obj
#                 stat   df      pval 
#      gof       7.611    6     0.268
#     rare       3.115    1     0.078
#  gof.sim           -    -     0.261
# rare.sim           -    -     0.114
# like.sim           -    -     0.085

# The number of simulations requested, n.sim, was 1000
# x.linked=FALSE
}

