In this paper, we show how to construct the genealogy of a sample of genes for a large class of models with selection and mutation. Each gene corresponds to a single locus at which there is no recombination. The genealogy of the sample is embedded in a graph which we call the ancestral selection graph. This graph contains all the information about the ancestry; it is the analogue of Kingman's coalescent process which arises in the case with no selection. The ancestral selection graph can be easily simulated and we outline an algorithm for simulating samples. The main goal is to analyze the ancestral selection graph and to compare it to Kingman's coalescent process. In the case of no mutation, we find that the distribution of the time to the most recent common ancestor does not depend on the selection coefficient, and hence is the same as in the neutral case. When the mutation rate is positive, we give a procedure for computing the probability that two individuals in the sample are identical by descent and the Laplace transform of the time to the most recent common ancestor of a sample of two individuals; we evaluate the first two terms of their respective power series in terms of the selection coefficient. The probability of identity by descent depends on both the selection coefficient and the mutation rate and is different from the analogous expression in the neutral case. The Laplace transform does not have a linear correction term in the selection coefficient. We also provide a recursion formula that can be used to approximate the probability of a given sample by simulating backwards along the sample paths of the ancestral selection graph, a technique developed by Griffiths and Tavare (1994).