Sunday, July 16, 2006

On the Genetic Efficiency of ex situ germplam conservation

K K Vinod

When we plan for an expedition for germplasm collection from a region following objectives are need to be set.

1. Acquire the maximum genetic diversity of targeted taxa within the region, within the constraints of limited available resources.

2. Acquire germplasm with the maximum novelty value with respect to a collection already held ex situ; i.e. the greatest number and diversity of genes and genotypes that have not previously been collected.

3. Combat genetic erosion.

4. Acquire genes or genotypes most likely to benefit a particular breeding or research objective.

5. Acquire germplasm for analysis of agro-ecogeographic patterns of the distribution of biodiversity.

In the case of first objective, limiting resources vary from expedition to expedition, and may be time (time to travel to a site, time to collect overall site data, time to collect each seed or plant at a site, speed of returning live plants to base); space available in the collecting vehicle; or labour and facilities to process samples at base. Resource limitations influence optimal collecting strategy in a way that depends on population structure.

For the objective 2 above, except that additional information is needed on the diversity and origins of the pre-existing collection.

Where the primary objective is to combat genetic erosion (objective 3 above), sampling strategy can and should still be designed to satisfy objec­tive 1. However, painstaking planning to maximize the diversity collected may be counterproductive where the rate of erosion is so high that diver­sity is lost whilst planning is in progress. In these circumstances, speed of undertaking a collecting expedition is of overriding importance.

Objective 4, a breeder-driven collection to support a particular breed­ing objective, requires a totally different sampling strategy, to locate par­ticular genes rather than maximize diversity of genes. Nevertheless, knowledge of evolutionary patterns can aid identification of sites most likely to contain the desired genes or genotypes.

Objective 5, collection for agro-ecogeographic analysis, requires yet another strategy, namely an appropriately randomized sampling proce­dure. This fact is not sufficiently recognized, as many published analyses are based on collections made for conservation or breeding purposes. Yet any sampling strategy that aims to maximize diversity or to target specific genes can generate incorrect and misleading estimates of components of variance. For example, suppose two collections are undertaken in two dif­ferent regions, both with a sampling strategy to maximize the diversity sampled within each region. A comparison of both collections will then incorrectly suggest that there is less difference between them and more variation within each region than is really the case.

A specific example of an erroneous conclusion may be the widely accepted latitudinal cline in genetic diversity of Trifolium repens across Europe, according to which southern populations are believed to be highly diverse and northern ones uniform. This is likely to be spurious, and at least partly a consequence of collections being specifically targeted at well-managed pastures in the north but at highly diverse habitats in the south. A northern collection targeted at diverse hab­itats contained as much diversity as southern populations (Hamilton, 1980).

Appropriate randomization for agro-ecogeographic analyses need not mean full randomization. Collecting for maximum diversity can be com­patible and even beneficial to agro-ecogeographic analysis. Seeking to col­lect maximum genetic diversity by targeting maximum environmental diversity improves the sensitivity of analysis of the relationship between genetic diversity and environmental diversity. This of course requires col­lection of all relevant environmental data so that they can be incorporated as independent variables in statistical analysis.

Collections targeting spatial scale of biodiversity and overall distribution of sites

The general scaling properties of biodiversity have two imme­diate implications. First, it is important to cover as large an area as pos­sible. Second, adjacent sites should not be further apart than the genetic patch size, since increasing the geographical distance beyond this does not increase the expected genetic distance between two populations. For this purpose, genetic patch size should be measured using neutral genes, to provide a general baseline sampling strategy that is not influenced by any particular pattern of environmental diversity

At the lower end of the scale, the genetic neighbourhood area defines the minimum possible scale for taking distinct population samples, at least of the seed population. At smaller scales mating is random and Hardy-Weinberg equilibrium is expected, with no possibility of division into genetically distinct subpopulations. This applies only to seeds: the population of adult plants may show genetic subdivision at smaller scales if the environment is heterogeneous at smaller scales, imposing smaller scale heterogeneity of pressures within the genetic neighbourhood. Thus there may be merit in finer-scale sampling of adult populations than seed populations.

However, the genetic neighbourhood area of most wild plant species is remarkably small, far smaller than the unit regarded as one population by the collector. For the insect-pollinated self-incompatible perennial Trifolium repens the reproductive genetic neighbourhood area is 2m2; for the wind-pollinated self-incompatible per­ennial Lolium perenne it is 8.4 m2. In practice, therefore, each sample of a wild population in an ex situ collection almost invariably comprises genotypes from what were originally numerous dis­tinct genetic populations.

The genetic neighbourhood area of crop plants is closely related to the type of farming. In primitive farming communities it is generally much smaller than for modern agriculture. Farmers in such communities usually maintain and select their own seed, with limited 'dispersal' (by seed exchange) between isolated communities or even between farmers within communities. It is essential, when collecting, to determine what are the local customs in relation to seed selection and exchange, especially (i) whether a formal centralized system exists for exchanging seed, or whether exchange is informal and centralized through the market, or informal and localized to individual farmer-farmer interactions; and (ii) how much farmers rely on their own farm-saved seed, and if so whether they consciously make their own selections. Only with such local knowledge can the collector judge the probable scale of distribution of diversity.

Collections targeting by habitat and adaptation to environment

Targeting the maximum diversity of habitats for collection will maximize the diversity of genes contributing to adaptation to the selection pres­sures imposed in the environments sampled. It will also maximize diver­sity of genes closely linked to the adaptive genes and of pleiotropic characters. It will have no effect on the diversity of genes that are neutral for the particular environmental diversity sampled - this includes not only genes that appear neutral with respect to all known selection pres­sures, but also genes that are non-neutral for different types of environ­mental diversity.

Effective environmental targeting in this manner depends on the col­lector having good knowledge of environmental diversity in the region, and of the distribution of the target taxa in relation to environmental diver­sity. Much of the planning phase of a collection should be devoted to iden­tifying contrasting environments, using as many sources of information as possible, preferably in map form: not only conventional geographical maps, but also maps of surface geology, soil, temperature, rainfall, vegeta­tion and land use. Much additional information is not available in map form, and may not be readily available prior to the expedition, being in the knowledge domain of local extension scientists and farmers. Relevant local knowledge covers not only natural variation between fields but also diver­sity in farmer-selection pressures resulting from variation in crop usage and variation in preferred crop characteristics.

Collections targeting centres of diversity

It is now widely accepted that evolution does not progress at a uniform rate, but involves periods of relative stability interspersed with periods of rapid change. Exactly how and how much the rate of evolution changes is still the subject of debate. Nevertheless, for most species and genera it is possible to identify centres of diversity, associated with a phase of rapid diversification at some stage in their evolutionary history.

Centres of diversity are most strongly developed for crop species, leading to the famous pioneering work of Vavilov (1951). These centres are associated with early agricultural developments. They are attributed to disruptive selection caused by the simultaneous action of natural selection for fitness and artificial selection for agronomic value, combined with diverse artificial selections applied by different farmers in different envi­ronments, and with introgression between conspecific crop and wild rela­tive.

By definition, a collecting expedition will obtain the greatest diversity if it is located within the centre of diversity of the target taxa. The content of ex situ collections should therefore contain a bias in favour of popula­tions from the centre of diversity.

Sampling targeting environmental heterogeneity: stratified sampling

The environment is a multidimensional entity. Genetic adaptation to environment is correspondingly multidimensional. Different environmen­tal variables show different patterns of variation in space and time. Therefore genetic variation for adaptation to different environmental vari­ables also shows different patterns. For example, Hamilton (1980) collected Trifolium repens from an area of high diversity of soils and grass­land management but uniform climate. Relative to the global diversity pre­sent in the entire gene pool of T. repens, genetic diversity between populations was high for vegetative and morphological characteristics important for adaptation to soils and management but low for time of flowering. More generally, it may be, for example, that populations from adjacent fields differ mainly in genes affecting response to management; ones from nearby fields differ mainly in genes for response to aspect; ones further apart differ mainly in genes for response to soils; ones from differ­ent altitudes differ mainly for response to temperature; ones from differ­ent villages for local human preferences; ones from different latitudes for response to day length; and so on.

Given this situation, a stratified sampling strategy will not just max­imize the genetic diversity collected; it will maximize the diversity of dif­ferent types of genetic diversity collected. A particular advantage of stratified sampling is that it does not depend on prior knowledge of the different scales of heterogeneity of different envi­ronmental attributes. Although such knowledge helps, nevertheless the fact that different environmental variables show different scales of hetero­geneity is itself sufficient to make a stratified sampling procedure more efficient in obtaining qualitatively different types of genetic diversity.

For some purposes, the stratification of sampling procedure should be extended to sampling individuals within sites, at least for natural popula­tions. Certainly this will maximize within-accession diversity sampled from such populations. For example: sampling several individuals from a single genetic population will sample diversity in genes that are truly polymorphic at the genetic population level; sampling from different quadrats within a field will acquire diversity in genes responsible for micro scale adaptation to patchiness of the vegetation, soil characteristics, and microflora, microtopography, etc.; and samples from the boundaries of the field are more likely to contain immigrant genes from nearby, differ­ently adapted populations.

Stratification of sampling procedure within a population is rarely appropriate for crop populations and market populations. Even for natural populations it may not always be appro­priate. In particular, by maximizing within-population variance, a stratified sampling procedure will invalidate agro-ecogeographical comparison of dif­ferent populations. If this is important, a random sampling procedure is more appropriate, unless the individual plants are maintained separately.

For species where it is difficult, or even impossible as routine practice, to distinguish plants from each other - as with most perennial herbaceous species communities - it may be impossible to take a truly random sam­ple. In these species, only inflorescences, or leaves, or some other part of the plant, can be sampled at random. This inevitably introduces a size bias into the sample in favour of those plants with the most inflorescences, leaves, etc.

Sampling targeting breeding system and adjustment of sampling procedures

The breeding system has a major influence on the distribution of genetic diversity. A number of mechanisms operate to fix par­ticular variants in lines: inbreeding fixes it through homozygosity; apo­mixis fixes variants even in heterozygotes; some complex chromosome linkages, like those in Oenothera, operate to minimize recombination. All these cases reduce within-population variance, so that a correspondingly increased proportion of the total gene pool is represented by variation between populations. In contrast, outbreeders show higher within-population variance. Sampling procedures must be adjusted corres­pondingly, to take relatively few individuals from many populations of inbreeding, apomictic and similar species, and many individuals from each of fewer populations of outbreeding species (Marshall and Brown, 1975).

Vegetative propagation (by stolons, bulbs, rhizomes, etc.) is function­ally equivalent to apomixis in that it can generate numerous genetically identical plants. However, vegetative propagation is often associated with outbreeding, generating a complex two-level population structure. There is high genetic variance among the individuals originating by sexual reproduction through different zygotes, and zero genetic variance (ignor­ing somatic mutations) among the vegetative progeny derived solely by mitotic division from a single zygote.

In many such vegetatively reproduced species it is impossible to know at a glance whether two plants are derived from the same or from differ­ent zygotes. In these species the two-level population structure can present very considerable problems for collection for efficient conservation. The commonest approach is to ensure a large enough distance between sam­pled individuals so as to be reasonably confident that they are genetically distinct. However, a single clone of even small herbaceous species can cover hundreds or thousands of square metres. The distance between adjacent samples therefore has to be undesir­ably large, in that it eliminates sampling the genetic diversity that is expressed at smaller scales. For many species there is no satisfactory reso­lution to this problem, as the only resolution may be intensive sampling followed by genetic fingerprinting to determine the genotypic composition of the population sample, which of course is unjustifiably labour intensive.

Populations of these species often show a highly skewed distribution of physical size of genotypes, with a few large genotypes and many small ones (Hamilton et al., 1996). When population sam­ples are based on a random selection of inflorescences or leaves, the sam­ple will be strongly biased in favour of the few large genotypes.

Collections targeting temporal scale of biodiversity

Little attention has been paid to the temporal scale of biodiversity for con­servation purposes. Whilst the importance of cyclic fluctuations, chaotic changes and continuous directional shifts are all well acknowledged and documented, it has rarely if ever been considered justifiable for conserva­tion purposes to return to the same sites for repeat collections. The only common reason for returning to a site or region is to test specific hypothe­ses, for example to test the extent of genetic erosion.


Hamilton, N.R.S. and Chorlton, KH.C. (1995) Collecting vegetative material of forage grasses and legumes. In: Guarino, L, Ramanatha Rao, V and Reid, R. (eds) Collecting Plant Genetic Diversity: Technical Guidelines. CAB International, Wallingford, UK, pp. 467-484.

Hamilton, N.R.S., Jones, D., Cresswell, A. and Fothergill, M. (1996) Genetic diversity and sustainability of clover-based pastures: 2. Size hierarchies and sampling bias. In: Younie, D. (ed.) Legumes in Sustainable Farming Systems. Symposium of the British Grassland Society, Aberdeen. pp. 179-180.

Marshall, D.R and Brown, AH.D. (1975) Optimum sampling strategies in genetic conservation. In: Frankel, O.H. and Hawkes, J.G. (eds) Crop Genetic Resources for Today and Tomorrow. Cambridge University Press, Cambridge, pp. 53-80.

Vavilov, N.I. (1951) The origin, variation, immunity and breeding of cultivated plants (translated by KS. Chester). Chronica Botanica 13, 1-366.