This review is aimed to summarize the information on estimation of inbreeding coefficient by conventional and molecular methods for enhancing the accuracy of estimation of genetic diversity in animal population. Estimation of inbreeding by using pedigrees are the best way to evaluate relationships among individuals, but they are not always usually available particularly for wild populations. In absence of detailed pedigree records, attempts have been made to estimate individuals' levels of inbreeding using molecular markers, generally making use of heterozygosity measures based on microsatellite markers. When both genealogical and molecular information is available, it can be combined to calculate the coancestry conditional on markers which might better help to estimate the genetic structure of population.
As the livestock industry is striving to provide food to the global market, many owners are utilizing several breeding strategies to increase genetic gain in their herds. Due to these strategies, many populations have suffered from an increase in inbreeding levels in recent decades. Inbreeding is defined as the mating of individuals whose relatedness is greater than the average degree of relationship that exists in the population (Lush, 1945), and capable of changing the genotypic frequencies of a population without modifying the gene frequencies. Inbreeding puts a deleterious effect on additive genetic variance as well as on phenotypic values. Accumulation of deleterious recessive alleles caused due to inbreeding in the population also affects fitness traits (Falconer and Mackay, 1996). Inbreeding coefficient is defined as the probability that two alleles at a locus are identical by descent. Identity by descent results from flow of an allele from one common ancestor to multiple offspring (MacKinnon, 2003). If two alleles at a locus are identical by descent, the genotype is said to be autozygous, otherwise the genotype is allozygous. Allozygous genotypes may be homozygous (identical by state) or heterozygous, but in the absence of recent mutation an autozygous genotype is always homozygous (Hartl and Clark, 1997).
Importance of Inbreeding
Mating of related individuals decreases the viability and fertility of offspring below the population mean – a phenomenon known as inbreeding depression. Inbreeding in closed populations leads to reduced genetic diversity and loss of heterozygosity from small, closed, selected populations at a rapid rate which may compromise production and / or fitness of inbred animals (Selvaggi et al., 2010). The main proposed theories for inbreeding depression are those of over-dominance and the partial recessive hypotheses (Charlesworth and Charlesworth, 1987). In the overdominance hypothesis, inbreeding depression is attributable to higher fitness of heterozygotes while partial recessive hypothesis proposes that negative fitness consequences are due to the fixation of recessive or partially recessive deleterious alleles (Frankham et al., 2002). Deleterious recessive alleles are considered as the major cause of inbreeding depression (Charlesworth and Charlesworth, 1999).
Consequences of Inbreeding
Inbreeding has following major consequences-
Conventional Methods for Estimation of Inbreeding Coefficient
Various methods of estimating the inbreeding coefficient are based upon-
Path Analysis Technique
In path analysis, a path diagram ids made from pedigree. The arrows or paths, flowing from the oldest to the youngest generation show that how the genes were transmitted from generation to generation. Path may be a straight line and sometimes cross, depending on the complexity of the family tree and matings occurred.
FX = ∑[(0.5)N(1 + FA)]
FX – the inbreeding of an individual;
∑ – the symbol for “sum of” or “add”;
N – the number of individuals in a path that is determined by tracing a path from one parent back to the common ancestor and forward from the common ancestor to the other parent; if more than one common ancestor exists, the term (0.5)N is repeated for each common ancestor; if more than one path exists between the individual and a common ancestor, the term (0.5)N is repeated for each unique path;
FA – the inbreeding of the common ancestor.
Coancestry method is preferred over path analysis for calculating the coefficient of inbreeding of complicated pedigrees and useful under systems of very close inbreeding as the identification of each inheritance pathway becomes very laborious. The coefficient of inbreeding of an individual (F) remains equal to the coancestry (f) between the two parents.
In covariance analysis, a covariance table is made from pedigree. Every individual in the pedigree is listed at the top of each column and to the left of each row. The parents of all individual are listed at the far left of the table. If any of the parents are unknown, then information is put up by a dash in the parents’ column. The cells that make up a covariance table will contain either the covariance values between two individuals or the covariance value of an individual. Covariance of individual values (the diagonal values) can be used to determine individual inbreeding values. Inbreeding is determined by subtracting 1.0 from covariance of individuals.
Estimation of Inbreeding Coefficient by Microsatellite Markers
Microsatellites has been globally used as reliable molecular markers in order to study the genetic relationship of different populations and for measurement of inbreeding indirectly. These markers are co-dominant, highly polymorphic, highly abundant, heritable, locus specific and can be easily analyzed which makes them suitable for studies on population phylogenesis constitution (Wang et al., 2007). By using microsatellite markers inbreeding coefficient should be measured as the deviation of the observed heterozygosity of an individual relative to the heterozygosity expected under random mating (Lukas and Donald, 2002) as follows-
F= 1 – (Ho/He)
F – coefficient of inbreeding.
Ho – observed frequency of heterozygous individuals.
He – expected frequency of heterozygous individuals in the population.
Hence, microsatellite markers indirectly measure the inbreeding coefficient based on marker heterozygosity at different loci.
Procedure for Estimation of Inbreeding Coefficient by Microsatellite Method
Use of Microsatellite Markers in Estimation of Inbreeding Coefficient
The average inbreeding coefficient of a population on the basis of pedigree information is frequently used to describe the genetic variability of populations. But, now-a-days, developments in molecular genetics has made possible to calculate inbreeding coefficients on the basis of genetic marker information. Hosain et al. (2010) genetically evaluated Caspian horses for genetic diversity and assessed for recent population bottlenecks. If both genealogical and molecular information is available, it can be combined to calculate the coancestry which might estimate the genetic structure of Caspian horse population in better way.
Inspite of being the best way to evaluate relationships among individuals, pedigrees are usually unavailable for wild populations (Frankham et al., 2002; Haig & Ballou, 2002; Earnhardt et al., 2004; Pemberton, 2004; Ralls & Ballou, 2004; Grueber & Jamieson, 2008). Molecular estimates of inbreeding and relationship can be used as alternative to measure homozygosity by descent (autozygosity) (Frankham et al., 2002). The expected decrease in the heterozygosity of neutral markers has led to studies of the effects of inbreeding through heterozygosity–fitness correlations (HFCs): correlations between heterozygosity of molecular markers (multilocus heterozygosity, MLH) and fitness components (David, 1998; Hansson & Westerberg, 2002; Pemberton, 2004; Kempenaers, 2007). Greuber et al. (2008) studied the application of HFC in estimation of inbreeding depression in threatened species. They found that extremely bottlenecked populations show reduced genetic diversity and heterozygosity in comparison to non-bottlenecked populations (Nei et al., 1975; Frankham et al., 2002; Spielman et al., 2004). It means that loci exhibiting variability in an outbred population are often monomorphic in a bottlenecked one (Taylor, 2006). As a result, molecular analysis of genetically depauperate species will take time and needs the search for large numbers of loci or various types of molecular markers such as microsatellites, minisatellites or SNPs (Miller et al., 2003; Taylor et al., 2007; Grueber et al., 2008).
Kardos et al. (2015) used computer simulations to test whether the realized proportion of the genome that is identical by descent (IBDG) is predicted better by the pedigree inbreeding coefficient (FP) or by genomic (marker-based) measures of inbreeding. Genomic estimators of IBDG included the increase in individual homozygosity relative to mean Hardy-Weinberg expected homozygosity (FH), and two measures (FROH and FE) that use mapped genetic markers to estimate IBDG. They demonstrated that IBDG can be more precisely estimated with large numbers of genetic markers than with pedigrees.
Brown et al. (2014) assessed inbreeding in a small population of Chios sheep undergoing intense selection for the Prion protein (PrP) gene after the 10 years of beginning of a scrapie resistance selection programme. The mean individual inbreeding coefficient estimated from the pedigree stood at the level of 4.5 per cent, five generations after the implementation of selection for the PrP gene. The inbreeding coefficient estimated by genetic markers was 4.37 per cent, implying that such a marker panel could be a useful and cost-effective tool for estimating inbreeding in unrecorded populations. They compared results from pedigree and genetic marker analysis showing a high degree of agreement as follows.
Table 1: Comparison of results from pedigree and genetic marker analysis showing a high degree of agreement
|Rate of inbreeding (ΔF)||0.0065||0.0048|
|Effective population size (Ne)||76||104|
|Average relatedness (AR)||0.58|
|Molecular coancestry (FM)||0.349|
In wild animal populations, the amount of inbreeding differs between species and between populations within species. Observed inbreeding is usually assumed to be caused by limited outbreeding opportunities due to demographic factors like small population size or population sub structuring. Langen et al. (2011) investigated inbreeding in a natural population of the West African cichlid fish Pelvicachromis taeniatus which showed clear kin mating preferences in standardized laboratory experiments but no inbreeding depression. The microsatellite analysis revealed that the natural population has, in comparison to two reference populations, a reduced allelic diversity (A = 3) resulting in a low heterozygosity (Ho = 0.167) pointing to a highly inbred population. Furthermore, they found a significant heterozygote deficit not only at population (Fis = 0.116) but also at subpopulation level (Fis = 0.081) suggesting that inbreeding is not only a by-product of population sub-structuring but possibly a consequence of behavioral kin preferences.
Leroy et al. (2009) studied the genetic diversity of 61 dog breeds raised in France. Genealogical analyses were performed on the pedigree file of the French kennel club. A total of 1514 dogs were genotyped using 21 microsatellite markers. At the breed level, few correlations were found between genealogical and molecular parameters. Kinship coefficients and individual similarity estimators were, however, significantly correlated, with the best mean correlation being found for the Lynch & Ritland estimator (r = 0.43). According to both approaches, it was concluded that special efforts should be made to maintain diversity for three breeds namely the Berger des Pyre´ne´es, Braque Saint-Germain and Bull Terrier.
Two alternative approaches that can be used to measure how inbred an individual is the use of pedigree records to estimate inbreeding coefficients or molecular markers to measure multilocus heterozygosity. However, the relationship between inbreeding coefficient and heterozygosity has only rarely been investigated. Slate et al. (2004) investigated the relationship between the inbreeding coefficient and multilocus heterozygosity. The microsatellite genotypes at 138 loci spanning all 26 autosomes of the sheep genome were used to investigate the relationship between inbreeding coefficient and multilocus heterozygosity. Multilocus heterozygosity was only weakly correlated with inbreeding coefficient, and heterozygosity was not positively correlated between markers more often than expected by chance. It is inbreeding coefficient, not multilocus heterozygosity which detected evidence of inbreeding depression for morphological traits.
Roswitha et al. (2003) carried out a simulation study involving ten sires and 50 dams. They mated animals over a period of 20 discrete generations. The population size was kept constant. Different situations with regard to the level of polymorphism and initial allele frequencies and mating scheme (random mating, avoidance of full sib mating, avoidance of full sib and half sib mating) were considered. Pedigree inbreeding coefficients of the last generation using full pedigree or 10, 5 and 2 generations of the pedigree were calculated. Marker inbreeding coefficients based on different sets of microsatellite loci were also investigated. Under random mating, pedigree-inbreeding coefficients were found to be closely related to true autozygosity (i.e. the actual proportion of loci with alleles identical by descent) than marker-inbreeding coefficients. If mating is not random, then demands on the quality and quantity of pedigree records increased. They reported that greater attention must be paid to the correct parentage of the animals.
Sahoo et al. (2016) revealed the presence of high genetic diversity within native Indian pig breeds by using microsatellite markers. They also found that there is very low probability of genetic identity of two individuals belonging to two different populations and therefore can be considered as a separate genetic entity. The study clearly verified that using panel of microsatellite markers different breeds or populations of native Indian pigs can be suitably investigated for relationships and genetic diversity.
Estimation of Inbreeding Coefficient using SNP Panel
Traditional breeding programs assumes an average pairwise kinship between sibs. Based on pedigree information, the relationship matrix is used for genetic evaluations disregarding variation due to Mendelian sampling. Therefore, inbreeding and kinship coefficients are either over or underestimated resulting in reduction of accuracy of genetic evaluations and genetic progress. Pairwise kinship and individual inbreeding can be estimated by single nucleotide polymorphism (SNPs) more accurately. Genome-wide SNP data provide a powerful tool to estimate pairwise relatedness among individuals and individual inbreeding coefficient. For complex pedigree or pedigree error detection, high throughput genotyping performed in GWAS represents new opportunities by using as many as millions of SNPs to assess the degree of relationship between a pair of individuals. Lopes et al. (2013) reported that the reduced sets of SNPs could generate more accurate kinship coefficients between sibs than the pedigree-based method. Variation of genomic kinship of father-offspring pair’s was recommended as a parameter to determine accuracy of the method rather than correlation with pedigree-based estimates. Inbreeding and kinship coefficients can be estimated with high accuracy using ≥2,000 unlinked SNPs.
Li et al. (2011) reported that Genome-wide SNP data provide a powerful tool to estimate pairwise relatedness among individuals and individual inbreeding coefficient. They compared methods for estimating the two parameters in a Finnsheep population based on genome-wide SNPs and genealogies, separately. This study included ninety-nine Finnsheep in Finland that differed in coat colours (white, black, brown, grey, and black/white spotted) and were from a large pedigree comprising 319 119 animals. All the individuals were genotyped with the Illumina Ovine SNP50K BeadChip by the International Sheep Genomics Consortium. They identified three genetic subpopulations that corresponded approximately with the coat colours (grey, white, and black and brown) of the sheep and detected a significant subdivision among the colour types (FST = 5.4%, P<0.05). We applied robust algorithms for the genomic estimation of individual inbreeding (FSNP) and pairwise relatedness (ΦSNP) as implemented in the programs KING and PLINK, respectively. Estimates of the two parameters from pedigrees (FPED and ΦPED) were computed using the RelaX2 program. Values of the two parameters estimated from genomic and genealogical data were mostly consistent, in particular for the highly inbred animals (e.g. inbreeding coefficient F>0.0625) and pairs of closely related animals (e.g. the full- or half-sibs). Nevertheless, they also detected differences in the two parameters between the approaches, particularly with respect to the grey Finnsheep. This could be due to the smaller sample size and relative incompleteness of the pedigree for them. They concluded that the genome-wide genomic data will provide useful information on a per sample or pairwise-samples basis in cases of complex genealogies or in the absence of genealogical data.
Fig.1: Inbreeding coefficient based on the genomic data (FSNP) plotted against Inbreeding coefficient based on the pedigree data (FPED).
Gazal et al. (2014) performed simulations with known genealogies using different SNP panels with different levels of linkage disequilibrium (LD) to compare several estimators of inbreeding coefficient, including single-point estimates, methods based on the length of runs of homozygosity (ROHs) and different methods that use hidden Markov models (HMMs). Single point methods were found to have higher standard deviations than other methods. ROHs gave the best estimates provided the correct length threshold is known. HMMs on sparse data gave equivalent or better results than HMMs modeling LD. Provided LD is correctly accounted for, the inbreeding estimates were very similar using the different SNP panels.
Wang (2016) showed by simulation study that genomic markers can yield much better estimates of inbreeding and relatedness than pedigrees when they are numerous (about 10000 SNPs) under realistic situations (e.g. genome and population sizes). Simulations also confirmed that inbreeding estimated from many SNPs can be much more powerful than inbreeding estimated from pedigree for detecting inbreeding depression in viability. However, argued that pedigrees cannot be replaced completely by genomic SNPs, because the former allows for the calculation of more complicated IBD coefficients (involving more than 2 individuals, more than one locus, and more than 2 genes at a locus) for which the latter may have reduced capacity or limited power, and because the former has social and other significance for remote relationships which have little genetic significance and cannot be inferred reliably from markers. Makanjuola et al. (2018) reported that inbreeding coefficients estimated using genomic information are increasingly being used as opposed to traditional pedigree measures, because they account for realized inbreeding rather than its expectation. They suggested that majority of the recent inbreeding can be captured using medium density panels, however, older inbreeding and population history inferences are better estimated and inferred from whole genome sequence data.
Druet et al. (2017) proposed a model that estimates inbreeding relative to multiple age-based classes. Each inbreeding distribution is associated to a different time in the past: recent inbreeding generating longer homozygous stretches than more ancient. Model is a mixture of exponential distribution implemented in a hidden Markov model framework that uses marker allele frequencies, genetic distances, genotyping error rates and the sequences of observed genotypes. Based on simulations studies, it was shown that the inbreeding coefficients and the age of inbreeding are correctly estimated. Mean absolute errors of estimators are low, the efficiency depending on the available information. When several inbreeding classes are simulated, the model captures them if their ages are sufficiently different. Genotyping errors or low-fold sequencing data are easily accommodated in the hidden Markov model framework.
Abri et al. (2017) compared between pedigree and genomic relatedness and inbreeding measures in a herd of 36 pedigreed Egyptian Arabian horses genotyped using the Equine SNP70 platform (Geneseek, Inc.). They estimated the minimum number of markers sufficient for genomic inbreeding calculations. Pedigree inbreeding values were moderately correlated with genomic inbreeding values (r = 0.406), whereas genomic relationships and pedigree relationships have a high correlation (r = 0.77). They concluded that genomic estimates of inbreeding and relationships are superior to their pedigree counterparts. They can be thus utilized in conservation of valuable lines of livestock, and in breeds at risk for loss of genomic diversity. They also postulated a minimum of 2000 markers in linkage equilibrium to be used for inbreeding estimation.
Zhang et al. (2015) used pedigree and genomic data at different densities from 50k to full sequence variants to compare how different methods performed for the estimation of inbreeding levels in three different cattle breeds. It was reported that F (PED) suffered from limited pedigree depth and density of marker affects run of homozygosity (ROH) estimation. Detecting ROH based on 50k chip data was observed to give estimates similar to ROH from sequence data. In the absence of full sequence data ROH based on 50k can be used to access homozygosity levels in individuals. However, genotypes denser than 50k are required to accurately detect short ROH that are most likely identical by descent (IBD).
Advantage and Disadvantage of Microsatellite Marker as a Measure of Inbreeding Coefficient
Both pedigree and molecular method to estimate inbreeding coefficient has its own limitations and advantages. One weakness of pedigree is it that the inbreeding coefficient depends very much on the quality of pedigree information. When pedigree inbreeding coefficients are computed in the sense of Malécot or Wright, it is necessary to define the base population to which the present inbreeding is referred. Under practical circumstances the real base population is never known. Very often gaps and sometimes false parentage occur in pedigree records. These pedigree weaknesses influence the measures for autozygosity. With pedigrees reduced in length, true autozygosity is severely underestimated. In cases of incomplete or short pedigrees the autozygosity of single animals or all animals within a generation is underestimated to the same extent, respectively. In case of false parentage, two types of errors may occur on using pedigree inbreeding coeffcients: underestimation and overestimation of the true autozygosity of individuals. There are several benefits of studying inbreeding using molecular methods rather than pedigrees, such as requiring less monitoring and the increasing application and accuracy of noninvasive sampling methods (Broquet et al., 2007). Hence, microsatellite markers provide an alternative approach to estimate inbreeding coefficient when quality pedigree information is not available, particularly with regard to false parentage.
Limitations of Microsatellite Markers
Microsatellite markers measure multilocus heterozygosity and then indirectly estimates inbreeding coefficient on the basis of difference in heterozygosity. It has been found that under random mating with less than 100 marker loci, the correlation between autozygosity and marker inbreeding coefficients is lower than with pedigree inbreeding coefficients (except for pedigrees with 20% false paternity). An additional point is it that the allele frequencies in a defined base population (for estimating expected heterozygosity) is assumed to be known, which is quite unrealistic. Again, mutations and genotyping errors would lead to even worse results for marker based inbreeding coefficients. Therefore, marker inbreeding coefficients do not appear to be a favourable method for identifying autozygous animals when reliable pedigree information is available.