Bioinformatics tools for analysis of livestock genetic diversity
Nina Moravčíková 1 Anna Trakovická 1 Ondrej Kadlečík 1 Radovan Kasarda 1
Since the beginning of the genomic era, there have been many advances in whole-genome sequencing in livestock species as well as the development of next-generation sequencing (NGS) technologies and high-throughput genotyping platforms. In terms of livestock genetic diversity analysis, the availability of huge amount of data produced by the application of single nucleotide polymorphism (SNP) genotyping platforms allows to use of various highly sophisticated analyses.
The aim of this study is to present some of methodological approaches and bioinformatics tools that have been applied to determine the conservation status of Slovak Pinzgau cattle as one of the most endangered livestock breeds in Slovakia. The genotyping data have been obtained from in total of 152 animals by using Illumina BovineSNP50v2 BeadChip. The quality control of data have been performed to remove gonosomal loci, loci with call rate <90% and minor allele frequency <1% by using PLINK 1.9. For subsequent analyses various software environments have been used, including R (packages Adegent, StAMPP, rehh, PCAdapt etc.), SNeP or GEMMA. In addition to traditional genetic diversity parameters such as allele and genotype frequencies or heterozygosity, obtained SNP data allowed a more realistic estimation of genomic inbreeding, effective population size as well as population substructure. The level of genomic inbreeding has been derived from the proportion of runs of homozygosity (ROH) segments with specific length in genome and the effective population size has been estimated by calculation of linkage disequilibrium between adjacent syntenic loci. Moreover, the high-throughput data enabled us to identify the genomic regions affected by positive selection which can be unique for any autochthonous population and to define the genomic regions associated with traits of interest through GWAS analysis. The applied statistics for identifying the SNPs associated with selection pressure focused mainly on the extreme values of FST index, integrated Haplotype Homozygosity Score (iHS), and proportion of autozygosity islands in genome.
The obtained level of genomic inbreeding in population was higher compared to published genealogical studies for Slovak Pinzgau cattle. The results showed that the ROH segments >4 Mb covered in average 1.91% of the genome and the ROH >16 Mb reached 0.43%, which signalize inbreeding in recent population. As expected due to the limited number of Pinzgau bulls used in breeding praxis the observed values of recent Ne was on the border of limit defining the endangerment status of population (Ne=80). Subsequent analysis of selection signatures through calculation of iHS score and Wright's FST showed several genomic regions reflecting mainly the dual-purpose character of analysed breed. Detected signals of selection were located directly or very close to genomic regions containing genes involved in muscle formation (CAPN2, CAPN3), body growth (GHR) and immunity response (IL2, IL21).
All of the applied methods in our study as universal and wide-spectrum approaches are applicable to any other population and present the manual for effective monitoring of diversity to maintain the genetic potential of animal genetic resources for further generations.