Estimation of the total genome length of chickpea via k-mer analysis. The frequency distribution of 17-mers within the raw genomic reads displays 2 major peaks (A and B). Peak A, resembles a Gaussian distribution and represents k-mers of ~0-10X coverage which arise by chance due to sequencing errors. Peak B, corresponding to k-mers of ~20-50X coverage, represents the majority of the genome and resembles a Poisson distribution with minor differences due to sequencing errors, heterozygosity and repetitive DNA. The total genome size of chickpea was estimated by obtaining the multiplication product of 17bp and the k-mer frequency (value at y-axis) corresponding to the coverage (value at x-axis) at Peak B (i.e. 17 X Peak B frequency). |