Status of SSR, cSSR, iSSR and VNTR motifs in Leptosphaeria maculans based on high throughput sequencing data

Document Type : Original Article

Authors

Department of Plant Protection, Faculty of Agriculture, University of Zanjan, Zanjan, Iran

Abstract

Leptosphaeria maculans is a fungus of the phylum Ascomycota that is a causal agent of blackleg disease on canola (Brassica napus L.). Due to the high diversity and worldwide distribution, L. maculans has been widely studied as a model phytopathogenic fungus. Simple sequence repeats (SSRs) are robust molecular markers widely used for population diversity research. This study utilized whole-genome sequencing data of four Iranian L. maculans isolates (Pk4, Ar3, Ar5, and Alam10). We compared them with the JN3 reference genome to identify and compare different types of SSRs, including perfect (SSRs), compound (cSSR), imperfect (iSSR) and variable tandem number repeats (VNTR) motifs. The average length of SSRs was estimated to be 155.692 kb, accounting for 0.36% of the total genome. An average of 7138 SSR motifs with a frequency of one SSR per 169.5 bp, including an average of 33.86% tri, 25.69% di, 14.48% mono, 10.87% tetra, 8.52% hexa, 6.58% penta-nucleotide repeats, were identified from assembled genomic sequences. Of the total SSRs identified in the Pk4 isolate, 459 motifs were identified in CDS regions. Approximately 13% of the identified SSRs were linked to cSSRs. The average cSSR loci density for four isolates was 487.32 bp/Mb, and C, AG and AC were the most frequent SSR motifs. The assessed isolates' cSSRs lengths ranged from 24 to 295 bp. The largest common cSSRs in four isolates were identified as a motif (GA)26-(CAGAGA)15 with a length of 142 bp. The tri-nucleotide (AAT) was the most common iSSRs motif, followed by di, tetra, mono, hexa, and penta-nucleotides. About 30% of iSSRs contained the AAT, AT, and AAG motifs. Among the 7 to 30 nucleotide motifs, 7, 8, 9, and 10 motifs showed the most occurrences. In addition, 11 motifs with more than 100 nucleotides were found in the studied isolates and the reference genome. The data demonstrate that these results can be used to characterize L. maculans isolates from diverse hosts and geographic locations and are transferable to other isolates of L. maculans.

Keywords