Research Article |
Corresponding author: Na Song ( songna624@163.com ) Academic editor: Jolanta Kiełpińska
© 2022 Bingjie Chen, Yu Pan, Jian Zheng, Chenyu Song, Na Song.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Chen B, Pan Y, Zheng J, Song C, Song N (2022) Development and characterization of microsatellite markers for Chaeturichthys stigmatias (Actinopterygii: Gobiiformes: Gobiidae) based on restriction site-associated DNA sequencing (RAD-seq). Acta Ichthyologica et Piscatoria 52(3): 229-237. https://doi.org/10.3897/aiep.52.84144
|
Chaeturichthys stigmatias Richardson, 1844, a fish species of the family Gobiidae, is an offshore warm-temperate fish species and a dominant component of estuarine ecosystems. In this study, restriction site-associated DNA sequencing was adopted to analyze the traits of candidate microsatellite markers for C. stigmatias, and 30 polymorphic loci were developed. A total of 5631 microsatellites with primer fragments were detected, among which trinucleotide repeats (57.56% of the total simple sequence repeats) were the most abundant, followed by di- (23.30%), tetra- (11.79%), penta- (4.14%), and hexa- (3.21%) nucleotide repeats type. The numbers of alleles per locus ranged from 6 to 14 with the mean value of 10.4. The mean value of observed heterozygosity and the expected heterozygosity were 0.349 and 0.870, respectively. The microsatellite locus with the lowest polymorphic information content (PIC) was 0.749, which indicated that all sites were highly polymorphic (PIC > 0.50). This is the first microsatellite development and characterization of this species to be reported.
Chaeturichthys stigmatias, DNA sequences, microsatellite, polymorphic sites, primer detection, RAD-seq
The branded goby, Chaeturichthys stigmatias Richardson, 1844 (also known as 矛尾鰕虎魚, finespot goby, or Cá Bống râu mắt nhỏ), is a warm-temperate nearshore benthic fish which is widely distributed in the coastal areas of China, Korea, and Japan (
Exploring genetic diversity and population genetic structure may lead to a better understanding of the ecological importance of Gobiidae (see
Restriction site-associated DNA sequencing (RAD-seq) is a powerful tool to characterize the microsatellite and single nucleotide polymorphism (SNP) markers, which was based on the second-generation sequencing technology (
A specimen of Chaeturichthys stigmatias was collected from the coast of Qingdao, China in November 2018, and sent for high-throughput sequencing. Dozens of C. stigmatias were collected from Zhoushan (August 2019), Qingdao (December 2019), Yantai (December 2019), and Weihai (October 2020), and 24 of them were used for polymorphism detection and genetic diversity analysis in this study. The samples were quickly dissected, and part of the muscle tissues on the caudal peduncle were collected and preserved in 95% alcohol in ice box, and then stored in –80°C for DNA extraction. The traditional phenol–chloroform method was used to extract genomic DNA. The total DNA was treated with RNase, and the DNA with high purity and without RNA contamination was obtained for the detection of SSR primers polymorphism. The extracted DNA was measured using a Nanodrop 2000 (Thermo Scientific, USA) and a Qubit 2.0 (Invitrogen, USA) bioanalyzer system.
After DNA quality inspection, library construction and sequencing were conducted. The steps of RAD library construction (
The software of detecting sequence repeats is “SSR search”, which is a Perl program written by Novogene (Beijing). The detection software is divided into three modules. The first module is used to detect all simple repeats of DNA sequence, the second module is to filter the results of the first module to remove the simple repeats that are too close. The detection criteria were as follows: the length of the SSR repeat unit ranges from 2 to 6 bp; the minimum length of the SSR sequence was 12 bp; the length of the upstream and downstream sequences of the SSR was 100 bp, and 12 bp was the minimum distance between two SSR sequences. The third module is to use Primer3 (a software that designs primers under Linux or UNIX systems) to design primers (
Microsatellites were verified through PCR and electrophoresis. Each 25 μL PCR amplification system contained the following reagents: 17.25 μL ultrapure water, 2.5 μL 10 × PCR buffer, 2 μL dNTPs, 1 μL each primer (5 μmol · L–1), 0.25 μL Taq polymerase, and 1 μL template DNA. The PCR reactions ran for 5 min at 94°C, followed by 38 cycles of 45 s at 94°C, 45 s at the annealing temperature (Table
Characteristics of microsatellite loci in Chaeturichthys stigmatias from China.
Locus | Primer sequence (5′→3′) | Repeat motif | T a [°C] | Expected product length [bp] | N a | H o | H e | PIC | |
Forward | Reverse | ||||||||
MW29 | ACTAATTAGCATTCAGCACCAGC | GTCATGCACAGTGACACCATAAT | (TG)15 | 58.3 | 135 | 7 | 0.000 | 0.823 | 0.778 |
MW31 | TGATCGACAATGGAAATGTAATG | TATTTCTATAGCCACAGCTGCCT | (TG)7 | 56.4 | 145 | 13 | 0.292 | 0.895 | 0.866 |
MW32 | TAAAGTGCCGTAACAAGTTGGAT | CGTCATGATTTCAGGAAGTAACA | (TA)10 | 55 | 144 | 9 | 0.042 | 0.870 | 0.836 |
MW34 | AAGTGTCTATTCTGAGCGCACTT | TTGCAGTGATGAATCAAACATTC | (GAT)8 | 58.3 | 153 | 9 | 0.292 | 0.878 | 0.844 |
MW40 | TCTGCATCTTCTGAACTTCACCT | CTCTGAAACACACGTCACACCT | (GC)7 | 56.4 | 156 | 11 | 0.333 | 0.894 | 0.862 |
MW54 | ATAGAAGGGACTTCAGTTGGACC | CCATTTAAACTCTGTCAGACCCA | (AT)7 | 56.4 | 138 | 6 | 0.167 | 0.796 | 0.749 |
MW56 | TGTATTCTCGCTTACTGCAGCTC | TCATTTCTCAGCATTGACTCTCAT | (ATA)7 | 56.4 | 132 | 14 | 0.375 | 0.883 | 0.854 |
MW66 | AGAGTGAAAGAACGCACTGACC | GACCTTAGTGAGAGTGTGCGTGT | (CA)9 | 58.3 | 140 | 8 | 0.458 | 0.855 | 0.819 |
MW72 | TGCAAACACTGCTTGTTGTAGTT | TGAGCTGATTGTGTTAGTTTGTCA | (TA)7 | 56.4 | 150 | 11 | 0.458 | 0.903 | 0.873 |
MW77 | CTGCTGCTGTTGTTACTCAGATG | TATCAAGGGCTCACTAAAGGACA | (GAG)7 | 58.3 | 137 | 6 | 0.250 | 0.821 | 0.774 |
MW79 | GAAGAGGGAAGAGAGAACCAAAG | TTCTTGTCCCAAATTCACTTCTC | (GA)9 | 56.4 | 160 | 10 | 0.417 | 0.834 | 0.793 |
MW80 | TTAGACAGGACAGCGTTAGCATT | CACAGCAAAGGCTCTGAATACTT | (GA)7 | 56.4 | 147 | 13 | 0.417 | 0.867 | 0.834 |
MW83 | GAGACACTGTCAGAGCAGATCCT | TAATCAACAGCATGAAGAGCAGA | (GCT)7 | 56.4 | 148 | 10 | 0.500 | 0.840 | 0.801 |
MW86 | AAATCCTTCTGCAATTGACTCTG | GAGAGGGAGGAAGAGATAATGGA | (CT)8 | 55 | 139 | 7 | 0.167 | 0.816 | 0.769 |
MW87 | ACTGCTGCTAGATTTACTGGTGC | TATCCTTCATCCTCCTCTTCACA | (TAC)8 | 60.2 | 157 | 11 | 0.417 | 0.876 | 0.842 |
MW88 | TTGAGTATATTTCAGCCCGTCTC | GCCGTTTGCTCATAACATAAACT | (AT)8 | 56.4 | 133 | 13 | 0.250 | 0.912 | 0.884 |
MW92 | TTTGAAAAGGTGCAGGAGATG | TGAACTCCACTGCTCTGTGTAAA | (CT)15 | 53.1 | 136 | 12 | 0.708 | 0.908 | 0.879 |
MW97 | CACAGCAAACAAAGAAACAACAC | TATTACGGAAAGGGTAGGACCAT | (TAC)7 | 58.3 | 138 | 12 | 0.417 | 0.857 | 0.826 |
MW100 | TCCCACCACAGAAGTTAAACAGT | GCATGTTCCTTACAAAGGTTCAC | (TAT)7 | 55 | 148 | 8 | 0.042 | 0.840 | 0.800 |
MW103 | CTTTCTTACTTTCCCGCTCTCTT | CATGGAAATGGATAGAAATGGAA | (CT)7 | 55 | 133 | 11 | 0.375 | 0.889 | 0.858 |
MW104 | AGGCAAGAAATATCACAGGGACT | TCGTGACTCATGGAAATACCAAT | (AT)7 | 55 | 147 | 10 | 0.333 | 0.860 | 0.824 |
MW111 | CAGGCCTGTTAGCTTAGCTGTAG | CACTGGCACACACAACCTAAATA | (AT)12 | 58.3 | 139 | 12 | 0.542 | 0.893 | 0.862 |
MW113 | GTATTTATCCGAGCACGCACTAC | TAAACGCACGAACAGTATCGTAA | (TG)12 | 55 | 156 | 12 | 0.833 | 0.898 | 0.868 |
MW115 | TTATTTGCCAGTATTGACCCAGT | CCAAGCCTCTAAGAGTGTCTGAA | (CA)11 | 49.6 | 150 | 9 | 0.000 | 0.876 | 0.841 |
MW117 | TGACGTGTGTAACATTCGTGAGT | GAGGGAATGATGTCTGTGATTTC | (ACA)8 | 58.3 | 151 | 13 | 0.417 | 0.931 | 0.905 |
MW118 | TTATTGGCCCTCAGTGTGTTATT | CCTCGAGGAAATATCAGAGTATCG | (TAA)10 | 55 | 157 | 10 | 0.333 | 0.874 | 0.840 |
MW119 | AAATGACGAGACAATTACAACTGAT | TTCCTTTGTGTATTATGGAAGTTCA | (TA)15 | 58.3 | 139 | 11 | 0.833 | 0.886 | 0.854 |
MW120 | TTTCAGATACACCTCATTGGACC | GAAACAACAGCAGTTGCACAAT | (AAT)7 | 60.2 | 140 | 13 | 0.500 | 0.897 | 0.867 |
MW121 | TCTGTTTGATGCAGTGACAGAGT | CCTCCAGAGAAGGACTCATCAT | (TGC)7 | 58.3 | 132 | 12 | 0.167 | 0.903 | 0.873 |
MW123 | TCCATCCTAAACTGAACCAAATG | TGAAATGTAGTCAATCTTTGCCA | (TTA)7 | 58.3 | 154 | 9 | 0.125 | 0.835 | 0.795 |
After statistical analysis, the results were input into Genepop 4.0 (
We have read the policies relating to animal experiments and confirmed this study complied. All procedures performed in this study were approved by the Institutional Animal Care and Use Committee of the Ocean University of China.
A total of 4.682 Gb high-quality data was obtained, and the Q20 and Q30 values were 97.31% and 92.52%, respectively. The RAD-Tag capture rate was 98.03%, and the GC content was 39.43%. Genomic GC content had a significant effect on the randomness of second-generation genome sequencing. Too high (>65%) or too low (<25%) GC content will lead to sequencing bias and seriously affect the results of genomic analysis. The GC content of Chaeturichthys stigmatias was normal, and the sequencing quality was qualified, indicating that the sequencing of the database was successful (
The sequences were clustered and assembled. The total contig base was 113 171 723 bp, and the total contig number was 337 800. The mean value of contig length of the assembly sequences was 335 bp, and N50 length was 393 bp. The GC content of the assembly result was 39.04%, which was consistent with the GC content of the sequencing clean data, indicating that the assembly result was true and reliable (
Based on the RAD-seq, the total number of identified microsatellites was 5829. Among them, there were 5631 microsatellite loci containing primer fragments (Table
Simple sequence repeat (SSR) distribution statistics for Chaeturichthys stigmatias from China.
Nucleotide repeat type | Statistics | |
---|---|---|
SSR number | Percentage | |
Di- | 1312 | 23.30 |
Tri- | 3241 | 57.56 |
Tetra- | 664 | 11.79 |
Penta- | 233 | 4.14 |
Hexa- | 181 | 3.21 |
The previous studies showed that the dominant repeating unit was discrepant. Some fish species were dinucleotide, such as Megalobrama amblycephala Yih, 1955 and Larimichthys crocea (Richardson, 1846) (see
The distribution and frequency of microsatellite motifs were presented in Fig.
The distribution and frequency of microsatellite motifs of Chaeturichthys stigmatias from China. (A) Frequency of different dinucleotide microsatellite motifs; (B) Frequency of different trinucleotide microsatellite motifs; (C) Frequency of different tetranucleotide microsatellite motifs; (D) Frequency of different pentabase microsatellite motifs; (E) Frequency of different hexanucleotide microsatellite motifs.
In terms of the frequency of repeating units, there were only four distinct types of repeats detected in pentanucleotide and hexanucleotide, and all of them were predominant at a frequency of a 4-fold repeat. Seven types were identified and 4-fold repeat was predominant in all tetranucleotide repeats. The types of repetition frequency detected in dinucleotide and trinucleotide were not less than 10 types, and 5-fold repeat and 6-fold repeat were the main components in dinucleotide and trinucleotide respectively (Figs
In this study, the frequency distribution of the repetition units of dinucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide microsatellite were mainly 4–10 times, 4–7 times, 4–5 times, 4 times, and 4 times, respectively (Fig.
A gradient PCR experiment was performed on the synthesized 148 pairs of primers, and the optimal temperature of each pair of primers was screened. The results showed that a total of 97 pairs of primers were successfully amplified. Then after the PCR product was subjected to polyacrylamide gel electrophoresis experiments, a total of 30 primers with polymorphism were screened out. A total of 312 alleles were detected for 24 individuals at 30 polymorphic loci, and the number of alleles per locus ranged from 6 to 14, with the mean value of effective alleles was 10.4. The mean value of expected heterozygosity was 0.870, the observed heterozygosity was 0.349, and the mean value of polymorphic information content was 0.836 (Table
In this study, 30 primers with polymorphism were screened out as dinucleotide and trinucleotide repeats, without tetranucleotide, pentanucleotide, and hexanucleotide repeats.
The higher heterozygote ratio reflects the stability of the genetic structure of the population. We found that the observed heterozygosity (Ho) of 30 polymorphic sites was lower than the expected heterozygosity (He), showing a relative lack of heterozygosity. It was generally believed that the loss of heterozygosity was caused by geographical isolation, decreased gene exchange between populations, and increased inbreeding (
At the same time, according to Hardy–Weinberg equilibrium analysis, all the 30 microsatellite loci discussed in this study showed significant imbalance, which was a common phenomenon in fish populations, such as Siniperca scherzeri Steindachner, 1892 and Lutjanus peru (Nichols et Murphy, 1922) (see
This study was conducted in combination with high-throughput sequencing, which also marks the first analysis of the microsatellite characteristics of Chaeturichthys stigmatias. In summary, a total of 4.682 Gb high-quality sequence data was obtained and 5631 SSRs were identified based on RAD-seq, indicating the high efficiency of the primer development of this technology. The 30 pairs of polymorphic primers obtained in this study will provide an effective basis for the future comparative analysis of the genetic structure and genetic characteristics of C. stigmatias, and also provide a significant basis for the development of microsatellite primers using high-throughput sequencing technology in the future.
We thank Dr Zonghang Zhang for the English editing. This work was supported by the National Key R and D Program of China (Grant number 2018YFD0900905).
Appendix 1
Data type: excel file
Explanation note: Detailed data for types of trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide loci.