• Microsatellite discovery in an insular amphibian (Grandisonia alternans) with comments on cross-species utility and the accuracy of locus identification from unassembled Illumina data

      Adamson, Eleanor A. S.; Saha, Anwesha; Maddock, Simon T.; Nussbaum, Ronald A.; Gower, David J.; Streicher, Jeffrey W. (Springer, 2016-07-29)
      The Seychelles archipelago is unique among isolated oceanic islands because it features an endemic radiation of caecilian amphibians (Gymnophiona). In order to develop population genetics resources for this system, we identified microsatellite loci using unassembled Illumina MiSeq data generated from a genomic library of Grandisonia alternans, a species that occurs on multiple islands in the archipelago. Applying a recently described method (PALFINDER) we identified 8001 microsatellite loci that were potentially informative for population genetics analyses. Of these markers, we screened 60 loci using five individuals, directly sequenced several amplicons to confirm their identity, and then used eight loci to score allele sizes in 64 G. alternans individuals originating from five islands. A number of these individuals were sampled using non-lethal methods, demonstrating the efficacy of non-destructive molecular sampling in amphibian research. Although two loci satisfied our criteria as diploid, neutrally evolving loci with the statistical power to detect population structure, our success in identifying reliable loci was very low. Additionally, we discovered some issues with primer redundancy and differences between Illumina and Sanger sequences that suggest some Illumina-inferred loci are invalid. We investigated cross-species utility for eight loci and found most could be successfully amplified, sequenced and aligned across other species and genera of caecilians from the Seychelles. Thus, our study in part supported the validity of using PALFINDER with unassembled reads for microsatellite discovery within and across species, but importantly identified major limitations to applying this approach to small datasets (ca. 1 million reads) and loci with small tandem repeat sizes.