In addition, two initiatives are in the pipeline. Powering the plasmid maps on with SnapGene Server is useful because it helps us find which common features need to be added to the database – we will work to fill any common annotation gaps in the plasmids available through the Addgene website. What’s next for common features? We plan to continue updating the feature database. To prevent misidentification, we augment the database with new fluorescent protein variants. The best example is fluorescent proteins, which often come in closely related versions that have different properties. Another limitation is that by tolerating mismatches, our algorithm could annotate a feature inaccurately. This issue is addressed by adding more variants to the database. For instance, it could miss a common feature if the sequence differences exceed the threshold. In an effort to be rigorous, we dug into the original literature, some of it decades old, to provide reliable feature annotations.Įven with these extensive efforts, such an algorithm has limitations. We found that the annotated boundaries for control elements in commercial plasmids were inconsistent and sometimes clearly wrong. Identifying plasmid control elementsĬoding sequence features are relatively straightforward to define, but for control elements such as promoters and transcription terminators, the boundaries are less obvious. With the increasing popularity of gene synthesis, many researchers now use codon-optimized versions of common coding sequence features, so our detection system was enhanced to allow searches for a perfect protein sequence match even when the DNA sequence has changed. For a coding sequence feature that may be used to make fusion genes, detection needs to occur even if one or two codons are missing at the beginning or end of the feature. Empirical tests indicated that a reasonable rule is to require at least 96% sequence identity when detecting a reference feature. Instead, we identified common variants, and then crafted a detection algorithm that tolerates occasional mismatches or indels. It proved to be impractical to catalog every variant of a feature. These plasmids contain features such as antibiotic resistance markers and replication origins, but there is extensive heterogeneity in the feature sequences due to genetic drift and the use of genes from different microbial strains. The source of common features was our collection of popular plasmid sequences. Development of this tool required creating a database of common features, and devising rules for identifying a feature even when the match is imperfect. This algorithm enables one of SnapGene’s most popular aspects - its ability to annotate a raw plasmid sequence and display frequently used genes and control elements. Development of software with these qualities is an ongoing process that involves iterative refinements in response to customer feedback.Īn example of this approach is SnapGene’s algorithm for detecting common features. SnapGene has been engineered to be easy and enjoyable to use. Instead of crowding the interface with every possible option, we place the most important controls front and center, and make specialized controls available when needed. For every task, we envision what the user wants to do and make the path to accomplishing their goals as intuitive and painless as possible. But what makes software good? Fortunately, that question has been thoroughly answered by experts in human-computer interaction (HCI), and we have adhered rigorously to HCI principles. SnapGene was created to alleviate these problems through good software design. In the 21st century, many molecular biologists didn’t know the complete sequences or properties of the DNA molecules they were using. Records of plasmid construction were often incomplete or nonexistent. Primer design was done painstakingly by hand. Preventable errors in the design of cloning strategies set experiments back days or even weeks. While there were software tools available to biomedical researchers manipulating DNA sequences on a daily basis, many found these tools inadequate for planning, visualizing, and documenting their procedures. This post was contribued by guest bloggers Aline and Benjamin Glick from SnapGene.
0 Comments
Leave a Reply. |