Metainformationen zur Seite
Quality check and GBOL Gold Standard
Now check step by step if the assemblies match the “GBOL Gold Standard”:
Correct sequence length
658 bp (only few exceptions in some groups like Molluscs, Pseudoscorpions, some Hymenoptera). Trim the assemblies (cut primer sequences) if necessary as we practiced in the workshop:
Find and cut the LCO-Primer on the 5´- end of the reverse strand by marking and moving the annotation (pink bar) if not already at the right position.
Find and cut the HCO-Primer on the 5´- end of the forward strand by marking and moving the annotation. In this example, the primer was already found and marked (green) by the program.
If there is no annotation yet then just mark the bases and click on Add Annotation and choose the type trimmed.
Less than 1% ambiguities/ disagreements
If you have sequences with more than ~6 (1%) ambiguities/ disagreements you should carefully go through all cases and check if they could be edited with a clear conscience (means that the program sometimes makes dubious decision which can be modified to save the barcode ). If you choose disagreements or ambiguities in the drop-down field below Highlighting on the right and then click on one of the arrows then the disagreements/ ambiguities are highlighted one by one.
This examples shows such a case which is mentioned above: The upper trace shows a double peak with a C (blue) but also an even stronger T (green) but for whatever reason the program decided that it is a C. If you change the C into a T, both strands will match and the sequence will no longer have a disagreement at this position.
No stop codons
Number of stop codons must be 0. If a sequence with 658bp has stop codons, then it usually can be discarded. If it is too long or too short, then wait until you checked the sequence in an alignment (point 4)).
If the Min # Stop Codons or any other parameter is not displayed you can add it via the Manage Column symbol on the right.
No gaps
When you checked all assemblies for primers, disagreements/ ambiguities and stop codons, mark them all and right click on Multiple Align.
Choose these options in the pop-up window:
Check your alignment for stop codons and gaps. For this it´s necessary to choose the right Genetic Code and reading frame.
Check for Stop Codons like you did for the Disagreements/ Ambiguities (via the Highlighting function).
In this example you can see that one sequence is one base too long (659bp) and thus has some stopcodons:
Aligning the assemblies reveals gaps or redundant bases and their position so in a case like this you can go back into the assembly and delete the redundant base (usually the stop codons will then disappear).
The next example shows the opposite case: a sequence is one base too short (657bp) and so produces a gap within the alignment.
In this case go back into that sequence, have a look at position 637 and check in the chromatogram if it is obvious which base is missing there. Even if you cannot see the corresponding peaks in the chromatogram, you can insert an N in comparison with other sequences.
In the end you should make a last quick check for gaps and then your alignment should ideally look like that:
In some exceptional cases, sequences may not have the correct length, usually 3, 6 or 9 bases too many or too few. This is the case, for example, with pseudoscorpions and molluscs.
Discard all sequences if they are shorter than 500bp, if they have stop codons, if they produce gaps in an alignment (and you can´t fill it), if they have more than 1% disagreements/ ambiguities and if the binning of Geneious (based on the binning profile which we installed → manage columns: Bin) is still low or medium after creating the consensus sequence (next step).











