05:06:2024 20:55 sequence 1.2 Evan Christensen definition term replaced by amino acid modification Alliance of Genome Resources Alliance of Genome Resources Gene Biotype Slim biosapiens database of genomic structural variation RNA modification SO feature annotation variant annotation term amino acid 1 letter code amino acid 3 letter code biosapiens protein feature ontology dbsnp variant terms DBVAR ensembl variant terms subset_property synonym_type_property consider has_alternative_id has_broad_synonym database_cross_reference has_exact_synonym has_narrow_synonym has_obo_format_version has_obo_namespace has_related_synonym has_scope has_synonym_type in_subset A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap. sequence adjacent_to adjacent_to A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap. PMID:20226267 SO:ke sequence associated_with This relationship is vague and up for discussion. associated_with B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A. sequence complete_evidence_for_feature If A is a feature with multiple regions such as a multi exon transcript, the supporting EST evidence is complete if each of the regions is supported by an equivalent region in B. Also there must be no extra regions in B that are not represented in A. This relationship was requested by jeltje on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222. complete_evidence_for_feature B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A. SO:ke X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z. kareneilbeck 2010-10-14T01:38:51Z sequence connects_on Example: A splice_junction connects_on exon, exon, mature_transcript. connects_on X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z. PMID:20226267 X contained_by Y iff X starts after start of Y and X ends before end of Y. kareneilbeck 2010-10-14T01:26:16Z sequence contained_by The inverse is contains. Example: intein contained_by immature_peptide_region. contained_by X contained_by Y iff X starts after start of Y and X ends before end of Y. PMID:20226267 The inverse of contained_by. kareneilbeck 2010-10-14T01:32:15Z sequence contains Example: pre_miRNA contains miRNA_loop. contains The inverse of contained_by. PMID:20226267 sequence derives_from derives_from X is disconnected_from Y iff it is not the case that X overlaps Y. kareneilbeck 2010-10-14T01:42:10Z sequence disconnected_from disconnected_from X is disconnected_from Y iff it is not the case that X overlaps Y. PMID:20226267 kareneilbeck 2009-08-19T02:19:45Z sequence edited_from edited_from kareneilbeck 2009-08-19T02:19:11Z sequence edited_to edited_to B is evidence_for_feature A, if an instance of B supports the existence of A. sequence evidence_for_feature This relationship was requested by nlw on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222. evidence_for_feature B is evidence_for_feature A, if an instance of B supports the existence of A. SO:ke X is exemplar of Y if X is the best evidence for Y. sequence exemplar_of Tracker id: 2594157. exemplar_of X is exemplar of Y if X is the best evidence for Y. SO:ke Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary. kareneilbeck 2010-10-14T01:45:45Z sequence finished_by Example CDS finished_by stop_codon. finished_by Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary. PMID:20226267 X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary. kareneilbeck 2010-10-14T02:17:53Z sequence finishes Example: stop_codon finishes CDS. finishes X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary. PMID:20226267 X gained Y if X is a variant_of X' and Y part of X but not X'. kareneilbeck 2011-06-28T12:51:10Z sequence gained A relation with which to annotate the changes in a variant sequence with respect to a reference. For example a variant transcript may gain a stop codon not present in the reference sequence. gained X gained Y if X is a variant_of X' and Y part of X but not X'. SO:ke sequence genome_of genome_of kareneilbeck 2009-08-19T02:27:04Z sequence guided_by guided_by kareneilbeck 2009-08-19T02:27:24Z sequence guides guides X has_integral_part Y if and only if: X has_part Y and Y part_of X. kareneilbeck 2009-08-19T12:01:46Z sequence has_integral_part Example: mRNA has_integral_part CDS. has_integral_part X has_integral_part Y if and only if: X has_part Y and Y part_of X. http://precedings.nature.com/documents/3495/version/1 sequence has_origin has_origin Inverse of part_of. sequence has_part Example: operon has_part gene. has_part Inverse of part_of. http://precedings.nature.com/documents/3495/version/1 sequence has_quality The relationship between a feature and an attribute. has_quality sequence homologous_to homologous_to X integral_part_of Y if and only if: X part_of Y and Y has_part X. kareneilbeck 2009-08-19T12:03:28Z sequence integral_part_of Example: exon integral_part_of transcript. integral_part_of X integral_part_of Y if and only if: X part_of Y and Y has_part X. http://precedings.nature.com/documents/3495/version/1 R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical). kareneilbeck 2010-10-14T02:19:48Z sequence is_consecutive_sequence_of Example: region is consecutive_sequence of base. is_consecutive_sequence_of R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical). PMID:20226267 X lost Y if X is a variant_of X' and Y part of X' but not X. kareneilbeck 2011-06-28T12:53:16Z sequence lost A relation with which to annotate the changes in a variant sequence with respect to a reference. For example a variant transcript may have lost a stop codon present in the reference sequence. lost X lost Y if X is a variant_of X' and Y part of X' but not X. SO:ke A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y. kareneilbeck 2010-10-14T01:34:48Z sequence maximally_overlaps Example: non_coding_region_of_exon maximally_overlaps the intersections of exon and UTR. maximally_overlaps A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y. PMID:20226267 sequence member_of A subtype of part_of. Inverse is collection_of. Winston, M, Chaffin, R, Herrmann: A taxonomy of part-whole relations. Cognitive Science 1987, 11:417-444. member_of A relationship between a pseudogenic feature and its functional ancestor. sequence non_functional_homolog_of non_functional_homolog_of A relationship between a pseudogenic feature and its functional ancestor. SO:ke sequence orthologous_to orthologous_to X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y. kareneilbeck 2010-10-14T01:33:15Z sequence overlaps Example: coding_exon overlaps CDS. overlaps X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y. PMID:20226267 sequence paralogous_to paralogous_to X part_of Y if X is a subregion of Y. sequence part_of Example: amino_acid part_of polypeptide. part_of X part_of Y if X is a subregion of Y. http://precedings.nature.com/documents/3495/version/1 B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A. sequence partial_evidence_for_feature partial_evidence_for_feature B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A. SO:ke sequence position_of position_of Inverse of processed_into. kareneilbeck 2009-08-19T12:14:00Z sequence processed_from Example: miRNA processed_from miRNA_primary_transcript. processed_from Inverse of processed_into. http://precedings.nature.com/documents/3495/version/1 X is processed_into Y if a region X is modified to create Y. kareneilbeck 2009-08-19T12:15:02Z sequence processed_into Example: miRNA_primary_transcript processed into miRNA. processed_into X is processed_into Y if a region X is modified to create Y. http://precedings.nature.com/documents/3495/version/1 kareneilbeck 2009-08-19T02:21:03Z sequence recombined_from recombined_from kareneilbeck 2009-08-19T02:20:07Z sequence recombined_to recombined_to sequence sequence_of sequence_of sequence similar_to similar_to X is strted_by Y if Y is part_of X and X and Y share a 5' boundary. kareneilbeck 2010-10-14T01:43:55Z sequence started_by Example: CDS started_by start_codon. started_by X is strted_by Y if Y is part_of X and X and Y share a 5' boundary. PMID:20226267 X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary. kareneilbeck 2010-10-14T01:47:53Z sequence starts Example: start_codon starts CDS. starts X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary. PMID:20226267 kareneilbeck 2009-08-19T02:22:14Z sequence trans_spliced_from trans_spliced_from kareneilbeck 2009-08-19T02:22:00Z sequence trans_spliced_to trans_spliced_to X is transcribed_from Y if X is synthesized from template Y. kareneilbeck 2009-08-19T12:05:39Z sequence transcribed_from Example: primary_transcript transcribed_from gene. transcribed_from X is transcribed_from Y if X is synthesized from template Y. http://precedings.nature.com/documents/3495/version/1 Inverse of transcribed_from. kareneilbeck 2009-08-19T12:08:24Z sequence transcribed_to Example: gene transcribed_to primary_transcript. transcribed_to Inverse of transcribed_from. http://precedings.nature.com/documents/3495/version/1 Inverse of translation _of. kareneilbeck 2009-08-19T12:11:53Z sequence translates_to Example: codon translates_to amino_acid. translates_to Inverse of translation _of. http://precedings.nature.com/documents/3495/version/1 X is translation of Y if Y is translated by ribosome to create X. kareneilbeck 2009-08-19T12:09:59Z sequence translation_of Example: Polypeptide translation_of CDS. translation_of X is translation of Y if Y is translated by ribosome to create X. http://precedings.nature.com/documents/3495/version/1 A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A. sequence variant_of Added to SO during the immunology workshop, June 2007. This relationship was approved by Barry Smith. variant_of A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A. SO:immuno_workshop true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true true sequence SO:0000000 Sequence_Ontology true A 5' UTR variant within an upstream open reading frame. evan 2024-04-10T17:49:03Z sequence SO:00000000002382 Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #647. 5_prime_UTR_uORF_variant A 5' UTR variant within an upstream open reading frame. PMID:32461616 PMID:32926138 A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids. sequence sequence SO:0000001 region A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids. SO:ke A 5' UTR variant where a stop codon in an upstream open reading frame is introduced, moved or lost. evan 2024-04-10T17:56:17Z sequence SO:00000010002382 Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #622. 5_prime_UTR_uORF_stop_codon_variant A 5' UTR variant where a stop codon in an upstream open reading frame is introduced, moved or lost. PMID:32461616 PMID:32926138 A folded sequence. INSDC_feature:misc_structure sequence secondary structure sequence SO:0000002 sequence_secondary_structure A folded sequence. SO:ke A 5' UTR variant which disrupts the translation of an upstream open reading frame because the number of nucleotides inserted or deleted is not a multiple of three. evan 2024-04-10T17:58:40Z uFrameshift (UTRannotator) sequence SO:00000020002382 Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #621. 5_prime_UTR_uORF_frameshift_variant A 5' UTR variant which disrupts the translation of an upstream open reading frame because the number of nucleotides inserted or deleted is not a multiple of three. PMID:32461616 PMID:32926138 G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet. http://en.wikipedia.org/wiki/G-quadruplex G quartet G tetrad G-quadruplex G-quartet G-tetrad G_quadruplex guanine tetrad sequence SO:0000003 G_quartet G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet. http://www.ncbi.nlm.nih.gov/pubmed/7919797?dopt=Abstract http://en.wikipedia.org/wiki/G-quadruplex wiki A 5' UTR variant where a premature stop codon is gained in an upstream open reading frame. evan 2024-04-10T18:01:42Z uSTOP_gained sequence SO:00000030002382 Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #624. 5_prime_UTR_uORF_stop_codon_gain_variant A 5' UTR variant where a premature stop codon is gained in an upstream open reading frame. PMID:32461616 PMID:32926138 uSTOP_gained UTRannotator A coding exon that is not the most 3-prime or the most 5-prime in a given transcript. interior coding exon sequence SO:0000004 interior_coding_exon A 5' UTR variant where the stop codon of an upstream open reading frame is lost. evan 2024-04-10T18:05:50Z uSTOP_lost sequence SO:00000040002382 Added 10 Apr 2024 at the request of Sarah Hunt (EBI). See GitHub Issue #623. 5_prime_UTR_uORF_stop_codon_loss_variant A 5' UTR variant where the stop codon of an upstream open reading frame is lost. PMID:32461616 PMID:32926138 uSTOP_lost UTRannotator The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Satellite_DNA INSDC_qualifier:satellite satellite DNA sequence SO:0000005 satellite_DNA The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Satellite_DNA wiki A region amplified by a PCR reaction. http://en.wikipedia.org/wiki/RAPD PCR product sequence amplicon SO:0000006 This term is mapped to MGED. This term is now located in OBI, with the following ID OBI_0000406. PCR_product A region amplified by a PCR reaction. SO:ke http://en.wikipedia.org/wiki/RAPD wiki One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert. mate pair read-pair sequence SO:0000007 read_pair One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert. SO:ls sequence SO:0000008 gene_sensu_your_favorite_organism true sequence SO:0000009 gene_class true A gene which, when transcribed, can be translated into a protein. protein-coding sequence SO:0000010 protein_coding A gene which can be transcribed, but will not be translated into a protein. non protein-coding sequence SO:0000011 non_protein_coding The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote. scRNA primary transcript scRNA transcript small cytoplasmic RNA transcript sequence small cytoplasmic RNA small_cytoplasmic_RNA SO:0000012 scRNA_primary_transcript The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote. http://www.ebi.ac.uk/embl/WebFeat/align/scRNA_s.html A small non coding RNA sequence, present in the cytoplasm. INSDC_feature:ncRNA INSDC_qualifier:scRNA small cytoplasmic RNA sequence SO:0000013 scRNA A small non coding RNA sequence, present in the cytoplasm. SO:ke A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element. INR motif initiator initiator motif sequence DMp2 SO:0000014 Binds TAF1, TAF2. INR_motif A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element. PMID:12651739 PMID:16858867 A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C). DPE motif downstream core promoter element CRWMGCGWKCGCTTS sequence SO:0000015 Binds TAF6, TAF9. DPE_motif A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C). PMID:12515390 PMID:12537576 PMID:12651739 PMID:16858867 A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB. B-recognition element BRE motif BREu motif transcription factor B-recognition element sequence BREu TFIIB recognition element SO:0000016 Binds TFIIB. BREu_motif A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB. PMID:12651739 PMID:16858867 A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes. PSE motif proximal sequence element sequence SO:0000017 PSE_motif A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes. PMID:11390411 PMID:12621023 PMID:12651739 PMID:23166507 PMID:8339931 A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned. http://en.wikipedia.org/wiki/Linkage_group linkage group sequence SO:0000018 linkage_group A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned. ISBN:038752046 http://en.wikipedia.org/wiki/Linkage_group wiki true A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge. RNA internal loop sequence SO:0000020 RNA_internal_loop A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge. SO:ke An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand. asymmetric RNA internal loop sequence SO:0000021 asymmetric_RNA_internal_loop An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand. SO:ke A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix. A minor RNA motif sequence SO:0000022 A_minor_RNA_motif A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix. SO:ke The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices. http://en.wikipedia.org/wiki/K-turn K turn RNA motif K-turn kink turn kink-turn motif sequence SO:0000023 K_turn_RNA_motif The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices. SO:ke http://en.wikipedia.org/wiki/K-turn wiki A loop in ribosomal RNA containing the sites of attack for ricin and sarcin. sarcin like RNA motif sarcin/ricin RNA domain sarcin/ricin domain sarcin/ricin loop sequence SO:0000024 sarcin_like_RNA_motif A loop in ribosomal RNA containing the sites of attack for ricin and sarcin. http://www.ncbi.nlm.nih.gov/pubmed/7897662 An internal RNA loop where the extent of the loop on both stands is the same size. A-minor RNA motif sequence SO:0000025 symmetric_RNA_internal_loop An internal RNA loop where the extent of the loop on both stands is the same size. SO:ke RNA junction loop sequence SO:0000026 RNA_junction_loop RNA hook turn hook-turn motif sequence hook turn SO:0000027 RNA_hook_turn Two bases paired opposite each other by hydrogen bonds creating a secondary structure. http://en.wikipedia.org/wiki/Base_pair base pair sequence SO:0000028 base_pair http://en.wikipedia.org/wiki/Base_pair wiki The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation. WC base pair Watson Crick base pair Watson-Crick pair canonical base pair sequence Watson-Crick base pair SO:0000029 WC_base_pair The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation. PMID:12177293 A type of non-canonical base-pairing. sugar edge base pair sequence SO:0000030 sugar_edge_base_pair A type of non-canonical base-pairing. PMID:12177293 DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules. http://en.wikipedia.org/wiki/Aptamer sequence SO:0000031 aptamer DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules. http://aptamer.icmb.utexas.edu http://en.wikipedia.org/wiki/Aptamer wiki DNA molecules that have been selected from random pools based on their ability to bind other molecules. DNA aptamer sequence SO:0000032 DNA_aptamer DNA molecules that have been selected from random pools based on their ability to bind other molecules. http:aptamer.icmb.utexas.edu RNA molecules that have been selected from random pools based on their ability to bind other molecules. RNA aptamer sequence SO:0000033 RNA_aptamer RNA molecules that have been selected from random pools based on their ability to bind other molecules. http://aptamer.icmb.utexas.edu Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino. morphant morpholino morpholino oligo sequence SO:0000034 morpholino_oligo Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino. http://www.gene-tools.com/ A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Riboswitch INSDC_qualifier:riboswitch riboswitch RNA sequence SO:0000035 riboswitch A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites. PMID:2820954 http://en.wikipedia.org/wiki/Riboswitch wiki A DNA region that is required for the binding of chromatin to the nuclear matrix. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Matrix_attachment_site INSDC_qualifier:matrix_attachment_region MAR S/MAR SMAR matrix association region matrix attachment region matrix attachment site nuclear matrix association region nuclear matrix attachment site scaffold attachment site scaffold matrix attachment region sequence S/MAR element SO:0000036 matrix_attachment_site A DNA region that is required for the binding of chromatin to the nuclear matrix. SO:ma http://en.wikipedia.org/wiki/Matrix_attachment_site wiki A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Locus_control_region INSDC_qualifier:locus_control_region LCR locus control region sequence locus control element SO:0000037 Definition updated Nov 10 2020, Colin Logie from GREEKC helped us realize that LCRs can also be located 3' to a gene. locus_control_region A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene. SO:ma http://en.wikipedia.org/wiki/Locus_control_region wiki A collection of match parts. sequence SO:0000038 match_set true A collection of match parts. SO:ke A part of a match, for example an hsp from blast is a match_part. match part sequence SO:0000039 match_part A part of a match, for example an hsp from blast is a match_part. SO:ke A clone of a DNA region of a genome. genomic clone sequence SO:0000040 genomic_clone A clone of a DNA region of a genome. SO:ma An operation that can be applied to a sequence, that results in a change. sequence operation sequence SO:0000041 sequence_operation true An operation that can be applied to a sequence, that results in a change. SO:ke An attribute of a pseudogene (SO:0000336). pseudogene attribute sequence SO:0000042 pseudogene_attribute true An attribute of a pseudogene (SO:0000336). SO:ma A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail. INSDC_feature:gene INSDC_qualifier:processed processed pseudogene retropseudogene sequence R psi G pseudogene by reverse transcription SO:0000043 Please not the synonym R psi M uses the spelled out form of the greek letter. processed_pseudogene A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A pseudogene caused by unequal crossing over at recombination. pseudogene by unequal crossing over sequence SO:0000044 pseudogene_by_unequal_crossing_over A pseudogene caused by unequal crossing over at recombination. SO:ke To remove a subsection of sequence. sequence SO:0000045 delete true To remove a subsection of sequence. SO:ke To insert a subsection of sequence. sequence SO:0000046 insert true To insert a subsection of sequence. SO:ke To invert a subsection of sequence. sequence SO:0000047 invert true To invert a subsection of sequence. SO:ke To substitute a subsection of sequence for another. sequence SO:0000048 substitute true To substitute a subsection of sequence for another. SO:ke To translocate a subsection of sequence. sequence SO:0000049 translocate true To translocate a subsection of sequence. SO:ke A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene. sequence SO:0000050 gene_part true A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene. SO:ke A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid. http://en.wikipedia.org/wiki/Hybridization_probe sequence SO:0000051 probe A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid. SO:ma http://en.wikipedia.org/wiki/Hybridization_probe wiki sequence assortment-derived_deficiency SO:0000052 assortment_derived_deficiency true A sequence_variant_effect which changes the regulatory region of a gene. SO:0001556 sequence variant affecting regulatory region sequence mutation affecting regulatory region SO:0000053 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_regulatory_region true A sequence_variant_effect which changes the regulatory region of a gene. SO:ke A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number. http://en.wikipedia.org/wiki/Aneuploid sequence SO:0000054 aneuploid A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number. SO:ke http://en.wikipedia.org/wiki/Aneuploid wiki A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present. http://en.wikipedia.org/wiki/Hyperploid sequence SO:0000055 hyperploid A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present. SO:ke http://en.wikipedia.org/wiki/Hyperploid wiki A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing. http://en.wikipedia.org/wiki/Hypoploid sequence SO:0000056 hypoploid A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing. SO:ke http://en.wikipedia.org/wiki/Hypoploid wiki A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon. http://en.wikipedia.org/wiki/Operator_(biology)#Operator operator segment sequence SO:0000057 Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529. operator A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon. SO:ma http://en.wikipedia.org/wiki/Operator_(biology)#Operator wiki sequence assortment-derived_aneuploid SO:0000058 assortment_derived_aneuploid true A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease. nuclease binding site sequence SO:0000059 nuclease_binding_site A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease. SO:cb One arm of a compound chromosome. compound chromosome arm sequence SO:0000060 FLAG - this term is should probably be a part of rather than an is_a. compound_chromosome_arm A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme. restriction endonuclease binding site restriction enzyme binding site sequence SO:0000061 A region of a molecule that binds to a restriction enzyme. restriction_enzyme_binding_site A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme. SO:cb An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining. deficient intrachromosomal transposition sequence SO:0000062 deficient_intrachromosomal_transposition An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining. FB:reference_manual An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining. deficient interchromosomal transposition sequence SO:0000063 deficient_interchromosomal_transposition An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining. SO:ke sequence SO:0000064 This classes of attributes was added by MA to allow the broad description of genes based on qualities of the transcript(s). A product of SO meeting 2004. gene_by_transcript_attribute true A chromosome structure variation whereby an arm exists as an individual chromosome element. free chromosome arm sequence SO:0000065 free_chromosome_arm A chromosome structure variation whereby an arm exists as an individual chromosome element. SO:ke sequence SO:0000066 gene_by_polyadenylation_attribute true gene to gene feature sequence SO:0000067 gene_to_gene_feature An attribute describing a gene that has a sequence that overlaps the sequence of another gene. sequence SO:0000068 overlapping An attribute describing a gene that has a sequence that overlaps the sequence of another gene. SO:ke An attribute to describe a gene when it is located within the intron of another gene. inside intron sequence SO:0000069 inside_intron An attribute to describe a gene when it is located within the intron of another gene. SO:ke An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand. inside intron antiparallel sequence SO:0000070 inside_intron_antiparallel An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand. SO:ke An attribute to describe a gene when it is located within the intron of another gene and on the same strand. inside intron parallel sequence SO:0000071 inside_intron_parallel An attribute to describe a gene when it is located within the intron of another gene and on the same strand. SO:ke sequence SO:0000072 end_overlapping_gene true An attribute to describe a gene when the five prime region overlaps with another gene's 3' region. five prime-three prime overlap sequence SO:0000073 five_prime_three_prime_overlap An attribute to describe a gene when the five prime region overlaps with another gene's 3' region. SO:ke An attribute to describe a gene when the five prime region overlaps with another gene's five prime region. five prime-five prime overlap sequence SO:0000074 five_prime_five_prime_overlap An attribute to describe a gene when the five prime region overlaps with another gene's five prime region. SO:ke An attribute to describe a gene when the 3' region overlaps with another gene's 3' region. three prime-three prime overlap sequence SO:0000075 three_prime_three_prime_overlap An attribute to describe a gene when the 3' region overlaps with another gene's 3' region. SO:ke An attribute to describe a gene when the 3' region overlaps with another gene's 5' region. 5' 3' overlap three prime five prime overlap sequence SO:0000076 three_prime_five_prime_overlap An attribute to describe a gene when the 3' region overlaps with another gene's 5' region. SO:ke A region sequence that is complementary to a sequence of messenger RNA. http://en.wikipedia.org/wiki/Antisense sequence SO:0000077 antisense A region sequence that is complementary to a sequence of messenger RNA. SO:ke http://en.wikipedia.org/wiki/Antisense wiki A transcript that is polycistronic. polycistronic transcript sequence SO:0000078 polycistronic_transcript A transcript that is polycistronic. SO:xp A transcript that is dicistronic. dicistronic transcript sequence SO:0000079 dicistronic_transcript A transcript that is dicistronic. SO:ke A gene that is a member of an operon, which is a set of genes transcribed together as a unit. operon member sequence SO:0000080 operon_member gene array member sequence SO:0000081 gene_array_member sequence SO:0000082 processed_transcript_attribute true DNA belonging to the macronuclei of ciliates. macronuclear sequence sequence SO:0000083 macronuclear_sequence DNA belonging to the micronuclei of a cell. micronuclear sequence sequence SO:0000084 micronuclear_sequence sequence SO:0000085 gene_by_genome_location true sequence SO:0000086 gene_by_organelle_of_genome true A gene from nuclear sequence. http://en.wikipedia.org/wiki/Nuclear_gene nuclear gene sequence SO:0000087 nuclear_gene A gene from nuclear sequence. SO:xp http://en.wikipedia.org/wiki/Nuclear_gene wiki A gene located in mitochondrial sequence. http://en.wikipedia.org/wiki/Mitochondrial_gene mitochondrial gene mt gene sequence SO:0000088 mt_gene A gene located in mitochondrial sequence. SO:xp http://en.wikipedia.org/wiki/Mitochondrial_gene wiki A gene located in kinetoplast sequence. kinetoplast gene sequence SO:0000089 kinetoplast_gene A gene located in kinetoplast sequence. SO:xp A gene from plastid sequence. plastid gene sequence SO:0000090 plastid_gene A gene from plastid sequence. SO:xp A gene from apicoplast sequence. apicoplast gene sequence SO:0000091 apicoplast_gene A gene from apicoplast sequence. SO:xp A gene from chloroplast sequence. chloroplast gene ct gene sequence SO:0000092 ct_gene A gene from chloroplast sequence. SO:xp A gene from chromoplast_sequence. chromoplast gene sequence SO:0000093 chromoplast_gene A gene from chromoplast_sequence. SO:xp A gene from cyanelle sequence. cyanelle gene sequence SO:0000094 cyanelle_gene A gene from cyanelle sequence. SO:xp A plastid gene from leucoplast sequence. leucoplast gene sequence SO:0000095 leucoplast_gene A plastid gene from leucoplast sequence. SO:xp A gene from proplastid sequence. proplastid gene sequence SO:0000096 proplastid_gene A gene from proplastid sequence. SO:ke A gene from nucleomorph sequence. nucleomorph gene sequence SO:0000097 nucleomorph_gene A gene from nucleomorph sequence. SO:xp A gene from plasmid sequence. plasmid gene sequence SO:0000098 plasmid_gene A gene from plasmid sequence. SO:xp A gene from proviral sequence. proviral gene sequence SO:0000099 proviral_gene A gene from proviral sequence. SO:xp A proviral gene with origin endogenous retrovirus. endogenous retroviral gene sequence SO:0000100 endogenous_retroviral_gene A proviral gene with origin endogenous retrovirus. SO:xp A transposon or insertion sequence. An element that can insert in a variety of DNA sequences. http://en.wikipedia.org/wiki/Transposable_element transposable element transposon sequence SO:0000101 transposable_element A transposon or insertion sequence. An element that can insert in a variety of DNA sequences. http://www.sci.sdsu.edu/~smaloy/Glossary/T.html http://en.wikipedia.org/wiki/Transposable_element wiki A match to an EST or cDNA sequence. expressed sequence match sequence SO:0000102 expressed_sequence_match A match to an EST or cDNA sequence. SO:ke The end of the clone insert. clone insert end sequence SO:0000103 clone_insert_end The end of the clone insert. SO:ke A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation. SO:0000358 http://en.wikipedia.org/wiki/Polypeptide protein sequence SO:0000104 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The term 'protein' was merged with 'polypeptide'. Although 'protein' was a sequence_attribute and therefore meant to describe the quality rather than an actual feature, it was being used erroneously. It is replaced by 'peptidyl' as the polymer attribute. polypeptide A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation. SO:ma http://en.wikipedia.org/wiki/Polypeptide wiki A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere. chromosome arm sequence SO:0000105 chromosome_arm A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere. http://www.medterms.com/script/main/art.asp?articlekey=5152 sequence SO:0000106 non_capped_primary_transcript true A single stranded oligo used for polymerase chain reaction. sequencing primer sequence SO:0000107 sequencing_primer An mRNA with a frameshift. frameshifted mRNA mRNA with frameshift sequence SO:0000108 mRNA_with_frameshift An mRNA with a frameshift. SO:xp A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration. sequence mutation SO:0000109 sequence_variant_obs true A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration. SO:ke Any extent of continuous biological sequence. INSDC_feature:misc_feature INSDC_note:other INSDC_note:sequence_feature located_sequence_feature sequence feature sequence located sequence feature SO:0000110 sequence_feature Any extent of continuous biological sequence. LAMHDI:mb SO:ke A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast. transposable element gene sequence SO:0000111 transposable_element_gene A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast. SO:ke An oligo to which new deoxyribonucleotides can be added by DNA polymerase. http://en.wikipedia.org/wiki/Primer_(molecular_biology) DNA primer primer oligonucleotide primer polynucleotide primer sequence sequence SO:0000112 primer An oligo to which new deoxyribonucleotides can be added by DNA polymerase. SO:ke http://en.wikipedia.org/wiki/Primer_(molecular_biology) wiki A viral sequence which has integrated into a host genome. proviral region sequence proviral sequence SO:0000113 proviral_region A viral sequence which has integrated into a host genome. SO:ke A methylated deoxy-cytosine. methylated C methylated cytosine methylated cytosine base methylated cytosine residue methylated_C sequence SO:0000114 methylated_cytosine A methylated deoxy-cytosine. SO:ke sequence SO:0000115 transcript_feature true An attribute describing a sequence that is modified by editing. sequence SO:0000116 edited An attribute describing a sequence that is modified by editing. SO:ke sequence SO:0000117 transcript_with_readthrough_stop_codon true A transcript with a translational frameshift. transcript with translational frameshift sequence SO:0000118 transcript_with_translational_frameshift A transcript with a translational frameshift. SO:xp An attribute to describe a sequence that is regulated. sequence SO:0000119 regulated An attribute to describe a sequence that is regulated. SO:ke A primary transcript that, at least in part, encodes one or more proteins. protein coding primary transcript sequence pre mRNA SO:0000120 May contain introns. protein_coding_primary_transcript A primary transcript that, at least in part, encodes one or more proteins. SO:ke A single stranded oligo used for polymerase chain reaction. DNA forward primer forward DNA primer forward primer forward primer oligo forward primer oligonucleotide forward primer polynucleotide forward primer sequence sequence SO:0000121 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. forward_primer A single stranded oligo used for polymerase chain reaction. http://mged.sourceforge.net/ontologies/MGEDontology.php A folded RNA sequence. RNA sequence secondary structure sequence SO:0000122 RNA_sequence_secondary_structure A folded RNA sequence. SO:ke An attribute describing a gene that is regulated at transcription. transcriptionally regulated sequence SO:0000123 By:<protein_id>. transcriptionally_regulated An attribute describing a gene that is regulated at transcription. SO:ma Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate. transcriptionally constitutive sequence SO:0000124 transcriptionally_constitutive Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate. SO:ke An inducer molecule is required for transcription to occur. transcriptionally induced sequence SO:0000125 transcriptionally_induced An inducer molecule is required for transcription to occur. SO:ke A repressor molecule is required for transcription to stop. transcriptionally repressed sequence SO:0000126 transcriptionally_repressed A repressor molecule is required for transcription to stop. SO:ke A gene that is silenced. silenced gene sequence SO:0000127 silenced_gene A gene that is silenced. SO:xp A gene that is silenced by DNA modification. gene silenced by DNA modification sequence SO:0000128 gene_silenced_by_DNA_modification A gene that is silenced by DNA modification. SO:xp A gene that is silenced by DNA methylation. gene silenced by DNA methylation methylation-silenced gene sequence SO:0000129 gene_silenced_by_DNA_methylation A gene that is silenced by DNA methylation. SO:xp An attribute describing a gene that is regulated after it has been translated. post translationally regulated post-translationally regulated sequence SO:0000130 post_translationally_regulated An attribute describing a gene that is regulated after it has been translated. SO:ke An attribute describing a gene that is regulated as it is translated. translationally regulated sequence SO:0000131 translationally_regulated An attribute describing a gene that is regulated as it is translated. SO:ke A single stranded oligo used for polymerase chain reaction. DNA reverse primer reverse DNA primer reverse primer reverse primer oligo reverse primer oligonucleotide reverse primer sequence sequence SO:0000132 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. reverse_primer A single stranded oligo used for polymerase chain reaction. http://mged.sourceforge.net/ontologies/MGEDontology.php This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones. epigenetically modified sequence SO:0000133 epigenetically_modified This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones. SO:ke Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin. imprinted http:http://en.wikipedia.org/wiki/Genomic_imprinting genomically imprinted sequence SO:0000134 genomically_imprinted Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin. SO:ke http:http://en.wikipedia.org/wiki/Genomic_imprinting wiki The maternal copy of the gene is modified, rendering it transcriptionally silent. maternally imprinted sequence SO:0000135 maternally_imprinted The maternal copy of the gene is modified, rendering it transcriptionally silent. SO:ke The paternal copy of the gene is modified, rendering it transcriptionally silent. paternally imprinted sequence SO:0000136 paternally_imprinted The paternal copy of the gene is modified, rendering it transcriptionally silent. SO:ke Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell. allelically excluded sequence SO:0000137 Examples are x-inactivation and immunoglobulin formation. allelically_excluded Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell. SO:ke An epigenetically modified gene, rearranged at the DNA level. gene rearranged at DNA level sequence SO:0000138 gene_rearranged_at_DNA_level An epigenetically modified gene, rearranged at the DNA level. SO:xp Region in mRNA where ribosome assembles. INSDC_feature:regulatory INSDC_qualifier:ribosome_binding_site ribosome entry site sequence SO:0000139 ribosome_entry_site Region in mRNA where ribosome assembles. SO:ke A sequence segment located within the five prime end of an mRNA that causes premature termination of translation. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Attenuator INSDC_qualifier:attenuator attenuator sequence sequence SO:0000140 attenuator A sequence segment located within the five prime end of an mRNA that causes premature termination of translation. SO:as http://en.wikipedia.org/wiki/Attenuator wiki The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Terminator_(genetics) INSDC_qualifier:terminator terminator sequence sequence SO:0000141 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. terminator The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Terminator_(genetics) wiki A folded DNA sequence. DNA sequence secondary structure sequence SO:0000142 DNA_sequence_secondary_structure A folded DNA sequence. SO:ke A region of known length which may be used to manufacture a longer region. assembly component sequence SO:0000143 assembly_component A region of known length which may be used to manufacture a longer region. SO:ke sequence SO:0000144 primary_transcript_attribute true A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough. recoded codon sequence SO:0000145 recoded_codon A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough. SO:xp An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end. sequence SO:0000146 capped An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end. SO:ke A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing. http://en.wikipedia.org/wiki/Exon INSDC_feature:exon sequence SO:0000147 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. exon A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing. SO:ke http://en.wikipedia.org/wiki/Exon wiki One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's. sequence scaffold SO:0000148 supercontig One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's. SO:ls A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases. http://en.wikipedia.org/wiki/Contig sequence SO:0000149 contig A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases. SO:ls http://en.wikipedia.org/wiki/Contig wiki A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine. sequence SO:0000150 read A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine. SO:rd A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism. http:http://en.wikipedia.org/wiki/Clone_(genetics) sequence SO:0000151 clone A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism. SO:ke http:http://en.wikipedia.org/wiki/Clone_(genetics) wiki Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells. yeast artificial chromosome sequence SO:0000152 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. YAC Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells. SO:ma Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host. bacterial artificial chromosome sequence SO:0000153 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. BAC Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host. SO:ma The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells. http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome P1 P1 artificial chromosome sequence SO:0000154 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Drosophila melanogaster PACs carry an average insert size of 80 kb. The library represents a 6-fold coverage of the genome. PAC The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells. http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome wiki A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism. plasmid sequence sequence SO:0000155 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. plasmid A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism. SO:ma A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites. http://en.wikipedia.org/wiki/Cosmid cosmid vector sequence SO:0000156 Paper: vans GA et al. High efficiency vectors for cosmid microcloning and genomic analysis. Gene 1989; 79(1):9-20. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. cosmid A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites. SO:ma http://en.wikipedia.org/wiki/Cosmid wiki A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids. http://en.wikipedia.org/wiki/Phagemid sequence phagemid vector SO:0000157 phagemid A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids. SO:ma http://en.wikipedia.org/wiki/Phagemid wiki A cloning vector that utilizes the E. coli F factor. http://en.wikipedia.org/wiki/Fosmid sequence fosmid vector SO:0000158 Birren BW et al. A human chromosome 22 fosmid resource: mapping and analysis of 96 clones. Genomics 1996. fosmid A cloning vector that utilizes the E. coli F factor. SO:ma http://en.wikipedia.org/wiki/Fosmid wiki The point at which one or more contiguous nucleotides were excised. SO:1000033 http://en.wikipedia.org/wiki/Nucleotide_deletion loinc:LA6692-3 deleted_sequence nucleotide deletion nucleotide_deletion sequence SO:0000159 deletion The point at which one or more contiguous nucleotides were excised. SO:ke http://en.wikipedia.org/wiki/Nucleotide_deletion wiki loinc:LA6692-3 Deletion A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome. sequence SO:0000160 lambda_clone true A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome. ISBN:0-1767-2380-8 A modified base in which adenine has been methylated. methylated A methylated adenine methylated adenine base methylated adenine residue methylated_A sequence SO:0000161 methylated_adenine A modified base in which adenine has been methylated. SO:ke Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction. http://en.wikipedia.org/wiki/Splice_site splice site sequence SO:0000162 With spliceosomal introns, the splice sites bind the spliceosomal machinery. splice_site Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction. SO:cjm SO:ke http://en.wikipedia.org/wiki/Splice_site wiki Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron. 5' splice site donor splice site five prime splice site splice donor site sequence donor SO:0000163 five_prime_cis_splice_site Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron. SO:cjm SO:ke http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron. acceptor splice site splice acceptor site three prime splice site sequence 3' splice site acceptor SO:0000164 three_prime_cis_splice_site Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron. SO:cjm SO:ke http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Enhancer_(genetics) INSDC_qualifier:enhancer sequence SO:0000165 An enhancer may participate in an enhanceosome GO:0034206. A protein-DNA complex formed by the association of a distinct set of general and specific transcription factors with a region of enhancer DNA. The cooperative assembly of an enhanceosome confers specificity of transcriptional regulation. This comment is a place holder should we start to make cross products with GO. enhancer A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Enhancer_(genetics) wiki An enhancer bound by a factor. enhancer bound by factor sequence SO:0000166 enhancer_bound_by_factor An enhancer bound by a factor. SO:xp A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Promoter INSDC_qualifier:promoter promoter sequence sequence SO:0000167 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The region on a DNA molecule involved in RNA polymerase binding to initiate transcription. Moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. Merged with RNA_polymerase_promoter (SO:0001203) Aug 2020. Moved up one level from is_a CRM (SO:0000727) to is_a transcriptional_cis_regulatory_region (SO:0001055) as part of the GREEKC work January 2021. Pascale Gaudet from Gene Ontology pointed out that CRM can be located upstream of the promoter and therefore cannot include the promoter. promoter A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription. SO:regcreative http://en.wikipedia.org/wiki/Promoter wiki A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA. sequence SO:0000168 restriction_enzyme_cut_site true A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA. SO:ma A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription. RNA polymerase A promoter RNApol I promoter pol I promoter polymerase I promoter sequence SO:0000169 parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221. RNApol_I_promoter A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription. SO:ke A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription. RNA polymerase B promoter RNApol II promoter polymerase II promoter sequence pol II promoter SO:0000170 parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221. RNApol_II_promoter A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription. SO:ke A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription. RNA polymerase C promoter RNApol III promoter pol III promoter polymerase III promoter sequence SO:0000171 parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221. RNApol_III_promoter A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription. SO:ke Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT. INSDC_feature:regulatory http://en.wikipedia.org/wiki/CAAT_box CAAT box CAAT signal CAAT-box INSDC_qualifier:CAAT_signal sequence SO:0000172 CAAT_signal Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/CAAT_box wiki A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG. INSDC_feature:regulatory GC rich promoter region GC-rich region INSDC_qualifier:GC_rich_promoter_region sequence SO:0000173 GC_rich_promoter_region A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG. http://www.insdc.org/files/feature_table.html A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T). INSDC_feature:regulatory http://en.wikipedia.org/wiki/TATA_box Goldstein-Hogness box INSDC_qualifier:TATA_box TATA box sequence SO:0000174 Binds TBP. TATA_box A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T). PMID:16858867 http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/TATA_box wiki A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Pribnow_box -10 signal INSDC_qualifier:minus_10_signal Pribnow Schaller box Pribnow box Pribnow-Schaller box minus 10 signal sequence SO:0000175 Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021. minus_10_signal A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Pribnow_box wiki A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70. INSDC_feature:regulatory -35 signal INSDC_qualifier:minus_35_signal minus 35 signal sequence SO:0000176 Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021. minus_35_signal A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70. http://www.insdc.org/files/feature_table.html A nucleotide match against a sequence from another organism. cross genome match sequence SO:0000177 cross_genome_match A nucleotide match against a sequence from another organism. SO:ma The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene. http://en.wikipedia.org/wiki/Operon INSDC_feature:operon sequence SO:0000178 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Definition updated with per Mejia-Almonte et.al Redefining fundamental concepts of transcription initiation in prokaryotes Aug 5 2020. operon The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene. SO:ma http://en.wikipedia.org/wiki/Operon wiki The start of the clone insert. clone insert start sequence SO:0000179 clone_insert_start The start of the clone insert. SO:ke A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase. http://en.wikipedia.org/wiki/Retrotransposon class I transposon retrotransposon element sequence class I SO:0000180 retrotransposon A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase. http://www.dddmag.com/Glossary.aspx#r http://en.wikipedia.org/wiki/Retrotransposon wiki A match against a translated sequence. translated nucleotide match sequence SO:0000181 translated_nucleotide_match A match against a translated sequence. SO:ke A transposon where the mechanism of transposition is via a DNA intermediate. DNA transposon class II transposon sequence class II SO:0000182 DNA_transposon A transposon where the mechanism of transposition is via a DNA intermediate. SO:ke A region of the gene which is not transcribed. non transcribed region non-transcribed sequence nontranscribed region nontranscribed sequence sequence SO:0000183 non_transcribed_region A region of the gene which is not transcribed. SO:ke A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs. U2 intron sequence SO:0000184 May have either GT-AG or AT-AG 5' and 3' boundaries. U2_intron A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs. PMID:9428511 A transcript that in its initial state requires modification to be functional. http://en.wikipedia.org/wiki/Primary_transcript INSDC_feature:precursor_RNA INSDC_feature:prim_transcript precursor RNA primary transcript sequence SO:0000185 primary_transcript A transcript that in its initial state requires modification to be functional. SO:ma http://en.wikipedia.org/wiki/Primary_transcript wiki A retrotransposon flanked by long terminal repeat sequences. LTR retrotransposon long terminal repeat retrotransposon sequence SO:0000186 LTR_retrotransposon A retrotransposon flanked by long terminal repeat sequences. SO:ke A group of characterized repeat sequences. sequence SO:0000187 repeat_family true A group of characterized repeat sequences. SO:ke A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it. http://en.wikipedia.org/wiki/Intron INSDC_feature:intron sequence SO:0000188 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. intron A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Intron wiki A retrotransposon without long terminal repeat sequences. non LTR retrotransposon sequence SO:0000189 non_LTR_retrotransposon A retrotransposon without long terminal repeat sequences. SO:ke An intron that is the most 5-prime in a given transcript. 5' intron 5' intron sequence five prime intron sequence SO:0000190 five_prime_intron An intron that is not the most 3-prime or the most 5-prime in a given transcript. interior intron sequence SO:0000191 interior_intron An intron that is the most 3-prime in a given transcript. 3' intron three prime intron sequence 3' intron sequence SO:0000192 three_prime_intron A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme. http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism RFLP RFLP fragment restriction fragment length polymorphism sequence SO:0000193 RFLP_fragment A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme. GOC:pj http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism wiki A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats. LINE LINE element Long interspersed element Long interspersed nuclear element sequence SO:0000194 LINE_element A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats. http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon). coding exon sequence SO:0000195 coding_exon An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon). SO:ke The sequence of the five_prime_coding_exon that codes for protein. five prime exon coding region sequence SO:0000196 five_prime_coding_exon_coding_region The sequence of the five_prime_coding_exon that codes for protein. SO:cjm The sequence of the three_prime_coding_exon that codes for protein. three prime exon coding region sequence SO:0000197 three_prime_coding_exon_coding_region The sequence of the three_prime_coding_exon that codes for protein. SO:cjm An exon that does not contain any codons. noncoding exon sequence SO:0000198 noncoding_exon An exon that does not contain any codons. SO:ke A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions. translocated sequence sequence transchr SO:0000199 translocation A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions. NCBI:th SO:ke transchr http://www.ncbi.nlm.nih.gov/dbvar/ The 5' most coding exon. 5' coding exon five prime coding exon sequence SO:0000200 five_prime_coding_exon The 5' most coding exon. SO:ke An exon that is bounded by 5' and 3' splice sites. interior exon sequence SO:0000201 interior_exon An exon that is bounded by 5' and 3' splice sites. PMID:10373547 The coding exon that is most 3-prime on a given transcript. three prime coding exon sequence 3' coding exon SO:0000202 three_prime_coding_exon The coding exon that is most 3-prime on a given transcript. SO:ma Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated. untranslated region sequence SO:0000203 UTR Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated. SO:ke A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein. http://en.wikipedia.org/wiki/5'_UTR 5' UTR INSDC_feature:5'UTR five prime UTR five_prime_untranslated_region sequence SO:0000204 five_prime_UTR A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/5'_UTR wiki A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein. http://en.wikipedia.org/wiki/Three_prime_untranslated_region INSDC_feature:3'UTR three prime UTR three prime untranslated region sequence SO:0000205 three_prime_UTR A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Three_prime_untranslated_region wiki A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element. http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element SINE element Short interspersed element Short interspersed nuclear element sequence SO:0000206 SINE_element A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element. SO:ke http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element wiki SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ. http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism simple sequence length variation sequence SSLP simple sequence length polymorphism SO:0000207 simple_sequence_length_variation SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ. SO:ke http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism WIKI A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long. TIR element terminal inverted repeat element sequence SO:0000208 terminal_inverted_repeat_element A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long. http://www.genetics.org/cgi/reprint/156/4/1983.pdf A primary transcript encoding a ribosomal RNA. rRNA primary transcript ribosomal RNA primary transcript sequence SO:0000209 rRNA_primary_transcript A primary transcript encoding a ribosomal RNA. SO:ke A primary transcript encoding a transfer RNA (SO:0000253). tRNA primary transcript sequence SO:0000210 tRNA_primary_transcript A primary transcript encoding a transfer RNA (SO:0000253). SO:ke A primary transcript encoding alanyl tRNA. alanine tRNA primary transcript sequence SO:0000211 alanine_tRNA_primary_transcript A primary transcript encoding alanyl tRNA. SO:ke A primary transcript encoding arginyl tRNA (SO:0000255). arginine tRNA primary transcript sequence SO:0000212 arginine_tRNA_primary_transcript A primary transcript encoding arginyl tRNA (SO:0000255). SO:ke A primary transcript encoding asparaginyl tRNA (SO:0000256). asparagine tRNA primary transcript sequence SO:0000213 asparagine_tRNA_primary_transcript A primary transcript encoding asparaginyl tRNA (SO:0000256). SO:ke A primary transcript encoding aspartyl tRNA (SO:0000257). aspartic acid tRNA primary transcript sequence SO:0000214 aspartic_acid_tRNA_primary_transcript A primary transcript encoding aspartyl tRNA (SO:0000257). SO:ke A primary transcript encoding cysteinyl tRNA (SO:0000258). cysteine tRNA primary transcript sequence SO:0000215 cysteine_tRNA_primary_transcript A primary transcript encoding cysteinyl tRNA (SO:0000258). SO:ke A primary transcript encoding glutaminyl tRNA (SO:0000260). glutamic acid tRNA primary transcript sequence SO:0000216 glutamic_acid_tRNA_primary_transcript A primary transcript encoding glutaminyl tRNA (SO:0000260). SO:ke A primary transcript encoding glutamyl tRNA (SO:0000260). glutamine tRNA primary transcript sequence SO:0000217 glutamine_tRNA_primary_transcript A primary transcript encoding glutamyl tRNA (SO:0000260). SO:ke A primary transcript encoding glycyl tRNA (SO:0000263). glycine tRNA primary transcript sequence SO:0000218 glycine_tRNA_primary_transcript A primary transcript encoding glycyl tRNA (SO:0000263). SO:ke A primary transcript encoding histidyl tRNA (SO:0000262). histidine tRNA primary transcript sequence SO:0000219 histidine_tRNA_primary_transcript A primary transcript encoding histidyl tRNA (SO:0000262). SO:ke A primary transcript encoding isoleucyl tRNA (SO:0000263). isoleucine tRNA primary transcript sequence SO:0000220 isoleucine_tRNA_primary_transcript A primary transcript encoding isoleucyl tRNA (SO:0000263). SO:ke A primary transcript encoding leucyl tRNA (SO:0000264). leucine tRNA primary transcript sequence SO:0000221 leucine_tRNA_primary_transcript A primary transcript encoding leucyl tRNA (SO:0000264). SO:ke A primary transcript encoding lysyl tRNA (SO:0000265). lysine tRNA primary transcript sequence SO:0000222 lysine_tRNA_primary_transcript A primary transcript encoding lysyl tRNA (SO:0000265). SO:ke A primary transcript encoding methionyl tRNA (SO:0000266). methionine tRNA primary transcript sequence SO:0000223 methionine_tRNA_primary_transcript A primary transcript encoding methionyl tRNA (SO:0000266). SO:ke A primary transcript encoding phenylalanyl tRNA (SO:0000267). phenylalanine tRNA primary transcript sequence SO:0000224 phenylalanine_tRNA_primary_transcript A primary transcript encoding phenylalanyl tRNA (SO:0000267). SO:ke A primary transcript encoding prolyl tRNA (SO:0000268). proline tRNA primary transcript sequence SO:0000225 proline_tRNA_primary_transcript A primary transcript encoding prolyl tRNA (SO:0000268). SO:ke A primary transcript encoding seryl tRNA (SO:000269). serine tRNA primary transcript sequence SO:0000226 serine_tRNA_primary_transcript A primary transcript encoding seryl tRNA (SO:000269). SO:ke A primary transcript encoding threonyl tRNA (SO:000270). threonine tRNA primary transcript sequence SO:0000227 threonine_tRNA_primary_transcript A primary transcript encoding threonyl tRNA (SO:000270). SO:ke A primary transcript encoding tryptophanyl tRNA (SO:000271). tryptophan tRNA primary transcript sequence SO:0000228 tryptophan_tRNA_primary_transcript A primary transcript encoding tryptophanyl tRNA (SO:000271). SO:ke A primary transcript encoding tyrosyl tRNA (SO:000272). tyrosine tRNA primary transcript sequence SO:0000229 tyrosine_tRNA_primary_transcript A primary transcript encoding tyrosyl tRNA (SO:000272). SO:ke A primary transcript encoding valyl tRNA (SO:000273). valine tRNA primary transcript sequence SO:0000230 valine_tRNA_primary_transcript A primary transcript encoding valyl tRNA (SO:000273). SO:ke A primary transcript encoding a small nuclear RNA (SO:0000274). snRNA primary transcript sequence SO:0000231 snRNA_primary_transcript A primary transcript encoding a small nuclear RNA (SO:0000274). SO:ke A primary transcript encoding one or more small nucleolar RNAs (SO:0000275). snoRNA primary transcript sequence SO:0000232 This definition was broadened 26 Jan 2021 to reflect that a single transcript can encode one or more snoRNAs. Brought to our attention by FlyBase. GitHub Issue #520 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/520). snoRNA_primary_transcript A primary transcript encoding one or more small nucleolar RNAs (SO:0000275). SO:ke A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified. http://en.wikipedia.org/wiki/Mature_transcript mature transcript sequence SO:0000233 A processed transcript cannot contain introns. mature_transcript A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified. SO:ke http://en.wikipedia.org/wiki/Mature_transcript wiki Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns. http://en.wikipedia.org/wiki/MRNA http://www.gencodegenes.org/gencode_biotypes.html INSDC_feature:mRNA messenger RNA protein_coding_transcript sequence SO:0000234 An mRNA does not contain introns as it is a processed_transcript. The equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. mRNA Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns. SO:ma http://en.wikipedia.org/wiki/MRNA wiki http://www.gencodegenes.org/gencode_biotypes.html GENCODE A DNA site where a transcription factor binds. TF binding site transcription factor binding site sequence SO:0000235 Definition updated along with definitions in Mejia-Almonte et.al PMID:32665585. Added relationship part_of SO:0000727 CRM in place of previous CRM relationship has_part TF_binding_site August 2020 in response to requests from GREEKC initiative. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. TF_binding_site A DNA site where a transcription factor binds. SO:ke The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER. open reading frame sequence SO:0000236 The definition was modified by Rama. ORF is defined by the sequence, whereas the CDS is defined according to whether a polypeptide is made. This term is mapped to MGED. Do not obsolete without consulting MGED ontology. ORF The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER. SGD:rb SO:ma An attribute describing a transcript. transcript attribute sequence SO:0000237 transcript_attribute A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats. foldback element sequence LVR element long inverted repeat element SO:0000238 foldback_element A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats. http://www.genetics.org/cgi/reprint/156/4/1983.pdf The sequences extending on either side of a specific region. flanking region sequence SO:0000239 flanking_region The sequences extending on either side of a specific region. SO:ke A deviation in chromosome structure or number. chromosome variation sequence SO:0000240 chromosome_variation A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal. internal UTR sequence SO:0000241 internal_UTR A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal. SO:cjm The untranslated sequence separating the 'cistrons' of multicistronic mRNA. untranslated region polycistronic mRNA sequence SO:0000242 untranslated_region_polycistronic_mRNA The untranslated sequence separating the 'cistrons' of multicistronic mRNA. SO:ke Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation. http://en.wikipedia.org/wiki/Internal_ribosome_entry_site IRES internal ribosomal entry sequence internal ribosomal entry site internal ribosome entry site sequence internal ribosome entry sequence SO:0000243 internal_ribosome_entry_site Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation. SO:ke http://en.wikipedia.org/wiki/Internal_ribosome_entry_site wiki sequence 4-cutter_restriction_site four-cutter_restriction_sit SO:0000244 four_cutter_restriction_site true sequence SO:0000245 mRNA_by_polyadenylation_status true A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule. sequence SO:0000246 polyadenylated A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule. SO:ke sequence SO:0000247 mRNA_not_polyadenylated true A kind of kind of sequence alteration where the copies of a region present varies across a population. sequence length alteration sequence SO:0000248 sequence_length_alteration A kind of kind of sequence alteration where the copies of a region present varies across a population. SO:ke sequence 6-cutter_restriction_site six-cutter_restriction_site SO:0000249 six_cutter_restriction_site true A post_transcriptionally modified base. modified RNA base feature sequence SO:0000250 modified_RNA_base_feature A post_transcriptionally modified base. SO:ke sequence 8-cutter_restriction_site eight-cutter_restriction_site SO:0000251 eight_cutter_restriction_site true rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity. INSDC_qualifier:unknown http://en.wikipedia.org/wiki/RRNA INSDC_feature:rRNA ribosomal RNA ribosomal ribonucleic acid sequence SO:0000252 Definition updated 10 June 2021 as part of restructuring rRNA terms and reforming definitions to have similar structures. Request from EBI. See GitHub Issue #493 rRNA rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity. ISBN:0198506732 http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/RRNA wiki Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position. INSDC_qualifier:unknown http://en.wikipedia.org/wiki/TRNA INSDC_feature:tRNA sequence transfer RNA transfer ribonucleic acid SO:0000253 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. tRNA Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position. ISBN:0198506732 http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005 http://en.wikipedia.org/wiki/TRNA wiki A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region. alanyl tRNA alanyl-transfer RNA alanyl-transfer ribonucleic acid sequence SO:0000254 alanyl_tRNA A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region. SO:ke A primary transcript encoding a small ribosomal subunit RNA. rRNA small subunit primary transcript sequence SO:0000255 rRNA_small_subunit_primary_transcript A primary transcript encoding a small ribosomal subunit RNA. SO:ke A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region. asparaginyl tRNA asparaginyl-transfer RNA asparaginyl-transfer ribonucleic acid sequence SO:0000256 asparaginyl_tRNA A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region. SO:ke A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region. aspartyl tRNA aspartyl-transfer RNA aspartyl-transfer ribonucleic acid sequence SO:0000257 aspartyl_tRNA A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region. SO:ke A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region. cysteinyl tRNA cysteinyl-transfer RNA cysteinyl-transfer ribonucleic acid sequence SO:0000258 cysteinyl_tRNA A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region. SO:ke A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region. glutaminyl tRNA glutaminyl-transfer RNA glutaminyl-transfer ribonucleic acid sequence SO:0000259 glutaminyl_tRNA A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region. SO:ke A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region. glutamyl tRNA glutamyl-transfer ribonucleic acid sequence glutamyl-transfer RNA SO:0000260 glutamyl_tRNA A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region. SO:ke A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region. glycyl tRNA sequence glycyl-transfer RNA glycyl-transfer ribonucleic acid SO:0000261 glycyl_tRNA A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region. SO:ke A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region. histidyl tRNA histidyl-transfer RNA histidyl-transfer ribonucleic acid sequence SO:0000262 histidyl_tRNA A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region. SO:ke A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region. isoleucyl tRNA isoleucyl-transfer RNA isoleucyl-transfer ribonucleic acid sequence SO:0000263 isoleucyl_tRNA A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region. SO:ke A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region. leucyl tRNA leucyl-transfer RNA leucyl-transfer ribonucleic acid sequence SO:0000264 leucyl_tRNA A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region. SO:ke A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region. lysyl tRNA lysyl-transfer RNA lysyl-transfer ribonucleic acid sequence SO:0000265 lysyl_tRNA A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region. SO:ke A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region. methionyl tRNA methionyl-transfer RNA methionyl-transfer ribonucleic acid sequence SO:0000266 methionyl_tRNA A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region. SO:ke A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region. phenylalanyl tRNA phenylalanyl-transfer RNA phenylalanyl-transfer ribonucleic acid sequence SO:0000267 phenylalanyl_tRNA A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region. SO:ke A tRNA sequence that has a proline anticodon, and a 3' proline binding region. prolyl tRNA prolyl-transfer RNA prolyl-transfer ribonucleic acid sequence SO:0000268 prolyl_tRNA A tRNA sequence that has a proline anticodon, and a 3' proline binding region. SO:ke A tRNA sequence that has a serine anticodon, and a 3' serine binding region. seryl tRNA seryl-transfer RNA sequence seryl-transfer ribonucleic acid SO:0000269 seryl_tRNA A tRNA sequence that has a serine anticodon, and a 3' serine binding region. SO:ke A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region. threonyl tRNA threonyl-transfer ribonucleic acid sequence threonyl-transfer RNA SO:0000270 threonyl_tRNA A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region. SO:ke A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region. tryptophanyl tRNA tryptophanyl-transfer RNA tryptophanyl-transfer ribonucleic acid sequence SO:0000271 tryptophanyl_tRNA A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region. SO:ke A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region. tyrosyl tRNA tyrosyl-transfer ribonucleic acid sequence tyrosyl-transfer RNA SO:0000272 tyrosyl_tRNA A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region. SO:ke A tRNA sequence that has a valine anticodon, and a 3' valine binding region. valyl tRNA valyl-transfer ribonucleic acid sequence valyl-transfer RNA SO:0000273 valyl_tRNA A tRNA sequence that has a valine anticodon, and a 3' valine binding region. SO:ke A small nuclear RNA molecule involved in pre-mRNA splicing and processing. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/SnRNA INSDC_qualifier:snRNA small nuclear RNA sequence SO:0000274 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. snRNA A small nuclear RNA molecule involved in pre-mRNA splicing and processing. PMID:11733745 WB:ems http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/SnRNA wiki Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing. INSDC_feature:ncRNA INSDC_qualifier:snoRNA small nucleolar RNA sequence SO:0000275 Updated the definition of snoRNA (SO:0000275) from "A snoRNA (small nucleolar RNA) is any one of a class of small RNAs that are associated with the eukaryotic nucleus as components of small nucleolar ribonucleoproteins. They participate in the processing or modifications of many RNAs, mostly ribosomal RNAs (rRNAs) though snoRNAs are also known to target other classes of RNA, including spliceosomal RNAs, tRNAs, and mRNAs via a stretch of sequence that is complementary to a sequence in the targeted RNA." to "Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing." to acknowledge that some snoRNAs functionally localize to other compartments (cytoplasm or even secreted). See GitHub Issue #578. snoRNA Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing. GOC:kgc PMID:31828325 Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes). Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors. SO:0000649 INSDC_feature:ncRNA http://en.wikipedia.org/wiki/MiRNA http://en.wikipedia.org/wiki/StRNA INSDC_qualifier:miRNA micro RNA microRNA small temporal RNA stRNA sequence SO:0000276 miRNA Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes). Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors. PMID:11081512 PMID:12592000 http://en.wikipedia.org/wiki/MiRNA wiki http://en.wikipedia.org/wiki/StRNA wiki An attribute describing a sequence that is bound by another molecule. bound by factor sequence SO:0000277 Formerly called transcript_by_bound_factor. bound_by_factor An attribute describing a sequence that is bound by another molecule. SO:ke A transcript that is bound by a nucleic acid. transcript bound by nucleic acid sequence SO:0000278 Formerly called transcript_by_bound_nucleic_acid. transcript_bound_by_nucleic_acid A transcript that is bound by a nucleic acid. SO:xp A transcript that is bound by a protein. transcript bound by protein sequence SO:0000279 Formerly called transcript_by_bound_protein. transcript_bound_by_protein A transcript that is bound by a protein. SO:xp A gene that is engineered. engineered gene sequence SO:0000280 engineered_gene A gene that is engineered. SO:xp A gene that is engineered and foreign. engineered foreign gene sequence SO:0000281 engineered_foreign_gene A gene that is engineered and foreign. SO:xp An mRNA with a minus 1 frameshift. mRNA with minus 1 frameshift sequence SO:0000282 mRNA_with_minus_1_frameshift An mRNA with a minus 1 frameshift. SO:xp A transposable_element that is engineered and foreign. engineered foreign transposable element gene sequence SO:0000283 engineered_foreign_transposable_element_gene A transposable_element that is engineered and foreign. SO:xp The recognition site is bipartite and interrupted. sequence SO:0000284 type_I_enzyme_restriction_site true The recognition site is bipartite and interrupted. http://www.promega.com A gene that is foreign. foreign gene sequence SO:0000285 foreign_gene A gene that is foreign. SO:xp A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Long_terminal_repeat INSDC_qualifier:long_terminal_repeat LTR long terminal repeat sequence direct terminal repeat SO:0000286 long_terminal_repeat A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Long_terminal_repeat wiki A gene that is a fusion. http://en.wikipedia.org/wiki/Fusion_gene fusion gene sequence SO:0000287 fusion_gene A gene that is a fusion. SO:xp http://en.wikipedia.org/wiki/Fusion_gene wiki A fusion gene that is engineered. engineered fusion gene sequence SO:0000288 engineered_fusion_gene A fusion gene that is engineered. SO:xp A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Microsatellite INSDC_qualifier:microsatellite STR microsatellite locus microsatellite marker short tandem repeat sequence SO:0000289 microsatellite A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem. NCBI:th http://www.informatics.jax.org/silver/glossary.shtml http://en.wikipedia.org/wiki/Microsatellite wiki STR http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9651/ A region of a repeating dinucleotide sequence (two bases). dinucleotide repeat microsatellite dinucleotide repeat microsatellite feature dinucleotide repeat microsatellite locus dinucleotide repeat microsatellite marker sequence SO:0000290 dinucleotide_repeat_microsatellite_feature A region of a repeating trinucleotide sequence (three bases). rinucleotide repeat microsatellite trinucleotide repeat microsatellite feature trinucleotide repeat microsatellite locus sequence dinucleotide repeat microsatellite marker SO:0000291 trinucleotide_repeat_microsatellite_feature sequence SO:0000292 repetitive_element true A repetitive element that is engineered and foreign. engineered foreign repetitive element sequence SO:0000293 engineered_foreign_repetitive_element A repetitive element that is engineered and foreign. SO:xp The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Inverted_repeat INSDC_qualifier:inverted inverted repeat inverted repeat sequence sequence SO:0000294 inverted_repeat The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC. SO:ke http://en.wikipedia.org/wiki/Inverted_repeat wiki A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs. U12 intron U12-dependent intron sequence SO:0000295 May have either GT-AC or AT-AC 5' and 3' boundaries. U12_intron A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs. PMID:9428511 A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites. http://en.wikipedia.org/wiki/Origin_of_replication INSDC_feature:rep_origin ori origin of replication sequence SO:0000296 origin_of_replication A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites. NCBI:cf http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Origin_of_replication wiki Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein. http://en.wikipedia.org/wiki/D_loop D-loop INSDC_feature:D-loop sequence displacement loop SO:0000297 Moved from is_a: SO:0000296 origin_of_replication to is_a: SO:0001411 biological_region after Terrence Murphy (INSDC) pointed out that the D loop can also refer to a loop in DNA repair, which is not an origin of replication. See GitHub Issue #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417) D_loop Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/D_loop wiki A feature where there has been exchange of genetic material in the event of mitosis or meiosis INSDC_feature:misc_recomb INSDC_qualifier:other recombination feature sequence SO:0000298 recombination_feature A location where recombination or occurs during mitosis or meiosis. specific recombination site sequence SO:0000299 specific_recombination_site A location where a gene is rearranged due to recombination during mitosis or meiosis. recombination feature of rearranged gene sequence SO:0000300 recombination_feature_of_rearranged_gene A feature where recombination has occurred for the purpose of generating a diversity in the immune system. vertebrate immune system gene recombination feature sequence SO:0000301 vertebrate_immune_system_gene_recombination_feature Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence. J gene recombination feature J-RS sequence SO:0000302 J_gene_recombination_feature Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Part of the primary transcript that is clipped off during processing. sequence SO:0000303 clip Part of the primary transcript that is clipped off during processing. SO:ke The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site. sequence SO:0000304 type_II_enzyme_restriction_site true The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site. http://www.promega.com A modified nucleotide, i.e. a nucleotide other than A, T, C. G. INSDC_feature:modified_base modified base site sequence SO:0000305 Modified base:<modified_base>. modified_DNA_base A modified nucleotide, i.e. a nucleotide other than A, T, C. G. http://www.insdc.org/files/feature_table.html A nucleotide modified by methylation. methylated base feature sequence SO:0000306 methylated_DNA_base_feature A nucleotide modified by methylation. SO:ke Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes. http://en.wikipedia.org/wiki/CpG_island CG island CpG island sequence SO:0000307 CpG_island Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes. SO:rd http://en.wikipedia.org/wiki/CpG_island wiki sequence SO:0000308 sequence_feature_locating_method true sequence SO:0000309 computed_feature true sequence SO:0000310 predicted_ab_initio_computation true . sequence SO:0000311 similar to:<sequence_id> computed_feature_by_similarity true . SO:ma Attribute to describe a feature that has been experimentally verified. experimentally determined sequence SO:0000312 experimentally_determined Attribute to describe a feature that has been experimentally verified. SO:ke A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences. SO:0000019 http://en.wikipedia.org/wiki/Stem_loop INSDC_feature:stem_loop RNA_hairpin_loop stem loop stem-loop sequence SO:0000313 stem_loop A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Stem_loop wiki A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Direct_repeat INSDC_qualifier:direct direct repeat sequence SO:0000314 direct_repeat A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA. SO:ke http://en.wikipedia.org/wiki/Direct_repeat wiki The first base where RNA polymerase begins to synthesize the RNA transcript. INSDC_feature:misc_feature INSDC_note:transcription_start_site transcription start site transcription_start_site sequence SO:0000315 Added relationship is_a SO:0002309 core_promoter_element with the creation of core_promoter_element as part of GREEKC initiative August 2020 - Dave Sant. TSS The first base where RNA polymerase begins to synthesize the RNA transcript. SO:ke A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon. INSDC_feature:CDS coding sequence coding_sequence sequence SO:0000316 CDS A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon. SO:ma Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host. cDNA clone sequence SO:0000317 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. cDNA_clone Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host. http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html First codon to be translated by a ribosome. http://en.wikipedia.org/wiki/Start_codon initiation codon start codon sequence SO:0000318 start_codon First codon to be translated by a ribosome. SO:ke http://en.wikipedia.org/wiki/Start_codon wiki In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis. http://en.wikipedia.org/wiki/Stop_codon stop codon sequence SO:0000319 stop_codon In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis. SO:ke http://en.wikipedia.org/wiki/Stop_codon wiki Sequences within the intron that modulate splice site selection for some introns. intronic splice enhancer sequence SO:0000320 intronic_splice_enhancer Sequences within the intron that modulate splice site selection for some introns. SO:ke An mRNA with a plus 1 frameshift. mRNA with plus 1 frameshift sequence SO:0000321 mRNA_with_plus_1_frameshift An mRNA with a plus 1 frameshift. SO:ke A region of nucleotide sequence targeted by a nuclease enzyme that is found cleaved more than would be expected by chance. nuclease hypersensitive site sequence SO:0000322 Relationship to accessible_DNA_region added 11 Feb 2021. GREEKC pointed out that this is an assay based term, but we need a biological term for the accessible DNA. See GitHub Issue #531. nuclease_hypersensitive_site The first base to be translated into protein. coding start translation initiation site sequence translation start SO:0000323 coding_start The first base to be translated into protein. SO:ke A nucleotide sequence that may be used to identify a larger sequence. sequence SO:0000324 tag A nucleotide sequence that may be used to identify a larger sequence. SO:ke A primary transcript encoding a large ribosomal subunit RNA. 35S rRNA primary transcript rRNA large subunit primary transcript sequence SO:0000325 rRNA_large_subunit_primary_transcript A primary transcript encoding a large ribosomal subunit RNA. SO:ke A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. SAGE tag sequence SO:0000326 SAGE_tag A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract The last base to be translated into protein. It does not include the stop codon. coding end translation termination site translation_end sequence SO:0000327 coding_end The last base to be translated into protein. It does not include the stop codon. SO:ke A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid. microarray oligo microarray oligonucleotide sequence SO:0000328 microarray_oligo An mRNA with a plus 2 frameshift. mRNA with plus 2 frameshift sequence SO:0000329 mRNA_with_plus_2_frameshift An mRNA with a plus 2 frameshift. SO:xp Region of sequence similarity by descent from a common ancestor. INSDC_feature:misc_feature http://en.wikipedia.org/wiki/Conserved_region INSDC_note:conserved_region conserved region sequence SO:0000330 conserved_region Region of sequence similarity by descent from a common ancestor. SO:ke http://en.wikipedia.org/wiki/Conserved_region wiki Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known. INSDC_feature:STS sequence tag site sequence SO:0000331 STS Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known. http://www.biospace.com Coding region of sequence similarity by descent from a common ancestor. coding conserved region sequence SO:0000332 coding_conserved_region Coding region of sequence similarity by descent from a common ancestor. SO:ke The boundary between two exons in a processed transcript. exon junction sequence SO:0000333 exon_junction The boundary between two exons in a processed transcript. SO:ke Non-coding region of sequence similarity by descent from a common ancestor. conserved non-coding element conserved non-coding sequence nc conserved region noncoding conserved region sequence SO:0000334 nc_conserved_region Non-coding region of sequence similarity by descent from a common ancestor. SO:ke A mRNA with a minus 2 frameshift. mRNA with minus 2 frameshift sequence SO:0000335 mRNA_with_minus_2_frameshift A mRNA with a minus 2 frameshift. SO:ke A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog). INSDC_feature:gene http://en.wikipedia.org/wiki/Pseudogene INSDC_qualifier:pseudo INSDC_qualifier:unknown sequence SO:0000336 pseudogene A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog). http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html http://en.wikipedia.org/wiki/Pseudogene wiki A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference. RNAi reagent sequence SO:0000337 RNAi_reagent A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference. SO:rd A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins. miniature inverted repeat transposable element sequence SO:0000338 MITE A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins. http://www.pnas.org/cgi/content/full/97/18/10083 A region in a genome which promotes recombination. http://en.wikipedia.org/wiki/Recombination_hotspot recombination hotspot sequence SO:0000339 recombination_hotspot A region in a genome which promotes recombination. SO:rd http://en.wikipedia.org/wiki/Recombination_hotspot wiki Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication. http://en.wikipedia.org/wiki/Chromosome sequence SO:0000340 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. chromosome Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication. SO:ma http://en.wikipedia.org/wiki/Chromosome wiki A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark. http://en.wikipedia.org/wiki/Cytological_band chromosome band cytoband cytological band sequence SO:0000341 chromosome_band A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark. SO:ma http://en.wikipedia.org/wiki/Cytological_band wiki A region specifically recognised by a recombinase where recombination can occur during mitosis or meiosis. site specific recombination target region sequence SO:0000342 site_specific_recombination_target_region A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4. sequence SO:0000343 match A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4. SO:ke Region of a transcript that regulates splicing. splice enhancer sequence SO:0000344 splice_enhancer Region of a transcript that regulates splicing. SO:ke A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long. expressed sequence tag sequence SO:0000345 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. EST A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long. SO:ke Cre-Recombination target sequence. loxP site sequence Cre-recombination target region SO:0000346 loxP_site A match against a nucleotide sequence. nucleotide match sequence SO:0000347 nucleotide_match A match against a nucleotide sequence. SO:ke An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone. http://en.wikipedia.org/wiki/Nucleic_acid nucleic acid sequence SO:0000348 nucleic_acid An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone. CHEBI:33696 RSC:cb http://en.wikipedia.org/wiki/Nucleic_acid wiki A match against a protein sequence. protein match sequence SO:0000349 protein_match A match against a protein sequence. SO:ke An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid. FLP recombination target region FRT site sequence SO:0000350 FRT_site An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid. SO:ma An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence. synthetic sequence sequence SO:0000351 synthetic_sequence An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence. SO:ma An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone. sequence SO:0000352 DNA An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone. RSC:cb A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences. http://en.wikipedia.org/wiki/Sequence_assembly sequence assembly sequence SO:0000353 sequence_assembly A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences. SO:ma http://en.wikipedia.org/wiki/Sequence_assembly wiki A region of intronic nucleotide sequence targeted by a nuclease enzyme. group 1 intron homing endonuclease target region sequence SO:0000354 group_1_intron_homing_endonuclease_target_region A region of intronic nucleotide sequence targeted by a nuclease enzyme. SO:ke A region of the genome which is co-inherited as the result of the lack of historic recombination within it. haplotype block sequence SO:0000355 haplotype_block A region of the genome which is co-inherited as the result of the lack of historic recombination within it. SO:ma An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone. sequence SO:0000356 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. RNA An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone. RSC:cb An attribute describing a region that is bounded either side by a particular kind of region. sequence SO:0000357 flanked An attribute describing a region that is bounded either side by a particular kind of region. SO:ke true An attribute describing sequence that is flanked by Lox-P sites. http://en.wikipedia.org/wiki/Floxed sequence SO:0000359 floxed An attribute describing sequence that is flanked by Lox-P sites. SO:ke http://en.wikipedia.org/wiki/Floxed wiki A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS. http://en.wikipedia.org/wiki/Codon sequence SO:0000360 codon A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS. SO:ke http://en.wikipedia.org/wiki/Codon wiki An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT. FRT flanked sequence SO:0000361 FRT_flanked An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT. SO:ke A cDNA clone constructed from more than one mRNA. Usually an experimental artifact. invalidated by chimeric cDNA sequence SO:0000362 invalidated_by_chimeric_cDNA A cDNA clone constructed from more than one mRNA. Usually an experimental artifact. SO:ma A transgene that is floxed. floxed gene sequence SO:0000363 floxed_gene A transgene that is floxed. SO:xp The region of sequence surrounding a transposable element. transposable element flanking region sequence SO:0000364 transposable_element_flanking_region The region of sequence surrounding a transposable element. SO:ke A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site. http://en.wikipedia.org/wiki/Integron sequence SO:0000365 integron A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site. SO:as http://en.wikipedia.org/wiki/Integron wiki The junction where an insertion occurred. insertion site sequence SO:0000366 insertion_site The junction where an insertion occurred. SO:ke A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place. attI site sequence SO:0000367 attI_site A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place. SO:as The junction in a genome where a transposable_element has inserted. transposable element insertion site sequence SO:0000368 transposable_element_insertion_site The junction in a genome where a transposable_element has inserted. SO:ke sequence SO:0000369 integrase_coding_region true A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others). small regulatory ncRNA sequence SO:0000370 small_regulatory_ncRNA A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others). PMID:28541282 PomBase:al SO:ma A transposon that encodes function required for conjugation. conjugative transposon sequence SO:0000371 conjugative_transposon A transposon that encodes function required for conjugation. http://www.sci.sdsu.edu/~smaloy/Glossary/C.html An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein. enzymatic RNA sequence SO:0000372 This was moved to be a child of transcript (SO:0000673) because some enzymatic RNA regions are part of primary transcripts and some are part of processed transcripts. Moved under ncRNA on 18 Nov 2021. See GitHub Issue #533. enzymatic_RNA An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein. RSC:cb A recombinationally rearranged gene by inversion. recombinationally inverted gene sequence SO:0000373 recombinationally_inverted_gene A recombinationally rearranged gene by inversion. SO:xp An RNA with catalytic activity. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Ribozyme INSDC_qualifier:ribozyme sequence SO:0000374 ribozyme An RNA with catalytic activity. SO:ma http://en.wikipedia.org/wiki/Ribozyme wiki Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes. http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA cytosolic 5.8S LSU rRNA cytosolic 5.8S rRNA cytosolic 5.8S ribosomal RNA cytosolic rRNA 5 8S sequence SO:0000375 Dave Sant removed '5_8S rRNA is also found in archaea.' from definition due to lack of references mentioning this on 1 Feb 2021. See GitHub Issue #505. Renamed from rRNA_5_8S to cytosolic_5_8S_rRNA on 10 June 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_5_8S_rRNA Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes. https://rfam.xfam.org/family/RF00002 http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA wiki A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase. http://en.wikipedia.org/wiki/6S_RNA 6S RNA RNA 6S sequence SO:0000376 RNA_6S A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00013 http://en.wikipedia.org/wiki/6S_RNA wiki An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family. CsrB RsmB RNA CsrB-RsmB RNA sequence SO:0000377 CsrB_RsmB_RNA An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00018 DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation. http://en.wikipedia.org/wiki/DsrA_RNA DsrA RNA sequence SO:0000378 DsrA_RNA DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00014 http://en.wikipedia.org/wiki/DsrA_RNA wiki A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli. http://en.wikipedia.org/wiki/GcvB_RNA GcvB RNA sequence SO:0000379 GcvB_RNA A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00022 http://en.wikipedia.org/wiki/GcvB_RNA wiki A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Hammerhead_ribozyme INSDC_qualifier:hammerhead_ribozyme hammerhead ribozyme sequence SO:0000380 hammerhead_ribozyme A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs. PMID:2436805 http://en.wikipedia.org/wiki/Hammerhead_ribozyme wiki A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon. group IIA intron sequence SO:0000381 group_IIA_intron A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon. PMID:20463000 A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon. group IIB intron sequence SO:0000382 group_IIB_intron A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon. PMID:20463000 A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message. http://en.wikipedia.org/wiki/MicF_RNA MicF RNA sequence SO:0000383 MicF_RNA A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00033 http://en.wikipedia.org/wiki/MicF_RNA wiki A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages. http://en.wikipedia.org/wiki/OxyS_RNA OxyS RNA sequence SO:0000384 OxyS_RNA A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00035 http://en.wikipedia.org/wiki/OxyS_RNA wiki The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs. INSDC_feature:ncRNA INSDC_qualifier:RNase_MRP_RNA RNase MRP RNA sequence SO:0000385 Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533. RNase_MRP_RNA The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030 The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs. INSDC_feature:ncRNA INSDC_qualifier:RNase_P_RNA RNase P RNA sequence SO:0000386 Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533. RNase_P_RNA The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010 Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential. http://en.wikipedia.org/wiki/RprA_RNA RprA RNA sequence SO:0000387 RprA_RNA Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00034 http://en.wikipedia.org/wiki/RprA_RNA wiki The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins. RRE RNA sequence SO:0000388 RRE_RNA The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00036 A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels. http://en.wikipedia.org/wiki/Spot_42_RNA spot-42 RNA sequence SO:0000389 spot_42_RNA A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00021 http://en.wikipedia.org/wiki/Spot_42_RNA wiki The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Telomerase_RNA INSDC_qualifier:telomerase_RNA telomerase RNA sequence SO:0000390 telomerase_RNA The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025 http://en.wikipedia.org/wiki/Telomerase_RNA wiki U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV. http://en.wikipedia.org/wiki/U1_snRNA U1 small nuclear RNA U1 snRNA small nuclear RNA U1 snRNA U1 sequence SO:0000391 U1_snRNA U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003 http://en.wikipedia.org/wiki/U1_snRNA wiki U1 small nuclear RNA RSC:cb small nuclear RNA U1 RSC:cb snRNA U1 RSC:cb U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing. http://en.wikipedia.org/wiki/U2_snRNA U2 small nuclear RNA U2 snRNA small nuclear RNA U2 snRNA U2 sequence SO:0000392 U2_snRNA U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004 http://en.wikipedia.org/wiki/U2_snRNA wiki U2 small nuclear RNA RSC:CB small nuclear RNA U2 RSC:CB snRNA U2 RSC:CB U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6. http://en.wikipedia.org/wiki/U4_snRNA U4 small nuclear RNA U4 snRNA small nuclear RNA U4 snRNA U4 sequence SO:0000393 U4_snRNA U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015 http://en.wikipedia.org/wiki/U4_snRNA wiki U4 small nuclear RNA RSC:cb small nuclear RNA U4 RSC:cb snRNA U4 RSC:cb An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397). U4atac small nuclear RNA U4atac snRNA small nuclear RNA U4atac snRNA U4atac sequence SO:0000394 U4atac_snRNA An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397). PMID:12409455 U4atac small nuclear RNA RSC:cb small nuclear RNA U4atac RSC:cb snRNA U4atac RSC:cb U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation. http://en.wikipedia.org/wiki/U5_snRNA U5 small nuclear RNA U5 snRNA small nuclear RNA U5 snRNA U5 sequence SO:0000395 U5_snRNA U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020 http://en.wikipedia.org/wiki/U5_snRNA wiki U5 small nuclear RNA RSC:cb small nuclear RNA U5 RSC:cb snRNA U5 RSC:cb U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA. http://en.wikipedia.org/wiki/U6_snRNA U6 small nuclear RNA U6 snRNA small nuclear RNA U6 snRNA U6 sequence SO:0000396 U6_snRNA U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015 http://en.wikipedia.org/wiki/U6_snRNA wiki U6 small nuclear RNA RSC:cb small nuclear RNA U6 RSC:cb snRNA U6 RSC:cb U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394). U6atac small nuclear RNA U6atac snRNA snRNA U6atac sequence SO:0000397 U6atac_snRNA U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394). http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=12409455&dopt=Abstract U6atac small nuclear RNA RSC:cb U6atac snRNA RSC:cb snRNA U6atac RSC:cb U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence. http://en.wikipedia.org/wiki/U11_snRNA U11 small nuclear RNA U11 snRNA small nuclear RNA U11 snRNA U11 sequence SO:0000398 U11_snRNA U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence. PMID:9622129 http://en.wikipedia.org/wiki/U11_snRNA wiki U11 small nuclear RNA RSC:cb small nuclear RNA U11 RSC:cb snRNA U11 RSC:cb The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns. http://en.wikipedia.org/wiki/U12_snRNA U12 small nuclear RNA U12 snRNA small nuclear RNA U12 snRNA U12 sequence SO:0000399 U12_snRNA The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007 http://en.wikipedia.org/wiki/U12_snRNA wiki U12 small nuclear RNA RSC:cb small nuclear RNA U12 RSC:cb snRNA U12 RSC:cb An attribute describes a quality of sequence. sequence attribute sequence SO:0000400 sequence_attribute An attribute describes a quality of sequence. SO:ke An attribute describing a gene. gene attribute sequence SO:0000401 gene_attribute sequence SO:0000402 enhancer_attribute true U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates. SO:0005839 U14 small nucleolar RNA U14 snoRNA small nucleolar RNA U14 snoRNA U14 sequence SO:0000403 An evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA. U14_snoRNA U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates. PMID:2551119 http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016 A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Vault_RNA INSDC_qualifier:vault_RNA vault RNA sequence SO:0000404 vault_RNA A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006 http://en.wikipedia.org/wiki/Vault_RNA wiki Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Y_RNA INSDC_qualifier:Y_RNA Y RNA sequence SO:0000405 Y_RNA Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019 http://en.wikipedia.org/wiki/Y_RNA wiki An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed. http://en.wikipedia.org/wiki/Twintron sequence SO:0000406 twintron An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed. PMID:1899376 PMID:7823908 http://en.wikipedia.org/wiki/Twintron wiki Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes. http://en.wikipedia.org/wiki/18S_ribosomal_RNA cytosolic 18S rRNA cytosolic 18S ribosomal RNA cytosolic rRNA 18S sequence SO:0000407 Renamed to cytosolic_18S_rRNA from rRNA_18S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. cytosolic_18S_rRNA Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes. SO:ke http://en.wikipedia.org/wiki/18S_ribosomal_RNA wiki The interbase position where something (eg an aberration) occurred. sequence SO:0000408 site true The interbase position where something (eg an aberration) occurred. SO:ke A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids. BS:00033 http://en.wikipedia.org/wiki/Binding_site INSDC_feature:misc_binding binding site binding_or_interaction_site sequence site SO:0000409 See GO:0005488 : binding. binding_site A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids. EBIBS:GAR SO:ke http://en.wikipedia.org/wiki/Binding_site wiki A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules. INSDC_feature:protein_bind protein binding site sequence SO:0000410 See GO:0042277 : peptide binding. protein_binding_site A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules. SO:ke A region that rescues. rescue fragment rescue region sequence rescue segment SO:0000411 rescue_region A region that rescues. SO:xp A region of polynucleotide sequence produced by digestion with a restriction endonuclease. http://en.wikipedia.org/wiki/Restriction_fragment restriction fragment sequence SO:0000412 restriction_fragment A region of polynucleotide sequence produced by digestion with a restriction endonuclease. SO:ke http://en.wikipedia.org/wiki/Restriction_fragment wiki A region where the sequence differs from that of a specified sequence. INSDC_feature:misc_difference sequence difference sequence SO:0000413 sequence_difference A region where the sequence differs from that of a specified sequence. SO:ke An attribute to describe a feature that is invalidated due to genomic contamination. invalidated by genomic contamination sequence SO:0000414 invalidated_by_genomic_contamination An attribute to describe a feature that is invalidated due to genomic contamination. SO:ke An attribute to describe a feature that is invalidated due to polyA priming. invalidated by genomic polyA primed cDNA sequence SO:0000415 invalidated_by_genomic_polyA_primed_cDNA An attribute to describe a feature that is invalidated due to polyA priming. SO:ke An attribute to describe a feature that is invalidated due to partial processing. invalidated by partial processing sequence SO:0000416 invalidated_by_partial_processing An attribute to describe a feature that is invalidated due to partial processing. SO:ke A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution. BS:00012 BS:00134 SO:0001069 domain structural domain polypeptide domain polypeptide_structural_domain sequence SO:0000417 Range. Old definition from before biosapiens: A region of a single polypeptide chain that folds into an independent unit and exhibits biological activity. A polypeptide chain may have multiple domains. polypeptide_domain A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution. EBIBS:GAR domain uniprot:feature_type structural domain polypeptide_structural_domain The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components. BS:00159 http://en.wikipedia.org/wiki/Signal_peptide INSDC_feature:sig_peptide signal peptide signal peptide coding sequence sequence signal SO:0000418 Old def before biosapiens:The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence. signal_peptide The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Signal_peptide wiki signal uniprot:feature_type The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide. BS:00149 INSDC_feature:mat_peptide mature protein region sequence chain mature peptide SO:0000419 This term mature peptide, merged with the biosapiens term mature protein region and took that to be the new name. Old def: The coding sequence for the mature or final peptide or protein product following post-translational modification. mature_protein_region The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide. EBIBS:GAR SO:cb http://www.insdc.org/files/feature_table.html chain uniprot:feature_type An inverted repeat (SO:0000294) occurring at the 5-prime termini of a DNA transposon. 5' TIR five prime terminal inverted repeat sequence SO:0000420 five_prime_terminal_inverted_repeat An inverted repeat (SO:0000294) occurring at the 3-prime termini of a DNA transposon. 3' TIR three prime terminal inverted repeat sequence SO:0000421 three_prime_terminal_inverted_repeat The U5 segment of the long terminal repeats. U5 LTR region U5 long terminal repeat region sequence SO:0000422 U5_LTR_region The R segment of the long terminal repeats. R LTR region R long terminal repeat region sequence SO:0000423 R_LTR_region The U3 segment of the long terminal repeats. U3 LTR region U3 long terminal repeat region sequence SO:0000424 U3_LTR_region The long terminal repeat found at the five-prime end of the sequence to be inserted into the host genome. 5' LTR 5' long terminal repeat five prime LTR sequence SO:0000425 five_prime_LTR The long terminal repeat found at the three-prime end of the sequence to be inserted into the host genome. 3' LTR 3' long terminal repeat three prime LTR sequence SO:0000426 three_prime_LTR The R segment of the three-prime long terminal repeat. R 5' long term repeat region R five prime LTR region sequence SO:0000427 R_five_prime_LTR_region The U5 segment of the three-prime long terminal repeat. U5 5' long terminal repeat region U5 five prime LTR region sequence SO:0000428 U5_five_prime_LTR_region The U3 segment of the three-prime long terminal repeat. U3 5' long term repeat region U3 five prime LTR region sequence SO:0000429 U3_five_prime_LTR_region The R segment of the three-prime long terminal repeat. R 3' long terminal repeat region R three prime LTR region sequence SO:0000430 R_three_prime_LTR_region The U3 segment of the three-prime long terminal repeat. U3 3' long terminal repeat region U3 three prime LTR region sequence SO:0000431 U3_three_prime_LTR_region The U5 segment of the three-prime long terminal repeat. U5 3' long terminal repeat region U5 three prime LTR region sequence SO:0000432 U5_three_prime_LTR_region A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon. INSDC_feature:repeat_region INSDC_qualifier:non_ltr_retrotransposon_polymeric_tract non LTR retrotransposon polymeric tract sequence SO:0000433 non_LTR_retrotransposon_polymeric_tract A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon. SO:ke A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion. target site duplication sequence SO:0000434 target_site_duplication A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion. http://www.koko.gov.my/CocoaBioTech/Glossaryt.html A polypurine tract within an LTR_retrotransposon. RR tract sequence LTR retrotransposon poly purine tract SO:0000435 RR_tract A polypurine tract within an LTR_retrotransposon. SO:ke A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host. autonomously replicating sequence sequence SO:0000436 ARS A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host. SO:ma sequence SO:0000437 assortment_derived_duplication true sequence SO:0000438 gene_not_polyadenylated true A ring chromosome is a chromosome whose arms have fused together to form a ring in an inverted fashion, often with the loss of the ends of the chromosome. inverted ring chromosome sequence SO:0000439 inverted_ring_chromosome A replicon that has been modified to act as a vector for foreign sequence. http://en.wikipedia.org/wiki/Vector_(molecular_biology) vector vector replicon sequence SO:0000440 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. vector_replicon A replicon that has been modified to act as a vector for foreign sequence. SO:ma http://en.wikipedia.org/wiki/Vector_(molecular_biology) wiki A single stranded oligonucleotide. single strand oligo single strand oligonucleotide single stranded oligonucleotide ss oligo ss oligonucleotide sequence SO:0000441 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. ss_oligo A single stranded oligonucleotide. SO:ke A double stranded oligonucleotide. double stranded oligonucleotide ds oligo ds-oligonucleotide sequence SO:0000442 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. ds_oligo A double stranded oligonucleotide. SO:ke An attribute to describe the kind of biological sequence. polymer attribute sequence SO:0000443 polymer_attribute An attribute to describe the kind of biological sequence. SO:ke Non-coding exon in the 3' UTR. three prime noncoding exon sequence SO:0000444 three_prime_noncoding_exon Non-coding exon in the 3' UTR. SO:ke Non-coding exon in the 5' UTR. 5' nc exon 5' non coding exon five prime noncoding exon sequence SO:0000445 five_prime_noncoding_exon Non-coding exon in the 5' UTR. SO:ke Intron located in the untranslated region. UTR intron sequence SO:0000446 UTR_intron Intron located in the untranslated region. SO:ke An intron located in the 5' UTR. five prime UTR intron sequence SO:0000447 five_prime_UTR_intron An intron located in the 5' UTR. SO:ke An intron located in the 3' UTR. three prime UTR intron sequence SO:0000448 three_prime_UTR_intron An intron located in the 3' UTR. SO:ke A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components. random sequence sequence SO:0000449 random_sequence A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components. SO:ma A light region between two darkly staining bands in a polytene chromosome. sequence chromosome interband SO:0000450 interband A light region between two darkly staining bands in a polytene chromosome. SO:ma A gene that encodes a polyadenylated mRNA. gene with polyadenylated mRNA sequence SO:0000451 gene_with_polyadenylated_mRNA A gene that encodes a polyadenylated mRNA. SO:xp sequence SO:0000452 transgene_attribute true A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type. chromosomal transposition transposition sequence SO:0000453 chromosomal_transposition A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type. FB:reference_manual SO:ke A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements. INSDC_feature:ncRNA INSDC_qualifier:rasiRNA repeat associated small interfering RNA sequence SO:0000454 Changed parent term from ncRNA (SO:0000655) to piRNA (SO:0001035). See GitHub Issue #573. rasiRNA A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements. PMID:18032451 http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284 A gene that encodes an mRNA with a frameshift. gene with mRNA with frameshift sequence SO:0000455 gene_with_mRNA_with_frameshift A gene that encodes an mRNA with a frameshift. SO:xp A gene that is recombinationally rearranged. recombinationally rearranged gene sequence SO:0000456 recombinationally_rearranged_gene A gene that is recombinationally rearranged. SO:ke A chromosome duplication involving an insertion from another chromosome. interchromosomal duplication sequence SO:0000457 interchromosomal_duplication A chromosome duplication involving an insertion from another chromosome. SO:ke Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment. D gene D-GENE INSDC_feature:D_segment sequence SO:0000458 D_gene_segment Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A gene with a transcript that is trans-spliced. gene with trans spliced transcript sequence SO:0000459 gene_with_trans_spliced_transcript A gene with a transcript that is trans-spliced. SO:xp Germline genomic DNA with the sequence for a V, D, C, or J portion of an immunoglobulin/T-cell receptor. vertebrate immunoglobulin T cell receptor segment vertebrate_immunoglobulin/T-cell receptor gene sequence SO:0000460 I am using the term segment instead of gene here to avoid confusion with the region 'gene'. vertebrate_immunoglobulin_T_cell_receptor_segment A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion. inversion derived bipartite deficiency sequence SO:0000461 inversion_derived_bipartite_deficiency A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion. FB:km A non-functional descendant of a functional entity. pseudogenic region sequence SO:0000462 pseudogenic_region A non-functional descendant of a functional entity. SO:cjm A gene that encodes more than one transcript. encodes alternately spliced transcripts sequence SO:0000463 encodes_alternately_spliced_transcripts A gene that encodes more than one transcript. SO:ke A non-functional descendant of an exon. decayed exon sequence SO:0000464 Does not have to be part of a pseudogene. decayed_exon A non-functional descendant of an exon. SO:ke A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion. inversion derived deficiency plus duplication sequence SO:0000465 inversion_derived_deficiency_plus_duplication A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion. FB:km Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR. INSDC_feature:V_segment V gene V gene segment V-GENE variable_gene sequence SO:0000466 V_gene_segment Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein. post translationally regulated by protein stability post-translationally regulated by protein stability sequence SO:0000467 post_translationally_regulated_by_protein_stability An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein. SO:ke One of the pieces of sequence that make up a golden path. golden path fragment sequence SO:0000468 golden_path_fragment One of the pieces of sequence that make up a golden path. SO:rd An attribute describing a gene sequence where the resulting protein is modified to regulate it. post translationally regulated by protein modification post-translationally regulated by protein modification sequence SO:0000469 post_translationally_regulated_by_protein_modification An attribute describing a gene sequence where the resulting protein is modified to regulate it. SO:ke Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment. INSDC_feature:J_segment J gene J-GENE sequence SO:0000470 J_gene_segment Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# The gene product is involved in its own transcriptional regulation. sequence SO:0000471 autoregulated The gene product is involved in its own transcriptional regulation. SO:ke A set of regions which overlap with minimal polymorphism to form a linear sequence. tiling path sequence SO:0000472 tiling_path A set of regions which overlap with minimal polymorphism to form a linear sequence. SO:cjm The gene product is involved in its own transcriptional regulation where it decreases transcription. negatively autoregulated sequence SO:0000473 negatively_autoregulated The gene product is involved in its own transcriptional regulation where it decreases transcription. SO:ke A piece of sequence that makes up a tiling_path (SO:0000472). tiling path fragment sequence SO:0000474 tiling_path_fragment A piece of sequence that makes up a tiling_path (SO:0000472). SO:ke The gene product is involved in its own transcriptional regulation, where it increases transcription. positively autoregulated sequence SO:0000475 positively_autoregulated The gene product is involved in its own transcriptional regulation, where it increases transcription. SO:ke A DNA sequencer read which is part of a contig. contig read sequence SO:0000476 contig_read A DNA sequencer read which is part of a contig. SO:ke A gene that is polycistronic. sequence SO:0000477 polycistronic_gene true A gene that is polycistronic. SO:ke Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205). C gene C_GENE INSDC_feature:C_region constant gene sequence SO:0000478 C_gene_segment Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205). http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A transcript that is trans-spliced. INSDC_feature:tRNA INSDC_qualifier:trans_splicing trans spliced transcript trans-spliced transcript sequence SO:0000479 trans_spliced_transcript A transcript that is trans-spliced. SO:xp A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly. tiling path clone sequence SO:0000480 tiling_path_clone A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly. SO:ke An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon. TIR terminal inverted repeat sequence SO:0000481 terminal_inverted_repeat An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon. SO:ke Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration. vertebrate immunoglobulin T cell receptor gene cluster vertebrate_immunoglobulin/T-cell receptor gene cluster sequence SO:0000482 vertebrate_immunoglobulin_T_cell_receptor_gene_cluster A primary transcript that is never translated into a protein. nc primary transcript noncoding primary transcript sequence SO:0000483 nc_primary_transcript A primary transcript that is never translated into a protein. SO:ke The sequence of the 3' exon that is not coding. three prime coding exon noncoding region three_prime_exon_noncoding_region sequence SO:0000484 three_prime_coding_exon_noncoding_region The sequence of the 3' exon that is not coding. SO:ke Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene. (DJ)-J-CLUSTER DJ J cluster sequence SO:0000485 DJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# The sequence of the 5' exon preceding the start codon. five prime coding exon noncoding region five_prime_exon_noncoding_region sequence SO:0000486 five_prime_coding_exon_noncoding_region The sequence of the 5' exon preceding the start codon. SO:ke Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene. (VDJ)-J-C-CLUSTER VDJ J C cluster sequence SO:0000487 VDJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene. (VDJ)-J-CLUSTER VDJ J cluster sequence SO:0000488 VDJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene. VJ C cluster sequence (VJ)-C-CLUSTER SO:0000489 VJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene. (VJ)-J-C-CLUSTER VJ J C cluster sequence SO:0000490 VJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene. (VJ)-J-CLUSTER VJ J cluster sequence SO:0000491 VJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Recombination signal including D-heptamer, D-spacer and D-nonamer in 5' of D-region of a D-gene or D-sequence. D gene recombination feature sequence SO:0000492 D_gene_recombination_feature 7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene. 3'D-HEPTAMER three prime D heptamer sequence SO:0000493 three_prime_D_heptamer 7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene. 3'D-NOMAMER three prime D nonamer sequence SO:0000494 three_prime_D_nonamer A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS. 3'D-SPACER three prime D spacer sequence SO:0000495 three_prime_D_spacer A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. 5'D-HEPTAMER five prime D heptamer sequence SO:0000496 five_prime_D_heptamer 7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 9 nucleotide recombination site (e.g. GGTTTTTGT), part of a five_prime_D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. 5'D-NONAMER five prime D nonamer sequence SO:0000497 five_prime_D_nonamer 9 nucleotide recombination site (e.g. GGTTTTTGT), part of a five_prime_D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 12 or 23 nucleotide spacer between the 5' D-heptamer (SO:0000496) and 5' D-nonamer (SO:0000497) of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. 5'-SPACER five prime D spacer five prime D-spacer sequence SO:0000498 five_prime_D_spacer 12 or 23 nucleotide spacer between the 5' D-heptamer (SO:0000496) and 5' D-nonamer (SO:0000497) of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database. virtual sequence sequence SO:0000499 virtual_sequence A continuous piece of sequence similar to the 'virtual contig' concept of the Ensembl database. SO:ke A type of non-canonical base-pairing. This is less energetically favourable than watson crick base pairing. Hoogsteen GC base pairs only have two hydrogen bonds. http://en.wikipedia.org/wiki/Hoogsteen_base_pair Hoogsteen base pair sequence SO:0000500 Hoogsteen_base_pair A type of non-canonical base-pairing. This is less energetically favourable than watson crick base pairing. Hoogsteen GC base pairs only have two hydrogen bonds. PMID:12177293 http://en.wikipedia.org/wiki/Hoogsteen_base_pair wiki A type of non-canonical base-pairing. reverse Hoogsteen base pair sequence SO:0000501 reverse_Hoogsteen_base_pair A type of non-canonical base-pairing. SO:ke A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes. sequence SO:0000502 This concept cam about as a direct result of the SO meeting August 2004.nThe exact nature of the relationship between transcribed_region and gene is still up for discussion. We are going with 'associated_with' for the time being. transcribed_region true A region of sequence that is transcribed. This region may cover the transcript of a gene, it may emcompas the sequence covered by all of the transcripts of a alternately spliced gene, or it may cover the region transcribed by a polycistronic transcript. A gene may have 1 or more transcribed regions and a transcribed_region may belong to one or more genes. SO:ke sequence SO:0000503 alternately_spliced_gene_encodeing_one_transcript true Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene and one C-gene. D DJ C cluster D-(DJ)-C-CLUSTER sequence SO:0000504 D_DJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene and one DJ-gene. D DJ cluster D-(DJ)-CLUSTER sequence SO:0000505 D_DJ_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene and one DJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, one J-gene and one C-gene. D DJ J C cluster D-(DJ)-J-C-CLUSTER sequence SO:0000506 D_DJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A non functional descendant of an exon, part of a pseudogene. pseudogenic exon sequence SO:0000507 This is the analog of the exon of a functional gene. The term was requested by Rama - SGD to allow the annotation of the parts of a pseudogene. Non-functional is defined as either its transcription or translation (or both) are prevented due to one or more mutations. pseudogenic_exon A non functional descendant of an exon, part of a pseudogene. SO:ke Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, and one J-gene. D DJ J cluster D-(DJ)-J-CLUSTER sequence SO:0000508 D_DJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one D-gene, one DJ-gene, and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene, one J-gene and one C-gene. D J C cluster D-J-C-CLUSTER sequence SO:0000509 D_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including L-part1, V-intron and V-D-exon, with the 5' UTR (SO:0000204) and 3' UTR (SO:0000205). VD gene V_D_GENE sequence SO:0000510 VD_gene_segment Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including L-part1, V-intron and V-D-exon, with the 5' UTR (SO:0000204) and 3' UTR (SO:0000205). http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one J-gene and one C-gene. J C cluster J-C-CLUSTER sequence SO:0000511 J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at one end and presumed to have a deficiency or duplication at the other end of the inversion. inversion derived deficiency plus aneuploid sequence SO:0000512 inversion_derived_deficiency_plus_aneuploid A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at one end and presumed to have a deficiency or duplication at the other end of the inversion. FB:km Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one J-gene. J cluster J-CLUSTER sequence SO:0000513 J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 9 nucleotide recombination site (e.g. GGTTTTTGT), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. J nonamer J-NONAMER sequence SO:0000514 J_nonamer 9 nucleotide recombination site (e.g. GGTTTTTGT), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 7 nucleotide recombination site (e.g. CACAGTG), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. J heptamer J-HEPTAMER sequence SO:0000515 J_heptamer 7 nucleotide recombination site (e.g. CACAGTG), part of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A non functional descendant of a transcript, part of a pseudogene. INSDC_feature:misc_RNA INSDC_qualifier:pseudo pseudogenic transcript sequence SO:0000516 This is the analog of the transcript of a functional gene. The term was requested by Rama - SGD to allow the annotation of the parts of a pseudogene. Non-functional is defined as either its transcription or translation (or both) are prevented due to one or more mutations. pseudogenic_transcript A non functional descendant of a transcript, part of a pseudogene. SO:ke 12 or 23 nucleotide spacer between the J-nonamer and the J-heptamer of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. J spacer J-SPACER sequence SO:0000517 J_spacer 12 or 23 nucleotide spacer between the J-nonamer and the J-heptamer of a J-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one DJ-gene. V DJ cluster V-(DJ)-CLUSTER sequence SO:0000518 V_DJ_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one DJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one J-gene. V DJ J cluster sequence V-(DJ)-J-CLUSTER SO:0000519 V_DJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one C-gene. V VDJ C cluster V-(VDJ)-C-CLUSTER sequence SO:0000520 V_VDJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VDJ-gene. V VDJ cluster V-(VDJ)-CLUSTER sequence SO:0000521 V_VDJ_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VDJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one J-gene. V VDJ J cluster sequence V-(VDJ)-J-CLUSTER SO:0000522 V_VDJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one C-gene. V VJ C cluster V-(VJ)-C-CLUSTER sequence SO:0000523 V_VJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VJ-gene. V VJ cluster V-(VJ)-CLUSTER sequence SO:0000524 V_VJ_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene and one VJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one J-gene. V VJ J cluster V-(VJ)-J-CLUSTER sequence SO:0000525 V_VJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one V-gene. V cluster V-CLUSTER sequence SO:0000526 V_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one V-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one C-gene. V D DJ C cluster V-D-(DJ)-C-CLUSTER sequence SO:0000527 V_D_DJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene. V D DJ cluster V-D-(DJ)-CLUSTER sequence SO:0000528 V_D_DJ_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene, one J-gene and one C-gene. V D DJ J C cluster V-D-(DJ)-J-C-CLUSTER sequence SO:0000529 V_D_DJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one J-gene. V D DJ J cluster V-D-(DJ)-J-CLUSTER sequence SO:0000530 V_D_DJ_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one D-gene, one DJ-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene and one C-gene. V D J C cluster V-D-J-C-CLUSTER sequence SO:0000531 V_D_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene. V D J cluster V-D-J-CLUSTER sequence SO:0000532 V_D_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one D-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 7 nucleotide recombination site (e.g. CACAGTG), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene. V heptamer V-HEPTAMER sequence SO:0000533 V_heptamer 7 nucleotide recombination site (e.g. CACAGTG), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene and one J-gene. V J cluster V-J-CLUSTER sequence SO:0000534 V_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one J-gene and one C-gene. V J C cluster V-J-C-CLUSTER sequence SO:0000535 V_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one V-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 9 nucleotide recombination site (e.g. ACAAAAACC), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene. V nonamer V-NONAMER sequence SO:0000536 V_nonamer 9 nucleotide recombination site (e.g. ACAAAAACC), part of V-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 12 or 23 nucleotide spacer between the V-heptamer and the V-nonamer of a V-gene recombination feature of an immunoglobulin/T-cell receptor gene. V spacer V-SPACER sequence SO:0000537 V_spacer 12 or 23 nucleotide spacer between the V-heptamer and the V-nonamer of a V-gene recombination feature of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Recombination signal including V-heptamer, V-spacer and V-nonamer in 3' of V-region of a V-gene or V-sequence of an immunoglobulin/T-cell receptor gene. V gene recombination feature V-RS sequence SO:0000538 V_gene_recombination_feature Recombination signal including V-heptamer, V-spacer and V-nonamer in 3' of V-region of a V-gene or V-sequence of an immunoglobulin/T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene and one C-gene. (DJ)-C-CLUSTER DJ C cluster sequence SO:0000539 DJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA in rearranged configuration including at least one D-J-GENE, one J-GENE and one C-GENE. (DJ)-J-C-CLUSTER DJ J C cluster sequence SO:0000540 DJ_J_C_cluster Genomic DNA in rearranged configuration including at least one D-J-GENE, one J-GENE and one C-GENE. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one C-gene. (VDJ)-C-CLUSTER VDJ C cluster sequence SO:0000541 VDJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one C-gene. V DJ C cluster V-(DJ)-C-CLUSTER sequence SO:0000542 V_DJ_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# sequence SO:0000543 alternately_spliced_gene_encoding_greater_than_one_transcript true A rolling circle transposon. Autonomous helitrons encode a 5'-to-3' DNA helicase and nuclease/ligase similar to those encoded by known rolling-circle replicons. http://en.wikipedia.org/wiki/Helitron sequence ISCR SO:0000544 helitron A rolling circle transposon. Autonomous helitrons encode a 5'-to-3' DNA helicase and nuclease/ligase similar to those encoded by known rolling-circle replicons. http://www.pnas.org/cgi/content/full/100/11/6569 http://en.wikipedia.org/wiki/Helitron wiki The pseudoknots involved in recoding are unique in that, as they play their role as a structure, they are immediately unfolded and their now linear sequence serves as a template for decoding. recoding pseudoknot sequence SO:0000545 recoding_pseudoknot The pseudoknots involved in recoding are unique in that, as they play their role as a structure, they are immediately unfolded and their now linear sequence serves as a template for decoding. http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=33937 An oligonucleotide sequence that was designed by an experimenter that may or may not correspond with any natural sequence. designed sequence sequence SO:0000546 designed_sequence A chromosome generated by recombination between two inversions; there is a duplication at each end of the inversion. inversion derived bipartite duplication sequence SO:0000547 inversion_derived_bipartite_duplication A chromosome generated by recombination between two inversions; there is a duplication at each end of the inversion. FB:km A gene that encodes a transcript that is edited. gene with edited transcript sequence SO:0000548 gene_with_edited_transcript A gene that encodes a transcript that is edited. SO:xp A chromosome generated by recombination between two inversions; has a duplication at one end and presumed to have a deficiency or duplication at the other end of the inversion. inversion derived duplication plus aneuploid sequence SO:0000549 inversion_derived_duplication_plus_aneuploid A chromosome generated by recombination between two inversions; has a duplication at one end and presumed to have a deficiency or duplication at the other end of the inversion. FB:km A chromosome structural variation whereby either a chromosome exists in addition to the normal chromosome complement or is lacking. aneuploid chromosome sequence SO:0000550 Examples are Nullo-4, Haplo-4 and triplo-4 in Drosophila. aneuploid_chromosome A chromosome structural variation whereby either a chromosome exists in addition to the normal chromosome complement or is lacking. SO:ke The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA. INSDC_feature:regulatory INSDC_qualifier:polyA_signal_sequence poly(A) signal polyA signal sequence polyadenylation termination signal sequence SO:0000551 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. polyA_signal_sequence The recognition sequence necessary for endonuclease cleavage of an RNA transcript that is followed by polyadenylation; consensus=AATAAA. http://www.insdc.org/files/feature_table.html A region in the 5' UTR that pairs with the 16S rRNA during formation of the preinitiation complex. http://en.wikipedia.org/wiki/Shine-Dalgarno_sequence Shine Dalgarno sequence Shine-Dalgarno sequence five prime ribosome binding site sequence RBS SO:0000552 Not found in Eukaryotic sequence. Shine_Dalgarno_sequence A region in the 5' UTR that pairs with the 16S rRNA during formation of the preinitiation complex. SO:jh http://en.wikipedia.org/wiki/Shine-Dalgarno_sequence wiki The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. The boundary between the UTR and the polyA sequence. SO:0001430 INSDC_feature:polyA_site polyA cleavage site polyA junction polyA site polyA_junction sequence polyadenylation site SO:0000553 polyA_site The site on an RNA transcript to which will be added adenine residues by post-transcriptional polyadenylation. The boundary between the UTR and the polyA sequence. http://www.insdc.org/files/feature_table.html sequence SO:0000554 assortment_derived_deficiency_plus_duplication true 5' most region of a precursor transcript that is clipped off during processing. five prime clip sequence 5' clip SO:0000555 five_prime_clip 5' most region of a precursor transcript that is clipped off during processing. http://www.insdc.org/files/feature_table.html Recombination signal of an immunoglobulin/T-cell receptor gene, including the 5' D-nonamer (SO:0000497), 5' D-spacer (SO:0000498), and 5' D-heptamer (SO:0000396) in 5' of the D-region of a D-gene, or in 5' of the D-region of DJ-gene. 5'RS five prime D recombination signal sequence five prime D-recombination signal sequence sequence SO:0000556 five_prime_D_recombination_signal_sequence Recombination signal of an immunoglobulin/T-cell receptor gene, including the 5' D-nonamer (SO:0000497), 5' D-spacer (SO:0000498), and 5' D-heptamer (SO:0000396) in 5' of the D-region of a D-gene, or in 5' of the D-region of DJ-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# 3'-most region of a precursor transcript that is clipped off during processing. 3'-clip three prime clip sequence SO:0000557 three_prime_clip 3'-most region of a precursor transcript that is clipped off during processing. http://www.insdc.org/files/feature_table.html Genomic DNA of immunoglobulin/T-cell receptor gene including more than one C-gene. C cluster C-CLUSTER sequence SO:0000558 C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene including more than one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one D-gene. D cluster D-CLUSTER sequence SO:0000559 D_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including more than one D-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene and one J-gene. D J cluster D-J-CLUSTER sequence SO:0000560 D_J_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration including at least one D-gene and one J-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Seven nucleotide recombination site (e.g. CACAGTG), part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene. heptamer of recombination feature of vertebrate immune system gene sequence HEPTAMER SO:0000561 heptamer_of_recombination_feature_of_vertebrate_immune_system_gene Seven nucleotide recombination site (e.g. CACAGTG), part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Nine nucleotide recombination site, part of V-gene, D-gene or J-gene recombination feature of an immunoglobulin or T-cell receptor gene. nonamer of recombination feature of vertebrate immune system gene sequence SO:0000562 nonamer_of_recombination_feature_of_vertebrate_immune_system_gene A 12 or 23 nucleotide spacer between two regions of an immunoglobulin/T-cell receptor gene that may be rearranged by recombinase. vertebrate immune system gene recombination spacer sequence SO:0000563 vertebrate_immune_system_gene_recombination_spacer Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene, one J-gene and one C-gene. V DJ J C cluster V-(DJ)-J-C-CLUSTER sequence SO:0000564 V_DJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one DJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene, one J-gene and one C-gene. V VDJ J C cluster V-(VDJ)-J-C-CLUSTER sequence SO:0000565 V_VDJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VDJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene, one J-gene and one C-gene. V VJ J C cluster V-(VJ)-J-C-CLUSTER sequence SO:0000566 V_VJ_J_C_cluster Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one V-gene, one VJ-gene, one J-gene and one C-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A chromosome may be generated by recombination between two inversions; presumed to have a deficiency or duplication at each end of the inversion. inversion derived aneuploid chromosome sequence SO:0000567 inversion_derived_aneuploid_chromosome A chromosome may be generated by recombination between two inversions; presumed to have a deficiency or duplication at each end of the inversion. FB:km A promoter that can allow for transcription in both directions. bidirectional promoter sequence SO:0000568 Definition updated in Aug 2020 by Dave Sant. bidirectional_promoter A promoter that can allow for transcription in both directions. PMID:21601935 SO:ke An attribute of a feature that occurred as the product of a reverse transcriptase mediated event. SO:0100042 http://en.wikipedia.org/wiki/Retrotransposed sequence SO:0000569 GO:0003964 RNA-directed DNA polymerase activity. retrotransposed An attribute of a feature that occurred as the product of a reverse transcriptase mediated event. SO:ke http://en.wikipedia.org/wiki/Retrotransposed wiki Recombination signal of an immunoglobulin/T-cell receptor gene, including the 3' D-heptamer (SO:0000493), 3' D-spacer, and 3' D-nonamer (SO:0000494) in 3' of the D-region of a D-gene. 3'D-RS three prime D recombination signal sequence three_prime_D-recombination_signal_sequence sequence SO:0000570 three_prime_D_recombination_signal_sequence Recombination signal of an immunoglobulin/T-cell receptor gene, including the 3' D-heptamer (SO:0000493), 3' D-spacer, and 3' D-nonamer (SO:0000494) in 3' of the D-region of a D-gene. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A region that can be transcribed into a microRNA (miRNA). miRNA encoding sequence SO:0000571 miRNA_encoding Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including D-J-region with 5' UTR and 3' UTR, also designated as D-J-segment. D-J-GENE DJ gene sequence SO:0000572 DJ_gene_segment Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA including D-J-region with 5' UTR and 3' UTR, also designated as D-J-segment. http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A region that can be transcribed into a ribosomal RNA (rRNA). rRNA encoding sequence SO:0000573 rRNA_encoding Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-D-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205). V-D-J-GENE VDJ gene sequence SO:0000574 VDJ_gene_segment Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-D-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205). http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A region that can be transcribed into a small cytoplasmic RNA (scRNA). scRNA encoding sequence SO:0000575 scRNA_encoding Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205). V-J-GENE VJ gene sequence SO:0000576 VJ_gene_segment Rearranged genomic DNA of immunoglobulin/T-cell receptor gene including L-part1, V-intron and V-J-exon, with the 5'UTR (SO:0000204) and 3'UTR (SO:0000205). http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7# A region of chromosome where the spindle fibers attach during mitosis and meiosis. http://en.wikipedia.org/wiki/Centromere INSDC_feature:centromere sequence SO:0000577 centromere A region of chromosome where the spindle fibers attach during mitosis and meiosis. SO:ke http://en.wikipedia.org/wiki/Centromere wiki A region that can be transcribed into a small nucleolar RNA (snoRNA). snoRNA encoding sequence SO:0000578 snoRNA_encoding A locatable feature on a transcript that is edited. edited transcript feature sequence SO:0000579 edited_transcript_feature A locatable feature on a transcript that is edited. SO:ma A primary transcript encoding a methylation guide small nucleolar RNA. methylation guide snoRNA primary transcript sequence SO:0000580 methylation_guide_snoRNA_primary_transcript A primary transcript encoding a methylation guide small nucleolar RNA. SO:ke A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA. http://en.wikipedia.org/wiki/5%27_cap sequence SO:0000581 cap A structure consisting of a 7-methylguanosine in 5'-5' triphosphate linkage with the first nucleotide of an mRNA. It is added post-transcriptionally, and is not encoded in the DNA. http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html http://en.wikipedia.org/wiki/5%27_cap wiki A primary transcript encoding an rRNA cleavage snoRNA. rRNA cleavage snoRNA primary transcript sequence SO:0000582 rRNA_cleavage_snoRNA_primary_transcript A primary transcript encoding an rRNA cleavage snoRNA. SO:ke The region of a transcript that will be edited. pre edited region pre-edited region sequence SO:0000583 pre_edited_region The region of a transcript that will be edited. http://dna.kdna.ucla.edu/rna/index.aspx A tmRNA liberates a mRNA from a stalled ribosome. To accomplish this part of the tmRNA is used as a reading frame that ends in a translation stop signal. The broken mRNA is replaced in the ribosome by the tmRNA and translation of the tmRNA leads to addition of a proteolysis tag to the incomplete protein enabling recognition by a protease. Recently a number of permuted tmRNAs genes have been found encoded in two parts. TmRNAs have been identified in eubacteria and some chloroplasts but are absent from archeal and Eukaryote nuclear genomes. http://en.wikipedia.org/wiki/TmRNA INSDC_feature:tmRNA sequence 10Sa RNA ssrA SO:0000584 tmRNA A tmRNA liberates a mRNA from a stalled ribosome. To accomplish this part of the tmRNA is used as a reading frame that ends in a translation stop signal. The broken mRNA is replaced in the ribosome by the tmRNA and translation of the tmRNA leads to addition of a proteolysis tag to the incomplete protein enabling recognition by a protease. Recently a number of permuted tmRNAs genes have been found encoded in two parts. TmRNAs have been identified in eubacteria and some chloroplasts but are absent from archeal and Eukaryote nuclear genomes. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00023 http://en.wikipedia.org/wiki/TmRNA wiki snoRNA that is associated with guiding methylation of nucleotides. It contains two short conserved sequence motifs: C (RUGAUGA) near the 5-prime end and D (CUGA) near the 3-prime end. C/D box snoRNA encoding sequence SO:0000585 C_D_box_snoRNA_encoding A primary transcript encoding a tmRNA (SO:0000584). tmRNA primary transcript sequence 10Sa RNA primary transcript ssrA RNA primary transcript SO:0000586 tmRNA_primary_transcript A primary transcript encoding a tmRNA (SO:0000584). SO:ke Group I catalytic introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions. http://en.wikipedia.org/wiki/Group_I_intron group I intron sequence SO:0000587 GO:0000372. group_I_intron Group I catalytic introns are large self-splicing ribozymes. They catalyze their own excision from mRNA, tRNA and rRNA precursors in a wide range of organisms. The core secondary structure consists of 9 paired regions (P1-P9). These fold to essentially two domains, the P4-P6 domain (formed from the stacking of P5, P4, P6 and P6a helices) and the P3-P9 domain (formed from the P8, P3, P7 and P9 helices). Group I catalytic introns often have long ORFs inserted in loop regions. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00028 http://en.wikipedia.org/wiki/Group_I_intron wiki A self spliced intron. INSDC_feature:ncRNA INSDC_qualifier:autocatalytically_spliced_intron autocatalytically spliced intron sequence SO:0000588 autocatalytically_spliced_intron A self spliced intron. SO:ke A primary transcript encoding a signal recognition particle RNA. SRP RNA primary transcript sequence SO:0000589 SRP_RNA_primary_transcript A primary transcript encoding a signal recognition particle RNA. SO:ke The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding. INSDC_feature:ncRNA INSDC_qualifier:SRP_RNA SRP RNA sequence 7S RNA signal recognition particle RNA SO:0000590 SRP_RNA The signal recognition particle (SRP) is a universally conserved ribonucleoprotein. It is involved in the co-translational targeting of proteins to membranes. The eukaryotic SRP consists of a 300-nucleotide 7S RNA and six proteins: SRPs 72, 68, 54, 19, 14, and 9. Archaeal SRP consists of a 7S RNA and homologues of the eukaryotic SRP19 and SRP54 proteins. In most eubacteria, the SRP consists of a 4.5S RNA and the Ffh protein (a homologue of the eukaryotic SRP54 protein). Eukaryotic and archaeal 7S RNAs have very similar secondary structures, with eight helical elements. These fold into the Alu and S domains, separated by a long linker region. Eubacterial SRP is generally a simpler structure, with the M domain of Ffh bound to a region of the 4.5S RNA that corresponds to helix 8 of the eukaryotic and archaeal SRP S domain. Some Gram-positive bacteria (e.g. Bacillus subtilis), however, have a larger SRP RNA that also has an Alu domain. The Alu domain is thought to mediate the peptide chain elongation retardation function of the SRP. The universally conserved helix which interacts with the SRP54/Ffh M domain mediates signal sequence recognition. In eukaryotes and archaea, the SRP19-helix 6 complex is thought to be involved in SRP assembly and stabilizes helix 8 for SRP54 binding. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00017 A tertiary structure in RNA where nucleotides in a loop form base pairs with a region of RNA downstream of the loop. http://en.wikipedia.org/wiki/Pseudoknot sequence SO:0000591 pseudoknot A tertiary structure in RNA where nucleotides in a loop form base pairs with a region of RNA downstream of the loop. RSC:cb http://en.wikipedia.org/wiki/Pseudoknot wiki A pseudoknot which contains two stems and at least two loops. H pseudoknot H-pseudoknot H-type pseudoknot classical pseudoknot hairpin-type pseudoknot sequence SO:0000592 H_pseudoknot A pseudoknot which contains two stems and at least two loops. http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10334330&dopt=Abstract Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'. C D box snoRNA C/D box snoRNA SNORD box C/D snoRNA sequence SO:0000593 Added 'SNORD' as a synonym of C_D_box_snoRNA (SO:0000593) and 'SNORA' as a synonym of H_ACA_box_snoRNA (SO:0000594). See GitHub Issue #577. C_D_box_snoRNA Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'. http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html SNORD PMID:31828325 Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains. H ACA box snoRNA H/ACA box snoRNA SNORA box H/ACA snoRNA sequence SO:0000594 Added 'SNORD' as a synonym of C_D_box_snoRNA (SO:0000593) and 'SNORA' as a synonym of H_ACA_box_snoRNA (SO:0000594). See GitHub Issue #577. H_ACA_box_snoRNA Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains. http://www.bio.umass.edu/biochem/rna-sequence/Yeast_snoRNA_Database/snoRNA_DataBase.html SNORA PMID:31828325 A primary transcript encoding a small nucleolar RNA of the box C/D family. C/D box snoRNA primary transcript sequence SO:0000595 C_D_box_snoRNA_primary_transcript A primary transcript encoding a small nucleolar RNA of the box C/D family. SO:ke A primary transcript encoding a small nucleolar RNA of the box H/ACA family. H ACA box snoRNA primary transcript sequence SO:0000596 H_ACA_box_snoRNA_primary_transcript A primary transcript encoding a small nucleolar RNA of the box H/ACA family. SO:ke The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa. sequence SO:0000597 transcript_edited_by_U_insertion/deletion true The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa. http://www.rna.ucla.edu/index.html sequence transcript_edited_by_C-insertion_and_dinucleotide_insertion SO:0000598 edited_by_C_insertion_and_dinucleotide_insertion true sequence SO:0000599 edited_by_C_to_U_substitution true sequence SO:0000600 edited_by_A_to_I_substitution true sequence SO:0000601 edited_by_G_addition true A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Guide_RNA INSDC_qualifier:guide_RNA gRNA guide RNA sequence SO:0000602 guide_RNA A short 3'-uridylated RNA that can form a duplex (except for its post-transcriptionally added oligo_U tail (SO:0000609)) with a stretch of mature edited mRNA. http://www.rna.ucla.edu/index.html http://en.wikipedia.org/wiki/Guide_RNA wiki Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron to intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny. http://en.wikipedia.org/wiki/Group_II_intron group II intron sequence SO:0000603 GO:0000373. group_II_intron Group II introns are found in rRNA, tRNA and mRNA of organelles in fungi, plants and protists, and also in mRNA in bacteria. They are large self-splicing ribozymes and have 6 structural domains (usually designated dI to dVI). A subset of group II introns also encode essential splicing proteins in intronic ORFs. The length of these introns can therefore be up to 3kb. Splicing occurs in almost identical fashion to nuclear pre-mRNA splicing with two transesterification steps. The 2' hydroxyl of a bulged adenosine in domain VI attacks the 5' splice site, followed by nucleophilic attack on the 3' splice site by the 3' OH of the upstream exon. Protein machinery is required for splicing in vivo, and long range intron to intron and intron-exon interactions are important for splice site positioning. Group II introns are further sub-classified into groups IIA and IIB which differ in splice site consensus, distance of bulged A from 3' splice site, some tertiary interactions, and intronic ORF phylogeny. http://www.sanger.ac.uk/Software/Rfam/browse/index.shtml http://en.wikipedia.org/wiki/Group_II_intron wiki Edited mRNA sequence mediated by a single guide RNA (SO:0000602). editing block sequence SO:0000604 editing_block Edited mRNA sequence mediated by a single guide RNA (SO:0000602). http://dna.kdna.ucla.edu/rna/index.aspx A region containing or overlapping no genes that is bounded on either side by a gene, or bounded by a gene and the end of the chromosome. http://en.wikipedia.org/wiki/Intergenic_region intergenic region sequence SO:0000605 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. intergenic_region A region containing or overlapping no genes that is bounded on either side by a gene, or bounded by a gene and the end of the chromosome. SO:cjm http://en.wikipedia.org/wiki/Intergenic_region wiki Edited mRNA sequence mediated by two or more overlapping guide RNAs (SO:0000602). editing domain sequence SO:0000606 editing_domain Edited mRNA sequence mediated by two or more overlapping guide RNAs (SO:0000602). http://dna.kdna.ucla.edu/rna/index.aspx The region of an edited transcript that will not be edited. unedited region sequence SO:0000607 unedited_region The region of an edited transcript that will not be edited. http://dna.kdna.ucla.edu/rna/index.aspx snoRNA that is associated with guiding polyuridylation. It contains two short conserved sequence motifs: H box (ANANNA) and ACA (ACA). H ACA box snoRNA encoding sequence SO:0000608 H_ACA_box_snoRNA_encoding The string of non-encoded U's at the 3' end of a guide RNA (SO:0000602). oligo U tail sequence SO:0000609 oligo_U_tail The string of non-encoded U's at the 3' end of a guide RNA (SO:0000602). http://www.rna.ucla.edu/ Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs. polyA sequence sequence SO:0000610 polyA_sequence Sequence of about 100 nucleotides of A added to the 3' end of most eukaryotic mRNAs. SO:ke A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat. branch point branch site branch_point sequence SO:0000611 branch_site A pyrimidine rich sequence near the 3' end of an intron to which the 5'end becomes covalently bound during nuclear splicing. The resulting structure resembles a lariat. SO:ke The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing. http://en.wikipedia.org/wiki/Polypyrimidine_tract polypyrimidine tract sequence SO:0000612 polypyrimidine_tract The polypyrimidine tract is one of the cis-acting sequence elements directing intron removal in pre-mRNA splicing. http://nar.oupjournals.org/cgi/content/full/25/4/888 http://en.wikipedia.org/wiki/Polypyrimidine_tract wiki A DNA sequence to which bacterial RNA polymerase binds, to begin transcription. bacterial RNApol promoter sequence SO:0000613 former parent RNA_polymerase_promoter SO:0001203 was merged with promoter SO:0000167 in Aug 2020 as part of GREEKC. bacterial_RNApol_promoter A DNA sequence to which bacterial RNA polymerase binds, to begin transcription. SO:ke A terminator signal for bacterial transcription. bacterial terminator sequence SO:0000614 Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529. bacterial_terminator A terminator signal for bacterial transcription. SO:ke A terminator signal for RNA polymerase III transcription. terminator of type 2 RNApol III promoter sequence SO:0000615 terminator_of_type_2_RNApol_III_promoter A terminator signal for RNA polymerase III transcription. SO:ke The base where transcription ends. transcription end site sequence SO:0000616 transcription_end_site The base where transcription ends. SO:ke This type of promoter recruits RNA pol III. This promoter is intragenic and includes an A box, an intermediate element, and a C box. This is well conserved in the 5s rRNA promoters across species. RNApol III promoter type 1 sequence SO:0000617 RNApol_III_promoter_type_1 This type of promoter recruits RNA pol III. This promoter is intragenic and includes an A box, an intermediate element, and a C box. This is well conserved in the 5s rRNA promoters across species. PMID:12381659 This type of promoter recruits RNA pol III to transcribe genes mainly for t-RNA. This promoter is intragenic and includes an A box and a B box. RNApol III promoter type 2 sequence tRNA promoter SO:0000618 RNApol_III_promoter_type_2 This type of promoter recruits RNA pol III to transcribe genes mainly for t-RNA. This promoter is intragenic and includes an A box and a B box. PMID:12381659 A variably distant linear promoter region recognized by TFIIIC, with consensus sequence TGGCnnAGTGG. http://en.wikipedia.org/wiki/A-box A-box sequence SO:0000619 Binds TFIIIC. A_box A variably distant linear promoter region recognized by TFIIIC, with consensus sequence TGGCnnAGTGG. SO:ke http://en.wikipedia.org/wiki/A-box wiki A variably distant linear promoter region recognized by TFIIIC, with consensus sequence AGGTTCCAnnCC. B-box sequence SO:0000620 Binds TFIIIC. B_box A variably distant linear promoter region recognized by TFIIIC, with consensus sequence AGGTTCCAnnCC. SO:ke This type of promoter recruits RNA pol III to transcribe predominantly noncoding RNAs. This promoter contains a proximal sequence element (PSE) and a TATA box upstream of the gene that it regulates. Transcription can also be activated by a distal sequence element (DSE), which is located further upstream. RNApol III promoter type 3 sequence SO:0000621 RNApol_III_promoter_type_3 This type of promoter recruits RNA pol III to transcribe predominantly noncoding RNAs. This promoter contains a proximal sequence element (PSE) and a TATA box upstream of the gene that it regulates. Transcription can also be activated by a distal sequence element (DSE), which is located further upstream. PMID:12381659 An RNA polymerase III type 1 promoter with consensus sequence CAnnCCn. C-box sequence SO:0000622 C_box An RNA polymerase III type 1 promoter with consensus sequence CAnnCCn. SO:ke A region that can be transcribed into a small nuclear RNA (snRNA). snRNA encoding sequence SO:0000623 snRNA_encoding A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end. http://en.wikipedia.org/wiki/Telomere INSDC_feature:telomere telomeric DNA telomeric sequence sequence SO:0000624 telomere A specific structure at the end of a linear chromosome, required for the integrity and maintenance of the end. SO:ma http://en.wikipedia.org/wiki/Telomere wiki A regulatory region which upon binding of transcription factors, suppress the transcription of the gene or genes they control. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Silencer_(DNA) INSDC_qualifier:silencer sequence SO:0000625 silencer A regulatory region which upon binding of transcription factors, suppress the transcription of the gene or genes they control. SO:ke http://en.wikipedia.org/wiki/Silencer_(DNA) wiki Regions of the chromosome that are important for regulating binding of chromosomes to the nuclear matrix. chromosomal regulatory element sequence SO:0000626 chromosomal_regulatory_element A regulatory region that 1) when located between a CRM and a gene's promoter prevents the CRM from modulating that genes expression and 2) acts as a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Insulator_(genetics) INSDC_qualifier:insulator insulator element sequence SO:0000627 moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. insulator A regulatory region that 1) when located between a CRM and a gene's promoter prevents the CRM from modulating that genes expression and 2) acts as a chromatin boundary element or barrier that can block the encroachment of condensed chromatin from an adjacent region. NCBI:cf PMID:12154228 SO:regcreative http://en.wikipedia.org/wiki/Insulator_(genetics) wiki Regions of the chromosome that are important for structural elements. chromosomal structural element sequence SO:0000628 chromosomal_structural_element An open reading frame found within the 5' UTR that can be translated and stall the translation of the downstream open reading frame. five prime open reading frame sequence SO:0000629 five_prime_open_reading_frame An open reading frame found within the 5' UTR that can be translated and stall the translation of the downstream open reading frame. PMID:12890013 A start codon upstream of the ORF. upstream AUG codon sequence SO:0000630 upstream_AUG_codon A start codon upstream of the ORF. SO:ke A primary transcript encoding for more than one gene product. polycistronic primary transcript sequence SO:0000631 polycistronic_primary_transcript A primary transcript encoding for more than one gene product. SO:ke A primary transcript encoding for one gene product. monocistronic primary transcript sequence SO:0000632 monocistronic_primary_transcript A primary transcript encoding for one gene product. SO:ke An mRNA with either a single protein product, or for which the regions encoding all its protein products overlap. http://en.wikipedia.org/wiki/Monocistronic_mRNA monocistronic mRNA monocistronic processed transcript sequence SO:0000633 monocistronic_mRNA An mRNA with either a single protein product, or for which the regions encoding all its protein products overlap. SO:rd http://en.wikipedia.org/wiki/Monocistronic_mRNA wiki An mRNA that encodes multiple proteins from at least two non-overlapping regions. http://en.wikipedia.org/wiki/Polycistronic_mRNA polycistronic mRNA sequence polycistronic processed transcript SO:0000634 polycistronic_mRNA An mRNA that encodes multiple proteins from at least two non-overlapping regions. SO:rd http://en.wikipedia.org/wiki/Polycistronic_mRNA wiki A primary transcript that donates the spliced leader to other mRNA. mini exon donor RNA mini-exon donor RNA sequence SO:0000635 mini_exon_donor_RNA A primary transcript that donates the spliced leader to other mRNA. SO:ke Snall nuclear RNAs that are incorporated into the pre-mRNAs to replace the 5' end in some eukaryotes. spliced leader RNA sequence mini-exon SO:0000636 spliced_leader_RNA Snall nuclear RNAs that are incorporated into the pre-mRNAs to replace the 5' end in some eukaryotes. PMID:24130571 A plasmid that is engineered. engineered plasmid sequence engineered plasmid gene SO:0000637 engineered_plasmid A plasmid that is engineered. SO:xp Part of an rRNA transcription unit that is transcribed but discarded during maturation, not giving rise to any part of rRNA. transcribed spacer region sequence SO:0000638 transcribed_spacer_region Part of an rRNA transcription unit that is transcribed but discarded during maturation, not giving rise to any part of rRNA. http://oregonstate.edu/instruction/bb492/general/glossary.html Non-coding regions of DNA sequence that separate genes coding for the 28S, 5.8S, and 18S ribosomal RNAs. internal transcribed spacer region sequence SO:0000639 internal_transcribed_spacer_region Non-coding regions of DNA sequence that separate genes coding for the 28S, 5.8S, and 18S ribosomal RNAs. SO:ke Non-coding regions of DNA that precede the sequence that codes for the ribosomal RNA. external transcribed spacer region sequence SO:0000640 external_transcribed_spacer_region Non-coding regions of DNA that precede the sequence that codes for the ribosomal RNA. SO:ke A region of a repeating tetranucleotide sequence (four bases). tetranucleotide repeat microsatellite feature sequence SO:0000641 tetranucleotide_repeat_microsatellite_feature A region that can be transcribed into a signal recognition particle RNA (SRP RNA). SRP RNA encoding sequence SO:0000642 SRP_RNA_encoding A repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Minisatellite INSDC_qualifier:minisatellite VNTR sequence SO:0000643 minisatellite A repeat region containing tandemly repeated sequences having a unit length of 10 to 40 bp. http://www.informatics.jax.org/silver/glossary.shtml http://en.wikipedia.org/wiki/Minisatellite wiki VNTR http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9655/ Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/Antisense_RNA INSDC_qualifier:antisense_RNA antisense RNA sequence SO:0000644 antisense_RNA Antisense RNA is RNA that is transcribed from the coding, rather than the template, strand of DNA. It is therefore complementary to mRNA. SO:ke http://en.wikipedia.org/wiki/Antisense_RNA wiki The reverse complement of the primary transcript. antisense primary transcript sequence SO:0000645 antisense_primary_transcript The reverse complement of the primary transcript. SO:ke A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. siRNAs trigger the cleavage of their target molecules. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/SiRNA INSDC_qualifier:siRNA small interfering RNA sequence SO:0000646 siRNA A small RNA molecule that is the product of a longer exogenous or endogenous dsRNA, which is either a bimolecular duplex or very long hairpin, processed (via the Dicer pathway) such that numerous siRNAs accumulate from both strands of the dsRNA. siRNAs trigger the cleavage of their target molecules. PMID:12592000 http://en.wikipedia.org/wiki/SiRNA wiki A primary transcript encoding a micro RNA. SO:0000648 miRNA primary transcript micro RNA primary transcript small temporal RNA primary transcript stRNA primary transcript stRNA_primary_transcript sequence SO:0000647 miRNA_primary_transcript A primary transcript encoding a micro RNA. SO:ke true true Cytosolic SSU rRNA is an RNA component of the small subunit of cytosolic ribosomes. cytosolic SSU rRNA cytosolic SSU ribosomal RNA cytosolic small subunit rRNA sequence SO:0000650 Renamed to cytosolic_SSU_rRNA from small_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. cytosolic_SSU_rRNA Cytosolic SSU rRNA is an RNA component of the small subunit of cytosolic ribosomes. SO:ke Cytosolic LSU rRNA is an RNA component of the large subunit of cytosolic ribosomes. cytosolic LSU RNA cytosolic LSU rRNA cytosolic large subunit rRNA sequence SO:0000651 Renamed to cytosolic_LSU_rRNA from large_subunit_rRNA on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. cytosolic_LSU_rRNA Cytosolic LSU rRNA is an RNA component of the large subunit of cytosolic ribosomes. SO:ke Cytosolic 5S rRNA is an RNA component of the large subunit of cytosolic ribosomes in both prokaryotes and eukaryotes. http://en.wikipedia.org/wiki/5S_ribosomal_RNA cytosolic 5S LSU rRNA cytosolic 5S rRNA cytosolic 5S ribosomal RNA cytosolic rRNA 5S sequence SO:0000652 Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_5S_rRNA Cytosolic 5S rRNA is an RNA component of the large subunit of cytosolic ribosomes in both prokaryotes and eukaryotes. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00001 http://en.wikipedia.org/wiki/5S_ribosomal_RNA wiki Cytosolic 28S rRNA is an RNA component of the large subunit of cytosolic ribosomes in metazoan eukaryotes. http://en.wikipedia.org/wiki/28S_ribosomal_RNA cytosolic 28S LSU rRNA cytosolic 28S rRNA cytosolic 28S ribosomal RNA cytosolic rRNA 28S sequence SO:0000653 Renamed from rRNA_28S to cytosolic_28S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_28S_rRNA Cytosolic 28S rRNA is an RNA component of the large subunit of cytosolic ribosomes in metazoan eukaryotes. SO:ke http://en.wikipedia.org/wiki/28S_ribosomal_RNA wiki A mitochondrial gene located in a maxicircle. maxi-circle gene maxicircle gene sequence SO:0000654 maxicircle_gene A mitochondrial gene located in a maxicircle. SO:xp An RNA transcript that does not encode for a protein rather the RNA molecule is the gene product. INSDC_qualifier:other http://en.wikipedia.org/wiki/NcRNA http://www.gencodegenes.org/gencode_biotypes.html known_ncrna noncoding RNA sequence SO:0000655 A ncRNA is a processed_transcript, so it may not contain parts such as transcribed_spacer_regions that are removed in the act of processing. For the corresponding primary_transcripts, please see term SO:0000483 nc_primary_transcript. ncRNA An RNA transcript that does not encode for a protein rather the RNA molecule is the gene product. SO:ke http://en.wikipedia.org/wiki/NcRNA wiki http://www.gencodegenes.org/gencode_biotypes.html GENCODE A region that can be transcribed into a small temporal RNA (stRNA). Found in roundworm development. stRNA encoding sequence SO:0000656 stRNA_encoding A region of sequence containing one or more repeat units. INSDC_feature:repeat_region INSDC_qualifier:other repeat region sequence SO:0000657 repeat_region A region of sequence containing one or more repeat units. SO:ke A repeat that is located at dispersed sites in the genome. INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Interspersed_repeat INSDC_qualifier:dispersed dispersed repeat interspersed repeat sequence SO:0000658 dispersed_repeat A repeat that is located at dispersed sites in the genome. SO:ke http://en.wikipedia.org/wiki/Interspersed_repeat wiki A region that can be transcribed into a transfer-messenger RNA (tmRNA). tmRNA encoding sequence SO:0000659 tmRNA_encoding sequence SO:0000660 DNA_invertase_target_sequence true sequence SO:0000661 intron_attribute true An intron which is spliced by the spliceosome. spliceosomal intron sequence SO:0000662 GO:0000398. spliceosomal_intron An intron which is spliced by the spliceosome. SO:ke A region that can be transcribed into a transfer RNA (tRNA). tRNA encoding sequence SO:0000663 tRNA_encoding A region of a chromosome that has been introduced by backcrossing with a separate species. introgressed chromosome region sequence SO:0000664 introgressed_chromosome_region A region of a chromosome that has been introduced by backcrossing with a separate species. PMID:11454782 A transcript that is monocistronic. monocistronic transcript sequence SO:0000665 monocistronic_transcript A transcript that is monocistronic. SO:xp An intron (mitochondrial, chloroplast, nuclear or prokaryotic) that encodes a double strand sequence specific endonuclease allowing for mobility. mobile intron sequence SO:0000666 mobile_intron An intron (mitochondrial, chloroplast, nuclear or prokaryotic) that encodes a double strand sequence specific endonuclease allowing for mobility. SO:ke The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence. SO:1000034 loinc:LA6687-3 insertion nucleotide insertion nucleotide_insertion sequence SO:0000667 insertion The sequence of one or more nucleotides added between two adjacent nucleotides in the sequence. SO:ke loinc:LA6687-3 Insertion insertion http://www.ncbi.nlm.nih.gov/dbvar/ A match against an EST sequence. EST match sequence SO:0000668 EST_match A match against an EST sequence. SO:ke A feature where a segment of DNA has been rearranged from what it was in the parent cell. sequence rearrangement feature sequence SO:0000669 sequence_rearrangement_feature A sequence within the micronuclear DNA of ciliates at which chromosome breakage and telomere addition occurs during nuclear differentiation. chromosome breakage sequence sequence SO:0000670 chromosome_breakage_sequence A sequence within the micronuclear DNA of ciliates at which chromosome breakage and telomere addition occurs during nuclear differentiation. SO:ma A sequence eliminated from the genome of ciliates during nuclear differentiation. internal eliminated sequence sequence SO:0000671 internal_eliminated_sequence A sequence eliminated from the genome of ciliates during nuclear differentiation. SO:ma A sequence that is conserved, although rearranged relative to the micronucleus, in the macronucleus of a ciliate genome. macronucleus destined segment sequence SO:0000672 macronucleus_destined_segment A sequence that is conserved, although rearranged relative to the micronucleus, in the macronucleus of a ciliate genome. SO:ma An RNA synthesized on a DNA or RNA template by an RNA polymerase. INSDC_feature:misc_RNA http://en.wikipedia.org/wiki/RNA sequence SO:0000673 Added relationship overlaps SO:0002300 unit_of_gene_expression with Mejia-Almonte et.al PMID:32665585 Aug 5, 2020. transcript An RNA synthesized on a DNA or RNA template by an RNA polymerase. SO:ma http://en.wikipedia.org/wiki/RNA wiki A splice site where the donor and acceptor sites differ from the canonical form. SO:0000678 SO:0000679 non canonical splice site non-canonical splice site sequence SO:0000674 non_canonical_splice_site true A splice site where the donor and acceptor sites differ from the canonical form. SO:ke The major class of splice site with dinucleotides GT and AG for donor and acceptor sites, respectively. SO:0000676 SO:0000677 canonical splice site sequence SO:0000675 canonical_splice_site true The major class of splice site with dinucleotides GT and AG for donor and acceptor sites, respectively. SO:ke The canonical 3' splice site has the sequence "AG". canonical 3' splice site canonical three prime splice site sequence SO:0000676 canonical_three_prime_splice_site The canonical 3' splice site has the sequence "AG". SO:ke The canonical 5' splice site has the sequence "GT". canonical 5' splice site canonical five prime splice site sequence SO:0000677 canonical_five_prime_splice_site The canonical 5' splice site has the sequence "GT". SO:ke A 3' splice site that does not have the sequence "AG". non canonical three prime splice site non-canonical three prime splice site sequence non canonical 3' splice site SO:0000678 non_canonical_three_prime_splice_site A 3' splice site that does not have the sequence "AG". SO:ke A 5' splice site which does not have the sequence "GT". non canonical 5' splice site non canonical five prime splice site non-canonical five prime splice site sequence SO:0000679 non_canonical_five_prime_splice_site A 5' splice site which does not have the sequence "GT". SO:ke A start codon that is not the usual AUG sequence. non ATG start codon non canonical start codon non-canonical start codon sequence SO:0000680 non_canonical_start_codon A start codon that is not the usual AUG sequence. SO:ke A transcript that has been processed "incorrectly", for example by the failure of splicing of one or more exons. aberrant processed transcript sequence SO:0000681 aberrant_processed_transcript A transcript that has been processed "incorrectly", for example by the failure of splicing of one or more exons. SO:ke sequence SO:0000682 splicing_feature true Exonic splicing enhancers (ESEs) facilitate exon definition by assisting in the recruitment of splicing factors to the adjacent intron. exonic splice enhancer sequence SO:0000683 exonic_splice_enhancer Exonic splicing enhancers (ESEs) facilitate exon definition by assisting in the recruitment of splicing factors to the adjacent intron. http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12403462&dopt=Abstract A region of nucleotide sequence targeted by a nuclease enzyme. nuclease sensitive site sequence SO:0000684 nuclease_sensitive_site A region of nucleotide sequence targeted by a nuclease enzyme. SO:ma DNA region representing open chromatin structure that is hypersensitive to digestion by DNase I. INSDC_feature:regulatory DHS DNaseI hypersensitive site INSDC_qualifier:DNase_I_hypersensitive_site sequence SO:0000685 DNaseI_hypersensitive_site A chromosomal translocation whereby the chromosomes carrying non-homologous centromeres may be recovered independently. These chromosomes are described as translocation elements. This occurs for some translocations, particularly but not exclusively, reciprocal translocations. translocation element sequence SO:0000686 translocation_element A chromosomal translocation whereby the chromosomes carrying non-homologous centromeres may be recovered independently. These chromosomes are described as translocation elements. This occurs for some translocations, particularly but not exclusively, reciprocal translocations. SO:ma The space between two bases in a sequence which marks the position where a deletion has occurred. deletion junction sequence SO:0000687 deletion_junction The space between two bases in a sequence which marks the position where a deletion has occurred. SO:ke A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence. golden path sequence SO:0000688 golden_path A set of subregions selected from sequence contigs which when concatenated form a nonredundant linear sequence. SO:ls A match against cDNA sequence. cDNA match sequence SO:0000689 cDNA_match A match against cDNA sequence. SO:ke A gene that encodes a polycistronic transcript. gene with polycistronic transcript sequence SO:0000690 gene_with_polycistronic_transcript A gene that encodes a polycistronic transcript. SO:xp The initiator methionine that has been cleaved from a mature polypeptide sequence. BS:00067 cleaved initiator methionine sequence init_met initiator methionine SO:0000691 cleaved_initiator_methionine The initiator methionine that has been cleaved from a mature polypeptide sequence. EBIBS:GAR init_met uniprot:feature_type A gene that encodes a dicistronic transcript. gene with dicistronic transcript sequence SO:0000692 gene_with_dicistronic_transcript A gene that encodes a dicistronic transcript. SO:xp A gene that encodes an mRNA that is recoded. gene with recoded mRNA sequence SO:0000693 gene_with_recoded_mRNA A gene that encodes an mRNA that is recoded. SO:xp SNPs are single base pair positions in genomic DNA at which different sequence alternatives exist in normal individuals in some population(s), wherein the least frequent variant has an abundance of 1% or greater. single nucleotide polymorphism sequence SO:0000694 SNP SNPs are single base pair positions in genomic DNA at which different sequence alternatives exist in normal individuals in some population(s), wherein the least frequent variant has an abundance of 1% or greater. SO:cb A sequence used in experiment. sequence SO:0000695 Requested by Lynn Crosby, jan 2006. reagent A sequence used in experiment. SO:ke A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded. http://en.wikipedia.org/wiki/Oligonucleotide oligonucleotide sequence SO:0000696 oligo A short oligonucleotide sequence, of length on the order of 10's of bases; either single or double stranded. SO:ma http://en.wikipedia.org/wiki/Oligonucleotide wiki A gene that encodes a transcript with stop codon readthrough. gene with stop codon read through sequence SO:0000697 gene_with_stop_codon_read_through A gene that encodes a transcript with stop codon readthrough. SO:xp A gene encoding an mRNA that has the stop codon redefined as pyrrolysine. gene with stop codon redefined as pyrrolysine sequence SO:0000698 gene_with_stop_codon_redefined_as_pyrrolysine A gene encoding an mRNA that has the stop codon redefined as pyrrolysine. SO:xp A sequence_feature with an extent of zero. boundary breakpoint sequence SO:0000699 A junction is a boundary between regions. A boundary has an extent of zero. junction A sequence_feature with an extent of zero. SO:ke A comment about the sequence. sequence SO:0000700 remark A comment about the sequence. SO:ke A region of sequence where the validity of the base calling is questionable. possible base call error sequence SO:0000701 possible_base_call_error A region of sequence where the validity of the base calling is questionable. SO:ke A region of sequence where there may have been an error in the assembly. possible assembly error sequence SO:0000702 possible_assembly_error A region of sequence where there may have been an error in the assembly. SO:ke A region of sequence implicated in an experimental result. experimental result region sequence SO:0000703 experimental_result_region A region of sequence implicated in an experimental result. SO:ke A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions. http://en.wikipedia.org/wiki/Gene INSDC_feature:gene sequence SO:0000704 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. A gene may be considered as a unit of inheritance. gene A region (or regions) that includes all of the sequence elements necessary to encode a functional transcript. A gene may include regulatory regions, transcribed regions and/or other functional sequence regions. SO:immuno_workshop http://en.wikipedia.org/wiki/Gene wiki Two or more adjacent copies of a region (of length greater than 1). INSDC_feature:repeat_region http://en.wikipedia.org/wiki/Tandem_repeat http://www.sci.sdsu.edu/~smaloy/Glossary/T.html INSDC_qualifier:tandem tandem repeat sequence SO:0000705 tandem_repeat Two or more adjacent copies of a region (of length greater than 1). SO:ke http://en.wikipedia.org/wiki/Tandem_repeat wiki The 3' splice site of the acceptor primary transcript. trans splice acceptor site sequence 3' trans splice site SO:0000706 This region contains a polypyridine tract and AG dinucleotide in some organisms and is UUUCAG in C. elegans. trans_splice_acceptor_site The 3' splice site of the acceptor primary transcript. SO:ke The 5' five prime splice site region of the donor RNA. trans splice donor site trans-splice donor site sequence 5 prime trans splice site SO:0000707 SL RNA contains a donor site. trans_splice_donor_site The 5' five prime splice site region of the donor RNA. SO:ke A trans_splicing_acceptor_site which appends the 22nt SL1 RNA leader sequence to the 5' end of most mRNAs. SL1 acceptor site sequence SO:0000708 SL1_acceptor_site A trans_splicing_acceptor_site which appends the 22nt SL1 RNA leader sequence to the 5' end of most mRNAs. SO:nlw A trans_splicing_acceptor_site which appends the 22nt SL2 RNA leader sequence to the 5' end of mRNAs. SL2 acceptor sites occur in genes in internal segments of polycistronic transcripts. SL2 acceptor site sequence SO:0000709 SL2_acceptor_site A trans_splicing_acceptor_site which appends the 22nt SL2 RNA leader sequence to the 5' end of mRNAs. SL2 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A gene encoding an mRNA that has the stop codon redefined as selenocysteine. gene with stop codon redefined as selenocysteine sequence SO:0000710 gene_with_stop_codon_redefined_as_selenocysteine A gene encoding an mRNA that has the stop codon redefined as selenocysteine. SO:xp A gene with mRNA recoded by translational bypass. gene with mRNA recoded by translational bypass sequence SO:0000711 gene_with_mRNA_recoded_by_translational_bypass A gene with mRNA recoded by translational bypass. SO:xp A gene encoding a transcript that has a translational frameshift. gene with transcript with translational frameshift sequence SO:0000712 gene_with_transcript_with_translational_frameshift A gene encoding a transcript that has a translational frameshift. SO:xp A motif that is active in the DNA form of the sequence. http://en.wikipedia.org/wiki/DNA_motif DNA motif sequence SO:0000713 DNA_motif A motif that is active in the DNA form of the sequence. SO:ke http://en.wikipedia.org/wiki/DNA_motif wiki A region of nucleotide sequence corresponding to a known motif. INSDC_feature:misc_feature INSDC_note:nucleotide_motif nucleotide motif sequence SO:0000714 nucleotide_motif A region of nucleotide sequence corresponding to a known motif. SO:ke A motif that is active in RNA sequence. RNA motif sequence SO:0000715 RNA_motif A motif that is active in RNA sequence. SO:ke An mRNA that has the quality dicistronic. dicistronic mRNA sequence dicistronic processed transcript SO:0000716 dicistronic_mRNA An mRNA that has the quality dicistronic. SO:ke A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon. http://en.wikipedia.org/wiki/Reading_frame reading frame sequence SO:0000717 This term was added after a request by SGD. August 2004. Modified after SO meeting in Cambridge to not include start or stop. reading_frame A nucleic acid sequence that when read as sequential triplets, has the potential of encoding a sequential string of amino acids. It need not contain the start or stop codon. SGD:rb http://en.wikipedia.org/wiki/Reading_frame wiki A reading_frame that is interrupted by one or more stop codons; usually identified through inter-genomic sequence comparisons. blocked reading frame sequence SO:0000718 Term requested by Rama from SGD. blocked_reading_frame A reading_frame that is interrupted by one or more stop codons; usually identified through inter-genomic sequence comparisons. SGD:rb An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers. pseudochromosome sequence superscaffold SO:0000719 ultracontig An ordered and oriented set of scaffolds based on somewhat weaker sets of inferential evidence such as one set of mate pair reads together with supporting evidence from ESTs or location of markers from SNP or microsatellite maps, or cytogenetic localization of contained markers. FB:WG A transposable element that is foreign. foreign transposable element sequence SO:0000720 requested by Michael on 19 Nov 2004. foreign_transposable_element A transposable element that is foreign. SO:ke A gene that encodes a dicistronic primary transcript. gene with dicistronic primary transcript sequence SO:0000721 Requested by Michael, 19 nov 2004. gene_with_dicistronic_primary_transcript A gene that encodes a dicistronic primary transcript. SO:xp A gene that encodes a polycistronic mRNA. gene with dicistronic mRNA gene with dicistronic processed transcript sequence SO:0000722 Requested by MA nov 19 2004. gene_with_dicistronic_mRNA A gene that encodes a polycistronic mRNA. SO:xp Genomic sequence removed from the genome, as a normal event, by a process of recombination. INSDC_feature:iDNA intervening DNA sequence SO:0000723 iDNA Genomic sequence removed from the genome, as a normal event, by a process of recombination. SO:ma A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization. http://en.wikipedia.org/wiki/Origin_of_transfer INSDC_feature:oriT origin of transfer sequence SO:0000724 oriT A region of a DNA molecule where transfer is initiated during the process of conjugation or mobilization. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Origin_of_transfer wiki The transit_peptide is a short region at the N-terminus of the peptide that directs the protein to an organelle (chloroplast, mitochondrion, microbody or cyanelle). BS:00055 INSDC_feature:transit_peptide transit peptide sequence signal transit SO:0000725 Added to bring SO inline with the EMBL, DDBJ, GenBank feature table. Old definition before biosapiens: The coding sequence for an N-terminal domain of a nuclear-encoded organellar protein. This domain is involved in post translational import of the protein into the organelle. transit_peptide The transit_peptide is a short region at the N-terminus of the peptide that directs the protein to an organelle (chloroplast, mitochondrion, microbody or cyanelle). http://www.insdc.org/files/feature_table.html transit uniprot:feature_type The simplest repeated component of a repeat region. A single repeat. http://www.insdc.org/files/feature_table.html repeat unit sequence SO:0000726 Added to comply with the feature table. A single repeat. repeat_unit The simplest repeated component of a repeat region. A single repeat. SO:ke A regulatory region where transcription factor binding sites are clustered to regulate various aspects of transcription activities. (CRMs can be located a few kb to hundreds of kb upstream of the core promoter, in the coding sequence, within introns, or in the untranslated regions (UTR) sequences, and even on a different chromosome). A single gene can be regulated by multiple CRMs to give precise control of its spatial and temporal expression. CRMs function as nodes in large, intertwined regulatory network. CRM DNA accessibility is subject to regulation by dbTFs and transcription co-TFs. CRM TF module cis regulatory module transcription factor module sequence SO:0000727 Requested by Stephen Grossmann Dec 2004. Changed relationship from has_part SO:0000235 TF_binding site to TF_binding_site is part_of SO:0000727 CRM in response to requests from GREEKC initiative in Aug 2020. Removed 3' from definition because 5' UTRs are included as well, notified by Colin Logie of GREEKC. Nov 9 2020. DS Updated name from 'CRM' to 'cis_regulatory_module' on 08 Feb 2021. See GitHub Issue #526. DS Added final sentence to definition as part of GREEKC Feb 16, 2021. See GitHub Issue #534. cis_regulatory_module A regulatory region where transcription factor binding sites are clustered to regulate various aspects of transcription activities. (CRMs can be located a few kb to hundreds of kb upstream of the core promoter, in the coding sequence, within introns, or in the untranslated regions (UTR) sequences, and even on a different chromosome). A single gene can be regulated by multiple CRMs to give precise control of its spatial and temporal expression. CRMs function as nodes in large, intertwined regulatory network. CRM DNA accessibility is subject to regulation by dbTFs and transcription co-TFs. PMID:19660565 SO:SG A region of a peptide that is able to excise itself and rejoin the remaining portions with a peptide bond. http://en.wikipedia.org/wiki/Intein sequence protein intron SO:0000728 Intein-mediated protein splicing occurs after mRNA has been translated into a protein. intein A region of a peptide that is able to excise itself and rejoin the remaining portions with a peptide bond. SO:ke http://en.wikipedia.org/wiki/Intein wiki An attribute of protein-coding genes where the initial protein product contains an intein. intein containing sequence SO:0000729 intein_containing An attribute of protein-coding genes where the initial protein product contains an intein. SO:ke A gap in the sequence of known length. The unknown bases are filled in with N's. INSDC_feature:gap INSDC_feature:assembly_gap sequence SO:0000730 gap A gap in the sequence of known length. The unknown bases are filled in with N's. SO:ke An attribute to describe a feature that is incomplete. fragment sequence SO:0000731 Term added because of request by MO people. fragmentary An attribute to describe a feature that is incomplete. SO:ke An attribute describing an unverified region. http://en.wikipedia.org/wiki/Predicted sequence SO:0000732 predicted An attribute describing an unverified region. SO:ke http://en.wikipedia.org/wiki/Predicted wiki An attribute describing a located_sequence_feature. feature attribute sequence SO:0000733 feature_attribute An attribute describing a located_sequence_feature. SO:ke An exemplar is a representative cDNA sequence for each gene. The exemplar approach is a method that usually involves some initial clustering into gene groups and the subsequent selection of a representative from each gene group. exemplar mRNA sequence SO:0000734 Added for the MO people. exemplar_mRNA An exemplar is a representative cDNA sequence for each gene. The exemplar approach is a method that usually involves some initial clustering into gene groups and the subsequent selection of a representative from each gene group. http://mged.sourceforge.net/ontologies/MGEDontology.php The location of a sequence. sequence location sequence SO:0000735 sequence_location A sequence of DNA that originates from a an organelle. organelle sequence sequence SO:0000736 organelle_sequence DNA belonging to the genome of a mitochondria. mitochondrial sequence sequence SO:0000737 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. mitochondrial_sequence DNA belonging to the nuclear genome of cell. nuclear sequence sequence SO:0000738 Moved from is_a SO:0000736 (organelle_sequence) when brought to our attention by GitHub issue #489. nuclear_sequence DNA belonging to the genome of a plastid such as a chloroplast. The nucleomorph is the nuclei of the plastic. nucleomorphic sequence sequence SO:0000739 nucleomorphic_sequence DNA belonging to the genome of a plastid such as a chloroplast. plastid sequence sequence SO:0000740 plastid_sequence A kinetoplast is an interlocked network of thousands of minicircles and tens of maxicircles, located near the base of the flagellum of some protozoan species. SO:0000826 http://en.wikipedia.org/wiki/Kinetoplast kinetoplast_chromosome sequence SO:0000741 kinetoplast A kinetoplast is an interlocked network of thousands of minicircles and tens of maxicircles, located near the base of the flagellum of some protozoan species. PMID:8395055 http://en.wikipedia.org/wiki/Kinetoplast wiki A maxicircle is a replicon, part of a kinetoplast, that contains open reading frames and replicates via a rolling circle method. SO:0000827 maxicircle_chromosome sequence SO:0000742 maxicircle A maxicircle is a replicon, part of a kinetoplast, that contains open reading frames and replicates via a rolling circle method. PMID:8395055 DNA belonging to the genome of an apicoplast, a non-photosynthetic plastid. apicoplast sequence sequence SO:0000743 apicoplast_sequence DNA belonging to the genome of a chromoplast, a colored plastid for synthesis and storage of pigments. chromoplast sequence sequence SO:0000744 chromoplast_sequence DNA belonging to the genome of a chloroplast, a green plastid for photosynthesis. chloroplast sequence sequence SO:0000745 chloroplast_sequence DNA belonging to the genome of a cyanelle, a photosynthetic plastid found in algae. cyanelle sequence sequence SO:0000746 cyanelle_sequence DNA belonging to the genome of a leucoplast, a colorless plastid generally containing starch or oil. leucoplast sequence sequence SO:0000747 leucoplast_sequence DNA belonging to the genome of a proplastid such as an immature chloroplast. proplastid sequence sequence SO:0000748 proplastid_sequence The location of DNA that has come from a plasmid sequence. plasmid location sequence SO:0000749 plasmid_location An origin_of_replication that is used for the amplification of a chromosomal nucleic acid sequence. amplification origin sequence SO:0000750 amplification_origin An origin_of_replication that is used for the amplification of a chromosomal nucleic acid sequence. SO:ma The location of DNA that has come from a viral origin. proviral location sequence SO:0000751 proviral_location A region that is involved in the regulation of transcription of a group of regulated genes. SO:0001055 gene group regulatory region sequence SO:0000752 Merged into transcriptional_cis_regulatory_region (SO:0001055) on 11 Feb 2021 as part of GREEKC reducing redundancy as we prepare to submit several terms to Ensembl. See GitHub Issue #529. gene_group_regulatory_region true The region of sequence that has been inserted and is being propagated by the clone. clone insert sequence SO:0000753 clone_insert The region of sequence that has been inserted and is being propagated by the clone. SO:ke The lambda bacteriophage is the vector for the linear lambda clone. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome. lambda vector sequence SO:0000754 lambda_vector The lambda bacteriophage is the vector for the linear lambda clone. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome. ISBN:0-1767-2380-8 A plasmid that has been generated to act as a vector for foreign sequence. http://en.wikipedia.org/wiki/Plasmid_vector#Vectors plasmid vector sequence SO:0000755 plasmid_vector http://en.wikipedia.org/wiki/Plasmid_vector#Vectors wiki DNA synthesized by reverse transcriptase using RNA as a template. http://en.wikipedia.org/wiki/CDNA complementary DNA sequence SO:0000756 cDNA DNA synthesized by reverse transcriptase using RNA as a template. SO:ma http://en.wikipedia.org/wiki/CDNA wiki DNA synthesized from RNA by reverse transcriptase, single stranded. single strand cDNA single stranded cDNA sequence single-strand cDNA SO:0000757 single_stranded_cDNA DNA synthesized from RNA by reverse transcriptase that has been copied by PCR to make it double stranded. double stranded cDNA sequence double strand cDNA double-strand cDNA SO:0000758 double_stranded_cDNA sequence SO:0000759 plasmid_clone true sequence SO:0000760 YAC_clone true sequence SO:0000761 phagemid_clone true sequence P1_clone SO:0000762 PAC_clone true sequence SO:0000763 fosmid_clone true sequence SO:0000764 BAC_clone true sequence SO:0000765 cosmid_clone true A tRNA sequence that has a pyrrolysine anticodon, and a 3' pyrrolysine binding region. pyrrolysyl tRNA pyrrolysyl-transfer RNA pyrrolysyl-transfer ribonucleic acid sequence SO:0000766 pyrrolysyl_tRNA A tRNA sequence that has a pyrrolysine anticodon, and a 3' pyrrolysine binding region. SO:ke sequence SO:0000767 clone_insert_start true A plasmid that may integrate with a chromosome. sequence SO:0000768 episome A plasmid that may integrate with a chromosome. SO:ma The region of a two-piece tmRNA that bears the reading frame encoding the proteolysis tag. The tmRNA gene undergoes circular permutation in some groups of bacteria. Processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together. tmRNA coding piece sequence SO:0000769 Added in response to comment from Kelly Williams from Indiana. Nov 2005. tmRNA_coding_piece The region of a two-piece tmRNA that bears the reading frame encoding the proteolysis tag. The tmRNA gene undergoes circular permutation in some groups of bacteria. Processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together. Indiana:kw doi:10.1093/nar/gkh795 issn:1362-4962 The acceptor region of a two-piece tmRNA that when mature is charged at its 3' end with alanine. The tmRNA gene undergoes circular permutation in some groups of bacteria; processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together. tmRNA acceptor piece sequence SO:0000770 Added in response to Kelly Williams from Indiana. Date: Nov 2005. tmRNA_acceptor_piece The acceptor region of a two-piece tmRNA that when mature is charged at its 3' end with alanine. The tmRNA gene undergoes circular permutation in some groups of bacteria; processing of the transcripts from such a gene leaves the mature tmRNA in two pieces, base-paired together. Indiana:kw doi:10.1093/nar/gkh795 A quantitative trait locus (QTL) is a polymorphic locus which contains alleles that differentially affect the expression of a continuously distributed phenotypic trait. Usually it is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci. quantitative trait locus sequence SO:0000771 Added in respose to request by Simon Twigger November 14th 2005. QTL A quantitative trait locus (QTL) is a polymorphic locus which contains alleles that differentially affect the expression of a continuously distributed phenotypic trait. Usually it is a marker described by statistical association to quantitative variation in the particular phenotypic trait that is thought to be controlled by the cumulative action of alleles at multiple loci. http://rgd.mcw.edu/tu/qtls/ A genomic island is an integrated mobile genetic element, characterized by size (over 10 Kb). It that has features that suggest a foreign origin. These can include nucleotide distribution (oligonucleotides signature, CG content etc.) that differs from the bulk of the chromosome and/or genes suggesting DNA mobility. http://en.wikipedia.org/wiki/Genomic_island genomic island sequence SO:0000772 Genomic islands are transmissible elements characterized by large size (>10kb). genomic_island A genomic island is an integrated mobile genetic element, characterized by size (over 10 Kb). It that has features that suggest a foreign origin. These can include nucleotide distribution (oligonucleotides signature, CG content etc.) that differs from the bulk of the chromosome and/or genes suggesting DNA mobility. Phigo:at SO:ke http://en.wikipedia.org/wiki/Genomic_island wiki Mobile genetic elements that contribute to rapid changes in virulence potential. They are present on the genomes of pathogenic strains but absent from the genomes of non pathogenic members of the same or related species. pathogenic island sequence SO:0000773 Nature Reviews Microbiology 2, 414-424 (2004); doi:10.1038 micro 884 GENOMIC ISLANDS IN PATHOGENIC AND ENVIRONMENTAL MICROORGANISMS Ulrich Dobrindt, Bianca Hochhut, Ute Hentschel & Jorg Hacker. pathogenic_island Mobile genetic elements that contribute to rapid changes in virulence potential. They are present on the genomes of pathogenic strains but absent from the genomes of non pathogenic members of the same or related species. SO:ke A transmissible element containing genes involved in metabolism, analogous to the pathogenicity islands of gram negative bacteria. metabolic island sequence SO:0000774 Genes for phenolic compound degradation in Pseudomonas putida are found on metabolic islands. metabolic_island A transmissible element containing genes involved in metabolism, analogous to the pathogenicity islands of gram negative bacteria. SO:ke An adaptive island is a genomic island that provides an adaptive advantage to the host. adaptive island sequence SO:0000775 The iron-uptake ability of many pathogens are conveyed by adaptive islands. Nature Reviews Microbiology 2, 414-424 (2004); doi:10.1038 micro 884 GENOMIC ISLANDS IN PATHOGENIC AND ENVIRONMENTAL MICROORGANISMS Ulrich Dobrindt, Bianca Hochhut, Ute Hentschel & Jorg Hacker. adaptive_island An adaptive island is a genomic island that provides an adaptive advantage to the host. SO:ke A transmissible element containing genes involved in symbiosis, analogous to the pathogenicity islands of gram negative bacteria. symbiosis island sequence SO:0000776 Nitrogen fixation in Rhizobiaceae species is encoded by symbiosis islands. Evolution of rhizobia by acquisition of a 500-kb symbiosis island that integrates into a phe-tRNA gene. John T. Sullivan and Clive W. Ronso PNAS 1998 Apr 28 95 (9) 5145-5149. symbiosis_island A transmissible element containing genes involved in symbiosis, analogous to the pathogenicity islands of gram negative bacteria. SO:ke A non functional descendant of an rRNA. INSDC_feature:rRNA INSDC_qualifier:pseudo pseudogenic rRNA sequence SO:0000777 Added Jan 2006 to allow the annotation of the pseudogenic rRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations. pseudogenic_rRNA A non functional descendant of an rRNA. SO:ke A non functional descendent of a tRNA. INSDC_feature:tRNA INSDC_qualifier:pseudo pseudogenic tRNA sequence SO:0000778 Added Jan 2006 to allow the annotation of the pseudogenic tRNA by flybase. Non-functional is defined as its transcription is prevented due to one or more mutatations. pseudogenic_tRNA A non functional descendent of a tRNA. SO:ke An episome that is engineered. engineered episome sequence SO:0000779 Requested by Lynn Crosby Jan 2006. engineered_episome An episome that is engineered. SO:xp sequence SO:0000780 Added by KE Jan 2006 to capture the kinds of attributes of TEs transposable_element_attribute true Attribute describing sequence that has been integrated with foreign sequence. sequence SO:0000781 transgenic Attribute describing sequence that has been integrated with foreign sequence. SO:ke An attribute describing a feature that occurs in nature. sequence SO:0000782 natural An attribute describing a feature that occurs in nature. SO:ke An attribute to describe a region that was modified in vitro. sequence SO:0000783 engineered An attribute to describe a region that was modified in vitro. SO:ke An attribute to describe a region from another species. sequence SO:0000784 foreign An attribute to describe a region from another species. SO:ke The region of sequence that has been inserted and is being propagated by the clone. cloned region cloned segment sequence SO:0000785 Added in response to Lynn Crosby. A clone insert may be composed of many cloned regions. cloned_region reagent attribute sequence SO:0000786 Added jan 2006 by KE. reagent_attribute true sequence SO:0000787 clone_attribute true sequence SO:0000788 cloned true An attribute to describe a feature that has been proven. sequence SO:0000789 validated An attribute to describe a feature that has been proven. SO:ke An attribute describing a feature that is invalidated. sequence SO:0000790 invalidated An attribute describing a feature that is invalidated. SO:ke sequence SO:0000791 cloned_genomic true sequence SO:0000792 cloned_cDNA true sequence SO:0000793 engineered_DNA true A rescue region that is engineered. engineered rescue fragment engineered rescue region engineered rescue segment sequence SO:0000794 engineered_rescue_region A rescue region that is engineered. SO:xp A mini_gene that rescues. rescue mini gene rescue mini-gene sequence SO:0000795 rescue_mini_gene A mini_gene that rescues. SO:xp TE that has been modified in vitro, including insertion of DNA derived from a source other than the originating TE. transgenic transposable element sequence SO:0000796 Modified as requested by Lynn - FB. May 2007. transgenic_transposable_element TE that has been modified in vitro, including insertion of DNA derived from a source other than the originating TE. FB:mc TE that exists (or existed) in nature. natural transposable element sequence SO:0000797 natural_transposable_element TE that exists (or existed) in nature. FB:mc TE that has been modified by manipulations in vitro. engineered transposable element sequence SO:0000798 engineered_transposable_element TE that has been modified by manipulations in vitro. FB:mc A transposable_element that is engineered and foreign. engineered foreign transposable element sequence SO:0000799 engineered_foreign_transposable_element A transposable_element that is engineered and foreign. FB:mc A multi-chromosome duplication aberration generated by reassortment of other aberration components. assortment derived duplication sequence SO:0000800 assortment_derived_duplication A multi-chromosome duplication aberration generated by reassortment of other aberration components. FB:gm A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency and a duplication. assortment derived deficiency plus duplication sequence SO:0000801 assortment_derived_deficiency_plus_duplication A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency and a duplication. FB:gm A multi-chromosome deficiency aberration generated by reassortment of other aberration components. assortment-derived deficiency sequence SO:0000802 assortment_derived_deficiency A multi-chromosome deficiency aberration generated by reassortment of other aberration components. FB:gm A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency or a duplication. assortment derived aneuploid sequence SO:0000803 assortment_derived_aneuploid A multi-chromosome aberration generated by reassortment of other aberration components; presumed to have a deficiency or a duplication. FB:gm A region that is engineered. construct engineered region engineered sequence sequence SO:0000804 engineered_region A region that is engineered. SO:xp A region that is engineered and foreign. engineered foreign region sequence SO:0000805 engineered_foreign_region A region that is engineered and foreign. SO:xp When two regions of DNA are joined together that are not normally together. sequence SO:0000806 fusion A tag that is engineered. engineered tag sequence SO:0000807 engineered_tag A tag that is engineered. SO:xp A cDNA clone that has been validated. validated cDNA clone sequence SO:0000808 validated_cDNA_clone A cDNA clone that has been validated. SO:xp A cDNA clone that is invalid. invalidated cDNA clone sequence SO:0000809 invalidated_cDNA_clone A cDNA clone that is invalid. SO:xp A cDNA clone invalidated because it is chimeric. chimeric cDNA clone sequence SO:0000810 chimeric_cDNA_clone A cDNA clone invalidated because it is chimeric. SO:xp A cDNA clone invalidated by genomic contamination. genomically contaminated cDNA clone sequence SO:0000811 genomically_contaminated_cDNA_clone A cDNA clone invalidated by genomic contamination. SO:xp A cDNA clone invalidated by polyA priming. polyA primed cDNA clone sequence SO:0000812 polyA_primed_cDNA_clone A cDNA clone invalidated by polyA priming. SO:xp A cDNA invalidated clone by partial processing. partially processed cDNA clone sequence SO:0000813 partially_processed_cDNA_clone A cDNA invalidated clone by partial processing. SO:xp An attribute describing a region's ability, when introduced to a mutant organism, to re-establish (rescue) a phenotype. sequence SO:0000814 rescue An attribute describing a region's ability, when introduced to a mutant organism, to re-establish (rescue) a phenotype. SO:ke By definition, minigenes are short open-reading frames (ORF), usually encoding approximately 9 to 20 amino acids, which are expressed in vivo (as distinct from being synthesized as peptide or protein ex vivo and subsequently injected). The in vivo synthesis confers a distinct advantage: the expressed sequences can enter both antigen presentation pathways, MHC I (inducing CD8+ T- cells, which are usually cytotoxic T-lymphocytes (CTL)) and MHC II (inducing CD4+ T-cells, usually 'T-helpers' (Th)); and can encounter B-cells, inducing antibody responses. Three main vector approaches have been used to deliver minigenes: viral vectors, bacterial vectors and plasmid DNA. mini gene sequence SO:0000815 mini_gene By definition, minigenes are short open-reading frames (ORF), usually encoding approximately 9 to 20 amino acids, which are expressed in vivo (as distinct from being synthesized as peptide or protein ex vivo and subsequently injected). The in vivo synthesis confers a distinct advantage: the expressed sequences can enter both antigen presentation pathways, MHC I (inducing CD8+ T- cells, which are usually cytotoxic T-lymphocytes (CTL)) and MHC II (inducing CD4+ T-cells, usually 'T-helpers' (Th)); and can encounter B-cells, inducing antibody responses. Three main vector approaches have been used to deliver minigenes: viral vectors, bacterial vectors and plasmid DNA. PMID:15992143 A gene that rescues. rescue gene sequence SO:0000816 rescue_gene A gene that rescues. SO:xp An attribute describing sequence with the genotype found in nature and/or standard laboratory stock. http://en.wikipedia.org/wiki/Wild_type loinc:LA9658-1 wild type sequence SO:0000817 wild_type An attribute describing sequence with the genotype found in nature and/or standard laboratory stock. SO:ke http://en.wikipedia.org/wiki/Wild_type wiki loinc:LA9658-1 wild type A gene that rescues. wild type rescue gene sequence SO:0000818 wild_type_rescue_gene A gene that rescues. SO:xp A chromosome originating in a mitochondria. mitochondrial chromosome sequence SO:0000819 mitochondrial_chromosome A chromosome originating in a mitochondria. SO:xp A chromosome originating in a chloroplast. chloroplast chromosome sequence SO:0000820 chloroplast_chromosome A chromosome originating in a chloroplast. SO:xp A chromosome originating in a chromoplast. chromoplast chromosome sequence SO:0000821 chromoplast_chromosome A chromosome originating in a chromoplast. SO:xp A chromosome originating in a cyanelle. cyanelle chromosome sequence SO:0000822 cyanelle_chromosome A chromosome originating in a cyanelle. SO:xp A chromosome with origin in a leucoplast. leucoplast chromosome sequence SO:0000823 leucoplast_chromosome A chromosome with origin in a leucoplast. SO:xp A chromosome originating in a macronucleus. macronuclear chromosome sequence SO:0000824 macronuclear_chromosome A chromosome originating in a macronucleus. SO:xp A chromosome originating in a micronucleus. micronuclear chromosome sequence SO:0000825 micronuclear_chromosome A chromosome originating in a micronucleus. SO:xp true true A chromosome originating in a nucleus. nuclear chromosome sequence SO:0000828 nuclear_chromosome A chromosome originating in a nucleus. SO:xp A chromosome originating in a nucleomorph. nucleomorphic chromosome sequence SO:0000829 nucleomorphic_chromosome A chromosome originating in a nucleomorph. SO:xp A region of a chromosome. chromosomal region chromosomal_region chromosome part sequence SO:0000830 This is a manufactured term, that serves the purpose of allow the parts of a chromosome to have an is_a path to the root. chromosome_part A region of a chromosome. SO:ke A region of a gene. gene member region sequence SO:0000831 A manufactured term used to allow the parts of a gene to have an is_a path to the root. gene_member_region A region of a gene. SO:ke A region of sequence which is part of a promoter. sequence SO:0000832 This is a manufactured term to allow the parts of promoter to have an is_a path back to the root. promoter_region true A region of sequence which is part of a promoter. SO:ke A region of a transcript. transcript region sequence SO:0000833 This term was added to provide a grouping term for the region parts of transcript, thus giving them an is_a path back to the root. transcript_region A region of a transcript. SO:ke A region of a mature transcript. mature transcript region sequence SO:0000834 A manufactured term to collect together the parts of a mature transcript and give them an is_a path to the root. mature_transcript_region A region of a mature transcript. SO:ke A part of a primary transcript. primary transcript region sequence SO:0000835 This term was added to provide a grouping term for the region parts of primary_transcript, thus giving them an is_a path back to the root. primary_transcript_region A part of a primary transcript. SO:ke A region of an mRNA. mRNA region sequence SO:0000836 This term was added to provide a grouping term for the region parts of mRNA, thus giving them an is_a path back to the root. mRNA_region A region of an mRNA. SO:cb A region of UTR. UTR region sequence SO:0000837 A region of UTR. This term is a grouping term to allow the parts of UTR to have an is_a path to the root. UTR_region A region of UTR. SO:ke A region of an rRNA primary transcript. rRNA primary transcript region sequence SO:0000838 To allow transcribed_spacer_region to have a path to the root. rRNA_primary_transcript_region A region of an rRNA primary transcript. SO:ke Biological sequence region that can be assigned to a specific subsequence of a polypeptide. BS:00124 BS:00331 region site sequence positional positional polypeptide feature region or site annotation SO:0000839 Added to allow the polypeptide regions to have is_a paths back to the root. polypeptide_region Biological sequence region that can be assigned to a specific subsequence of a polypeptide. SO:GAR SO:ke region uniprot:feature_type site uniprot:feature_type A region of a repeated sequence. repeat component sequence SO:0000840 A manufactured to group the parts of repeats, to give them an is_a path back to the root. repeat_component A region of a repeated sequence. SO:ke A region within an intron. spliceosomal intron region sequence SO:0000841 A terms added to allow the parts of introns to have is_a paths to the root. spliceosomal_intron_region A region within an intron. SO:ke A region of a gene that has a specific function. gene component region sequence SO:0000842 gene_component_region A region which is part of a bacterial RNA polymerase promoter. sequence SO:0000843 This is a manufactured term to allow the parts of bacterial_RNApol_promoter to have an is_a path back to the root. bacterial_RNApol_promoter_region true A region which is part of a bacterial RNA polymerase promoter. SO:ke A region of sequence which is a promoter for RNA polymerase II. sequence SO:0000844 This is a manufactured term to allow the parts of RNApol_II_promoter to have an is_a path back to the root. RNApol_II_promoter_region true A region of sequence which is a promoter for RNA polymerase II. SO:ke A region of sequence which is a promoter for RNA polymerase III type 1. sequence SO:0000845 This is a manufactured term to allow the parts of RNApol_III_promoter_type_1 to have an is_a path back to the root. RNApol_III_promoter_type_1_region true A region of sequence which is a promoter for RNA polymerase III type 1. SO:ke A region of sequence which is a promoter for RNA polymerase III type 2. sequence SO:0000846 This is a manufactured term to allow the parts of RNApol_III_promoter_type_2 to have an is_a path back to the root. RNApol_III_promoter_type_2_region true A region of sequence which is a promoter for RNA polymerase III type 2. SO:ke A region of a tmRNA. tmRNA region sequence SO:0000847 This term was added to provide a grouping term for the region parts of tmRNA, thus giving them an is_a path back to the root. tmRNA_region A region of a tmRNA. SO:cb The long terminal repeat found at the ends of the sequence to be inserted into the host genome. LTR component long term repeat component sequence SO:0000848 LTR_component A component of the three-prime long terminal repeat. 3' long terminal repeat component three prime LTR component sequence SO:0000849 three_prime_LTR_component A component of the three-prime long terminal repeat. PMID:8649407 A component of the five-prime long terminal repeat. 5' long term repeat component five prime LTR component sequence SO:0000850 five_prime_LTR_component A component of the five-prime long terminal repeat. PMID:8649407 A region of a CDS. CDS region sequence SO:0000851 CDS_region A region of a CDS. SO:cb A region of an exon. exon region sequence SO:0000852 exon_region A region of an exon. RSC:cb A region that is homologous to another region. http://en.wikipedia.org/wiki/Homology_(biology) homolog homologous region homologue sequence SO:0000853 homologous_region A region that is homologous to another region. SO:ke http://en.wikipedia.org/wiki/Homology_(biology) wiki A homologous_region that is paralogous to another region. http://en.wikipedia.org/wiki/Paralog#Paralogy paralog paralogous region paralogue sequence SO:0000854 A term to be used in conjunction with the paralogous_to relationship. paralogous_region A homologous_region that is paralogous to another region. SO:ke http://en.wikipedia.org/wiki/Paralog#Paralogy wiki A homologous_region that is orthologous to another region. http://en.wikipedia.org/wiki/Ortholog#Orthology ortholog orthologous region orthologue sequence SO:0000855 This term should be used in conjunction with the similarity relationships defined in SO. orthologous_region A homologous_region that is orthologous to another region. SO:ke http://en.wikipedia.org/wiki/Ortholog#Orthology wiki A region that is similar or identical across more than one species. sequence SO:0000856 conserved Similarity due to common ancestry. sequence SO:0000857 homologous Similarity due to common ancestry. SO:ke An attribute describing a kind of homology where divergence occurred after a speciation event. sequence SO:0000858 orthologous An attribute describing a kind of homology where divergence occurred after a speciation event. SO:ke An attribute describing a kind of homology where divergence occurred after a duplication event. sequence SO:0000859 paralogous An attribute describing a kind of homology where divergence occurred after a duplication event. SO:ke Attribute describing sequence regions occurring in same order on chromosome of different species. http://en.wikipedia.org/wiki/Syntenic sequence SO:0000860 syntenic Attribute describing sequence regions occurring in same order on chromosome of different species. SO:ke http://en.wikipedia.org/wiki/Syntenic wiki A primary transcript that is capped. capped primary transcript sequence SO:0000861 capped_primary_transcript A primary transcript that is capped. SO:xp An mRNA that is capped. capped mRNA sequence SO:0000862 capped_mRNA An mRNA that is capped. SO:xp An attribute describing an mRNA feature. mRNA attribute sequence SO:0000863 mRNA_attribute An attribute describing an mRNA feature. SO:ke An attribute describing a sequence is representative of a class of similar sequences. sequence SO:0000864 exemplar An attribute describing a sequence is representative of a class of similar sequences. SO:ke An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is not divisible by 3. http://en.wikipedia.org/wiki/Frameshift sequence SO:0000865 frameshift An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is not divisible by 3. SO:ke http://en.wikipedia.org/wiki/Frameshift wiki A frameshift caused by deleting one base. minus 1 frameshift sequence SO:0000866 minus_1_frameshift A frameshift caused by deleting one base. SO:ke A frameshift caused by deleting two bases. minus 2 frameshift sequence SO:0000867 minus_2_frameshift A frameshift caused by deleting two bases. SO:ke A frameshift caused by inserting one base. plus 1 frameshift sequence SO:0000868 plus_1_frameshift A frameshift caused by inserting one base. SO:ke A frameshift caused by inserting two bases. plus 2 framshift sequence SO:0000869 plus_2_framshift A frameshift caused by inserting two bases. SO:ke An attribute describing transcript sequence that is created by splicing exons from diferent genes. trans-spliced sequence SO:0000870 trans_spliced An attribute describing transcript sequence that is created by splicing exons from diferent genes. SO:ke An mRNA that is polyadenylated. polyadenylated mRNA sequence SO:0000871 polyadenylated_mRNA An mRNA that is polyadenylated. SO:xp An mRNA that is trans-spliced. trans-spliced mRNA sequence SO:0000872 trans_spliced_mRNA An mRNA that is trans-spliced. SO:xp A transcript that is edited. edited transcript sequence SO:0000873 edited_transcript A transcript that is edited. SO:ke A transcript that has been edited by A to I substitution. edited transcript by A to I substitution sequence SO:0000874 edited_transcript_by_A_to_I_substitution A transcript that has been edited by A to I substitution. SO:ke An attribute describing a sequence that is bound by a protein. bound by protein sequence SO:0000875 bound_by_protein An attribute describing a sequence that is bound by a protein. SO:ke An attribute describing a sequence that is bound by a nucleic acid. bound by nucleic acid sequence SO:0000876 bound_by_nucleic_acid An attribute describing a sequence that is bound by a nucleic acid. SO:ke An attribute describing a situation where a gene may encode for more than 1 transcript. alternatively spliced sequence SO:0000877 alternatively_spliced An attribute describing a situation where a gene may encode for more than 1 transcript. SO:ke An attribute describing a sequence that contains the code for one gene product. sequence SO:0000878 monocistronic An attribute describing a sequence that contains the code for one gene product. SO:ke An attribute describing a sequence that contains the code for two gene products. sequence SO:0000879 dicistronic An attribute describing a sequence that contains the code for two gene products. SO:ke An attribute describing a sequence that contains the code for more than one gene product. sequence SO:0000880 polycistronic An attribute describing a sequence that contains the code for more than one gene product. SO:ke An attribute describing an mRNA sequence that has been reprogrammed at translation, causing localized alterations. sequence SO:0000881 recoded An attribute describing an mRNA sequence that has been reprogrammed at translation, causing localized alterations. SO:ke An attribute describing the alteration of codon meaning. codon redefined sequence SO:0000882 codon_redefined An attribute describing the alteration of codon meaning. SO:ke A stop codon redefined to be a new amino acid. stop codon read through sequence stop codon readthrough SO:0000883 stop_codon_read_through A stop codon redefined to be a new amino acid. SO:ke A stop codon redefined to be the new amino acid, pyrrolysine. stop codon redefined as pyrrolysine sequence SO:0000884 stop_codon_redefined_as_pyrrolysine A stop codon redefined to be the new amino acid, pyrrolysine. SO:ke A stop codon redefined to be the new amino acid, selenocysteine. stop codon redefined as selenocysteine sequence SO:0000885 stop_codon_redefined_as_selenocysteine A stop codon redefined to be the new amino acid, selenocysteine. SO:ke Recoded mRNA where a block of nucleotides is not translated. recoded by translational bypass sequence SO:0000886 recoded_by_translational_bypass Recoded mRNA where a block of nucleotides is not translated. SO:ke Recoding by frameshifting a particular site. translationally frameshifted sequence SO:0000887 translationally_frameshifted Recoding by frameshifting a particular site. SO:ke A gene that is maternally_imprinted. maternally imprinted gene sequence SO:0000888 maternally_imprinted_gene A gene that is maternally_imprinted. SO:xp A gene that is paternally imprinted. paternally imprinted gene sequence SO:0000889 paternally_imprinted_gene A gene that is paternally imprinted. SO:xp A gene that is post translationally regulated. post translationally regulated gene sequence SO:0000890 post_translationally_regulated_gene A gene that is post translationally regulated. SO:xp A gene that is negatively autoreguated. negatively autoregulated gene sequence SO:0000891 negatively_autoregulated_gene A gene that is negatively autoreguated. SO:xp A gene that is positively autoregulated. positively autoregulated gene sequence SO:0000892 positively_autoregulated_gene A gene that is positively autoregulated. SO:xp An attribute describing an epigenetic process where a gene is inactivated at transcriptional or translational level. http://en.wikipedia.org/wiki/Silenced sequence SO:0000893 silenced An attribute describing an epigenetic process where a gene is inactivated at transcriptional or translational level. SO:ke http://en.wikipedia.org/wiki/Silenced wiki An attribute describing an epigenetic process where a gene is inactivated by DNA modifications, resulting in repression of transcription. silenced by DNA modification sequence SO:0000894 silenced_by_DNA_modification An attribute describing an epigenetic process where a gene is inactivated by DNA modifications, resulting in repression of transcription. SO:ke An attribute describing an epigenetic process where a gene is inactivated by DNA methylation, resulting in repression of transcription. silenced by DNA methylation sequence SO:0000895 silenced_by_DNA_methylation An attribute describing an epigenetic process where a gene is inactivated by DNA methylation, resulting in repression of transcription. SO:ke A gene that is translationally regulated. translationally regulated gene sequence SO:0000896 translationally_regulated_gene A gene that is translationally regulated. SO:xp A gene that is allelically_excluded. allelically excluded gene sequence SO:0000897 allelically_excluded_gene A gene that is allelically_excluded. SO:xp A gene that is epigenetically modified. epigenetically modified gene sequence SO:0000898 epigenetically_modified_gene A gene that is epigenetically modified. SO:ke An attribute describing a nuclear pseudogene of a mitochndrial gene. nuclear mitochondrial sequence SO:0000899 nuclear_mitochondrial true An attribute describing a nuclear pseudogene of a mitochndrial gene. SO:ke An attribute describing a pseudogene where by an mRNA was retrotransposed. The mRNA sequence is transcribed back into the genome, lacking introns and promotors, but often including a polyA tail. sequence SO:0000900 processed true An attribute describing a pseudogene where by an mRNA was retrotransposed. The mRNA sequence is transcribed back into the genome, lacking introns and promotors, but often including a polyA tail. SO:ke An attribute describing a pseudogene that was created by tandem duplication and unequal crossing over during recombination. unequally crossed over sequence SO:0000901 unequally_crossed_over true An attribute describing a pseudogene that was created by tandem duplication and unequal crossing over during recombination. SO:ke A transgene is a gene that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another. http://en.wikipedia.org/wiki/Transgene sequence SO:0000902 transgene A transgene is a gene that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another. SO:xp http://en.wikipedia.org/wiki/Transgene wiki Endogenous DNA sequence that are likely to have arisen from retroviruses. endogenous retroviral sequence sequence SO:0000903 endogenous_retroviral_sequence An attribute to describe the sequence of a feature, where the DNA is rearranged. rearranged at DNA level sequence SO:0000904 rearranged_at_DNA_level An attribute to describe the sequence of a feature, where the DNA is rearranged. SO:ke An attribute describing the status of a feature, based on the available evidence. sequence SO:0000905 This term is the hypernym of attributes and should not be annotated to. status An attribute describing the status of a feature, based on the available evidence. SO:ke Attribute to describe a feature that is independently known - not predicted. independently known sequence SO:0000906 independently_known Attribute to describe a feature that is independently known - not predicted. SO:ke An attribute to describe a feature that has been predicted using sequence similarity techniques. supported by sequence similarity sequence SO:0000907 supported_by_sequence_similarity An attribute to describe a feature that has been predicted using sequence similarity techniques. SO:ke An attribute to describe a feature that has been predicted using sequence similarity of a known domain. supported by domain match sequence SO:0000908 supported_by_domain_match An attribute to describe a feature that has been predicted using sequence similarity of a known domain. SO:ke An attribute to describe a feature that has been predicted using sequence similarity to EST or cDNA data. supported by EST or cDNA sequence SO:0000909 supported_by_EST_or_cDNA An attribute to describe a feature that has been predicted using sequence similarity to EST or cDNA data. SO:ke A gene whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence. sequence SO:0000910 orphan An attribute describing a feature that is predicted by a computer program that did not rely on sequence similarity. predicted by ab initio computation sequence SO:0000911 predicted_by_ab_initio_computation An attribute describing a feature that is predicted by a computer program that did not rely on sequence similarity. SO:ke A motif of three consecutive residues and one H-bond in which: residue(i) is Aspartate or Asparagine (Asx), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2). BS:00203 asx turn sequence SO:0000912 asx_turn A motif of three consecutive residues and one H-bond in which: residue(i) is Aspartate or Asparagine (Asx), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2). http://www.ebi.ac.uk/msd-srv/msdmotif/ A clone insert made from cDNA. cloned cDNA insert sequence SO:0000913 cloned_cDNA_insert A clone insert made from cDNA. SO:xp A clone insert made from genomic DNA. cloned genomic insert sequence SO:0000914 cloned_genomic_insert A clone insert made from genomic DNA. SO:xp A clone insert that is engineered. engineered insert sequence SO:0000915 engineered_insert A clone insert that is engineered. SO:xp edit operation sequence SO:0000916 edit_operation true An edit to insert a U. insert U sequence SO:0000917 The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa. insert_U true An edit to insert a U. SO:ke An edit to delete a uridine. delete U sequence SO:0000918 The insertion and deletion of uridine (U) residues, usually within coding regions of mRNA transcripts of cryptogenes in the mitochondrial genome of kinetoplastid protozoa. delete_U true An edit to delete a uridine. SO:ke An edit to substitute an I for an A. substitute A to I sequence SO:0000919 substitute_A_to_I true An edit to substitute an I for an A. SO:ke An edit to insert a cytidine. insert C sequence SO:0000920 insert_C true An edit to insert a cytidine. SO:ke An edit to insert a dinucleotide. insert dinucleotide sequence SO:0000921 insert_dinucleotide true An edit to insert a dinucleotide. SO:ke An edit to substitute an U for a C. substitute C to U sequence SO:0000922 substitute_C_to_U true An edit to substitute an U for a C. SO:ke An edit to insert a G. insert G sequence SO:0000923 insert_G true An edit to insert a G. SO:ke An edit to insert a GC dinucleotide. insert GC sequence SO:0000924 The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs. insert_GC true An edit to insert a GC dinucleotide. SO:ke An edit to insert a GU dinucleotide. insert GU sequence SO:0000925 The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs. insert_GU true An edit to insert a GU dinucleotide. SO:ke An edit to insert a CU dinucleotide. insert CU sequence SO:0000926 The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs. insert_CU true An edit to insert a CU dinucleotide. SO:ke An edit to insert a AU dinucleotide. insert AU sequence SO:0000927 The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs. insert_AU true An edit to insert a AU dinucleotide. SO:ke An edit to insert a AA dinucleotide. insert AA sequence SO:0000928 The type of RNA editing found in the mitochondria of Myxomycota, characterized by the insertion of mono- and dinucleotides in RNAs relative to their mtDNA template and in addition, C to U base conversion. The most common mononucleotide insertion is cytidine, although a number of uridine mononucleotides are inserted at specific sites. Adenine and guanine have not been observed in mononucleotide insertions. Five different dinucleotide insertions have been observed, GC, GU, CU, AU and AA. Both mono- and dinucleotide insertions create open reading frames in mRNA and contribute to highly conserved structural features of rRNAs and tRNAs. insert_AA true An edit to insert a AA dinucleotide. SO:ke An mRNA that is edited. edited mRNA sequence SO:0000929 edited_mRNA An mRNA that is edited. SO:xp A region of guide RNA. guide RNA region sequence SO:0000930 guide_RNA_region A region of guide RNA. SO:ma A region of a guide_RNA that base-pairs to a target mRNA. anchor region sequence SO:0000931 anchor_region A region of a guide_RNA that base-pairs to a target mRNA. SO:jk A primary transcript that, at least in part, encodes one or more proteins that has not been edited. pre-edited mRNA sequence SO:0000932 pre_edited_mRNA An attribute to describe a feature between stages of processing. sequence SO:0000933 intermediate An attribute to describe a feature between stages of processing. SO:ke A miRNA target site is a binding site where the molecule is a micro RNA. miRNA target site sequence SO:0000934 miRNA_target_site A miRNA target site is a binding site where the molecule is a micro RNA. FB:cds A CDS that is edited. edited CDS sequence SO:0000935 edited_CDS A CDS that is edited. SO:xp Genomic DNA of immunoglobulin/T-cell receptor gene in partially rearranged genomic DNA. vertebrate immunoglobulin T cell receptor rearranged segment sequence SO:0000936 vertebrate_immunoglobulin_T_cell_receptor_rearranged_segment sequence SO:0000937 vertebrate_immune_system_feature true Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration. vertebrate immunoglobulin T cell receptor rearranged gene cluster sequence SO:0000938 vertebrate_immunoglobulin_T_cell_receptor_rearranged_gene_cluster Feature used for the recombination of genomic material for the purpose of generating diversity of the immune system. vertebrate immune system gene recombination signal feature sequence SO:0000939 vertebrate_immune_system_gene_recombination_signal_feature A gene that is recombinationally rearranged. recombinationally rearranged sequence SO:0000940 recombinationally_rearranged A recombinationally rearranged gene of the vertebrate immune system. recombinationally rearranged vertebrate immune system gene sequence SO:0000941 recombinationally_rearranged_vertebrate_immune_system_gene A recombinationally rearranged gene of the vertebrate immune system. SO:xp An integration/excision site of a phage chromosome at which a recombinase acts to insert the phage DNA at a cognate integration/excision site on a bacterial chromosome. attP site sequence SO:0000942 attP_site An integration/excision site of a phage chromosome at which a recombinase acts to insert the phage DNA at a cognate integration/excision site on a bacterial chromosome. SO:as An integration/excision site of a bacterial chromosome at which a recombinase acts to insert foreign DNA containing a cognate integration/excision site. attB site sequence SO:0000943 attB_site An integration/excision site of a bacterial chromosome at which a recombinase acts to insert foreign DNA containing a cognate integration/excision site. SO:as A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attB_site and the 3' portion of attP_site. sequence attBP' attL site SO:0000944 attL_site A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attB_site and the 3' portion of attP_site. SO:as A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attP_site and the 3' portion of attB_site. attR site sequence attPB' SO:0000945 attR_site A region that results from recombination between attP_site and attB_site, composed of the 5' portion of attP_site and the 3' portion of attB_site. SO:as A region specifically recognised by a recombinase, which inserts or removes another region marked by a distinct cognate integration/excision site. integration excision site sequence attachment site SO:0000946 integration_excision_site A region specifically recognised by a recombinase, which inserts or removes another region marked by a distinct cognate integration/excision site. SO:as A region specifically recognized by a recombinase, which separates a physically contiguous circle of DNA into two physically separate circles. res site resolution site sequence SO:0000947 resolution_site A region specifically recognized by a recombinase, which separates a physically contiguous circle of DNA into two physically separate circles. SO:as A region specifically recognised by a recombinase, which inverts the region flanked by a pair of sites. inversion site sequence SO:0000948 A target region for site-specific inversion of a DNA region and which carries binding sites for a site-specific recombinase and accessory proteins as well as the site for specific cleavage by the recombinase. inversion_site A region specifically recognised by a recombinase, which inverts the region flanked by a pair of sites. SO:ma A site at which replicated bacterial circular chromosomes are decatenated by site specific resolvase. dif site sequence SO:0000949 dif_site A site at which replicated bacterial circular chromosomes are decatenated by site specific resolvase. SO:as An attC site is a sequence required for the integration of a DNA of an integron. attC site sequence SO:0000950 attC_site An attC site is a sequence required for the integration of a DNA of an integron. SO:as A signal for RNA polymerase to terminate transcription. eukaryotic terminator sequence SO:0000951 eukaryotic_terminator An origin of vegetative replication in plasmids and phages. origin of vegetative replication sequence SO:0000952 oriV An origin of vegetative replication in plasmids and phages. SO:as An origin of bacterial chromosome replication. origin of bacterial chromosome replication sequence SO:0000953 oriC An origin of bacterial chromosome replication. SO:as Structural unit composed of a self-replicating, DNA molecule. DNA chromosome sequence SO:0000954 DNA_chromosome Structural unit composed of a self-replicating, DNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded DNA molecule. double stranded DNA chromosome sequence SO:0000955 double_stranded_DNA_chromosome Structural unit composed of a self-replicating, double-stranded DNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded DNA molecule. single stranded DNA chromosome sequence SO:0000956 single_stranded_DNA_chromosome Structural unit composed of a self-replicating, single-stranded DNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded, linear DNA molecule. linear double stranded DNA chromosome sequence SO:0000957 linear_double_stranded_DNA_chromosome Structural unit composed of a self-replicating, double-stranded, linear DNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded, circular DNA molecule. circular double stranded DNA chromosome sequence SO:0000958 circular_double_stranded_DNA_chromosome Structural unit composed of a self-replicating, double-stranded, circular DNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded, linear DNA molecule. linear single stranded DNA chromosome sequence SO:0000959 linear_single_stranded_DNA_chromosome Structural unit composed of a self-replicating, single-stranded, linear DNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded, circular DNA molecule. circular single stranded DNA chromosome sequence SO:0000960 circular_single_stranded_DNA_chromosome Structural unit composed of a self-replicating, single-stranded, circular DNA molecule. SO:ma Structural unit composed of a self-replicating, RNA molecule. RNA chromosome sequence SO:0000961 RNA_chromosome Structural unit composed of a self-replicating, RNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded RNA molecule. single stranded RNA chromosome sequence SO:0000962 single_stranded_RNA_chromosome Structural unit composed of a self-replicating, single-stranded RNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded, linear RNA molecule. linear single stranded RNA chromosome sequence SO:0000963 linear_single_stranded_RNA_chromosome Structural unit composed of a self-replicating, single-stranded, linear RNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded, linear RNA molecule. linear double stranded RNA chromosome sequence SO:0000964 linear_double_stranded_RNA_chromosome Structural unit composed of a self-replicating, double-stranded, linear RNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded RNA molecule. double stranded RNA chromosome sequence SO:0000965 double_stranded_RNA_chromosome Structural unit composed of a self-replicating, double-stranded RNA molecule. SO:ma Structural unit composed of a self-replicating, single-stranded, circular DNA molecule. circular single stranded RNA chromosome sequence SO:0000966 circular_single_stranded_RNA_chromosome Structural unit composed of a self-replicating, single-stranded, circular DNA molecule. SO:ma Structural unit composed of a self-replicating, double-stranded, circular RNA molecule. circular double stranded RNA chromosome sequence SO:0000967 circular_double_stranded_RNA_chromosome Structural unit composed of a self-replicating, double-stranded, circular RNA molecule. SO:ma sequence replication mode sequence SO:0000968 This has been obsoleted as it represents a process. replaced_by: GO:0034961. sequence_replication_mode true http://en.wikipedia.org/wiki/Rolling_circle rolling circle sequence SO:0000969 This has been obsoleted as it represents a process. replaced_by: GO:0070581. rolling_circle true http://en.wikipedia.org/wiki/Rolling_circle wiki theta replication sequence SO:0000970 This has been obsoleted as it represents a process. replaced_by: GO:0070582 theta_replication true DNA replication mode sequence SO:0000971 This has been obsoleted as it represents a process. replaced_by: GO:0006260. DNA_replication_mode true RNA replication mode sequence SO:0000972 This has been obsoleted as it represents a process. replaced_by: GO:0034961. RNA_replication_mode true A terminal_inverted_repeat_element that is bacterial and only encodes the functions required for its transposition between these inverted repeats. http://en.wikipedia.org/wiki/Insertion_sequence insertion sequence sequence IS SO:0000973 insertion_sequence A terminal_inverted_repeat_element that is bacterial and only encodes the functions required for its transposition between these inverted repeats. SO:as http://en.wikipedia.org/wiki/Insertion_sequence wiki true A gene found within a minicircle. minicircle gene sequence SO:0000975 minicircle_gene A feature_attribute describing a feature that is not manifest under normal conditions. sequence SO:0000976 cryptic A feature_attribute describing a feature that is not manifest under normal conditions. SO:ke anchor binding site sequence SO:0000977 Part of an edited transcript only. anchor_binding_site A region of a guide_RNA that specifies the insertions and deletions of bases in the editing of a target mRNA. information region template region sequence SO:0000978 template_region A region of a guide_RNA that specifies the insertions and deletions of bases in the editing of a target mRNA. SO:jk A non-protein_coding gene that encodes a guide_RNA. gRNA encoding sequence SO:0000979 gRNA_encoding A non-protein_coding gene that encodes a guide_RNA. SO:ma A minicircle is a replicon, part of a kinetoplast, that encodes for guide RNAs. SO:0000974 http://en.wikipedia.org/wiki/Minicircle minicircle_chromosome sequence SO:0000980 minicircle A minicircle is a replicon, part of a kinetoplast, that encodes for guide RNAs. PMID:8395055 http://en.wikipedia.org/wiki/Minicircle wiki A transcription terminator that is dependent upon Rho. rho dependent bacterial terminator sequence SO:0000981 rho_dependent_bacterial_terminator A transcription terminator that is not dependent upon Rho. Rather, the mRNA contains a sequence that allows it to base-pair with itself and make a stem-loop structure. rho independent bacterial terminator sequence SO:0000982 rho_independent_bacterial_terminator The attribute of how many strands are present in a nucleotide polymer. strand attribute sequence SO:0000983 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. strand_attribute When a nucleotide polymer has only one strand. sequence SO:0000984 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. single When a nucleotide polymer has two strands that are reverse-complement to one another and pair together. sequence SO:0000985 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. double The attribute of whether a nucleotide polymer is linear or circular. topology attribute sequence SO:0000986 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. topology_attribute A quality of a nucleotide polymer that has a 3'-terminal residue and a 5'-terminal residue. sequence two-ended SO:0000987 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. linear A quality of a nucleotide polymer that has a 3'-terminal residue and a 5'-terminal residue. SO:cb A quality of a nucleotide polymer that has no terminal nucleotide residues. sequence zero-ended SO:0000988 Attributes added to describe the different kinds of replicon. SO workshop, September 2006. circular A quality of a nucleotide polymer that has no terminal nucleotide residues. SO:cb Small non-coding RNA (59-60 nt long) containing 5' and 3' ends that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm. class II RNA sequence SO:0000989 class_II_RNA Small non-coding RNA (59-60 nt long) containing 5' and 3' ends that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm. PMID:15333696 Small non-coding RNA (55-65 nt long) containing highly conserved 5' and 3' ends (16 and 8 nt, respectively) that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm. class I RNA sequence SO:0000990 Requested by Karen Pilcher - Dictybase. song-Term Tracker-1574577. class_I_RNA Small non-coding RNA (55-65 nt long) containing highly conserved 5' and 3' ends (16 and 8 nt, respectively) that are predicted to come together to form a stem structure. Identified in the social amoeba Dictyostelium discoideum and localized in the cytoplasm. PMID:15333696 DNA located in the genome and able to be transmitted to the offspring. gDNA genomic DNA sequence SO:0000991 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. genomic_DNA DNA located in the genome and able to be transmitted to the offspring. BCS:etrwz A region of DNA that has been inserted into the bacterial genome using a bacterial artificial chromosome. BAC cloned genomic insert sequence SO:0000992 Requested by Andy Schroder - Flybase Harvard, Nov 2006. BAC_cloned_genomic_insert A sequence produced from an aligment algorithm that uses multiple sequences as input. sequence SO:0000993 Term added Dec 06 to comply with mapping to MGED terms. It should be used to generate consensus regions. The specific cross product terms they require are consensus_region and consensus_mRNA. consensus A region that has a known consensus sequence. consensus region sequence SO:0000994 DO not obsolete without considering MGED mapping. consensus_region An mRNA sequence produced from an aligment algorithm that uses multiple sequences as input. consensus mRNA sequence SO:0000995 DO not obsolete without considering MGED mapping. consensus_mRNA A region of the genome that has been predicted to be a gene but has not been confirmed by laboratory experiments. predicted gene sequence SO:0000996 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. predicted_gene A portion of a gene that is not the complete gene. gene fragment sequence SO:0000997 This term is mapped to MGED. Do not obsolete without consulting MGED ontology. gene_fragment A recursive splice site is a splice site which subdivides a large intron. Recursive splicing is a mechanism that splices large introns by sub dividing the intron at non exonic elements and alternate exons. recursive splice site sequence SO:0000998 recursive_splice_site A recursive splice site is a splice site which subdivides a large intron. Recursive splicing is a mechanism that splices large introns by sub dividing the intron at non exonic elements and alternate exons. http://www.genetics.org/cgi/content/full/170/2/661 A region of sequence from the end of a BAC clone that may provide a highly specific marker. BAC end BAC end sequence BES sequence SO:0000999 Requested by Keith Boroevich December, 2006. BAC_end A region of sequence from the end of a BAC clone that may provide a highly specific marker. SO:ke Cytosolic 16S rRNA is an RNA component of the small subunit of cytosolic ribosomes in prokaryotes. http://en.wikipedia.org/wiki/16S_ribosomal_RNA cytosolic 16S SSU RNA cytosolic 16S ribosomal RNA cytosolic rRNA 16S sequence cytosolic 16S rRNA SO:0001000 Renamed to cytosolic_16S_rRNA from rRNA_16S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493. cytosolic_16S_rRNA Cytosolic 16S rRNA is an RNA component of the small subunit of cytosolic ribosomes in prokaryotes. SO:ke http://en.wikipedia.org/wiki/16S_ribosomal_RNA wiki Cytosolic 23S rRNA is an RNA component of the large subunit of cytosolic ribosomes in prokaryotes. cytosolic 23S LSU rRNA cytosolic 23S rRNA cytosolic rRNA 23S sequence cytosolic 23S ribosomal RNA SO:0001001 Renamed from rRNA_23S to cytosolic_23S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_23S_rRNA Cytosolic 23S rRNA is an RNA component of the large subunit of cytosolic ribosomes in prokaryotes. SO:ke Cytosolic 25S rRNA is an RNA component of the large subunit of cytosolic ribosomes most eukaryotes. cytosolic 25S LSU rRNA cytosolic 25S rRNA cytosolic 25S ribosomal RNA cytosolic rRNA 25S sequence SO:0001002 Renamed from rRNA_5S to cytosolic_5S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_25S_rRNA Cytosolic 25S rRNA is an RNA component of the large subunit of cytosolic ribosomes most eukaryotes. PMID:15493135 PMID:2100998 RSC:cb A recombination product between the 2 LTR of the same element. solo LTR sequence SO:0001003 Requested by Hadi Quesneville January 2007. solo_LTR A recombination product between the 2 LTR of the same element. SO:ke When a sequence does not contain an equal distribution of all four possible nucleotide bases or does not contain all nucleotide bases. low complexity sequence SO:0001004 low_complexity A region where the DNA does not contain an equal distrubution of all four possible nucleotides or does not contain all four nucleotides. low complexity region sequence SO:0001005 low_complexity_region A phage genome after it has established in the host genome in a latent/immune state either as a plasmid or as an integrated "island". http://en.wikipedia.org/wiki/Prophage sequence SO:0001006 prophage A phage genome after it has established in the host genome in a latent/immune state either as a plasmid or as an integrated "island". GOC:jl http://en.wikipedia.org/wiki/Prophage wiki A remnant of an integrated prophage in the host genome or an "island" in the host genome that includes phage like-genes. http://ecoliwiki.net/colipedia/index.php/Category:Cryptic_Prophage.w cryptic prophage sequence SO:0001007 This is not cryptic in the same sense as a cryptic gene or cryptic splice site. cryptic_prophage A remnant of an integrated prophage in the host genome or an "island" in the host genome that includes phage like-genes. GOC:jl A base-paired stem with loop of 4 non-hydrogen bonded nucleotides. http://en.wikipedia.org/wiki/Tetraloop sequence SO:0001008 tetraloop A base-paired stem with loop of 4 non-hydrogen bonded nucleotides. SO:ke http://en.wikipedia.org/wiki/Tetraloop wiki A double-stranded DNA used to control macromolecular structure and function. DNA constraint DNA constraint sequence sequence SO:0001009 DNA_constraint_sequence A double-stranded DNA used to control macromolecular structure and function. http:/www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=search&db=pubmed&term=SILVERMAN+SK[au]&dispmax=50 A cytosine rich domain whereby strands associate both inter- and intramolecularly at moderately acidic pH. i motif short intercalated motif sequence SO:0001010 i_motif A cytosine rich domain whereby strands associate both inter- and intramolecularly at moderately acidic pH. PMID:9753739 Peptide nucleic acid, is a chemical not known to occur naturally but is artificially synthesized and used in some biological research and medical treatments. The PNA backbone is composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds. http://en.wikipedia.org/wiki/Peptide_nucleic_acid PNA oligo peptide nucleic acid sequence SO:0001011 PNA_oligo Peptide nucleic acid, is a chemical not known to occur naturally but is artificially synthesized and used in some biological research and medical treatments. The PNA backbone is composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds. SO:ke http://en.wikipedia.org/wiki/Peptide_nucleic_acid wiki A DNA sequence with catalytic activity. DNA enzyme catalytic DNA sequence deoxyribozyme SO:0001012 Added by request from Colin Batchelor. DNAzyme A DNA sequence with catalytic activity. SO:cb A multiple nucleotide polymorphism with alleles of common length > 1, for example AAA/TTT. sequence multiple nucleotide polymorphism SO:0001013 MNP A multiple nucleotide polymorphism with alleles of common length > 1, for example AAA/TTT. http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs2067431 An intronic region that has an attribute. intron domain sequence SO:0001014 Requested by Colin Batchelor, Feb 2007. intron_domain A type of non-canonical base pairing, most commonly between G and U, which is important for the secondary structure of RNAs. It has similar thermodynamic stability to the Watson-Crick pairing. Wobble base pairs only have two hydrogen bonds. Other wobble base pair possibilities are I-A, I-U and I-C. http://en.wikipedia.org/wiki/Wobble_base_pair wobble base pair wobble pair sequence SO:0001015 wobble_base_pair A type of non-canonical base pairing, most commonly between G and U, which is important for the secondary structure of RNAs. It has similar thermodynamic stability to the Watson-Crick pairing. Wobble base pairs only have two hydrogen bonds. Other wobble base pair possibilities are I-A, I-U and I-C. PMID:11256617 http://en.wikipedia.org/wiki/Wobble_base_pair wiki A purine-rich sequence in the group I introns which determines the locations of the splice sites in group I intron splicing and has catalytic activity. IGS internal guide sequence sequence SO:0001016 internal_guide_sequence A purine-rich sequence in the group I introns which determines the locations of the splice sites in group I intron splicing and has catalytic activity. SO:cb A sequence variant that does not affect protein function. Silent mutations may occur in genic ( CDS, UTR, intron etc) and intergenic regions. Silent mutations may have affects on processes such as splicing and regulation. http://en.wikipedia.org/wiki/Silent_mutation loinc:LA6700-4 silent mutation sequence SO:0001017 Added in March 2007 in after meeting with PharmGKB. Although this term is in common usage, it is better to annotate with the most specific term possible, such as synonymous codon, intron variant etc. silent_mutation A sequence variant that does not affect protein function. Silent mutations may occur in genic ( CDS, UTR, intron etc) and intergenic regions. Silent mutations may have affects on processes such as splicing and regulation. SO:ke http://en.wikipedia.org/wiki/Silent_mutation wiki loinc:LA6700-4 Silent A binding site that, in the molecule, interacts selectively and non-covalently with antibodies, B cells or T cells. http://en.wikipedia.org/wiki/Epitope sequence SO:0001018 Requested by Trish Whetzel. epitope A binding site that, in the molecule, interacts selectively and non-covalently with antibodies, B cells or T cells. SO:cb http://en.wikipedia.org/wiki/Epitope http://en.wikipedia.org/wiki/Epitope wiki A variation that increases or decreases the copy number of a given region. http://en.wikipedia.org/wiki/Copy_number_variation CNP CNV copy number polymorphism copy number variation sequence SO:0001019 copy_number_variation A variation that increases or decreases the copy number of a given region. SO:ke http://en.wikipedia.org/wiki/Copy_number_variation wiki SO:0001563 mutation affecting copy number sequence variant affecting copy number sequence SO:0001020 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_copy_number true A chromosomal region that may sustain a double-strand break, resulting in a recombination event. SO:0001242 INSDC_feature:misc_recomb INSDC_qualifier:chromosome_breakpoint aberration breakpoint aberration_junction chromosome breakpoint sequence SO:0001021 chromosome_breakpoint The point within a chromosome where an inversion begins or ends. inversion breakpoint sequence SO:0001022 inversion_breakpoint The point within a chromosome where an inversion begins or ends. SO:cb An allele is one of a set of coexisting sequence variants of a gene. http://en.wikipedia.org/wiki/Allele allelomorph sequence SO:0001023 allele An allele is one of a set of coexisting sequence variants of a gene. SO:immuno_workshop http://en.wikipedia.org/wiki/Allele wiki A haplotype is one of a set of coexisting sequence variants of a haplotype block. http://en.wikipedia.org/wiki/Haplotype sequence SO:0001024 haplotype A haplotype is one of a set of coexisting sequence variants of a haplotype block. SO:immuno_workshop http://en.wikipedia.org/wiki/Haplotype wiki A sequence variant that is segregating in one or more natural populations of a species. polymorphic sequence variant sequence SO:0001025 polymorphic_sequence_variant A sequence variant that is segregating in one or more natural populations of a species. SO:immuno_workshop A genome is the sum of genetic material within a cell or virion. http://en.wikipedia.org/wiki/Genome sequence SO:0001026 genome A genome is the sum of genetic material within a cell or virion. SO:immuno_workshop http://en.wikipedia.org/wiki/Genome wiki A genotype is a variant genome, complete or incomplete. http://en.wikipedia.org/wiki/Genotype sequence SO:0001027 genotype A genotype is a variant genome, complete or incomplete. SO:immuno_workshop http://en.wikipedia.org/wiki/Genotype wiki A diplotype is a pair of haplotypes from a given individual. It is a genotype where the phase is known. sequence SO:0001028 diplotype A diplotype is a pair of haplotypes from a given individual. It is a genotype where the phase is known. SO:immuno_workshop The attribute of whether the sequence is the same direction as a feature (forward) or the opposite direction as a feature (reverse). direction attribute sequence SO:0001029 direction_attribute Forward is an attribute of the feature, where the feature is in the 5' to 3' direction. sequence SO:0001030 forward Forward is an attribute of the feature, where the feature is in the 5' to 3' direction. SO:ke Reverse is an attribute of the feature, where the feature is in the 3' to 5' direction. Again could be applied to primer. sequence SO:0001031 reverse Reverse is an attribute of the feature, where the feature is in the 3' to 5' direction. Again could be applied to primer. SO:ke DNA belonging to the genome of a mitochondria. http://en.wikipedia.org/wiki/Mitochondrial_DNA mitochondrial DNA mtDNA sequence SO:0001032 This terms is used by MO. mitochondrial_DNA http://en.wikipedia.org/wiki/Mitochondrial_DNA wiki DNA belonging to the genome of a chloroplast, a photosynthetic plastid. chloroplast DNA sequence SO:0001033 This term is used by MO. chloroplast_DNA A de-branched intron which mimics the structure of pre-miRNA and enters the miRNA processing pathway without Drosha mediated cleavage. sequence SO:0001034 Ruby et al. Nature 448:83 describe a new class of miRNAs that are derived from de-branched introns. miRtron A de-branched intron which mimics the structure of pre-miRNA and enters the miRNA processing pathway without Drosha mediated cleavage. PMID:17589500 SO:ma A small non coding RNA, part of a silencing system that prevents the spreading of selfish genetic elements. INSDC_feature:ncRNA http://en.wikipedia.org/wiki/PiRNA INSDC_qualifier:piRNA piwi-associated RNA sequence SO:0001035 piRNA A small non coding RNA, part of a silencing system that prevents the spreading of selfish genetic elements. SO:ke http://en.wikipedia.org/wiki/PiRNA wiki A tRNA sequence that has an arginine anticodon, and a 3' arginine binding region. arginyl tRNA sequence SO:0001036 arginyl_tRNA A tRNA sequence that has an arginine anticodon, and a 3' arginine binding region. SO:ke A nucleotide region with either intra-genome or intracellular mobility, of varying length, which often carry the information necessary for transfer and recombination with the host genome. http://en.wikipedia.org/wiki/Mobile_genetic_element INSDC_feature:mobile_element MGE mobile genetic element sequence SO:0001037 mobile_genetic_element A nucleotide region with either intra-genome or intracellular mobility, of varying length, which often carry the information necessary for transfer and recombination with the host genome. PMID:14681355 http://en.wikipedia.org/wiki/Mobile_genetic_element wiki An MGE that is not integrated into the host chromosome. extrachromosomal mobile genetic element sequence SO:0001038 extrachromosomal_mobile_genetic_element An MGE that is not integrated into the host chromosome. SO:ke An MGE that is integrated into the host chromosome. integrated mobile genetic element sequence SO:0001039 integrated_mobile_genetic_element An MGE that is integrated into the host chromosome. SO:ke A plasmid sequence that is integrated within the host chromosome. integrated plasmid sequence SO:0001040 integrated_plasmid A plasmid sequence that is integrated within the host chromosome. SO:ke The region of nucleotide sequence of a virus, a submicroscopic particle that replicates by infecting a host cell. viral sequence virus sequence sequence SO:0001041 The definitions of the children of this term were revised Decemeber 2007 after discussion on song-devel. The resulting definitions are slightly unweildy but hopefully more logically correct. viral_sequence The region of nucleotide sequence of a virus, a submicroscopic particle that replicates by infecting a host cell. SO:ke The nucleotide sequence of a virus that infects bacteria. http://en.wikipedia.org/wiki/Bacteriophage bacteriophage phage phage sequence sequence SO:0001042 phage_sequence The nucleotide sequence of a virus that infects bacteria. SO:ke http://en.wikipedia.org/wiki/Bacteriophage wiki An attachment site located on a conjugative transposon and used for site-specific integration of a conjugative transposon. attCtn site sequence SO:0001043 attCtn_site An attachment site located on a conjugative transposon and used for site-specific integration of a conjugative transposon. Phigo:at A nuclear pseudogene of either coding or non-coding mitochondria derived sequence. http://en.wikipedia.org/wiki/Numt NUMT nuclear mitochondrial pseudogene nuclear mt pseudogene sequence SO:0001044 Definition change requested by Val, 3172757. nuclear_mt_pseudogene A nuclear pseudogene of either coding or non-coding mitochondria derived sequence. SO:xp http://en.wikipedia.org/wiki/Numt wikipedia A MGE region consisting of two fused plasmids resulting from a replicative transposition event. cointegrated plasmid cointegrated replicon sequence SO:0001045 cointegrated_plasmid A MGE region consisting of two fused plasmids resulting from a replicative transposition event. phigo:at Component of the inversion site located at the left of a region susceptible to site-specific inversion. IRLinv site sequence SO:0001046 IRLinv_site Component of the inversion site located at the left of a region susceptible to site-specific inversion. Phigo:at Component of the inversion site located at the right of a region susceptible to site-specific inversion. IRRinv site sequence SO:0001047 IRRinv_site Component of the inversion site located at the right of a region susceptible to site-specific inversion. Phigo:at A region located within an inversion site. inversion site part sequence SO:0001048 A term created to allow the parts of an inversion site have an is_a path back to the root. inversion_site_part A region located within an inversion site. SO:ke An island that contains genes for integration/excision and the gene and site for the initiation of intercellular transfer by conjugation. It can be complemented for transfer by a conjugative transposon. defective conjugative transposon sequence SO:0001049 defective_conjugative_transposon An island that contains genes for integration/excision and the gene and site for the initiation of intercellular transfer by conjugation. It can be complemented for transfer by a conjugative transposon. Phigo:ariane A portion of a repeat, interrupted by the insertion of another element. repeat fragment sequence SO:0001050 Requested by Chris Smith, and others at Flybase to help annotate nested repeats. repeat_fragment A portion of a repeat, interrupted by the insertion of another element. SO:ke sequence SO:0001051 nested_region true sequence SO:0001052 nested_repeat true sequence SO:0001053 nested_transposon true A portion of a transposon, interrupted by the insertion of another element. transposon fragment sequence SO:0001054 transposon_fragment A portion of a transposon, interrupted by the insertion of another element. SO:ke A regulatory_region that modulates the transcription of a gene or genes. INSDC_feature:regulatory INSDC_qualifier:transcriptional_cis_regulatory_region transcription-control region transcriptional cis regulatory region sequence SO:0001055 Previous parent term transcription_regulatory_region (SO:0001067) has been merged with this term on 11 Feb 2021 as part of the GREEKC consortium. See GitHub Issue #527. transcriptional_cis_regulatory_region A regulatory_region that modulates the transcription of a gene or genes. PMID:9679020 SO:regcreative A regulatory_region that modulates splicing. splicing regulatory region sequence SO:0001056 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. splicing_regulatory_region A regulatory_region that modulates splicing. SO:ke sequence SO:0001057 enhanceosome true A transcriptional_cis_regulatory_region that restricts the activity of a CRM to a single promoter and which functions only when both itself and an insulator are located between the CRM and the promoter. promoter targeting sequence sequence SO:0001058 Obsoleted Jan 21, 2021 by Dave Sant. GREEKC consortium individuals pointed out that this did not fit with the other child terms of transcriptional_cis_regulatory_region (SO:0001055), which are currently promoter, CRM and promoter flanking region. No comments about when this term was created exist, no references are listed. GREEKC members assume that this was previously under enhansosome (SO:0001057), which was probably created along with this term but has since been obsoleted. This term can be resurrected as non-obsolete if we can find a reference publication and/or change the name to a term that is commonly used in the field. promoter_targeting_sequence true A transcriptional_cis_regulatory_region that restricts the activity of a CRM to a single promoter and which functions only when both itself and an insulator are located between the CRM and the promoter. SO:regcreative A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence. SO:1000004 SO:1000007 INSDC_feature:misc_feature INSDC_feature:variation INSDC_note:sequence_alteration sequence alteration partially characterised change in DNA sequence partially_characterised_change_in_DNA_sequence uncharacterised_change_in_nucleotide_sequence sequence sequence variation SO:0001059 Merged with partially characterized change in nucleotide sequence. sequence_alteration A sequence_alteration is a sequence_feature whose extent is the deviation from another sequence. SO:ke A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration. Jannovar:sequence_variant VAAST:sequence_variant sequence variant sequence ANNOVAR:unknown SO:0001060 sequence_variant A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration. SO:ke Jannovar:sequence_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:sequence_variant ANNOVAR:unknown http://www.openbioinformatics.org/annovar/annovar_download.html The propeptide_cleavage_site is the arginine/lysine boundary on a propeptide where cleavage occurs. BS:00063 propeptide cleavage site sequence SO:0001061 Discrete. propeptide_cleavage_site The propeptide_cleavage_site is the arginine/lysine boundary on a propeptide where cleavage occurs. EBIBS:GAR Part of a peptide chain which is cleaved off during the formation of the mature protein. BS:00077 http://en.wikipedia.org/wiki/Propeptide INSDC_feature:propeptide sequence propep SO:0001062 Range. propeptide Part of a peptide chain which is cleaved off during the formation of the mature protein. EBIBS:GAR http://en.wikipedia.org/wiki/Propeptide wiki propep uniprot:feature_type An immature_peptide_region is the extent of the peptide after it has been translated and before any processing occurs. BS:00129 immature peptide region sequence SO:0001063 Range. immature_peptide_region An immature_peptide_region is the extent of the peptide after it has been translated and before any processing occurs. EBIBS:GAR Active peptides are proteins which are biologically active, released from a precursor molecule. BS:00076 peptide http://en.wikipedia.org/wiki/Peptide active peptide sequence SO:0001064 Hormones, neuropeptides, antimicrobial peptides, are active peptides. They are typically short (<40 amino acids) in length. active_peptide Active peptides are proteins which are biologically active, released from a precursor molecule. EBIBS:GAR UniProt:curation_manual peptide uniprot:feature_type http://en.wikipedia.org/wiki/Peptide wiki true Polypeptide region that is rich in a particular amino acid or homopolymeric and greater than three residues in length. BS:00068 compositionally_biased_region sequence compbias compositional bias compositionally biased compositionally biased region of peptide SO:0001066 Range. compositionally_biased_region_of_peptide Polypeptide region that is rich in a particular amino acid or homopolymeric and greater than three residues in length. EBIBS:GAR UniProt:curation_manual compbias uniprot:feature_type A sequence motif is a short (up to 20 amino acids) region of biological interest. Such motifs, although they are too short to constitute functional domains, share sequence similarities and are conserved in different proteins. They display a common function (protein-binding, subcellular location etc.). BS:00032 motif polypeptide motif sequence SO:0001067 Range. polypeptide_motif A sequence motif is a short (up to 20 amino acids) region of biological interest. Such motifs, although they are too short to constitute functional domains, share sequence similarities and are conserved in different proteins. They display a common function (protein-binding, subcellular location etc.). EBIBS:GAR UniProt:curation_manual motif uniprot:feature_type A polypeptide_repeat is a single copy of an internal sequence repetition. BS:00070 polypeptide repeat sequence repeat SO:0001068 Range. polypeptide_repeat A polypeptide_repeat is a single copy of an internal sequence repetition. EBIBS:GAR repeat uniprot:feature_type true Region of polypeptide with a given structural property. BS:00337 polypeptide structural region sequence structural_region SO:0001070 Range. polypeptide_structural_region Region of polypeptide with a given structural property. EBIBS:GAR SO:cb Arrangement of the polypeptide with respect to the lipid bilayer. BS:00128 membrane structure sequence SO:0001071 Range. membrane_structure Arrangement of the polypeptide with respect to the lipid bilayer. EBIBS:GAR Polypeptide region that is localized outside of a lipid bilayer. BS:00154 extramembrane polypeptide region sequence extramembrane extramembrane_region topo_dom SO:0001072 Range. extramembrane_polypeptide_region Polypeptide region that is localized outside of a lipid bilayer. EBIBS:GAR SO:cb extramembrane extramembrane_region topo_dom uniprot:feature_type Polypeptide region that is localized inside the cytoplasm. BS:00145 cytoplasm_location cytoplasmic polypeptide region sequence inside SO:0001073 cytoplasmic_polypeptide_region Polypeptide region that is localized inside the cytoplasm. EBIBS:GAR SO:cb cytoplasm_location inside Polypeptide region that is localized outside of a lipid bilayer and outside of the cytoplasm. BS:00144 non cytoplasmic polypeptide region non_cytoplasm_location sequence outside SO:0001074 This could be inside an organelle within the cell. non_cytoplasmic_polypeptide_region Polypeptide region that is localized outside of a lipid bilayer and outside of the cytoplasm. EBIBS:GAR SO:cb non_cytoplasm_location outside Polypeptide region present in the lipid bilayer. BS:00156 intramembrane polypeptide region sequence intramembrane SO:0001075 intramembrane_polypeptide_region Polypeptide region present in the lipid bilayer. EBIBS:GAR intramembrane Polypeptide region localized within the lipid bilayer where both ends traverse the same membrane. BS:00155 membrane peptide loop sequence membrane_loop SO:0001076 membrane_peptide_loop Polypeptide region localized within the lipid bilayer where both ends traverse the same membrane. EBIBS:GAR SO:cb membrane_loop Polypeptide region traversing the lipid bilayer. BS:00158 transmembrane polypeptide region sequence transmem transmembrane SO:0001077 transmembrane_polypeptide_region Polypeptide region traversing the lipid bilayer. EBIBS:GAR UniProt:curator_manual transmem uniprot:feature_type transmembrane A region of peptide with secondary structure has hydrogen bonding along the peptide chain that causes a defined conformation of the chain. BS:00003 http://en.wikipedia.org/wiki/Secondary_structure polypeptide secondary structure sequence 2nary structure secondary structure secondary structure region secondary_structure SO:0001078 Biosapien term was secondary_structure. polypeptide_secondary_structure A region of peptide with secondary structure has hydrogen bonding along the peptide chain that causes a defined conformation of the chain. EBIBS:GAR http://en.wikipedia.org/wiki/Secondary_structure wiki 2nary structure secondary structure secondary structure region secondary_structure Motif is a three-dimensional structural element within the chain, which appears also in a variety of other molecules. Unlike a domain, a motif does not need to form a stable globular unit. BS:0000338 http://en.wikipedia.org/wiki/Structural_motif sequence polypeptide structural motif structural_motif SO:0001079 polypeptide_structural_motif Motif is a three-dimensional structural element within the chain, which appears also in a variety of other molecules. Unlike a domain, a motif does not need to form a stable globular unit. EBIBS:GAR http://en.wikipedia.org/wiki/Structural_motif wiki structural_motif A coiled coil is a structural motif in proteins, in which alpha-helices are coiled together like the strands of a rope. BS:00041 http://en.wikipedia.org/wiki/Coiled_coil coiled coil sequence coiled SO:0001080 Range. coiled_coil A coiled coil is a structural motif in proteins, in which alpha-helices are coiled together like the strands of a rope. EBIBS:GAR UniProt:curation_manual http://en.wikipedia.org/wiki/Coiled_coil wiki coiled uniprot:feature_type A motif comprising two helices separated by a turn. BS:00147 helix turn helix helix-turn-helix sequence HTH SO:0001081 helix_turn_helix A motif comprising two helices separated by a turn. EBIBS:GAR HTH Incompatibility in the sequence due to some experimental problem. BS:00125 sequencing_information sequence SO:0001082 Range. polypeptide_sequencing_information Incompatibility in the sequence due to some experimental problem. EBIBS:GAR Indicates that two consecutive residues in a fragment sequence are not consecutive in the full-length protein and that there are a number of unsequenced residues between them. BS:00182 non consecutive non_cons sequence SO:0001083 non_adjacent_residues Indicates that two consecutive residues in a fragment sequence are not consecutive in the full-length protein and that there are a number of unsequenced residues between them. EBIBS:GAR UniProt:curation_manual non_cons uniprot:feature_type The residue at an extremity of the sequence is not the terminal residue. BS:00072 non terminal non_ter sequence SO:0001084 Discrete. non_terminal_residue The residue at an extremity of the sequence is not the terminal residue. EBIBS:GAR UniProt:curation_manual non_ter uniprot:feature_type Different sources report differing sequences. BS:00069 conflict sequence SO:0001085 Discrete. sequence_conflict Different sources report differing sequences. EBIBS:GAR UniProt:curation_manual conflict uniprot:feature_type Describes the positions in a sequence where the authors are unsure about the sequence assignment. BS:00181 INSDC_feature:unsure unsure sequence SO:0001086 sequence_uncertainty Describes the positions in a sequence where the authors are unsure about the sequence assignment. EBIBS:GAR UniProt:curation_manual unsure uniprot:feature_type Posttranslationally formed amino acid bonds. BS:00178 cross link sequence crosslink SO:0001087 cross_link true Posttranslationally formed amino acid bonds. EBIBS:GAR UniProt:curation_manual The covalent bond between sulfur atoms that binds two peptide chains or different parts of one peptide chain and is a structural determinant in many protein molecules. BS:00028 disulphide sequence disulfid disulfide disulfide bond disulphide bond SO:0001088 2 discreet & joined. disulfide_bond true The covalent bond between sulfur atoms that binds two peptide chains or different parts of one peptide chain and is a structural determinant in many protein molecules. EBIBS:GAR UniProt:curation_manual A region where a transformation occurs in a protein after it has been synthesized. This which may regulate, stabilize, crosslink or introduce new chemical functionalities in the protein. BS:00052 http://en.wikipedia.org/wiki/Post_translational_modification mod_res modified residue post_translational_modification sequence SO:0001089 Discrete. post_translationally_modified_region A region where a transformation occurs in a protein after it has been synthesized. This which may regulate, stabilize, crosslink or introduce new chemical functionalities in the protein. EBIBS:GAR UniProt:curation_manual http://en.wikipedia.org/wiki/Post_translational_modification wiki mod_res uniprot:feature_type Binding involving a covalent bond. BS:00246 covalent binding site sequence SO:0001090 covalent_binding_site true Binding involving a covalent bond. EBIBS:GAR Binding site for any chemical group (co-enzyme, prosthetic group, etc.). BS:00029 non covalent binding site sequence binding binding site SO:0001091 Discrete. non_covalent_binding_site true Binding site for any chemical group (co-enzyme, prosthetic group, etc.). EBIBS:GAR binding uniprot:curation A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with metal ions. BS:00027 sequence metal_binding SO:0001092 Residue is part of a binding site for a metal ion. polypeptide_metal_contact A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with metal ions. EBIBS:GAR SO:cb UniProt:curation_manual A binding site that, in the protein molecule, interacts selectively and non-covalently with polypeptide residues. BS:00131 http://en.wikipedia.org/wiki/Protein_protein_interaction protein protein contact protein protein contact site sequence protein_protein_interaction SO:0001093 protein_protein_contact A binding site that, in the protein molecule, interacts selectively and non-covalently with polypeptide residues. EBIBS:GAR UniProt:Curation_manual http://en.wikipedia.org/wiki/Protein_protein_interaction wiki A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with calcium ions. BS:00186 Ca_contact_site ca_bind polypeptide calcium ion contact site sequence ca bind SO:0001094 Residue involved in contact with calcium. polypeptide_calcium_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with calcium ions. EBIBS:GAR ca_bind uniprot:feature_type A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with cobalt ions. BS:00136 Co_contact_site polypeptide cobalt ion contact site sequence SO:0001095 polypeptide_cobalt_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with cobalt ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with copper ions. BS:00146 Cu_contact_site polypeptide copper ion contact site sequence SO:0001096 polypeptide_copper_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with copper ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with iron ions. BS:00137 Fe_contact_site polypeptide iron ion contact site sequence SO:0001097 polypeptide_iron_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with iron ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with magnesium ions. BS:00187 Mg_contact_site polypeptide magnesium ion contact site sequence SO:0001098 polypeptide_magnesium_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with magnesium ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with manganese ions. BS:00140 Mn_contact_site polypeptide manganese ion contact site sequence SO:0001099 polypeptide_manganese_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with manganese ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with molybdenum ions. BS:00141 Mo_contact_site polypeptide molybdenum ion contact site sequence SO:0001100 polypeptide_molybdenum_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with molybdenum ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with nickel ions. BS:00142 Ni_contact_site polypeptide nickel ion contact site sequence SO:0001101 polypeptide_nickel_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with nickel ions. EBIBS:GAR A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with tungsten ions. BS:00143 W_contact_site polypeptide tungsten ion contact site sequence SO:0001102 polypeptide_tungsten_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with tungsten ions. EBIBS:GAR SO:cb A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with zinc ions. BS:00185 Zn_contact_site polypeptide zinc ion contact site sequence SO:0001103 polypeptide_zinc_ion_contact_site A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with zinc ions. EBIBS:GAR SO:cb Amino acid involved in the activity of an enzyme. BS:00026 active site residue catalytic residue sequence act_site SO:0001104 Discrete. catalytic_residue Amino acid involved in the activity of an enzyme. EBIBS:GAR UniProt:curation_manual act_site uniprot:feature_type Residues which interact with a ligand. BS:00157 polypeptide ligand contact sequence protein-ligand interaction SO:0001105 polypeptide_ligand_contact Residues which interact with a ligand. EBIBS:GAR A motif of five consecutive residues and two H-bonds in which: Residue(i) is Aspartate or Asparagine (Asx), side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3), main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4). BS:00202 asx motif sequence SO:0001106 asx_motif A motif of five consecutive residues and two H-bonds in which: Residue(i) is Aspartate or Asparagine (Asx), side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3), main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4). EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of three residues within a beta-sheet in which the main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows: Residue(i): -140 degrees < phi(l) -20 degrees , -90 degrees < psi(l) < 40 degrees. Residue (i+1): -180 degrees < phi < -25 degrees or +120 degrees < phi < +180 degrees, +40 degrees < psi < +180 degrees or -180 degrees < psi < -120 degrees. BS:00208 http://en.wikipedia.org/wiki/Beta_bulge beta bulge sequence SO:0001107 beta_bulge A motif of three residues within a beta-sheet in which the main chains of two consecutive residues are H-bonded to that of the third, and in which the dihedral angles are as follows: Residue(i): -140 degrees < phi(l) -20 degrees , -90 degrees < psi(l) < 40 degrees. Residue (i+1): -180 degrees < phi < -25 degrees or +120 degrees < phi < +180 degrees, +40 degrees < psi < +180 degrees or -180 degrees < psi < -120 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ http://en.wikipedia.org/wiki/Beta_bulge wiki A motif of three residues within a beta-sheet consisting of two H-bonds. Beta bulge loops often occur at the loop ends of beta-hairpins. BS:00209 beta bulge loop sequence SO:0001108 beta_bulge_loop A motif of three residues within a beta-sheet consisting of two H-bonds. Beta bulge loops often occur at the loop ends of beta-hairpins. EBIBS:GAR Http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+4), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+3), these loops have an RL nest at residues i+2 and i+3. BS:00210 beta bulge loop five sequence SO:0001109 beta_bulge_loop_five A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+4), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+3), these loops have an RL nest at residues i+2 and i+3. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+5), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+4), these loops have an RL nest at residues i+3 and i+4. BS:00211 beta bulge loop six sequence SO:0001110 beta_bulge_loop_six A motif of three residues within a beta-sheet consisting of two H-bonds in which: the main-chain NH of residue(i) is H-bonded to the main-chain CO of residue(i+5), the main-chain CO of residue i is H-bonded to the main-chain NH of residue(i+4), these loops have an RL nest at residues i+3 and i+4. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A beta strand describes a single length of polypeptide chain that forms part of a beta sheet. A single continuous stretch of amino acids adopting an extended conformation of hydrogen bonds between the N-O and the C=O of another part of the peptide. This forms a secondary protein structure in which two or more extended polypeptide regions are hydrogen-bonded to one another in a planar array. BS:00042 http://en.wikipedia.org/wiki/Beta_sheet sequence strand SO:0001111 Range. beta_strand A beta strand describes a single length of polypeptide chain that forms part of a beta sheet. A single continuous stretch of amino acids adopting an extended conformation of hydrogen bonds between the N-O and the C=O of another part of the peptide. This forms a secondary protein structure in which two or more extended polypeptide regions are hydrogen-bonded to one another in a planar array. EBIBS:GAR UniProt:curation_manual http://en.wikipedia.org/wiki/Beta_sheet wiki strand uniprot:feature_type A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (one running N-terminal to C-terminal and one running C-terminal to N-terminal). Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i) and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they form two mutual backbone hydrogen bonds to each other's flanking peptide groups; this is known as a close pair of hydrogen bonds. The peptide backbone dihedral angles (phi, psi) are about (-140 degrees, 135 degrees) in antiparallel sheets. BS:0000341 antiparallel beta strand sequence SO:0001112 Range. antiparallel_beta_strand A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (one running N-terminal to C-terminal and one running C-terminal to N-terminal). Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i) and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they form two mutual backbone hydrogen bonds to each other's flanking peptide groups; this is known as a close pair of hydrogen bonds. The peptide backbone dihedral angles (phi, psi) are about (-140 degrees, 135 degrees) in antiparallel sheets. EBIBS:GAR UniProt:curation_manual A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (both running N-terminal to C-terminal). This orientation is slightly less stable because it introduces nonplanarity in the inter-strand hydrogen bonding pattern. Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i)and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they do not hydrogen bond to each other; rather, one residue forms hydrogen bonds to the residues that flank the other (but not vice versa). For example, residue i may form hydrogen bonds to residues j - 1 and j + 1; this is known as a wide pair of hydrogen bonds. By contrast, residue j may hydrogen-bond to different residues altogether, or to none at all. The dihedral angles (phi, psi) are about (-120 degrees, 115 degrees) in parallel sheets. BS:00151 parallel beta strand sequence SO:0001113 Range. parallel_beta_strand A peptide region which hydrogen bonded to another region of peptide running in the oposite direction (both running N-terminal to C-terminal). This orientation is slightly less stable because it introduces nonplanarity in the inter-strand hydrogen bonding pattern. Hydrogen bonding occurs between every other C=O from one strand to every other N-H on the adjacent strand. In this case, if two atoms C-alpha (i)and C-alpha (j) are adjacent in two hydrogen-bonded beta strands, then they do not hydrogen bond to each other; rather, one residue forms hydrogen bonds to the residues that flank the other (but not vice versa). For example, residue i may form hydrogen bonds to residues j - 1 and j + 1; this is known as a wide pair of hydrogen bonds. By contrast, residue j may hydrogen-bond to different residues altogether, or to none at all. The dihedral angles (phi, psi) are about (-120 degrees, 115 degrees) in parallel sheets. EBIBS:GAR UniProt:curation_manual A helix is a secondary_structure conformation where the peptide backbone forms a coil. BS:00152 sequence helix SO:0001114 Range. peptide_helix A helix is a secondary_structure conformation where the peptide backbone forms a coil. EBIBS:GAR helix A left handed helix is a region of peptide where the coiled conformation turns in an anticlockwise, left handed screw. BS:00222 left handed helix sequence helix-l SO:0001115 left_handed_peptide_helix A left handed helix is a region of peptide where the coiled conformation turns in an anticlockwise, left handed screw. EBIBS:GAR A right handed helix is a region of peptide where the coiled conformation turns in a clockwise, right handed screw. BS:0000339 right handed helix sequence helix SO:0001116 right_handed_peptide_helix A right handed helix is a region of peptide where the coiled conformation turns in a clockwise, right handed screw. EBIBS:GAR helix The helix has 3.6 residues per turn which corresponds to a translation of 1.5 angstroms (= 0.15 nm) along the helical axis. Every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier. BS:00040 http://en.wikipedia.org/wiki/Alpha_helix sequence a-helix helix SO:0001117 Range. alpha_helix The helix has 3.6 residues per turn which corresponds to a translation of 1.5 angstroms (= 0.15 nm) along the helical axis. Every backbone N-H group donates a hydrogen bond to the backbone C=O group of the amino acid four residues earlier. EBIBS:GAR http://en.wikipedia.org/wiki/Alpha_helix wiki a-helix helix uniprot:feature_type The pi helix has 4.1 residues per turn and a translation of 1.15 (=0.115 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid five residues earlier. BS:00153 http://en.wikipedia.org/wiki/Pi_helix pi helix sequence SO:0001118 Range. pi_helix The pi helix has 4.1 residues per turn and a translation of 1.15 (=0.115 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid five residues earlier. EBIBS:GAR http://en.wikipedia.org/wiki/Pi_helix wiki The 3-10 helix has 3 residues per turn with a translation of 2.0 angstroms (=0.2 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid three residues earlier. BS:0000340 http://en.wikipedia.org/wiki/310_helix 3(10) helix 3-10 helix 310 helix three ten helix sequence SO:0001119 Range. three_ten_helix The 3-10 helix has 3 residues per turn with a translation of 2.0 angstroms (=0.2 nm) along the helical axis. The N-H group of an amino acid forms a hydrogen bond with the C=O group of the amino acid three residues earlier. EBIBS:GAR http://en.wikipedia.org/wiki/310_helix wiki A motif of two consecutive residues with dihedral angles. Nest should not have Proline as any residue. Nests frequently occur as parts of other motifs such as Schellman loops. BS:00223 nest_motif sequence nest polypeptide nest motif SO:0001120 polypeptide_nest_motif A motif of two consecutive residues with dihedral angles. Nest should not have Proline as any residue. Nests frequently occur as parts of other motifs such as Schellman loops. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ nest A motif of two consecutive residues with dihedral angles: Residue(i): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. BS:00224 nest_left_right nest_lr polypeptide nest left right motif sequence SO:0001121 polypeptide_nest_left_right_motif A motif of two consecutive residues with dihedral angles: Residue(i): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of two consecutive residues with dihedral angles: Residue(i): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. BS:00225 nest_right_left nest_rl polypeptide nest right left motif sequence SO:0001122 polypeptide_nest_right_left_motif A motif of two consecutive residues with dihedral angles: Residue(i): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of six or seven consecutive residues that contains two H-bonds. BS:00226 schellmann loop sequence paperclip paperclip loop SO:0001123 schellmann_loop A motif of six or seven consecutive residues that contains two H-bonds. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ paperclip Wild type: A motif of seven consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+6), the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+5). BS:00228 schellmann loop seven seven-residue schellmann loop sequence SO:0001124 schellmann_loop_seven Wild type: A motif of seven consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+6), the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+5). EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Common Type: A motif of six consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+5) the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+4). BS:00227 schellmann loop six six-residue schellmann loop sequence SO:0001125 schellmann_loop_six Common Type: A motif of six consecutive residues that contains two H-bonds in which: the main-chain CO of residue(i) is H-bonded to the main-chain NH of residue(i+5) the main-chain CO of residue(i+1) is H-bonded to the main-chain NH of residue(i+4). EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of five consecutive residues and two hydrogen bonds in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) , the main-chain CO group of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4). BS:00229 serine/threonine motif st motif st_motif sequence SO:0001126 serine_threonine_motif A motif of five consecutive residues and two hydrogen bonds in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2) or (i+3) , the main-chain CO group of residue(i) is H-bonded to the main-chain NH of residue(i+3) or (i+4). EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of four or five consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain OH of residue(i) is H-bonded to the main-chain CO of residue(i3) or (i4), Phi angles of residues(i1), (i2) and (i3) are negative. BS:00230 serine threonine staple motif st_staple sequence SO:0001127 serine_threonine_staple_motif A motif of four or five consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain OH of residue(i) is H-bonded to the main-chain CO of residue(i3) or (i4), Phi angles of residues(i1), (i2) and (i3) are negative. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A reversal in the direction of the backbone of a protein that is stabilized by hydrogen bond between backbone NH and CO groups, involving no more than 4 amino acid residues. BS:00148 sequence turn SO:0001128 Range. polypeptide_turn_motif A reversal in the direction of the backbone of a protein that is stabilized by hydrogen bond between backbone NH and CO groups, involving no more than 4 amino acid residues. EBIBS:GAR uniprot:feature_type turn Left handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. BS:00206 asx turn left handed type one sequence asx_turn_il SO:0001129 asx_turn_left_handed_type_one Left handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Left handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. BS:00204 asx turn left handed type two asx_turn_iil sequence SO:0001130 asx_turn_left_handed_type_two Left handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Right handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. BS:00205 asx turn right handed type two asx_turn_iir sequence SO:0001131 asx_turn_right_handed_type_two Right handed type II (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, +80 degrees < psi +120 degrees < +180 degrees. Residue(i+1): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Right handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. BS:00207 asx turn type right handed type one asx_turn_ir sequence SO:0001132 asx_turn_right_handed_type_one Right handed type I (dihedral angles):- Residue(i): -140 degrees < chi (1) -120 degrees < -20 degrees, -90 degrees < psi +120 degrees < +40 degrees. Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles of the second and third residues, which are the basis for sub-categorization. BS:00212 beta turn sequence SO:0001133 beta_turn A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles of the second and third residues, which are the basis for sub-categorization. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Left handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles:- Residue(i+1): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. Residue(i+2): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. BS:00215 beta turn left handed type one beta_turn_il type I' beta turn type I' turn sequence SO:0001134 beta_turn_left_handed_type_one Left handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles:- Residue(i+1): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. Residue(i+2): -140 degrees > phi > -20 degrees, -90 degrees > psi > +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Left handed type II: A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees > phi > -20 degrees, +80 degrees > psi > +180 degrees. Residue(i+2): +20 degrees > phi > +140 degrees, -40 degrees > psi > +90 degrees. BS:00213 beta turn left handed type two beta_turn_iil type II' beta turn type II' turn sequence SO:0001135 beta_turn_left_handed_type_two Left handed type II: A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees > phi > -20 degrees, +80 degrees > psi > +180 degrees. Residue(i+2): +20 degrees > phi > +140 degrees, -40 degrees > psi > +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Right handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+2): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. BS:00216 beta turn right handed type one beta_turn_ir type I beta turn type I turn sequence SO:0001136 beta_turn_right_handed_type_one Right handed type I:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. Residue(i+2): -140 degrees < phi < -20 degrees, -90 degrees < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Right handed type II:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, +80 degrees < psi < +180 degrees. Residue(i+2): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. BS:00214 beta turn right handed type two beta_turn_iir type II beta turn type II turn sequence SO:0001137 beta_turn_right_handed_type_two Right handed type II:A motif of four consecutive residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth. It is characterized by the dihedral angles: Residue(i+1): -140 degrees < phi < -20 degrees, +80 degrees < psi < +180 degrees. Residue(i+2): +20 degrees < phi < +140 degrees, -40 degrees < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Gamma turns, defined for 3 residues i,( i+1),( i+2) if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees. BS:00219 gamma turn sequence SO:0001138 gamma_turn Gamma turns, defined for 3 residues i,( i+1),( i+2) if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=75.0 - psi(i+1)=-64.0. BS:00220 classic gamma turn gamma turn classic sequence SO:0001139 gamma_turn_classic Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=75.0 - psi(i+1)=-64.0. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=-79.0 - psi(i+1)=69.0. BS:00221 gamma turn inverse sequence SO:0001140 gamma_turn_inverse Gamma turns, defined for 3 residues i, i+1, i+2 if a hydrogen bond exists between residues i and i+2 and the phi and psi angles of residue i+1 fall within 40 degrees: phi(i+1)=-79.0 - psi(i+1)=69.0. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of three consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2). BS:00231 serine/threonine turn st_turn sequence SO:0001141 serine_threonine_turn A motif of three consecutive residues and one H-bond in which: residue(i) is Serine (S) or Threonine (T), the side-chain O of residue(i) is H-bonded to the main-chain NH of residue(i+2). EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees. BS:00234 st turn left handed type one st_turn_il sequence SO:0001142 st_turn_left_handed_type_one The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees. BS:00232 st turn left handed type two st_turn_iil sequence SO:0001143 st_turn_left_handed_type_two The peptide twists in an anticlockwise, left handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees. BS:00235 st turn right handed type one st_turn_ir sequence SO:0001144 st_turn_right_handed_type_one The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, -90 degrees psi +120 degrees < +40 degrees, residue(i+1): -140 degrees < phi < -20 degrees, -90 < psi < +40 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees. BS:00233 st turn right handed type two st_turn_iir sequence SO:0001145 st_turn_right_handed_type_two The peptide twists in an clockwise, right handed manner. The dihedral angles for this turn are: Residue(i): -140 degrees < chi(1) -120 degrees < -20 degrees, +80 degrees psi +120 degrees < +180 degrees, residue(i+1): +20 degrees < phi < +140 degrees, -40 < psi < +90 degrees. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A site of sequence variation (alteration). Alternative sequence due to naturally occurring events such as polymorphisms and alternative splicing or experimental methods such as site directed mutagenesis. BS:00336 sequence_variations sequence SO:0001146 For example, was a substitution natural or mutated as part of an experiment? This term is added to merge the biosapiens term sequence_variations. polypeptide_variation_site A site of sequence variation (alteration). Alternative sequence due to naturally occurring events such as polymorphisms and alternative splicing or experimental methods such as site directed mutagenesis. EBIBS:GAR SO:ke Describes the natural sequence variants due to polymorphisms, disease-associated mutations, RNA editing and variations between strains, isolates or cultivars. BS:00071 natural_variant sequence variation variant sequence SO:0001147 Discrete. natural_variant_site Describes the natural sequence variants due to polymorphisms, disease-associated mutations, RNA editing and variations between strains, isolates or cultivars. EBIBS:GAR UniProt:curation_manual variant uniprot:feature_type Site which has been experimentally altered. BS:00036 mutagen mutagenesis mutated_site sequence SO:0001148 Discrete. mutated_variant_site Site which has been experimentally altered. EBIBS:GAR UniProt:curation_manual mutagen uniprot:feature_type Description of sequence variants produced by alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting. BS:00073 SO:0001065 alternative_sequence var_seq isoform sequence variation varsplic sequence SO:0001149 Discrete. alternate_sequence_site Description of sequence variants produced by alternative splicing, alternative promoter usage, alternative initiation and ribosomal frameshifting. EBIBS:GAR UniProt:curation_manual var_seq uniprot:feature_type A motif of four consecutive peptide resides of type VIa or type VIb and where the i+2 residue is cis-proline. beta turn type six cis-proline loop type VI beta turn type VI turn sequence SO:0001150 beta_turn_type_six A motif of four consecutive peptide resides of type VIa or type VIb and where the i+2 residue is cis-proline. SO:cb A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -90 degrees, psi ~ 0 degrees. beta turn type six a type VIa beta turn type VIa turn sequence SO:0001151 beta_turn_type_six_a A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -90 degrees, psi ~ 0 degrees. PMID:2371257 SO:cb A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -60 degrees, psi-2 = 120 degrees, phi-3 = -90 degrees, psi-3 = 0 degrees. beta turn type six a one type VIa1 beta turn type VIa1 turn sequence SO:0001152 beta_turn_type_six_a_one A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -60 degrees, psi-2 = 120 degrees, phi-3 = -90 degrees, psi-3 = 0 degrees. PMID:27428516 A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -120 degrees, psi-2 = 120 degrees, phi-3 = -60 degrees, psi-3 = 0 degrees. beta turn type six a two type VIa2 beta turn type VIa2 turn sequence SO:0001153 beta_turn_type_six_a_two A type VIa beta turn with the following phi and psi sngles on amino acid residues 2 and 3: phi-2 = -120 degrees, psi-2 = 120 degrees, phi-3 = -60 degrees, psi-3 = 0 degrees. PMID:27428516 A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -120 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -60 degrees, psi ~ 0 degrees. beta turn type six b type VIb beta turn type VIb turn sequence SO:0001154 beta_turn_type_six_b A motif of four consecutive peptide residues, of which the i+2 residue is proline, and that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -120 degrees, psi ~ 120 degrees. Residue(i+2): phi ~ -60 degrees, psi ~ 0 degrees. PMID:2371257 SO:cb A motif of four consecutive peptide residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ -30 degrees. Residue(i+2): phi ~ -120 degrees, psi ~ 120 degrees. beta turn type eight type VIII beta turn type VIII turn sequence SO:0001155 beta_turn_type_eight A motif of four consecutive peptide residues that may contain one H-bond, which, if present, is between the main-chain CO of the first residue and the main-chain NH of the fourth and is characterized by the dihedral angles: Residue(i+1): phi ~ -60 degrees, psi ~ -30 degrees. Residue(i+2): phi ~ -120 degrees, psi ~ 120 degrees. PMID:2371257 SO:cb A sequence element characteristic of some RNA polymerase II promoters, usually located between -10 and -60 relative to the TSS. Consensus sequence is WATCGATW. DRE motif NDM4 WATCGATW_motif sequence SO:0001156 This consensus sequence was identified computationally using the MEME algorithm within core promoter sequences from -60 to +40, with an E value of 1.7e-183. Tends to co-occur with Motif 7. Tends to not occur with DPE motif (SO:0000015) or motif 10. DRE_motif A sequence element characteristic of some RNA polymerase II promoters, usually located between -10 and -60 relative to the TSS. Consensus sequence is WATCGATW. PMID:12537576 A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements with respect to the TSS (+1). Consensus sequence is YGGTCACACTR. Marked spatial preference within core promoter; tend to occur near the TSS, although not as tightly as INR (SO:0000014). DMv4 DMv4 motif directional motif v4 motif 1 element promoter motif 1 YGGTCACATR sequence SO:0001157 DMv4_motif A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements with respect to the TSS (+1). Consensus sequence is YGGTCACACTR. Marked spatial preference within core promoter; tend to occur near the TSS, although not as tightly as INR (SO:0000014). PMID:16827941:12537576 A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and +1 relative to the TSS. Consensus sequence is AWCAGCTGWT. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015). E box motif generic E box motif AWCAGCTGWT sequence NDM5 SO:0001158 E_box_motif A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and +1 relative to the TSS. Consensus sequence is AWCAGCTGWT. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015). PMID:12537576:16827941 A sequence element characteristic of some RNA polymerase II promoters, usually located between -50 and -10 relative to the TSS. Consensus sequence is KTYRGTATWTTT. Tends to co-occur with DMv4 (SO:0001157) . Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162). DMv5 DMv5 motif directional motif v5 KTYRGTATWTTT sequence promoter motif 6 SO:0001159 DMv5_motif A sequence element characteristic of some RNA polymerase II promoters, usually located between -50 and -10 relative to the TSS. Consensus sequence is KTYRGTATWTTT. Tends to co-occur with DMv4 (SO:0001157) . Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162). PMID:12537576:16827941 A sequence element characteristic of some RNA polymerase II promoters, usually located between -30 and +15 relative to the TSS. Consensus sequence is KNNCAKCNCTRNY. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015) or MTE (0001162). DMv3 DMv3 motif directional motif v3 promoter motif 7 KNNCAKCNCTRNY sequence SO:0001160 DMv3_motif A sequence element characteristic of some RNA polymerase II promoters, usually located between -30 and +15 relative to the TSS. Consensus sequence is KNNCAKCNCTRNY. Tends to co-occur with DMv2 (SO:0001161). Tends to not occur with DPE motif (SO:0000015) or MTE (0001162). PMID:12537576:16827941 A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and -45 relative to the TSS. Consensus sequence is MKSYGGCARCGSYSS. Tends to co-occur with DMv3 (SO:0001160). Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162). DMv2 DMv2 motif directional motif v2 promoter motif 8 MKSYGGCARCGSYSS sequence SO:0001161 DMv2_motif A sequence element characteristic of some RNA polymerase II promoters, usually located between -60 and -45 relative to the TSS. Consensus sequence is MKSYGGCARCGSYSS. Tends to co-occur with DMv3 (SO:0001160). Tends to not occur with DPE motif (SO:0000015) or MTE (SO:0001162). PMID:12537576:16827941 A sequence element characteristic of some RNA polymerase II promoters, usually located between +20 and +30 relative to the TSS. Consensus sequence is CSARCSSAACGS. Tends to co-occur with INR motif (SO:0000014). Tends to not occur with DPE motif (SO:0000015) or DMv5 (SO:0001159). motif ten element motif_ten_element CSARCSSAACGS sequence SO:0001162 MTE A sequence element characteristic of some RNA polymerase II promoters, usually located between +20 and +30 relative to the TSS. Consensus sequence is CSARCSSAACGS. Tends to co-occur with INR motif (SO:0000014). Tends to not occur with DPE motif (SO:0000015) or DMv5 (SO:0001159). PMID:12537576:15231738 PMID:16858867 A promoter motif with consensus sequence TCATTCG. DMp3 INR1 motif directional motif p3 directional promoter motif 3 sequence SO:0001163 INR1_motif A promoter motif with consensus sequence TCATTCG. PMID:16827941 A promoter motif with consensus sequence CGGACGT. DMp5 DPE1 motif directional motif 5 sequence directional promoter motif 5 SO:0001164 DPE1_motif A promoter motif with consensus sequence CGGACGT. PMID:16827941 A promoter motif with consensus sequence CARCCCT. DMv1 motif sequence DMv1 directional promoter motif v1 SO:0001165 DMv1_motif A promoter motif with consensus sequence CARCCCT. PMID:16827941 A non directional promoter motif with consensus sequence GAGAGCG. GAGA GAGA motif NDM1 sequence SO:0001166 GAGA_motif A non directional promoter motif with consensus sequence GAGAGCG. PMID:16827941 A non directional promoter motif with consensus CGMYGYCR. NDM2 NDM2 motif non directional promoter motif 2 sequence SO:0001167 NDM2_motif A non directional promoter motif with consensus CGMYGYCR. PMID:16827941 A non directional promoter motif with consensus sequence GAAAGCT. NDM3 NDM3 motif non directional motif 3 sequence SO:0001168 NDM3_motif A non directional promoter motif with consensus sequence GAAAGCT. PMID:16827941 A ds_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded RNA. double stranded RNA virus sequence ds RNA viral sequence sequence SO:0001169 ds_RNA_viral_sequence A ds_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded RNA. SO:ke A kind of DNA transposon that populates the genomes of protists, fungi, and animals, characterized by a unique set of proteins necessary for their transposition, including a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. Polintons are characterized by 6-bp target site duplications, terminal-inverted repeats that are several hundred nucleotides long, and 5'-AG and TC-3' termini. Polintons exist as autonomous and nonautonomous elements. sequence maverick element SO:0001170 polinton A kind of DNA transposon that populates the genomes of protists, fungi, and animals, characterized by a unique set of proteins necessary for their transposition, including a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. Polintons are characterized by 6-bp target site duplications, terminal-inverted repeats that are several hundred nucleotides long, and 5'-AG and TC-3' termini. Polintons exist as autonomous and nonautonomous elements. PMID:16537396 A component of the large ribosomal subunit in mitochondrial rRNA. SO:0002345 21S LSU rRNA 21S rRNA 21S ribosomal RNA rRNA 21S sequence SO:0001171 This term has been merged into mt_LSU_rRNA (SO:0002345) as part of reorganization of rRNA child terms 10 June 2021. Requested by EBI. See GitHub Issue #493. rRNA_21S true A component of the large ribosomal subunit in mitochondrial rRNA. RSC:cb A region of a tRNA. tRNA region sequence SO:0001172 tRNA_region A region of a tRNA. RSC:cb A sequence of seven nucleotide bases in tRNA which contains the anticodon. It has the sequence 5'-pyrimidine-purine-anticodon-modified purine-any base-3. anti-codon loop anticodon loop sequence SO:0001173 anticodon_loop A sequence of seven nucleotide bases in tRNA which contains the anticodon. It has the sequence 5'-pyrimidine-purine-anticodon-modified purine-any base-3. ISBN:0716719207 A sequence of three nucleotide bases in tRNA which recognizes a codon in mRNA. http://en.wikipedia.org/wiki/Anticodon anti-codon sequence SO:0001174 anticodon A sequence of three nucleotide bases in tRNA which recognizes a codon in mRNA. RSC:cb http://en.wikipedia.org/wiki/Anticodon wiki Base sequence at the 3' end of a tRNA. The 3'-hydroxyl group on the terminal adenosine is the attachment point for the amino acid. CCA sequence CCA tail sequence SO:0001175 CCA_tail Base sequence at the 3' end of a tRNA. The 3'-hydroxyl group on the terminal adenosine is the attachment point for the amino acid. ISBN:0716719207 Non-base-paired sequence of nucleotide bases in tRNA. It contains several dihydrouracil residues. DHU loop sequence D loop SO:0001176 DHU_loop Non-base-paired sequence of nucleotide bases in tRNA. It contains several dihydrouracil residues. ISBN:071671920 Non-base-paired sequence of three nucleotide bases in tRNA. It has sequence T-Psi-C. T loop TpsiC loop sequence SO:0001177 T_loop Non-base-paired sequence of three nucleotide bases in tRNA. It has sequence T-Psi-C. ISBN:0716719207 A primary transcript encoding pyrrolysyl tRNA (SO:0000766). pyrrolysine tRNA primary transcript sequence SO:0001178 pyrrolysine_tRNA_primary_transcript A primary transcript encoding pyrrolysyl tRNA (SO:0000766). RSC:cb U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA. http://en.wikipedia.org/wiki/Small_nucleolar_RNA_U3 U3 small nucleolar RNA U3 snoRNA small nucleolar RNA U3 snoRNA U3 sequence SO:0001179 The definition is most of the old definition for snoRNA (SO:0000275). U3_snoRNA U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00012 http://en.wikipedia.org/wiki/Small_nucleolar_RNA_U3 wiki A cis-acting element found in the 3' UTR of some mRNA which is rich in AUUUA pentamers. Messenger RNAs bearing multiple AU-rich elements are often unstable. http://en.wikipedia.org/wiki/AU-rich_element AU rich element AU-rich element sequence ARE SO:0001180 AU_rich_element A cis-acting element found in the 3' UTR of some mRNA which is rich in AUUUA pentamers. Messenger RNAs bearing multiple AU-rich elements are often unstable. PMID:7892223 http://en.wikipedia.org/wiki/AU-rich_element wiki A cis-acting element found in the 3' UTR of some mRNA which is bound by the Drosophila Bruno protein and its homologs. Bruno response element sequence BRE SO:0001181 Not to be confused with BRE_motif (SO:0000016), which binds transcription factor II B. Bruno_response_element A cis-acting element found in the 3' UTR of some mRNA which is bound by the Drosophila Bruno protein and its homologs. PMID:10893231 A regulatory sequence found in the 5' and 3' UTRs of many mRNAs which encode iron-binding proteins. It has a hairpin structure and is recognized by trans-acting proteins known as iron-regulatory proteins. http://en.wikipedia.org/wiki/Iron_responsive_element IRE iron responsive element sequence SO:0001182 iron_responsive_element A regulatory sequence found in the 5' and 3' UTRs of many mRNAs which encode iron-binding proteins. It has a hairpin structure and is recognized by trans-acting proteins known as iron-regulatory proteins. PMID:3198610 PMID:8710843 http://en.wikipedia.org/wiki/Iron_responsive_element wiki An attribute describing a sequence composed of nucleobases bound to a morpholino backbone. A morpholino backbone consists of morpholine (CHEBI:34856) rings connected by phosphorodiamidate linkages. http://en.wikipedia.org/wiki/Morpholino morpholino backbone sequence SO:0001183 Do not use this for feature annotation. Use morpholino_oligo (SO:0000034) instead. morpholino_backbone An attribute describing a sequence composed of nucleobases bound to a morpholino backbone. A morpholino backbone consists of morpholine (CHEBI:34856) rings connected by phosphorodiamidate linkages. RSC:cb http://en.wikipedia.org/wiki/Morpholino wiki An attribute describing a sequence composed of peptide nucleic acid (CHEBI:48021), a chemical consisting of nucleobases bound to a backbone composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds. sequence peptide nucleic acid SO:0001184 Do not use this term for feature annotation. Use PNA_oligo (SO:0001011) instead. PNA An attribute describing a sequence composed of peptide nucleic acid (CHEBI:48021), a chemical consisting of nucleobases bound to a backbone composed of repeating N-(2-aminoethyl)-glycine units linked by peptide bonds. The purine and pyrimidine bases are linked to the backbone by methylene carbonyl bonds. RSC:cb An attribute describing the sequence of a transcript that has catalytic activity with or without an associated ribonucleoprotein. sequence SO:0001185 Do not use this for feature annotation. Use enzymatic_RNA (SO:0000372) instead. enzymatic An attribute describing the sequence of a transcript that has catalytic activity with or without an associated ribonucleoprotein. RSC:cb An attribute describing the sequence of a transcript that has catalytic activity even without an associated ribonucleoprotein. sequence SO:0001186 Do not use this for feature annotation. Use ribozyme (SO:0000374) instead. ribozymic An attribute describing the sequence of a transcript that has catalytic activity even without an associated ribonucleoprotein. RSC:cb A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue. pseudouridylation guide snoRNA sequence SO:0001187 Has RNA pseudouridylation guide activity (GO:0030558). pseudouridylation_guide_snoRNA A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue. GOC:mah PMID:12457565 An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of 'locked' deoxyribose rings connected to a phosphate backbone. The deoxyribose unit's conformation is 'locked' by a 2'-C,4'-C-oxymethylene link. sequence SO:0001188 Do not use this term for feature annotation. Use LNA_oligo (SO:0001189) instead. LNA An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of 'locked' deoxyribose rings connected to a phosphate backbone. The deoxyribose unit's conformation is 'locked' by a 2'-C,4'-C-oxymethylene link. CHEBI:48010 An oligo composed of LNA residues. http://en.wikipedia.org/wiki/Locked_nucleic_acid LNA oligo locked nucleic acid sequence SO:0001189 LNA_oligo An oligo composed of LNA residues. RSC:cb http://en.wikipedia.org/wiki/Locked_nucleic_acid wiki An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of threose rings connected to a phosphate backbone. sequence SO:0001190 Do not use this term for feature annotation. Use TNA_oligo (SO:0001191) instead. TNA An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of threose rings connected to a phosphate backbone. CHEBI:48019 An oligo composed of TNA residues. http://en.wikipedia.org/wiki/Threose_nucleic_acid TNA oligo threose nucleic acid sequence SO:0001191 TNA_oligo An oligo composed of TNA residues. RSC:cb http://en.wikipedia.org/wiki/Threose_nucleic_acid wiki An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of an acyclic three-carbon propylene glycol connected to a phosphate backbone. It has two enantiomeric forms, (R)-GNA and (S)-GNA. sequence SO:0001192 Do not use this term for feature annotation. Use GNA_oligo (SO:0001192) instead. GNA An attribute describing a sequence consisting of nucleobases attached to a repeating unit made of an acyclic three-carbon propylene glycol connected to a phosphate backbone. It has two enantiomeric forms, (R)-GNA and (S)-GNA. CHEBI:48015 An oligo composed of GNA residues. http://en.wikipedia.org/wiki/Glycerol_nucleic_acid GNA oligo glycerol nucleic acid glycol nucleic acid sequence SO:0001193 GNA_oligo An oligo composed of GNA residues. RSC:cb http://en.wikipedia.org/wiki/Glycerol_nucleic_acid wiki An attribute describing a GNA sequence in the (R)-GNA enantiomer. R GNA sequence SO:0001194 Do not use this term for feature annotation. Use R_GNA_oligo (SO:0001195) instead. R_GNA An attribute describing a GNA sequence in the (R)-GNA enantiomer. CHEBI:48016 An oligo composed of (R)-GNA residues. (R)-glycerol nucleic acid (R)-glycol nucleic acid R GNA oligo sequence SO:0001195 R_GNA_oligo An oligo composed of (R)-GNA residues. RSC:cb An attribute describing a GNA sequence in the (S)-GNA enantiomer. S GNA sequence SO:0001196 Do not use this term for feature annotation. Use S_GNA_oligo (SO:0001197) instead. S_GNA An attribute describing a GNA sequence in the (S)-GNA enantiomer. CHEBI:48017 An oligo composed of (S)-GNA residues. (S)-glycerol nucleic acid (S)-glycol nucleic acid S GNA oligo sequence SO:0001197 S_GNA_oligo An oligo composed of (S)-GNA residues. RSC:cb A ds_DNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded DNA. double stranded DNA virus ds DNA viral sequence sequence SO:0001198 ds_DNA_viral_sequence A ds_DNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as double stranded DNA. SO:ke A ss_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as single stranded RNA. single strand RNA virus ss RNA viral sequence sequence SO:0001199 ss_RNA_viral_sequence A ss_RNA_viral_sequence is a viral_sequence that is the sequence of a virus that exists as single stranded RNA. SO:ke A negative_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that is complementary to mRNA and must be converted to positive sense RNA by RNA polymerase before translation. negative sense ssRNA viral sequence sequence negative sense single stranded RNA virus SO:0001200 negative_sense_ssRNA_viral_sequence A negative_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that is complementary to mRNA and must be converted to positive sense RNA by RNA polymerase before translation. SO:ke A positive_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that can be immediately translated by the host. positive sense ssRNA viral sequence sequence positive sense single stranded RNA virus SO:0001201 positive_sense_ssRNA_viral_sequence A positive_sense_RNA_viral_sequence is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus that can be immediately translated by the host. SO:ke A ambisense_RNA_virus is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus with both messenger and anti messenger polarity. ambisense single stranded RNA virus ambisense ssRNA viral sequence sequence SO:0001202 ambisense_ssRNA_viral_sequence A ambisense_RNA_virus is a ss_RNA_viral_sequence that is the sequence of a single stranded RNA virus with both messenger and anti messenger polarity. SO:ke A region (DNA) to which RNA polymerase binds, to begin transcription. SO:0000167 RNA polymerase promoter sequence SO:0001203 Term merged with promoter SO:0000167 in August 2020 as part of GREEKC initiative. See GitHub Issue 492 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/492) RNA_polymerase_promoter true A region (DNA) to which RNA polymerase binds, to begin transcription. xenbase:jb A region (DNA) to which Bacteriophage RNA polymerase binds, to begin transcription. Phage RNA Polymerase Promoter sequence SO:0001204 former parent RNA_polymerase_promoter SO:0001203 was merged with promoter SO:0000167 in Aug 2020 as part of GREEKC. Phage_RNA_Polymerase_Promoter A region (DNA) to which Bacteriophage RNA polymerase binds, to begin transcription. xenbase:jb A region (DNA) to which the SP6 RNA polymerase binds, to begin transcription. SP6 RNA Polymerase Promoter sequence SO:0001205 SP6_RNA_Polymerase_Promoter A region (DNA) to which the SP6 RNA polymerase binds, to begin transcription. xenbase:jb A DNA sequence to which the T3 RNA polymerase binds, to begin transcription. T3 RNA Polymerase Promoter sequence SO:0001206 T3_RNA_Polymerase_Promoter A DNA sequence to which the T3 RNA polymerase binds, to begin transcription. xenbase:jb A region (DNA) to which the T7 RNA polymerase binds, to begin transcription. T7 RNA Polymerase Promoter sequence SO:0001207 T7_RNA_Polymerase_Promoter A region (DNA) to which the T7 RNA polymerase binds, to begin transcription. xenbase:jb An EST read from the 5' end of a transcript that usually codes for a protein. These regions tend to be conserved across species and do not change much within a gene family. 5' EST five prime EST sequence SO:0001208 five_prime_EST An EST read from the 5' end of a transcript that usually codes for a protein. These regions tend to be conserved across species and do not change much within a gene family. http://www.ncbi.nlm.nih.gov/About/primer/est.html An EST read from the 3' end of a transcript. They are more likely to fall within non-coding, or untranslated regions(UTRs). 3' EST three prime EST sequence SO:0001209 three_prime_EST An EST read from the 3' end of a transcript. They are more likely to fall within non-coding, or untranslated regions(UTRs). http://www.ncbi.nlm.nih.gov/About/primer/est.html The region of mRNA (not divisible by 3 bases) that is skipped or added during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. http://en.wikipedia.org/wiki/Translational_frameshift INSDC_qualifier:ribosomal_slippage ribosomal frameshift ribosomal slippage translational frameshift sequence SO:0001210 Added synonym 'ribosomal_slippage' on Feb 1, 2021, a term in INSDC and GenBank. See GitHub Issue #522. translational_frameshift The region of mRNA (not divisible by 3 bases) that is skipped or added during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. SO:ke http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Translational_frameshift wiki The region of mRNA 1 base long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. plus 1 ribosomal frameshift plus 1 translational frameshift sequence SO:0001211 plus_1_translational_frameshift The region of mRNA 1 base long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. SO:ke The region of mRNA 2 bases long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. plus 2 ribosomal frameshift plus 2 translational frameshift sequence SO:0001212 plus_2_translational_frameshift The region of mRNA 2 bases long that is skipped during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. SO:ke Group III introns are introns found in the mRNA of the plastids of euglenoid protists. They are spliced by a two step transesterification with bulged adenosine as initiating nucleophile. http://en.wikipedia.org/wiki/Group_III_intron group III intron sequence SO:0001213 GO:0000374. group_III_intron Group III introns are introns found in the mRNA of the plastids of euglenoid protists. They are spliced by a two step transesterification with bulged adenosine as initiating nucleophile. PMID:11377794 http://en.wikipedia.org/wiki/Group_III_intron wiki The maximal intersection of exon and UTR. noncoding region of exon sequence SO:0001214 An exon either containing but not starting with a start codon or containing but not ending with a stop codon will be partially coding and partially non coding. noncoding_region_of_exon The maximal intersection of exon and UTR. SO:ke The region of an exon that encodes for protein sequence. coding region of exon sequence SO:0001215 An exon containing either a start or stop codon will be partially coding and partially non coding. coding_region_of_exon The region of an exon that encodes for protein sequence. SO:ke An intron that spliced via endonucleolytic cleavage and ligation rather than transesterification. endonuclease spliced intron sequence SO:0001216 endonuclease_spliced_intron An intron that spliced via endonucleolytic cleavage and ligation rather than transesterification. SO:ke A gene that codes for an RNA that can be translated into a protein. protein coding gene sequence SO:0001217 protein_coding_gene An insertion that derives from another organism, via the use of recombinant DNA technology. transgenic insertion sequence SO:0001218 transgenic_insertion An insertion that derives from another organism, via the use of recombinant DNA technology. SO:bm A gene that has been produced as the product of a reverse transcriptase mediated event. sequence SO:0001219 retrogene An attribute describing an epigenetic process where a gene is inactivated by RNA interference. silenced by RNA interference sequence SO:0001220 RNA interference is GO:0016246. silenced_by_RNA_interference An attribute describing an epigenetic process where a gene is inactivated by RNA interference. RSC:cb An attribute describing an epigenetic process where a gene is inactivated by histone modification. silenced by histone modification sequence SO:0001221 Histone modification is GO:0016570. silenced_by_histone_modification An attribute describing an epigenetic process where a gene is inactivated by histone modification. RSC:cb An attribute describing an epigenetic process where a gene is inactivated by histone methylation. silenced by histone methylation sequence SO:0001222 Histone methylation is GO:0016571. silenced_by_histone_methylation An attribute describing an epigenetic process where a gene is inactivated by histone methylation. RSC:cb An attribute describing an epigenetic process where a gene is inactivated by histone deacetylation. silenced by histone deacetylation sequence SO:0001223 Histone deacetylation is GO:0016573. silenced_by_histone_deacetylation An attribute describing an epigenetic process where a gene is inactivated by histone deacetylation. RSC:cb A gene that is silenced by RNA interference. RNA interference silenced gene RNAi silenced gene gene silenced by RNA interference sequence SO:0001224 gene_silenced_by_RNA_interference A gene that is silenced by RNA interference. SO:xp A gene that is silenced by histone modification. gene silenced by histone modification sequence SO:0001225 gene_silenced_by_histone_modification A gene that is silenced by histone modification. SO:xp A gene that is silenced by histone methylation. gene silenced by histone methylation sequence SO:0001226 gene_silenced_by_histone_methylation A gene that is silenced by histone methylation. SO:xp A gene that is silenced by histone deacetylation. gene silenced by histone deacetylation sequence SO:0001227 gene_silenced_by_histone_deacetylation A gene that is silenced by histone deacetylation. SO:xp A modified RNA base in which the 5,6-dihydrouracil is bound to the ribose ring. RNAMOD:051 http://en.wikipedia.org/wiki/Dihydrouridine D sequence SO:0001228 dihydrouridine A modified RNA base in which the 5,6-dihydrouracil is bound to the ribose ring. RSC:cb http://en.wikipedia.org/wiki/Dihydrouridine wiki D A modified RNA base in which the 5- position of the uracil is bound to the ribose ring instead of the 4- position. RNAMOD:050 http://en.wikipedia.org/wiki/Pseudouridine Y sequence SO:0001229 The free molecule is CHEBI:17802. pseudouridine A modified RNA base in which the 5- position of the uracil is bound to the ribose ring instead of the 4- position. RSC:cb http://en.wikipedia.org/wiki/Pseudouridine wiki Y A modified RNA base in which hypoxanthine is bound to the ribose ring. http://en.wikipedia.org/wiki/Inosine sequence I RNAMOD:017 SO:0001230 The free molecule is CHEBI:17596. inosine A modified RNA base in which hypoxanthine is bound to the ribose ring. RSC:cb http://library.med.utah.edu/RNAmods/ http://en.wikipedia.org/wiki/Inosine wiki A modified RNA base in which guanine is methylated at the 7- position. 7-methylguanine seven methylguanine sequence SO:0001231 The free molecule is CHEBI:2274. seven_methylguanine A modified RNA base in which guanine is methylated at the 7- position. RSC:cb A modified RNA base in which thymine is bound to the ribose ring. sequence SO:0001232 The free molecule is CHEBI:30832. ribothymidine A modified RNA base in which thymine is bound to the ribose ring. RSC:cb A modified RNA base in which methylhypoxanthine is bound to the ribose ring. sequence SO:0001233 methylinosine A modified RNA base in which methylhypoxanthine is bound to the ribose ring. RSC:cb An attribute describing a feature that has either intra-genome or intracellular mobility. http://en.wikipedia.org/wiki/Mobile sequence SO:0001234 mobile An attribute describing a feature that has either intra-genome or intracellular mobility. RSC:cb http://en.wikipedia.org/wiki/Mobile wiki A region containing at least one unique origin of replication and a unique termination site. http://en.wikipedia.org/wiki/Replicon_(genetics) sequence SO:0001235 replicon A region containing at least one unique origin of replication and a unique termination site. ISBN:0716719207 http://en.wikipedia.org/wiki/Replicon_(genetics) wiki A base is a sequence feature that corresponds to a single unit of a nucleotide polymer. http://en.wikipedia.org/wiki/Nucleobase sequence SO:0001236 base A base is a sequence feature that corresponds to a single unit of a nucleotide polymer. SO:ke http://en.wikipedia.org/wiki/Nucleobase wiki A sequence feature that corresponds to a single amino acid residue in a polypeptide. http://en.wikipedia.org/wiki/Amino_acid amino acid sequence SO:0001237 Probably in the future this will cross reference to Chebi. amino_acid A sequence feature that corresponds to a single amino acid residue in a polypeptide. RSC:cb http://en.wikipedia.org/wiki/Amino_acid wiki The tanscription start site that is most frequently used for transcription of a gene. major TSS major transcription start site sequence SO:0001238 major_TSS A tanscription start site that is not the most frequently used for transcription of a gene. minor TSS sequence SO:0001239 minor_TSS The region of a gene from the 5' most TSS to the 3' TSS. SO:0000167 TSS region sequence SO:0001240 Merged into promoter (SO:0000167) on 11 Feb 2021 by Dave Sant. GREEKC had asked us to merge these terms to reduce redundancy. See GitHub Issue #528 TSS_region true The region of a gene from the 5' most TSS to the 3' TSS. BBOP:nw A gene that has multiple possible transcription start sites. encodes alternate transcription start sites sequence SO:0001241 encodes_alternate_transcription_start_sites true A part of an miRNA primary_transcript. miRNA primary transcript region sequence SO:0001243 miRNA_primary_transcript_region A part of an miRNA primary_transcript. SO:ke The 60-70 nucleotide region remain after Drosha processing of the primary transcript, that folds back upon itself to form a hairpin structure. pre-miRNA sequence SO:0001244 pre_miRNA The 60-70 nucleotide region remain after Drosha processing of the primary transcript, that folds back upon itself to form a hairpin structure. SO:ke The stem of the hairpin loop formed by folding of the pre-miRNA. miRNA stem sequence SO:0001245 miRNA_stem The stem of the hairpin loop formed by folding of the pre-miRNA. SO:ke The loop of the hairpin loop formed by folding of the pre-miRNA. miRNA loop sequence SO:0001246 miRNA_loop The loop of the hairpin loop formed by folding of the pre-miRNA. SO:ke An oligo composed of synthetic nucleotides. synthetic oligo sequence SO:0001247 synthetic_oligo An oligo composed of synthetic nucleotides. SO:ke A region of the genome of known length that is composed by ordering and aligning two or more different regions. http://en.wikipedia.org/wiki/Genome_assembly#Genome_assembly sequence SO:0001248 assembly A region of the genome of known length that is composed by ordering and aligning two or more different regions. SO:ke http://en.wikipedia.org/wiki/Genome_assembly#Genome_assembly wiki A fragment assembly is a genome assembly that orders overlapping fragments of the genome based on landmark sequences. The base pair distance between the landmarks is known allowing additivity of lengths. fragment assembly physical map sequence SO:0001249 fragment_assembly A fragment assembly is a genome assembly that orders overlapping fragments of the genome based on landmark sequences. The base pair distance between the landmarks is known allowing additivity of lengths. SO:ke A fingerprint_map is a physical map composed of restriction fragments. BACmap FPC FPCmap fingerprint map restriction map sequence SO:0001250 fingerprint_map A fingerprint_map is a physical map composed of restriction fragments. SO:ke An STS map is a physical map organized by the unique STS landmarks. STS map sequence SO:0001251 STS_map An STS map is a physical map organized by the unique STS landmarks. SO:ke A radiation hybrid map is a physical map. RH map radiation hybrid map sequence SO:0001252 RH_map A radiation hybrid map is a physical map. SO:ke A DNA fragment generated by sonication. Sonication is a technique used to sheer DNA into smaller fragments. sonicate fragment sequence SO:0001253 sonicate_fragment A DNA fragment generated by sonication. Sonication is a technique used to sheer DNA into smaller fragments. SO:ke A kind of chromosome variation where the chromosome complement is an exact multiple of the haploid number and is greater than the diploid number. http://en.wikipedia.org/wiki/Polyploid sequence SO:0001254 polyploid A kind of chromosome variation where the chromosome complement is an exact multiple of the haploid number and is greater than the diploid number. SO:ke http://en.wikipedia.org/wiki/Polyploid wiki A polyploid where the multiple chromosome set was derived from the same organism. http://en.wikipedia.org/wiki/Autopolyploid sequence SO:0001255 autopolyploid A polyploid where the multiple chromosome set was derived from the same organism. SO:ke http://en.wikipedia.org/wiki/Autopolyploid wiki A polyploid where the multiple chromosome set was derived from a different organism. http://en.wikipedia.org/wiki/Allopolyploid sequence SO:0001256 allopolyploid A polyploid where the multiple chromosome set was derived from a different organism. SO:ke http://en.wikipedia.org/wiki/Allopolyploid wiki The binding site (recognition site) of a homing endonuclease. The binding site is typically large. homing endonuclease binding site sequence SO:0001257 homing_endonuclease_binding_site The binding site (recognition site) of a homing endonuclease. The binding site is typically large. SO:ke A sequence element characteristic of some RNA polymerase II promoters with sequence ATTGCAT that binds Pou-domain transcription factors. octamer motif sequence SO:0001258 Nature. 1986 Oct 16-22;323(6089):640-3. octamer_motif A sequence element characteristic of some RNA polymerase II promoters with sequence ATTGCAT that binds Pou-domain transcription factors. GOC:dh PMID:3095662 A chromosome originating in an apicoplast. apicoplast chromosome sequence SO:0001259 apicoplast_chromosome A chromosome originating in an apicoplast. SO:xp A collection of discontinuous sequences. sequence collection sequence SO:0001260 sequence_collection A collection of discontinuous sequences. SO:ke A continuous region of sequence composed of the overlapping of multiple sequence_features, which ultimately provides evidence for another sequence_feature. overlapping feature set sequence SO:0001261 This feature was requested by Nicole, tracker id 1911479. It is required to gather evidence together for annotation. An example would be overlapping ESTs that support an mRNA. overlapping_feature_set A continuous region of sequence composed of the overlapping of multiple sequence_features, which ultimately provides evidence for another sequence_feature. SO:ke A continous experimental result region extending the length of multiple overlapping EST's. overlapping EST set sequence SO:0001262 overlapping_EST_set A continous experimental result region extending the length of multiple overlapping EST's. SO:ke A gene that encodes a non-coding RNA. ncRNA gene non-coding RNA gene sequence SO:0001263 ncRNA_gene A noncoding RNA that guides the insertion or deletion of uridine residues in mitochondrial mRNAs. This may also refer to synthetic RNAs used to guide DNA editing using the CRIPSR/Cas9 system. gRNA gene sequence SO:0001264 gRNA_gene A small noncoding RNA of approximately 22 nucleotides in length which may be involved in regulation of gene expression. SO:0001270 miRNA gene stRNA gene stRNA_gene sequence SO:0001265 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. miRNA_gene A gene encoding a small noncoding RNA that is generally found only in the cytoplasm. scRNA gene sequence SO:0001266 scRNA_gene A gene encoding a small noncoding RNA that participates in the processing or chemical modifications of many RNAs, including ribosomal RNAs and spliceosomal RNAs. snoRNA gene sequence SO:0001267 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. Added additional children of snoRNA on 18 Nov 2021 at the request of Steven Marygold. See GitHub Issue #519. snoRNA_gene A gene that encodes a small nuclear RNA. small nuclear RNA gene snRNA gene sequence SO:0001268 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. snRNA_gene A gene that encodes a small nuclear RNA. http://en.wikipedia.org/wiki/Small_nuclear_RNA A gene that encodes a signal recognition particle (SRP) RNA. SRP RNA gene sequence SO:0001269 SRP_RNA_gene true A bacterial RNA with both tRNA and mRNA like properties. tmRNA gene sequence SO:0001271 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. tmRNA_gene A noncoding RNA that binds to a specific amino acid to allow that amino acid to be used by the ribosome during translation of RNA. tRNA gene sequence SO:0001272 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. tRNA_gene A modified adenine is an adenine base feature that has been altered. modified adenosine sequence SO:0001273 modified_adenosine A modified adenine is an adenine base feature that has been altered. SO:ke A modified inosine is an inosine base feature that has been altered. modified inosine sequence SO:0001274 modified_inosine A modified inosine is an inosine base feature that has been altered. SO:ke A modified cytidine is a cytidine base feature which has been altered. modified cytidine sequence SO:0001275 modified_cytidine A modified cytidine is a cytidine base feature which has been altered. SO:ke A guanosine base that has been modified. modified guanosine sequence SO:0001276 modified_guanosine A uridine base that has been modified. modified uridine sequence SO:0001277 modified_uridine 1-methylinosine is a modified inosine. RNAMOD:018 1-methylinosine m1I one methylinosine sequence SO:0001278 one_methylinosine 1-methylinosine is a modified inosine. http://library.med.utah.edu/RNAmods/ m1I 1,2'-O-dimethylinosine is a modified inosine. RNAMOD:019 1,2'-O-dimethylinosine m'Im one two prime O dimethylinosine sequence SO:0001279 one_two_prime_O_dimethylinosine 1,2'-O-dimethylinosine is a modified inosine. http://library.med.utah.edu/RNAmods/ m'Im 2'-O-methylinosine is a modified inosine. RNAMOD:081 2'-O-methylinosine Im two prime O methylinosine sequence SO:0001280 two_prime_O_methylinosine 2'-O-methylinosine is a modified inosine. http://library.med.utah.edu/RNAmods/ Im 3-methylcytidine is a modified cytidine. RNAMOD:020 3-methylcytidine m3C three methylcytidine sequence SO:0001281 three_methylcytidine 3-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m3C 5-methylcytidine is a modified cytidine. RNAMOD:021 5-methylcytidine five methylcytidine m5C sequence SO:0001282 five_methylcytidine 5-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m5C 2'-O-methylcytidine is a modified cytidine. RNAMOD:022 2'-O-methylcytidine Cm two prime O methylcytidine sequence SO:0001283 two_prime_O_methylcytidine 2'-O-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ Cm 2-thiocytidine is a modified cytidine. RNAMOD:023 2-thiocytidine s2C two thiocytidine sequence SO:0001284 two_thiocytidine 2-thiocytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ s2C N4-acetylcytidine is a modified cytidine. RNAMOD:024 N4 acetylcytidine N4-acetylcytidine ac4C sequence SO:0001285 N4_acetylcytidine N4-acetylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ ac4C 5-formylcytidine is a modified cytidine. RNAMOD:025 5-formylcytidine f5C five formylcytidine sequence SO:0001286 five_formylcytidine 5-formylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ f5C 5,2'-O-dimethylcytidine is a modified cytidine. RNAMOD:026 5,2'-O-dimethylcytidine five two prime O dimethylcytidine m5Cm sequence SO:0001287 five_two_prime_O_dimethylcytidine 5,2'-O-dimethylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m5Cm N4-acetyl-2'-O-methylcytidine is a modified cytidine. RNAMOD:027 N4 acetyl 2 prime O methylcytidine N4-acetyl-2'-O-methylcytidine ac4Cm sequence SO:0001288 N4_acetyl_2_prime_O_methylcytidine N4-acetyl-2'-O-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ ac4Cm Lysidine is a modified cytidine. RNAMOD:028 http://en.wikipedia.org/wiki/Lysidine k2C sequence SO:0001289 lysidine Lysidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ http://en.wikipedia.org/wiki/Lysidine wiki k2C N4-methylcytidine is a modified cytidine. RNAMOD:082 N4 methylcytidine N4-methylcytidine m4C sequence SO:0001290 N4_methylcytidine N4-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m4C N4,2'-O-dimethylcytidine is a modified cytidine. RNAMOD:083 N4 2 prime O dimethylcytidine N4,2'-O-dimethylcytidine m4Cm sequence SO:0001291 N4_2_prime_O_dimethylcytidine N4,2'-O-dimethylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m4Cm 5-hydroxymethylcytidine is a modified cytidine. RNAMOD:084 5-hydroxymethylcytidine five hydroxymethylcytidine hm5C sequence SO:0001292 five_hydroxymethylcytidine 5-hydroxymethylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ hm5C 5-formyl-2'-O-methylcytidine is a modified cytidine. RNAMOD:095 5-formyl-2'-O-methylcytidine f5Cm five formyl two prime O methylcytidine sequence SO:0001293 five_formyl_two_prime_O_methylcytidine 5-formyl-2'-O-methylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ f5Cm N4_N4_2_prime_O_trimethylcytidine is a modified cytidine. RNAMOD:107 N4,N4,2'-O-trimethylcytidine m42Cm sequence SO:0001294 N4_N4_2_prime_O_trimethylcytidine N4_N4_2_prime_O_trimethylcytidine is a modified cytidine. http://library.med.utah.edu/RNAmods/ m42Cm 1_methyladenosine is a modified adenosine. RNAMOD:001 1-methyladenosine m1A one methyladenosine sequence SO:0001295 one_methyladenosine 1_methyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m1A 2_methyladenosine is a modified adenosine. RNAMOD:002 2-methyladenosine m2A two methyladenosine sequence SO:0001296 two_methyladenosine 2_methyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m2A N6_methyladenosine is a modified adenosine. RNAMOD:003 N6 methyladenosine N6-methyladenosine m6A sequence SO:0001297 N6_methyladenosine N6_methyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m6A 2prime_O_methyladenosine is a modified adenosine. RNAMOD:004 2'-O-methyladenosine Am two prime O methyladenosine sequence SO:0001298 two_prime_O_methyladenosine 2prime_O_methyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ Am 2_methylthio_N6_methyladenosine is a modified adenosine. RNAMOD:005 2-methylthio-N6-methyladenosine ms2m6A two methylthio N6 methyladenosine sequence SO:0001299 two_methylthio_N6_methyladenosine 2_methylthio_N6_methyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ms2m6A N6_isopentenyladenosine is a modified adenosine. RNAMOD:006 N6 isopentenyladenosine N6-isopentenyladenosine i6A sequence SO:0001300 N6_isopentenyladenosine N6_isopentenyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ i6A 2_methylthio_N6_isopentenyladenosine is a modified adenosine. RNAMOD:007 2-methylthio-N6-isopentenyladenosine ms2i6A two methylthio N6 isopentenyladenosine sequence SO:0001301 two_methylthio_N6_isopentenyladenosine 2_methylthio_N6_isopentenyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ms2i6A N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine. RNAMOD:008 N6 cis hydroxyisopentenyl adenosine N6-(cis-hydroxyisopentenyl)adenosine io6A sequence SO:0001302 N6_cis_hydroxyisopentenyl_adenosine N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ io6A 2_methylthio_N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine. RNAMOD:009 2-methylthio-N6-(cis-hydroxyisopentenyl) adenosine ms2io6A two methylthio N6 cis hydroxyisopentenyl adenosine sequence SO:0001303 two_methylthio_N6_cis_hydroxyisopentenyl_adenosine 2_methylthio_N6_cis_hydroxyisopentenyl_adenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ms2io6A N6_glycinylcarbamoyladenosine is a modified adenosine. RNAMOD:010 N6 glycinylcarbamoyladenosine N6-glycinylcarbamoyladenosine g6A sequence SO:0001304 N6_glycinylcarbamoyladenosine N6_glycinylcarbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ g6A N6_threonylcarbamoyladenosine is a modified adenosine. RNAMOD:011 N6 threonylcarbamoyladenosine N6-threonylcarbamoyladenosine t6A sequence SO:0001305 N6_threonylcarbamoyladenosine N6_threonylcarbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ t6A 2_methylthio_N6_threonyl_carbamoyladenosine is a modified adenosine. RNAMOD:012 2-methylthio-N6-threonyl carbamoyladenosine ms2t6A two methylthio N6 threonyl carbamoyladenosine sequence SO:0001306 two_methylthio_N6_threonyl_carbamoyladenosine 2_methylthio_N6_threonyl_carbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ms2t6A N6_methyl_N6_threonylcarbamoyladenosine is a modified adenosine. RNAMOD:013 N6 methyl N6 threonylcarbamoyladenosine N6-methyl-N6-threonylcarbamoyladenosine m6t6A sequence SO:0001307 N6_methyl_N6_threonylcarbamoyladenosine N6_methyl_N6_threonylcarbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m6t6A N6_hydroxynorvalylcarbamoyladenosine is a modified adenosine. RNAMOD:014 N6 hydroxynorvalylcarbamoyladenosine N6-hydroxynorvalylcarbamoyladenosine hn6A sequence SO:0001308 N6_hydroxynorvalylcarbamoyladenosine N6_hydroxynorvalylcarbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ hn6A 2_methylthio_N6_hydroxynorvalyl_carbamoyladenosine is a modified adenosine. RNAMOD:015 2-methylthio-N6-hydroxynorvalyl carbamoyladenosine ms2hn6A two methylthio N6 hydroxynorvalyl carbamoyladenosine sequence SO:0001309 two_methylthio_N6_hydroxynorvalyl_carbamoyladenosine 2_methylthio_N6_hydroxynorvalyl_carbamoyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ms2hn6A 2prime_O_ribosyladenosine_phosphate is a modified adenosine. RNAMOD:016 2'-O-ribosyladenosine (phosphate) Ar(p) two prime O ribosyladenosine phosphate sequence SO:0001310 two_prime_O_ribosyladenosine_phosphate 2prime_O_ribosyladenosine_phosphate is a modified adenosine. http://library.med.utah.edu/RNAmods/ Ar(p) N6_N6_dimethyladenosine is a modified adenosine. RNAMOD:080 N6,N6-dimethyladenosine m62A sequence SO:0001311 N6_N6_dimethyladenosine N6_N6_dimethyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m62A N6_2prime_O_dimethyladenosine is a modified adenosine. RNAMOD:088 N6 2 prime O dimethyladenosine N6,2'-O-dimethyladenosine m6Am sequence SO:0001312 N6_2_prime_O_dimethyladenosine N6_2prime_O_dimethyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m6Am N6_N6_2prime_O_trimethyladenosine is a modified adenosine. RNAMOD:089 N6,N6,2'-O-trimethyladenosine m62Am sequence SO:0001313 N6_N6_2_prime_O_trimethyladenosine N6_N6_2prime_O_trimethyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m62Am 1,2'-O-dimethyladenosine is a modified adenosine. RNAMOD:097 1,2'-O-dimethyladenosine m1Am one two prime O dimethyladenosine sequence SO:0001314 one_two_prime_O_dimethyladenosine 1,2'-O-dimethyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ m1Am N6_acetyladenosine is a modified adenosine. RNAMOD:102 N6 acetyladenosine N6-acetyladenosine ac6A sequence SO:0001315 N6_acetyladenosine N6_acetyladenosine is a modified adenosine. http://library.med.utah.edu/RNAmods/ ac6A 7-deazaguanosine is a modified guanosine. seven deazaguanosine sequence 7-deazaguanosine SO:0001316 seven_deazaguanosine 7-deazaguanosine is a modified guanosine. http://library.med.utah.edu/RNAmods/ Queuosine is a modified 7-deazoguanosine. RNAMOD:043 http://en.wikipedia.org/wiki/Queuosine Q sequence SO:0001317 queuosine Queuosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ http://en.wikipedia.org/wiki/Queuosine wiki Q Epoxyqueuosine is a modified 7-deazoguanosine. RNAMOD:044 eQ sequence SO:0001318 epoxyqueuosine Epoxyqueuosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ eQ Galactosyl_queuosine is a modified 7-deazoguanosine. RNAMOD:045 galQ galactosyl queuosine galactosyl-queuosine sequence SO:0001319 galactosyl_queuosine Galactosyl_queuosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ galQ Mannosyl_queuosine is a modified 7-deazoguanosine. RNAMOD:046 manQ mannosyl queuosine mannosyl-queuosine sequence SO:0001320 mannosyl_queuosine Mannosyl_queuosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ manQ 7_cyano_7_deazaguanosine is a modified 7-deazoguanosine. RNAMOD:047 7-cyano-7-deazaguanosine preQ0 seven cyano seven deazaguanosine sequence SO:0001321 seven_cyano_seven_deazaguanosine 7_cyano_7_deazaguanosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ preQ0 7_aminomethyl_7_deazaguanosine is a modified 7-deazoguanosine. RNAMOD:048 7-aminomethyl-7-deazaguanosine preQ1 seven aminomethyl seven deazaguanosine sequence SO:0001322 seven_aminomethyl_seven_deazaguanosine 7_aminomethyl_7_deazaguanosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ preQ1 Archaeosine is a modified 7-deazoguanosine. RNAMOD:049 G+ sequence SO:0001323 archaeosine Archaeosine is a modified 7-deazoguanosine. http://library.med.utah.edu/RNAmods/ G+ 1_methylguanosine is a modified guanosine base feature. RNAMOD:029 1-methylguanosine m1G one methylguanosine sequence SO:0001324 one_methylguanosine 1_methylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m1G N2_methylguanosine is a modified guanosine base feature. RNAMOD:030 N2 methylguanosine N2-methylguanosine m2G sequence SO:0001325 N2_methylguanosine N2_methylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m2G 7_methylguanosine is a modified guanosine base feature. RNAMOD:031 7-methylguanosine m7G seven methylguanosine sequence SO:0001326 seven_methylguanosine 7_methylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m7G 2prime_O_methylguanosine is a modified guanosine base feature. RNAMOD:032 2'-O-methylguanosine Gm two prime O methylguanosine sequence SO:0001327 two_prime_O_methylguanosine 2prime_O_methylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ Gm N2_N2_dimethylguanosine is a modified guanosine base feature. RNAMOD:033 N2,N2-dimethylguanosine m22G sequence SO:0001328 N2_N2_dimethylguanosine N2_N2_dimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m22G N2_2prime_O_dimethylguanosine is a modified guanosine base feature. RNAMOD:034 N2 2 prime O dimethylguanosine N2,2'-O-dimethylguanosine m2Gm sequence SO:0001329 N2_2_prime_O_dimethylguanosine N2_2prime_O_dimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m2Gm N2_N2_2prime_O_trimethylguanosine is a modified guanosine base feature. RNAMOD:035 N2,N2,2'-O-trimethylguanosine m22Gmv sequence SO:0001330 N2_N2_2_prime_O_trimethylguanosine N2_N2_2prime_O_trimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m22Gmv 2prime_O_ribosylguanosine_phosphate is a modified guanosine base feature. RNAMOD:036 2'-O-ribosylguanosine (phosphate) Gr(p) two prime O ribosylguanosine phosphate sequence SO:0001331 two_prime_O_ribosylguanosine_phosphate 2prime_O_ribosylguanosine_phosphate is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ Gr(p) Wybutosine is a modified guanosine base feature. RNAMOD:037 yW sequence SO:0001332 wybutosine Wybutosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ yW Peroxywybutosine is a modified guanosine base feature. RNAMOD:038 o2yW sequence SO:0001333 peroxywybutosine Peroxywybutosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ o2yW Hydroxywybutosine is a modified guanosine base feature. RNAMOD:039 OHyW sequence SO:0001334 hydroxywybutosine Hydroxywybutosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ OHyW Undermodified_hydroxywybutosine is a modified guanosine base feature. RNAMOD:040 OHyW* undermodified hydroxywybutosine sequence SO:0001335 undermodified_hydroxywybutosine Undermodified_hydroxywybutosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ OHyW* Wyosine is a modified guanosine base feature. RNAMOD:041 IMG sequence SO:0001336 wyosine Wyosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ IMG Methylwyosine is a modified guanosine base feature. RNAMOD:042 mimG sequence SO:0001337 methylwyosine Methylwyosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ mimG N2_7_dimethylguanosine is a modified guanosine base feature. RNAMOD:090 N2 7 dimethylguanosine N2,7-dimethylguanosine m2,7G sequence SO:0001338 N2_7_dimethylguanosine N2_7_dimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m2,7G N2_N2_7_trimethylguanosine is a modified guanosine base feature. RNAMOD:091 N2,N2,7-trimethylguanosine m2,2,7G sequence SO:0001339 N2_N2_7_trimethylguanosine N2_N2_7_trimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m2,2,7G 1_2prime_O_dimethylguanosine is a modified guanosine base feature. RNAMOD:096 1,2'-O-dimethylguanosine m1Gm one two prime O dimethylguanosine sequence SO:0001340 one_two_prime_O_dimethylguanosine 1_2prime_O_dimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m1Gm 4_demethylwyosine is a modified guanosine base feature. RNAMOD:100 4-demethylwyosine four demethylwyosine imG-14 sequence SO:0001341 four_demethylwyosine 4_demethylwyosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ imG-14 Isowyosine is a modified guanosine base feature. RNAMOD:101 imG2 sequence SO:0001342 isowyosine Isowyosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ imG2 N2_7_2prirme_O_trimethylguanosine is a modified guanosine base feature. RNAMOD:106 N2 7 2prirme O trimethylguanosine N2,7,2'-O-trimethylguanosine m2,7Gm sequence SO:0001343 N2_7_2prirme_O_trimethylguanosine N2_7_2prirme_O_trimethylguanosine is a modified guanosine base feature. http://library.med.utah.edu/RNAmods/ m2,7Gm 5_methyluridine is a modified uridine base feature. RNAMOD:052 http://en.wikipedia.org/wiki/5-methyluridine 5-methyluridine five methyluridine m5U sequence SO:0001344 five_methyluridine 5_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ http://en.wikipedia.org/wiki/5-methyluridine wiki m5U 2prime_O_methyluridine is a modified uridine base feature. RNAMOD:053 2'-O-methyluridine Um two prime O methyluridine sequence SO:0001345 two_prime_O_methyluridine 2prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ Um 5_2_prime_O_dimethyluridine is a modified uridine base feature. RNAMOD:054 5,2'-O-dimethyluridine five two prime O dimethyluridine m5Um sequence SO:0001346 five_two_prime_O_dimethyluridine 5_2_prime_O_dimethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m5Um 1_methylpseudouridine is a modified uridine base feature. RNAMOD:055 1-methylpseudouridine m1Y one methylpseudouridine sequence SO:0001347 one_methylpseudouridine 1_methylpseudouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m1Y 2prime_O_methylpseudouridine is a modified uridine base feature. RNAMOD:056 2'-O-methylpseudouridine Ym two prime O methylpseudouridine sequence SO:0001348 two_prime_O_methylpseudouridine 2prime_O_methylpseudouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ Ym 2_thiouridine is a modified uridine base feature. RNAMOD:057 2-thiouridine s2U two thiouridine sequence SO:0001349 two_thiouridine 2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ s2U 4_thiouridine is a modified uridine base feature. RNAMOD:058 4-thiouridine four thiouridine s4U sequence SO:0001350 four_thiouridine 4_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ s4U 5_methyl_2_thiouridine is a modified uridine base feature. RNAMOD:059 5-methyl-2-thiouridine five methyl 2 thiouridine m5s2U sequence SO:0001351 five_methyl_2_thiouridine 5_methyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m5s2U 2_thio_2prime_O_methyluridine is a modified uridine base feature. RNAMOD:060 2-thio-2'-O-methyluridine s2Um two thio two prime O methyluridine sequence SO:0001352 two_thio_two_prime_O_methyluridine 2_thio_2prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ s2Um 3_3_amino_3_carboxypropyl_uridine is a modified uridine base feature. RNAMOD:061 3-(3-amino-3-carboxypropyl)uridine acp3U sequence SO:0001353 three_three_amino_three_carboxypropyl_uridine 3_3_amino_3_carboxypropyl_uridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ acp3U 5_hydroxyuridine is a modified uridine base feature. RNAMOD:060 5-hydroxyuridine five hydroxyuridine ho5U sequence SO:0001354 five_hydroxyuridine 5_hydroxyuridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ ho5U 5_methoxyuridine is a modified uridine base feature. RNAMOD:063 5-methoxyuridine five methoxyuridine mo5U sequence SO:0001355 five_methoxyuridine 5_methoxyuridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mo5U Uridine_5_oxyacetic_acid is a modified uridine base feature. RNAMOD:064 cmo5U uridine 5-oxyacetic acid uridine five oxyacetic acid sequence SO:0001356 uridine_five_oxyacetic_acid Uridine_5_oxyacetic_acid is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ cmo5U Uridine_5_oxyacetic_acid_methyl_ester is a modified uridine base feature. RNAMOD:065 mcmo5U uridine 5-oxyacetic acid methyl ester uridine five oxyacetic acid methyl ester sequence SO:0001357 uridine_five_oxyacetic_acid_methyl_ester Uridine_5_oxyacetic_acid_methyl_ester is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mcmo5U 5_carboxyhydroxymethyl_uridine is a modified uridine base feature. RNAMOD:066 5-(carboxyhydroxymethyl)uridine chm5U five carboxyhydroxymethyl uridine sequence SO:0001358 five_carboxyhydroxymethyl_uridine 5_carboxyhydroxymethyl_uridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ chm5U 5_carboxyhydroxymethyl_uridine_methyl_ester is a modified uridine base feature. RNAMOD:067 5-(carboxyhydroxymethyl)uridine methyl ester five carboxyhydroxymethyl uridine methyl ester mchm5U sequence SO:0001359 five_carboxyhydroxymethyl_uridine_methyl_ester 5_carboxyhydroxymethyl_uridine_methyl_ester is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mchm5U Five_methoxycarbonylmethyluridine is a modified uridine base feature. RNAMOD:068 5-methoxycarbonylmethyluridine five methoxycarbonylmethyluridine mcm5U sequence SO:0001360 five_methoxycarbonylmethyluridine Five_methoxycarbonylmethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mcm5U Five_methoxycarbonylmethyl_2_prime_O_methyluridine is a modified uridine base feature. RNAMOD:069 5-methoxycarbonylmethyl-2'-O-methyluridine five methoxycarbonylmethyl two prime O methyluridine mcm5Um sequence SO:0001361 five_methoxycarbonylmethyl_two_prime_O_methyluridine Five_methoxycarbonylmethyl_2_prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mcm5Um 5_methoxycarbonylmethyl_2_thiouridine is a modified uridine base feature. RNAMOD:070 5-methoxycarbonylmethyl-2-thiouridine five methoxycarbonylmethyl two thiouridine mcm5s2U sequence SO:0001362 five_methoxycarbonylmethyl_two_thiouridine 5_methoxycarbonylmethyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mcm5s2U 5_aminomethyl_2_thiouridine is a modified uridine base feature. RNAMOD:071 5-aminomethyl-2-thiouridine five aminomethyl two thiouridine nm5s2U sequence SO:0001363 five_aminomethyl_two_thiouridine 5_aminomethyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ nm5s2U 5_methylaminomethyluridine is a modified uridine base feature. RNAMOD:072 5-methylaminomethyluridine five methylaminomethyluridine mnm5U sequence SO:0001364 five_methylaminomethyluridine 5_methylaminomethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mnm5U 5_methylaminomethyl_2_thiouridine is a modified uridine base feature. RNAMOD:073 5-methylaminomethyl-2-thiouridine five methylaminomethyl two thiouridine mnm5s2U sequence SO:0001365 five_methylaminomethyl_two_thiouridine 5_methylaminomethyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mnm5s2U 5_methylaminomethyl_2_selenouridine is a modified uridine base feature. RNAMOD:074 5-methylaminomethyl-2-selenouridine five methylaminomethyl two selenouridine mnm5se2U sequence SO:0001366 five_methylaminomethyl_two_selenouridine 5_methylaminomethyl_2_selenouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ mnm5se2U 5_carbamoylmethyluridine is a modified uridine base feature. RNAMOD:075 5-carbamoylmethyluridine five carbamoylmethyluridine ncm5U sequence SO:0001367 five_carbamoylmethyluridine 5_carbamoylmethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ ncm5U 5_carbamoylmethyl_2_prime_O_methyluridine is a modified uridine base feature. RNAMOD:076 5-carbamoylmethyl-2'-O-methyluridine five carbamoylmethyl two prime O methyluridine ncm5Um sequence SO:0001368 five_carbamoylmethyl_two_prime_O_methyluridine 5_carbamoylmethyl_2_prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ ncm5Um 5_carboxymethylaminomethyluridine is a modified uridine base feature. RNAMOD:077 5-carboxymethylaminomethyluridine cmnm5U five carboxymethylaminomethyluridine sequence SO:0001369 five_carboxymethylaminomethyluridine 5_carboxymethylaminomethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ cmnm5U 5_carboxymethylaminomethyl_2_prime_O_methyluridine is a modified uridine base feature. RNAMOD:078 5-carboxymethylaminomethyl- 2'-O-methyluridine cmnm5Um five carboxymethylaminomethyl two prime O methyluridine sequence SO:0001370 five_carboxymethylaminomethyl_two_prime_O_methyluridine 5_carboxymethylaminomethyl_2_prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ cmnm5Um 5_carboxymethylaminomethyl_2_thiouridine is a modified uridine base feature. RNAMOD:079 5-carboxymethylaminomethyl-2-thiouridine cmnm5s2U five carboxymethylaminomethyl two thiouridine sequence SO:0001371 five_carboxymethylaminomethyl_two_thiouridine 5_carboxymethylaminomethyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ cmnm5s2U 3_methyluridine is a modified uridine base feature. RNAMOD:085 3-methyluridine m3U three methyluridine sequence SO:0001372 three_methyluridine 3_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m3U 1_methyl_3_3_amino_3_carboxypropyl_pseudouridine is a modified uridine base feature. RNAMOD:086 1-methyl-3-(3-amino-3-carboxypropyl) pseudouridine m1acp3Y sequence SO:0001373 one_methyl_three_three_amino_three_carboxypropyl_pseudouridine 1_methyl_3_3_amino_3_carboxypropyl_pseudouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m1acp3Y 5_carboxymethyluridine is a modified uridine base feature. RNAMOD:087 5-carboxymethyluridine cm5U five carboxymethyluridine sequence SO:0001374 five_carboxymethyluridine 5_carboxymethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ cm5U 3_2prime_O_dimethyluridine is a modified uridine base feature. RNAMOD:092 3,2'-O-dimethyluridine m3Um three two prime O dimethyluridine sequence SO:0001375 three_two_prime_O_dimethyluridine 3_2prime_O_dimethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m3Um 5_methyldihydrouridine is a modified uridine base feature. RNAMOD:093 5-methyldihydrouridine five methyldihydrouridine m5D sequence SO:0001376 five_methyldihydrouridine 5_methyldihydrouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m5D 3_methylpseudouridine is a modified uridine base feature. RNAMOD:094 3-methylpseudouridine m3Y three methylpseudouridine sequence SO:0001377 three_methylpseudouridine 3_methylpseudouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ m3Y 5_taurinomethyluridine is a modified uridine base feature. RNAMOD:098 5-taurinomethyluridine five taurinomethyluridine tm5U sequence SO:0001378 five_taurinomethyluridine 5_taurinomethyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ tm5U 5_taurinomethyl_2_thiouridineis a modified uridine base feature. RNAMOD:099 5-taurinomethyl-2-thiouridine five taurinomethyl two thiouridine tm5s2U sequence SO:0001379 five_taurinomethyl_two_thiouridine 5_taurinomethyl_2_thiouridineis a modified uridine base feature. http://library.med.utah.edu/RNAmods/ tm5s2U 5_isopentenylaminomethyl_uridine is a modified uridine base feature. RNAMOD:103 5-(isopentenylaminomethyl)uridine five isopentenylaminomethyl uridine inm5U sequence SO:0001380 five_isopentenylaminomethyl_uridine 5_isopentenylaminomethyl_uridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ inm5U 5_isopentenylaminomethyl_2_thiouridine is a modified uridine base feature. RNAMOD:104 5-(isopentenylaminomethyl)- 2-thiouridine five isopentenylaminomethyl two thiouridine inm5s2U sequence SO:0001381 five_isopentenylaminomethyl_two_thiouridine 5_isopentenylaminomethyl_2_thiouridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ inm5s2U 5_isopentenylaminomethyl_2prime_O_methyluridine is a modified uridine base feature. RNAMOD:105 5-(isopentenylaminomethyl)- 2'-O-methyluridine five isopentenylaminomethyl two prime O methyluridine inm5Um sequence SO:0001382 five_isopentenylaminomethyl_two_prime_O_methyluridine 5_isopentenylaminomethyl_2prime_O_methyluridine is a modified uridine base feature. http://library.med.utah.edu/RNAmods/ inm5Um A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a histone. histone binding site sequence SO:0001383 histone_binding_site A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a histone. SO:ke A portion of a CDS that is not the complete CDS. CDS fragment incomplete CDS sequence SO:0001384 CDS_fragment A post translationally modified amino acid feature. modified amino acid feature sequence SO:0001385 modified_amino_acid_feature A post translationally modified amino acid feature. SO:ke A post translationally modified glycine amino acid feature. MOD:00908 ModGly modified glycine sequence SO:0001386 modified_glycine A post translationally modified glycine amino acid feature. SO:ke ModGly A post translationally modified alanine amino acid feature. MOD:00901 ModAla modified L alanine modified L-alanine sequence SO:0001387 modified_L_alanine A post translationally modified alanine amino acid feature. SO:ke ModAla A post translationally modified asparagine amino acid feature. MOD:00903 ModAsn modified L asparagine modified L-asparagine sequence SO:0001388 modified_L_asparagine A post translationally modified asparagine amino acid feature. SO:ke ModAsn A post translationally modified aspartic acid amino acid feature. MOD:00904 ModAsp modified L aspartic acid modified L-aspartic acid sequence SO:0001389 modified_L_aspartic_acid A post translationally modified aspartic acid amino acid feature. SO:ke ModAsp A post translationally modified cysteine amino acid feature. MOD:00905 ModCys modified L cysteine modified L-cysteine sequence SO:0001390 modified_L_cysteine A post translationally modified cysteine amino acid feature. SO:ke ModCys A post translationally modified glutamic acid. MOD:00906 ModGlu modified L glutamic acid modified L-glutamic acid sequence SO:0001391 modified_L_glutamic_acid ModGlu A post translationally modified threonine amino acid feature. MOD:00917 ModThr modified L threonine modified L-threonine sequence SO:0001392 modified_L_threonine A post translationally modified threonine amino acid feature. SO:ke ModThr A post translationally modified tryptophan amino acid feature. MOD:00918 ModTrp modified L tryptophan modified L-tryptophan sequence SO:0001393 modified_L_tryptophan A post translationally modified tryptophan amino acid feature. SO:ke ModTrp A post translationally modified glutamine amino acid feature. MOD:00907 ModGln modified L glutamine modified L-glutamine sequence SO:0001394 modified_L_glutamine A post translationally modified glutamine amino acid feature. SO:ke A post translationally modified methionine amino acid feature. MOD:00913 ModMet modified L methionine modified L-methionine sequence SO:0001395 modified_L_methionine A post translationally modified methionine amino acid feature. SO:ke ModMet A post translationally modified isoleucine amino acid feature. MOD:00910 ModIle modified L isoleucine modified L-isoleucine sequence SO:0001396 modified_L_isoleucine A post translationally modified isoleucine amino acid feature. SO:ke ModIle A post translationally modified phenylalanine amino acid feature. MOD:00914 ModPhe modified L phenylalanine modified L-phenylalanine sequence SO:0001397 modified_L_phenylalanine A post translationally modified phenylalanine amino acid feature. SO:ke ModPhe A post translationally modified histidine amino acid feature. MOD:00909 ModHis modified L histidine modified L-histidine sequence SO:0001398 modified_L_histidine A post translationally modified histidine amino acid feature. SO:ke A post translationally modified serine amino acid feature. MOD:00916 MosSer modified L serine modified L-serine sequence SO:0001399 modified_L_serine A post translationally modified serine amino acid feature. SO:ke MOD:00916 http://www.psidev.info/index.php?q=node/104 MosSer A post translationally modified lysine amino acid feature. MOD:00912 ModLys modified L lysine modified L-lysine sequence SO:0001400 modified_L_lysine A post translationally modified lysine amino acid feature. SO:ke ModLys A post translationally modified leucine amino acid feature. MOD:00911 ModLeu modified L leucine modified L-leucine sequence SO:0001401 modified_L_leucine A post translationally modified leucine amino acid feature. SO:ke ModLeu A post translationally modified selenocysteine amino acid feature. MOD:01158 modified L selenocysteine modified L-selenocysteine sequence SO:0001402 modified_L_selenocysteine A post translationally modified selenocysteine amino acid feature. SO:ke A post translationally modified valine amino acid feature. MOD:00920 ModVal modified L valine modified L-valine sequence SO:0001403 modified_L_valine A post translationally modified valine amino acid feature. SO:ke ModVal A post translationally modified proline amino acid feature. MOD:00915 ModPro modified L proline modified L-proline sequence SO:0001404 modified_L_proline A post translationally modified proline amino acid feature. SO:ke ModPro A post translationally modified tyrosine amino acid feature. MOD:00919 ModTry modified L tyrosine modified L-tyrosine sequence SO:0001405 modified_L_tyrosine A post translationally modified tyrosine amino acid feature. SO:ke ModTry A post translationally modified arginine amino acid feature. MOD:00902 ModArg modified L arginine modified L-arginine sequence SO:0001406 modified_L_arginine A post translationally modified arginine amino acid feature. SO:ke ModArg An attribute describing the nature of a proteinaceous polymer, where by the amino acid units are joined by peptide bonds. sequence SO:0001407 peptidyl An attribute describing the nature of a proteinaceous polymer, where by the amino acid units are joined by peptide bonds. SO:ke The C-terminal residues of a polypeptide which are exchanged for a GPI-anchor. cleaved for gpi anchor region sequence SO:0001408 cleaved_for_gpi_anchor_region The C-terminal residues of a polypeptide which are exchanged for a GPI-anchor. EBI:rh A region which is intended for use in an experiment. biomaterial region sequence SO:0001409 biomaterial_region A region which is intended for use in an experiment. SO:cb A region which is the result of some arbitrary experimental procedure. The procedure may be carried out with biological material or inside a computer. experimental output artefact experimental_output_artefact sequence analysis feature SO:0001410 experimental_feature A region which is the result of some arbitrary experimental procedure. The procedure may be carried out with biological material or inside a computer. SO:cb A region defined by its disposition to be involved in a biological process. INSDC_misc_feature INSDC_note:biological_region biological region sequence SO:0001411 biological_region A region defined by its disposition to be involved in a biological process. SO:cb A DNA region within which self-interaction occurs more often than expected by chance because of DNA-looping. topologically defined region sequence SO:0001412 topologically_defined_region A DNA region within which self-interaction occurs more often than expected by chance because of DNA-looping. PMID:32782014 SO:cb The point within a chromosome where a translocation begins or ends. translocation breakpoint sequence SO:0001413 translocation_breakpoint The point within a chromosome where a translocation begins or ends. SO:cb The point within a chromosome where a insertion begins or ends. insertion breakpoint sequence SO:0001414 insertion_breakpoint The point within a chromosome where a insertion begins or ends. SO:cb The point within a chromosome where a deletion begins or ends. deletion breakpoint sequence SO:0001415 deletion_breakpoint The point within a chromosome where a deletion begins or ends. SO:cb A flanking region located five prime of a specific region. five prime flanking region sequence 5' flanking region SO:0001416 five_prime_flanking_region A flanking region located five prime of a specific region. SO:chado A flanking region located three prime of a specific region. three prime flanking region sequence 3' flanking region SO:0001417 three_prime_flanking_region A flanking region located three prime of a specific region. SO:chado An experimental region, defined by a tiling array experiment to be transcribed at some level. transcribed fragment sequence transfrag SO:0001418 Term requested by the MODencode group. transcribed_fragment An experimental region, defined by a tiling array experiment to be transcribed at some level. SO:ke Intronic 2 bp region bordering exon. A splice_site that adjacent_to exon and overlaps intron. cis splice site sequence SO:0001419 cis_splice_site Intronic 2 bp region bordering exon. A splice_site that adjacent_to exon and overlaps intron. SO:cjm SO:ke Primary transcript region bordering trans-splice junction. trans splice site sequence SO:0001420 trans_splice_site Primary transcript region bordering trans-splice junction. SO:ke The boundary between an intron and an exon. splice boundary splice junction sequence SO:0001421 splice_junction The boundary between an intron and an exon. SO:ke A region of a polypeptide, involved in the transition from one conformational state to another. polypeptide conformational switch sequence SO:0001422 MM Young, K Kirshenbaum, KA Dill & S Highsmith. Predicting conformational switches in proteins. Protein Science, 1999, 8, 1752-64. K. Kirshenbaum, M.M. Young and S. Highsmith. Predicting Allosteric Switches in Myosins. Protein Science 8(9):1806-1815. 1999. conformational_switch A region of a polypeptide, involved in the transition from one conformational state to another. SO:ke A read produced by the dye terminator method of sequencing. sequence dye terminator read SO:0001423 dye_terminator_read A read produced by the dye terminator method of sequencing. SO:ke A read produced by pyrosequencing technology. sequence pyorsequenced read SO:0001424 An example is a read produced by Roche 454 technology. pyrosequenced_read A read produced by pyrosequencing technology. SO:ke A read produced by ligation based sequencing technologies. sequence ligation based read SO:0001425 An example of this kind of read is one produced by ABI SOLiD. ligation_based_read A read produced by ligation based sequencing technologies. SO:ke A read produced by the polymerase based sequence by synthesis method. sequence polymerase synthesis read SO:0001426 An example is a read produced by Illumina technology. polymerase_synthesis_read A read produced by the polymerase based sequence by synthesis method. SO:ke A structural region in an RNA molecule which promotes ribosomal frameshifting of cis coding sequence. cis regulatory frameshift element sequence SO:0001427 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. cis_regulatory_frameshift_element A structural region in an RNA molecule which promotes ribosomal frameshifting of cis coding sequence. RFAM:jd A sequence assembly derived from expressed sequences. expressed sequence assembly sequence SO:0001428 From tracker [ 2372385 ] expressed_sequence_assembly. expressed_sequence_assembly A sequence assembly derived from expressed sequences. SO:ke A binding site that, in the molecule, interacts selectively and non-covalently with DNA. DNA binding site sequence SO:0001429 DNA_binding_site A binding site that, in the molecule, interacts selectively and non-covalently with DNA. SO:ke true A gene that is not transcribed under normal conditions and is not critical to normal cellular functioning. cryptic gene sequence SO:0001431 cryptic_gene A gene that is not transcribed under normal conditions and is not critical to normal cellular functioning. SO:ke SO:0001545 sequence variant affecting polyadenylation sequence mutation affecting polyadenylation SO:0001432 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_polyadenylation true A three prime RACE (Rapid Amplification of cDNA Ends) clone is a cDNA clone copied from the 3' end of an mRNA (using a poly-dT primer to capture the polyA tail and a gene-specific or randomly primed 5' primer), and spliced into a vector for propagation in a suitable host. sequence 3' RACE clone SO:0001433 three_prime_RACE_clone A three prime RACE (Rapid Amplification of cDNA Ends) clone is a cDNA clone copied from the 3' end of an mRNA (using a poly-dT primer to capture the polyA tail and a gene-specific or randomly primed 5' primer), and spliced into a vector for propagation in a suitable host. modENCODE:nlw A cassette pseudogene is a kind of gene in an inactive form which may recombine at a telomeric locus to form a functional copy. cassette pseudogene sequence cassette type psedogene SO:0001434 Requested by the Trypanosome community. cassette_pseudogene A cassette pseudogene is a kind of gene in an inactive form which may recombine at a telomeric locus to form a functional copy. SO:ke A non-polar, hydorophobic amino acid encoded by the codons GCN (GCT, GCC, GCA and GCG). A Ala sequence SO:0001435 A place holder for a cross product with chebi. alanine A Ala A non-polar, hydorophobic amino acid encoded by the codons GTN (GTT, GTC, GTA and GTG). V Val sequence SO:0001436 A place holder for a cross product with chebi. valine V Val A non-polar, hydorophobic amino acid encoded by the codons CTN (CTT, CTC, CTA and CTG), TTA and TTG. L Leu sequence SO:0001437 A place holder for a cross product with chebi. leucine L Leu A non-polar, hydorophobic amino acid encoded by the codons ATH (ATT, ATC and ATA). I Ile sequence SO:0001438 A place holder for a cross product with chebi. isoleucine I Ile A non-polar, hydorophobic amino acid encoded by the codons CCN (CCT, CCC, CCA and CCG). P Pro sequence SO:0001439 A place holder for a cross product with chebi. proline P Pro A non-polar, hydorophobic amino acid encoded by the codon TGG. Trp W sequence SO:0001440 A place holder for a cross product with chebi. tryptophan Trp W A non-polar, hydorophobic amino acid encoded by the codons TTT and TTC. F Phe sequence SO:0001441 A place holder for a cross product with chebi. phenylalanine F Phe A non-polar, hydorophobic amino acid encoded by the codon ATG. M Met sequence SO:0001442 A place holder for a cross product with chebi. methionine M Met A non-polar, hydorophilic amino acid encoded by the codons GGN (GGT, GGC, GGA and GGG). G Gly sequence SO:0001443 A place holder for a cross product with chebi. glycine G Gly A polar, hydorophilic amino acid encoded by the codons TCN (TCT, TCC, TCA, TCG), AGT and AGC. S Ser sequence SO:0001444 A place holder for a cross product with chebi. serine S Ser A polar, hydorophilic amino acid encoded by the codons ACN (ACT, ACC, ACA and ACG). T Thr sequence SO:0001445 A place holder for a cross product with chebi. threonine T Thr A polar, hydorophilic amino acid encoded by the codons TAT and TAC. Tyr Y sequence SO:0001446 A place holder for a cross product with chebi. tyrosine Tyr Y A polar amino acid encoded by the codons TGT and TGC. C Cys sequence SO:0001447 A place holder for a cross product with chebi. cysteine C Cys A polar, hydorophilic amino acid encoded by the codons CAA and CAG. Gln Q sequence SO:0001448 A place holder for a cross product with chebi. glutamine Gln Q A polar, hydorophilic amino acid encoded by the codons AAT and AAC. Asn N sequence SO:0001449 A place holder for a cross product with chebi. asparagine Asn N A positively charged, hydorophilic amino acid encoded by the codons AAA and AAG. K Lys sequence SO:0001450 A place holder for a cross product with chebi. lysine K Lys A positively charged, hydorophilic amino acid encoded by the codons CGN (CGT, CGC, CGA and CGG), AGA and AGG. Arg R sequence SO:0001451 A place holder for a cross product with chebi. arginine Arg R A positively charged, hydorophilic amino acid encoded by the codons CAT and CAC. H His sequence SO:0001452 A place holder for a cross product with chebi. histidine H His A negatively charged, hydorophilic amino acid encoded by the codons GAT and GAC. Asp D aspartic acid sequence SO:0001453 A place holder for a cross product with chebi. aspartic_acid Asp D A negatively charged, hydorophilic amino acid encoded by the codons GAA and GAG. E Glu glutamic acid sequence SO:0001454 A place holder for a cross product with chebi. glutamic_acid E Glu A relatively rare amino acid encoded by the codon UGA in some contexts, whereas UGA is a termination codon in other contexts. Sec U sequence SO:0001455 A place holder for a cross product with chebi. selenocysteine A relatively rare amino acid encoded by the codon UGA in some contexts, whereas UGA is a termination codon in other contexts. PMID:23275319 Sec U A relatively rare amino acid encoded by the codon UAG in some contexts, whereas UAG is a termination codon in other contexts. O Pyl sequence SO:0001456 A place holder for a cross product with chebi. pyrrolysine A relatively rare amino acid encoded by the codon UAG in some contexts, whereas UAG is a termination codon in other contexts. PMID:15788401 O Pyl A region defined by a set of transcribed sequences from the same gene or expressed pseudogene. transcribed cluster sequence unigene cluster SO:0001457 This term was requested by Jeff Bowes, using the tracker, ID = 2594157. transcribed_cluster A region defined by a set of transcribed sequences from the same gene or expressed pseudogene. SO:ke A kind of transcribed_cluster defined by a set of transcribed sequences from the a unique gene. sequence unigene cluster SO:0001458 This term was requested by Jeff Bowes, using the tracker, ID = 2594157. unigene_cluster A kind of transcribed_cluster defined by a set of transcribed sequences from the a unique gene. SO:ke Clustered Palindromic Repeats interspersed with bacteriophage derived spacer sequences. http:en.wikipedia.org/wiki/CRISPR CRISPR element Clustered_Regularly_Interspaced_Short_Palindromic_Repeat sequence SO:0001459 CRISPR Clustered Palindromic Repeats interspersed with bacteriophage derived spacer sequences. RFAM:jd A binding site that, in an insulator region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. sequence insulator binding site SO:0001460 See tracker ID 2060908. insulator_binding_site A binding site that, in an insulator region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. SO:ke A binding site that, in the enhancer region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. sequence enhancer binding site SO:0001461 enhancer_binding_site A binding site that, in the enhancer region of a nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. SO:ke A collection of contigs. contig collection sequence SO:0001462 See tracker ID: 2138359. contig_collection A collection of contigs. SO:ke Long, intervening non-coding RNA. A transcript that does not overlap within the start or end genomic coordinates of a coding gene or pseudogene on either strand. large intervening non-coding RNA long intergenic non-coding RNA long intervening non-coding RNA sequence SO:0001463 lincRNA Long, intervening non-coding RNA. A transcript that does not overlap within the start or end genomic coordinates of a coding gene or pseudogene on either strand. PMID:19182780 PMID:23463798 SO:ke http://www.gencodegenes.org/gencode_biotypes.html An EST spanning part or all of the untranslated regions of a protein-coding transcript. UTR sequence tag sequence SO:0001464 UST An EST spanning part or all of the untranslated regions of a protein-coding transcript. SO:nlw A UST located in the 3'UTR of a protein-coding transcript. sequence 3' UST SO:0001465 three_prime_UST A UST located in the 3'UTR of a protein-coding transcript. SO:nlw An UST located in the 5'UTR of a protein-coding transcript. sequence 5' UST SO:0001466 five_prime_UST An UST located in the 5'UTR of a protein-coding transcript. SO:nlw A tag produced from a single sequencing read from a RACE product; typically a few hundred base pairs long. RACE sequence tag sequence SO:0001467 RST A tag produced from a single sequencing read from a RACE product; typically a few hundred base pairs long. SO:nlw A tag produced from a single sequencing read from a 3'-RACE product; typically a few hundred base pairs long. 3' RST sequence SO:0001468 three_prime_RST A tag produced from a single sequencing read from a 3'-RACE product; typically a few hundred base pairs long. SO:nlw A tag produced from a single sequencing read from a 5'-RACE product; typically a few hundred base pairs long. sequence 5' RST SO:0001469 five_prime_RST A tag produced from a single sequencing read from a 5'-RACE product; typically a few hundred base pairs long. SO:nlw A match against an UST sequence. UST match sequence SO:0001470 UST_match A match against an UST sequence. SO:nlw A match against an RST sequence. RST match sequence SO:0001471 RST_match A match against an RST sequence. SO:nlw A nucleotide match to a primer sequence. primer match sequence SO:0001472 primer_match A nucleotide match to a primer sequence. SO:nlw A region of the pri miRNA that base pairs with the guide to form the hairpin. kareneilbeck 2009-05-27T03:35:43Z miRNA antiguide miRNA passenger strand miRNA star sequence SO:0001473 miRNA_antiguide A region of the pri miRNA that base pairs with the guide to form the hairpin. SO:ke The boundary between the spliced leader and the first exon of the mRNA. kareneilbeck 2009-07-13T04:50:49Z trans-splice junction sequence SO:0001474 trans_splice_junction The boundary between the spliced leader and the first exon of the mRNA. SO:ke A region of a primary transcript, that is removed via trans splicing. kareneilbeck 2009-07-14T11:36:08Z sequence SO:0001475 outron A region of a primary transcript, that is removed via trans splicing. PMID:16401417 SO:ke A plasmid that occurs naturally. kareneilbeck 2009-09-01T03:43:06Z natural plasmid sequence SO:0001476 natural_plasmid A plasmid that occurs naturally. SO:xp A gene trap construct is a type of engineered plasmid which is designed to integrate into a genome and produce a fusion transcript between exons of the gene into which it inserts and a reporter element in the construct. Gene traps contain a splice acceptor, do not contain promoter elements for the reporter, and are mutagenic. Gene traps may be bicistronic with the second cassette containing a promoter driving an a selectable marker. kareneilbeck 2009-09-01T03:49:09Z gene trap construct sequence SO:0001477 gene_trap_construct A gene trap construct is a type of engineered plasmid which is designed to integrate into a genome and produce a fusion transcript between exons of the gene into which it inserts and a reporter element in the construct. Gene traps contain a splice acceptor, do not contain promoter elements for the reporter, and are mutagenic. Gene traps may be bicistronic with the second cassette containing a promoter driving an a selectable marker. ZFIN:dh A promoter trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when inserted in close proximity to a promoter element. Promoter traps typically do not contain promoter elements and are mutagenic. kareneilbeck 2009-09-01T03:52:01Z promoter trap construct sequence SO:0001478 promoter_trap_construct A promoter trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when inserted in close proximity to a promoter element. Promoter traps typically do not contain promoter elements and are mutagenic. ZFIN:dh An enhancer trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when the expression from a basic minimal promoter is enhanced by genomic enhancer elements. Enhancer traps contain promoter elements and are not usually mutagenic. kareneilbeck 2009-09-01T03:53:26Z enhancer trap construct sequence SO:0001479 enhancer_trap_construct An enhancer trap construct is a type of engineered plasmid which is designed to integrate into a genome and express a reporter when the expression from a basic minimal promoter is enhanced by genomic enhancer elements. Enhancer traps contain promoter elements and are not usually mutagenic. ZFIN:dh A region of sequence from the end of a PAC clone that may provide a highly specific marker. kareneilbeck 2009-09-09T05:18:12Z PAC end sequence SO:0001480 PAC_end A region of sequence from the end of a PAC clone that may provide a highly specific marker. ZFIN:mh RAPD is a 'PCR product' where a sequence variant is identified through the use of PCR with random primers. kareneilbeck 2009-09-09T05:26:10Z Random Amplification Polymorphic DNA sequence SO:0001481 RAPD RAPD is a 'PCR product' where a sequence variant is identified through the use of PCR with random primers. ZFIN:mh An enhancer that drives the pattern of transcription and binds to the same TF as the primary enhancer, but is located in the intron of or on the far side of a neighboring gene. kareneilbeck 2009-09-09T05:29:29Z shadow enhancer sequence SO:0001482 shadow_enhancer An enhancer that drives the pattern of transcription and binds to the same TF as the primary enhancer, but is located in the intron of or on the far side of a neighboring gene. PMID:22083793 SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist. kareneilbeck 2009-10-08T11:37:49Z single nucleotide variant sequence SO:0001483 SNV SNVs are single nucleotide positions in genomic DNA at which different sequence alternatives exist. SO:bm An X element combinatorial repeat is a repeat region located between the X element and the telomere or adjacent Y' element. kareneilbeck 2009-11-10T11:03:37Z INSDC_feature:repeat_region INSDC_qualifier:x_element_combinatorial_repeat X element combinatorial repeat sequence SO:0001484 X element combinatorial repeats contain Tbf1p binding sites, and possible functions include a role in telomerase-independent telomere maintenance via recombination or as a barrier against transcriptional silencing. These are usually present as a combination of one or more of several types of smaller elements (designated A, B, C, or D). This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747. X_element_combinatorial_repeat An X element combinatorial repeat is a repeat region located between the X element and the telomere or adjacent Y' element. http://www.yeastgenome.org/help/glossary.html A Y' element is a repeat region (SO:0000657) located adjacent to telomeric repeats or X element combinatorial repeats, either as a single copy or tandem repeat of two to four copies. kareneilbeck 2009-11-10T12:08:57Z INSDC_feature:repeat_region INSDC_qualifier:Y_prime_element Y prime element Y' element sequence SO:0001485 This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747. Y_prime_element A Y' element is a repeat region (SO:0000657) located adjacent to telomeric repeats or X element combinatorial repeats, either as a single copy or tandem repeat of two to four copies. http:http://www.yeastgenome.org/help/glossary.html The status of a whole genome sequence, where the data is minimally filtered or un-filtered, from any number of sequencing platforms, and is assembled into contigs. Genome sequence of this quality may harbour regions of poor quality and can be relatively incomplete. kareneilbeck 2009-10-23T12:48:32Z standard draft sequence SO:0001486 standard_draft The status of a whole genome sequence, where the data is minimally filtered or un-filtered, from any number of sequencing platforms, and is assembled into contigs. Genome sequence of this quality may harbour regions of poor quality and can be relatively incomplete. DOI:10.1126 The status of a whole genome sequence, where overall coverage represents at least 90 percent of the genome. kareneilbeck 2009-10-23T12:52:36Z high quality draft sequence SO:0001487 high_quality_draft The status of a whole genome sequence, where overall coverage represents at least 90 percent of the genome. DOI:10.1126 The status of a whole genome sequence, where additional work has been performed, using either manual or automated methods, such as gap resolution. kareneilbeck 2009-10-23T12:54:35Z improved high quality draft sequence SO:0001488 improved_high_quality_draft The status of a whole genome sequence, where additional work has been performed, using either manual or automated methods, such as gap resolution. DOI:10.1126 The status of a whole genome sequence,where annotation, and verification of coding regions has occurred. kareneilbeck 2009-10-23T12:57:10Z annotation directed improvement sequence SO:0001489 annotation_directed_improved_draft The status of a whole genome sequence,where annotation, and verification of coding regions has occurred. DOI:10.1126 The status of a whole genome sequence, where the assembly is high quality, closure approaches have been successful for most gaps, misassemblies and low quality regions. kareneilbeck 2009-10-23T01:01:07Z non contiguous finished sequence SO:0001490 noncontiguous_finished The status of a whole genome sequence, where the assembly is high quality, closure approaches have been successful for most gaps, misassemblies and low quality regions. DOI:10.1126 The status of a whole genome sequence, with less than 1 error per 100,000 base pairs. kareneilbeck 2009-10-23T01:04:43Z finished finished genome sequence SO:0001491 finished_genome The status of a whole genome sequence, with less than 1 error per 100,000 base pairs. DOI:10.1126 A regulatory region that is part of an intron. kareneilbeck 2009-11-08T02:48:02Z intronic regulatory region sequence SO:0001492 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. intronic_regulatory_region A regulatory region that is part of an intron. SO:ke A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region composed of 8-11bp which enables binding by the centromere binding factor 1(Cbf1p). kareneilbeck 2009-11-09T05:47:23Z CDEI Centromere DNA Element I sequence SO:0001493 This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699. centromere_DNA_Element_I A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region composed of 8-11bp which enables binding by the centromere binding factor 1(Cbf1p). PMID:11222754 A centromere DNA Element II (CDEII) is part a conserved region of the centromere, consisting of a consensus region that is AT-rich and ~ 75-100 bp in length. kareneilbeck 2009-11-09T05:51:26Z CDEII centromere DNA Element II sequence SO:0001494 This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699. centromere_DNA_Element_II A centromere DNA Element II (CDEII) is part a conserved region of the centromere, consisting of a consensus region that is AT-rich and ~ 75-100 bp in length. PMID:11222754 A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region that consists of a 25-bp which enables binding by the centromere DNA binding factor 3 (CBF3) complex. kareneilbeck 2009-11-09T05:54:47Z CDEIII centromere DNA Element III sequence SO:0001495 This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880699. centromere_DNA_Element_III A centromere DNA Element I (CDEI) is a conserved region, part of the centromere, consisting of a consensus region that consists of a 25-bp which enables binding by the centromere DNA binding factor 3 (CBF3) complex. PMID:11222754 The telomeric repeat is a repeat region, part of the chromosome, which in yeast, is a G-rich terminal sequence of the form (TG(1-3))n or more precisely ((TG)(1-6)TG(2-3))n. kareneilbeck 2009-11-09T06:00:42Z INSDC_feature:repeat_region INSDC_qualifier:telomeric_repeat telomeric repeat sequence SO:0001496 The repeats are maintained by telomerase and there is generally 300 (+/-) 75 bp of TG(1-3) at a given end. Telomeric repeats function in completing chromosome replication and protecting the ends from degradation and end-to-end fusions. This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880739. telomeric_repeat The telomeric repeat is a repeat region, part of the chromosome, which in yeast, is a G-rich terminal sequence of the form (TG(1-3))n or more precisely ((TG)(1-6)TG(2-3))n. PMID:8720065 The X element is a conserved region, of the telomere, of ~475 bp that contains an ARS sequence and in most cases an Abf1p binding site. kareneilbeck 2009-11-10T10:56:54Z X element core sequence sequence X element SO:0001497 Possible functions include roles in chromosomal segregation, maintenance of chromosome stability, recombinational sequestering, or as a barrier to transcriptional silencing. This term was requested 2009-10-16 by Michel Dumontier, tracker id 2880747. From Janos Demeter: The only region shared by all chromosome ends, the X element core sequence is a small conserved element (~475 bp) that contains an ARS sequence and in most cases an Abf1p binding site. Between these is a GC-rich region nearly identical to the meiosis-specific regulatory sequence URS1. X_element The X element is a conserved region, of the telomere, of ~475 bp that contains an ARS sequence and in most cases an Abf1p binding site. PMID:7785338 PMID:8005434 http://www.yeastgenome.org/help/glossary.html#xelemcoresequence A region of sequence from the end of a YAC clone that may provide a highly specific marker. kareneilbeck 2009-11-19T11:07:18Z YAC end sequence SO:0001498 YAC_end A region of sequence from the end of a YAC clone that may provide a highly specific marker. SO:ke The status of whole genome sequence. kareneilbeck 2009-10-23T12:47:47Z whole genome sequence status sequence SO:0001499 This terms and children were added to SO in response to tracker request by Patrick Chain. The paper Genome Project Standards in a New Era of Sequencing. Science October 9th 2009, addresses these terms. whole_genome_sequence_status The status of whole genome sequence. DOI:10.1126 A biological_region characterized as a single heritable trait in a phenotype screen. The heritable phenotype may be mapped to a chromosome but generally has not been characterized to a specific gene locus. kareneilbeck 2009-12-07T01:50:55Z heritable phenotypic marker phenotypic marker sequence SO:0001500 heritable_phenotypic_marker A biological_region characterized as a single heritable trait in a phenotype screen. The heritable phenotype may be mapped to a chromosome but generally has not been characterized to a specific gene locus. JAX:hdene A collection of peptide sequences. kareneilbeck 2009-12-11T10:58:58Z peptide collection peptide set sequence SO:0001501 Term requested via tracker ID: 2910829. peptide_collection A collection of peptide sequences. BBOP:nlw An experimental feature with high sequence identity to another sequence. kareneilbeck 2009-12-11T11:06:05Z high identity region sequence SO:0001502 Requested by tracker ID: 2902685. high_identity_region An experimental feature with high sequence identity to another sequence. SO:ke A transcript for which no open reading frame has been identified and for which no other function has been determined. kareneilbeck 2009-12-21T05:37:14Z processed transcript sequence SO:0001503 Ensembl and Vega also use this term name. Requested by Howard Deen of MGI. processed_transcript A transcript for which no open reading frame has been identified and for which no other function has been determined. MGI:hdeen A chromosome variation derived from an event during meiosis. kareneilbeck 2010-03-02T05:03:18Z sequence assortment derived variation SO:0001504 assortment_derived_variation A chromosome variation derived from an event during meiosis. SO:ke A collection of sequences (often chromosomes) taken as the standard for a given organism and genome assembly. kareneilbeck 2010-03-03T02:10:03Z sequence reference genome SO:0001505 reference_genome A collection of sequences (often chromosomes) taken as the standard for a given organism and genome assembly. SO:ke A collection of sequences (often chromosomes) of an individual. kareneilbeck 2010-03-03T02:11:25Z sequence variant genome SO:0001506 variant_genome A collection of sequences (often chromosomes) of an individual. SO:ke A collection of one or more sequences of an individual. kareneilbeck 2010-03-03T02:13:28Z sequence variant collection SO:0001507 variant_collection A collection of one or more sequences of an individual. SO:ke An attribute of alteration of one or more chromosomes. kareneilbeck 2010-03-04T02:53:23Z alteration attribute sequence SO:0001508 alteration_attribute An attribute of a change in the structure or number of a chromosomes. kareneilbeck 2010-03-04T02:54:30Z chromosomal variation attribute sequence SO:0001509 chromosomal_variation_attribute A change in chromosomes that occurs between two separate chromosomes. kareneilbeck 2010-03-04T02:55:25Z sequence SO:0001510 intrachromosomal A change in chromosomes that occurs between two sections of the same chromosome or between homologous chromosomes. kareneilbeck 2010-03-04T02:55:43Z sequence SO:0001511 interchromosomal A quality of a chromosomal insertion,. kareneilbeck 2010-03-04T02:55:56Z insertion attribute sequence SO:0001512 insertion_attribute A quality of a chromosomal insertion,. SO:ke An insertion of extension of a tandem repeat. kareneilbeck 2010-03-04T02:56:37Z sequence SO:0001513 tandem A quality of an insertion where the insert is not in a cytologically inverted orientation. kareneilbeck 2010-03-04T02:56:49Z sequence SO:0001514 direct A quality of an insertion where the insert is not in a cytologically inverted orientation. SO:ke A quality of an insertion where the insert is in a cytologically inverted orientation. kareneilbeck 2010-03-04T02:57:40Z sequence SO:0001515 inverted A quality of an insertion where the insert is in a cytologically inverted orientation. SO:ke The quality of a duplication where the new region exists independently of the original. kareneilbeck 2010-03-04T02:57:51Z sequence SO:0001516 free The quality of a duplication where the new region exists independently of the original. SO:ke When a region of a chromosome is changed to the reverse order without duplication or deletion. kareneilbeck 2010-03-04T02:58:10Z inversion attribute sequence SO:0001517 inversion_attribute An inversion event that includes the centromere. kareneilbeck 2010-03-04T02:58:24Z sequence SO:0001518 pericentric An inversion event that does not include the centromere. kareneilbeck 2010-03-04T02:58:35Z sequence SO:0001519 paracentric An attribute of a translocation, which is then a region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions. kareneilbeck 2010-03-04T02:58:47Z translocation attribute sequence SO:0001520 translocaton_attribute When translocation occurs between nonhomologous chromosomes and involved an equal exchange of genetic materials. kareneilbeck 2010-03-04T02:59:34Z sequence SO:0001521 reciprocal When a translocation is simply moving genetic material from one chromosome to another. kareneilbeck 2010-03-04T02:59:51Z sequence SO:0001522 insertional An attribute of a duplication, which is an insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome. kareneilbeck 2010-03-05T01:56:33Z sequence duplication attribute SO:0001523 duplication_attribute When a genome contains an abnormal amount of chromosomes. kareneilbeck 2010-03-05T02:21:00Z sequence chromosomally aberrant genome SO:0001524 chromosomally_aberrant_genome A region of sequence where the final nucleotide assignment differs from the original assembly due to an improvement that replaces a mistake. kareneilbeck 2010-03-09T02:16:31Z sequence assembly error correction SO:0001525 assembly_error_correction A region of sequence where the final nucleotide assignment differs from the original assembly due to an improvement that replaces a mistake. SO:ke A region of sequence where the final nucleotide assignment is different from that given by the base caller due to an improvement that replaces a mistake. kareneilbeck 2010-03-09T02:18:07Z sequence base call error correction SO:0001526 base_call_error_correction A region of sequence where the final nucleotide assignment is different from that given by the base caller due to an improvement that replaces a mistake. SO:ke A region of peptide sequence used to target the polypeptide molecule to a specific organelle. kareneilbeck 2010-03-11T02:15:05Z peptide localization signal sequence localization signal SO:0001527 peptide_localization_signal A region of peptide sequence used to target the polypeptide molecule to a specific organelle. SO:ke A polypeptide region that targets a polypeptide to the nucleus. kareneilbeck 2010-03-11T02:16:38Z http://en.wikipedia.org/wiki/Nuclear_localization_signal NLS sequence SO:0001528 nuclear_localization_signal A polypeptide region that targets a polypeptide to the nucleus. SO:ke http://en.wikipedia.org/wiki/Nuclear_localization_signal wikipedia A polypeptide region that targets a polypeptide to the endosome. kareneilbeck 2010-03-11T02:20:58Z endosomal localization signal sequence SO:0001529 endosomal_localization_signal A polypeptide region that targets a polypeptide to the endosome. SO:ke A polypeptide region that targets a polypeptide to the lysosome. kareneilbeck 2010-03-11T02:24:10Z lysosomal localization signal sequence SO:0001530 lysosomal_localization_signal A polypeptide region that targets a polypeptide to the lysosome. SO:ke A polypeptide region that targets a polypeptide to he cytoplasm. kareneilbeck 2010-03-11T02:25:25Z http://en.wikipedia.org/wiki/Nuclear_export_signal NES nuclear export signal sequence SO:0001531 nuclear_export_signal A polypeptide region that targets a polypeptide to he cytoplasm. SO:ke A region recognized by a recombinase. kareneilbeck 2010-03-11T03:16:47Z http://en.wikipedia.org/wiki/Recombination_Signal_Sequences sequence recombination signal sequence SO:0001532 recombination_signal_sequence A region recognized by a recombinase. SO:ke http://en.wikipedia.org/wiki/Recombination_Signal_Sequences wikipedia A splice site that is in part of the transcript not normally spliced. They occur via mutation or transcriptional error. kareneilbeck 2010-03-11T03:25:06Z cryptic splice site sequence cryptic splice signal SO:0001533 cryptic_splice_site A splice site that is in part of the transcript not normally spliced. They occur via mutation or transcriptional error. SO:ke A polypeptide region that targets a polypeptide to the nuclear rim. kareneilbeck 2010-03-11T03:31:30Z PMID:16027110 sequence nuclear rim localization signal SO:0001534 nuclear_rim_localization_signal A polypeptide region that targets a polypeptide to the nuclear rim. SO:ke A P-element is a DNA transposon responsible for hybrid dysgenesis. P elements in this terminal inverted repeat (TIR) transposon superfamily have 31 bp perfect TIR and upon insertion duplicate an 8 bp sequence. It contains transposase that may lack the DDE domain. kareneilbeck 2010-03-12T03:40:33Z DTP transposon P TIR transposon P element P transposable element P-element sequence SO:0001535 Moved from under DNA_transposon (SO:0000182) by Dave Sant as per request from GitHub issue #488 on June 25, 2020 P_TIR_transposon A P-element is a DNA transposon responsible for hybrid dysgenesis. P elements in this terminal inverted repeat (TIR) transposon superfamily have 31 bp perfect TIR and upon insertion duplicate an 8 bp sequence. It contains transposase that may lack the DDE domain. PMID:6309410 SO:ke A variant whereby the effect is evaluated with respect to a reference. kareneilbeck 2010-03-22T11:30:25Z functional effect variant functional variant sequence SO:0001536 Updated after request from Lea Starita, lea.starita@gmail.com from the NCBI. functional_effect_variant A variant whereby the effect is evaluated with respect to a reference. SO:ke A sequence variant that changes one or more structural features. kareneilbeck 2010-03-22T11:31:01Z http://vat.gersteinlab.org/formats.php Jannovar:structural_variant VAT:svOverlap sequence structural variant SO:0001537 structural_variant A sequence variant that changes one or more structural features. SO:ke http://vat.gersteinlab.org/formats.php VAT Jannovar:structural_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAT:svOverlap A sequence variant which alters the functioning of a transcript with respect to a reference sequence. kareneilbeck 2010-03-22T11:32:58Z transcript function variant sequence SO:0001538 transcript_function_variant A sequence variant which alters the functioning of a transcript with respect to a reference sequence. SO:ke A sequence variant that affects the functioning of a translational product with respect to a reference sequence. kareneilbeck 2010-03-22T11:46:15Z translational product variant sequence SO:0001539 translational_product_function_variant A sequence variant that affects the functioning of a translational product with respect to a reference sequence. SO:ke A sequence variant which alters the level of a transcript. kareneilbeck 2010-03-22T11:47:07Z level of transcript variant sequence SO:0001540 level_of_transcript_variant A sequence variant which alters the level of a transcript. SO:ke A sequence variant that decreases the level of mature, spliced and processed RNA with respect to a reference sequence. kareneilbeck 2010-03-22T11:47:47Z decreased transcript level sequence SO:0001541 decreased_transcript_level_variant A sequence variant that decreases the level of mature, spliced and processed RNA with respect to a reference sequence. SO:ke A sequence variant that increases the level of mature, spliced and processed RNA with respect to a reference sequence. kareneilbeck 2010-03-22T11:48:17Z increased transcript level variant sequence SO:0001542 increased_transcript_level_variant A sequence variant that increases the level of mature, spliced and processed RNA with respect to a reference sequence. SO:ke A sequence variant that affects the post transcriptional processing of a transcript with respect to a reference sequence. kareneilbeck 2010-03-22T11:48:48Z transcript processing variant sequence SO:0001543 transcript_processing_variant A sequence variant that affects the post transcriptional processing of a transcript with respect to a reference sequence. SO:ke A transcript processing variant whereby the process of editing is disrupted with respect to the reference. kareneilbeck 2010-03-22T11:49:25Z editing variant sequence SO:0001544 editing_variant A transcript processing variant whereby the process of editing is disrupted with respect to the reference. SO:ke A sequence variant that changes polyadenylation with respect to a reference sequence. kareneilbeck 2010-03-22T11:49:40Z polyadenylation variant sequence SO:0001545 polyadenylation_variant A sequence variant that changes polyadenylation with respect to a reference sequence. SO:ke A variant that changes the stability of a transcript with respect to a reference sequence. kareneilbeck 2010-03-22T11:50:01Z transcript stability variant sequence SO:0001546 transcript_stability_variant A variant that changes the stability of a transcript with respect to a reference sequence. SO:ke A sequence variant that decreases transcript stability with respect to a reference sequence. kareneilbeck 2010-03-22T11:50:23Z decrease transcript stability variant sequence SO:0001547 decreased_transcript_stability_variant A sequence variant that decreases transcript stability with respect to a reference sequence. SO:ke A sequence variant that increases transcript stability with respect to a reference sequence. kareneilbeck 2010-03-22T11:50:39Z increased transcript stability variant sequence SO:0001548 increased_transcript_stability_variant A sequence variant that increases transcript stability with respect to a reference sequence. SO:ke A variant that changes alters the transcription of a transcript with respect to a reference sequence. kareneilbeck 2010-03-22T11:51:26Z transcription variant sequence SO:0001549 transcription_variant A variant that changes alters the transcription of a transcript with respect to a reference sequence. SO:ke A sequence variant that changes the rate of transcription with respect to a reference sequence. kareneilbeck 2010-03-22T11:51:50Z rate of transcription variant sequence SO:0001550 rate_of_transcription_variant A sequence variant that changes the rate of transcription with respect to a reference sequence. SO:ke A sequence variant that increases the rate of transcription with respect to a reference sequence. kareneilbeck 2010-03-22T11:52:17Z increased transcription rate variant sequence SO:0001551 increased_transcription_rate_variant A sequence variant that increases the rate of transcription with respect to a reference sequence. SO:ke A sequence variant that decreases the rate of transcription with respect to a reference sequence. kareneilbeck 2010-03-22T11:52:43Z decreased transcription rate variant sequence SO:0001552 decreased_transcription_rate_variant A sequence variant that decreases the rate of transcription with respect to a reference sequence. SO:ke A functional variant that changes the translational product level with respect to a reference sequence. kareneilbeck 2010-03-22T11:53:32Z translational product level variant sequence SO:0001553 translational_product_level_variant A functional variant that changes the translational product level with respect to a reference sequence. SO:ke A sequence variant which changes polypeptide functioning with respect to a reference sequence. kareneilbeck 2010-03-22T11:53:54Z polypeptide function variant sequence SO:0001554 polypeptide_function_variant A sequence variant which changes polypeptide functioning with respect to a reference sequence. SO:ke A sequence variant which decreases the translational product level with respect to a reference sequence. kareneilbeck 2010-03-22T11:54:25Z decrease translational product level sequence SO:0001555 decreased_translational_product_level A sequence variant which decreases the translational product level with respect to a reference sequence. SO:ke A sequence variant which increases the translational product level with respect to a reference sequence. kareneilbeck 2010-03-22T11:55:25Z increase translational product level sequence SO:0001556 increased_translational_product_level A sequence variant which increases the translational product level with respect to a reference sequence. SO:ke A sequence variant which causes gain of polypeptide function with respect to a reference sequence. kareneilbeck 2010-03-22T11:56:12Z polypeptide gain of function variant sequence SO:0001557 polypeptide_gain_of_function_variant A sequence variant which causes gain of polypeptide function with respect to a reference sequence. SO:ke A sequence variant which changes the localization of a polypeptide with respect to a reference sequence. kareneilbeck 2010-03-22T11:56:37Z polypeptide localization variant sequence SO:0001558 polypeptide_localization_variant A sequence variant which changes the localization of a polypeptide with respect to a reference sequence. SO:ke A sequence variant that causes the loss of a polypeptide function with respect to a reference sequence. kareneilbeck 2010-03-22T11:56:58Z polypeptide loss of function variant sequence SO:0001559 polypeptide_loss_of_function_variant A sequence variant that causes the loss of a polypeptide function with respect to a reference sequence. SO:ke A sequence variant that causes the inactivation of a ligand binding site with respect to a reference sequence. kareneilbeck 2010-03-22T11:58:00Z inactive ligand binding site sequence SO:0001560 inactive_ligand_binding_site A sequence variant that causes the inactivation of a ligand binding site with respect to a reference sequence. SO:ke A sequence variant that causes some but not all loss of polypeptide function with respect to a reference sequence. kareneilbeck 2010-03-22T11:58:32Z polypeptide partial loss of function sequence SO:0001561 polypeptide_partial_loss_of_function A sequence variant that causes some but not all loss of polypeptide function with respect to a reference sequence. SO:ke A sequence variant that causes a change in post translational processing of the peptide with respect to a reference sequence. kareneilbeck 2010-03-22T11:59:06Z polypeptide post translational processing variant sequence SO:0001562 polypeptide_post_translational_processing_variant A sequence variant that causes a change in post translational processing of the peptide with respect to a reference sequence. SO:ke A sequence variant where copies of a feature (CNV) are either increased or decreased. kareneilbeck 2010-03-22T02:27:33Z copy number change sequence SO:0001563 copy_number_change A sequence variant where copies of a feature (CNV) are either increased or decreased. SO:ke A sequence variant where the structure of the gene is changed. kareneilbeck 2010-03-22T02:28:01Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:gene_variant VAAST:gene_variant gene structure variant snpEff:GENE sequence SO:0001564 gene_variant A sequence variant where the structure of the gene is changed. SO:ke Jannovar:gene_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:gene_variant snpEff:GENE A sequence variant whereby a two genes have become joined. kareneilbeck 2010-03-22T02:28:28Z gene fusion sequence SO:0001565 gene_fusion A sequence variant whereby a two genes have become joined. SO:ke A sequence variant located within a regulatory region. kareneilbeck 2010-03-22T02:28:48Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:regulatory_region_variant VEP:regulatory_region_variant regulatory region variant regulatory_region_ snpEff:REGULATION sequence SO:0001566 EBI term: Regulatory region variations - In regulatory region annotated by Ensembl. regulatory_region_variant A sequence variant located within a regulatory region. SO:ke Jannovar:regulatory_region_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:regulatory_region_variant regulatory_region_ http://ensembl.org/info/docs/variation/index.html snpEff:REGULATION A sequence variant where at least one base in the terminator codon is changed, but the terminator remains. kareneilbeck 2010-04-19T05:02:30Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:stop_retained_variant VAAST:stop_retained VAAST:stop_retained_variant VEP:stop_retained_variant snpEff:NON_SYNONYMOUS_STOP snpEff:SYNONYMOUS_STOP stop retained variant sequence SO:0001567 stop_retained_variant A sequence variant where at least one base in the terminator codon is changed, but the terminator remains. SO:ke Jannovar:stop_retained_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:stop_retained VAAST:stop_retained_variant VEP:stop_retained_variant snpEff:NON_SYNONYMOUS_STOP snpEff:SYNONYMOUS_STOP A sequence variant that changes the process of splicing. kareneilbeck 2010-03-22T02:29:22Z Jannovar:splicing_variant splicing variant sequence SO:0001568 splicing_variant A sequence variant that changes the process of splicing. SO:ke Jannovar:splicing_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A sequence variant causing a new (functional) splice site. kareneilbeck 2010-03-22T02:29:41Z cryptic splice site activation sequence SO:0001569 cryptic_splice_site_variant A sequence variant causing a new (functional) splice site. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A sequence variant whereby a new splice site is created due to the activation of a new acceptor. kareneilbeck 2010-03-22T02:30:11Z cryptic splice acceptor sequence SO:0001570 cryptic_splice_acceptor A sequence variant whereby a new splice site is created due to the activation of a new acceptor. SO:ke A sequence variant whereby a new splice site is created due to the activation of a new donor. kareneilbeck 2010-03-22T02:30:35Z cryptic splice donor sequence SO:0001571 cryptic_splice_donor A sequence variant whereby a new splice site is created due to the activation of a new donor. SO:ke A sequence variant whereby an exon is lost from the transcript. kareneilbeck 2010-03-22T02:31:09Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:exon_loss_variant exon loss snpEff:EXON_DELETED sequence SO:0001572 exon_loss_variant A sequence variant whereby an exon is lost from the transcript. SO:ke Jannovar:exon_loss_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:EXON_DELETED A sequence variant whereby an intron is gained by the processed transcript; usually a result of an alteration of the donor or acceptor. kareneilbeck 2010-03-22T02:31:25Z intron gain intron gain variant sequence SO:0001573 intron_gain_variant A sequence variant whereby an intron is gained by the processed transcript; usually a result of an alteration of the donor or acceptor. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A splice variant that changes the 2 base region at the 3' end of an intron. kareneilbeck 2010-03-22T02:31:52Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:splice_acceptor_variant Seattleseq:splice-acceptor VAAST:splice_acceptor_variant VEP:splice_acceptor_variant snpEff:SPLICE_SITE_ACCEPTOR splice acceptor variant sequence SO:0001574 splice_acceptor_variant A splice variant that changes the 2 base region at the 3' end of an intron. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Jannovar:splice_acceptor_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:splice-acceptor VAAST:splice_acceptor_variant VEP:splice_acceptor_variant snpEff:SPLICE_SITE_ACCEPTOR A splice variant that changes the 2 base pair region at the 5' end of an intron. kareneilbeck 2010-03-22T02:32:10Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:splice_donor_variant Seattleseq:splice-donor VAAST:splice_donor_variant VEP:splice_donor_variant snpEff:SPLICE_SITE_DONOR splice donor variant sequence SO:0001575 splice_donor_variant A splice variant that changes the 2 base pair region at the 5' end of an intron. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Jannovar:splice_donor_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:splice-donor VAAST:splice_donor_variant VEP:splice_donor_variant snpEff:SPLICE_SITE_DONOR A sequence variant that changes the structure of the transcript. kareneilbeck 2010-03-22T02:32:41Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:transcript_variant VAAST:transcript_variant snpEff:TRANSCRIPT transcript variant sequence SO:0001576 transcript_variant A sequence variant that changes the structure of the transcript. SO:ke Jannovar:transcript_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:transcript_variant snpEff:TRANSCRIPT A transcript variant with a complex INDEL- Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border. kareneilbeck 2010-03-22T02:33:03Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp complex change in transcript complex transcript variant complex_indel sequence Seattleseq:codingComplex Seattleseq:codingComplex-near-splice SO:0001577 EBI term: Complex InDel - Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border. complex_transcript_variant A transcript variant with a complex INDEL- Insertion or deletion that spans an exon/intron border or a coding sequence/UTR border. http://ensembl.org/info/docs/variation/index.html http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq complex_indel http://ensembl.org/info/docs/variation/index.html Seattleseq:codingComplex Seattleseq:codingComplex-near-splice A sequence variant where at least one base of the terminator codon (stop) is changed, resulting in an elongated transcript. kareneilbeck 2010-03-23T03:46:42Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences ANNOVAR:stoploss Jannovar:stop_lost Seattleseq:stop-lost VAAST:stop_lost VAT:removedStop VEP:stop_lost snpEff:STOP_LOST stop codon lost stop lost sequence Seattleseq:stop-lost-near-splice SO:0001578 EBI term: Stop lost - In coding sequence, resulting in the loss of a stop codon. stop_lost A sequence variant where at least one base of the terminator codon (stop) is changed, resulting in an elongated transcript. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq http://vat.gersteinlab.org/formats.php VAT ANNOVAR:stoploss http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:stop_lost http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:stop-lost VAAST:stop_lost VAT:removedStop VEP:stop_lost snpEff:STOP_LOST stop lost http://ensembl.org/info/docs/variation/index.html Seattleseq:stop-lost-near-splice transcript sequence variant sequence SO:0001579 transcript_sequence_variant true A sequence variant that changes the coding sequence. kareneilbeck 2010-03-22T02:34:36Z SO:0001581 http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:coding_sequence_variant Seattleseq:coding VAAST:coding_sequence_variant VEP:coding_sequence_variant coding sequence variant coding variant codon variant codon_variant snpEff:CDS sequence snpEff:CODON_CHANGE SO:0001580 coding_sequence_variant A sequence variant that changes the coding sequence. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Jannovar:coding_sequence_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:coding VAAST:coding_sequence_variant VEP:coding_sequence_variant snpEff:CDS snpEff:CODON_CHANGE true A codon variant that changes at least one base of the first codon of a transcript. kareneilbeck 2010-03-22T02:35:18Z http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php loinc:LA6695-6 Jannovar:initiator_codon_variant VAT:startOverlap initiatior codon variant initiator codon change sequence snpEff:NON_SYNONYMOUS_START SO:0001582 This is being used to annotate changes to the first codon of a transcript, when the first annotated codon is not to methionine. A variant is predicted to change the first amino acid of a translation irrespective of the fact that the underlying codon is an AUG. As such for transcripts with an incomplete CDS (sequence does not start with an AUG), it is still called. initiator_codon_variant A codon variant that changes at least one base of the first codon of a transcript. SO:ke http://vat.gersteinlab.org/formats.php VAT loinc:LA6695-6 Initiating Methionine Jannovar:initiator_codon_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAT:startOverlap snpEff:NON_SYNONYMOUS_START A sequence variant, that changes one or more bases, resulting in a different amino acid sequence but where the length is preserved. kareneilbeck 2010-03-22T02:35:49Z SO:0001584 SO:0001783 http://en.wikipedia.org/wiki/Missense_mutation http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences loinc:LA6698-0 Jannovar:missense_variant Seattleseq:missense VAAST:missense_variant VAT:nonsynonymous VEP:missense_variant missense missense codon snpEff:NON_SYNONYMOUS_CODING sequence ANNOVAR:nonsynonymous SNV Seattleseq:missense-near-splice VAAST:non_synonymous_codon SO:0001583 EBI term: Non-synonymous SNPs. SNPs that are located in the coding sequence and result in an amino acid change in the encoded peptide sequence. A change that causes a non_synonymous_codon can be more than 3 bases - for example 4 base substitution. missense_variant A sequence variant, that changes one or more bases, resulting in a different amino acid sequence but where the length is preserved. EBI:fc EBI:gr SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq http://vat.gersteinlab.org/formats.php VAT loinc:LA6698-0 Missense Jannovar:missense_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:missense VAAST:missense_variant VAT:nonsynonymous VEP:missense_variant missense ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd snpEff:NON_SYNONYMOUS_CODING ANNOVAR:nonsynonymous SNV http://www.openbioinformatics.org/annovar/annovar_download.html Seattleseq:missense-near-splice VAAST:non_synonymous_codon true A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for a different but similar amino acid. These variants may or may not be deleterious. kareneilbeck 2010-03-22T02:36:40Z conservative missense codon conservative missense variant sequence neutral missense codon quiet missense codon SO:0001585 conservative_missense_variant A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for a different but similar amino acid. These variants may or may not be deleterious. SO:ke A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for an amino acid with different biochemical properties. kareneilbeck 2010-03-22T02:37:16Z non conservative missense codon non conservative missense variant sequence SO:0001586 non_conservative_missense_variant A sequence variant whereby at least one base of a codon is changed resulting in a codon that encodes for an amino acid with different biochemical properties. SO:ke A sequence variant whereby at least one base of a codon is changed, resulting in a premature stop codon, leading to a shortened polypeptide. kareneilbeck 2010-03-22T02:37:52Z http://ensembl.org/info/genome/variation/prediction/predicted_data.html http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php loinc:LA6699-8 ANNOVAR:stopgain Jannovar:stop_gained Seattleseq:stop-gained VAAST:stop_gained VAT:prematureStop VEP:stop_gained nonsense nonsense codon snpEff:STOP_GAINED stop gained sequence Seattleseq:stop-gained-near-splice stop codon gained SO:0001587 EBI term: Stop gained - In coding sequence, resulting in the gain of a stop codon (i.e. leading to a shortened peptide sequence). stop_gained A sequence variant whereby at least one base of a codon is changed, resulting in a premature stop codon, leading to a shortened polypeptide. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq http://vat.gersteinlab.org/formats.php VAT loinc:LA6699-8 Nonsense ANNOVAR:stopgain http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:stop_gained http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:stop-gained VAAST:stop_gained VAT:prematureStop VEP:stop_gained nonsense ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd snpEff:STOP_GAINED stop gained http://ensembl.org/info/docs/variation/index.html Seattleseq:stop-gained-near-splice true A sequence variant which causes a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three. kareneilbeck 2010-03-22T02:40:19Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences loinc:LA6694-9 Jannovar:frameshift_variant Seattleseq:frameshift VAAST:frameshift_variant VEP:frameshift_variant frameshift variant frameshift_ frameshift_coding snpEff:FRAME_SHIFT VAT:deletionFS VAT:insertionFS sequence ANNOVAR:frameshift block substitution ANNOVAR:frameshift substitution Seattleseq:frameshift-near-splice SO:0001589 EBI term:Frameshift variations - In coding sequence, resulting in a frameshift. frameshift_variant A sequence variant which causes a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq http://vat.gersteinlab.org/formats.php VAT loinc:LA6694-9 Frameshift Jannovar:frameshift_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:frameshift VAAST:frameshift_variant VEP:frameshift_variant frameshift_ ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd frameshift_coding http://ensembl.org/info/docs/variation/index.html snpEff:FRAME_SHIFT VAT:deletionFS VAT:insertionFS ANNOVAR:frameshift block substitution http://www.openbioinformatics.org/annovar/annovar_download.html Seattleseq:frameshift-near-splice A sequence variant whereby at least one of the bases in the terminator codon is changed. kareneilbeck 2010-03-22T02:40:37Z SO:0001625 http://vat.gersteinlab.org/formats.php loinc:LA6700-2 VAT:endOverlap terminal codon variant terminal_codon_variant terminator codon variant sequence SO:0001590 The terminal codon may be the terminator, or in an incomplete transcript the last available codon. terminator_codon_variant A sequence variant whereby at least one of the bases in the terminator codon is changed. SO:ke http://vat.gersteinlab.org/formats.php VAT loinc:LA6700-2 Stop Codon Mutation VAT:endOverlap A sequence variant that reverts the sequence of a previous frameshift mutation back to the initial frame. kareneilbeck 2010-03-22T02:41:09Z frame restoring variant sequence SO:0001591 frame_restoring_variant A sequence variant that reverts the sequence of a previous frameshift mutation back to the initial frame. SO:ke A sequence variant which causes a disruption of the translational reading frame, by shifting one base ahead. kareneilbeck 2010-03-22T02:41:30Z -1 frameshift variant minus 1 frameshift variant sequence SO:0001592 minus_1_frameshift_variant A sequence variant which causes a disruption of the translational reading frame, by shifting one base ahead. http://arjournals.annualreviews.org/doi/pdf/10.1146/annurev.ge.08.120174.001535 A sequence variant which causes a disruption of the translational reading frame, by shifting two bases forward. kareneilbeck 2010-03-22T02:41:52Z -2 frameshift variant minus 2 frameshift variant sequence SO:0001593 minus_2_frameshift_variant A sequence variant which causes a disruption of the translational reading frame, by shifting one base backward. kareneilbeck 2010-03-22T02:42:06Z +1 frameshift variant plus 1 frameshift variant sequence SO:0001594 plus_1_frameshift_variant A sequence variant which causes a disruption of the translational reading frame, by shifting one base backward. http://arjournals.annualreviews.org/doi/pdf/10.1146/annurev.ge.08.120174.001535 A sequence variant which causes a disruption of the translational reading frame, by shifting two bases backward. kareneilbeck 2010-03-22T02:42:23Z +2 frameshift variant plus 2 frameshift variant sequence SO:0001595 plus_2_frameshift_variant A sequence variant within a transcript that changes the secondary structure of the RNA product. kareneilbeck 2010-03-22T02:43:18Z transcript secondary structure variant sequence SO:0001596 transcript_secondary_structure_variant A sequence variant within a transcript that changes the secondary structure of the RNA product. SO:ke A secondary structure variant that compensate for the change made by a previous variant. kareneilbeck 2010-03-22T02:43:54Z compensatory transcript secondary structure variant sequence SO:0001597 compensatory_transcript_secondary_structure_variant A secondary structure variant that compensate for the change made by a previous variant. SO:ke A sequence variant within the transcript that changes the structure of the translational product. kareneilbeck 2010-03-22T02:44:17Z translational product structure variant sequence SO:0001598 translational_product_structure_variant A sequence variant within the transcript that changes the structure of the translational product. SO:ke A sequence variant that changes the resulting polypeptide structure. kareneilbeck 2010-03-22T02:44:46Z 3D polypeptide structure variant sequence SO:0001599 3D_polypeptide_structure_variant A sequence variant that changes the resulting polypeptide structure. SO:ke A sequence variant that changes the resulting polypeptide structure. kareneilbeck 2010-03-22T02:45:13Z complex 3D structural variant sequence SO:0001600 complex_3D_structural_variant A sequence variant that changes the resulting polypeptide structure. SO:ke A sequence variant in the CDS region that causes a conformational change in the resulting polypeptide sequence. kareneilbeck 2010-03-22T02:45:48Z conformational change variant sequence SO:0001601 conformational_change_variant A sequence variant in the CDS region that causes a conformational change in the resulting polypeptide sequence. SO:ke A variant that changes the translational product with respect to the reference. kareneilbeck 2010-03-22T02:46:54Z complex change of translational product variant sequence SO:0001602 complex_change_of_translational_product_variant A sequence variant with in the CDS that causes a change in the resulting polypeptide sequence. kareneilbeck 2010-03-22T02:47:13Z polypeptide sequence variant sequence SO:0001603 polypeptide_sequence_variant A sequence variant with in the CDS that causes a change in the resulting polypeptide sequence. SO:ke A sequence variant within a CDS resulting in the loss of an amino acid from the resulting polypeptide. kareneilbeck 2010-03-22T02:47:36Z amino acid deletion sequence SO:0001604 amino_acid_deletion A sequence variant within a CDS resulting in the loss of an amino acid from the resulting polypeptide. SO:ke A sequence variant within a CDS resulting in the gain of an amino acid to the resulting polypeptide. kareneilbeck 2010-03-22T02:47:56Z amino acid insertion sequence SO:0001605 amino_acid_insertion A sequence variant within a CDS resulting in the gain of an amino acid to the resulting polypeptide. SO:ke A sequence variant of a codon resulting in the substitution of one amino acid for another in the resulting polypeptide. kareneilbeck 2010-03-22T02:48:17Z VAAST:amino_acid_substitution amino acid substitution sequence SO:0001606 amino_acid_substitution A sequence variant of a codon resulting in the substitution of one amino acid for another in the resulting polypeptide. SO:ke VAAST:amino_acid_substitution A sequence variant of a codon causing the substitution of a similar amino acid for another in the resulting polypeptide. kareneilbeck 2010-03-22T02:48:57Z conservative amino acid substitution sequence SO:0001607 conservative_amino_acid_substitution A sequence variant of a codon causing the substitution of a similar amino acid for another in the resulting polypeptide. SO:ke A sequence variant of a codon causing the substitution of a non conservative amino acid for another in the resulting polypeptide. kareneilbeck 2010-03-22T02:49:23Z non conservative amino acid substitution sequence SO:0001608 non_conservative_amino_acid_substitution A sequence variant of a codon causing the substitution of a non conservative amino acid for another in the resulting polypeptide. SO:ke An elongation of a polypeptide sequence deriving from a sequence variant extending the CDS. kareneilbeck 2010-03-22T02:49:52Z elongated polypeptide sequence SO:0001609 elongated_polypeptide An elongation of a polypeptide sequence deriving from a sequence variant extending the CDS. SO:ke An elongation of a polypeptide sequence at the C terminus deriving from a sequence variant extending the CDS. kareneilbeck 2010-03-22T02:50:20Z elongated polypeptide C terminal sequence SO:0001610 elongated_polypeptide_C_terminal An elongation of a polypeptide sequence at the C terminus deriving from a sequence variant extending the CDS. SO:ke An elongation of a polypeptide sequence at the N terminus deriving from a sequence variant extending the CDS. kareneilbeck 2010-03-22T02:50:31Z elongated polypeptide N terminal sequence SO:0001611 elongated_polypeptide_N_terminal An elongation of a polypeptide sequence at the N terminus deriving from a sequence variant extending the CDS. SO:ke A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the C terminus. kareneilbeck 2010-03-22T02:51:05Z elongated in frame polypeptide C terminal sequence SO:0001612 elongated_in_frame_polypeptide_C_terminal A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the C terminus. SO:ke A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the C terminus. kareneilbeck 2010-03-22T02:51:20Z elongated polypeptide out of frame C terminal sequence SO:0001613 elongated_out_of_frame_polypeptide_C_terminal A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the C terminus. SO:ke A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the N terminus. kareneilbeck 2010-03-22T02:51:49Z elongated in frame polypeptide N terminal sequence SO:0001614 elongated_in_frame_polypeptide_N_terminal_elongation A sequence variant with in the CDS that causes in frame elongation of the resulting polypeptide sequence at the N terminus. SO:ke A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the N terminus. kareneilbeck 2010-03-22T02:52:05Z elongated out of frame N terminal sequence SO:0001615 elongated_out_of_frame_polypeptide_N_terminal A sequence variant with in the CDS that causes out of frame elongation of the resulting polypeptide sequence at the N terminus. SO:ke A sequence variant that causes a fusion of two polypeptide sequences. kareneilbeck 2010-03-22T02:52:43Z polypeptide fusion sequence SO:0001616 polypeptide_fusion A sequence variant that causes a fusion of two polypeptide sequences. SO:ke A sequence variant of the CD that causes a truncation of the resulting polypeptide. kareneilbeck 2010-03-22T02:53:07Z polypeptide truncation sequence SO:0001617 polypeptide_truncation A sequence variant of the CD that causes a truncation of the resulting polypeptide. SO:ke A sequence variant that causes the inactivation of a catalytic site with respect to a reference sequence. kareneilbeck 2010-03-22T03:06:14Z inactive catalytic site sequence SO:0001618 inactive_catalytic_site A sequence variant that causes the inactivation of a catalytic site with respect to a reference sequence. SO:ke A transcript variant of a non coding RNA gene. kareneilbeck 2010-03-23T11:16:23Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:non_coding_transcript_variant VEP:non_coding_transcript_variant nc transcript variant non coding transcript variant within_non_coding_gene ANNOVAR:ncRNA sequence SO:0001619 Within non-coding gene - Located within a gene that does not code for a protein. non_coding_transcript_variant A transcript variant of a non coding RNA gene. SO:ke Jannovar:non_coding_transcript_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:non_coding_transcript_variant within_non_coding_gene http://ensembl.org/info/docs/variation/index.html ANNOVAR:ncRNA http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ A transcript variant located with the sequence of the mature miRNA. kareneilbeck 2010-03-23T11:16:58Z http://snpeff.sourceforge.net/SnpEff_manual.html VEP:mature_miRNA_variant mature miRNA variant snpEff:MICRO_RNA within_mature_miRNA sequence SO:0001620 EBI term: Within mature miRNA - Located within a microRNA. mature_miRNA_variant A transcript variant located with the sequence of the mature miRNA. SO:ke VEP:mature_miRNA_variant snpEff:MICRO_RNA within_mature_miRNA http://ensembl.org/info/docs/variation/index.html A variant in a transcript that is the target of nonsense-mediated mRNA decay. kareneilbeck 2010-03-23T11:20:40Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences NMD transcript variant NMD_transcript Nonsense mediated decay transcript variant VEP:NMD_transcript_variant sequence SO:0001621 NMD_transcript_variant A variant in a transcript that is the target of nonsense-mediated mRNA decay. SO:ke NMD_transcript http://ensembl.org/info/docs/variation/index.html VEP:NMD_transcript_variant A transcript variant that is located within the UTR. kareneilbeck 2010-03-23T11:22:58Z UTR variant UTR_ sequence SO:0001622 UTR_variant A transcript variant that is located within the UTR. SO:ke UTR_ http://ensembl.org/info/docs/variation/index.html A UTR variant of the 5' UTR. kareneilbeck 2010-03-23T11:23:29Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences 5'UTR variant 5PRIME_UTR Jannovar:5_prime_utr_variant Seattleseq:5-prime-UTR VAAST:5_prime_UTR_variant VAAST:five_prime_UTR_variant VEP:5_prime_UTR_variant five prime UTR variant snpEff:UTR_5_PRIME untranslated-5 sequence ANNOVAR:UTR5 SO:0001623 EBI term: 5prime UTR variations - In 5prime UTR (untranslated region). 5_prime_UTR_variant A UTR variant of the 5' UTR. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq 5PRIME_UTR http://ensembl.org/info/docs/variation/index.html Jannovar:5_prime_utr_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:5-prime-UTR VAAST:5_prime_UTR_variant VAAST:five_prime_UTR_variant VEP:5_prime_UTR_variant snpEff:UTR_5_PRIME untranslated-5 ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd ANNOVAR:UTR5 http://www.openbioinformatics.org/annovar/annovar_download.html A UTR variant of the 3' UTR. kareneilbeck 2010-03-23T11:23:54Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences 3'UTR variant 3PRIME_UTR Jannovar:3_prime_utr_variant Seattleseq:3-prime-UTR VAAST:3_prime_UTR_variant VAAST:three_prime_UTR_variant VEP:3_prime_UTR_variant snpEff:UTR_3_PRIME three prime UTR variant untranslated-3 sequence ANNOVAR:UTR3 SO:0001624 EBI term 3prime UTR variations - In 3prime UTR. 3_prime_UTR_variant A UTR variant of the 3' UTR. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq 3PRIME_UTR http://ensembl.org/info/docs/variation/index.html Jannovar:3_prime_utr_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:3-prime-UTR VAAST:3_prime_UTR_variant VAAST:three_prime_UTR_variant VEP:3_prime_UTR_variant snpEff:UTR_3_PRIME untranslated-3 ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd ANNOVAR:UTR3 http://www.openbioinformatics.org/annovar/annovar_download.html true A sequence variant where at least one base of the final codon of an incompletely annotated transcript is changed. kareneilbeck 2010-03-23T03:51:15Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:incomplete_terminal_codon_variant incomplete terminal codon variant partial_codon sequence SO:0001626 EBI term: Partial codon - Located within the final, incomplete codon of a transcript with a shortened coding sequence where the end is unknown. incomplete_terminal_codon_variant A sequence variant where at least one base of the final codon of an incompletely annotated transcript is changed. SO:ke VEP:incomplete_terminal_codon_variant partial_codon http://ensembl.org/info/docs/variation/index.html A transcript variant occurring within an intron. kareneilbeck 2010-03-23T03:52:38Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:intron_variant Seattleseq:intron VAAST:intron_variant VEP:intron_variant intron variant intron_ intronic snpEff:INTRON sequence ANNOVAR:intronic Seattleseq:intron-near-splice SO:0001627 EBI term: Intronic variations - In intron. intron_variant A transcript variant occurring within an intron. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Jannovar:intron_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:intron VAAST:intron_variant VEP:intron_variant intron_ ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd intronic http://ensembl.org/info/docs/variation/index.html snpEff:INTRON ANNOVAR:intronic http://www.openbioinformatics.org/annovar/annovar_download.html Seattleseq:intron-near-splice A sequence variant located in the intergenic region, between genes. kareneilbeck 2010-03-23T05:07:37Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:intergenic_variant Seattleseq:intergenic VEP:intergenic_variant intergenic intergenic variant snpEff:INTERGENIC sequence ANNOVAR:intergenic SO:0001628 EBI term Intergenic variations - More than 5 kb either upstream or downstream of a transcript. intergenic_variant A sequence variant located in the intergenic region, between genes. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Jannovar:intergenic_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:intergenic VEP:intergenic_variant intergenic http://ensembl.org/info/docs/variation/index.html snpEff:INTERGENIC ANNOVAR:intergenic http://www.openbioinformatics.org/annovar/annovar_download.html A sequence variant that changes the first two or last two bases of an intron, or the 5th base from the start of the intron in the orientation of the transcript. kareneilbeck 2010-03-24T09:42:00Z http://vat.gersteinlab.org/formats.php VAT:spliceOverlap essential_splice_site splice site variant sequence SO:0001629 EBI term - essential splice site - In the first 2 or the last 2 base pairs of an intron. The 5th base is on the donor (5') side of the intron. Updated to b in line with Cancer Genome Project at the Sanger. splice_site_variant A sequence variant that changes the first two or last two bases of an intron, or the 5th base from the start of the intron in the orientation of the transcript. http://ensembl.org/info/docs/variation/index.html http://vat.gersteinlab.org/formats.php VAT VAT:spliceOverlap essential_splice_site http://ensembl.org/info/docs/variation/index.html A sequence variant in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron. kareneilbeck 2010-03-24T09:46:02Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:splice_region_variant VAAST:splice_region_variant VEP:splice_region_variant snpEff:SPLICE_SITE_REGION splice region variant sequence ANNOVAR:splicing snpEff:SPLICE_SITE_BRANCH snpEff:SPLICE_SITE_BRANCH_U12 SO:0001630 EBI term: splice site - 1-3 bps into an exon or 3-8 bps into an intron. splice_region_variant A sequence variant in which a change has occurred within the region of the splice site, either within 1-3 bases of the exon or 3-8 bases of the intron. http://ensembl.org/info/docs/variation/index.html Jannovar:splice_region_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:splice_region_variant VEP:splice_region_variant snpEff:SPLICE_SITE_REGION splice region variant http://ensembl.org/info/docs/variation/index.html ANNOVAR:splicing http://www.openbioinformatics.org/annovar/annovar_download.html snpEff:SPLICE_SITE_BRANCH snpEff:SPLICE_SITE_BRANCH_U12 A sequence variant located 5' of a gene. kareneilbeck 2010-03-24T09:49:13Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:upstream_gene_variant VEP:upstream_gene_variant snpEff:UPSTREAM upstream gene variant sequence ANNOVAR:upstream SO:0001631 Different groups annotate up and downstream to different lengths. The subtypes are specific and are backed up with cross references. upstream_gene_variant A sequence variant located 5' of a gene. SO:ke Jannovar:upstream_gene_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:upstream_gene_variant snpEff:UPSTREAM ANNOVAR:upstream http://www.openbioinformatics.org/annovar/annovar_download.html A sequence variant located 3' of a gene. kareneilbeck 2010-03-24T09:49:38Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:downstream_gene_variant VEP:downstream_gene_variant downstream gene variant snpEff:DOWNSTREAM sequence ANNOVAR:downstream SO:0001632 Different groups annotate up and downstream to different lengths. The subtypes are specific and are backed up with cross references. downstream_gene_variant A sequence variant located 3' of a gene. SO:ke Jannovar:downstream_gene_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:downstream_gene_variant snpEff:DOWNSTREAM ANNOVAR:downstream http://www.openbioinformatics.org/annovar/annovar_download.html A sequence variant located within 5 KB of the end of a gene. kareneilbeck 2010-03-24T09:50:16Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp 5KB downstream variant Seattleseq:downstream-gene downstream sequence within 5KB downstream SO:0001633 EBI term Downstream variations - Within 5 kb downstream of the 3prime end of a transcript. 5KB_downstream_variant A sequence variant located within 5 KB of the end of a gene. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:downstream-gene downstream http://ensembl.org/info/docs/variation/index.html A sequence variant located within a half KB of the end of a gene. kareneilbeck 2010-03-24T09:50:42Z 500B downstream variant near-gene-3 sequence SO:0001634 500B_downstream_variant A sequence variant located within a half KB of the end of a gene. SO:ke near-gene-3 ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd A sequence variant located within 5KB 5' of a gene. kareneilbeck 2010-03-24T09:51:06Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp 5kb upstream variant Seattleseq:upstream-gene upstream sequence SO:0001635 EBI term Upstream variations - Within 5 kb upstream of the 5prime end of a transcript. 5KB_upstream_variant A sequence variant located within 5KB 5' of a gene. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:upstream-gene upstream http://ensembl.org/info/docs/variation/index.html A sequence variant located within 2KB 5' of a gene. kareneilbeck 2010-03-24T09:51:22Z 2KB upstream variant near-gene-5 sequence SO:0001636 2KB_upstream_variant A sequence variant located within 2KB 5' of a gene. SO:ke near-gene-5 ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd A gene that encodes for ribosomal RNA. kareneilbeck 2010-04-21T10:10:32Z rDNA rRNA gene sequence SO:0001637 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). rRNA_gene A gene that encodes for ribosomal RNA. SO:ke A gene that encodes for an piwi associated RNA. kareneilbeck 2010-04-21T10:11:36Z piRNA gene sequence SO:0001638 Moved from ncRNA_gene to sncRNA_gene 27 April 2021 to be more consistent with the organization of the ncRNA branch of SO. Requested by FlyBase, moved by Dave Sant. See GitHub Issue #514. piRNA_gene A gene that encodes for an piwi associated RNA. SO:ke A gene that encodes an RNase P RNA. kareneilbeck 2010-04-21T10:13:23Z RNase P RNA gene sequence SO:0001639 Moved under enzymatic_RNA_gene on 18 Nov 2021. See GitHub Issue #533. RNase_P_RNA_gene A gene that encodes an RNase P RNA. SO:ke A gene that encodes a RNase_MRP_RNA. kareneilbeck 2010-04-21T10:13:58Z sequence RNase MRP RNA gene SO:0001640 Moved under enzymatic_RNA_gene on 18 Nov 2021. See GitHub Issue #533. RNase_MRP_RNA_gene A gene that encodes a RNase_MRP_RNA. SO:ke A gene that encodes a long, intervening non-coding RNA. kareneilbeck 2010-04-21T10:14:24Z lincRNA gene sequence SO:0001641 lincRNA_gene A gene that encodes a long, intervening non-coding RNA. PMID:23463798 SO:ke http://www.gencodegenes.org/gencode_biotypes.html A mathematically defined repeat (MDR) is a experimental feature that is determined by querying overlapping oligomers of length k against a database of shotgun sequence data and identifying regions in the query sequence that exceed a statistically determined threshold of repetitiveness. kareneilbeck 2010-05-03T11:50:14Z mathematically defined repeat sequence SO:0001642 Mathematically defined repeat regions are determined without regard to the biological origin of the repetitive region. The repeat units of a MDR are the overlapping oligomers of size k that were used to for the query. Tools that can annotate mathematically defined repeats include Tallymer (Kurtz et al 2008, BMC Genomics: 517) and RePS (Wang et al, Genome Res 12(5): 824-831.). mathematically_defined_repeat A mathematically defined repeat (MDR) is a experimental feature that is determined by querying overlapping oligomers of length k against a database of shotgun sequence data and identifying regions in the query sequence that exceed a statistically determined threshold of repetitiveness. SO:jestill A telomerase RNA gene is a non coding RNA gene the RNA product of which is a component of telomerase. kareneilbeck 2010-05-18T05:26:38Z http:http://en.wikipedia.org/wiki/Telomerase_RNA_component TERC Telomerase RNA component telomerase RNA gene sequence SO:0001643 telomerase_RNA_gene A telomerase RNA gene is a non coding RNA gene the RNA product of which is a component of telomerase. SO:ke http:http://en.wikipedia.org/wiki/Telomerase_RNA_component wikipedia An engineered vector that is able to take part in homologous recombination in a host with the intent of introducing site specific genomic modifications. kareneilbeck 2010-05-28T02:05:25Z sequence targeting vector SO:0001644 targeting_vector An engineered vector that is able to take part in homologous recombination in a host with the intent of introducing site specific genomic modifications. MGD:tm PMID:10354467 A measurable sequence feature that varies within a population. kareneilbeck 2010-05-28T02:33:07Z sequence genetic marker SO:0001645 genetic_marker A measurable sequence feature that varies within a population. SO:db A genetic marker, discovered using Diversity Arrays Technology (DArT) technology. kareneilbeck 2010-05-28T02:34:43Z DArT marker sequence SO:0001646 DArT_marker A genetic marker, discovered using Diversity Arrays Technology (DArT) technology. SO:ke A kind of ribosome entry site, specific to Eukaryotic organisms that overlaps part of both 5' UTR and CDS sequence. kareneilbeck 2010-06-07T03:12:20Z http://en.wikipedia.org/wiki/Kozak_consensus_sequence kozak consensus kozak consensus sequence kozak sequence sequence SO:0001647 kozak_sequence A kind of ribosome entry site, specific to Eukaryotic organisms that overlaps part of both 5' UTR and CDS sequence. SO:ke http://en.wikipedia.org/wiki/Kozak_consensus_sequence wikipedia A transposon that is disrupted by the insertion of another element. kareneilbeck 2010-06-23T03:22:57Z nested transposon sequence SO:0001648 nested_transposon A transposon that is disrupted by the insertion of another element. SO:ke A repeat that is disrupted by the insertion of another element. kareneilbeck 2010-06-23T03:24:55Z INSDC_feature:repeat_region INSDC_qualifier:nested nested repeat sequence SO:0001649 nested_repeat A repeat that is disrupted by the insertion of another element. SO:ke A sequence variant which does not cause a disruption of the translational reading frame. kareneilbeck 2010-07-19T01:24:44Z VAAST:inframe_variant cds-indel inframe variant sequence ANNOVAR:nonframeshift block substitution ANNOVAR:nonframeshift substitution SO:0001650 inframe_variant A sequence variant which does not cause a disruption of the translational reading frame. SO:ke VAAST:inframe_variant cds-indel ANNOVAR:nonframeshift block substitution http://www.openbioinformatics.org/annovar/annovar_download.html ANNOVAR:nonframeshift substitution true true A transcription factor binding site of variable direct repeats of the sequence PuGGTCA spaced by five nucleotides (DR5) found in the promoters of retinoic acid-responsive genes, to which retinoic acid receptors bind. kareneilbeck 2010-08-03T10:46:12Z RARE retinoic acid responsive element sequence SO:0001653 retinoic_acid_responsive_element A transcription factor binding site of variable direct repeats of the sequence PuGGTCA spaced by five nucleotides (DR5) found in the promoters of retinoic acid-responsive genes, to which retinoic acid receptors bind. PMID:11327309 PMID:19917671 A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. kareneilbeck 2010-08-03T12:26:05Z sequence nucleotide to protein binding site SO:0001654 nucleotide_to_protein_binding_site A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues. SO:ke A binding site that, in the molecule, interacts selectively and non-covalently with nucleotide residues. kareneilbeck 2010-08-03T12:30:04Z np_bind nucleotide binding site sequence SO:0001655 See GO:0000166 : nucleotide binding. nucleotide_binding_site A binding site that, in the molecule, interacts selectively and non-covalently with nucleotide residues. SO:cb np_bind uniprot:feature A binding site that, in the molecule, interacts selectively and non-covalently with metal ions. kareneilbeck 2010-08-03T12:31:42Z sequence metal binding site SO:0001656 See GO:0046872 : metal ion binding. metal_binding_site A binding site that, in the molecule, interacts selectively and non-covalently with metal ions. SO:cb A binding site that, in the molecule, interacts selectively and non-covalently with a small molecule such as a drug, or hormone. kareneilbeck 2010-08-03T12:32:58Z ligand binding site sequence SO:0001657 ligand_binding_site A binding site that, in the molecule, interacts selectively and non-covalently with a small molecule such as a drug, or hormone. SO:ke An NTR is a nested repeat of two distinct tandem motifs interspersed with each other. kareneilbeck 2010-08-26T09:36:16Z NTR nested tandem repeat sequence SO:0001658 Tracker ID: 3052459. nested_tandem_repeat An NTR is a nested repeat of two distinct tandem motifs interspersed with each other. SO:AF An element that can exist within the promoter region of a gene. kareneilbeck 2010-10-01T11:48:32Z promoter element sequence SO:0001659 Mmoved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. promoter_element An element that only exists within the promoter region of a eukaryotic gene. kareneilbeck 2010-10-01T11:49:03Z core eukaryotic promoter element sequence general transcription factor binding site SO:0001660 core_eukaryotic_promoter_element An element that only exists within the promoter region of a eukaryotic gene. GREEKC:cl A TATA box core promoter of a gene transcribed by RNA polymerase II. kareneilbeck 2010-10-01T02:42:12Z RNA polymerase II TATA box sequence SO:0001661 RNA_polymerase_II_TATA_box A TATA box core promoter of a gene transcribed by RNA polymerase II. PMID:16858867 A TATA box core promoter of a gene transcribed by RNA polymerase III. kareneilbeck 2010-10-01T02:43:16Z RNA polymerase III TATA box sequence SO:0001662 RNA_polymerase_III_TATA_box A TATA box core promoter of a gene transcribed by RNA polymerase III. SO:ke A core RNA polymerase II promoter element with consensus (G/A)T(T/G/A)(T/A)(G/T)(T/G)(T/G). kareneilbeck 2010-10-01T02:49:55Z BREd sequence BREd motif SO:0001663 BREd_motif A core RNA polymerase II promoter element with consensus (G/A)T(T/G/A)(T/A)(G/T)(T/G)(T/G). PMID:16858867 A discontinuous core element of RNA polymerase II transcribed genes, situated downstream of the TSS. It is composed of three sub elements: SI, SII and SIII. kareneilbeck 2010-10-01T02:56:41Z sequence downstream core element SO:0001664 DCE A discontinuous core element of RNA polymerase II transcribed genes, situated downstream of the TSS. It is composed of three sub elements: SI, SII and SIII. PMID:16858867 A sub element of the DCE core promoter element, with consensus sequence CTTC. kareneilbeck 2010-10-01T03:00:10Z sequence DCE SI SO:0001665 DCE_SI A sub element of the DCE core promoter element, with consensus sequence CTTC. PMID:16858867 SO:ke A sub element of the DCE core promoter element with consensus sequence CTGT. kareneilbeck 2010-10-01T03:00:30Z DCE SII sequence SO:0001666 DCE_SII A sub element of the DCE core promoter element with consensus sequence CTGT. PMID:16858867 SO:ke A sub element of the DCE core promoter element with consensus sequence AGC. kareneilbeck 2010-10-01T03:00:44Z DCE SIII sequence SO:0001667 DCE_SIII A sub element of the DCE core promoter element with consensus sequence AGC. PMID:16858867 SO:ke DNA segment that ranges from about -250 to -40 relative to +1 of RNA transcription start site, where sequence specific DNA-binding transcription factors binds, such as Sp1, CTF (CCAAT-binding transcription factor), and CBF (CCAAT-box binding factor). kareneilbeck 2010-10-01T03:10:23Z sequence proximal promoter element specific transcription factor binding site SO:0001668 proximal_promoter_element DNA segment that ranges from about -250 to -40 relative to +1 of RNA transcription start site, where sequence specific DNA-binding transcription factors binds, such as Sp1, CTF (CCAAT-binding transcription factor), and CBF (CCAAT-box binding factor). PMID:12515390 PMID:9679020 SO:ml The minimal portion of the promoter required to properly initiate transcription in RNA polymerase II transcribed genes. kareneilbeck 2010-10-01T03:13:41Z RNApol II core promoter sequence SO:0001669 RNApol_II_core_promoter The minimal portion of the promoter required to properly initiate transcription in RNA polymerase II transcribed genes. PMID:16858867 A regulatory promoter element that is distal from the TSS. kareneilbeck 2010-10-01T03:21:08Z sequence distal promoter element SO:0001670 distal_promoter_element A DNA sequence to which bacterial RNA polymerase sigma 70 binds, to begin transcription. kareneilbeck 2010-10-06T01:41:34Z bacterial RNA polymerase promoter sigma 70 sequence SO:0001671 bacterial_RNApol_promoter_sigma_70_element A DNA sequence to which bacterial RNA polymerase sigma 54 binds, to begin transcription. kareneilbeck 2010-10-06T01:42:37Z bacterial RNA polymerase promoter sigma54 sequence <new synonym> SO:0001672 bacterial_RNApol_promoter_sigma54_element A conserved region about 12-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54. kareneilbeck 2010-10-06T01:44:57Z minus 12 signal sequence SO:0001673 minus_12_signal A conserved region about 12-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54. PMID:18331472 A conserved region about 24-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54. kareneilbeck 2010-10-06T01:45:24Z sequence minus 24 signal SO:0001674 minus_24_signal A conserved region about 24-bp upstream of the start point of bacterial transcription units, involved with sigma factor 54. PMID:18331472 An A box within an RNA polymerase III type 1 promoter. kareneilbeck 2010-10-06T05:43:43Z sequence A box type 1 SO:0001675 The A box can be found in the promoters of type 1 and type 2 (pol III) so sub-typing here allows the part of relationship of the subtypes to remain true. A_box_type_1 An A box within an RNA polymerase III type 1 promoter. SO:ke An A box within an RNA polymerase III type 2 promoter. kareneilbeck 2010-10-06T05:44:18Z sequence A box type 2 SO:0001676 The A box can be found in the promoters of type 1 and type 2 (pol III) so sub-typing here allows the part of relationship of the subtypes to remain true. A_box_type_2 An A box within an RNA polymerase III type 2 promoter. SO:ke A core promoter region of RNA polymerase III type 1 promoters. kareneilbeck 2010-10-06T05:52:03Z IE sequence intermediate element SO:0001677 intermediate_element A core promoter region of RNA polymerase III type 1 promoters. PMID:12381659 A promoter element that is not part of the core promoter, but provides the promoter with a specific regulatory region. kareneilbeck 2010-10-07T04:39:48Z sequence regulatory promoter element SO:0001678 regulatory_promoter_element A promoter element that is not part of the core promoter, but provides the promoter with a specific regulatory region. PMID:12381659 A regulatory region that is involved in the control of the process of transcription. kareneilbeck 2010-10-12T03:49:35Z transcription regulatory region sequence SO:0001679 Obsoleted by David Sant on 11 Feb 2021 when it was merged with transcriptional_cis_regulatory_region (SO:0001055) to reduce redundancy and be consistent with Gene Ontology. See GitHub Issue #527. transcription_regulatory_region true A regulatory region that is involved in the control of the process of transcription. SO:ke A regulatory region that is involved in the control of the process of translation. kareneilbeck 2010-10-12T03:52:45Z translation regulatory region sequence SO:0001680 translation_regulatory_region A regulatory region that is involved in the control of the process of translation. SO:ke A regulatory region that is involved in the control of the process of recombination. kareneilbeck 2010-10-12T03:53:35Z recombination regulatory region sequence SO:0001681 recombination_regulatory_region A regulatory region that is involved in the control of the process of recombination. SO:ke A regulatory region that is involved in the control of the process of nucleotide replication. kareneilbeck 2010-10-12T03:54:09Z INSDC_feature:regulatory INSDC_qualifier:replication_regulatory_region replication regulatory region sequence SO:0001682 replication_regulatory_region A regulatory region that is involved in the control of the process of nucleotide replication. SO:ke A sequence motif is a nucleotide or amino-acid sequence pattern that may have biological significance. kareneilbeck 2010-10-14T04:13:22Z http://en.wikipedia.org/wiki/Sequence_motif sequence sequence motif SO:0001683 sequence_motif A sequence motif is a nucleotide or amino-acid sequence pattern that may have biological significance. http://en.wikipedia.org/wiki/Sequence_motif http://en.wikipedia.org/wiki/Sequence_motif wikipedia An attribute of an experimentally derived feature. kareneilbeck 2010-10-28T02:22:23Z sequence experimental feature attribute SO:0001684 experimental_feature_attribute An attribute of an experimentally derived feature. SO:ke The score of an experimentally derived feature such as a p-value. kareneilbeck 2010-10-28T02:23:16Z sequence SO:0001685 score The score of an experimentally derived feature such as a p-value. SO:ke An experimental feature attribute that defines the quality of the feature in a quantitative way, such as a phred quality score. kareneilbeck 2010-10-28T02:24:11Z sequence quality value SO:0001686 quality_value An experimental feature attribute that defines the quality of the feature in a quantitative way, such as a phred quality score. SO:ke The nucleotide region (usually a palindrome) that is recognized by a restriction enzyme. This may or may not be equal to the restriction enzyme binding site. kareneilbeck 2010-10-29T12:29:57Z restriction endonuclease recognition site restriction enzyme recognition site sequence SO:0001687 restriction_enzyme_recognition_site The nucleotide region (usually a palindrome) that is recognized by a restriction enzyme. This may or may not be equal to the restriction enzyme binding site. SO:ke The boundary at which a restriction enzyme breaks the nucleotide sequence. kareneilbeck 2010-10-29T12:35:02Z restriction enzyme cleavage junction sequence SO:0001688 restriction_enzyme_cleavage_junction The boundary at which a restriction enzyme breaks the nucleotide sequence. SO:ke The restriction enzyme cleavage junction on the 5' strand of the nucleotide sequence. kareneilbeck 2010-10-29T12:36:24Z 5' restriction enzyme junction sequence SO:0001689 five_prime_restriction_enzyme_junction The restriction enzyme cleavage junction on the 5' strand of the nucleotide sequence. SO:ke The restriction enzyme cleavage junction on the 3' strand of the nucleotide sequence. kareneilbeck 2010-10-29T12:37:52Z 3' restriction enzyme junction sequence SO:0001690 three_prime_restriction_enzyme_junction A restriction enzyme recognition site that, when cleaved, results in no overhangs. kareneilbeck 2010-10-29T12:39:53Z blunt end restriction enzyme cleavage site sequence SO:0001691 blunt_end_restriction_enzyme_cleavage_site A restriction enzyme recognition site that, when cleaved, results in no overhangs. SBOL:jgquinn SO:ke A site where restriction enzymes can cleave that will produce an overhang or 'sticky end'. kareneilbeck 2010-10-29T12:40:50Z sequence sticky end restriction enzyme cleavage site SO:0001692 sticky_end_restriction_enzyme_cleavage_site A restriction enzyme cleavage site where both strands are cut at the same position. kareneilbeck 2010-10-29T12:43:14Z sequence blunt end restriction enzyme cleavage site SO:0001693 blunt_end_restriction_enzyme_cleavage_junction A restriction enzyme cleavage site where both strands are cut at the same position. SO:ke A restriction enzyme cleavage site whereby only one strand is cut. kareneilbeck 2010-10-29T12:44:48Z sequence single strand restriction enzyme cleavage site SO:0001694 single_strand_restriction_enzyme_cleavage_site A restriction enzyme cleavage site whereby only one strand is cut. SO:ke A terminal region of DNA sequence where the end of the region is not blunt ended. kareneilbeck 2010-10-29T12:48:35Z single strand overhang sequence sticky end SO:0001695 restriction_enzyme_single_strand_overhang A terminal region of DNA sequence where the end of the region is not blunt ended. SO:ke A region that has been implicated in binding although the exact coordinates of binding may be unknown. kareneilbeck 2010-11-02T11:39:59Z sequence experimentally defined binding region SO:0001696 experimentally_defined_binding_region A region that has been implicated in binding although the exact coordinates of binding may be unknown. SO:ke A region of sequence identified by CHiP seq technology to contain a protein binding site. kareneilbeck 2010-11-02T11:43:07Z sequence ChIP seq region SO:0001697 ChIP_seq_region A region of sequence identified by CHiP seq technology to contain a protein binding site. SO:ke "A primer containing an SNV at the 3' end for accurate genotyping. kareneilbeck 2010-11-11T03:25:21Z ASPE primer allele specific primer extension primer sequence SO:0001698 ASPE_primer "A primer containing an SNV at the 3' end for accurate genotyping. http://www.ncbi.nlm.nih.gov/pubmed/11252801 A primer with one or more mismatches to the DNA template corresponding to a position within a restriction enzyme recognition site. kareneilbeck 2010-11-11T03:27:09Z dCAPS primer derived cleaved amplified polymorphic primer sequence SO:0001699 dCAPS_primer A primer with one or more mismatches to the DNA template corresponding to a position within a restriction enzyme recognition site. http://www.ncbi.nlm.nih.gov/pubmed/9628033 Histone modification is a post translationally modified region whereby residues of the histone protein are modified by methylation, acetylation, phosphorylation, ubiquitination, sumoylation, citrullination, or ADP-ribosylation. kareneilbeck 2010-03-31T10:22:08Z histone modification sequence histone modification site SO:0001700 histone_modification Histone modification is a post translationally modified region whereby residues of the histone protein are modified by methylation, acetylation, phosphorylation, ubiquitination, sumoylation, citrullination, or ADP-ribosylation. http:en.wikipedia.org/wiki/Histone A histone modification site where the modification is the methylation of the residue. kareneilbeck 2010-03-31T10:23:02Z histone methylation histone methylation site sequence SO:0001701 histone_methylation_site A histone modification site where the modification is the methylation of the residue. SO:ke A histone modification where the modification is the acylation of the residue. kareneilbeck 2010-03-31T10:23:27Z histone acetylation histone acetylation site sequence SO:0001702 histone_acetylation_site A histone modification where the modification is the acylation of the residue. SO:ke A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2010-03-31T10:25:05Z H3K9 acetylation site H3K9ac sequence SO:0001703 H3K9_acetylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is acetylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 14th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2010-03-31T10:25:53Z H3K14 acetylation site H3K14ac sequence SO:0001704 H3K14_acetylation_site A kind of histone modification site, whereby the 14th residue (a lysine), from the start of the H3 histone protein is acetylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is mono-methylated. kareneilbeck 2010-03-31T10:28:14Z H3K4 mono-methylation site sequence H3K4me1 SO:0001705 H3K4_monomethylation_site A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is mono-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 protein is tri-methylated. kareneilbeck 2010-03-31T10:29:12Z H3K4 tri-methylation sequence H3K4me3 SO:0001706 H3K4_trimethylation A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 protein is tri-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is tri-methylated. kareneilbeck 2010-03-31T10:30:34Z H3K9 tri-methylation site sequence H3K9Me3 SO:0001707 H3K9_trimethylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is tri-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is mono-methylated. kareneilbeck 2010-03-31T10:31:54Z H2K27 mono-methylation site sequence H2K27Me1 SO:0001708 H3K27_monomethylation_site A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is mono-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is tri-methylated. kareneilbeck 2010-03-31T10:32:41Z H3K27 tri-methylation site sequence H3K27Me3 SO:0001709 H3K27_trimethylation_site A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is tri-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is mono- methylated. kareneilbeck 2010-03-31T10:33:42Z H3K79 mono-methylation site sequence H3K79me1 SO:0001710 H3K79_monomethylation_site A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is mono- methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is di-methylated. kareneilbeck 2010-03-31T10:34:39Z H3K79 di-methylation site sequence H3K79Me2 SO:0001711 H3K79_dimethylation_site A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is di-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is tri-methylated. kareneilbeck 2010-03-31T10:35:30Z H3K79 tri-methylation site sequence H3K79Me3 SO:0001712 H3K79_trimethylation_site A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is tri-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H4histone protein is mono-methylated. kareneilbeck 2010-03-31T10:36:43Z H4K20 mono-methylation site sequence H4K20Me1 SO:0001713 H4K20_monomethylation_site A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H4histone protein is mono-methylated. http://en.wikipedia.org/wiki/Histone A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B protein is methylated. kareneilbeck 2010-03-31T10:38:12Z H2BK5 mono-methylation site sequence SO:0001714 H2BK5_monomethylation_site A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B protein is methylated. http://en.wikipedia.org/wiki/Histone An ISRE is a transcriptional cis regulatory region, containing the consensus region: YAGTTTC(A/T)YTTTYCC, responsible for increased transcription via interferon binding. kareneilbeck 2010-04-05T11:15:08Z interferon stimulated response element sequence SO:0001715 Term requested via tracker (2981725) by Alan Ruttenberg, April 2010. It has been described as both an enhancer and a promoter, so the parent is the more general term. Moved from is_a SO:0001055 transcriptional_cis_regulatory_region to SO:0000235 TF_binding_site after Colin Logie pointed out that this is a consensus sequence where transcription factors bind, GREEKC Jan 21, 2021. ISRE An ISRE is a transcriptional cis regulatory region, containing the consensus region: YAGTTTC(A/T)YTTTYCC, responsible for increased transcription via interferon binding. http://genesdev.cshlp.org/content/2/4/383.abstrac A histone modification site where ubiquitin may be added. kareneilbeck 2010-04-13T10:12:18Z sequence histone ubiquitination site SO:0001716 histone_ubiqitination_site A histone modification site where ubiquitin may be added. SO:ke A histone modification site on H2B where ubiquitin may be added. kareneilbeck 2010-04-13T10:13:28Z sequence H2BUbiq SO:0001717 H2B_ubiquitination_site A histone modification site on H2B where ubiquitin may be added. SO:ke A kind of histone modification site, whereby the 18th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2010-04-13T10:39:35Z H3K18 acetylation site H3K18ac sequence SO:0001718 H3K18_acetylation_site A kind of histone modification site, whereby the 18th residue (a lysine), from the start of the H3 histone protein is acetylated. SO:ke A kind of histone modification, whereby the 23rd residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2010-04-13T10:42:45Z H3K23 acetylation site H3K23ac sequence SO:0001719 H3K23_acetylation_site A kind of histone modification, whereby the 23rd residue (a lysine), from the start of the H3 histone protein is acetylated. SO:ke A biological DNA region implicated in epigenomic changes caused by mechanisms other than changes in the underlying DNA sequence. This includes, nucleosomal histone post-translational modifications, nucleosome depletion to render DNA accessible and post-replicational base modifications such as cytosine modification. kareneilbeck 2010-03-27T12:02:29Z sequence epigenetically modified region SO:0001720 Moved from is_a biological_region (SO:0001411) to is_a regulatory_region (SO:0005836) on 11 Feb 2021. GREEKC members pointed out that this would be a more appropriate location. See GitHub Issue #530. 11 Feb 2021 updated definition along with addition of epigenomically_modified_region (SO:0002332). Epigenetically modified region is now not inherited while epigenomically modified region is not annotated as inherited. See GitHub Issue #532 and issue #534. epigenetically_modified_region A biological DNA region implicated in epigenomic changes caused by mechanisms other than changes in the underlying DNA sequence. This includes, nucleosomal histone post-translational modifications, nucleosome depletion to render DNA accessible and post-replicational base modifications such as cytosine modification. SO:ke http://en.wikipedia.org/wiki/Epigenetics A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acylated. kareneilbeck 2010-04-13T10:44:09Z H3K27 acylation site sequence H3K27Ac SO:0001721 H3K27_acylation_site true A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acylated. SO:ke A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is mono-methylated. kareneilbeck 2010-04-13T10:46:32Z H3K36 mono-methylation site sequence H3K36Me1 SO:0001722 H3K36_monomethylation_site A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is mono-methylated. SO:ke A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is dimethylated. kareneilbeck 2010-04-13T10:59:35Z H3K36 di-methylation site sequence H3K36Me2 SO:0001723 H3K36_dimethylation_site A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is dimethylated. SO:ke A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is tri-methylated. kareneilbeck 2010-04-13T11:01:58Z H3K36 tri-methylation site sequence H3K36Me3 SO:0001724 H3K36_trimethylation_site A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is tri-methylated. SO:ke A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is di-methylated. kareneilbeck 2010-04-13T11:03:15Z H3K4 di-methylation site sequence H3K4Me2 SO:0001725 H3K4_dimethylation_site A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is di-methylated. SO:ke A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is di-methylated. kareneilbeck 2010-04-13T01:45:41Z H3K27 di-methylation site sequence H3K27Me2 SO:0001726 H3K27_dimethylation_site A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is di-methylated. SO:ke A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is mono-methylated. kareneilbeck 2010-04-13T11:06:17Z H3K9 mono-methylation site sequence H3K9Me1 SO:0001727 H3K9_monomethylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is mono-methylated. SO:ke A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein may be dimethylated. kareneilbeck 2010-04-13T11:08:19Z H3K9 di-methylation site sequence H3K9Me2 SO:0001728 H3K9_dimethylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein may be dimethylated. SO:ke A kind of histone modification site, whereby the 16th residue (a lysine), from the start of the H4 histone protein is acetylated. kareneilbeck 2010-04-13T11:09:41Z H4K16 acetylation site H4K16ac sequence SO:0001729 H4K16_acetylation_site A kind of histone modification site, whereby the 16th residue (a lysine), from the start of the H4 histone protein is acetylated. SO:ke A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H4 histone protein is acetylated. kareneilbeck 2010-04-13T11:13:00Z H4K5 acetylation site H4K5ac sequence SO:0001730 H4K5_acetylation_site A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H4 histone protein is acetylated. SO:ke A kind of histone modification site, whereby the 8th residue (a lysine), from the start of the H4 histone protein is acetylated. kareneilbeck 2010-04-13T11:14:24Z H4K8 acetylation site H4K8ac sequence SO:0001731 H4K8_acetylation_site A kind of histone modification site, whereby the 8th residue (a lysine), from the start of the H4 histone protein is acetylated. SO:KE A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is methylated. kareneilbeck 2010-04-13T11:26:22Z H3K27 methylation site sequence SO:0001732 H3K27_methylation_site A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is methylated. SO:ke A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is methylated. kareneilbeck 2010-04-13T11:27:28Z H3K36 methylation site sequence SO:0001733 H3K36_methylation_site A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is methylated. SO:ke A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is methylated. kareneilbeck 2010-04-13T11:28:14Z H3K4 methylation site sequence SO:0001734 H3K4_methylation_site A kind of histone modification, whereby the 4th residue (a lysine), from the start of the H3 protein is methylated. SO:ke A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is methylated. kareneilbeck 2010-04-13T11:29:16Z H3K79 methylation site sequence SO:0001735 H3K79_methylation_site A kind of histone modification site, whereby the 79th residue (a lysine), from the start of the H3 histone protein is methylated. SO:ke A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is methylated. kareneilbeck 2010-04-13T11:31:37Z H3K9 methylation site sequence SO:0001736 H3K9_methylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H3 histone protein is methylated. SO:ke A histone modification, whereby the histone protein is acylated at multiple sites in a region. kareneilbeck 2010-04-13T01:58:21Z sequence histone acylation region SO:0001737 histone_acylation_region A histone modification, whereby the histone protein is acylated at multiple sites in a region. SO:ke A region of the H4 histone whereby multiple lysines are acylated. kareneilbeck 2010-04-13T02:00:06Z H4K acylation region sequence H4KAc SO:0001738 H4K_acylation_region A region of the H4 histone whereby multiple lysines are acylated. SO:ke A gene with a start codon other than AUG. kareneilbeck 2011-01-10T01:30:31Z gene with non canonical start codon sequence SO:0001739 Requested by flybase, Dec 2010. gene_with_non_canonical_start_codon A gene with a start codon other than AUG. SO:xp A gene with a translational start codon of CUG. kareneilbeck 2011-01-10T01:32:35Z gene with start codon CUG sequence SO:0001740 Requested by flybase, Dec 2010. gene_with_start_codon_CUG A gene with a translational start codon of CUG. SO:mc A gene segment which when incorporated by somatic recombination in the final gene transcript results in a nonfunctional product. batchelorc 2011-02-15T05:07:52Z pseudogenic gene segment sequence SO:0001741 pseudogenic_gene_segment A gene segment which when incorporated by somatic recombination in the final gene transcript results in a nonfunctional product. SO:hd A sequence alteration whereby the copy number of a given regions is greater than the reference sequence. kareneilbeck 2011-02-28T01:54:09Z copy number gain sequence gain SO:0001742 copy_number_gain A sequence alteration whereby the copy number of a given regions is greater than the reference sequence. SO:ke gain http://www.ncbi.nlm.nih.gov/dbvar/ A sequence alteration whereby the copy number of a given region is less than the reference sequence. kareneilbeck 2011-02-28T01:55:02Z copy number loss sequence loss SO:0001743 copy_number_loss A sequence alteration whereby the copy number of a given region is less than the reference sequence. SO:ke loss http://www.ncbi.nlm.nih.gov/dbvar/ Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from one parent and no copies of the same chromosome or region from the other parent. kareneilbeck 2011-02-28T02:01:05Z http:http://en.wikipedia.org/wiki/Uniparental_disomy UPD uniparental disomy sequence SO:0001744 UPD Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from one parent and no copies of the same chromosome or region from the other parent. SO:BM http:http://en.wikipedia.org/wiki/Uniparental_disomy wikipedia UPD http://www.ncbi.nlm.nih.gov/dbvar/ Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the mother and no copies of the same chromosome or region from the father. kareneilbeck 2011-02-28T02:03:01Z maternal uniparental disomy sequence SO:0001745 maternal_uniparental_disomy Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the mother and no copies of the same chromosome or region from the father. SO:bm Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the father and no copies of the same chromosome or region from the mother. kareneilbeck 2011-02-28T02:03:30Z paternal uniparental disomy sequence SO:0001746 paternal_uniparental_disomy Uniparental disomy is a sequence_alteration where a diploid individual receives two copies for all or part of a chromosome from the father and no copies of the same chromosome or region from the mother. SO:bm A DNA sequence that in the normal state of the chromosome corresponds to an unfolded, un-complexed stretch of double-stranded DNA. kareneilbeck 2011-02-28T02:21:52Z open chromatin region sequence SO:0001747 Requested by John Calley 3125900. open_chromatin_region A DNA sequence that in the normal state of the chromosome corresponds to an unfolded, un-complexed stretch of double-stranded DNA. SO:cb A SL2_acceptor_site which appends the SL3 RNA leader sequence to the 5' end of an mRNA. SL3 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T02:58:40Z SL3 acceptor site sequence SO:0001748 SL3_acceptor_site A SL2_acceptor_site which appends the SL3 RNA leader sequence to the 5' end of an mRNA. SL3 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL4 RNA leader sequence to the 5' end of an mRNA. SL4 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:08:47Z SL4 acceptor site sequence SO:0001749 SL4_acceptor_site A SL2_acceptor_site which appends the SL4 RNA leader sequence to the 5' end of an mRNA. SL4 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL5 RNA leader sequence to the 5' end of an mRNA. SL5 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:09:36Z SL5 acceptor site sequence SO:0001750 SL5_acceptor_site A SL2_acceptor_site which appends the SL5 RNA leader sequence to the 5' end of an mRNA. SL5 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL6 RNA leader sequence to the 5' end of an mRNA. SL6 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:10:14Z SL6 acceptor site sequence SO:0001751 SL6_acceptor_site A SL2_acceptor_site which appends the SL6 RNA leader sequence to the 5' end of an mRNA. SL6 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL7 RNA leader sequence to the 5' end of an mRNA. SL7 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:13:20Z SL37 acceptor site sequence SO:0001752 SL7_acceptor_site A SL2_acceptor_site which appends the SL7 RNA leader sequence to the 5' end of an mRNA. SL7 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL8 RNA leader sequence to the 5' end of an mRNA. SL8 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:15:26Z SL8 acceptor site sequence SO:0001753 SL8_acceptor_site A SL2_acceptor_site which appends the SL8 RNA leader sequence to the 5' end of an mRNA. SL8 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL9 RNA leader sequence to the 5' end of an mRNA. SL9 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:15:57Z SL9 acceptor site sequence SO:0001754 SL9_acceptor_site A SL2_acceptor_site which appends the SL9 RNA leader sequence to the 5' end of an mRNA. SL9 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL10 RNA leader sequence to the 5' end of an mRNA. SL10 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:16:31Z SL10 acceptor site sequence SO:0001755 SL10_acceptor_site A SL2_acceptor_site which appends the SL10 RNA leader sequence to the 5' end of an mRNA. SL10 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL11 RNA leader sequence to the 5' end of an mRNA. SL11 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:16:54Z SL11 acceptor site sequence SO:0001756 SL11_acceptor_site A SL2_acceptor_site which appends the SL11 RNA leader sequence to the 5' end of an mRNA. SL11 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A SL2_acceptor_site which appends the SL12 RNA leader sequence to the 5' end of an mRNA. SL12 acceptor sites occur in genes in internal segments of polycistronic transcripts. kareneilbeck 2011-02-28T03:17:23Z SL12 acceptor site sequence SO:0001757 SL12_acceptor_site A SL2_acceptor_site which appends the SL12 RNA leader sequence to the 5' end of an mRNA. SL12 acceptor sites occur in genes in internal segments of polycistronic transcripts. SO:nlw A pseudogene that arose via gene duplication. Generally duplicated pseudogenes have the same structure as the original gene, including intron-exon structure and some regulatory sequence. kareneilbeck 2011-03-09T09:58:04Z sequence duplicated pseudogene SO:0001758 duplicated_pseudogene A pseudogene that arose via gene duplication. Generally duplicated pseudogenes have the same structure as the original gene, including intron-exon structure and some regulatory sequence. http://en.wikipedia.org/wiki/Pseudogene A pseudogene, deactivated from original state by mutation, fixed in a population,where the ortholog in a reference species such as mouse remains functional. kareneilbeck 2011-03-09T10:04:04Z INSDC_feature:gene INSDC_qualifier:unitary sequence disabled gene unitary pseudogene SO:0001759 This is different from a non processed pseudogene because the gene was not duplicated. An example is the L-gulono-lactone oxidase pseudogene in primates. unitary_pseudogene A pseudogene, deactivated from original state by mutation, fixed in a population,where the ortholog in a reference species such as mouse remains functional. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html SO:ke http://en.wikipedia.org/wiki/Pseudogene A pseudogene that arose from a means other than retrotransposition. A pseudogene created via genomic duplication of a functional protein-coding parent gene followed by accumulation of deleterious mutations. kareneilbeck 2011-03-09T10:54:47Z INSDC_feature:gene INSDC_qualifier:unprocessed unprocessed pseudogene unprocessed_pseudogene sequence non processed pseudogene SO:0001760 non_processed_pseudogene A pseudogene that arose from a means other than retrotransposition. A pseudogene created via genomic duplication of a functional protein-coding parent gene followed by accumulation of deleterious mutations. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html SO:ke A dependent entity that inheres in a bearer, a sequence variant. kareneilbeck 2011-03-15T03:40:35Z variant quality sequence SO:0001761 variant_quality A dependent entity that inheres in a bearer, a sequence variant. PMID:17597783 SO:ke A quality inhering in a variant by virtue of its origin. kareneilbeck 2011-03-15T03:42:13Z variant origin sequence SO:0001762 variant_origin A quality inhering in a variant by virtue of its origin. PMID:17597783 SO:ke A physical quality which inheres to the variant by virtue of the number instances of the variant within a population. kareneilbeck 2011-03-15T03:44:39Z variant frequency sequence SO:0001763 variant_frequency A physical quality which inheres to the variant by virtue of the number instances of the variant within a population. PMID:17597783 SO:ke A physical quality which inheres to the variant by virtue of the number instances of the variant within a population. kareneilbeck 2011-03-15T03:47:20Z unique variant sequence SO:0001764 unique_variant A physical quality which inheres to the variant by virtue of the number instances of the variant within a population. SO:ke When a variant from the genomic sequence is rarely found in the general population. The threshold for 'rare' varies between studies. kareneilbeck 2011-03-15T03:48:29Z rare variant sequence SO:0001765 rare_variant A variant that affects one of several possible alleles at that location, such as the major histocompatibility complex (MHC) genes. kareneilbeck 2011-03-15T03:48:51Z polymorphic variant sequence SO:0001766 polymorphic_variant When a variant from the genomic sequence is commonly found in the general population. kareneilbeck 2011-03-15T03:50:36Z common variant sequence SO:0001767 common_variant When a variant has become fixed in the population so that it is now the only variant. kareneilbeck 2011-03-15T03:50:53Z fixed variant sequence SO:0001768 fixed_variant A quality inhering in a variant by virtue of its phenotype. kareneilbeck 2011-03-15T03:53:15Z variant phenotype sequence SO:0001769 variant_phenotype A quality inhering in a variant by virtue of its phenotype. PMID:17597783 SO:ke A variant that does not affect the function of the gene or cause disease. kareneilbeck 2011-03-15T03:55:40Z benign variant sequence SO:0001770 benign_variant A variant that has been found to be associated with disease. kareneilbeck 2011-03-15T04:05:16Z disease associated variant sequence SO:0001771 disease_associated_variant A variant that has been found to cause disease. kareneilbeck 2011-03-15T04:05:46Z disease causing variant sequence SO:0001772 disease_causing_variant A sequence variant where the mutated gene product does not allow for one or more basic functions necessary for survival. kareneilbeck 2011-03-15T04:06:22Z lethal variant sequence SO:0001773 lethal_variant A variant within a gene that contributes to a quantitative trait such as height or weight. kareneilbeck 2011-03-15T04:28:13Z quantitative variant sequence SO:0001774 quantitative_variant A variant in the genetic material inherited from the mother. kareneilbeck 2011-03-15T04:30:23Z maternal variant sequence SO:0001775 maternal_variant A variant in the genetic material inherited from the father. kareneilbeck 2011-03-15T04:30:47Z paternal variant sequence SO:0001776 paternal_variant A variant that has arisen after splitting of the embryo, resulting in the variant being found in only some of the tissues or cells of the body. kareneilbeck 2011-03-15T04:31:12Z somatic variant sequence SO:0001777 somatic_variant A variant present in the embryo that is carried by every cell in the body. kareneilbeck 2011-03-15T04:31:46Z germline variant sequence SO:0001778 germline_variant A variant that is found only by individuals that belong to the same pedigree. kareneilbeck 2011-03-15T04:32:18Z pedigree specific variant sequence SO:0001779 pedigree_specific_variant A variant found within only speficic populations. kareneilbeck 2011-03-15T04:33:05Z population specific variant sequence SO:0001780 population_specific_variant A variant arising in the offspring that is not found in either of the parents. kareneilbeck 2011-03-15T04:33:34Z de novo variant sequence SO:0001781 de_novo_variant A sequence variant located within a transcription factor binding site. kareneilbeck 2011-03-17T10:59:20Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:tf_binding_site_variant TF binding site variant VEP:TF_binding_site_variant sequence SO:0001782 TF_binding_site_variant A sequence variant located within a transcription factor binding site. EBI:fc Jannovar:tf_binding_site_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:TF_binding_site_variant true A structural sequence alteration or rearrangement encompassing one or more genome fragments, with 4 or more breakpoints. kareneilbeck 2011-03-23T03:21:19Z SO:1000146 complex chromosomal mutation complex_chromosomal_mutation sequence complex SO:0001784 complex_structural_alteration A structural sequence alteration or rearrangement encompassing one or more genome fragments, with 4 or more breakpoints. FB:reference_manual NCBI:th SO:ke complex http://www.ncbi.nlm.nih.gov/dbvar/ An alteration of the genome that leads to a change in the structure of one or more chromosomes. kareneilbeck 2011-03-25T02:27:41Z structural alteration sequence SO:0001785 structural_alteration A functional variant whereby the sequence alteration causes a loss of function of one allele of a gene. kareneilbeck 2011-03-25T02:32:58Z LOH loss of heterozygosity sequence SO:0001786 loss_of_heterozygosity A functional variant whereby the sequence alteration causes a loss of function of one allele of a gene. SO:ke A sequence variant that causes a change at the 5th base pair after the start of the intron in the orientation of the transcript. kareneilbeck 2011-04-05T04:16:28Z splice donor 5th base variant sequence SO:0001787 splice_donor_5th_base_variant A sequence variant that causes a change at the 5th base pair after the start of the intron in the orientation of the transcript. EBI:gr An U-box is a conserved T-rich region upstream of a retroviral polypurine tract that is involved in PPT primer creation during reverse transcription. kareneilbeck 2011-04-08T10:39:14Z U-box sequence SO:0001788 U_box An U-box is a conserved T-rich region upstream of a retroviral polypurine tract that is involved in PPT primer creation during reverse transcription. PMID:10556309 PMID:11577982 PMID:9649446 A specialized region in the genomes of some yeast and fungi, the genes of which regulate mating type. kareneilbeck 2011-04-08T11:14:07Z http://en.wikipedia.org/wiki/Mating-type_region mating type region sequence SO:0001789 mating_type_region A specialized region in the genomes of some yeast and fungi, the genes of which regulate mating type. SO:ke An assembly region that has been sequenced from both ends resulting in a read_pair (mate_pair). kareneilbeck 2011-04-14T01:48:20Z paired end fragment sequence SO:0001790 paired_end_fragment An assembly region that has been sequenced from both ends resulting in a read_pair (mate_pair). SO:ke A sequence variant that changes exon sequence. kareneilbeck 2011-05-06T01:51:17Z http://snpeff.sourceforge.net/SnpEff_manual.html ANNOVAR:exonic Jannovar:exon_variant VAAST:exon_variant exon variant snpEff:EXON sequence SO:0001791 exon_variant A sequence variant that changes exon sequence. SO:ke ANNOVAR:exonic http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:exon_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAAST:exon_variant snpEff:EXON A sequence variant that changes non-coding exon sequence in a non-coding transcript. kareneilbeck 2011-05-06T01:51:59Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq:non-coding-exon VEP:non_coding_transcript_exon_variant non coding transcript exon variant non_coding_transcript_exon_variant snpEff:non_coding_exon_variant ANNOVAR:ncRNA_exonic Jannovar:non_coding_transcript_exon_variant sequence Seattleseq:non-coding-exon-near-splice SO:0001792 non_coding_transcript_exon_variant A sequence variant that changes non-coding exon sequence in a non-coding transcript. EBI:fc SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:non-coding-exon VEP:non_coding_transcript_exon_variant non_coding_transcript_exon_variant snpEff:non_coding_exon_variant ANNOVAR:ncRNA_exonic Jannovar:non_coding_transcript_exon_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:non-coding-exon-near-splice A read from an end of the clone sequence. kareneilbeck 2011-05-13T11:32:27Z clone end sequence SO:0001793 clone_end A read from an end of the clone sequence. SO:ke A point centromere is a relatively small centromere (about 125 bp DNA) in discrete sequence, found in some yeast including S. cerevisiae. kareneilbeck 2011-05-31T12:42:35Z point centromere sequence SO:0001794 point_centromere A point centromere is a relatively small centromere (about 125 bp DNA) in discrete sequence, found in some yeast including S. cerevisiae. PMID:7502067 SO:vw A regional centromere is a large modular centromere found in fission yeast and higher eukaryotes. It consist of a central core region flanked by inverted inner and outer repeat regions. kareneilbeck 2011-05-31T12:43:07Z regional centromere sequence SO:0001795 regional_centromere A regional centromere is a large modular centromere found in fission yeast and higher eukaryotes. It consist of a central core region flanked by inverted inner and outer repeat regions. PMID:7502067 SO:vw A conserved region within the central region of a modular centromere, where the kinetochore is formed. kareneilbeck 2011-05-31T12:56:30Z regional centromere central core sequence SO:0001796 regional_centromere_central_core A conserved region within the central region of a modular centromere, where the kinetochore is formed. SO:vw A repeat region found within the modular centromere. kareneilbeck 2011-05-31T12:59:27Z INSDC_feature:repeat_region INSDC_qualifier:centromeric_repeat centromeric repeat sequence SO:0001797 centromeric_repeat A repeat region found within the modular centromere. SO:ke The inner inverted repeat region of a modular centromere and part of the central core surrounding a non-conserved central region. This region is adjacent to the central core, on each chromosome arm. kareneilbeck 2011-05-31T01:01:08Z lmr repeat lmr1L lmr1R regional centromere inner repeat region sequence SO:0001798 regional_centromere_inner_repeat_region The inner inverted repeat region of a modular centromere and part of the central core surrounding a non-conserved central region. This region is adjacent to the central core, on each chromosome arm. SO:vw The heterochromatic outer repeat region of a modular centromere. These repeats exist in tandem arrays on both chromosome arms. kareneilbeck 2011-05-31T01:03:23Z regional centromere outer repeat region sequence SO:0001799 regional_centromere_outer_repeat_region The heterochromatic outer repeat region of a modular centromere. These repeats exist in tandem arrays on both chromosome arms. SO:vw The sequence of a 21 nucleotide double stranded, polyadenylated non coding RNA, transcribed from the TAS gene. kareneilbeck 2011-05-31T03:24:06Z sequence trans acting small interfering RNA SO:0001800 tasiRNA The sequence of a 21 nucleotide double stranded, polyadenylated non coding RNA, transcribed from the TAS gene. PMID:16145017 A primary transcript encoding a tasiRNA. kareneilbeck 2011-05-31T03:27:35Z tasiRNA primary transcript sequence SO:0001801 tasiRNA_primary_transcript A primary transcript encoding a tasiRNA. PMID:16145017 A transcript processing variant whereby polyadenylation of the encoded transcript is increased with respect to the reference. kareneilbeck 2011-06-01T10:53:12Z increased polyadenylation variant sequence SO:0001802 Term requested by M. Dumontier, June 1 2011. increased_polyadenylation_variant A transcript processing variant whereby polyadenylation of the encoded transcript is increased with respect to the reference. SO:ke A transcript processing variant whereby polyadenylation of the encoded transcript is decreased with respect to the reference. kareneilbeck 2011-06-01T10:53:40Z decreased polyadenylation variant sequence SO:0001803 Term requested by M. Dumontier, June 1 2011. decreased_polyadenylation_variant A transcript processing variant whereby polyadenylation of the encoded transcript is decreased with respect to the reference. SO:ke A conserved polypeptide motif that mediates protein-protein interaction and defines adaptor proteins for DDB1/cullin 4 ubiquitin ligases. kareneilbeck 2011-06-17T12:10:44Z DDB box DDB-box sequence SO:0001804 Note: PMID:18794354 describes the DDB box, and has lots of alignments, but doesn't actually come out with a consensus sequence. DDB_box A conserved polypeptide motif that mediates protein-protein interaction and defines adaptor proteins for DDB1/cullin 4 ubiquitin ligases. PMID:18794354 PMID:19818632 A conserved polypeptide motif that can be recognized by both Fizzy/Cdc20- and FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is RXXLXXXXN. kareneilbeck 2011-06-17T12:16:02Z D-box destruction box sequence SO:0001805 destruction_box A conserved polypeptide motif that can be recognized by both Fizzy/Cdc20- and FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is RXXLXXXXN. PMID:12208841 PMID:1842691 A C-terminal tetrapeptide motif that mediates retention of a protein in (or retrieval to) the endoplasmic reticulum. In mammals the sequence is KDEL, and in fungi HDEL or DDEL. kareneilbeck 2011-06-17T12:19:49Z ER retention signal endoplasmic reticulum retention signal sequence SO:0001806 ER_retention_signal A C-terminal tetrapeptide motif that mediates retention of a protein in (or retrieval to) the endoplasmic reticulum. In mammals the sequence is KDEL, and in fungi HDEL or DDEL. PMID:2077689 doi:10.1093/jxb/50.331.157 A conserved polypeptide motif that can be recognized by FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is KENXXXN. kareneilbeck 2011-06-17T12:24:14Z KEN box sequence SO:0001807 KEN_box A conserved polypeptide motif that can be recognized by FZR/Cdh1-activated anaphase-promoting complex/cyclosome (APC/C) and targets a protein for ubiquitination and subsequent degradation by the APC/C. The consensus sequence is KENXXXN. PMID:10733526 PMID:1220884 PMID:18426916 A polypeptide region that targets a polypeptide to the mitochondrion. kareneilbeck 2011-06-17T12:26:35Z MTS mitochondrial signal sequence mitochondrial targeting signal sequence SO:0001808 mitochondrial_targeting_signal A polypeptide region that targets a polypeptide to the mitochondrion. PomBase:mah A signal sequence that is not cleaved from the polypeptide. Anchors a Type II membrane protein to the membrane. kareneilbeck 2011-06-17T12:28:53Z signal anchor uncleaved signal peptide sequence SO:0001809 signal_anchor A signal sequence that is not cleaved from the polypeptide. Anchors a Type II membrane protein to the membrane. http://www.cbs.dtu.dk/services/SignalP/background/biobackground.php A polypeptide region that mediates binding to PCNA. The consensus sequence is QXX(hh)XX(aa), where (h) denotes residues with moderately hydrophobic side chains and (a) denotes residues with highly hydrophobic aromatic side chains. kareneilbeck 2011-06-17T12:33:25Z PIP box sequence SO:0001810 PIP_box A polypeptide region that mediates binding to PCNA. The consensus sequence is QXX(hh)XX(aa), where (h) denotes residues with moderately hydrophobic side chains and (a) denotes residues with highly hydrophobic aromatic side chains. PMID:9631646 A post-translationally modified region in which residues of the protein are modified by phosphorylation. kareneilbeck 2011-06-17T12:36:20Z phosphorylation site sequence SO:0001811 phosphorylation_site A post-translationally modified region in which residues of the protein are modified by phosphorylation. PomBase:mah A region that traverses the lipid bilayer and adopts a helical secondary structure. kareneilbeck 2011-06-17T12:39:46Z transmembrane helix sequence SO:0001812 transmembrane_helix A region that traverses the lipid bilayer and adopts a helical secondary structure. PomBase:mah A polypeptide region that targets a polypeptide to the vacuole. kareneilbeck 2011-06-17T12:42:48Z vacuolar sorting signal sequence SO:0001813 vacuolar_sorting_signal A polypeptide region that targets a polypeptide to the vacuole. PomBase:mah An attribute of a coding genomic variant. kareneilbeck 2011-06-24T03:32:25Z coding variant quality sequence SO:0001814 coding_variant_quality A variant that does not lead to any change in the amino acid sequence. kareneilbeck 2011-06-24T03:33:16Z sequence SO:0001815 synonymous A variant that leads to the change of an amino acid within the protein. kareneilbeck 2011-06-24T03:33:36Z sequence non synonymous SO:0001816 non_synonymous An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is divisible by 3. kareneilbeck 2011-06-24T03:34:03Z sequence SO:0001817 inframe An attribute describing a sequence that contains a mutation involving the deletion or insertion of one or more bases, where this number is divisible by 3. SO:ke A sequence_variant which is predicted to change the protein encoded in the coding sequence. kareneilbeck 2011-06-24T03:38:02Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:protein_altering_variant protein altering variant sequence SO:0001818 protein_altering_variant A sequence_variant which is predicted to change the protein encoded in the coding sequence. EBI:gr VEP:protein_altering_variant A sequence variant where there is no resulting change to the encoded amino acid. kareneilbeck 2011-06-24T03:38:30Z SO:0001588 http://en.wikipedia.org/wiki/Silent_mutation http://en.wikipedia.org/wiki/Synonymous_mutation http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:synonymous_variant Seattleseq:synonymous VAAST:synonymous_codon VAAST:synonymous_variant VAT:synonymous VEP:synonymous_variant coding-synon snpEff:SYNONYMOUS_CODING synonymous codon synonymous_coding synonymous_codon sequence ANNOVAR:synonymous SNV Seattleseq:synonymous-near-splice silent mutation silent substitution silent_mutation SO:0001819 EBI term: Synonymous SNPs - In coding sequence, not resulting in an amino acid change (i.e. silent mutation). This term is sometimes used synonomously with the more general term 'silent mutation', although a silent mutation may occur in non coding sequence. The best practice is to annotate to the most specific term. synonymous_variant A sequence variant where there is no resulting change to the encoded amino acid. SO:ke http://en.wikipedia.org/wiki/Silent_mutation wiki http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq http://vat.gersteinlab.org/formats.php VAT Jannovar:synonymous_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html Seattleseq:synonymous VAAST:synonymous_codon VAAST:synonymous_variant VAT:synonymous VEP:synonymous_variant coding-synon ftp://ftp.ncbi.nih.gov/snp/specs/docsum_3.1.xsd snpEff:SYNONYMOUS_CODING ANNOVAR:synonymous SNV http://www.openbioinformatics.org/annovar/annovar_download.html Seattleseq:synonymous-near-splice A coding sequence variant where the change does not alter the frame of the transcript. kareneilbeck 2011-06-27T11:25:33Z inframe change in CDS length inframe indel sequence SO:0001820 inframe_indel A coding sequence variant where the change does not alter the frame of the transcript. SO:ke An inframe non synonymous variant that inserts bases into in the coding sequence. kareneilbeck 2011-06-27T11:26:22Z SO:0001651 http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences ANNOVAR:nonframeshift insertion Jannovar:inframe_insertion VAT:insertionNFS VEP:inframe_insertion inframe increase in CDS length inframe insertion inframe_codon_gain snpEFF:CODON_INSERTION sequence inframe codon gain SO:0001821 inframe_insertion An inframe non synonymous variant that inserts bases into in the coding sequence. EBI:gr http://vat.gersteinlab.org/formats.php VAT ANNOVAR:nonframeshift insertion http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:inframe_insertion http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAT:insertionNFS VEP:inframe_insertion snpEFF:CODON_INSERTION An inframe non synonymous variant that deletes bases from the coding sequence. kareneilbeck 2011-06-27T11:27:10Z SO:0001652 http://snpeff.sourceforge.net/SnpEff_manual.html http://vat.gersteinlab.org/formats.php http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences ANNOVAR:nonframeshift deletion Jannovar:inframe_deletion VAT:deletionNFS VEP:inframe_deletion inframe decrease in CDS length inframe_codon_loss sequence inframe codon loss inframe deletion snpEff:CODON_DELETION SO:0001822 inframe_deletion An inframe non synonymous variant that deletes bases from the coding sequence. EBI:gr http://vat.gersteinlab.org/formats.php VAT ANNOVAR:nonframeshift deletion http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:inframe_deletion http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VAT:deletionNFS VEP:inframe_deletion snpEff:CODON_DELETION An inframe increase in cds length that inserts one or more codons into the coding sequence between existing codons. kareneilbeck 2011-06-27T11:28:02Z conservative increase in CDS length conservative inframe insertion sequence SO:0001823 conservative_inframe_insertion An inframe increase in cds length that inserts one or more codons into the coding sequence between existing codons. EBI:gr An inframe increase in cds length that inserts one or more codons into the coding sequence within an existing codon. kareneilbeck 2011-06-27T11:28:37Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:disruptive_inframe_insertion disruptive increase in CDS length disruptive inframe insertion snpEff:CODON_CHANGE_PLUS_CODON_INSERTION sequence SO:0001824 disruptive_inframe_insertion An inframe increase in cds length that inserts one or more codons into the coding sequence within an existing codon. EBI:gr Jannovar:disruptive_inframe_insertion http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:CODON_CHANGE_PLUS_CODON_INSERTION An inframe decrease in cds length that deletes one or more entire codons from the coding sequence but does not change any remaining codons. kareneilbeck 2011-06-27T11:30:43Z conservative inframe deletion sequence conservative decrease in CDS length SO:0001825 conservative_inframe_deletion An inframe decrease in cds length that deletes one or more entire codons from the coding sequence but does not change any remaining codons. EBI:gr An inframe decrease in cds length that deletes bases from the coding sequence starting within an existing codon. kareneilbeck 2011-06-27T11:31:31Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:disruptive_inframe_deletion disruptive decrease in CDS length disruptive inframe deletion snpEff:CODON_CHANGE_PLUS_CODON_DELETION sequence SO:0001826 disruptive_inframe_deletion An inframe decrease in cds length that deletes bases from the coding sequence starting within an existing codon. EBI:gr Jannovar:disruptive_inframe_deletion http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:CODON_CHANGE_PLUS_CODON_DELETION A sequencer read of an mRNA substrate. kareneilbeck 2011-06-28T04:04:32Z mRNA read sequence SO:0001827 Requested by Bayer Cropscience June, 2011. mRNA_read A sequencer read of an mRNA substrate. SO:ke A sequencer read of a genomic DNA substrate. kareneilbeck 2011-06-28T04:06:10Z gDNA read gDNA_read genomic DNA read sequence SO:0001828 genomic_DNA_read A sequencer read of a genomic DNA substrate. SO:ke A contig composed of mRNA_reads. kareneilbeck 2011-06-28T04:07:09Z sequence mRNA contig SO:0001829 Requested by Bayer Cropscience June, 2011. mRNA_contig A contig composed of mRNA_reads. SO:ke A PCR product obtained by applying the AFLP technique, based on a restriction enzyme digestion of genomic DNA and an amplification of the resulting fragments. kareneilbeck 2011-07-14T12:12:35Z http://en.wikipedia.org/wiki/Amplified_fragment_length_polymorphism AFLP AFLP fragment AFLP-PCR amplified fragment length polymorphism amplified fragment length polymorphism PCR sequence SO:0001830 Requested by Bayer Cropscience June, 2011. AFLP_fragment A PCR product obtained by applying the AFLP technique, based on a restriction enzyme digestion of genomic DNA and an amplification of the resulting fragments. GMOD:ea http://en.wikipedia.org/wiki/Amplified_fragment_length_polymorphism wiki A match to a protein HMM such as pfam. kareneilbeck 2011-08-11T03:20:27Z protein hmm match sequence SO:0001831 protein_hmm_match A match to a protein HMM such as pfam. SO:ke A region of immunoglobulin sequence, either constant or variable. kareneilbeck 2011-09-01T03:27:20Z immunoglobulin region sequence SO:0001832 immunoglobulin_region A region of immunoglobulin sequence, either constant or variable. SO:ke The variable region of an immunoglobulin polypeptide sequence. kareneilbeck 2011-09-01T03:28:40Z INSDC_feature:V_region V region sequence SO:0001833 V_region The variable region of an immunoglobulin polypeptide sequence. SO:ke The constant region of an immunoglobulin polypeptide sequence. kareneilbeck 2011-09-01T03:29:41Z C region sequence SO:0001834 C_region The constant region of an immunoglobulin polypeptide sequence. SO:ke Extra nucleotides inserted between rearranged immunoglobulin segments. kareneilbeck 2011-09-01T03:50:16Z INSDC_feature:N_region N-region sequence SO:0001835 N_region Extra nucleotides inserted between rearranged immunoglobulin segments. SO:ke The switch region of immunoglobulin heavy chains; it is involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin classes from the same B-cell. kareneilbeck 2011-09-01T03:52:05Z INSDC_feature:S_region S region sequence SO:0001836 S_region The switch region of immunoglobulin heavy chains; it is involved in the rearrangement of heavy chain DNA leading to the expression of a different immunoglobulin classes from the same B-cell. SO:ke A kind of insertion where the inserted sequence is a mobile element. kareneilbeck 2011-10-04T12:36:52Z mobile element insertion sequence SO:0001837 Requested by the EBI. mobile_element_insertion A kind of insertion where the inserted sequence is a mobile element. EBI:dvga An insertion the sequence of which cannot be mapped to the reference genome. kareneilbeck 2011-10-04T01:14:50Z novel sequence insertion sequence SO:0001838 Requested by the NCBI. novel_sequence_insertion An insertion the sequence of which cannot be mapped to the reference genome. NCBI:th A promoter element with consensus sequence GTGRGAA, bound by CSL (CBF1/RBP-JK/Suppressor of Hairless/LAG-1) transcription factors. kareneilbeck 2011-10-07T03:37:43Z CSL response element sequence SO:0001839 CSL_response_element A promoter element with consensus sequence GTGRGAA, bound by CSL (CBF1/RBP-JK/Suppressor of Hairless/LAG-1) transcription factors. PMID:19101542 A GATA transcription factor element containing the consensus sequence WGATAR (in which W indicates A/T and R indicates A/G). kareneilbeck 2011-10-07T03:42:05Z GATA box sequence GATA element SO:0001840 Changed to is_a SO:0001055 transcriptional_cis_regulatory_region from core_eukaryotic_promoter_element SO:0001660 after Ruth Lovering from GREEKC initiative pointed out that GATA boxes are frequently in enhancer regions, Dave Sant Aug 2020. Moved from is_a SO:0001055 transcriptional_cis_regulatory_region to SO:0000235 TF_binding_site after Colin Logie pointed out that this is a consensus sequence where transcription factors bind, GREEKC Jan 21, 2021. GATA_box A GATA transcription factor element containing the consensus sequence WGATAR (in which W indicates A/T and R indicates A/G). PMID:8321208 A pseudogene in the reference genome, though known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error. kareneilbeck 2011-10-07T03:46:57Z polymorphic psuedogene sequence SO:0001841 This terms is used by Ensembl and Vega. Pseudogene owing to a SNP/DIP but in other individuals/haplotypes/strains the gene is translated. polymorphic_pseudogene A pseudogene in the reference genome, though known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html JAX:hd A promoter element with consensus sequence TGACTCA, bound by AP-1 and related transcription factors. kareneilbeck 2011-10-07T03:54:52Z AP-1 binding site sequence SO:0001842 AP_1_binding_site A promoter element with consensus sequence TGACTCA, bound by AP-1 and related transcription factors. PMID:1899230 PMID:3034432 PMID:3125983 MERGED DEFINITION: TARGET DEFINITION: A promoter element with consensus sequence TGACGTCA; bound by the ATF/CREB family of transcription factors. -------------------- SOURCE DEFINITION: A promoter element that contains a core sequence TGACGT, bound by a protein complex that regulates transcription of genes encoding PKA pathway components. kareneilbeck 2011-10-07T03:58:48Z SO:0001900 ATF/CRE site Atf1/Pcr1 recognition motif M26 binding site M26_binding_site cyclic AMP response element m26 site sequence SO:0001843 New synonym Atf1/Pcr1 recognition motif added in response to Antonia Lock GitHub Issue Request #437, PMID:15716492 CRE MERGED DEFINITION: TARGET DEFINITION: A promoter element with consensus sequence TGACGTCA; bound by the ATF/CREB family of transcription factors. -------------------- SOURCE DEFINITION: A promoter element that contains a core sequence TGACGT, bound by a protein complex that regulates transcription of genes encoding PKA pathway components. PMID:11483355 PMID:11483993 PMID:15448137 ATF/CRE site PMID:11483993 A promoter element bound by copper ion-sensing transcription factors such as S. cerevisiae Mac1p or S. pombe Cuf1; the consensus sequence is HTHNNGCTGD (more specifically TTTGCKCR in budding yeast). kareneilbeck 2011-10-07T04:02:51Z copper-response element sequence SO:0001844 CuRE A promoter element bound by copper ion-sensing transcription factors such as S. cerevisiae Mac1p or S. pombe Cuf1; the consensus sequence is HTHNNGCTGD (more specifically TTTGCKCR in budding yeast). PMID:10593913 PMID:9188496 PMID:9211922 A promoter element with consensus sequence CGWGGWNGMM, bound by transcription factors related to RecA and found in promoters of genes expressed following several types of DNA damage or inhibition of DNA synthesis. kareneilbeck 2011-10-07T04:17:25Z DNA damage response element sequence SO:0001845 DRE A promoter element with consensus sequence CGWGGWNGMM, bound by transcription factors related to RecA and found in promoters of genes expressed following several types of DNA damage or inhibition of DNA synthesis. PMID:11073995 PMID:8668127 A promoter element that has consensus sequence GTAAACAAACAAAM and contains a heptameric core GTAAACA, bound by transcription factors with a forkhead DNA-binding domain. kareneilbeck 2011-10-07T04:20:01Z sequence FLEX element SO:0001846 FLEX_element A promoter element that has consensus sequence GTAAACAAACAAAM and contains a heptameric core GTAAACA, bound by transcription factors with a forkhead DNA-binding domain. PMID:10747048 PMID:14871934 A promoter element with consensus sequence TTTRTTTACA, bound by transcription factors with a forkhead DNA-binding domain. kareneilbeck 2011-10-07T04:22:06Z forkhead motif sequence SO:0001847 forkhead_motif A promoter element with consensus sequence TTTRTTTACA, bound by transcription factors with a forkhead DNA-binding domain. PMID:15195092 A core promoter element that has the consensus sequence CAGTCACA (or its inverted form TGTGACTG), and plays the role of a TATA box in promoters that do not contain a canonical TATA sequence. kareneilbeck 2011-10-07T04:24:14Z homoID homol D box sequence SO:0001848 homol_D_box A core promoter element that has the consensus sequence CAGTCACA (or its inverted form TGTGACTG), and plays the role of a TATA box in promoters that do not contain a canonical TATA sequence. PMID:21673110 PMID:7501449 PMID:8458332 A core promoter element that has the consensus sequence ACCCTACCCT (or its inverted form AGGGTAGGGT), and is found near the homol D box in some promoters that use a homol D box instead of a canonical TATA sequence. kareneilbeck 2011-10-07T04:26:09Z homol E box sequence SO:0001849 homol_E_box A core promoter element that has the consensus sequence ACCCTACCCT (or its inverted form AGGGTAGGGT), and is found near the homol D box in some promoters that use a homol D box instead of a canonical TATA sequence. PMID:7501449 A promoter element that consists of at least three copies of the pentanucleotide NGAAN, bound by the heat shock transcription factor HSF. kareneilbeck 2011-10-07T04:29:10Z heat shock element sequence SO:0001850 HSE A promoter element that consists of at least three copies of the pentanucleotide NGAAN, bound by the heat shock transcription factor HSF. PMID:17347150 PMID:8689565 A GATA promoter element with consensus sequence WGATAA, found in promoters of genes repressed in the presence of iron. kareneilbeck 2011-10-07T04:32:42Z IDP (GATA) iron repressed GATA element sequence SO:0001851 The synonym IDP (GATA) is found in an annotation but un-traced as far as literature goes. iron_repressed_GATA_element A GATA promoter element with consensus sequence WGATAA, found in promoters of genes repressed in the presence of iron. PMID:11956219 PMID:17211681 A promoter element with consensus sequence ACAAT, found in promoters of mating type M-specific genes in fission yeast and bound by the transcription factor Mat1-Mc. kareneilbeck 2011-10-07T04:39:43Z mating type M-box sequence SO:0001852 Note that this should not be confused with the M-box that has consensus sequence CATGTG and is bound by bHLH transcription factors such as MITF. mating_type_M_box A promoter element with consensus sequence ACAAT, found in promoters of mating type M-specific genes in fission yeast and bound by the transcription factor Mat1-Mc. PMID:9233811 A non-palindromic sequence found in the promoters of genes whose expression is regulated in response to androgen. kareneilbeck 2011-10-10T04:52:44Z ARE androgen response element sequence SO:0001853 androgen_response_element A non-palindromic sequence found in the promoters of genes whose expression is regulated in response to androgen. PMID:21796522 A smFISH is a probe that binds RNA in a single molecule in situ hybridization experiment. kareneilbeck 2011-10-10T05:00:30Z single molecule fish probe sequence smFISH probe SO:0001854 smFISH_probe A smFISH is a probe that binds RNA in a single molecule in situ hybridization experiment. PMID:18806792 A promoter element with consensus sequence ACGCGT, bound by the transcription factor complex MBF (MCB-binding factor) and found in promoters of genes expressed during the G1/S transition of the cell cycle. kareneilbeck 2011-10-10T05:09:45Z MluI cell cycle box sequence SO:0001855 MCB A promoter element with consensus sequence ACGCGT, bound by the transcription factor complex MBF (MCB-binding factor) and found in promoters of genes expressed during the G1/S transition of the cell cycle. PMID:16285853 A promoter element with consensus sequence CCAAT, bound by a protein complex that represses transcription in response to low iron levels. kareneilbeck 2011-10-10T05:13:54Z CCAAT motif sequence SO:0001856 CCAAT_motif A promoter element with consensus sequence CCAAT, bound by a protein complex that represses transcription in response to low iron levels. PMID:16963626 A promoter element with consensus sequence CCAGCC, bound by the fungal transcription factor Ace2. kareneilbeck 2011-10-10T05:19:10Z Ace2 upstream activating sequence sequence SO:0001857 Ace2_UAS A promoter element with consensus sequence CCAGCC, bound by the fungal transcription factor Ace2. PMID:16678171 A promoter element with consensus sequence TTCTTTGTTY, bound an HMG-box transcription factor such as S. pombe Ste11, and found in promoters of genes up-regulated early in meiosis. kareneilbeck 2011-10-10T05:22:13Z TR box sequence SO:0001858 TR_box A promoter element with consensus sequence TTCTTTGTTY, bound an HMG-box transcription factor such as S. pombe Ste11, and found in promoters of genes up-regulated early in meiosis. PMID:1657709 A promoter element with consensus sequence CCCCTC, bound by the PKA-responsive zinc finger transcription factor Rst2. kareneilbeck 2011-10-14T10:25:02Z stress-starvation response element of Schizosaccharomyces pombe sequence STREP motif SO:0001859 STREP_motif A promoter element with consensus sequence CCCCTC, bound by the PKA-responsive zinc finger transcription factor Rst2. PMID:11739717 A DNA motif that contains a core consensus sequence AGGTAAGGGTAATGCAC, is found in the intergenic regions of rDNA repeats, and is bound by an RNA polymerase I transcription termination factor (e.g. S. pombe Reb1). The S. pombe telomeric repeat consensus is TTAC(0-1)A(0-1)G(1-8). kareneilbeck 2011-10-19T11:23:09Z rDIS sequence SO:0001860 Page 208 of ISBN:978-0199638901 rDNA_intergenic_spacer_element A DNA motif that contains a core consensus sequence AGGTAAGGGTAATGCAC, is found in the intergenic regions of rDNA repeats, and is bound by an RNA polymerase I transcription termination factor (e.g. S. pombe Reb1). The S. pombe telomeric repeat consensus is TTAC(0-1)A(0-1)G(1-8). ISBN:978-0199638901 PMID:9016645 A 10-bp promoter element bound by sterol regulatory element binding proteins (SREBPs), found in promoters of genes involved in sterol metabolism. Many variants of the sequence ATCACCCCAC function as SREs. kareneilbeck 2011-10-19T03:02:05Z SRE sequence SO:0001861 sterol_regulatory_element A 10-bp promoter element bound by sterol regulatory element binding proteins (SREBPs), found in promoters of genes involved in sterol metabolism. Many variants of the sequence ATCACCCCAC function as SREs. GO:mah PMID:11111080 PMID:16537923 SRE GO:mah A dinucleotide repeat region composed of GT repeating elements. kareneilbeck 2011-10-19T03:54:37Z d(GT)n sequence SO:0001862 paper:PMID:16043634. GT_dinucleotide_repeat A dinucleotide repeat region composed of GT repeating elements. SO:ke A trinucleotide repeat region composed of GTT repeating elements. kareneilbeck 2011-10-19T03:56:54Z d(GTT) sequence SO:0001863 GTT_trinucleotide_repeat A trinucleotide repeat region composed of GTT repeating elements. SO:ke A DNA motif to which the S. pombe Sap1 protein binds. The consensus sequence is 5'-TARGCAGNTNYAACGMG-3'; it is found at the mating type locus, where it is important for mating type switching, and at replication fork barriers in rDNA repeats. kareneilbeck 2011-10-19T04:24:16Z Sap1 recognitions site sequence SO:0001864 Sap1_recognition_motif A DNA motif to which the S. pombe Sap1 protein binds. The consensus sequence is 5'-TARGCAGNTNYAACGMG-3'; it is found at the mating type locus, where it is important for mating type switching, and at replication fork barriers in rDNA repeats. PMID:16166653 PMID:7651412 An RNA polymerase II promoter element found in the promoters of genes regulated by calcineurin. The consensus sequence is GNGGCKCA. kareneilbeck 2011-10-20T10:12:19Z CDRE motif calcineurin-dependent response element sequence SO:0001865 CDRE_motif An RNA polymerase II promoter element found in the promoters of genes regulated by calcineurin. The consensus sequence is GNGGCKCA. PMID:16928959 calcineurin-dependent response element PMID:16928959 A contig of BAC reads. kareneilbeck 2012-01-17T02:45:04Z BAC read contig sequence SO:0001866 Requested by Bayer Cropscience December, 2011. BAC_read_contig A contig of BAC reads. GMOD:ea A gene suspected of being involved in the expression of a trait. kareneilbeck 2012-01-17T02:53:03Z candidate gene target gene sequence SO:0001867 Requested by Bayer Cropscience December, 2011. candidate_gene A gene suspected of being involved in the expression of a trait. GMOD:ea A candidate gene whose association with a trait is based on the gene's location on a chromosome. kareneilbeck 2012-01-17T02:54:42Z positional candidate gene sequence positional target gene SO:0001868 Requested by Bayer Cropscience December, 2011. positional_candidate_gene A candidate gene whose association with a trait is based on the gene's location on a chromosome. GMOD:ea A candidate gene whose function has something in common biologically with the trait under investigation. kareneilbeck 2012-01-17T02:57:30Z functional candidate gene functional target gene sequence SO:0001869 Requested by Bayer Cropscience December, 2011. functional_candidate_gene A candidate gene whose function has something in common biologically with the trait under investigation. GMOD:ea A short ncRNA that is transcribed from an enhancer. May have a regulatory function. kareneilbeck 2012-01-17T03:09:35Z eRNA sequence SO:0001870 enhancerRNA A short ncRNA that is transcribed from an enhancer. May have a regulatory function. SO:cjm doi:10.1038/465173a A promoter element with consensus sequence GNAACR, bound by the transcription factor complex PBF (PCB-binding factor) and found in promoters of genes expressed during the M/G1 transition of the cell cycle. kareneilbeck 2012-01-17T03:14:02Z sequence SO:0001871 PCB A promoter element with consensus sequence GNAACR, bound by the transcription factor complex PBF (PCB-binding factor) and found in promoters of genes expressed during the M/G1 transition of the cell cycle. GO:mah PMID:12411492 A region of a chromosome, where the chromosome has undergone a large structural rearrangement that altered the genome organization. There is no longer synteny to the reference genome. kareneilbeck 2012-02-03T04:38:35Z rearrangement region sequence SO:0001872 NCBI definition: An orphan rearrangement between chromosomal location observed in isolation. rearrangement_region A region of a chromosome, where the chromosome has undergone a large structural rearrangement that altered the genome organization. There is no longer synteny to the reference genome. NCBI:th PMID:18564416 A rearrangement breakpoint between two different chromosomes. kareneilbeck 2012-02-03T04:43:45Z interchromosomal breakpoint sequence SO:0001873 interchromosomal_breakpoint A rearrangement breakpoint between two different chromosomes. NCBI:th A rearrangement breakpoint within the same chromosome. kareneilbeck 2012-02-03T04:44:53Z intrachromosomal breakpoint sequence SO:0001874 intrachromosomal_breakpoint A rearrangement breakpoint within the same chromosome. NCBI:th A supercontig that is not been assigned to any ultracontig during a genome assembly project. kareneilbeck 2012-02-14T05:02:20Z unassigned supercontig sequence unassigned scaffold SO:0001875 Requested by Bayer Cropscience January, 2012. unassigned_supercontig A supercontig that is not been assigned to any ultracontig during a genome assembly project. GMOD:ea A partial DNA sequence assembly of a chromosome or full genome, which contains gaps that are filled with N's. kareneilbeck 2012-02-14T05:05:32Z pseudomolecule partial genomic sequence assembly sequence assembly with N-gaps sequence SO:0001876 Requested by Bayer Cropscience January, 2012. partial_genomic_sequence_assembly A partial DNA sequence assembly of a chromosome or full genome, which contains gaps that are filled with N's. GMOD:ea A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced. kareneilbeck 2012-02-14T05:18:01Z INSDC_feature:ncRNA http://www.gencodegenes.org/gencode_biotypes.html INSDC_qualifier:lncRNA lncRNA_transcript long non-coding RNA sequence SO:0001877 Updated the definition of lncRNA (SO:0001877) from "A non-coding RNA over 200nucleotides in length." to "A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced." See GitHub Issue #575 lncRNA A non-coding RNA generally longer than 200 nucleotides that cannot be classified as any other ncRNA subtype. Similar to mRNAs, lncRNAs are mainly transcribed by RNA polymerase II, are often capped by 7-methyl guanosine at their 5' ends, polyadenylated at their 3' ends and may be spliced. HGNC:mw PMID:33353982 http://www.gencodegenes.org/gencode_biotypes.html GENCODE A sequence variant that falls entirely or partially within a genomic feature. kareneilbeck 2012-04-03T11:27:27Z feature alteration sequence SO:0001878 Created in conjunction with the EBI. feature_variant A sequence variant that falls entirely or partially within a genomic feature. EBI:fc SO:ke A sequence variant, caused by an alteration of the genomic sequence, where the deletion, is greater than the extent of the underlying genomic features. kareneilbeck 2012-04-03T11:36:48Z feature ablation sequence SO:0001879 Created in conjunction with the EBI. feature_ablation A sequence variant, caused by an alteration of the genomic sequence, where the deletion, is greater than the extent of the underlying genomic features. SO:ke A sequence variant, caused by an alteration of the genomic sequence, where the structural change, an amplification of sequence, is greater than the extent of the underlying genomic features. kareneilbeck 2012-04-03T11:37:48Z feature amplification sequence SO:0001880 Created in conjunction with the EBI. feature_amplification A sequence variant, caused by an alteration of the genomic sequence, where the structural change, an amplification of sequence, is greater than the extent of the underlying genomic features. SO:ke A sequence variant, caused by an alteration of the genomic sequence, where the structural change, a translocation, is greater than the extent of the underlying genomic features. kareneilbeck 2012-04-03T11:38:52Z feature translocation sequence SO:0001881 Created in conjunction with the EBI. feature_translocation A sequence variant, caused by an alteration of the genomic sequence, where the structural change, a translocation, is greater than the extent of the underlying genomic features. SO:ke A sequence variant, caused by an alteration of the genomic sequence, where a deletion fuses genomic features. kareneilbeck 2012-04-03T11:39:20Z feature fusion sequence SO:0001882 Created in conjunction with the EBI. feature_fusion A sequence variant, caused by an alteration of the genomic sequence, where a deletion fuses genomic features. SO:ke A feature translocation where the region contains a transcript. kareneilbeck 2012-04-03T12:29:52Z transcript translocation sequence SO:0001883 Created in conjunction with the EBI. transcript_translocation A feature translocation where the region contains a transcript. SO:ke A feature translocation where the region contains a regulatory region. kareneilbeck 2012-04-03T12:31:04Z regulatory region translocation sequence SO:0001884 Created in conjunction with the EBI. regulatory_region_translocation A feature translocation where the region contains a regulatory region. SO:ke A feature translocation where the region contains a transcription factor binding site. kareneilbeck 2012-04-03T12:31:15Z TFBS binding site translocation transcription factor binding site translocation sequence SO:0001885 Created in conjunction with the EBI. TFBS_translocation A feature translocation where the region contains a transcription factor binding site. SO:ke A feature fusion where the deletion brings together transcript regions. kareneilbeck 2012-04-03T12:34:56Z transcript fusion sequence SO:0001886 Created in conjunction with the EBI. transcript_fusion A feature fusion where the deletion brings together transcript regions. SO:ke A feature fusion where the deletion brings together regulatory regions. kareneilbeck 2012-04-03T12:35:58Z regulatory region fusion sequence SO:0001887 Created in conjunction with the EBI. regulatory_region_fusion A feature fusion where the deletion brings together regulatory regions. SO:ke A fusion where the deletion brings together transcription factor binding sites. kareneilbeck 2012-04-03T12:36:42Z TFBS fusion transcription factor binding site fusion sequence SO:0001888 Created in conjunction with the EBI. TFBS_fusion A fusion where the deletion brings together transcription factor binding sites. SO:ke A feature amplification of a region containing a transcript. kareneilbeck 2012-04-03T12:39:23Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:transcript_amplification transcript amplification sequence SO:0001889 Created in conjunction with the EBI. transcript_amplification A feature amplification of a region containing a transcript. SO:ke VEP:transcript_amplification A feature fusion where the deletion brings together a regulatory region and a transcript region. kareneilbeck 2012-04-03T12:40:17Z transcript regulatory region fusion sequence SO:0001890 Created in conjunction with the EBI. transcript_regulatory_region_fusion A feature fusion where the deletion brings together a regulatory region and a transcript region. SO:ke A feature amplification of a region containing a regulatory region. kareneilbeck 2012-04-03T12:41:28Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:regulatory_region_amplification regulatory region amplification sequence SO:0001891 Created in conjunction with the EBI. regulatory_region_amplification A feature amplification of a region containing a regulatory region. SO:ke VEP:regulatory_region_amplification A feature amplification of a region containing a transcription factor binding site. kareneilbeck 2012-04-03T12:42:48Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences TFBS amplification VEP:TFBS_amplification transcription factor binding site amplification sequence SO:0001892 Created in conjunction with the EBI. TFBS_amplification A feature amplification of a region containing a transcription factor binding site. SO:ke VEP:TFBS_amplification A feature ablation whereby the deleted region includes a transcript feature. kareneilbeck 2012-04-03T12:44:19Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:transcript_ablation VEP:transcript_ablation transcript ablation sequence SO:0001893 Created in conjunction with the EBI. transcript_ablation A feature ablation whereby the deleted region includes a transcript feature. SO:ke Jannovar:transcript_ablation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:transcript_ablation A feature ablation whereby the deleted region includes a regulatory region. kareneilbeck 2012-04-03T12:45:13Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:regulatory_region_ablation regulatory region ablation sequence SO:0001894 Created in conjunction with the EBI. regulatory_region_ablation A feature ablation whereby the deleted region includes a regulatory region. SO:ke VEP:regulatory_region_ablation A feature ablation whereby the deleted region includes a transcription factor binding site. kareneilbeck 2012-04-03T12:45:56Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences TFBS ablation VEP:TFBS_ablation transcription factor binding site ablation sequence SO:0001895 Created in conjunction with the EBI. TFBS_ablation A feature ablation whereby the deleted region includes a transcription factor binding site. SO:ke VEP:TFBS_ablation A CDS that is part of a transposable element. kareneilbeck 2012-04-05T01:57:04Z transposable element CDS sequence SO:0001896 transposable_element_CDS A CDS that is part of a transposable element. SO:ke A pseudogene contained within a transposable element. kareneilbeck 2012-04-05T04:09:45Z transposable element pseudogene sequence SO:0001897 transposable_element_pseudogene A pseudogene contained within a transposable element. SO:ke A repeat region which is part of the regional centromere outer repeat region. kareneilbeck 2012-04-06T11:48:48Z dg repeat sequence SO:0001898 For the S. pombe project - requested by Val Wood. dg_repeat A repeat region which is part of the regional centromere outer repeat region. PMID:16407326 SO:vw A repeat region which is part of the regional centromere outer repeat region. kareneilbeck 2012-04-06T11:50:07Z dh repeat sequence SO:0001899 For the S. pombe project - requested by Val Wood. dh_repeat A repeat region which is part of the regional centromere outer repeat region. PMID:16407326 SO:vw true A conserved 17-bp sequence (5'-ATCA(C/A)AACCCTAACCCT-3') commonly present upstream of the start site of histone transcription units functioning as a transcription factor binding site. kareneilbeck 2012-04-06T12:05:24Z AACCCT box sequence SO:0001901 AACCCT_box A conserved 17-bp sequence (5'-ATCA(C/A)AACCCTAACCCT-3') commonly present upstream of the start site of histone transcription units functioning as a transcription factor binding site. PMID:17452352 PMID:4092687 A region surrounding a cis_splice site, either within 1-3 bases of the exon or 3-8 bases of the intron. kareneilbeck 2012-04-06T12:23:32Z sequence splice region SO:0001902 splice_region A region surrounding a cis_splice site, either within 1-3 bases of the exon or 3-8 bases of the intron. SO:bm true Non-coding RNA transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA. kareneilbeck 2012-04-06T04:36:44Z natural antisense transcript sequence antisense lncRNA SO:0001904 Relationship is_a SO:0000644 antisense_RNA added 23 April 2021. See GitHub Issue #443 antisense_lncRNA Non-coding RNA transcribed from the opposite DNA strand compared with other transcripts and overlap in part with sense RNA. PMID:19638999 A transcript that is transcribed from the outer repeat region of a regional centromere. kareneilbeck 2012-04-11T04:54:22Z centromere outer repeat transcript regional centromere outer repeat region transcript regional_centromere_outer_repeat_region_transcript sequence SO:0001905 regional_centromere_outer_repeat_transcript A transcript that is transcribed from the outer repeat region of a regional centromere. PomBase:mah A sequence variant that causes the reduction of a genomic feature, with regard to the reference sequence. kareneilbeck 2012-04-12T05:05:28Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:feature_truncation VEP:feature_truncation feature truncation sequence SO:0001906 feature_truncation A sequence variant that causes the reduction of a genomic feature, with regard to the reference sequence. SO:ke Jannovar:feature_truncation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:feature_truncation A sequence variant that causes the extension of a genomic feature, with regard to the reference sequence. kareneilbeck 2012-04-12T05:05:56Z http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences VEP:feature_elongation feature elongation sequence SO:0001907 feature_elongation A sequence variant that causes the extension of a genomic feature, with regard to the reference sequence. SO:ke VEP:feature_elongation A sequence variant that causes the extension of a genomic feature from within the feature rather than from the terminus of the feature, with regard to the reference sequence. kareneilbeck 2012-04-12T05:06:20Z Jannovar:internal_feature_elongation internal feature elongation sequence SO:0001908 internal_feature_elongation A sequence variant that causes the extension of a genomic feature from within the feature rather than from the terminus of the feature, with regard to the reference sequence. SO:ke Jannovar:internal_feature_elongation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A frameshift variant that causes the translational reading frame to be extended relative to the reference feature. kareneilbeck 2012-04-12T05:10:05Z ANNOVAR:frameshift insertion Jannovar:frameshift_elongation frameshift elongation sequence SO:0001909 frameshift_elongation A frameshift variant that causes the translational reading frame to be extended relative to the reference feature. SO:ke ANNOVAR:frameshift insertion http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:frameshift_elongation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A frameshift variant that causes the translational reading frame to be shortened relative to the reference feature. kareneilbeck 2012-04-12T05:10:45Z ANNOVAR:frameshift deletion Jannovar:frameshift_truncation frameshift truncation sequence SO:0001910 frameshift_truncation A frameshift variant that causes the translational reading frame to be shortened relative to the reference feature. SO:ke ANNOVAR:frameshift deletion http://www.openbioinformatics.org/annovar/annovar_download.html Jannovar:frameshift_truncation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A sequence variant where copies of a feature are increased relative to the reference. kareneilbeck 2012-04-13T11:26:32Z copy number increase sequence SO:0001911 copy_number_increase A sequence variant where copies of a feature are increased relative to the reference. SO:ke A sequence variant where copies of a feature are decreased relative to the reference. kareneilbeck 2012-04-13T11:27:52Z copy number decrease sequence SO:0001912 copy_number_decrease A sequence variant where copies of a feature are decreased relative to the reference. SO:ke A bacterial promoter with sigma ecf factor binding dependency. This is a type of bacterial promoters that requires a sigma ECF factor to bind to identified -10 and -35 sequence regions in order to mediate binding of the RNA polymerase to the promoter region as part of transcription initiation. kareneilbeck 2012-06-11T02:41:33Z bacterial RNApol promoter sigma ecf sequence SO:0001913 Requested by Kevin Clancy - invitrogen -May 2012. bacterial_RNApol_promoter_sigma_ecf_element A bacterial promoter with sigma ecf factor binding dependency. This is a type of bacterial promoters that requires a sigma ECF factor to bind to identified -10 and -35 sequence regions in order to mediate binding of the RNA polymerase to the promoter region as part of transcription initiation. Invitrogen:kc A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing. kareneilbeck 2012-06-11T02:55:02Z DNA spacer replication fork barrier RFB RTS1 barrier RTS1 element rDNA replication fork barrier sequence SO:0001914 Requested by Midori - June 2012. rDNA_replication_fork_barrier A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing. PMID:14645529 A region defined by a cluster of experimentally determined transcription starting sites. kareneilbeck 2012-10-17T12:09:50Z TSC TSS cluster transcriptional initiation cluster transcriptional start site cluster sequence SO:0001915 transcription_start_cluster A region defined by a cluster of experimentally determined transcription starting sites. PMID:19624849 PMID:21372179 SO:andrewgibson A CAGE tag is a sequence tag hat corresponds to 5' ends of mRNA at cap sites, produced by cap analysis gene expression and used to identify transcriptional start sites. kareneilbeck 2012-10-17T12:36:58Z CAGE tag sequence SO:0001916 CAGE_tag A CAGE tag is a sequence tag hat corresponds to 5' ends of mRNA at cap sites, produced by cap analysis gene expression and used to identify transcriptional start sites. SO:andrewgibson A kind of transcription_initiation_cluster defined by the clustering of CAGE tags on a sequence region. kareneilbeck 2012-10-17T12:42:03Z INSDC_feature:misc_feature CAGE cluster CAGE peak CAGE_peak INSDC_note:CAGE_cluster sequence SO:0001917 CAGE_cluster A kind of transcription_initiation_cluster defined by the clustering of CAGE tags on a sequence region. PMID:16645617 SO:andrewgibson A cytosine methylated at the 5 carbon. kareneilbeck 2012-10-17T12:46:10Z http://www.insdc.org/files/feature_table.html#7.4.2 http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 5 methylcytosine 5-mC m-5C m5c sequence SO:0001918 5_methylcytosine A cytosine methylated at the 5 carbon. SO:rtapella A cytosine methylated at the 4 nitrogen. kareneilbeck 2012-10-17T12:50:40Z http://www.insdc.org/files/feature_table.html#7.4.2 http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 4-mC 4-methylcytosine N4 methylcytosine N4-methylcytosine N4_methylcytosine m-4C m4c sequence SO:0001919 4_methylcytosine A cytosine methylated at the 4 nitrogen. SO:rtapella An adenine methylated at the 6 nitrogen. kareneilbeck 2012-10-17T12:54:23Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 6-mA 6-methyladenine 6mA N6-methyladenine m-6A m6a sequence SO:0001920 N6_methyladenine An adenine methylated at the 6 nitrogen. SO:rtapella A contig of mitochondria derived sequences. kareneilbeck 2012-10-31T12:34:38Z mitochondrial contig sequence SO:0001921 Requested by Bayer Cropscience, October, 2012. mitochondrial_contig A contig of mitochondria derived sequences. GMOD:ea A scaffold composed of mitochondrial contigs. kareneilbeck 2012-10-31T12:42:45Z mitochondrial scaffold mitochondrial supercontig mitochondrial_scaffold sequence SO:0001922 mitochondrial_supercontig A scaffold composed of mitochondrial contigs. GMOD:ea A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts contain G rich telomeric RNA repeats and RNA tracts corresponding to adjacent subtelomeric sequences. They are 100-9000 bases long. kareneilbeck 2012-10-31T01:06:40Z sequence telomeric repeat containing RNA SO:0001923 Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012. TERRA A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts contain G rich telomeric RNA repeats and RNA tracts corresponding to adjacent subtelomeric sequences. They are 100-9000 bases long. PMID:22139915 A non coding RNA transcript, complementary to subtelomeric tract of TERRA transcript but devoid of the repeats. kareneilbeck 2012-10-31T01:11:49Z sequence SO:0001924 Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012. ARRET A non coding RNA transcript, complementary to subtelomeric tract of TERRA transcript but devoid of the repeats. PMID:2139915 A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts consist of C rich repeats. kareneilbeck 2012-10-31T01:24:37Z sequence SO:0001925 Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012. ARIA A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts consist of C rich repeats. PMID:22139915 A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts are antisense of ARRET transcripts. kareneilbeck 2012-10-31T01:40:22Z anti-ARRET sequence SO:0001926 Telomeric transcription has been documented in mammals, birds, fish, plants and yeast. Requested by Antonia Lock, October 2012. anti_ARRET A non-coding RNA transcript, derived from the transcription of the telomere. These transcripts are antisense of ARRET transcripts. PMID:22139915 A non-coding transcript derived from the transcript of the telomere. kareneilbeck 2012-10-31T01:42:15Z telomeric transcript sequence SO:0001927 telomeric_transcript A non-coding transcript derived from the transcript of the telomere. PMID:22139915 A duplication of the distal region of a chromosome. kareneilbeck 2012-10-31T01:56:44Z distal duplication sequence SO:0001928 This term is used by Complete Genomics in the structural variant analysis files. distal_duplication A duplication of the distal region of a chromosome. SO:bm A sequencer read of a mitochondrial DNA sample. kareneilbeck 2012-11-14T04:39:56Z mitochondrial DNA read sequence SO:0001929 Requested by Bayer Cropscience, October, 2012. mitochondrial_DNA_read A sequencer read of a mitochondrial DNA sample. GMOD:ea A sequencer read of a chloroplast DNA sample. kareneilbeck 2012-11-14T04:43:45Z chloroplast DNA read sequence SO:0001930 Requested by Bayer Cropscience, October, 2012. chloroplast_DNA_read A sequencer read of a chloroplast DNA sample. GMOD:ea Genomic DNA sequence produced from some base calling or alignment algorithm which uses aligned or assembled multiple gDNA sequences as input. kareneilbeck 2012-11-28T12:53:14Z consensus gDNA consensus genomic DNA sequence SO:0001931 Requested by Bayer Cropscience November, 2012. consensus_gDNA Genomic DNA sequence produced from some base calling or alignment algorithm which uses aligned or assembled multiple gDNA sequences as input. GMOD:ea A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 5' end. kareneilbeck 2013-03-06T09:50:44Z restriction enzyme five prime single strand overhang sequence SO:0001932 restriction_enzyme_five_prime_single_strand_overhang A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 5' end. SO:ke A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 3' end. kareneilbeck 2013-03-06T09:52:14Z restriction enzyme three prime single strand overhang sequence SO:0001933 restriction_enzyme_three_prime_single_strand_overhang A terminal region of DNA sequence where the end of the region is not blunt ended and the exposed single strand terminates at the 3' end. SO:ke A repeat_region containing repeat_units of 1 bp that is repeated multiple times in tandem. kareneilbeck 2013-03-06T09:59:15Z monomeric repeat sequence SO:0001934 monomeric_repeat A repeat_region containing repeat_units of 1 bp that is repeated multiple times in tandem. SO:ke A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H3 protein is tri-methylated. kareneilbeck 2013-03-06T10:13:48Z H3K20 trimethylation site sequence SO:0001935 H3K20_trimethylation_site A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H3 protein is tri-methylated. EBI:nj A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2013-03-06T10:16:55Z H3K36 acetylation site H3K36ac sequence SO:0001936 H3K36_acetylation_site A kind of histone modification site, whereby the 36th residue (a lysine), from the start of the H3 histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H2B protein is acetylated. kareneilbeck 2013-03-06T10:19:13Z H2BK12 acetylation site H2BK12ac sequence SO:0001937 H2BK12_acetylation_site A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H2B protein is acetylated. EBI:nj A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2A histone protein is acetylated. kareneilbeck 2013-03-06T10:20:57Z H2AK5 acetylation site H2AK5ac sequence SO:0001938 H2AK5_acetylation_site A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2A histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H4 histone protein is acetylated. kareneilbeck 2013-03-06T10:26:15Z H4K12 acetylation site H4K12ac sequence SO:0001939 H4K12_acetylation_site A kind of histone modification site, whereby the 12th residue (a lysine), from the start of the H4 histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 120th residue (a lysine), from the start of the H2B histone protein is acetylated. kareneilbeck 2013-03-06T10:28:38Z H2BK120 acetylation site H2BK120ac sequence SO:0001940 H2BK120_acetylation_site A kind of histone modification site, whereby the 120th residue (a lysine), from the start of the H2B histone protein is acetylated. EBI:nj http://dx.doi.org/10.4161/epi.6.5.15623 A kind of histone modification site, whereby the 91st residue (a lysine), from the start of the H4 histone protein is acetylated. kareneilbeck 2013-03-06T10:41:04Z H4K91 acetylation site H4K91ac sequence SO:0001941 H4K91_acetylation_site A kind of histone modification site, whereby the 91st residue (a lysine), from the start of the H4 histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H2B histone protein is acetylated. kareneilbeck 2013-03-06T10:44:31Z H2BK20 acetylation site H2BK20ac sequence SO:0001942 H2BK20_acetylation_site A kind of histone modification site, whereby the 20th residue (a lysine), from the start of the H2B histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2013-03-06T10:46:32Z H3K4 acetylation site H3K4ac sequence SO:0001943 H3K4_acetylation_site A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H3 histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H2A histone protein is acetylated. kareneilbeck 2013-03-06T10:48:11Z H2AK9 acetylation site H2AK9ac sequence SO:0001944 H2AK9_acetylation_site A kind of histone modification site, whereby the 9th residue (a lysine), from the start of the H2A histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 56th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2013-03-06T10:51:14Z H3K56 acetylation site H3K56ac sequence SO:0001945 H3K56_acetylation_site A kind of histone modification site, whereby the 56th residue (a lysine), from the start of the H3 histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2B histone protein is acetylated. kareneilbeck 2013-03-06T10:53:23Z H2BK15 acetylation site H2BK15ac sequence SO:0001946 H2BK15_acetylation_site A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2B histone protein is acetylated. EBI:nj A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is mono-methylated. kareneilbeck 2013-03-06T10:57:13Z H3R2me1 H3R2 monomethylation site sequence SO:0001947 H3R2_monomethylation_site A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is mono-methylated. EBI:nj A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is di-methylated. kareneilbeck 2013-03-06T10:59:17Z H3R2 dimethylation site H3R2me2 sequence SO:0001948 H3R2_dimethylation_site A kind of histone modification site, whereby the 2nd residue (an arginine), from the start of the H3 protein is di-methylated. EBI:nj A kind of histone modification site, whereby the 3nd residue (an arginine), from the start of the H4 protein is di-methylated. kareneilbeck 2013-03-06T11:01:27Z H4R3 dimethylation site H4R3me2 sequence SO:0001949 H4R3_dimethylation_site A kind of histone modification site, whereby the 3nd residue (an arginine), from the start of the H4 protein is di-methylated. EBI:nj A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H4 protein is tri-methylated. kareneilbeck 2013-03-06T11:03:29Z H4K4me3 H4K4 trimethylation site sequence SO:0001950 H4K4_trimethylation_site A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H4 protein is tri-methylated. EBI:nj A kind of histone modification site, whereby the 23rd residue (a lysine), from the start of the H3 protein is di-methylated. kareneilbeck 2013-03-06T11:05:33Z H3K23 dimethylation site H3K23me2 sequence SO:0001951 H3K23_dimethylation_site A kind of histone modification site, whereby the 23rd residue (a lysine), from the start of the H3 protein is di-methylated. EBI:nj A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites. kareneilbeck 2013-03-06T11:36:25Z promoter flanking region sequence SO:0001952 promoter_flanking_region A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites. EBI:nj A region of DNA sequence formed from the ligation of two sticky ends where the palindrome is broken and no longer comprises the recognition site and thus cannot be re-cut by the restriction enzymes used to create the sticky ends. kareneilbeck 2013-03-06T03:18:11Z sequence SO:0001953 restriction_enzyme_assembly_scar A region of DNA sequence formed from the ligation of two sticky ends where the palindrome is broken and no longer comprises the recognition site and thus cannot be re-cut by the restriction enzymes used to create the sticky ends. SO:ke A region related to restriction enzyme function. kareneilbeck 2013-03-06T03:23:34Z sequence restriction enzyme region SO:0001954 Not a great term for annotation, but used to classify the various regions related to restriction enzymes. restriction_enzyme_region A region related to restriction enzyme function. SO:ke A polypeptide region that proves structure in a protein that affects the stability of the protein. kareneilbeck 2013-03-06T03:32:47Z sequence protein stability element SO:0001955 protein_stability_element A polypeptide region that proves structure in a protein that affects the stability of the protein. SO:ke A polypeptide_region that codes for a protease cleavage site. kareneilbeck 2013-03-06T03:36:28Z protease site sequence SO:0001956 protease_site A polypeptide_region that codes for a protease cleavage site. SO:ke RNA secondary structure that affects the stability of an RNA molecule. kareneilbeck 2013-03-06T03:38:35Z sequence rna stability element SO:0001957 RNA_stability_element true RNA secondary structure that affects the stability of an RNA molecule. SO:ke A kind of intron whereby the excision is driven by lariat formation. kareneilbeck 2013-03-07T10:58:40Z lariat intron sequence SO:0001958 Requested by PomBase 3604508. lariat_intron A kind of intron whereby the excision is driven by lariat formation. SO:ke A cis-regulatory element, conserved sequence YYC+1TTTYY, and spans -2 to +6 relative to +1 TSS. It is present in most ribosomal protein genes in Drosophila and mammals but not in the yeast Saccharomyces cerevisiae. Resembles the initiator (TCAKTY in Drosophila) but functionally distinct from initiator. kareneilbeck 2013-05-17T04:38:48Z TCT element polypyrimidine initiator sequence SO:0001959 TCT_motif A cis-regulatory element, conserved sequence YYC+1TTTYY, and spans -2 to +6 relative to +1 TSS. It is present in most ribosomal protein genes in Drosophila and mammals but not in the yeast Saccharomyces cerevisiae. Resembles the initiator (TCAKTY in Drosophila) but functionally distinct from initiator. PMID:20801935 SO:myl A modified DNA cytosine base feature, modified by a hydroxymethyl group at the 5 carbon. kareneilbeck 2013-05-17T05:05:31Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 5-hmC 5-hydroxymethylcytosine sequence SO:0001960 5_hydroxymethylcytosine A modified DNA cytosine base feature, modified by a hydroxymethyl group at the 5 carbon. SO:ke A modified DNA cytosine base feature, modified by a formyl group at the 5 carbon. kareneilbeck 2013-05-17T05:06:13Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 5-fC 5-formylcytosine sequence SO:0001961 5_formylcytosine A modified DNA cytosine base feature, modified by a formyl group at the 5 carbon. SO:ke A modified adenine DNA base feature. kareneilbeck 2013-05-20T01:22:30Z sequence SO:0001962 modified_adenine A modified adenine DNA base feature. SO:ke A modified cytosine DNA base feature. kareneilbeck 2013-05-20T01:23:47Z sequence SO:0001963 modified_cytosine A modified cytosine DNA base feature. SO:ke A modified guanine DNA base feature. kareneilbeck 2013-05-20T01:25:31Z sequence SO:0001964 modified_guanine A modified guanine DNA base feature. SO:ke A modified DNA guanine base,at the 8 carbon, often the product of DNA damage. kareneilbeck 2013-05-20T01:27:51Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 8-oxoG 8-oxoguanine sequence SO:0001965 8_oxoguanine A modified DNA guanine base,at the 8 carbon, often the product of DNA damage. SO:ke A modified DNA cytosine base feature, modified by a carboxy group at the 5 carbon. kareneilbeck 2013-05-20T01:30:01Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 5-caC 5-carboxycytosine sequence SO:0001966 5_carboxylcytosine A modified DNA cytosine base feature, modified by a carboxy group at the 5 carbon. SO:ke A modified DNA adenine base,at the 8 carbon, often the product of DNA damage. kareneilbeck 2013-05-20T01:31:05Z http:http://www.pacificbiosciences.com/pdf/WP_Detecting_DNA_Base_Modifications_Using_SMRT_Sequencing.pdf 8-oxoA 8-oxoadenine sequence SO:0001967 8_oxoadenine A modified DNA adenine base,at the 8 carbon, often the product of DNA damage. SO:ke A transcript variant of a protein coding gene. kareneilbeck 2013-05-22T04:34:49Z Jannovar:coding_transcript_variant coding transcript variant sequence SO:0001968 coding_transcript_variant A transcript variant of a protein coding gene. SO:ke Jannovar:coding_transcript_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A transcript variant occurring within an intron of a coding transcript. kareneilbeck 2013-05-23T10:54:17Z Jannovar:coding_transcript_intron_variant coding sequence intron variant sequence SO:0001969 coding_transcript_intron_variant A transcript variant occurring within an intron of a coding transcript. SO:ke Jannovar:coding_transcript_intron_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A transcript variant occurring within an intron of a non coding transcript. kareneilbeck 2013-05-23T10:55:03Z non coding transcript intron variant ANNOVAR:ncRNA_intronic Jannovar:non_coding_transcript_intron_variant sequence SO:0001970 non_coding_transcript_intron_variant A transcript variant occurring within an intron of a non coding transcript. SO:ke ANNOVAR:ncRNA_intronic Jannovar:non_coding_transcript_intron_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html A binding site to which a polypeptide will bind with a zinc finger motif, which is characterized by requiring one or more Zinc 2+ ions for stabilized folding. kareneilbeck 2013-07-29T04:41:53Z zinc finger binding site zinc_fing sequence SO:0001971 zinc_finger_binding_site zinc_fing A histone 4 modification where the modification is the acetylation of the residue. kareneilbeck 2013-07-30T10:43:04Z H4ac histone 4 acetylation site sequence SO:0001972 histone_4_acetylation_site A histone 4 modification where the modification is the acetylation of the residue. EBI:nj ISBN:0815341059 SO:ke A histone 3 modification where the modification is the acetylation of the residue. kareneilbeck 2013-07-30T10:46:42Z H3ac histone 3 acetylation site sequence SO:0001973 histone_3_acetylation_site A histone 3 modification where the modification is the acetylation of the residue. EBI:nj ISBN:0815341059 SO:ke A transcription factor binding site with consensus sequence CCGCGNGGNGGCAG, bound by CCCTF-binding factor. kareneilbeck 2013-07-30T10:59:11Z CCCTF binding site CTCF binding site sequence SO:0001974 CTCF_binding_site A transcription factor binding site with consensus sequence CCGCGNGGNGGCAG, bound by CCCTF-binding factor. EBI:nj A restriction enzyme recognition site that, when cleaved, results in 5 prime overhangs. kareneilbeck 2013-07-30T11:32:16Z five prime sticky end restriction enzyme cleavage site sequence SO:0001975 Requested by Jackie Quinn. The sticky restriction sites are different from junctions because they include the sequence that is cut, inclusive of the five prime junction and the three prime junction. five_prime_sticky_end_restriction_enzyme_cleavage_site A restriction enzyme recognition site that, when cleaved, results in 5 prime overhangs. SO:ke A restriction enzyme recognition site that, when cleaved, results in 3 prime overhangs. kareneilbeck 2013-07-30T11:37:19Z three prime sticky end restriction enzyme cleavage site sequence SO:0001976 Requested by Jackie Quinn. The sticky restriction sites are different from junctions because they include the sequence that is cut, inclusive of the five prime junction and the three prime junction. three_prime_sticky_end_restriction_enzyme_cleavage_site A restriction enzyme recognition site that, when cleaved, results in 3 prime overhangs. SO:ke A region of a transcript encoding the cleavage site for a ribonuclease enzyme. kareneilbeck 2013-07-30T11:41:06Z ribonuclease site sequence SO:0001977 ribonuclease_site A region of a transcript encoding the cleavage site for a ribonuclease enzyme. SO:ke A region of sequence where developer information is encoded. kareneilbeck 2013-07-30T11:49:22Z DNA signature sequence SO:0001978 Requested by Jackie Quinn for use in synthetic biology. signature A region of sequence where developer information is encoded. SO:ke A motif that affects the stability of RNA. kareneilbeck 2013-07-30T03:33:53Z RNA stability element sequence SO:0001979 RNA_stability_element A motif that affects the stability of RNA. PMID:22495308 SO:ke A regulatory promoter element identified in mutation experiments, with consensus sequence: CACGTG. Present in promoters, intergenic regions, coding regions, and introns. They are involved in gene expression responses to light and interact with G-box binding factor and I-box binding factor 1a. kareneilbeck 2013-07-30T04:00:50Z G-box GBF binding sequence sequence SO:0001980 A plant specific region. G_box A regulatory promoter element identified in mutation experiments, with consensus sequence: CACGTG. Present in promoters, intergenic regions, coding regions, and introns. They are involved in gene expression responses to light and interact with G-box binding factor and I-box binding factor 1a. PMID:19249238 PMID:8571452 SO:ml An orientation dependent regulatory promoter element, with consensus sequence of TTGCACAN4TTGCACA, found in plants. kareneilbeck 2013-07-30T04:12:19Z L-box L-box promoter element sequence SO:0001981 L_box An orientation dependent regulatory promoter element, with consensus sequence of TTGCACAN4TTGCACA, found in plants. PMID:17381552 PMID:2902624 SO:ml A plant regulatory promoter motif, composed of a highly conserved hexamer GATAAG (I-box core). kareneilbeck 2013-07-30T04:17:55Z I-box promoter motif sequence SO:0001982 I-box A plant regulatory promoter motif, composed of a highly conserved hexamer GATAAG (I-box core). PMID:2347304 PMID:2902624 SO:ml A 5' UTR variant where a premature start codon is introduced, moved or lost. kareneilbeck 2013-07-30T04:36:25Z 5' UTR premature start codon variant sequence SO:0001983 Requested by Andy Menzies at the Sanger. This isn't necessarily a protein coding change. A premature start codon can effect the production of a mature protein product by providing a competing translation start point. Some genes balance their expression this way, eg THPO requires the presence of a premature start to limit expression, its loss leads to Familial thrombocythemia. 5_prime_UTR_premature_start_codon_variant A 5' UTR variant where a premature start codon is introduced, moved or lost. SANGER:am A gene cassette array that corresponds to a silenced version of a mating type region. kareneilbeck 2013-07-31T02:40:38Z sequence silent mating-type cassette SO:0001984 silent_mating_type_cassette_array A gene cassette array that corresponds to a silenced version of a mating type region. PomBase:mah Any of the DNA segments produced by discontinuous synthesis of the lagging strand during DNA replication. kareneilbeck 2013-07-31T02:57:55Z Okazaki fragment sequence SO:0001985 Requested by Midori Harris, 2013. Okazaki_fragment Any of the DNA segments produced by discontinuous synthesis of the lagging strand during DNA replication. ISBN:0805350152 A feature variant, where the alteration occurs upstream of the transcript TSS. kareneilbeck 2013-07-31T03:46:14Z upstream transcript variant sequence SO:0001986 Requested by Graham Ritchie, EBI/Sanger. upstream_transcript_variant A feature variant, where the alteration occurs upstream of the transcript TSS. EBI:gr A feature variant, where the alteration occurs downstream of the transcript termination site. kareneilbeck 2013-07-31T03:47:51Z downstream transcript variant sequence SO:0001987 Requested by Graham Ritchie, EBI/Sanger. downstream_transcript_variant A 5' UTR variant where a premature start codon is gained. kareneilbeck 2013-07-31T03:53:06Z http://snpeff.sourceforge.net/SnpEff_manual.html 5 prime UTR premature start codon gain variant Jannovar:5_prime_UTR_premature_start_codon_gain_variant snpEff:START_GAINED sequence SO:0001988 5_prime_UTR_premature_start_codon_gain_variant A 5' UTR variant where a premature start codon is gained. Sanger:am Jannovar:5_prime_UTR_premature_start_codon_gain_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:START_GAINED A 5' UTR variant where a premature start codon is lost. kareneilbeck 2013-07-31T03:56:48Z sequence SO:0001989 5_prime_UTR_premature_start_codon_loss_variant A 5' UTR variant where a premature start codon is lost. SANGER:am A 5' UTR variant where a premature start codon is moved. kareneilbeck 2013-07-31T03:57:47Z sequence SO:0001990 five_prime_UTR_premature_start_codon_location_variant A 5' UTR variant where a premature start codon is moved. SANGER:am A consensus AFLP fragment is an AFLP sequence produced from any alignment algorithm which uses assembled multiple AFLP sequences as input. kareneilbeck 2013-09-24T10:43:41Z consensus AFLP fragment consensus amplified fragment length polymorphism fragment sequence SO:0001991 Requested by Bayer Cropscience September, 2013. consensus_AFLP_fragment A consensus AFLP fragment is an AFLP sequence produced from any alignment algorithm which uses assembled multiple AFLP sequences as input. GMOD:ea A non-synonymous variant is an inframe, protein altering variant, resulting in a codon change. kareneilbeck 2013-10-16T11:47:51Z non_synonymous_coding nonsynonymous variant sequence SO:0001992 nonsynonymous_variant A non-synonymous variant is an inframe, protein altering variant, resulting in a codon change. SO:ke non_synonymous_coding http://ensembl.org/info/docs/variation/index.html Intronic positions associated with cis-splicing. Contains the first and second positions immediately before the exon and the first, second and fifth positions immediately after. kareneilbeck 2014-01-04T06:20:00Z extended cis splice site sequence SO:0001993 Added by Andy Menzies (Sanger). extended_cis_splice_site Intronic positions associated with cis-splicing. Contains the first and second positions immediately before the exon and the first, second and fifth positions immediately after. SANGER:am Fifth intronic position after the intron exon boundary, close to the 5' edge of the intron. kareneilbeck 2014-01-04T06:26:02Z intron base 5 sequence SO:0001994 intron_base_5 Fifth intronic position after the intron exon boundary, close to the 5' edge of the intron. SANGER:am A sequence variant occurring in the intron, within 10 bases of exon. kareneilbeck 2014-01-04T06:37:27Z extended intronic splice region variant sequence SO:0001995 Added by Andy Menzies (Sanger). extended_intronic_splice_region_variant A sequence variant occurring in the intron, within 10 bases of exon. sanger:am Region of intronic sequence within 10 bases of an exon. kareneilbeck 2014-01-04T06:41:23Z extended intronic splice region sequence SO:0001996 extended_intronic_splice_region Region of intronic sequence within 10 bases of an exon. SANGER:am A heterochromatic region of the chromosome, adjacent to the telomere (on the centromeric side) that contains repetitive DNA and sometimes genes and it is transcribed. kareneilbeck 2014-01-05T07:02:01Z sequence SO:0001997 subtelomere A heterochromatic region of the chromosome, adjacent to the telomere (on the centromeric side) that contains repetitive DNA and sometimes genes and it is transcribed. POMBE:al A small RNA oligo, typically about 20 bases, that guides the cas nuclease to a target DNA sequence in the CRISPR/cas mutagenesis method. kareneilbeck 2014-01-05T07:25:08Z small guide RNA sequence gRNA guide RNA SO:0001998 sgRNA A small RNA oligo, typically about 20 bases, that guides the cas nuclease to a target DNA sequence in the CRISPR/cas mutagenesis method. PMID:23934893 DNA motif that is a component of a mating type region. kareneilbeck 2014-01-05T07:30:17Z mating type region motif sequence SO:0001999 mating_type_region_motif DNA motif that is a component of a mating type region. SO:ke true A segment of non-homology between a and alpha mating alleles, found at all three mating loci (HML, MAT, and HMR), has two forms (Ya and Yalpha). kareneilbeck 2014-01-05T07:33:30Z Y-region sequence SO:0002001 Requested by Janos Demeter, SGD. Y_region A segment of non-homology between a and alpha mating alleles, found at all three mating loci (HML, MAT, and HMR), has two forms (Ya and Yalpha). SGD:jd A mating type region motif, one of two segments of homology found at all three mating loci (HML, MAT, and HMR). kareneilbeck 2014-01-05T07:34:59Z Z1-region sequence SO:0002002 Requested by Janos Demeter, SGD. Z1_region A mating type region motif, one of two segments of homology found at all three mating loci (HML, MAT, and HMR). SGD:jd A mating type region motif, the rightmost segment of homology in the HML and MAT mating loci (not present in HMR). kareneilbeck 2014-01-05T07:36:45Z Z2-segment sequence SO:0002003 Requested by Janos Demeter, SGD. Z2_region A mating type region motif, the rightmost segment of homology in the HML and MAT mating loci (not present in HMR). SGD:jd The ACS is an 11-bp sequence of the form 5'-WTTTAYRTTTW-3' which is at the core of every yeast ARS, and is necessary but not sufficient for recognition and binding by the origin recognition complex (ORC). Functional ARSs require an ACS, as well as other cis elements in the 5' (C domain) and 3' (B domain) flanking sequences of the ACS. kareneilbeck 2014-01-05T07:47:48Z ACS ARS consensus sequence sequence SO:0002004 ARS_consensus_sequence The ACS is an 11-bp sequence of the form 5'-WTTTAYRTTTW-3' which is at the core of every yeast ARS, and is necessary but not sufficient for recognition and binding by the origin recognition complex (ORC). Functional ARSs require an ACS, as well as other cis elements in the 5' (C domain) and 3' (B domain) flanking sequences of the ACS. SGD:jd The determinant of selective removal (DSR) motif consists of repeats of U(U/C)AAAC. The motif targets meiotic transcripts for removal during mitosis via the exosome. kareneilbeck 2014-01-05T07:51:27Z DSR motif sequence SO:0002005 Requested by Antonia Locke, (Pombe). DSR_motif The determinant of selective removal (DSR) motif consists of repeats of U(U/C)AAAC. The motif targets meiotic transcripts for removal during mitosis via the exosome. PMID:22645662 A promoter element that has the consensus sequence GNMGATC, and is found in promoters of genes repressed in the presence of zinc. kareneilbeck 2014-01-05T09:23:27Z zinc repressed element sequence SO:0002006 This element is bound by Loz1 in S. pombe. The paper does not name the element. This term was requested by Midoris Harris, for Pombe. zinc_repressed_element A promoter element that has the consensus sequence GNMGATC, and is found in promoters of genes repressed in the presence of zinc. PMID:24003116 POMBE:mh An MNV is a multiple nucleotide variant (substitution) in which the inserted sequence is the same length as the replaced sequence. kareneilbeck 2014-01-13T03:48:40Z multiple nucleotide substitution multiple nucleotide variant sequence SO:0002007 MNV An MNV is a multiple nucleotide variant (substitution) in which the inserted sequence is the same length as the replaced sequence. NCBI:th A sequence variant whereby at least one base of a codon encoding a rare amino acid is changed, resulting in a different encoded amino acid. kareneilbeck 2014-03-24T02:24:01Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:rare_amino_acid_variant rare amino acid variant snpEff:RARE_AMINO_ACID sequence SO:0002008 Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function. rare_amino_acid_variant A sequence variant whereby at least one base of a codon encoding a rare amino acid is changed, resulting in a different encoded amino acid. SO:ke Jannovar:rare_amino_acid_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:RARE_AMINO_ACID A sequence variant whereby at least one base of a codon encoding selenocysteine is changed, resulting in a different encoded amino acid. kareneilbeck 2014-03-24T02:29:44Z selenocysteine loss sequence SO:0002009 Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function. selenocysteine_loss A sequence variant whereby at least one base of a codon encoding selenocysteine is changed, resulting in a different encoded amino acid. SO:ke A sequence variant whereby at least one base of a codon encoding pyrrolysine is changed, resulting in a different encoded amino acid. kareneilbeck 2014-03-24T02:30:16Z pyrrolysine loss sequence SO:0002010 Request from Uma Devi Paila, UVA. Variants in the sites of rare amino acids e.g. Selenocysteine. These are important impact terms since a loss of such rare amino acids may lead to a loss of function. pyrrolysine_loss A sequence variant whereby at least one base of a codon encoding pyrrolysine is changed, resulting in a different encoded amino acid. SO:ke A variant that occurs within a gene but falls outside of all transcript features. This occurs when alternate transcripts of a gene do not share overlapping sequence. kareneilbeck 2014-03-24T02:33:13Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:intragenic_variant intragenic variant snpEff:INTRAGENIC sequence SO:0002011 Requested by Pablo Cingolani, for use in SnpEff. intragenic_variant A variant that occurs within a gene but falls outside of all transcript features. This occurs when alternate transcripts of a gene do not share overlapping sequence. SO:ke Jannovar:intragenic_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:INTRAGENIC A codon variant that changes at least one base of the canonical start codon. kareneilbeck 2014-03-24T02:41:28Z http://snpeff.sourceforge.net/SnpEff_manual.html http:www.ensembl.org/info/genome/variation/predicted_data.html#consequences Jannovar:start_lost VEP:start_lost snpEff:START_LOST sequence SO:0002012 Request from Uma Devi Paila, UVA. This term should not be applied to incomplete transcripts. start_lost A codon variant that changes at least one base of the canonical start codon. SO:ke Jannovar:start_lost http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html VEP:start_lost snpEff:START_LOST A sequence variant that causes the reduction of a the 5'UTR with regard to the reference sequence. kareneilbeck 2014-03-25T10:46:42Z http://snpeff.sourceforge.net/SnpEff_manual.html 5 prime UTR truncation Jannovar:5_prime_utr_truncation snpEff:UTR_5_DELETED sequence SO:0002013 5_prime_UTR_truncation A sequence variant that causes the reduction of a the 5'UTR with regard to the reference sequence. SO:ke Jannovar:5_prime_utr_truncation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:UTR_5_DELETED A sequence variant that causes the extension of 5' UTR, with regard to the reference sequence. kareneilbeck 2014-03-25T10:48:26Z 5 prime UTR elongation sequence SO:0002014 5_prime_UTR_elongation A sequence variant that causes the extension of 5' UTR, with regard to the reference sequence. SO:ke A sequence variant that causes the reduction of a the 3' UTR with regard to the reference sequence. kareneilbeck 2014-03-25T10:54:50Z http://snpeff.sourceforge.net/SnpEff_manual.html 3 prime UTR truncation Jannovar:3_prime_utr_truncation snpEff:UTR_3_DELETED sequence SO:0002015 3_prime_UTR_truncation A sequence variant that causes the reduction of a the 3' UTR with regard to the reference sequence. SO:ke Jannovar:3_prime_utr_truncation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:UTR_3_DELETED A sequence variant that causes the extension of 3' UTR, with regard to the reference sequence. kareneilbeck 2014-03-25T10:55:33Z 3 prime UTR elongation sequence SO:0002016 3_prime_UTR_elongation A sequence variant that causes the extension of 3' UTR, with regard to the reference sequence. SO:ke A sequence variant located in a conserved intergenic region, between genes. kareneilbeck 2014-03-25T02:54:39Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:conserved_intergenic_variant conserved intergenic variant snpEff:INTERGENIC_CONSERVED sequence SO:0002017 Requested by Uma Paila (UVA) for snpEff. conserved_intergenic_variant A sequence variant located in a conserved intergenic region, between genes. SO:ke Jannovar:conserved_intergenic_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:INTERGENIC_CONSERVED A transcript variant occurring within a conserved region of an intron. kareneilbeck 2014-03-25T02:58:41Z http://snpeff.sourceforge.net/SnpEff_manual.html Jannovar:conserved_intron_variant conserved intron variant snpEff:INTRON_CONSERVED sequence SO:0002018 Requested by Uma Paila (UVA) for snpEff. conserved_intron_variant A transcript variant occurring within a conserved region of an intron. SO:ke Jannovar:conserved_intron_variant http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html snpEff:INTRON_CONSERVED A sequence variant where at least one base in the start codon is changed, but the start remains. kareneilbeck 2014-03-28T10:08:41Z snpEff:SYNONYMOUS_START sequence SO:0002019 Requested by Uma Paila as this term is annotated by snpEff. This would be used for non_AUG start codon annotation. start_retained_variant A sequence variant where at least one base in the start codon is changed, but the start remains. SO:ke snpEff:SYNONYMOUS_START Boundary elements are DNA motifs that prevent heterochromatin from spreading into neighboring euchromatic regions. kareneilbeck 2014-05-30T14:45:37Z boundary element sequence insulator SO:0002020 Requested by Antonia Lock. Insulator is included as a related synonym since this is used to refer to insulator in the literature (NCBI:cf). boundary_element Boundary elements are DNA motifs that prevent heterochromatin from spreading into neighboring euchromatic regions. PMID:24013502 A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing. kareneilbeck 2014-05-30T14:57:26Z mating type region replication fork barrier sequence SO:0002021 Requested by Midori Harris. mating_type_region_replication_fork_barrier A DNA motif that is found in eukaryotic rDNA repeats, and is a site of replication fork pausing. PMID:17614787 A small RNA molecule, 22-23 nt in size, that is the product of a longer RNA. The production of priRNAs is independent of dicer and involves binding of RNA by argonaute and trimming by triman. In fission yeast, priRNAs trigger the establishment of heterochromatin. PriRNAs are primarily generated from centromeric transcripts (dg and dh repeats), but may also be produced from degradation products of primary transcripts. kareneilbeck 2014-05-30T15:01:24Z primal small RNA sequence SO:0002022 priRNA A small RNA molecule, 22-23 nt in size, that is the product of a longer RNA. The production of priRNAs is independent of dicer and involves binding of RNA by argonaute and trimming by triman. In fission yeast, priRNAs trigger the establishment of heterochromatin. PriRNAs are primarily generated from centromeric transcripts (dg and dh repeats), but may also be produced from degradation products of primary transcripts. PMID:20178743 PMID:24095277 PomBase:al A nucleic tag which is used in a ligation step of library preparation process to allow pooling of samples while maintaining ability to identify individual source material and creation of a multiplexed library. kareneilbeck 2014-05-30T15:13:16Z multiplexing sequence identifier sequence SO:0002023 multiplexing_sequence_identifier A nucleic tag which is used in a ligation step of library preparation process to allow pooling of samples while maintaining ability to identify individual source material and creation of a multiplexed library. OBO:prs PMID:22574170 The leftmost segment of homology in the HML and MAT mating loci, but not present in HMR. kareneilbeck 2014-07-11T13:20:08Z SO:0002000 W-region sequence SO:0002024 MERGED COMMENT: TARGET COMMENT: Requested by Janos Demeter, SGD. -------------------- SOURCE COMMENT: Requested by Janos Demeter, SGD. W_region The leftmost segment of homology in the HML and MAT mating loci, but not present in HMR. SGD:jd A genome region where chromosome pairing occurs preferentially during homologous chromosome pairing during early meiotic prophase of Meiosis I. kareneilbeck 2014-07-14T11:40:34Z cis-acting homologous chromosome pairing region sequence SO:0002025 Comment: An example of this is the Sme2 locus in fission yeast S. pombe, where is coincident with an ribonuclear complex termed the "Mei2 dot". This term was Requested by Val Wood, PomBase. cis_acting_homologous_chromosome_pairing_region A genome region where chromosome pairing occurs preferentially during homologous chromosome pairing during early meiotic prophase of Meiosis I. PMID:22582262 PMID:23117617 PMID:24173580 PomBase:vw The nucleotide sequence which encodes the intein portion of the precursor gene. kareneilbeck 2014-07-14T11:53:21Z sequence SO:0002026 Requested by Janos Demeter 2014. intein_encoding_region The nucleotide sequence which encodes the intein portion of the precursor gene. PMID:8165123 A short open reading frame that is found in the 5' untranslated region of an mRNA and plays a role in translational regulation. kareneilbeck 2014-07-14T11:59:23Z PMID:26684391 regulatory uORF upstream ORF sequence SO:0002027 uORF A short open reading frame that is found in the 5' untranslated region of an mRNA and plays a role in translational regulation. PMID:12890013 PMID:16153175 POMBASE:mah An open reading frame that encodes a peptide of less than 100 amino acids. kareneilbeck 2014-07-14T12:02:33Z smORF small ORF sequence SO:0002028 sORF An open reading frame that encodes a peptide of less than 100 amino acids. PMID:23970561 PMID:24705786 POMBASE:mah A translated ORF encoded entirely within the antisense strand of a known protein coding gene. kareneilbeck 2014-07-14T12:04:32Z translated nested antisense gene sequence SO:0002029 tnaORF A translated ORF encoded entirely within the antisense strand of a known protein coding gene. POMBASE:vw One of two segments of homology found at all three mating loci (HML, MAT and HMR). kareneilbeck 2014-07-14T18:43:21Z x-region sequence SO:0002030 X_region One of two segments of homology found at all three mating loci (HML, MAT and HMR). SGD:jd A short hairpin RNA (shRNA) is an RNA transcript that makes a tight hairpin turn that can be used to silence target gene expression via RNA interference. kareneilbeck 2014-10-23T09:16:29Z http:http:en.wikipedia.org/wiki/Small_hairpin_RNA short hairpin RNA small hairpin RNA sequence SO:0002031 shRNA A short hairpin RNA (shRNA) is an RNA transcript that makes a tight hairpin turn that can be used to silence target gene expression via RNA interference. PMID:6699500 SO:ke http:http:en.wikipedia.org/wiki/Small_hairpin_RNA wikipedia A non-coding transcript encoded by sequences adjacent to the ends of the 5' and 3' miR-encoding sequences that abut the loop in precursor miRNA. kareneilbeck 2015-01-09T13:57:43Z microRNA-offset RNA sequence SO:0002032 MoRs are generated from miR hairpins that are longer and can produce two functional miR per strand. They are called moRs because they are not located next to the loop and thus their biogenesis process is a little different, but functionally, they are supposed to act like miRs. It is the same for loRs that are the loop fragments, they are generated differently than miRs or moRs but if loaded into the risc they are supposed to act the same way miRs do. Requested by Thomas Desvignes, Jan 2015. moR A non-coding transcript encoded by sequences adjacent to the ends of the 5' and 3' miR-encoding sequences that abut the loop in precursor miRNA. SO:ke A short, non coding transcript of loop-derived sequences encoded in precursor miRNA. kareneilbeck 2015-01-09T14:02:02Z loop-origin miRs sequence SO:0002033 MoRs are generated from miR hairpins that are longer and can produce two functional miR per strand. They are called moRs because they are not located next to the loop and thus their biogenesis process is a little different, but functionally, they are supposed to act like miRs. It is the same for loRs that are the loop fragments, they are generated differently than miRs or moRs but if loaded into the risc they are supposed to act the same way miRs do. Requested by Thomas Desvignes, Jan 2015. loR A short, non coding transcript of loop-derived sequences encoded in precursor miRNA. SO:ke A snoRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:02:13Z miR encoding snoRNA primary transcript sequence SO:0002034 miR_encoding_snoRNA_primary_transcript A snoRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A primary transcript encoding a lncRNA. kareneilbeck 2015-01-09T15:23:03Z lncRNA primary transcript sequence SO:0002035 lncRNA_primary_transcript A primary transcript encoding a lncRNA. SO:ke A lncRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:23:48Z miR encoding lncRNA primary transcript sequence SO:0002036 miR_encoding_lncRNA_primary_transcript A lncRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A tRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:28:23Z miR encoding tRNA primary transcript sequence SO:0002037 miR_encoding_tRNA_primary_transcript A tRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A primary transcript encoding an shRNA. kareneilbeck 2015-01-09T15:30:43Z shRNA primary transcript sequence SO:0002038 shRNA_primary_transcript A primary transcript encoding an shRNA. SO:ke A shRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:32:00Z miR encoding shRNA primary transcript sequence SO:0002039 miR_encoding_shRNA_primary_transcript A shRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A primary transcript encoding a vaultRNA. kareneilbeck 2015-01-09T15:33:33Z vaultRNA primary transcript sequence SO:0002040 vaultRNA_primary_transcript A primary transcript encoding a vaultRNA. SO:ke A vaultRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:34:32Z miR encoding vaultRNA primary transcript sequence SO:0002041 miR_encoding_vaultRNA_primary_transcript A vaultRNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A primary transcript encoding a Y-RNA. kareneilbeck 2015-01-09T15:36:51Z Y-RNA primary transcript sequence SO:0002042 Y_RNA_primary_transcript A primary transcript encoding a Y-RNA. SO:ke A Y-RNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. kareneilbeck 2015-01-09T15:37:46Z miR encoding Y-RNA primary transcript sequence SO:0002043 miR_encoding_Y_RNA_primary_transcript A Y-RNA primary transcript that also encodes pre-miR sequence that is processed to form functionally active miRNA. SO:ke A TCS element is a (yeast) transcription factor binding site, bound by the TEA DNA binding domain (DBD) of transcription factors. The consensus site is CATTCC or CATTCT. kareneilbeck 2015-02-09T15:02:53Z TCS element TEA Consensus Sequence sequence SO:0002044 Requested by Rama - SGD. TCS_element A TCS element is a (yeast) transcription factor binding site, bound by the TEA DNA binding domain (DBD) of transcription factors. The consensus site is CATTCC or CATTCT. PMID:1489142 PMID:20118212 SO:ke A PRE is a (yeast) TFBS with consensus site [TGAAAC(A/G)]. kareneilbeck 2015-02-09T15:05:43Z PRE pheromone response element sequence SO:0002045 Requested by Rama, SGD. pheromone_response_element A PRE is a (yeast) TFBS with consensus site [TGAAAC(A/G)]. PMID:1489142 SO:ke A FRE is an enhancer element necessary and sufficient to confer filamentation associated expression in S. cerevisiae. kareneilbeck 2015-02-09T15:09:47Z filamentation and invasion response element sequence SO:0002046 Requested by Rama, SGD. FRE A FRE is an enhancer element necessary and sufficient to confer filamentation associated expression in S. cerevisiae. PMID:1489142 SO:ke Transcription pause sites are regions of a gene where RNA polymerase may pause during transcription. The functional role of pausing may be to facilitate factor recruitment, RNA folding, and synchronization with translation. Consensus transcription pause site have been observed in E. coli. kareneilbeck 2015-02-09T15:32:52Z transcription pause site sequence SO:0002047 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. transcription_pause_site Transcription pause sites are regions of a gene where RNA polymerase may pause during transcription. The functional role of pausing may be to facilitate factor recruitment, RNA folding, and synchronization with translation. Consensus transcription pause site have been observed in E. coli. PMID:24789973 SO:ke A reading frame that could encode a full-length protein but which contains obvious mid-sequence disablements (frameshifts or premature stop codons). kareneilbeck 2015-02-09T16:15:46Z dORF disabled ORF sequence disabled_reading frame SO:0002048 disabled_reading_frame A reading frame that could encode a full-length protein but which contains obvious mid-sequence disablements (frameshifts or premature stop codons). SGD:se A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acetylated. kareneilbeck 2015-05-14T10:17:11Z H3K27 acetylation site H3K27ac sequence SO:0002049 Requested by: Sagar Jain, Richard Scheuermann. H3K27_acetylation_site A kind of histone modification site, whereby the 27th residue (a lysine), from the start of the H3 histone protein is acetylated. SO:rs A promoter that allows for continual transcription of gene. kareneilbeck 2015-05-14T10:39:09Z constitutive promoter sequence SO:0002050 constitutive_promoter A promoter that allows for continual transcription of gene. SO:ke A promoter whereby activity is induced by the presence or absence of biotic or abiotic factors. kareneilbeck 2015-05-14T10:39:56Z inducible promoter sequence SO:0002051 inducible_promoter A promoter whereby activity is induced by the presence or absence of biotic or abiotic factors. SO:ke A variant where the mutated gene product adversely affects the other (wild type) gene product. kareneilbeck 2015-05-14T11:16:28Z dominant negative dominant negative variant sequence SO:0002052 Requested by Deanna Church. dominant_negative_variant A variant where the mutated gene product adversely affects the other (wild type) gene product. SO:ke A sequence variant whereby new or enhanced function is conferred on the gene product. kareneilbeck 2015-05-14T11:20:47Z gain of function variant sequence SO:0002053 gain_of_function_variant A sequence variant whereby new or enhanced function is conferred on the gene product. SO:ke A sequence variant whereby the gene product has diminished or abolished function. kareneilbeck 2015-05-14T11:21:29Z loss of function variant sequence SO:0002054 loss_of_function_variant A sequence variant whereby the gene product has diminished or abolished function. SO:ke A variant whereby the gene product is not functional or the gene product is not produced. kareneilbeck 2015-05-14T11:21:57Z null mutation sequence SO:0002055 null_mutation A variant whereby the gene product is not functional or the gene product is not produced. SO:ke An intronic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control. kareneilbeck 2015-05-14T12:24:10Z ISS intronic splicing silencer sequence SO:0002056 Requested by Javier Diez Perez. intronic_splicing_silencer An intronic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control. PMID:23241926 SO:ke kareneilbeck 2015-05-14T12:28:31Z ISE sequence SO:0002057 intronic_splicing_enhancer true An exonic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control. kareneilbeck 2015-05-14T12:42:12Z ESS exonic splicing silencer sequence SO:0002058 Requested by Javier Diez Perez. exonic_splicing_silencer An exonic splicing regulatory element that functions to recruit trans acting splicing factors suppress the transcription of the gene or genes they control. PMID:23241926 SO:ke A regulatory_region that promotes or induces the process of recombination. kareneilbeck 2015-05-14T13:08:58Z recombination enhancer sequence SO:0002059 recombination_enhancer A regulatory_region that promotes or induces the process of recombination. PMID:8861911 SGD:se A translocation where the regions involved are from different chromosomes. kareneilbeck 2015-06-18T11:10:30Z sequence SO:0002060 interchromosomal_translocation A translocation where the regions involved are from different chromosomes. NCBI:th A translocation where the regions involved are from the same chromosome. kareneilbeck 2015-06-18T11:10:51Z sequence SO:0002061 intrachromosomal_translocation A translocation where the regions involved are from the same chromosome. NCBI:th A contiguous cluster of translocations, usually the result of a single catastrophic event such as chromothripsis or chromoanasynthesis. kareneilbeck 2015-06-18T11:24:55Z complex chromosomal rearrangement sequence SO:0002062 complex_chromosomal_rearrangement A contiguous cluster of translocations, usually the result of a single catastrophic event such as chromothripsis or chromoanasynthesis. NCBI:th An insertion of sequence from the Alu family of mobile elements. kareneilbeck 2015-06-18T11:30:36Z Alu insertion sequence SO:0002063 Alu_insertion An insertion of sequence from the Alu family of mobile elements. NCBI:th An insertion from the Line1 family of mobile elements. kareneilbeck 2015-06-18T11:34:44Z sequence line1 insertion SO:0002064 LINE1_insertion An insertion from the Line1 family of mobile elements. NCBI:th An insertion of sequence from the SVA family of mobile elements. kareneilbeck 2015-06-18T11:36:12Z sequence SO:0002065 SVA_insertion An insertion of sequence from the SVA family of mobile elements. NCBI:th A deletion of a mobile element when comparing a reference sequence (has mobile element) to a individual sequence (does not have mobile element). kareneilbeck 2015-09-04T13:40:43Z mobile element deletion sequence SO:0002066 mobile_element_deletion A deletion of a mobile element when comparing a reference sequence (has mobile element) to a individual sequence (does not have mobile element). NCBI:th A deletion of the HERV mobile element with respect to a reference. kareneilbeck 2015-09-04T13:42:52Z HERV deletion sequence SO:0002067 HERV_deletion A deletion of the HERV mobile element with respect to a reference. NCBI:th A deletion of an SVA mobile element. kareneilbeck 2015-09-04T13:45:22Z SVA deletion sequence SO:0002068 SVA_deletion A deletion of an SVA mobile element. NCBI:th A deletion of a LINE1 mobile element with respect to a reference. kareneilbeck 2015-09-04T13:46:26Z sequence LINE1 deletion SO:0002069 LINE1_deletion A deletion of a LINE1 mobile element with respect to a reference. NCBI:th A deletion of an Alu mobile element with respect to a reference. kareneilbeck 2015-09-04T13:47:16Z sequence SO:0002070 Alu_deletion A deletion of an Alu mobile element with respect to a reference. NCBI:th A CDS that is supported by proteomics data. kareneilbeck 2015-10-12T13:25:02Z sequence SO:0002071 CDS_supported_by_peptide_spectrum_match A CDS that is supported by proteomics data. SO:ke A position or feature where two sequences have been compared. kareneilbeck 2015-11-23T14:14:32Z INSDC_feature:misc_feature INSDC_note:sequence_comparison sequence comparison sequence SO:0002072 sequence_comparison A position or feature within a sequence that is identical to the comparable position or feature of a specified reference sequence. kareneilbeck 2015-11-23T14:15:08Z no sequence alteration sequence SO:0002073 This term is requested by the ClinVar data model group for use in the allele registry and such. A sequence at a defined location that is defined to match the reference assembly. no_sequence_alteration A position or feature within a sequence that is identical to the comparable position or feature of a specified reference sequence. SO:ke A variant that falls in an intergenic region that is 1 kb or less between 2 genes. kareneilbeck 2015-11-23T14:24:16Z ANNOVAR:upstream;downstream sequence SO:0002074 This term is added to map to the Annovar annotation 'upstream,downstream' . intergenic_1kb_variant A variant that falls in an intergenic region that is 1 kb or less between 2 genes. SO:ke ANNOVAR:upstream;downstream A sequence variant that intersects an incompletely annotated transcript. kareneilbeck 2015-11-23T14:43:51Z http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ incomplete transcript variant sequence SO:0002075 This term is to map to the ANNOVAR term 'ncRNA' http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ . The description in the documentation (11/23/15) 'variant overlaps a transcript without coding annotation in the gene definition'. and this is further clarified in the document: ncRNA above refers to RNA without coding annotation. It does not mean that this is a RNA that will never be translated; it merely means that the user-selected gene annotation system was not able to give a coding sequence annotation. It could still code protein products and may have such annotations in future versions of gene annotation or in another gene annotation system. For example, BC039000 is regarded as ncRNA by ANNOVAR when using UCSC Known Gene annotation, but it is regarded as a protein-coding gene by ANNOVAR when using ENSEMBL annotation. It is further clarified in the comments section as: ncRNA does NOT mean conventional non-coding RNA. It means a RNA without complete coding sequence, and it can be a coding RNA that is annotated incorrectly by RefSeq or other gene definition systems. incomplete_transcript_variant A sequence variant that intersects an incompletely annotated transcript. SO:ke A sequence variant that intersects the 3' UTR of an incompletely annotated transcript. kareneilbeck 2015-11-23T14:45:52Z http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ ANNOVAR:ncRNA_UTR3 sequence incomplete transcript 3UTR variant SO:0002076 incomplete_transcript_3UTR_variant A sequence variant that intersects the 3' UTR of an incompletely annotated transcript. SO:ke ANNOVAR:ncRNA_UTR3 http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ A sequence variant that intersects the 5' UTR of an incompletely annotated transcript. kareneilbeck 2015-11-24T12:39:17Z http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ ANNOVAR:ncRNA_UTR5 incomplete transcript 5UTR variant sequence SO:0002077 incomplete_transcript_5UTR_variant A sequence variant that intersects the 5' UTR of an incompletely annotated transcript. SO:ke ANNOVAR:ncRNA_UTR5 http://annovar.openbioinformatics.org/en/latest/user-guide/gene/ A sequence variant that intersects the intron of an incompletely annotated transcript. kareneilbeck 2015-11-24T12:51:45Z incomplete transcript intronic variant sequence SO:0002078 incomplete_transcript_intronic_variant A sequence variant that intersects the intron of an incompletely annotated transcript. SO:ke A sequence variant that intersects the splice region of an incompletely annotated transcript. kareneilbeck 2015-11-24T12:52:06Z incomplete transcript splice region variant sequence SO:0002079 incomplete_transcript_splice_region_variant A sequence variant that intersects the splice region of an incompletely annotated transcript. SO:ke A sequence variant that intersects the exon of an incompletely annotated transcript. kareneilbeck 2015-11-24T12:52:10Z incomplete transcript exonic variant sequence SO:0002080 incomplete_transcript_exonic_variant A sequence variant that intersects the exon of an incompletely annotated transcript. SO:ke A sequence variant that intersects the coding regions of an incompletely annotated transcript. kareneilbeck 2015-11-24T15:32:27Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq:coding-notMod3 Seattleseq:coding-unknown sequence SO:0002081 incomplete_transcript_CDS A sequence variant that intersects the coding regions of an incompletely annotated transcript. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:coding-notMod3 Seattleseq:coding-unknown A sequence variant that intersects the coding sequence near a splice region of an incompletely annotated transcript. kareneilbeck 2015-11-24T15:51:06Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq:coding-notMod3-near-splice Seattleseq:coding-unknown-near-splice incomplete transcript coding splice variant sequence SO:0002082 incomplete_transcript_coding_splice_variant A sequence variant that intersects the coding sequence near a splice region of an incompletely annotated transcript. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:coding-notMod3-near-splice Seattleseq:coding-unknown-near-splice A sequence variant located within 2KB 3' of a gene. kareneilbeck 2015-11-24T15:55:49Z http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq:near-gene-3 sequence SO:0002083 2KB_downstream_variant A sequence variant located within 2KB 3' of a gene. SO:ke http://snp.gs.washington.edu/SeattleSeqAnnotation137/HelpHowToUse.jsp Seattleseq Seattleseq:near-gene-3 A sequence variant in which a change has occurred within the exonic region of the splice site, 1-2 bases from boundary. kareneilbeck 2015-12-01T14:38:47Z ANNOVAR:exonic;splicing exonic splice region variant sequence Seattleseq:coding-near-splice SO:0002084 exonic_splice_region_variant A sequence variant in which a change has occurred within the exonic region of the splice site, 1-2 bases from boundary. SO:ke ANNOVAR:exonic;splicing Seattleseq:coding-near-splice A sequence variant whereby two genes, on the same strand have become joined. kareneilbeck 2016-02-23T12:16:48Z unidirectional gene fusion sequence SO:0002085 Requested by SNPEFF team. Feb 2016. unidirectional_gene_fusion A sequence variant whereby two genes, on the same strand have become joined. SO:ke A sequence variant whereby two genes, on alternate strands have become joined. kareneilbeck 2016-02-23T12:17:18Z bidirectional gene fusion sequence SO:0002086 Requested by SNPEFF team. Feb 2016. bidirectional_gene_fusion A sequence variant whereby two genes, on alternate strands have become joined. SO:ke A non functional descendant of the coding portion of a coding transcript, part of a pseudogene. kareneilbeck 2016-02-29T12:58:52Z INSDC_feature:CDS INSDC_qualifier:pseudo pseudogenic CDS sequence SO:0002087 pseudogenic_CDS A non functional descendant of the coding portion of a coding transcript, part of a pseudogene. SO:ke A transcript variant occurring within the splice region (1-3 bases of the exon or 3-8 bases of the intron) of a non coding transcript. kareneilbeck 2016-03-07T09:40:46Z ANNOVAR:ncRNA_splicing sequence SO:0002088 non_coding_transcript_splice_region_variant A transcript variant occurring within the splice region (1-3 bases of the exon or 3-8 bases of the intron) of a non coding transcript. SO:ke A UTR variant of exonic sequence of the 3' UTR. kareneilbeck 2016-03-07T10:37:04Z 3 prime UTR exon variant sequence SO:0002089 Requested by visze github tracker ID 346. 3_prime_UTR_exon_variant A UTR variant of exonic sequence of the 3' UTR. SO:ke A UTR variant of intronic sequence of the 3' UTR. kareneilbeck 2016-03-07T10:37:41Z 3 prime UTR intron variant sequence SO:0002090 Requested by visze github tracker ID 346. 3_prime_UTR_intron_variant A UTR variant of intronic sequence of the 3' UTR. SO:ke A UTR variant of intronic sequence of the 5' UTR. kareneilbeck 2016-03-07T10:38:04Z 5 prime UTR intron variant sequence SO:0002091 Requested by visze github tracker ID 346. 5_prime_UTR_intron_variant A UTR variant of intronic sequence of the 5' UTR. SO:ke A UTR variant of exonic sequence of the 5' UTR. kareneilbeck 2016-03-07T10:38:26Z 5 prime UTR exon variant sequence SO:0002092 Requested by visze github tracker ID 346. 5_prime_UTR_exon_variant A UTR variant of exonic sequence of the 5' UTR. SO:ke A variant that impacts the internal interactions of the resulting polypeptide structure. kareneilbeck 2016-03-07T11:43:55Z structural interaction variant sequence SO:0002093 Requested by Pablo Cingolani. The way I calculate this is simply by looking at the PDB entry of one protein and then marking those AA that are within 3 Angstrom of each other (and far away in the AA sequence, e.g. over 20 AA distance). The assumption is that, since they are very close in distance, they must be "interacting" and thus important for protein structure. structural_interaction_variant A variant that impacts the internal interactions of the resulting polypeptide structure. SO:ke A genomic region at a non-allelic position where exchange of genetic material happens as a result of homologous recombination. nicole 2016-05-17T13:34:12Z INSDC_feature:misc_recomb INSDC_qualifier:non_allelic_homologous INSDC_qualifier:non_allelic_homologous_recombination NAHRR non allelic homologous recombination region sequence SO:0002094 non_allelic_homologous_recombination_region A ncRNA, specific to the Cajal body, that has been demonstrated to function as a guide RNA in the site-specific synthesis of 2'-O-ribose-methylated nucleotides and pseudouridines in the RNA polymerase II-transcribed U1, U2, U4 and U5 spliceosomal small nuclear RNAs (snRNAs). nicole 2016-05-19T13:42:45Z http://www.ncbi.nlm.nih.gov/pmc/articles/PMC126017/ small Cajal body specific RNA small Cajal body-specific RNA sequence SO:0002095 Moved from is_a ncRNA (SO:0000655) to is_a snoRNA (SO:0000275) as per request from FlyBase by Dave Sant 24 April 2021. See GitHub Issue #509. scaRNA A ncRNA, specific to the Cajal body, that has been demonstrated to function as a guide RNA in the site-specific synthesis of 2'-O-ribose-methylated nucleotides and pseudouridines in the RNA polymerase II-transcribed U1, U2, U4 and U5 spliceosomal small nuclear RNAs (snRNAs). PMC:126017 PMID:27775477 PMID:28869095 SO:nrs A variation that expands or contracts a tandem repeat with regard to a reference. kareneilbeck 2016-07-14T16:04:40Z short tandem repeat variation str variation sequence SO:0002096 short_tandem_repeat_variation A variation that expands or contracts a tandem repeat with regard to a reference. SO:ke A pseudogene derived from a vertebrate immune system gene. kareneilbeck 2016-07-15T16:00:22Z vertebrate immune system pseudogene sequence SO:0002097 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. vertebrate_immune_system_pseudogene A pseudogene derived from a vertebrate immune system gene. SO:ke A pseudogene derived from an immunoglobulin gene. kareneilbeck 2016-07-15T16:01:47Z immunoglobulin pseudogene sequence SO:0002098 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. immunoglobulin_pseudogene A pseudogene derived from an immunoglobulin gene. SO:ke A pseudogene derived from a T-cell receptor gene. kareneilbeck 2016-07-15T16:02:18Z sequence SO:0002099 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. T_cell_receptor_pseudogene A pseudogene derived from a T-cell receptor gene. SO:ke A pseudogenic constant region of an immunoglobulin gene which closely resembles a known functional Imunoglobulin constant gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon. kareneilbeck 2016-07-15T16:05:08Z IG C pseudogene sequence SO:0002100 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_C_pseudogene A pseudogenic constant region of an immunoglobulin gene which closely resembles a known functional Imunoglobulin constant gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A pseudogenic joining region which closely resembles a known functional imunoglobulin joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. kareneilbeck 2016-07-15T16:05:34Z IG J pseudogene IG joining pseudogene IG_joining_pseudogene Immunoglobulin Joining Pseudogene Immunoglobulin_Joining_Pseudogene sequence SO:0002101 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_J_pseudogene A pseudogenic joining region which closely resembles a known functional imunoglobulin joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A pseudogenic variable region which closely resembles a known functional imunoglobulin variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain. kareneilbeck 2016-07-15T16:05:56Z IG V pseudogene IG variable pseudogene IG_variable_pseudogene Immunoglobulin variable pseudogene Immunoglobulin_variable_pseudogene sequence SO:0002102 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_V_pseudogene A pseudogenic variable region which closely resembles a known functional imunoglobulin variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A pseudogenic variable region which closely resembles a known functional T receptor variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain. kareneilbeck 2016-07-15T16:06:29Z T cell receptor V pseudogene T cell receptor Variable pseudogene TR V pseudogene T_cell_receptor_V_pseudogene T_cell_receptor_Variable_pseudogene sequence SO:0002103 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_V_pseudogene A pseudogenic variable region which closely resembles a known functional T receptor variable gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the variable region of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A pseudogenic joining region which closely resembles a known functional T receptor (TR) joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. kareneilbeck 2016-07-15T16:06:51Z T cell receptor J pseudogene T cell receptor Joining pseudogene TR J pseudogene T_cell_receptor_J_pseudogene T_cell_receptor_Joining_pseudogene sequence SO:0002104 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_J_pseudogene A pseudogenic joining region which closely resembles a known functional T receptor (TR) joining gene but in which the coding region has stop codons, frameshift mutations or a mutation that effects the initiation codon that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated. kareneilbeck 2016-07-18T12:31:53Z translated processed pseudogene sequence SO:0002105 Term added as part of collaboration with Gencode, adding biotypes used in annotation. translated_processed_pseudogene A processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A non-processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated. kareneilbeck 2016-07-18T12:34:42Z translated unprocessed pseudogene translated_nonprocessed_pseudogene sequence SO:0002106 Term added as part of collaboration with Gencode, adding biotypes used in annotation. translated_unprocessed_pseudogene A non-processed pseudogene where there is evidence, (mass spec data) suggesting that it is also translated. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A unprocessed pseudogene supported by locus-specific evidence of transcription. kareneilbeck 2016-07-18T12:41:53Z transcribed unprocessed pseudogene transcribed_non_processed_pseudogene sequence SO:0002107 Term added as part of collaboration with Gencode, adding biotypes used in annotation. transcribed_unprocessed_pseudogene A unprocessed pseudogene supported by locus-specific evidence of transcription. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A species specific unprocessed pseudogene without a parent gene, as it has an active orthologue in another species. kareneilbeck 2016-07-18T12:44:26Z transcribed unitary pseudogene sequence SO:0002108 Term added as part of collaboration with Gencode, adding biotypes used in annotation. transcribed_unitary_pseudogene A species specific unprocessed pseudogene without a parent gene, as it has an active orthologue in another species. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A processed_pseudogene overlapped by locus-specific evidence of transcription. kareneilbeck 2016-07-18T12:45:48Z transcribed processed pseudogene sequence SO:0002109 Term added as part of collaboration with Gencode, adding biotypes used in annotation. transcribed_processed_pseudogene A processed_pseudogene overlapped by locus-specific evidence of transcription. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A polymorphic pseudogene in the reference genome, containing a retained intron, known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error. kareneilbeck 2016-07-18T12:47:33Z polymorphic pseudogene with retained intron sequence SO:0002110 Term added as part of collaboration with Gencode, adding biotypes used in annotation. polymorphic_pseudogene_with_retained_intron A polymorphic pseudogene in the reference genome, containing a retained intron, known to be intact in the genomes of other individuals of the same species. The annotation process has confirmed that the pseudogenisation event is not a genomic sequencing error. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A processed_transcript supported by EST and/or mRNA evidence that aligns unambiguously to a pseudogene locus (i.e. alignment to the pseudogene locus clearly better than alignment to parent locus). kareneilbeck 2016-07-18T14:07:00Z pseudogene processed transcript sequence SO:0002111 Term added as part of collaboration with Gencode, adding biotypes used in annotation. pseudogene_processed_transcript A processed_transcript supported by EST and/or mRNA evidence that aligns unambiguously to a pseudogene locus (i.e. alignment to the pseudogene locus clearly better than alignment to parent locus). GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A protein coding transcript containing a retained intron. kareneilbeck 2016-07-18T14:09:49Z sequence mRNA with retained intron SO:0002112 Term added as part of collaboration with Gencode, adding biotypes used in annotation. coding_transcript_with_retained_intron A protein coding transcript containing a retained intron. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A lncRNA transcript containing a retained intron. kareneilbeck 2016-07-18T14:13:07Z lncRNA with retained intron lncRNA_retained_intron sequence SO:0002113 Term added as part of collaboration with Gencode, adding biotypes used in annotation. lncRNA_with_retained_intron A lncRNA transcript containing a retained intron. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A protein coding transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon, making it susceptible to nonsense mediated decay. kareneilbeck 2016-07-18T14:16:13Z http://www.gencodegenes.org/gencode_biotypes.html NMD transcript nonsense mediated decay transcript protein_coding_NMD sequence SO:0002114 Term added as part of collaboration with Gencode, adding biotypes used in annotation. NMD_transcript A protein coding transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon, making it susceptible to nonsense mediated decay. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html http://www.gencodegenes.org/gencode_biotypes.html GENCODE A transcript supported by EST and/or mRNA evidence that aligns unambiguously to the pseudogene locus; has retained intronic sequence compared to a reference transcript sequence. kareneilbeck 2016-07-18T14:19:04Z pseudogene retained intron sequence SO:0002115 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. pseudogenic_transcript_with_retained_intron A transcript supported by EST and/or mRNA evidence that aligns unambiguously to the pseudogene locus; has retained intronic sequence compared to a reference transcript sequence. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A processed transcript that does not contain a CDS that fullfills annotation criteria and not necessarily functionally non-coding. kareneilbeck 2016-07-18T14:23:59Z polymorphic pseudogene processed transcript sequence SO:0002116 Term added as part of collaboration with Gencode, adding biotypes used in annotation. polymorphic_pseudogene_processed_transcript A processed transcript that does not contain a CDS that fullfills annotation criteria and not necessarily functionally non-coding. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html kareneilbeck 2016-07-18T14:27:21Z sequence SO:0002117 <new term> true A polymorphic pseudogene transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon. Premature stop codon is not introduced, directly or indirectly, as a result of the variation i.e. must be present in both protein_coding and pseudogenic alleles. kareneilbeck 2016-07-18T14:28:02Z NMD polymorphic pseudogene transcript nonsense_mediated_decay_polymorphic_pseudogene sequence SO:0002118 Term added as part of collaboration with Gencode, adding biotypes used in annotation. NMD_polymorphic_pseudogene_transcript A polymorphic pseudogene transcript that contains a CDS but has one or more splice junctions >50bp downstream of stop codon. Premature stop codon is not introduced, directly or indirectly, as a result of the variation i.e. must be present in both protein_coding and pseudogenic alleles. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A physical quality which inheres to the allele by virtue of the number instances of the allele within a population. This is the relative frequency of the allele at a given locus in a population. kareneilbeck 2016-07-21T11:58:55Z wikipedia:Allele_frequency sequence SO:0002119 Requested by HL7 clinical genomics group. allelic_frequency A physical quality which inheres to the allele by virtue of the number instances of the allele within a population. This is the relative frequency of the allele at a given locus in a population. SO:ke Transcript where ditag (digital gene expression profiling)and/or published experimental data strongly supports the existence of short non-coding transcripts transcribed from the 3'UTR. nicole 2016-08-23T15:48:21Z 3'_overlapping_ncrna 3prime_overlapping_ncRNA three prime overlapping noncoding rna sequence SO:0002120 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. three_prime_overlapping_ncrna Transcript where ditag (digital gene expression profiling)and/or published experimental data strongly supports the existence of short non-coding transcripts transcribed from the 3'UTR. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html The configuration of the IG and TR variable (V), diversity (D) and joining (J) germline genes before DNA rearrangements (with or without constant (C) genes in undefined configuration. (germline, non rearranged regions of the IG DNA loci). nicole 2016-08-23T15:54:51Z immune_gene sequence SO:0002121 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. vertebrate_immune_system_gene The configuration of the IG and TR variable (V), diversity (D) and joining (J) germline genes before DNA rearrangements (with or without constant (C) genes in undefined configuration. (germline, non rearranged regions of the IG DNA loci). GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A germline immunoglobulin gene. nicole 2016-08-23T15:56:09Z All_IG_genes IG_genes sequence SO:0002122 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. immunoglobulin_gene A germline immunoglobulin gene. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A constant (C) gene, a gene that codes the constant region of an immunoglobulin chain. nicole 2016-08-23T15:57:29Z IGC_gene Immunoglobulin_Constant_germline_Gene immunoglobulin_C_gene sequence SO:0002123 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_C_gene A constant (C) gene, a gene that codes the constant region of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A gene that rearranges at the DNA level and codes the diversity region of the variable domain of an immunoglobuin (IG) gene. nicole 2016-08-23T15:59:10Z IGD_gene Immunoglobulin_Diversity_ gene immunoglobulin_D_gene sequence SO:0002124 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_D_gene A gene that rearranges at the DNA level and codes the diversity region of the variable domain of an immunoglobuin (IG) gene. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. nicole 2016-08-23T16:00:36Z IG_joining_gene Immunoglobulin_Joining_Gene immunoglobulin_J_gene sequence SO:0002125 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_J_gene A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of an immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of an Immunoglobulin chain. nicole 2016-08-23T16:02:09Z IGV_gene IG_variable_gene Immunoglobulin_variable_gene sequence SO:0002126 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. IG_V_gene A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of an Immunoglobulin chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A gene that encodes a long non-coding RNA. nicole 2016-08-23T16:03:33Z lnc RNA gene lnc_RNA_gene long_non_coding_RNA_gene sequence SO:0002127 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. lncRNA_gene A gene that encodes a long non-coding RNA. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html Mitochondrial rRNA is an RNA component of the small or large subunits of mitochondrial ribosomes. nicole 2016-08-23T16:08:59Z Mt rRNA Mt_rRNA mitochondrial rRNA mitochondrial_rRNA sequence SO:0002128 Updated definition to be consistent with format of other rRNA definitions on 10 June 2021. Requested by EBI. See GitHub Issue #493. mt_rRNA Mitochondrial rRNA is an RNA component of the small or large subunits of mitochondrial ribosomes. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html Mitochondrial transfer RNA. nicole 2016-08-23T16:10:17Z Mt_tRNA mitochondrial_tRNA sequence SO:0002129 mt_tRNA Mitochondrial transfer RNA. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A transcript that contains a CDS but has no stop codon before the polyA site is reached. nicole 2016-08-23T16:11:34Z non_stop_decay_transcript sequence SO:0002130 NSD_transcript A transcript that contains a CDS but has no stop codon before the polyA site is reached. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html A long non-coding transcript found within an intron of a coding or non-coding gene, with no overlap of exonic sequence. nicole 2016-08-23T16:15:02Z SO:0001903 sense intronic lncRNA sense_intronic sense_intronic_lncRNA sense_intronic_non-coding_RNA sequence SO:0002131 Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579. sense_intronic_lncRNA A long non-coding transcript found within an intron of a coding or non-coding gene, with no overlap of exonic sequence. GENECODE:http://www.gencodegenes.org/gencode_biotypes.html A long non-coding transcript that contains a protein coding gene within its intronic sequence on the same strand, with no overlap of exonic sequence. nicole 2016-08-23T16:16:13Z sense overlap lncRNA sense_overlap_lncRNA sense_overlapping sequence SO:0002132 Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579. sense_overlap_lncRNA A long non-coding transcript that contains a protein coding gene within its intronic sequence on the same strand, with no overlap of exonic sequence. GENECODE:http://www.gencodegenes.org/gencode_biotypes.html A T-cell receptor germline gene. nicole 2016-08-23T16:17:12Z TR_gene sequence SO:0002133 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. T_cell_receptor_gene A constant (C) gene, a gene that codes the constant region of a T-cell receptor chain. nicole 2016-08-23T16:19:20Z T_cell_receptor_C_gene sequence SO:0002134 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_C_Gene A constant (C) gene, a gene that codes the constant region of a T-cell receptor chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A gene that rearranges at the DNA level and codes the diversity region of the variable domain of aT-cell receptor gene. nicole 2016-08-23T16:20:06Z T_cell_receptor_D_gene sequence SO:0002135 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_D_Gene A gene that rearranges at the DNA level and codes the diversity region of the variable domain of aT-cell receptor gene. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of aT-cell receptor chain. nicole 2016-08-23T16:20:36Z T_cell_receptor_J_gene sequence SO:0002136 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_J_Gene A joining gene that rearranges at the DNA level and codes the joining region of the variable domain of aT-cell receptor chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of aT-cell receptor chain. nicole 2016-08-23T16:21:04Z T_cell_receptor_V_gene sequence SO:0002137 These terms have been requested by Adam Frankish to support Gencode and Vega biotypes. The terms are defined according to IGMT. TR_V_Gene A variable gene that rearranges at the DNA level and codes the variable region of the variable domain of aT-cell receptor chain. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html IGMT:http://www.imgt.org/IMGTScientificChart/SequenceDescription/Keywords.php A transcript feature that has been predicted but is not yet validated. nicole 2016-08-23T16:27:38Z predicted transcript sequence SO:0002138 predicted_transcript A transcript feature that has been predicted but is not yet validated. SO:ke This is used for non-spliced EST clusters that have polyA features. This category has been specifically created for the ENCODE project to highlight regions that could indicate the presence of protein coding genes that require experimental validation, either by 5' RACE or RT-PCR to extend the transcripts, or by confirming expression of the putatively-encoded peptide with specific antibodies. nicole 2016-08-23T16:28:07Z TEC to_be_experimentally_confirmed_transcript sequence SO:0002139 unconfirmed_transcript This is used for non-spliced EST clusters that have polyA features. This category has been specifically created for the ENCODE project to highlight regions that could indicate the presence of protein coding genes that require experimental validation, either by 5' RACE or RT-PCR to extend the transcripts, or by confirming expression of the putatively-encoded peptide with specific antibodies. GENCODE:http://www.gencodegenes.org/gencode_biotypes.html An origin of replication that initiates early in S phase. nicole 2016-09-15T15:53:36Z early origin early origin of replication early replication origin sequence SO:0002140 early_origin_of_replication An origin of replication that initiates early in S phase. PMID:23348837 PMID:9115207 An origin of replication that initiates late in S phase. nicole 2016-09-15T15:56:07Z late origin late origin of replication late replication origin sequence SO:0002141 late_origin_of_replication An origin of replication that initiates late in S phase. PMID:23348837 PMID:9115207 A histone 2A modification where the modification is the acetylation of the residue. nicole 2016-10-25T12:03:46Z H2Aac histone 2A acetylation site sequence SO:0002142 histone_2A_acetylation_site A histone 2A modification where the modification is the acetylation of the residue. ISBN:0815341059 A histone 2B modification where the modification is the acetylation of the residue. nicole 2016-10-25T12:04:04Z H2Bac histone 2B acetylation site sequence SO:0002143 histone_2B_acetylation_site A histone 2B modification where the modification is the acetylation of the residue. ISBN:0815341059 A histone 2AZ modification where the modification is the acetylation of the residue. nicole 2016-10-25T14:11:49Z H2A.Zac H2AZac histone 2AZ acetylation site sequence SO:0002144 histone_2AZ_acetylation_site A histone 2AZ modification where the modification is the acetylation of the residue. PMID:19385636 PMID:24316985 PMID:27087541 A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H2AZ histone protein is acetylated. nicole 2016-10-25T14:19:43Z H2A.ZK4ac H2AZK4 acetylation site H2AZK4ac sequence SO:0002145 H2AZK4_acetylation_site A kind of histone modification site, whereby the 4th residue (a lysine), from the start of the H2AZ histone protein is acetylated. PMID:19385636 PMID:24316985 PMID:27087541 A kind of histone modification site, whereby the 7th residue (a lysine), from the start of the H2AZ histone protein is acetylated. nicole 2016-10-25T14:23:11Z H2A.ZK7ac H2AZK7 acetylation site H2AZK7ac sequence SO:0002146 H2AZK7_acetylation_site A kind of histone modification site, whereby the 7th residue (a lysine), from the start of the H2AZ histone protein is acetylated. PMID:19385636 PMID:24316985 PMID:27087541 A kind of histone modification site, whereby the 11th residue (a lysine), from the start of the H2AZ histone protein is acetylated. nicole 2016-10-25T14:23:31Z H2A.ZK11ac H2AZK11 acetylation site H2AZK11ac sequence SO:0002147 H2AZK11_acetylation_site A kind of histone modification site, whereby the 11th residue (a lysine), from the start of the H2AZ histone protein is acetylated. PMID:19385636 PMID:24316985 PMID:27087541 A kind of histone modification site, whereby the 13th residue (a lysine), from the start of the H2AZ histone protein is acetylated. nicole 2016-10-25T14:23:50Z H2A.ZK13ac H2AZK13 acetylation site H2AZK13ac sequence SO:0002148 H2AZK13_acetylation_site A kind of histone modification site, whereby the 13th residue (a lysine), from the start of the H2AZ histone protein is acetylated. PMID:19385636 PMID:24316985 PMID:27087541 A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2AZ histone protein is acetylated. nicole 2016-10-25T14:24:08Z H2A.ZK15ac H2AZK15 acetylation site H2AZK15ac sequence SO:0002149 H2AZK15_acetylation_site A kind of histone modification site, whereby the 15th residue (a lysine), from the start of the H2AZ histone protein is acetylated. PMID:19385636 PMID:24316985 PMID:27087541 A uORF beginning with the canonical start codon AUG. nicole 2016-10-26T09:37:11Z AUG initiated uORF sequence SO:0002150 AUG_initiated_uORF A uORF beginning with the canonical start codon AUG. PMID:26684391 PMID:27313038 A uORF beginning with a codon other than AUG. nicole 2016-10-26T09:37:45Z non AUG initiated uORF sequence SO:0002151 non_AUG_initiated_uORF A uORF beginning with a codon other than AUG. PMID:26684391 PMID:27313038 A variant that falls downstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms. nicole 2016-10-28T10:20:55Z genic 3 prime transcript variant genic 3' transcript variant genic downstream transcript variant sequence SO:0002152 genic_downstream_transcript_variant A variant that falls downstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms. NCBI:dm SO:ke A variant that falls upstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms. nicole 2016-10-28T10:23:17Z genic 5 prime transcript variant genic 5' transcript variant genic upstream transcript variant sequence SO:0002153 genic_upstream_transcript_variant A variant that falls upstream of a transcript, but within the genic region of the gene due to alternately transcribed isoforms. NCBI:dm SO:ke A genomic region where there is an exchange of genetic material with another genomic region, occurring in somatic cells. nicole 2016-10-28T10:33:54Z INSDC_feature:misc_recomb INSDC_qualifier:mitotic INSDC_qualifier:mitotic_recombination mitotic recombination region sequence SO:0002154 mitotic_recombination_region A genomic region where there is an exchange of genetic material with another genomic region, occurring in somatic cells. NCBI:cf SO:ke A genomic region in which there is an exchange of genetic material as a result of the repair of meiosis-specific double strand breaks that occur during meiotic prophase. nicole 2016-10-28T10:34:55Z INSDC_feature:misc_recomb INSDC_qualifier:meiotic INSDC_qualifier:meiotic_recombination meiotic recombination region sequence SO:0002155 meiotic_recombination_region A genomic region in which there is an exchange of genetic material as a result of the repair of meiosis-specific double strand breaks that occur during meiotic prophase. NCBI:cf SO:ke A promoter element bound by the MADS family of transcription factors with consensus 5'-(C/T)TA(T/A)4TA(G/A)-3'. nicole 2016-10-28T10:42:06Z CArG box sequence SO:0002156 Requested by Antonia Lock CArG_box A promoter element bound by the MADS family of transcription factors with consensus 5'-(C/T)TA(T/A)4TA(G/A)-3'. PMID:1748287 PMID:7623803 A gene cassette array containing H+ mating type specific information. nicole 2016-11-17T10:59:00Z sequence SO:0002157 Mat2P A gene cassette array containing H+ mating type specific information. PMID:18354497 A gene cassette array containing H- mating type specific information. nicole 2016-11-17T11:02:27Z sequence SO:0002158 Mat3M A gene cassette array containing H- mating type specific information. PMID:18354497 A conserved Cdc48/p97 interaction motif with strict consensus sequence F[PI]GKG[TK][RK]LG[GT] and relaxed consensus sequence FXGKGX[RK]LG. nicole 2016-12-15T09:48:38Z SHP box sequence SO:0002159 SHP_box A conserved Cdc48/p97 interaction motif with strict consensus sequence F[PI]GKG[TK][RK]LG[GT] and relaxed consensus sequence FXGKGX[RK]LG. PMID:17083136 PMID:27655872 A sequence variant that changes the length of one or more sequence features. nicole 2017-04-26T12:31:12Z sequence length variant sequence SO:0002160 sequence_length_variant A sequence variant where the copies of a short tandem repeat (STR) feature are either contracted or expanded. nicole 2017-04-26T12:50:55Z short tandem repeat change str change sequence SO:0002161 short_tandem_repeat_change A short tandem repeat variant containing more repeat units than the reference sequence. nicole 2017-04-26T12:51:26Z short tandem repeat expansion str expansion sequence SO:0002162 short_tandem_repeat_expansion A short tandem repeat variant containing fewer repeat units than the reference sequence. nicole 2017-04-26T12:52:33Z short tandem repeat contraction str contraction sequence SO:0002163 short_tandem_repeat_contraction A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B histone protein is acetylated. nicole 2017-05-17T15:22:58Z H2BK5 acetylation site H2BK5ac sequence SO:0002164 H2BK5_acetylation_site A kind of histone modification site, whereby the 5th residue (a lysine), from the start of the H2B histone protein is acetylated. PMID:18552846 http://www.actrec.gov.in/histome/ptm_sp.php?ptm_sp=H2BK5ac A short tandem repeat expansion with an increase in a sequence of three nucleotide units repeated in tandem compared to a reference sequence. nicole 2017-06-02T10:43:42Z trinucleotide repeat expansion sequence SO:0002165 trinucleotide_repeat_expansion A ref_miRNA (RefSeq-miRNA) sequence is assigned at the creation of a new mature miRNA entry in a database. The ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA. nicole 2017-06-22T11:05:49Z RefSeq miRNA RefSeq-miRNA ref miRNA sequence SO:0002166 ref_miRNA A ref_miRNA (RefSeq-miRNA) sequence is assigned at the creation of a new mature miRNA entry in a database. The ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA. PMID:26453491 IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple ref_miRNA. nicole 2017-06-22T11:09:42Z sequence SO:0002167 isomiR IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple ref_miRNA. PMID:26453491 An RNA_thermometer is a cis element in the 5' end of an mRNA that can change its secondary structure in response to temperature and coordinate temperature-dependent gene expression. nicole 2017-07-17T10:07:45Z https://en.wikipedia.org/wiki/RNA_thermometer RNA thermometer RNA thermoregulator RNAT thermoregulator sequence SO:0002168 RNA_thermometer An RNA_thermometer is a cis element in the 5' end of an mRNA that can change its secondary structure in response to temperature and coordinate temperature-dependent gene expression. PMID:22421878 https://en.wikipedia.org/wiki/RNA_thermometer wiki A sequence variant that falls in the polypyrimidine tract at 3' end of intron between 17 and 3 bases from the end (acceptor -3 to acceptor -17). nicole 2017-07-31T13:40:13Z splice polypyrimidine tract variant sequence SO:0002169 splice_polypyrimidine_tract_variant A sequence variant that falls in the region between the 3rd and 6th base after splice junction (5' end of intron). nicole 2017-07-31T13:48:32Z splice donor region variant sequence SO:0002170 splice_donor_region_variant A telomeric D-loop is a three-stranded DNA displacement loop that forms at the site where the telomeric 3' single-stranded DNA overhang (formed of the repeat sequence TTAGGG in mammals) is tucked back inside the double-stranded component of telomeric DNA molecule, thus forming a t-loop or telomeric-loop and protecting the chromosome terminus. nicole 2017-08-01T13:12:11Z telomeric D loop sequence SO:0002171 This definition is from GO:0061820 telomeric D-loop disassembly. telomeric_D_loop A telomeric D-loop is a three-stranded DNA displacement loop that forms at the site where the telomeric 3' single-stranded DNA overhang (formed of the repeat sequence TTAGGG in mammals) is tucked back inside the double-stranded component of telomeric DNA molecule, thus forming a t-loop or telomeric-loop and protecting the chromosome terminus. PMID:10338204 PMID:15071557 PMID:24012755 A sequence_alteration where the source of the alteration is due to an artifact in the base-calling or assembly process. nicole 2017-08-18T13:43:26Z sequence alteration artifact sequence SO:0002172 sequence_alteration_artifact An indel that is the result of base-calling or assembly error. nicole 2017-08-18T15:16:20Z indel artifact sequence SO:0002173 indel_artifact A deletion that is the result of base-calling or assembly error. nicole 2017-08-18T15:17:11Z deletion artifact sequence SO:0002174 deletion_artifact An insertion that is the result of base-calling or assembly error. nicole 2017-08-18T15:17:42Z insertion artifact sequence SO:0002175 insertion_artifact A substitution that is the result of base-calling or assembly error. nicole 2017-08-18T15:18:12Z substitution artifact sequence SO:0002176 substitution_artifact A duplication that is the result of base-calling or assembly error. nicole 2017-08-18T15:26:00Z duplication artifact sequence SO:0002177 duplication_artifact An SNV that is the result of base-calling or assembly error. nicole 2017-08-18T15:26:49Z SNV artifact sequence SO:0002178 SNV_artifact An MNV that is the result of base-calling or assembly error. nicole 2017-08-18T15:27:21Z MNV artifact sequence SO:0002179 MNV_artifact A gene that encodes an enzymatic RNA. nicole 2017-09-27T10:30:27Z enzymatic RNA gene sequence SO:0002180 enzymatic_RNA_gene A gene that encodes a ribozyme. nicole 2017-09-27T10:31:09Z ribozyme gene sequence SO:0002181 ribozyme_gene A gene that encodes an antisense long, non-coding RNA. nicole 2017-09-27T10:44:00Z antisense lncRNA gene sequence SO:0002182 antisense_lncRNA_gene A gene that encodes a sense overlap long non-coding RNA. nicole 2017-09-27T10:48:05Z sense overlap lncRNA gene sense overlap ncRNA gene sense_overlap_lncRNA_gene sequence SO:0002183 Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579. sense_overlap_lncRNA_gene A gene that encodes a sense intronic long non-coding RNA. nicole 2017-09-27T11:03:50Z sense intronic lncRNA gene sense intronic ncRNA gene sense_intronic_lncRNA_gene sequence SO:0002184 Updating the names of sense_intronic_ncRNA (SO:0002131), sense_overlap_ncRNA (SO:0002132), sense_overlap_ncRNA_gene (SO:0002183), and sense_intronic_ncRNA_gene (SO:0002184) to _lncRNA. See GitHub Issue #579. sense_intronic_lncRNA_gene A non-coding locus that originates from within the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand. nicole 2017-10-03T11:43:48Z bidirectional promoter lncRNA bidirectional promoter lncRNA gene bidirectional_promoter_lncRNA_gene sequence SO:0002185 This is a gencode term. See GitHub Issue #408. Synonyms "bidirectional promoter lncRNA gene" and "bidirectional_promoter_lncRNA_gene" added 23 April 2021 by David Sant. See GitHub Issue #506. bidirectional_promoter_lncRNA_gene A non-coding locus that originates from within the promoter region of a protein-coding gene, with transcription proceeding in the opposite direction on the other strand. https://www.gencodegenes.org/pages/biotypes.html A region of genomic sequence known to undergo mutational events with greater frequency than expected by chance. nicole 2017-11-07T12:27:51Z mutational hotspot sequence SO:0002186 mutational_hotspot An insertion of sequence from the HERV family of mobile elements with respect to a reference. nicole 2017-11-20T11:52:51Z HERV insertion sequence SO:0002187 HERV_insertion An insertion of sequence from the HERV family of mobile elements with respect to a reference. NCBI:th A gene_member_region that encodes sequence that directly contributes to the molecular function of its gene or gene product. nicole 2017-12-15T11:08:43Z functional gene region sequence SO:0002188 A functional_gene_region is a sequence feature that resides within a gene. But it is typically the corresponding region of translated/transcribed sequence in a gene product, that performs the molecular function qualifying it as a functional_gene_region. Here, a functional_gene_region must contribute directly to the molecular function of the gene product - regions that code for purely structural elements in a gene product that connect such directly functional elements together are not considered functional_gene_regions. Examples of regions considered 'functional' include those encoding enzymatic activity, binding activity, regions required for localization or membrane association, channel-forming regions, and signal peptides or other elements critical for processing of a gene product. In addition, regions that function at the genomic/DNA level are also included - e.g. regions of sequence known to be critical for binding transcription or splicing factors. functional_gene_region A gene_member_region that encodes sequence that directly contributes to the molecular function of its gene or gene product. Clingen:mb A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes. nicole 2018-01-03T15:47:32Z INSDC_feature:gene INSDC_qualifier:allelic allelic pseudogene sequence SO:0002189 allelic_pseudogene A transcriptional cis regulatory region that when located between an enhancer and a gene's promoter prevents the enhancer from modulating the expression of the gene. Sometimes referred to as an insulator but may not include the barrier function of an insulator. nicole 2018-01-04T17:28:52Z INSDC_feature:regulatory INSDC_qualifier:enhancer_blocking_element enhancer blocking element sequence insulator SO:0002190 Insulator is included as a related synonym since this is used to refer to insulator in the literature (NCBI:cf). enhancer_blocking_element A transcriptional cis regulatory region that when located between an enhancer and a gene's promoter prevents the enhancer from modulating the expression of the gene. Sometimes referred to as an insulator but may not include the barrier function of an insulator. NCBI:cf A regulatory region that controls epigenetic imprinting and affects the expression of target genes in an allele- or parent-of-origin-specific manner. Associated regulatory elements may include differentially methylated regions and non-coding RNAs. nicole 2018-01-04T17:35:34Z INSDC_feature:regulatory INSDC_qualifier:imprinting_control_region imprinting control region sequence SO:0002191 Moved from is_a regulatory_region (SO:0005836) to is_a epigenetically_modified_region (SO:0001720) on 11 Feb 2021. GREEKC members pointed out that this would be a more appropriate location. See GitHub Issue #530. imprinting_control_region A repeat lying outside the sequence for which it has functional significance (eg. transposon insertion target sites). nicole 2018-01-05T16:27:21Z INSDC_feature:repeat_region INSDC_qualifier:flanking flanking repeat sequence SO:0002192 flanking_repeat The pseudogene has arisen by reverse transcription of a mRNA into cDNA, followed by reintegration into the genome. Therefore, it has lost any intron/exon structure, and it might have a pseudo-polyA-tail. nicole 2018-01-08T11:43:58Z INSDC_feature:rRNA INSDC_qualifier:processed processed pseudogenic rRNA sequence SO:0002193 processed_pseudogenic_rRNA The pseudogene has arisen from a copy of the parent gene by duplication followed by accumulation of random mutation. The changes, compared to their functional homolog, include insertions, deletions, premature stop codons, frameshifts and a higher proportion of non-synonymous versus synonymous substitutions. nicole 2018-01-08T11:49:41Z INSDC_feature:rRNA INSDC_qualifier:unprocessed unprocessed pseudogenic rRNA sequence SO:0002194 unprocessed_pseudogenic_rRNA The pseudogene has no parent. It is the original gene, which is functional in some species but disrupted in some way (indels, mutation, recombination) in another species or strain. nicole 2018-01-08T11:51:59Z INSDC_feature:rRNA INSDC_qualifier:unitary unitary pseudogenic rRNA sequence SO:0002195 unitary_pseudogenic_rRNA A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes. nicole 2018-01-08T11:53:13Z INSDC_feature:rRNA INSDC_qualifier:allelic allelic pseudogenic rRNA sequence SO:0002196 allelic_pseudogenic_rRNA The pseudogene has arisen by reverse transcription of a mRNA into cDNA, followed by reintegration into the genome. Therefore, it has lost any intron/exon structure, and it might have a pseudo-polyA-tail. nicole 2018-01-08T12:10:10Z INSDC_feature:tRNA INSDC_qualifier:processed processed pseudogenic tRNA sequence SO:0002197 processed_pseudogenic_tRNA The pseudogene has arisen from a copy of the parent gene by duplication followed by accumulation of random mutation. The changes, compared to their functional homolog, include insertions, deletions, premature stop codons, frameshifts and a higher proportion of non-synonymous versus synonymous substitutions. nicole 2018-01-08T12:14:34Z INSDC_feature:tRNA INSDC_qualifier:unprocessed unprocessed pseudogenic tRNA sequence SO:0002198 unprocessed_pseudogenic_tRNA The pseudogene has no parent. It is the original gene, which is functional in some species but disrupted in some way (indels, mutation, recombination) in another species or strain. nicole 2018-01-08T12:16:59Z INSDC_feature:tRNA INSDC_qualifier:unitary unitary pseudogenic tRNA sequence SO:0002199 unitary_pseudogenic_tRNA A (unitary) pseudogene that is stable in the population but importantly it has a functional alternative allele also in the population. i.e., one strain may have the gene, another strain may have the pseudogene. MHC haplotypes have allelic pseudogenes. nicole 2018-01-08T12:18:38Z INSDC_feature:tRNA INSDC_qualifier:allelic allelic pseudogenic tRNA sequence SO:0002200 allelic_pseudogenic_tRNA A repeat at the ends of and within the sequence for which it has functional significance other than long terminal repeat. nicole 2018-01-08T13:00:59Z INSDC_feature:repeat_region INSDC_qualifier:terminal terminal repeat sequence SO:0002201 terminal_repeat A repeat region that is prone to expansions and/or contractions. nicole 2018-01-09T11:19:55Z INSDC_feature:misc_feature INSDC_note:repeat_instability_region repeat instability region sequence SO:0002202 repeat_instability_region A nucleotide site from which replication initiates. nicole 2018-01-09T11:23:35Z INSDC_feature:misc_feature INSDC_note:replication_start_site replication start site sequence SO:0002203 replication_start_site A nucleotide site from which replication initiates. NCBI:cf A point in nucleic acid where a cleavage event occurs. nicole 2018-01-09T11:30:34Z INSDC_feature:misc_feature INSDC_note:nucleotide_cleavage_site nucleotide cleavage site sequence SO:0002204 nucleotide_cleavage_site A regulatory element that acts in response to a stimulus, usually via transcription factor binding. nicole 2018-01-10T16:33:25Z INSDC_feature:regulatory INSDC_qualifier:response_element response element sequence SO:0002205 Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. response_element Identifies the biological source of the specified span of the sequence nicole 2018-01-26T09:50:58Z INSDC_feature:source sequence source sequence SO:0002206 Terms such as genomic_DNA or mRNA can be used to describe a sequence source. sequence_source Identifies the biological source of the specified span of the sequence NCBI:tm A hexameric RNA motif consisting of nucleotides UNAAAC (where N can be any nucleotide) that targets the RNA for degradation. nicole 2018-02-06T12:23:24Z UNAAAC motif sequence SO:0002207 UNAAAC_motif A hexameric RNA motif consisting of nucleotides UNAAAC (where N can be any nucleotide) that targets the RNA for degradation. PMID:22645662 PMID:28765164 PomBase:al An RNA that is transcribed from a long terminal repeat. nicole 2018-02-07T11:51:45Z LTR transcript long terminal repeat transcript sequence SO:0002208 long_terminal_repeat_transcript An RNA that is transcribed from a long terminal repeat. PMID:24256266 PomBase:mh A contig composed of genomic DNA derived sequences. nicole 2018-03-21T12:25:14Z gDNA contig gDNA_contig genomic DNA contig sequence SO:0002209 Requested by Bayer Crop Science, March 2018 genomic_DNA_contig A contig composed of genomic DNA derived sequences. BCS:etrwz A variation qualifying the presence of a sequence in a genome which is entirely missing in another genome. nicole 2018-03-21T12:59:14Z PAV presence absence variation presence-absence variation presence-absence_variation presence/absence variation presence/absence_variation sequence SO:0002210 Requested by Bayer Crop Science, March 2018 presence_absence_variation A variation qualifying the presence of a sequence in a genome which is entirely missing in another genome. BCS:bbean PMID:19956538 PMID:25881062 A self replicating circular nucleic acid molecule that is distinct from a chromosome in the organism. nicole 2018-04-18T11:13:38Z circular plasmid sequence SO:0002211 circular_plasmid A self replicating circular nucleic acid molecule that is distinct from a chromosome in the organism. PMID:21719542 SBOL:jb A self replicating linear nucleic acid molecule that is distinct from a chromosome in the organism. They are capped by terminal proteins covalently bound to the 5' ends of the DNA. nicole 2018-04-18T11:14:04Z linear plasmid sequence SO:0002212 linear_plasmid A self replicating linear nucleic acid molecule that is distinct from a chromosome in the organism. They are capped by terminal proteins covalently bound to the 5' ends of the DNA. PMID:21719542 SBOL:jb Termination signal preferentially observed downstream of polyadenylation signal nicole 2018-05-18T17:10:14Z (A(U)GUA) motif Nrd1 binding motif Nrd1-dependent terminator UCUUG motif UGUAA/G motif polyA site associated transcription termination signal polyA site downstream element transcription termination signal sequence SO:0002213 transcription_termination_signal Termination signal preferentially observed downstream of polyadenylation signal PMID:28367989 A sequence variant whereby at least one base of a codon is changed, resulting in a stop codon inserted next to an existing stop codon. This leads to a polypeptide of the same length. nicole 2018-06-13T09:53:31Z redundant inserted stop gained sequence SO:0002214 redundant_inserted_stop_gained A DNA motif to which the S. pombe Zas1 protein binds. The consensus sequence is 5'-(Y)CCCCAY-3'. nicole 2018-06-20T10:05:17Z Zas1 recognition motif sequence SO:0002215 Zas1_recognition_motif A DNA motif to which the S. pombe Zas1 protein binds. The consensus sequence is 5'-(Y)CCCCAY-3'. PMID:29735745 PomBase:vw A promoter element with consensus sequence [5'-TCG(G/C)(A/T)xxTTxAA], bound by the transcription factor Pho7. nicole 2018-09-12T12:26:50Z Pho7 binding site sequence SO:0002216 Pho7_binding_site A promoter element with consensus sequence [5'-TCG(G/C)(A/T)xxTTxAA], bound by the transcription factor Pho7. PMID:28811350 A sequence alteration which includes an insertion or a deletion. This describes a sequence length change when the direction of the change is unspecified or when such changes are pooled into one category. nicoleruiz 2019-02-24T18:26:05Z insertion or deletion unspecified indel sequence SO:0002217 This term is used when there is a change that is either an insertion or a deletion but it is unknown which event occurred. unspecified_indel A sequence alteration which includes an insertion or a deletion. This describes a sequence length change when the direction of the change is unspecified or when such changes are pooled into one category. ZFIN:st A sequence variant in which the function of a gene product is altered with respect to a reference. david 2019-03-01T10:21:26Z function modified variant sequence function_modified_variant functionally abnormal SO:0002218 Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019. functionally_abnormal A sequence variant in which the function of a gene product is retained with respect to a reference. david 2019-03-01T10:28:12Z function retained variant sequence function_retained_variant functionally normal SO:0002219 Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019. functionally_normal A sequence variant in which the function of a gene product is unknown with respect to a reference. david 2019-03-01T10:29:01Z function uncertain variant function_uncertain_variant sequence SO:0002220 Added after request from Lea Starita, lea.starita@gmail.com from the NCBI Feb 2019. function_uncertain_variant A regulatory_region including the Transcription Start Site (TSS) of a gene and serving as a platform for Pre-Initiation Complex (PIC) assembly, enabling transcription of a gene under certain conditions. david 2019-07-31T14:01:20Z Eukaryotic promoter sequence SO:0002221 eukaryotic_promoter A regulatory_region essential for the specific initiation of transcription at a defined location in a DNA molecule, although this location might not be one single base. It is recognized by a specific RNA polymerase(RNAP)-holoenzyme, and this recognition is not necessarily autonomous. david 2019-07-31T14:02:26Z Prokaryotic promoter sequence SO:0002222 prokaryotic_promoter A regulatory_region essential for the specific initiation of transcription at a defined location in a DNA molecule, although this location might not be one single base. It is recognized by a specific RNA polymerase(RNAP)-holoenzyme, and this recognition is not necessarily autonomous. PMID:32665585 Sequences that decrease interactions between biological regions, such as between a promoter, its 5' context and/or the translational unit(s) it regulates. Spacers can affect regulation of translation, transcription, and other biological processes. david 2019-09-06T19:05:52Z doi:10.1101/584664 sequence Inert DNA Spacer SO:0002223 Updated by Evan Christensen on May 27, 2021 per github request https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/494 inert_DNA_spacer Sequences that decrease interactions between biological regions, such as between a promoter, its 5' context and/or the translational unit(s) it regulates. Spacers can affect regulation of translation, transcription, and other biological processes. PMID:20843779 PMID:24933158 PMID:27034378 PMID:28422998 doi:10.1101/584664 https://www.biorxiv.org/content/10.1101/584664v1 A region that codes for a 2A self-cleaving polypeptide region, which is a region that can result in a break in the peptide sequence at its terminal G-P junction. david 2019-10-21T10:41:49Z sequence 2A polypeptide region 2A self-cleaving polypeptide region SO:0002224 Added by Dave Sant on October 21, 2019 per github request https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/475 2A_self_cleaving_peptide_region A region that codes for a 2A self-cleaving polypeptide region, which is a region that can result in a break in the peptide sequence at its terminal G-P junction. PMID:22301656 PMID:28526819 A conserved sequence (5'-CGNMGATCNTY-3') transcription repressor binding site required for gene repression in the presence of high zinc. david 2019-10-30T11:19:52Z sequence LRE LRE element Loz1 response element SO:0002225 Added on October 30, 2019 as per request of Val Wood request on GitHub Issue# 476 https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/476 LOZ1_response_element A group II intron that recognizes IBS1/EBS1 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon and may also recognize a stem-loop in the RNA. david 2020-03-27T08:56:34Z group IIB intron sequence SO:0002226 group_IIC_intron A group II intron that recognizes IBS1/EBS1 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon and may also recognize a stem-loop in the RNA. PMID:20463000 A sequence variant extending the CDS, that causes elongation of the resulting polypeptide sequence. david 2020-03-27T17:56:30Z CDS Extension elongated CDS elongated_CDS sequence SO:0002227 Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480) CDS_extension A sequence variant extending the CDS, that causes elongation of the resulting polypeptide sequence. PMID:14732127 PMID:15864293 PMID:27984720 PMID:31216041 PMID:32020195 A sequence variant extending the CDS at the 5' end, that causes elongation of the resulting polypeptide sequence at the N terminus. david 2020-03-27T17:57:30Z CDS Extension 5 prime CDS Extension five prime elongated CDS five prime elongated_CDS_five_prime sequence SO:0002228 Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480) CDS_five_prime_extension A sequence variant extending the CDS at the 5' end, that causes elongation of the resulting polypeptide sequence at the N terminus. PMID:14732127 PMID:15864293 PMID:27984720 PMID:31216041 PMID:32020195 A sequence variant extending the CDS at the 3' end, that causes elongation of the resulting polypeptide sequence at the C terminus. david 2020-03-27T17:58:30Z CDS Extension 3 prime CDS Extension three prime elongated CDS three prime elongated_CDS_three_prime sequence SO:0002229 Added as per request by Edward Wallace GitHub issue #480 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/480) CDS_three_prime_extension A sequence variant extending the CDS at the 3' end, that causes elongation of the resulting polypeptide sequence at the C terminus. PMID:14732127 PMID:15864293 PMID:27984720 PMID:31216041 PMID:32020195 A C-terminus protein motif (CAAX) serving as a post-translational prenylation site modified by the attachment of either a farnesyl or a geranyl-geranyl group to a cysteine residue. Farnesyltransferase recognizes CaaX boxes where X = M, S, Q, A, or C, whereas Geranylgeranyltransferase I recognizes CaaX boxes with X = L or E. david 2020-03-27T18:04:30Z CAAX box sequence SO:0002230 Added as per request by Val Wood GitHub issue #479 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/479) CAAX_box An RNA that catalyzes its own cleavage. david 2020-03-30T16:02:30Z self cleaving ribozyme sequence SO:0002231 Added as per request by John T. Sexton GitHub issue #470 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/470) self_cleaving_ribozyme A genetic feature that encodes a trait used for artificial selection of a subpopulation. david 2020-04-01T10:04:30Z selectable marker selectable_marker selection marker sequence SO:0002232 Added as per request by Bryan Bartley GitHub issue #468 and #402 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/468) selection_marker A chromosomal locus where complementary lncRNA and associated proteins accumulate at the corresponding lncRNA gene loci to tether homologous chromosome during chromosome pairing at meiosis I. david 2020-04-14T10:09:30Z homologous chromosome recognition and pairing locus sequence SO:0002233 Added as per request by Val Wood GitHub issue #483 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/483) homologous_chromosome_recognition_and_pairing_locus A chromosomal locus where complementary lncRNA and associated proteins accumulate at the corresponding lncRNA gene loci to tether homologous chromosome during chromosome pairing at meiosis I. PMID:22582262 PMID:31811152 A cis-acting element involved in RNA stability found in the 3' UTR of some RNA (consensus UGUAAAUA). david 2020-04-14T10:40:30Z PRE binding RNA pumilio response element sequence SO:0002234 Added as per request by Val Wood GitHub issue #455 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/455) pumilio_response_element A cis-acting element involved in RNA stability found in the 3' UTR of some RNA (consensus UGUAAAUA). PMID:30601114 A polypeptide region that mediates binding to SUMO. The motif contains a hydrophobic core sequence consisting of three or four Ile, Leu, or Val residues plus one acidic or polar residue at position 2 or 3. david 2020-04-22T12:40:30Z SBM SIM SUMO binding motif SUMO interaction motif sequence SO:0002235 Added as per request GitHub issue #434 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/434) SUMO_interaction_motif A polypeptide region that mediates binding to SUMO. The motif contains a hydrophobic core sequence consisting of three or four Ile, Leu, or Val residues plus one acidic or polar residue at position 2 or 3. PMID:15388847,PMID:16524884 A gene which codes for 18S_rRNA, which functions as the small subunit of the ribosome in eukaryotes. david 2020-05-07T16:12:30Z 18S rRNA gene 18S_rRNA_gene sequence SO:0002236 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_18S_gene A gene which codes for 16S_rRNA, which functions as the small subunit of the ribosome in prokaryotes. david 2020-05-07T16:12:30Z 16S rRNA gene 16S_rRNA_gene sequence SO:0002237 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_16S_gene A gene which codes for 5S_rRNA, which is a portion of the large subunit of the ribosome in both eukaryotes and prokaryotes. david 2020-05-07T16:12:30Z 5S rRNA gene 5S_rRNA_gene sequence SO:0002238 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_5S_gene A gene which codes for 28S_rRNA, which functions as a component of the large subunit of the ribosome in eukaryotes. david 2020-05-07T16:12:30Z 28S rRNA gene 28S_rRNA_gene sequence SO:0002239 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_28S_gene A gene which codes for 5_8S_rRNA (5.8S rRNA), which functions as a component of the large subunit of the ribosome in eukaryotes. david 2020-05-07T16:12:30Z 5.8S rRNA gene 5_8S rRNA gene 5_8S_rRNA_gene sequence SO:0002240 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_5_8S_gene A gene which codes for 21S_rRNA, which functions as a component of the large subunit of the ribosome in mitochondria. SO:0002364 david 2020-05-07T16:12:30Z 21S rRNA gene 21S_rRNA_gene sequence SO:0002241 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472) Removed relationship derives_from SO:0001171 on 10 June 2021 when SO:0001171 rRNA_21S was obsoleted into SO:0002345 mt_LSU_rRNA. See GitHub Issue #493. OBSOLETED on 12 September 2022, merged into SO:0002364 mt_LSU_rRNA_gene see GitHub Issue #513. rRNA_21S_gene true A gene which codes for 25S_rRNA, which functions as a component of the large subunit of the ribosome in some eukaryotes. david 2020-05-07T16:12:30Z 25S rRNA gene 25S_rRNA_gene sequence SO:0002242 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_25S_gene A gene which codes for 23S_rRNA, which functions as a component of the large subunit of the ribosome in prokaryotes. david 2020-05-07T16:12:30Z 23S rRNA gene 23S_rRNA_gene sequence SO:0002243 Added as per request by Antonia Lock GitHub issue #472 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/472). Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_23S_gene A transcript which is partially duplicated due to duplication of DNA, leading to a new transcript that is only partial and likely nonfunctional. david 2020-05-13T09:07:30Z partially duplicated transcript sequence SO:0002244 Added as per request from the Illumina group partially_duplicated_transcript A partially_duplicated_transcript where the 5' end of the transcript is duplicated. david 2020-05-13T09:07:30Z 5' duplicated transcript five prime duplicated transcript five prime partially duplicated transcript sequence SO:0002245 Added as per request from the Illumina group five_prime_duplicated_transcript A partially_duplicated_transcript where the 3' end of the transcript is duplicated. david 2020-05-13T09:07:30Z 3' duplicated transcript three prime duplicated transcript three prime partially duplicated transcript sequence SO:0002246 Added as per request from the Illumina group three_prime_duplicated_transcript A non-coding RNA less than 200 nucleotides in length. david 2020-05-13T11:07:30Z Small noncoding RNA sequence SO:0002247 Added as per request from GitHub Issue #485 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/485) sncRNA A non-coding RNA less than 200 nucleotides in length. PMID:30069443 A region of DNA that is predicted to be translated and transcribed into a protein by a protein detection algorithm that does not get transcribed in nature. david 2020-05-13T11:40:30Z spurious protein sequence SO:0002248 Added as per request from GitHub Issue #478 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/478) spurious_protein A region of DNA that is predicted to be translated and transcribed into a protein by a protein detection algorithm that does not get transcribed in nature. PMID:21771858 A CDS region corresponding to a mature protein region of a polypeptide. david 2020-05-13T13:40:30Z INSDC_feature:mat_peptide mature protein region of CDS sequence SO:0002249 Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484) mature_protein_region_of_CDS A CDS region corresponding to a propeptide of a polypeptide. david 2020-05-13T13:40:30Z INSDC_feature:propeptide propeptide region of CDS sequence SO:0002250 Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484) propeptide_region_of_CDS A CDS region corresponding to a signal peptide of a polypeptide. david 2020-05-13T13:40:30Z INSDC_feature:sig_peptide Signal peptide region of CDS sequence SO:0002251 Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484) signal_peptide_region_of_CDS CDS region corresponding to a transit peptide region of a polypeptide. david 2020-05-13T13:40:30Z INSDC_feature:transit_peptide transit peptide region of CDS sequence SO:0002252 Added as per request from GitHub Issue #484 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/484) transit_peptide_region_of_CDS A portion of a stem loop secondary structure in RNA. david 2020-05-13T11:40:30Z stem loop region sequence SO:0002253 Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451) stem_loop_region The loop portion of a stem loop, which is not folded back upon itself. david 2020-05-13T11:40:30Z loop portion of stem loop sequence SO:0002254 Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451) loop The portion of a stem loop where the RNA is folded back upon itself. david 2020-05-13T11:40:30Z stem portion of stem loop sequence SO:0002255 Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451) stem A region of a stem in a stem loop structure where the sequences are non-complimentary. david 2020-05-13T11:40:30Z non-complimentary stem noncomplimentary stem sequence SO:0002256 Added as per request from GitHub Issue #451 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/451) non_complimentary_stem Cytologically observable heterochromatic regions of chromosomes away from centromeres that contain predominatly large tandem repeats and retrotransposons. david 2020-05-27T10:45:30Z Heterochromatin Knob sequence SO:0002257 Added as per request from GitHub Issue #487 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/487) knob Cytologically observable heterochromatic regions of chromosomes away from centromeres that contain predominatly large tandem repeats and retrotransposons. PMID:6439888 A binding motif with the consensus sequence TTAGGG to which Teb1 binds. david 2020-05-27T11:03:30Z teb1 recognition motif sequence SO:0002258 Requested by Antonia Locke, (Pombe) as per GitHub Issue Request #439 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/439) teb1_recognition_motif A binding motif with the consensus sequence TTAGGG to which Teb1 binds. PMID:23314747 PMID:27901072 A region defined by a cluster of experimentally determined polyadenylation sites, typically less than 25 bp in length and associated with a single polyadenylation signal. david 2020-05-27T14:17:30Z polyA cluster polyA site cluster polyA_cluster sequence SO:0002259 Added as per GitHub Issue Request #450 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/450) polyA_site_cluster A region defined by a cluster of experimentally determined polyadenylation sites, typically less than 25 bp in length and associated with a single polyadenylation signal. PMID:17202160 PMID:24072873 PMID:25906188 Large Retrotransposon Derivative elements are long-terminal repeats that contain reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins . The LARDs identified in barley and other Triticeae have LTRs ~5.5 kb and an interal domain of ~3.5 kb. LARDs lack coding domains and thus do not encode proteins. david 2020-05-27T15:47:30Z Large Retrotransposon Derivative large_retrotransposon_derivative sequence SO:0002260 Added as per GitHub Issue Request #429 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/429) LARD Large Retrotransposon Derivative elements are long-terminal repeats that contain reverse transcriptase priming sites and are conserved in sequence but contain no open reading frames encoding typical retrotransposon proteins . The LARDs identified in barley and other Triticeae have LTRs ~5.5 kb and an interal domain of ~3.5 kb. LARDs lack coding domains and thus do not encode proteins. PMID:15082561 TRIM elements have terminal direct repeat sequences of 100-250 bp in length that flank an internal domain of 100–300 bp. TRIMs lack coding domains and thus do not encode proteins. david 2020-05-27T15:47:30Z terminal-repeat retrotransposons in miniature terminal-repeat_retrotransposons_in_miniature sequence SO:0002261 Added as per GitHub Issue Request #429 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/429) TRIM TRIM elements have terminal direct repeat sequences of 100-250 bp in length that flank an internal domain of 100–300 bp. TRIMs lack coding domains and thus do not encode proteins. PMID:11717436 An absolute reference to the strand. When a chromosome has p and q arms, the Watson strand is the strand whose 5'-end is on the short arm of the chromosome. Of note, the term 'plus strand' is typically based on a reference sequence where it's preferred for the plus strand to be the Watson strand, but might not be and 'plus strand' is therefore not an exact synonym. david 2020-05-28T10:33:30Z Plus strand Forward strand Top strand Watson strand sequence SO:0002262 Added as per GitHub Issue Request #419 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/419) Watson_strand An absolute reference to the strand. When a chromosome has p and q arms, the Watson strand is the strand whose 5'-end is on the short arm of the chromosome. Of note, the term 'plus strand' is typically based on a reference sequence where it's preferred for the plus strand to be the Watson strand, but might not be and 'plus strand' is therefore not an exact synonym. PMID:21303550 An absolute reference to the strand. When a chromosome has p and q arms, the Crick strand is the strand whose 5'-end is on the long arm of the chromosome. Of note, the term 'minus strand' is typically based on a reference sequence where it's preferred for the minus strand to be the Crick strand, but might not be and 'minus strand' is therefore not an exact synonym. david 2020-05-28T10:33:30Z Minus strand Bottom strand Crick strand Reverse strand sequence SO:0002263 Added as per GitHub Issue Request #419 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/419) Crick_strand An absolute reference to the strand. When a chromosome has p and q arms, the Crick strand is the strand whose 5'-end is on the long arm of the chromosome. Of note, the term 'minus strand' is typically based on a reference sequence where it's preferred for the minus strand to be the Crick strand, but might not be and 'minus strand' is therefore not an exact synonym. PMID:21303550 LTR retrotransposons in the Copia superfamily contain elements coding for specific proteins in this order: GAG, AP, INT, RT, RH. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. david 2020-06-25T14:00:30Z Copia LTR retrotransposon RLC retrotransposon Ty1 retrotransposon sequence SO:0002264 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Copia_LTR_retrotransposon LTR retrotransposons in the Copia superfamily contain elements coding for specific proteins in this order: GAG, AP, INT, RT, RH. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. PMID:17984973 LTR retrotransposons in the Gypsy superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. david 2020-06-25T14:00:30Z Gypsy LTR retrotransposon RLG retrotransposon Ty3 retrotransposon sequence SO:0002265 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Gypsy_LTR_retrotransposon LTR retrotransposons in the Gypsy superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. PMID:17984973 LTR retrotransposons in the Bel-Pao superfamily are similar to LTRs in the Gypsy and Retrovirus superfamilies. Mainly described in metazoan genomes, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH and INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. david 2020-06-25T14:00:30Z Bel Pao LTR retrotransposon Bel-Pao LTR retrotransposon RLB retrotransposon sequence SO:0002266 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Bel_Pao_LTR_retrotransposon LTR retrotransposons in the Bel-Pao superfamily are similar to LTRs in the Gypsy and Retrovirus superfamilies. Mainly described in metazoan genomes, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH and INT. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. PMID:17984973 LTR retrotransposons in the retrovirus superfamily are similar to LTR retrotransposons in the Gypsy and Bel-Pao superfamilies. Mainly described in vertebrate animals, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT, and ENV. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. ENV is envelop protein. david 2020-06-25T14:00:30Z RLR retrotransposon Retrovirus LTR retrotransposon sequence SO:0002267 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Retrovirus_LTR_retrotransposon LTR retrotransposons in the retrovirus superfamily are similar to LTR retrotransposons in the Gypsy and Bel-Pao superfamilies. Mainly described in vertebrate animals, this superfamily contain elements coding for specific proteins in this order: GAG, AP, RT, RH, INT, and ENV. GAG is a structural protein for virus-like particles. AP is aspartic proteinase. INT is a DDE integrase. RT is a reverse transcriptase. RH is RNAse H. ENV is envelop protein. PMID:17984973 Endogenous retrovirus (ERV) retrotransposons are abundant in the genomes of jawed vertebrates. Human ERVs (HERVs) are classified based on their homologies to animal retroviruses. Class I families are similar in sequence to mammalian Gammaretroviruses (type C) and Epsilonretroviruses (Type E). Class II families show homology to mammalian Betaretroviruses (Type B) and Deltaretroviruses (Type D). F-Class III families are similar to foamy viruses. david 2020-06-25T14:00:30Z Endogenous Retrovirus LTR retrotransposon HERV RLE retrotransposon sequence SO:0002268 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Endogenous_Retrovirus_LTR_retrotransposon Endogenous retrovirus (ERV) retrotransposons are abundant in the genomes of jawed vertebrates. Human ERVs (HERVs) are classified based on their homologies to animal retroviruses. Class I families are similar in sequence to mammalian Gammaretroviruses (type C) and Epsilonretroviruses (Type E). Class II families show homology to mammalian Betaretroviruses (Type B) and Deltaretroviruses (Type D). F-Class III families are similar to foamy viruses. PMID:17984973 R2 retrotransposons are LINE elements (SO:0000194) that insert site-specifically into the host organism's 28S ribosomal RNA (rRNA) genes. david 2020-06-25T14:00:30Z R2 LINE retrotransposon R2 retrotransposon RIR retrotransposon sequence SO:0002269 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) R2_LINE_retrotransposon R2 retrotransposons are LINE elements (SO:0000194) that insert site-specifically into the host organism's 28S ribosomal RNA (rRNA) genes. PMID:21734471 RTE retrotransposons are LINE elements (SO:0000194) that contain a domain with homology to the apurinic-apyrimidic (AP) endonucleases in addition to the previously identified reverse transcriptase domain. david 2020-06-25T14:00:30Z RIT retrotransposon RTE LINE retrotransposon RTE retrotransposon sequence SO:0002270 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) RTE_LINE_retrotransposon RTE retrotransposons are LINE elements (SO:0000194) that contain a domain with homology to the apurinic-apyrimidic (AP) endonucleases in addition to the previously identified reverse transcriptase domain. PMID:9729877 Jockey retrotransposons are LINE elements (SO:0000194) found only in arthropods. The full-length element is ~ 5 kb and contains two open reading frames (SO:0000236), ORF1 (568 aa) and ORF2 (916 aa), the second of which encodes an apurinic endonuclease (APE) and a reverse transcriptase (RT). david 2020-06-25T14:00:30Z Jockey LINE retrotransposon LINE Jockey element RIJ retrotransposon sequence SO:0002271 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Jockey_LINE_retrotransposon Jockey retrotransposons are LINE elements (SO:0000194) found only in arthropods. The full-length element is ~ 5 kb and contains two open reading frames (SO:0000236), ORF1 (568 aa) and ORF2 (916 aa), the second of which encodes an apurinic endonuclease (APE) and a reverse transcriptase (RT). PMID:31709017 Long interspersed element-1 (LINE-1) elements are found in the human genome, which contains ORF1 (open reading frame1, including CC, coiled coil; RRM, RNA recognition motif; CTD, carboxyl-terminal domain) and ORF2 (including EN, endonuclease; RT, reverse transcriptase; C, cysteine-rich domain). The L1-encoded proteins (ORF1p and ORF2p) can mobilize nonautonomous retrotransposons, other noncoding RNAs, and messenger RNAs. david 2020-06-25T14:00:30Z L1 LINE retrotransposon L1 element LINE 1 element LINE-1 element sequence SO:0002272 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) L1_LINE_retrotransposon Long interspersed element-1 (LINE-1) elements are found in the human genome, which contains ORF1 (open reading frame1, including CC, coiled coil; RRM, RNA recognition motif; CTD, carboxyl-terminal domain) and ORF2 (including EN, endonuclease; RT, reverse transcriptase; C, cysteine-rich domain). The L1-encoded proteins (ORF1p and ORF2p) can mobilize nonautonomous retrotransposons, other noncoding RNAs, and messenger RNAs. PMID:31709017 Elements of the LINE I superfamily are similar to the Jockey and L1 superfamily. They contains two ORFs, the.second of which includes  Apurinic endonuclease (APE) and  reverse transcriptase (RT). The I superfamily encodes an RH (RNase H) domain downstream of the RT domain. david 2020-06-25T14:00:30Z https://www.sciencedirect.com/topics/biochemistry-genetics-and-molecular-biology/long-interspersed-nuclear-element I LINE retrotransposon LINE I element RII retrotransposon sequence SO:0002273 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) I_LINE_retrotransposon Short interspersed elements that originated from tRNAs. david 2020-06-25T14:00:30Z RST retrotransposon tRNA SINE element tRNA SINE retrotransposon sequence SO:0002274 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) tRNA_SINE_retrotransposon Short interspersed elements that originated from tRNAs. PMID:21673742 Short interspersed elements that originated from 7SL RNAs. david 2020-06-25T14:00:30Z 7SL SINE element 7SL SINE retrotransposon RSL retrotransposon sequence SO:0002275 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) 7SL_SINE_retrotransposon Short interspersed elements that originated from 7SL RNAs. PMID:21673742 Short interspersed elements that originated from 5S rRNAs. david 2020-06-25T14:00:30Z 5S SINE element 5S SINE retrotransposon RSS retrotransposon sequence SO:0002276 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) 5S_SINE_retrotransposon Short interspersed elements that originated from 5S rRNAs. PMID:21673742 Crypton is a superfamily of DNA transposons that use tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules. david 2020-06-25T14:00:30Z Crypton YR transposon Crypton transposon DYC transposon sequence SO:0002277 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Crypton_YR_transposon Crypton is a superfamily of DNA transposons that use tyrosine recombinase (YR) to cut and rejoin the recombining DNA molecules. PMID:22011512 Elements of the Tc1-Mariner terminal inverted repeat transposon superfamily (also called mariner transposons) are named after the Transponon of C. elegans number 1 transposasse. Their activity creates a 2-bp (TA) target-site duplication (TSD). Stowaway is the non-autonomous element in this superfamily usually shorter than 600 bp. david 2020-06-25T14:00:30Z DTT transposon Mariner Stowaway Tc1 Mariner TIR transposon Tc1 transposon TcMar-Stowaway transposon sequence SO:0002278 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Tc1_Mariner_TIR_transposon Elements of the Tc1-Mariner terminal inverted repeat transposon superfamily (also called mariner transposons) are named after the Transponon of C. elegans number 1 transposasse. Their activity creates a 2-bp (TA) target-site duplication (TSD). Stowaway is the non-autonomous element in this superfamily usually shorter than 600 bp. PMID:17984973 PMID:8556864 The hAT terminal inverted repeat transposon superfamily elements were first found in maize (the Ac/Ds elements). Members of the hAT superfamily have TSDs of 8 bp, relatively short TIRs of 5–27 bp and overall lengths of less than 4 kb. david 2020-06-25T14:00:30Z Ac transposon Ac/Ds transposon DTA transposon Ds transposon hAT TIR transposon hAT transposon hAT-Ac transposon sequence SO:0002279 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) hAT_TIR_transposon The hAT terminal inverted repeat transposon superfamily elements were first found in maize (the Ac/Ds elements). Members of the hAT superfamily have TSDs of 8 bp, relatively short TIRs of 5–27 bp and overall lengths of less than 4 kb. PMID:11454746 Members of the Mutator family of terminal inverted repeat (TIR) transposon are usually long but are also highly divergent, either sharing only terminal G…C nucleotides, or with the G…C nucleotides absent. The length of the TSD (7-11 bp, usually 9 bp) remains probably the most useful criterion for identification. david 2020-06-25T14:00:30Z DTM transposon MLE transposon MULE Mu transposon MuDR Mutator TIR transposon Mutator transposon sequence SO:0002280 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Mutator_TIR_transposon Members of the Mutator family of terminal inverted repeat (TIR) transposon are usually long but are also highly divergent, either sharing only terminal G…C nucleotides, or with the G…C nucleotides absent. The length of the TSD (7-11 bp, usually 9 bp) remains probably the most useful criterion for identification. PMID:17984973 Terminal inverted repeat transposon superfamily Merlin elements create 8-9 bp target-site duplications (TSD). david 2020-06-25T14:00:30Z DTE transposon Merlin TIR transposon Merlin transposon sequence SO:0002281 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Merlin_TIR_transposon Terminal inverted repeat transposon superfamily Merlin elements create 8-9 bp target-site duplications (TSD). PMID:17984973 Terminal inverted repeat (TIR) transposons of the superfamily Transib contain the DDE motif, which is related to the RAG1 protein involved in V(D)J recombination. david 2020-06-25T14:00:30Z DTR transposon Transib TIR transposon transib transposon sequence SO:0002282 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Transib_TIR_transposon Terminal inverted repeat (TIR) transposons of the superfamily Transib contain the DDE motif, which is related to the RAG1 protein involved in V(D)J recombination. PMID:17984973 Primarily found in animals, the terminal inverted repeat (TIR) transposon superfamily piggyBac elements favour insertion adjacent to TTAA. david 2020-06-25T14:00:30Z DTB transposon PiggyBac transposable element piggyBac TIR transposon sequence SO:0002283 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) piggyBac_TIR_transposon Primarily found in animals, the terminal inverted repeat (TIR) transposon superfamily piggyBac elements favour insertion adjacent to TTAA. PMID:17984973 Terminal inverted repeat transposons in the PIF/Harbinger/tourist superfamily create 3-bp target site duplication that are mainly 'TAA' or 'TTA'. The autonomous PIF-Harbinger elements are relatively small in size, usually a few kb in length. Non-autonomous elements in this superfamily usually shorter than 600 bp are referrred to as Tourist elements. The terminal sequences for PIF/Harbinger/Tourist elements are 'GGG/CCC…GGC/GCC' or 'GA/GGCA…TGCC/TC'. david 2020-06-25T14:00:30Z DTH transposon Harbinger transposon PIF Harbinger TIR transposon PIF transposon Tourist transposon element sequence SO:0002284 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) PIF_Harbinger_TIR_transposon Terminal inverted repeat transposons in the PIF/Harbinger/tourist superfamily create 3-bp target site duplication that are mainly 'TAA' or 'TTA'. The autonomous PIF-Harbinger elements are relatively small in size, usually a few kb in length. Non-autonomous elements in this superfamily usually shorter than 600 bp are referrred to as Tourist elements. The terminal sequences for PIF/Harbinger/Tourist elements are 'GGG/CCC…GGC/GCC' or 'GA/GGCA…TGCC/TC'. PMID:26709091 This terminal inverted repeat of the CACTA family generate 3-bp target site duplication (TSD) upon insertion. CACTA elements do not have a significant preference for genic region insertions. This terminal inverted repeat (TIR) transposon superfamily is named CACTA because their terminal sequences are 'CACTA/G…C/TAGTG'. david 2020-06-25T14:00:30Z CACTA TIR transposon CACTA transposon element CACTC transposon CMC-EnSpm transposon DTC transposon En transposon En-Spm transposon EnSpm transposon Spm transposon dSpm transposon sequence SO:0002285 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) CACTA_TIR_transposon This terminal inverted repeat of the CACTA family generate 3-bp target site duplication (TSD) upon insertion. CACTA elements do not have a significant preference for genic region insertions. This terminal inverted repeat (TIR) transposon superfamily is named CACTA because their terminal sequences are 'CACTA/G…C/TAGTG'. PMID:26709091 Tyrosine Kinase (YR) retrotransposons are a subclass of non-LTR retrotransposons. These YR-encoding elements consist of central gag, pol and tyrosine recombinase (YR) open reading frames (ORFs) flanked with terminal repeat. The pol ORF includes a reverse transcriptase (RT), a RNase H (RH) and, in case of DIRS, a domain similar to bacterial and phage DNA N-6-adenine-methyltransferase (MT). Compared to the retroviral pol (LTR retrotransposons, non-LTR retrotransposons and Penelope elements), both aspartic protease and DDE integrase are absent from YR retrotransposons. YR retrotransposons have inverted terminal repeats (ITRs). david 2020-06-25T14:00:30Z YR retrotransposon tyrosine kinase retrotransposon sequence SO:0002286 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) YR_retrotransposon Tyrosine Kinase (YR) retrotransposons are a subclass of non-LTR retrotransposons. These YR-encoding elements consist of central gag, pol and tyrosine recombinase (YR) open reading frames (ORFs) flanked with terminal repeat. The pol ORF includes a reverse transcriptase (RT), a RNase H (RH) and, in case of DIRS, a domain similar to bacterial and phage DNA N-6-adenine-methyltransferase (MT). Compared to the retroviral pol (LTR retrotransposons, non-LTR retrotransposons and Penelope elements), both aspartic protease and DDE integrase are absent from YR retrotransposons. YR retrotransposons have inverted terminal repeats (ITRs). PMID:24086727 Dictyostelium intermediate repeat sequence (DIRS) retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR, and MT. RT is a reverse transcriptase. RH is RNAse H. YR is tyrosine recombinase. MT is DNA N-6-adenine-methyltransferase. david 2020-06-25T14:00:30Z DIRS YR retrotransposon DIRS retrotransposon RYD retrotransposon sequence SO:0002287 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) DIRS_YR_retrotransposon Dictyostelium intermediate repeat sequence (DIRS) retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR, and MT. RT is a reverse transcriptase. RH is RNAse H. YR is tyrosine recombinase. MT is DNA N-6-adenine-methyltransferase. PMID:24086727 Ngaro retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in Ngaro are arranged in A-pol-B-A-B order where A and B represent ITRs. david 2020-06-25T14:00:30Z Ngaro YR retrotransposon Ngaro retrotransposon RYN retrotransposon sequence SO:0002288 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Ngaro_YR_retrotransposon Ngaro retrotransposons are members of the YR_retrotransposon (SO:0002286) superfamily with the following protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in Ngaro are arranged in A-pol-B-A-B order where A and B represent ITRs. PMID:24086727 VIPER retrotransposons are members of the YR_retrotransposon (SO:0002286 superfamily with protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in VIPER are arranged in A-pol-B-A-B order where A and B represent ITRs. VIPER is only found in kinetoplastida genomes. david 2020-06-25T14:00:30Z RYV retrotransposon Viper YR retrotransposon Viper retrotransposon sequence SO:0002289 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Viper_YR_retrotransposon VIPER retrotransposons are members of the YR_retrotransposon (SO:0002286 superfamily with protein domains: RT, RH, YR. RT is a reverse transcriptase. RH is RNAse H. YR is Tyrosine recombinase. Inverted terminal repeats (ITRs) in VIPER are arranged in A-pol-B-A-B order where A and B represent ITRs. VIPER is only found in kinetoplastida genomes. PMID:16297462 Penelope is a subclass of non_LTR_retrotransposons (SO:0000189). Penelope retrotransposons contains structural features of TR, RT, EN, TR, terminal repeats which can be in tandem or inverse orientation in different Penelope copies. RT is reverse transcriptase. EN is endonuclease. david 2020-06-25T14:00:30Z Penelope retrotransposon RPP retrotransposon sequence SO:0002290 Added as per GitHub Issue Request #488 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/488) Penelope_retrotransposon Penelope is a subclass of non_LTR_retrotransposons (SO:0000189). Penelope retrotransposons contains structural features of TR, RT, EN, TR, terminal repeats which can be in tandem or inverse orientation in different Penelope copies. RT is reverse transcriptase. EN is endonuclease. PMID:23914310 A non-coding RNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail. david 2020-07-01T11:49:30Z circRNA circular ncRNA noncoding circRNA sequence SO:0002291 Added as per GitHub Issue Request #490 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/490) and GitHub Issue Request #391 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/391) circular_ncRNA A non-coding RNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail. PMID:29086764 PMID:29182528 PMID:29230098 PMID:29576969 PMID:29626935 An mRNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail. david 2020-07-01T11:49:30Z circular mRNA coding circRNA sequence SO:0002292 Added as per GitHub Issue Request #490 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/490) and GitHub Issue Request #391 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/391) circular_mRNA An mRNA that is generated by backsplicing of exons or introns, resulting in a covalently closed loop without a 5’ cap or 3’ polyA tail. PMID:29086764 PMID:29182528 PMID:29576969 The non-coding region of the mitochondrial genome that controls RNA and DNA synthesis. david 2020-07-01T16:40:30Z https://en.wikipedia.org/wiki/MtDNA_control_region Mitochondrial A+T region Mitochondrial DNA control region Mitochondrial NCR Mitochondrial noncoding region MtDNA control region MtDNA_control_region sequence SO:0002293 Added as per GitHub Issue Request #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417) from INSDC mitochondrial_control_region The non-coding region of the mitochondrial genome that controls RNA and DNA synthesis. PMID: 19407924 PMID:10968878 https://en.wikipedia.org/wiki/MtDNA_control_region wiki Mitochondrial displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region. david 2020-07-08T11:49:30Z http://en.wikipedia.org/wiki/D_loop Mitochondrial D loop Mitochondrial displacement loop sequence SO:0002294 Added as per request by Terence Murphy (INSDC) for GitHub Issue #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417) mitochondrial_D_loop Mitochondrial displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/D_loop wiki A TF_binding_site that is involved in regulation of expression. david 2020-08-05T11:49:30Z TFRS transcription factor regulatory site sequence SO:0002295 Added as per Mejia-Almonte et.al PMID:32665585 transcription_factor_regulatory_site A TF_binding_site that is involved in regulation of expression. Bacterial_regulation_working_group:CMA PMID:32665585 The possible discontinuous stretch of DNA that is the combination of one or several TFRSs whose bound TFs work jointly in the regulation of a promoter. david 2020-08-05T11:49:30Z TFRS module TFRS phrase transcription factor regulatory site module transcription factor regulatory site phrase sequence SO:0002296 Added as per Mejia-Almonte et.al PMID:32665585. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. TFRS_module The possible discontinuous stretch of DNA that is the combination of one or several TFRSs whose bound TFs work jointly in the regulation of a promoter. Bacterial_regulation_working_group:CMA PMID:32665585 The possible discontinous stretch of DNA that encompass all the TFRSs that regulate a promoter. david 2020-08-05T11:49:30Z TFRS collection transcription factor regulatory site collection sequence SO:0002297 Added as per Mejia-Almonte et.al PMID:32665585. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527. TFRS_collection The possible discontinous stretch of DNA that encompass all the TFRSs that regulate a promoter. Bacterial_regulation_working_group:CMA PMID:32665585 An operon whose transcription is coordinated on a single transcription unit. david 2020-08-05T11:49:30Z simple operon sequence SO:0002298 Added as per Mejia-Almonte et.al PMID:32665585 simple_operon An operon whose transcription is coordinated on a single transcription unit. Bacterial_regulation_working_group:CMA PMID:32665585 An operon whose transcription is coordinated on several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene. david 2020-08-05T11:49:30Z complex operon sequence SO:0002299 Added as per Mejia-Almonte et.al PMID:32665585 complex_operon An operon whose transcription is coordinated on several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene. Bacterial_regulation_working_group:CMA PMID:32665585 Transcription units or transcribed coding sequences. david 2020-08-05T11:49:30Z unit of gene expression sequence SO:0002300 Added as per Mejia-Almonte et.al PMID:32665585 unit_of_gene_expression Transcription units or transcribed coding sequences. Bacterial_regulation_working_group:CMA PMID:32665585 DNA regions delimited by different nonspurious TSS-TTS pairs. david 2020-08-05T11:49:30Z transcription unit sequence SO:0002301 Added as per Mejia-Almonte et.al PMID:32665585 transcription_unit DNA regions delimited by different nonspurious TSS-TTS pairs. Bacterial_regulation_working_group:CMA PMID:32665585 A regulon defined by considering one regulatory gene product. david 2020-08-05T11:49:30Z simple regulon sequence SO:0002302 Added as per Mejia-Almonte et.al PMID:32665585 simple_regulon A regulon defined by considering one regulatory gene product. Bacterial_regulation_working_group:CMA PMID:32665585 A regulon defined by considering the units of expression regulated by a specified set of regulatory gene products. david 2020-08-05T11:49:30Z simple regulon sequence SO:0002303 Added as per Mejia-Almonte et.al PMID:32665585 complex_regulon A regulon defined by considering the units of expression regulated by a specified set of regulatory gene products. Bacterial_regulation_working_group:CMA PMID:32665585 An instance of a self-interacting DNA region flanked by left and right TAD boundaries. david 2020-08-12T14:01:30Z TAD topologically associated domain sequence SO:0002304 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. topologically_associated_domain An instance of a self-interacting DNA region flanked by left and right TAD boundaries. GREEKC:cl PMID:32782014 A DNA region enriched in DNA loop anchors and across which DNA loops occur less often than expected by chance. david 2020-08-12T14:01:30Z TAD boundary TAD_boundary topologically associated domain boundary sequence SO:0002305 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. topologically_associated_domain_boundary A DNA region enriched in DNA loop anchors and across which DNA loops occur less often than expected by chance. GREEKC:cl PMID:32782014 A region of a chromosome where regulatory events occur, including epigenetic modifications. These epigenetic modifications can include nucleosome modifications and post-replicational DNA modifications. david 2020-08-12T14:01:30Z chromatin regulatory region sequence SO:0002306 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. chromatin_regulatory_region A region of a chromosome where regulatory events occur, including epigenetic modifications. These epigenetic modifications can include nucleosome modifications and post-replicational DNA modifications. GREEKC:cl PMID:32782014 A region of DNA between two loop anchor positions that are held in close physical proximity. david 2020-08-12T14:01:30Z DNA loop sequence SO:0002307 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. DS updated defintion Feb 16, 2021. See GitHub Issue #534. DNA_loop A region of DNA between two loop anchor positions that are held in close physical proximity. GREEKC:cl PMID:32782014 The ends of a DNA loop where the two strands of DNA are held in close physical proximity. During interphase the anchors of DNA loops are convergently oriented CTCF binding sites. david 2020-08-12T14:01:30Z DNA loop anchor sequence SO:0002308 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. DS updated defintion Feb 16, 2021. See GitHub Issue #534. DNA_loop_anchor The ends of a DNA loop where the two strands of DNA are held in close physical proximity. During interphase the anchors of DNA loops are convergently oriented CTCF binding sites. GREEKC:cl PMID:32782014 An element that always exists within the promoter region of a gene. When multiple transcripts exist for a gene, the separate transcripts may have separate core_promoter_elements. david 2020-08-12T14:01:30Z core promoter element sequence SO:0002309 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. core_promoter_element An element that always exists within the promoter region of a gene. When multiple transcripts exist for a gene, the separate transcripts may have separate core_promoter_elements. GREEKC:rl The promoter of a cryptic gene. david 2020-08-12T14:01:30Z cryptic promoter sequence SO:0002310 Added by Dave to be consistent with other ontologies updated with GREEKC initiative. cryptic_promoter The promoter of a cryptic gene. GREEKC:cl A regulatory_region including the Transcription Start Site (TSS) of a gene found in genes of viruses. david 2020-08-12T14:01:30Z viral promoter sequence SO:0002311 viral_promoter A regulatory_region including the Transcription Start Site (TSS) of a gene found in genes of viruses. GREEKC:cl An element that always exists within the promoter region of a prokaryotic gene. david 2020-08-12T14:01:30Z core prokaryotic promoter element sequence general transcription factor binding site SO:0002312 core_prokaryotic_promoter_element An element that always exists within the promoter region of a prokaryotic gene. GREEKC:rl An element that always exists within the promoter region of a viral gene. david 2020-08-12T14:01:30Z core viral promoter element sequence general transcription factor binding site SO:0002313 core_viral_promoter_element An element that always exists within the promoter region of a viral gene. GREEKC:rl A sequence variant that alters the level or amount of gene product produced. This high level term can be applied where the direction of level change (increased vs decreased gene product level) is unknown or not confirmed. david 2020-12-18T22:35:30Z altered gene product level altered transcription level altered_transcription_level sequence SO:0002314 Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612. altered_gene_product_level A sequence variant that alters the level or amount of gene product produced. This high level term can be applied where the direction of level change (increased vs decreased gene product level) is unknown or not confirmed. GenCC:AR A variant that increases the level or amount of gene product produced. david 2020-12-18T22:35:30Z increased gene product level increased transcription level increased_transcription_level sequence SO:0002315 Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612. increased_gene_product_level A variant that increases the level or amount of gene product produced. GenCC:AR A sequence variant that decreases the level or amount of gene product produced. david 2020-12-18T22:35:30Z decreased gene product level decreased transcription level decreased_transcription_level reduced gene product level reduced transcription level reduced_gene_product_level reduced_transcription_level sequence SO:0002316 Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612. decreased_gene_product_level A sequence variant that decreases the level or amount of gene product produced. GenCC:AR A sequence variant that results in no gene product. david 2020-12-18T22:35:30Z absent gene product sequence SO:0002317 Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501). Updated definition 17 Feb 2023 along with updates from GenCC. See Issue Request #612. absent_gene_product A sequence variant that results in no gene product. GenCC:AR A sequence variant that alters the sequence of a gene product. david 2020-12-18T22:35:30Z altered gene product structure sequence SO:0002318 Added as per request from Ang Roberts as part of GenCC November 2020. See Issue Request #501 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/501) altered_gene_product_sequence A sequence variant that alters the sequence of a gene product. GenCC:AR A sequence variant that leads to a change in the location of a termination codon in a transcript that leads to nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants. david 2020-12-30T17:12:30Z NMD triggering variant nonsense-mediated decay triggering variant sequence SO:0002319 Added as per request from Ang Roberts as part of GenCC November 2020. NMD_triggering_variant A sequence variant that leads to a change in the location of a termination codon in a transcript that leads to nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants. GenCC:AR A sequence variant that leads to a change in the location of a termination codon in a transcript but allows the transcript to escape nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants. david 2020-12-30T17:12:30Z NMD escaping variant nonsense-mediated decay escaping variant sequence SO:0002320 Added as per request from Ang Roberts as part of GenCC November 2020. NMD_escaping_variant A sequence variant that leads to a change in the location of a termination codon in a transcript but allows the transcript to escape nonsense-mediated decay (NMD). The change in location of a termination codon can be caused by several different types of sequence variants, including stop_gained (SO:0001587), frameshift_variant (SO:0001589), splice_donor_variant (SO:0001575), and splice_acceptor_variant (SO:0001574) types of variants. GenCC:AR A stop_gained (SO:0001587) variant that is degraded by nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z stop gained variant-nonsense-mediated decay triggering stop gained-NMD triggering sequence SO:0002321 Added as per request from Ang Roberts as part of GenCC November 2020. stop_gained_NMD_triggering A stop_gained (SO:0001587) variant that is degraded by nonsense-mediated decay (NMD). GenCC:AR A stop_gained (SO:0001587) variant that allows the transcript to escape nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z stop gained variant-nonsense-mediated decay escaping stop gained-NMD escaping sequence SO:0002322 Added as per request from Ang Roberts as part of GenCC November 2020. stop_gained_NMD_escaping A stop_gained (SO:0001587) variant that allows the transcript to escape nonsense-mediated decay (NMD). GenCC:AR A frameshift_variant (SO:0001589) that is degraded by nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z frameshift variant-NMD triggering frameshift variant-nonsense-mediated decay triggering sequence SO:0002323 Added as per request from Ang Roberts as part of GenCC November 2020. frameshift_variant_NMD_triggering A frameshift_variant (SO:0001589) that is degraded by nonsense-mediated decay (NMD). GenCC:AR A frameshift_variant (SO:0001589) that allows the transcript to escape nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z frameshift variant-NMD escaping frameshift variant-nonsense-mediated decay escaping sequence SO:0002324 Added as per request from Ang Roberts as part of GenCC November 2020. frameshift_variant_NMD_escaping A frameshift_variant (SO:0001589) that allows the transcript to escape nonsense-mediated decay (NMD). GenCC:AR A splice_donor_variant (SO:0001575) that is degraded by nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z splice donor variant-NMD triggering splice donor variant-nonsense-mediated decay triggering sequence SO:0002325 Added as per request from Ang Roberts as part of GenCC November 2020. splice_donor_variant_NMD_triggering A splice_donor_variant (SO:0001575) that is degraded by nonsense-mediated decay (NMD). GenCC:AR A splice_donor_variant (SO:0001575) that allows the transcript to escape nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z splice donor variant-NMD escaping splice donor variant-nonsense-mediated decay escaping sequence SO:0002326 Added as per request from Ang Roberts as part of GenCC November 2020. splice_donor_variant_NMD_escaping A splice_donor_variant (SO:0001575) that allows the transcript to escape nonsense-mediated decay (NMD). GenCC:AR A splice_acceptor_variant (SO:0001574) that is degraded by nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z splice acceptor variant-NMD triggering splice acceptor variant-nonsense-mediated decay triggering sequence SO:0002327 Added as per request from Ang Roberts as part of GenCC November 2020. splice_acceptor_variant_NMD_triggering A splice_acceptor_variant (SO:0001574) that is degraded by nonsense-mediated decay (NMD). GenCC:AR A splice_acceptor_variant (SO:0001574) that allows the transcript to escape nonsense-mediated decay (NMD). david 2020-12-30T17:12:30Z splice acceptor variant-NMD escaping splice acceptor variant-nonsense-mediated decay escaping sequence SO:0002328 Added as per request from Ang Roberts as part of GenCC November 2020. splice_acceptor_variant_NMD_escaping A splice_acceptor_variant (SO:0001574) that allows the transcript to escape nonsense-mediated decay (NMD). GenCC:AR The region of mRNA 1 base long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. david 2021-02-03T20:33:30Z minus 1 ribosomal frameshift minus 1 ribosomal slippage minus 1 translational frameshift sequence SO:0002329 Added along with the update to the definition of transaltional_frameshift SO:0001210 Feb 2021, brought to our attention by Terrence Murphy of INSDC. See GitHub Issue #522. minus_1_translational_frameshift The region of mRNA 1 base long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. SO:ds The region of mRNA 2 bases long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. david 2021-02-03T20:33:30Z minus 2 ribosomal frameshift minus 2 ribosomal slippage minus 2 translational frameshift sequence SO:0002330 Added along with the update to the definition of transaltional_frameshift SO:0001210 Feb 2021, brought to our attention by Terrence Murphy of INSDC. See GitHub Issue #522. minus_2_translational_frameshift The region of mRNA 2 bases long that is included as part of two separate codons during the process of translational frameshifting (GO:0006452), causing the reading frame to be different. SO:ds A region of DNA that is depleted of nucleosomes and accessible to DNA-binding proteins including transcription factors and nucleases. david 2021-02-11T16:41:30Z accessible DNA region sequence SO:0002331 Added as part of GREEKC terms. See GitHub Issues #531 & #534. accessible_DNA_region A region of DNA that is depleted of nucleosomes and accessible to DNA-binding proteins including transcription factors and nucleases. PMID:25903461 SO:ds A biological region implicated in inherited changes caused by mechanisms other than changes in the underlying DNA sequence. david 2021-02-11T21:16:30Z https://epi.grants.cancer.gov/epigen/#:~:text=mail.nih.gov-,Overview,a%20cell%20or%20entire%20organism. epigenomically modified region sequence SO:0002332 Added as part of GREEKC terms to differentiate between inherited and not inherited epigenetic changes. See GitHub Issue #532. epigenomically_modified_region A biological region implicated in inherited changes caused by mechanisms other than changes in the underlying DNA sequence. SO:ds http://en.wikipedia.org/wiki/Epigenetics A stop codon with the DNA sequence TAG. david 2021-04-21T21:16:30Z Amber stop codon sequence SO:0002333 Added as per GitHub request #537. amber_stop_codon A stop codon with the DNA sequence TAG. https://en.wikipedia.org/wiki/Stop_codon A stop codon with the DNA sequence TAA. david 2021-04-21T21:16:30Z Ochre stop codon sequence SO:0002334 Added as per GitHub request #537. ochre_stop_codon A stop codon with the DNA sequence TAA. https://en.wikipedia.org/wiki/Stop_codon A stop codon with the DNA sequence TGA. david 2021-04-21T21:16:30Z Opal stop codon sequence SO:0002335 Added as per GitHub request #537. opal_stop_codon A stop codon with the DNA sequence TGA. https://en.wikipedia.org/wiki/Stop_codon A gene that encodes for 2S ribosomal RNA, which functions as a component of the large subunit of the ribosome in Drosophila and at least some other Diptera. david 2021-04-23T22:59:30Z 2S rRNA gene rRNA 2S gene sequence SO:0002336 Added as a request from FlyBase. See GitHub Issue #507. Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_2S_gene A gene that encodes for 2S ribosomal RNA, which functions as a component of the large subunit of the ribosome in Drosophila and at least some other Diptera. PMID: 118436 PMID: 29474379 PMID: 3136294 PMID:10788608 PMID:407103 PMID:4847940 PMID:768488 Cytosolic 2S rRNA is a 30 nucleotide RNA component of the large subunit of cytosolic ribosomes in Drosophila and at least some other Diptera. It is homologous to the 3' part of other 5.8S rRNA molecules. The 3' end of the 5.8S molecule is able to base-pair with the 5' end of the 2S rRNA to generate a helical region equivalent in position to the 'GC-rich hairpin' found in all previously sequenced 5.8S molecules. david 2021-04-23T22:59:30Z cytosolic 2S rRNA cytosolic rRNA 2S sequence SO:0002337 Added as a request from FlyBase. See GitHub Issue #507. Renamed from rRNA_2S to cytosolic_2S_rRNA on 27 May 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493. cytosolic_2S_rRNA Cytosolic 2S rRNA is a 30 nucleotide RNA component of the large subunit of cytosolic ribosomes in Drosophila and at least some other Diptera. It is homologous to the 3' part of other 5.8S rRNA molecules. The 3' end of the 5.8S molecule is able to base-pair with the 5' end of the 2S rRNA to generate a helical region equivalent in position to the 'GC-rich hairpin' found in all previously sequenced 5.8S molecules. PMID: 118436 PMID: 29474379 PMID: 3136294 PMID:10788608 PMID:407103 PMID:4847940 PMID:768488 A 57 to 71 nucleotide RNA that is a component of the U7 small nuclear ribonucleoprotein complex (U7 snRNP). The U7 snRNP is required for histone pre-mRNA processing. david 2021-04-24T16:59:30Z U7 small nuclear RNA U7 snRNA small nuclear RNA U7 snRNA U7 sequence SO:0002338 Added as a request from FlyBase. See GitHub Issue #508 U7_snRNA A 57 to 71 nucleotide RNA that is a component of the U7 small nuclear ribonucleoprotein complex (U7 snRNP). The U7 snRNP is required for histone pre-mRNA processing. PMID:15526162 A gene that encodes for a scaRNA (small Cajal body-specific RNA). david 2021-04-24T16:59:30Z Small Cajal body-specific RNA gene scaRNA gene sequence SO:0002339 Added as a request from FlyBase. See GitHub Issue #510 scaRNA_gene A gene that encodes for a scaRNA (small Cajal body-specific RNA). PMID:27775477 PMID:28869095 An abundant small nuclear RNA that, together with associated cellular proteins, regulates the activity of the positive transcription elongation factor b (P-TEFb). It is often described in literature as similar to a snRNA, except of longer length. david 2021-04-27T14:50:30Z 7SK RNA RNA 7SK sequence SO:0002340 Added as a request from FlyBase. See GitHub Issue #512 RNA_7SK An abundant small nuclear RNA that, together with associated cellular proteins, regulates the activity of the positive transcription elongation factor b (P-TEFb). It is often described in literature as similar to a snRNA, except of longer length. PMID:19246988 PMID:21853533 PMID:27369380 A gene encoding a 7SK RNA (SO:0002340). david 2021-04-27T14:50:30Z 7SK RNA gene RNA 7SK gene sequence SO:0002341 Added as a request from FlyBase. See GitHub Issue #512 RNA_7SK_gene A gene encoding a 7SK RNA (SO:0002340). PMID:19246988 PMID:21853533 PMID:27369380 A ncRNA_gene that encodes an ncRNA less than 200 nucleotides in length. david 2021-04-27T14:50:30Z small non-coding RNA gene sncRNA gene sequence SO:0002342 Added as a request from FlyBase to make the ncRNA_gene branch in SO mirror the ncRNA branch. See GitHub Issue #514 sncRNA_gene A ncRNA_gene that encodes an ncRNA less than 200 nucleotides in length. PMID:28449079 PMID:30069443 PMID:30937442 Cytosolic rRNA is an RNA component of the small or large subunits of cytosolic ribosomes. david 2021-06-10T16:45:30Z cytosolic rRNA cytosolic ribosomal RNA sequence SO:0002343 Added as a request from EBI. See GitHub Issue #493 cytosolic_rRNA Cytosolic rRNA is an RNA component of the small or large subunits of cytosolic ribosomes. PMID:3044395 Mitochondrial SSU rRNA is an RNA component of the small subunit of mitochondrial ribosomes. david 2021-06-10T16:45:30Z MT SSU rRNA mitochondrial SSU rRNA mitochondrial small subunit rRNA sequence SO:0002344 Added as a request from EMBL. See GitHub Issue #493 mt_SSU_rRNA Mitochondrial SSU rRNA is an RNA component of the small subunit of mitochondrial ribosomes. PMID: 24572720 PMID:3044395 Mitochondrial LSU rRNA is an RNA component of the large subunit of mitochondrial ribosomes. david 2021-06-10T16:45:30Z MT LSU rRNA mitochondrial LSU rRNA mitochondrial large subunit rRNA sequence SO:0002345 Added as a request from EMBL. See GitHub Issue #493 mt_LSU_rRNA Mitochondrial LSU rRNA is an RNA component of the large subunit of mitochondrial ribosomes. PMID: 24572720 PMID:3044395 Plastid rRNA is an RNA component of the small or large subunits of plastid (such as chloroplast) ribosomes. david 2021-06-10T16:45:30Z plastid rRNA sequence SO:0002346 Added as a request from EMBL. See GitHub Issue #493 plastid_rRNA Plastid rRNA is an RNA component of the small or large subunits of plastid (such as chloroplast) ribosomes. PMID: 24572720 PMID:3044395 Plastid SSU rRNA is an RNA component of the small subunit of plastid (such as chloroplast) ribosomes. david 2021-06-10T16:45:30Z plastid SSU rRNA plastid small subunit rRNA sequence SO:0002347 Added as a request from EMBL. See GitHub Issue #493 plastid_SSU_rRNA Plastid SSU rRNA is an RNA component of the small subunit of plastid (such as chloroplast) ribosomes. PMID: 24572720 PMID:3044395 Plastid LSU rRNA is an RNA component of the large subunit of plastid (such as chloroplast) ribosomes. david 2021-06-10T16:45:30Z plastid LSU rRNA plastid large subunit rRNA sequence SO:0002348 Added as a request from EMBL. See GitHub Issue #493 plastid_LSU_rRNA Plastid LSU rRNA is an RNA component of the large subunit of plastid (such as chloroplast) ribosomes. PMID: 24572720 PMID:3044395 A heritable locus on a chromosome that is prone to DNA breakage. evan 2021-09-30T19:29:24Z sequence SO:0002349 See GitHub Issue #301. fragile_site A fragile site considered part of the normal chromosomal structure. evan 2021-09-30T19:33:59Z sequence SO:0002350 See GitHub Issue #301. common_fragile_site A fragile site considered part of the normal chromosomal structure. PMID: 16236432 PMID: 17608616 A fragile site found in the chromosomes of less than five percent of the human population. evan 2021-09-30T19:34:13Z sequence SO:0002351 See GitHub Issue #301. rare_fragile_site A fragile site found in the chromosomes of less than five percent of the human population. PMID:16236432 PMID:17608616 A non-coding RNA typically derived from intronic sequence of the sense strand of a cognate host gene, that is not rapidly degraded. It may contain exonic sequences, 5′ caps, and/or polyA tails. evan 2021-09-30T21:07:18Z Stable intronic sequence RNA stable_intronic_sequence_RNA sequence SO:0002352 See GitHub Issue #515. sisRNA A non-coding RNA typically derived from intronic sequence of the sense strand of a cognate host gene, that is not rapidly degraded. It may contain exonic sequences, 5′ caps, and/or polyA tails. PMID:27147469 PMID:29397203 PMID:30391089 A gene encoding a stem-bulge RNA. evan 2021-09-30T21:25:37Z Stem-bulge RNA gene stem_bulge_RNA_gene sequence SO:0002353 See GitHub Issue #516. sbRNA_gene A gene encoding a stem-bulge RNA. PMID:25908866 PMID:30666901 A small non-coding stem-loop RNA present in nematodes and insects, functionally and structurally related to vertebrate Y RNA. evan 2021-09-30T21:29:19Z Stem-bulge RNA stem_bulge_RNA sequence SO:0002354 See GitHub Issue #516. sbRNA A small non-coding stem-loop RNA present in nematodes and insects, functionally and structurally related to vertebrate Y RNA. PMID:25908866 PMID:30666901 A gene encoding a hpRNA. evan 2021-10-07T17:09:18Z Hairpin RNA gene sequence SO:0002355 See GitHub Issue #518. hpRNA_gene A gene encoding a hpRNA. PMID:18463630 PMID:18719707 PMID:25544562 An RNA comprising an extended inverted repeat, the stem of which is typically much longer than that of miRNA precursors and can be up to 400 base pairs in length. hpRNAs are processed by Dicer-2 to generate endogenous short interfering RNAs (siRNAs). evan 2021-10-07T17:35:56Z Hairpin RNA sequence SO:0002356 See GitHub Issue #518. hpRNA An RNA comprising an extended inverted repeat, the stem of which is typically much longer than that of miRNA precursors and can be up to 400 base pairs in length. hpRNAs are processed by Dicer-2 to generate endogenous short interfering RNAs (siRNAs). PMID:18463630 PMID:18719707 PMID:25544562 A physically clustered group of two or more genes in a particular genome that together encode a biosynthetic pathway for the production of a specialized metabolite (including its chemical variants). evan 2021-10-07T18:20:34Z Metabolic gene cluster sequence SO:0002357 See GitHub Issue #558. biosynthetic_gene_cluster A physically clustered group of two or more genes in a particular genome that together encode a biosynthetic pathway for the production of a specialized metabolite (including its chemical variants). PMID:26284661 A gene that encodes a vault RNA. evan 2021-11-11T23:25:13Z sequence SO:0002358 As of 11 November 2021 the HNGC lists 4 genes as RNA, vault. These are HGNC IDs: 12654, 12655, 12656, 37054. vault_RNA_gene A gene that encodes a vault RNA. PMID:19298825 PMID:19491402 PMID:22058117 PMID:22926522 PMID:30773316 PMID:9535882 A gene that encodes a Y RNA. evan 2021-11-11T23:52:21Z sequence SO:0002359 There are four genes from HGNC that are annotated this way. HGNC IDs: 10242, 10243, 10244, and 10248. Y_RNA_gene A gene that encodes a Y RNA. PMID:1698620 PMID:6187471 PMID:6816230 PMID:7520568 PMID:7539809 PMID:8836182 A gene that codes for cytosolic rRNA. evan 2021-11-19T04:30:40Z sequence SO:0002360 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_rRNA_gene A gene that codes for cytosolic LSU rRNA. evan 2021-11-19T04:32:20Z cytosolic large subunit rRNA gene sequence SO:0002361 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_LSU_rRNA_gene A gene that codes for cytosolic SSU rRNA. evan 2021-11-19T04:37:02Z cytosolic small subunit rRNA gene sequence SO:0002362 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). cytosolic_SSU_rRNA_gene A gene that codes for mitochondrial rRNA. evan 2021-11-19T04:55:58Z mitochondrial rRNA gene sequence SO:0002363 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). mt_rRNA_gene A gene that codes for mitochondrial LSU rRNA. evan 2021-11-19T04:57:49Z mitochondrial large subunit rRNA gene rRNA 21S gene rRNA_21S_gene sequence SO:0002364 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). Obsoleted term rRNA_21S_gene (SO:0002241) merged into this term on 12 Sept 2022, see GitHub Issue #513. mt_LSU_rRNA_gene A gene that codes for mitochondrial SSU rRNA. evan 2021-11-19T04:58:07Z mitochondrial small subunit rRNA gene sequence SO:0002365 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). mt_SSU_rRNA_gene A gene that codes for plastid rRNA. evan 2021-11-19T05:00:08Z sequence SO:0002366 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). plastid_rRNA_gene A gene that codes for plastid LSU rRNA. evan 2021-11-19T05:00:49Z plastid large subunit rRNA gene sequence SO:0002367 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). plastid_LSU_rRNA_gene A gene that codes for plastid SSU rRNA. evan 2021-11-19T05:01:03Z plastid small subunit rRNA gene sequence SO:0002368 Adjusted heirarchy and names of rRNA_gene terms at the request of Steven Marygold GitHub Issue #513 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/513). plastid_SSU_rRNA_gene A scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs. evan 2021-11-19T05:34:44Z C/D scaRNA sequence SO:0002369 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). C_D_box_scaRNA A scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs. PMID:17099227 PMID:24659245 A scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs. evan 2021-11-19T05:35:04Z H/ACA scaRNA sequence SO:0002370 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). H_ACA_box_scaRNA A scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs. PMID:17099227 PMID:24659245 A scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs. evan 2021-11-19T05:35:23Z C/D-H/ACA scaRNA sequence SO:0002371 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). C-D_H_ACA_box_scaRNA A scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs. PMID:17099227 PMID:24659245 A gene that codes for scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs. evan 2021-11-19T05:40:46Z C/D scaRNA gene sequence SO:0002372 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). C_D_box_scaRNA_gene A gene that codes for scaRNA possessing a box C/D sequence motif, guiding the methylation of snRNAs. PMID:17099227 PMID:24659245 A gene that codes for scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs. evan 2021-11-19T05:40:58Z H/ACA scaRNA gene sequence SO:0002373 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). H_ACA_box_scaRNA_gene A gene that codes for scaRNA possessing a box H/ACA sequence motif, guiding the pseudouridylation of snRNAs. PMID:17099227 PMID:24659245 A gene that codes for scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs. evan 2021-11-19T05:41:06Z C/D-H/ACA scaRNA gene sequence SO:0002374 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). C-D_H_ACA_box_scaRNA_gene A gene that codes for scaRNA possessing both box C/D and box H/ACA sequence motifs, guiding both the methylation and pseudouridylation of snRNAs. PMID:17099227 PMID:24659245 A gene that codes a C_D_box_snoRNA. Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'. evan 2021-11-19T05:49:55Z box C/D snoRNA gene, C D box snoRNA gene, C/D box snoRNA gene sequence SO:0002375 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). Added citations. See GitHub Issue #565. C_D_box_snoRNA_gene A gene that codes a C_D_box_snoRNA. Most box C/D snoRNAs also contain long (>10 nt) sequences complementary to rRNA. Boxes C and D, as well as boxes C' and D', are usually located in close proximity, and form a structure known as the box C/D motif. This motif is important for snoRNA stability, processing, nucleolar targeting and function. A small number of box C/D snoRNAs are involved in rRNA processing; most, however, are known or predicted to serve as guide RNAs in ribose methylation of rRNA. Targeting involves direct base pairing of the snoRNA at the rRNA site to be modified and selection of a rRNA nucleotide a fixed distance from box D or D'. PMID:12457565 PMID:22065625 A gene that codes for H_ACA_box_snoRNA. Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains. evan 2021-11-19T05:50:14Z box H/ACA snoRNA gene, H ACA box snoRNA gene, H/ACA box snoRNA gene sequence SO:0002376 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). Added citations. See GitHub Issue #565. H_ACA_box_snoRNA_gene A gene that codes for H_ACA_box_snoRNA. Members of the box H/ACA family contain an ACA triplet, exactly 3 nt upstream from the 3' end and an H-box in a hinge region that links two structurally similar functional domains of the molecule. Both boxes are important for snoRNA biosynthesis and function. A few box H/ACA snoRNAs are involved in rRNA processing; most others are known or predicted to participate in selection of uridine nucleosides in rRNA to be converted to pseudouridines. Site selection is mediated by direct base pairing of the snoRNA with rRNA through one or both targeting domains. PMID:12457565 PMID:22065625 A gene that codes for U14_snoRNA. U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates. evan 2021-11-19T05:50:43Z small nucleolar RNA U14 gene, snoRNA U14 gene, U14 small nucleolar RNA gene, U14 snoRNA gene sequence SO:0002377 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). U14_snoRNA_gene A gene that codes for U3_snoRNA. U3 snoRNA is a member of the box C/D class of small nucleolar RNAs. The U3 snoRNA secondary structure is characterised by a small 5' domain (with boxes A and A'), and a larger 3' domain (with boxes B, C, C', and D), the two domains being linked by a single-stranded hinge. Boxes B and C form the B/C motif, which appears to be exclusive to U3 snoRNAs, and boxes C' and D form the C'/D motif. The latter is functionally similar to the C/D motifs found in other snoRNAs. The 5' domain and the hinge region act as a pre-rRNA-binding domain. The 3' domain has conserved protein-binding sites. Both the box B/C and box C'/D motifs are sufficient for nuclear retention of U3 snoRNA. The box C'/D motif is also necessary for nucleolar localization, stability and hypermethylation of U3 snoRNA. Both box B/C and C'/D motifs are involved in specific protein interactions and are necessary for the rRNA processing functions of U3 snoRNA. evan 2021-11-19T05:50:57Z small nucleolar RNA U3 gene, snoRNA U3 gene, U3 small nucleolar RNA gene, U3 snoRNA gene sequence SO:0002378 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). U3_snoRNA_gene A gene that codes for methylation_guide_snoRNA. A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue. evan 2021-11-19T05:51:12Z methylation guide snoRNA gene sequence SO:0002379 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). methylation_guide_snoRNA_gene A gene that codes for methylation_guide_snoRNA. A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue. PMID:12457565 A gene that codes for pseudouridylation_guide_snoRNA. A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue. evan 2021-11-19T05:51:40Z pseudouridylation guide snoRNA gene sequence SO:0002380 Added at the request of Steven Marygold. See GitHub Issue #519 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/519). pseudouridylation_guide_snoRNA_gene A gene that codes for pseudouridylation_guide_snoRNA. A snoRNA that specifies the site of pseudouridylation in an RNA molecule by base pairing with a short sequence around the target residue. PMID:12457565 A long non-coding RNA which is produced using the promoter of a protein-coding gene but with transcription occurring in the opposite direction. evan 2022-10-28T19:05:35Z bidirectional lncRNA bidirectional promoter lncRNA bidirectional promoter long non-coding RNA bidirectional_lncRNA bidirectional_promoter_long_non-coding_RNA sequence SO:0002381 Created new term "bidirectional_promoter_lncRNA" (SO:0002381). See GitHub Issue #579. bidirectional_promoter_lncRNA A long non-coding RNA which is produced using the promoter of a protein-coding gene but with transcription occurring in the opposite direction. PMID:30175284 PMID:34956340 PMID:26578749 A conserved cis-acting element that confers extreme-distance regulatory activity to an enhancer. 2024-06-06T16:53:56Z evan REX sequence SO:0002382 Added at the request of Chris Mungall (Berkeley Lab) on 6 June 2024. See GitHub Issue #649. range_extender_element A conserved cis-acting element that confers extreme-distance regulatory activity to an enhancer. https://doi.org/10.1101/2024.05.26.595809 On its own, a range extender element is not a classical enhancer, but its addition can extend the range of action of a heterologous short-range enhancer by more than 10-fold compared to its native range. In extreme cases, extended ranges may span approximately 840 kilobases of genomic space. The evolutionarily conserved homeodomain motif [C/T]AATA, required for long-range enhancer activity, is present within the range extender element. Range extender elements are distinct from and do not share sequence similarity with other cis-regulatory elements like tethering and remote control elements in Drosophila; or CTCF sites, CpG islands, and enhancer booster elements in mammals. Rather than conferring robustness of remote enhancer activity, a range extender element is both required and sufficient for long-range enhancer-promoter activation. A variant that has been found to be pathogenic in the context of a neoplastic disease. 2024-06-06T20:09:59Z evan sequence SO:0002383 See GitHub Issue #643. oncogenic_variant A variant that has been found to be pathogenic in the context of a neoplastic disease. PMID:35101336 A region of sequence that is involved in the control of a biological process. INSDC_feature:regulatory http://en.wikipedia.org/wiki/Regulatory_region INSDC_qualifier:other regulatory region sequence SO:0005836 regulatory_region A region of sequence that is involved in the control of a biological process. SO:ke http://en.wikipedia.org/wiki/Regulatory_region wiki The primary transcript of an evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA. 4.5S snRNA primary transcript U14 snoRNA primary transcript sequence SO:0005837 U14_snoRNA_primary_transcript The primary transcript of an evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA. PMID:2251119 true A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue. methylation guide snoRNA sequence SO:0005841 Has RNA 2'-O-ribose methylation guide activity (GO:0030561). methylation_guide_snoRNA A snoRNA that specifies the site of 2'-O-ribose methylation in an RNA molecule by base pairing with a short sequence around the target residue. GOC:mah PMID:12457565 An ncRNA that is part of a ribonucleoprotein that cleaves the primary pre-rRNA transcript in the process of producing mature rRNA molecules. rRNA cleavage RNA sequence SO:0005843 rRNA_cleavage_RNA An ncRNA that is part of a ribonucleoprotein that cleaves the primary pre-rRNA transcript in the process of producing mature rRNA molecules. GOC:kgc An exon that is the only exon in a gene. exon of single exon gene singleton exon sequence single_exon SO:0005845 exon_of_single_exon_gene An exon that is the only exon in a gene. RSC:cb A gene that is a member of a gene cassette, which is a mobile genetic element. cassette array member sequence SO:0005847 cassette_array_member A gene that is a member of a gene cassette, which is a mobile genetic element. gene cassette member sequence SO:0005848 gene_cassette_member A gene that is a member of a group of genes that are either regulated or transcribed together within a larger group of genes that are regulated or transcribed together. gene subarray member sequence SO:0005849 gene_subarray_member Non-covalent primer binding site for initiation of replication, transcription, or reverse transcription. http://en.wikipedia.org/wiki/Primer_binding_site INSDC_feature:primer_bind primer binding site sequence SO:0005850 primer_binding_site Non-covalent primer binding site for initiation of replication, transcription, or reverse transcription. http://www.insdc.org/files/feature_table.html http://en.wikipedia.org/wiki/Primer_binding_site wiki An array includes two or more genes, or two or more gene subarrays, contiguously arranged where the individual genes, or subarrays, are either identical in sequence, or essentially so. gene array sequence SO:0005851 This would include, for example, a cluster of genes each encoding the major ribosomal RNAs and a cluster of histone gene subarrays. gene_array An array includes two or more genes, or two or more gene subarrays, contiguously arranged where the individual genes, or subarrays, are either identical in sequence, or essentially so. SO:ma A subarray is, by defintition, a member of a gene array (SO:0005851); the members of a subarray may differ substantially in sequence, but are closely related in function. gene subarray sequence SO:0005852 This would include, for example, a cluster of genes encoding different histones. gene_subarray A subarray is, by defintition, a member of a gene array (SO:0005851); the members of a subarray may differ substantially in sequence, but are closely related in function. SO:ma A gene that can be substituted for a related gene at a different site in the genome. http://en.wikipedia.org/wiki/Gene_cassette gene cassette sequence SO:0005853 This would include, for example, the mating type gene cassettes of S. cerevisiae. Gene cassettes usually exist as linear sequences as part of a larger DNA molecule, such as a chromosome or plasmid. gene_cassette A gene that can be substituted for a related gene at a different site in the genome. SGD:se http://en.wikipedia.org/wiki/Gene_cassette wiki An array of non-functional genes whose members, when captured by recombination form functional genes. gene cassette array sequence SO:0005854 This would include, for example, the arrays of non-functional VSG genes of Trypanosomes. gene_cassette_array An array of non-functional genes whose members, when captured by recombination form functional genes. SO:ma A collection of related genes. gene group sequence SO:0005855 gene_group A collection of related genes. SO:ma A primary transcript encoding seryl tRNA (SO:000269). selenocysteine tRNA primary transcript sequence SO:0005856 selenocysteine_tRNA_primary_transcript A primary transcript encoding seryl tRNA (SO:000269). SO:ke A tRNA sequence that has a selenocysteine anticodon, and a 3' selenocysteine binding region. selenocysteinyl tRNA selenocysteinyl-transfer RNA selenocysteinyl-transfer ribonucleic acid sequence SO:0005857 selenocysteinyl_tRNA A tRNA sequence that has a selenocysteine anticodon, and a 3' selenocysteine binding region. SO:ke A region in which two or more pairs of homologous markers occur on the same chromosome in two or more species. syntenic region sequence SO:0005858 syntenic_region A region in which two or more pairs of homologous markers occur on the same chromosome in two or more species. http://www.informatics.jax.org/silverbook/glossary.shtml A region of a peptide that is involved in a biochemical function. biochemical motif biochemical region of peptide sequence biochemical_region SO:0100001 Range. biochemical_region_of_peptide A region of a peptide that is involved in a biochemical function. EBIBS:GAR A region that is involved a contact with another molecule. sequence molecular contact region SO:0100002 Range. molecular_contact_region A region that is involved a contact with another molecule. EBIBS:GAR A region of polypeptide chain with high conformational flexibility. intrinsically unstructured polypeptide region sequence disordered region SO:0100003 intrinsically_unstructured_polypeptide_region A region of polypeptide chain with high conformational flexibility. EBIBS:GAR disordered region A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1. catmat-3l sequence SO:0100004 catmat_left_handed_three A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2. catmat-4l sequence SO:0100005 catmat_left_handed_four A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1. catmat-3r sequence SO:0100006 catmat_right_handed_three A motif of 3 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -75 bounds -100 to -50, res i+1: psi 140 bounds 110 to 170. An extra restriction of the length of the O to O distance would be useful, that it be less than 5 Angstrom. More precisely these two oxygens are the main chain carbonyl oxygen atoms of residues i-1 and i+1. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2. catmat-4r sequence SO:0100007 catmat_right_handed_four A motif of 4 consecutive residues with dihedral angles as follows: res i: phi -90 bounds -120 to -60, res i: psi -10 bounds -50 to 30, res i+1: phi -90 bounds -120 to -60, res i+1: psi -10 bounds -50 to 30, res i+2: phi -75 bounds -100 to -50, res i+2: psi 140 bounds 110 to 170. The extra restriction of the length of the O to O distance is similar, that it be less than 5 Angstrom. In this case these two Oxygen atoms are the main chain carbonyl oxygen atoms of residues i-1 and i+2. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A motif of five consecutive residues and two H-bonds in which: H-bond between CO of residue(i) and NH of residue(i+4), H-bond between CO of residue(i) and NH of residue(i+3),Phi angles of residues(i+1), (i+2) and (i+3) are negative. alpha beta motif sequence SO:0100008 alpha_beta_motif A motif of five consecutive residues and two H-bonds in which: H-bond between CO of residue(i) and NH of residue(i+4), H-bond between CO of residue(i) and NH of residue(i+3),Phi angles of residues(i+1), (i+2) and (i+3) are negative. EBIBS:GAR http://www.ebi.ac.uk/msd-srv/msdmotif/ A peptide that acts as a signal for both membrane translocation and lipid attachment in prokaryotes. lipoprotein signal peptide prokaryotic membrane lipoprotein lipid attachment site sequence SO:0100009 lipoprotein_signal_peptide A peptide that acts as a signal for both membrane translocation and lipid attachment in prokaryotes. EBIBS:GAR An experimental region wherean analysis has been run and not produced any annotation. no output sequence SO:0100010 no_output An experimental region wherean analysis has been run and not produced any annotation. EBIBS:GAR no output The cleaved_peptide_region is the region of a peptide sequence that is cleaved during maturation. cleaved peptide region sequence SO:0100011 Range. cleaved_peptide_region The cleaved_peptide_region is the region of a peptide sequence that is cleaved during maturation. EBIBS:GAR Irregular, unstructured regions of a protein's backbone, as distinct from the regular region (namely alpha helix and beta strand - characterised by specific patterns of main-chain hydrogen bonds). peptide coil sequence coil random coil SO:0100012 peptide_coil Irregular, unstructured regions of a protein's backbone, as distinct from the regular region (namely alpha helix and beta strand - characterised by specific patterns of main-chain hydrogen bonds). EBIBS:GAR coil random coil Hydrophobic regions are regions with a low affinity for water. hydrophobic_region sequence hydropathic hydrophobic region of peptide hydrophobicity SO:0100013 Range. hydrophobic_region_of_peptide Hydrophobic regions are regions with a low affinity for water. EBIBS:GAR The amino-terminal positively-charged region of a signal peptide (approx 1-5 aa). sequence N-region SO:0100014 n_terminal_region The amino-terminal positively-charged region of a signal peptide (approx 1-5 aa). EBIBS:GAR The more polar, carboxy-terminal region of the signal peptide (approx 3-7 aa). sequence C-region SO:0100015 c_terminal_region The more polar, carboxy-terminal region of the signal peptide (approx 3-7 aa). EBIBS:GAR The central, hydrophobic region of the signal peptide (approx 7-15 aa). central hydrophobic region of signal peptide sequence H-region central_hydrophobic_region SO:0100016 central_hydrophobic_region_of_signal_peptide The central, hydrophobic region of the signal peptide (approx 7-15 aa). EBIBS:GAR A conserved motif is a short (up to 20 amino acids) region of biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found. sequence motif SO:0100017 polypeptide_conserved_motif A conserved motif is a short (up to 20 amino acids) region of biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found. EBIBS:GAR A polypeptide binding motif is a short (up to 20 amino acids) polypeptide region of biological interest that contains one or more amino acids experimentally shown to bind to a ligand. polypeptide binding motif sequence binding SO:0100018 polypeptide_binding_motif A polypeptide binding motif is a short (up to 20 amino acids) polypeptide region of biological interest that contains one or more amino acids experimentally shown to bind to a ligand. EBIBS:GAR binding uniprot:feature_type A polypeptide catalytic motif is a short (up to 20 amino acids) polypeptide region that contains one or more active site residues. polypeptide catalytic motif sequence catalytic_motif SO:0100019 polypeptide_catalytic_motif A polypeptide catalytic motif is a short (up to 20 amino acids) polypeptide region that contains one or more active site residues. EBIBS:GAR A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with DNA. DNA_bind polypeptide DNA contact sequence SO:0100020 polypeptide_DNA_contact A binding site that, in the polypeptide molecule, interacts selectively and non-covalently with DNA. EBIBS:GAR SO:ke DNA_bind uniprot:feature A subsection of sequence with biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found. polypeptide conserved region sequence SO:0100021 polypeptide_conserved_region A subsection of sequence with biological interest that is conserved in different proteins. They may or may not have functional or structural significance within the proteins in which they are found. EBIBS:GAR true A sequence alteration where the length of the change in the variant is the same as that of the reference. loinc:LA6690-7 sequence SO:1000002 substitution A sequence alteration where the length of the change in the variant is the same as that of the reference. SO:ke loinc:LA6690-7 Substitution true When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change. complex substitution sequence SO:1000005 complex_substitution When no simple or well defined DNA mutation event describes the observed DNA change, the keyword "complex" should be used. Usually there are multiple equally plausible explanations for the change. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html true A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence. http://en.wikipedia.org/wiki/Point_mutation point mutation sequence SO:1000008 point_mutation A single nucleotide change which has occurred at the same position of a corresponding nucleotide in a reference sequence. SO:immuno_workshop http://en.wikipedia.org/wiki/Point_mutation wiki Change of a pyrimidine nucleotide, C or T, into an other pyrimidine nucleotide, or change of a purine nucleotide, A or G, into an other purine nucleotide. sequence SO:1000009 transition Change of a pyrimidine nucleotide, C or T, into an other pyrimidine nucleotide, or change of a purine nucleotide, A or G, into an other purine nucleotide. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A substitution of a pyrimidine, C or T, for another pyrimidine. pyrimidine transition sequence SO:1000010 pyrimidine_transition A substitution of a pyrimidine, C or T, for another pyrimidine. SO:ke A transition of a cytidine to a thymine. C to T transition sequence SO:1000011 C_to_T_transition A transition of a cytidine to a thymine. SO:ke The transition of cytidine to thymine occurring at a pCpG site as a consequence of the spontaneous deamination of 5'-methylcytidine. C to T transition at pCpG site sequence SO:1000012 C_to_T_transition_at_pCpG_site The transition of cytidine to thymine occurring at a pCpG site as a consequence of the spontaneous deamination of 5'-methylcytidine. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A transition of a thymine to a cytidine. T to C transition sequence SO:1000013 T_to_C_transition A substitution of a purine, A or G, for another purine. purine transition sequence SO:1000014 purine_transition A substitution of a purine, A or G, for another purine. SO:ke A transition of an adenine to a guanine. A to G transition sequence SO:1000015 A_to_G_transition A transition of an adenine to a guanine. SO:ke A transition of a guanine to an adenine. G to A transition sequence SO:1000016 G_to_A_transition A transition of a guanine to an adenine. SO:ke Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G, or vice versa. http://en.wikipedia.org/wiki/Transversion sequence SO:1000017 transversion Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G, or vice versa. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html http://en.wikipedia.org/wiki/Transversion wiki Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G. pyrimidine to purine transversion sequence SO:1000018 pyrimidine_to_purine_transversion Change of a pyrimidine nucleotide, C or T, into a purine nucleotide, A or G. SO:ke A transversion from cytidine to adenine. C to A transversion sequence SO:1000019 C_to_A_transversion A transversion from cytidine to adenine. SO:ke A transversion of a cytidine to a guanine. C to G transversion sequence SO:1000020 C_to_G_transversion A transversion from T to A. T to A transversion sequence SO:1000021 T_to_A_transversion A transversion from T to A. SO:ke A transversion from T to G. T to G transversion sequence SO:1000022 T_to_G_transversion A transversion from T to G. SO:ke Change of a purine nucleotide, A or G , into a pyrimidine nucleotide C or T. purine to pyrimidine transversion sequence SO:1000023 purine_to_pyrimidine_transversion Change of a purine nucleotide, A or G , into a pyrimidine nucleotide C or T. SO:ke A transversion from adenine to cytidine. A to C transversion sequence SO:1000024 A_to_C_transversion A transversion from adenine to cytidine. SO:ke A transversion from adenine to thymine. A to T transversion sequence SO:1000025 A_to_T_transversion A transversion from adenine to thymine. SO:ke A transversion from guanine to cytidine. G to C transversion sequence SO:1000026 G_to_C_transversion A transversion from guanine to cytidine. SO:ke A transversion from guanine to thymine. G to T transversion sequence SO:1000027 G_to_T_transversion A transversion from guanine to thymine. SO:ke A chromosomal structure variation within a single chromosome. intrachromosomal mutation sequence SO:1000028 intrachromosomal_mutation A chromosomal structure variation within a single chromosome. SO:ke An incomplete chromosome. http://en.wikipedia.org/wiki/Chromosomal_deletion chromosomal deletion deficiency sequence (Drosophila)Df (bacteria)&ampDgr; (fungi)D SO:1000029 chromosomal_deletion An incomplete chromosome. SO:ke http://en.wikipedia.org/wiki/Chromosomal_deletion wiki An interchromosomal mutation where a region of the chromosome is inverted with respect to wild type. http://en.wikipedia.org/wiki/Chromosomal_inversion chromosomal inversion sequence (Drosophila)In (bacteria)IN (fungi)In SO:1000030 chromosomal_inversion An interchromosomal mutation where a region of the chromosome is inverted with respect to wild type. SO:ke http://en.wikipedia.org/wiki/Chromosomal_inversion wiki A chromosomal structure variation whereby more than one chromosome is involved. interchromosomal mutation sequence SO:1000031 interchromosomal_mutation A chromosomal structure variation whereby more than one chromosome is involved. SO:ke A sequence alteration which included an insertion and a deletion, affecting 2 or more bases. http://en.wikipedia.org/wiki/Indel loinc:LA9659-9 deletion-insertion indel sequence SO:1000032 Indels can have a different number of bases than the corresponding reference sequence. The term name was changed from indel to delins on 2/24/2019 to align with the HGVS nomenclature term for a deletion-insertion. Indel was causing confusion in the annotation community (github issue 445). The HGVS nomenclature definition of deletion-insertion (delins) is a sequence change where, compared to a reference sequence, one or more nucleotides are replaced by one or more other nucleotides and which is not a substitution, inversion or conversion. delins A sequence alteration which included an insertion and a deletion, affecting 2 or more bases. http://varnomen.hgvs.org/recommendations/DNA/variant/delins/ http://en.wikipedia.org/wiki/Indel wiki loinc:LA9659-9 Insertion and Deletion true true An insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome. loinc:LA6686-5 nucleotide duplication sequence nucleotide_duplication SO:1000035 duplication An insertion which derives from, or is identical in sequence to, nucleotides present at a known location in the genome. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html NCBI:th loinc:LA6686-5 Duplication A continuous nucleotide sequence is inverted in the same position. loinc:LA6689-9 inversion sequence SO:1000036 inversion A continuous nucleotide sequence is inverted in the same position. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html loinc:LA6689-9 Inversion inversion http://www.ncbi.nlm.nih.gov/dbvar/ An extra chromosome. http://en.wikipedia.org/wiki/Chromosomal_duplication chromosomal duplication sequence (Drosophila)Dp (fungi)Dp SO:1000037 chromosomal_duplication An extra chromosome. SO:ke http://en.wikipedia.org/wiki/Chromosomal_duplication wiki A duplication that occurred within a chromosome. intrachromosomal duplication sequence SO:1000038 intrachromosomal_duplication A duplication that occurred within a chromosome. SO:ke A tandem duplication where the individual regions are in the same orientation. direct tandem duplication sequence SO:1000039 direct_tandem_duplication A tandem duplication where the individual regions are in the same orientation. SO:ke A tandem duplication where the individual regions are not in the same orientation. inverted tandem duplication sequence mirror duplication SO:1000040 inverted_tandem_duplication A tandem duplication where the individual regions are not in the same orientation. SO:ke A chromosome structure variation whereby a transposition occurred within a chromosome. intrachromosomal transposition sequence (Drosophila)Tp SO:1000041 intrachromosomal_transposition A chromosome structure variation whereby a transposition occurred within a chromosome. SO:ke A chromosome structure variant where a monocentric element is caused by the fusion of two chromosome arms. compound chromosome sequence SO:1000042 compound_chromosome A chromosome structure variant where a monocentric element is caused by the fusion of two chromosome arms. SO:ke A non reciprocal translocation whereby the participating chromosomes break at their centromeres and the long arms fuse to form a single chromosome with a single centromere. http://en.wikipedia.org/wiki/Robertsonian_fusion Robertsonian fusion centric-fusion translocations whole-arm translocations sequence SO:1000043 Robertsonian_fusion A non reciprocal translocation whereby the participating chromosomes break at their centromeres and the long arms fuse to form a single chromosome with a single centromere. http://en.wikipedia.org/wiki/Robertsonian_translocation http://en.wikipedia.org/wiki/Robertsonian_fusion wiki A chromosomal mutation. Rearrangements that alter the pairing of telomeres are classified as translocations. http://en.wikipedia.org/wiki/Chromosomal_translocation chromosomal translocation sequence (Drosophila)T (fungi)T SO:1000044 chromosomal_translocation A chromosomal mutation. Rearrangements that alter the pairing of telomeres are classified as translocations. FB:reference_manual http://en.wikipedia.org/wiki/Chromosomal_translocation wiki A ring chromosome is a chromosome whose arms have fused together to form a ring, often with the loss of the ends of the chromosome. http://en.wikipedia.org/wiki/Ring_chromosome ring chromosome sequence (Drosophila)R (fungi)C SO:1000045 ring_chromosome A ring chromosome is a chromosome whose arms have fused together to form a ring, often with the loss of the ends of the chromosome. http://en.wikipedia.org/wiki/Ring_chromosome http://en.wikipedia.org/wiki/Ring_chromosome wiki A chromosomal inversion that includes the centromere. pericentric inversion sequence SO:1000046 pericentric_inversion A chromosomal inversion that includes the centromere. FB:reference_manual A chromosomal inversion that does not include the centromere. paracentric inversion sequence SO:1000047 paracentric_inversion A chromosomal inversion that does not include the centromere. FB:reference_manual A chromosomal translocation with two breaks; two chromosome segments have simply been exchanged. reciprocal chromosomal translocation sequence SO:1000048 reciprocal_chromosomal_translocation A chromosomal translocation with two breaks; two chromosome segments have simply been exchanged. FB:reference_manual Any change in mature, spliced and processed, RNA that results from a change in the corresponding DNA sequence. SO:0001576 SO:1000177 SO:1000179 mutation affecting transcript sequence variant causing partially characterised change in transcript sequence variant causing uncharacterised change in transcript sequence variation affecting transcript sequence_variant_causing_partially_characterised_change_in_transcript sequence_variant_causing_uncharacterised_change_in_transcript sequence mutation causing partially characterised change in transcript mutation causing uncharacterised change in transcript SO:1000049 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_affecting_transcript true Any change in mature, spliced and processed, RNA that results from a change in the corresponding DNA sequence. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html No effect on the state of the RNA. sequence variant causing no change in transcript sequence mutation causing no change in transcript SO:1000050 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also as there is not change, it is not a good ontological term. sequence_variant_causing_no_change_in_transcript true No effect on the state of the RNA. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html Any of the amino acid coding triplets of a gene are affected by the DNA mutation. SO:0001580 mutation affecting coding sequence sequence sequence variation affecting coding sequence SO:1000054 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_affecting_coding_sequence true Any of the amino acid coding triplets of a gene are affected by the DNA mutation. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The DNA mutation changes, usually destroys, the first coding triplet of a gene. Usually prevents translation although another initiator codon may be used. SO:0001582 sequence variant causing initiator codon change in transcript sequence mutation causing initiator codon change in transcript SO:1000055 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_initiator_codon_change_in_transcript true The DNA mutation changes, usually destroys, the first coding triplet of a gene. Usually prevents translation although another initiator codon may be used. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The DNA mutation affects the amino acid coding sequence of a gene; this region includes both the initiator and terminator codons. SO:0001606 sequence variant causing amino acid coding codon change in transcript sequence mutaton causing amino acid coding codon change in transcript SO:1000056 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_amino_acid_coding_codon_change_in_transcript true The DNA mutation affects the amino acid coding sequence of a gene; this region includes both the initiator and terminator codons. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The changed codon has the same translation product as the original codon. SO:0001819 sequence variant causing synonymous codon change in transcript sequence mutation causing synonymous codon change in transcript SO:1000057 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_synonymous_codon_change_in_transcript true The changed codon has the same translation product as the original codon. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A DNA point mutation that causes a substitution of an amino acid by an other. SO:0001583 non-synonymous codon change in transcript sequence variant causing non synonymous codon change in transcript sequence mutation causing non synonymous codon change in transcript SO:1000058 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_non_synonymous_codon_change_in_transcript true A DNA point mutation that causes a substitution of an amino acid by an other. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The nucleotide change in the codon leads to a new codon coding for a new amino acid. SO:0001583 sequence variant causing missense codon change in transcript sequence mutation causing missense codon change in transcript SO:1000059 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_missense_codon_change_in_transcript true The nucleotide change in the codon leads to a new codon coding for a new amino acid. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The amino acid change following from the codon change does not change the gross properties (size, charge, hydrophobicity) of the amino acid at that position. SO:0001585 sequence variant causing conservative missense codon change in transcript sequence mutation causing conservative missense codon change in transcript SO:1000060 The exact rules need to be stated, a common set of rules can be derived from e.g. BLOSUM62 amino acid distance matrix. sequence_variant_causing_conservative_missense_codon_change_in_transcript true The amino acid change following from the codon change does not change the gross properties (size, charge, hydrophobicity) of the amino acid at that position. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The amino acid change following from the codon change changes the gross properties (size, charge, hydrophobicity) of the amino acid in that position. SO:0001586 sequence variant causing nonconservative missense codon change in transcript sequence mutation causing nonconservative missense codon change in transcript SO:1000061 The exact rules need to be stated, a common set of rules can be derived from e.g. BLOSUM62 amino acid distance matrix. sequence_variant_causing_nonconservative_missense_codon_change_in_transcript true The amino acid change following from the codon change changes the gross properties (size, charge, hydrophobicity) of the amino acid in that position. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The nucleotide change in the codon triplet creates a terminator codon. SO:0001587 sequence variant causing nonsense codon change in transcript sequence mutation causing nonsense codon change in transcript SO:1000062 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_nonsense_codon_change_in_transcript true The nucleotide change in the codon triplet creates a terminator codon. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The nucleotide change in the codon triplet changes the stop codon, causing an elongated transcript sequence. SO:0001590 sequence variant causing terminator codon change in transcript sequence mutation causing terminator codon change in transcript SO:1000063 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_terminator_codon_change_in_transcript true The nucleotide change in the codon triplet changes the stop codon, causing an elongated transcript sequence. SO:ke An umbrella term for terms describing an effect of a sequence variation on the frame of translation. mutation affecting reading frame sequence sequence variation affecting reading frame SO:1000064 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_affecting_reading_frame true An umbrella term for terms describing an effect of a sequence variation on the frame of translation. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A mutation causing a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three. http://en.wikipedia.org/wiki/Frameshift_mutation frameshift mutation sequence frameshift sequence variation out of frame mutation SO:1000065 frameshift_sequence_variation true A mutation causing a disruption of the translational reading frame, because the number of nucleotides inserted or deleted is not a multiple of three. SO:ke http://en.wikipedia.org/wiki/Frameshift_mutation wiki A mutation causing a disruption of the translational reading frame, due to the insertion of a nucleotide. SO:0001594 plus 1 frameshift mutation sequence variant causing plus 1 frameshift mutation sequence SO:1000066 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_plus_1_frameshift_mutation true A mutation causing a disruption of the translational reading frame, due to the insertion of a nucleotide. SO:ke A mutation causing a disruption of the translational reading frame, due to the deletion of a nucleotide. SO:0001592 minus 1 frameshift mutation sequence variant causing minus 1 frameshift sequence SO:1000067 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_minus_1_frameshift true A mutation causing a disruption of the translational reading frame, due to the deletion of a nucleotide. SO:ke A mutation causing a disruption of the translational reading frame, due to the insertion of two nucleotides. SO:0001595 plus 2 frameshift mutation sequence variant causing plus 2 frameshift sequence SO:1000068 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_plus_2_frameshift true A mutation causing a disruption of the translational reading frame, due to the insertion of two nucleotides. SO:ke A mutation causing a disruption of the translational reading frame, due to the deletion of two nucleotides. SO:0001593 minus 2 frameshift mutation sequence variant causing minus 2 frameshift sequence SO:1000069 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_minus_2_frameshift true A mutation causing a disruption of the translational reading frame, due to the deletion of two nucleotides. SO:ke Sequence variant affects the way in which the primary transcriptional product is processed to form the mature transcript. SO:0001543 sequence variant affecting transcript processing sequence mutation affecting transcript processing SO:1000070 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_transcript_processing true Sequence variant affects the way in which the primary transcriptional product is processed to form the mature transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A sequence_variant_effect where the way in which the primary transcriptional product is processed to form the mature transcript, specifically by the removal (splicing) of intron sequences is changed. SO:0001568 sequence variant affecting splicing sequence mutation affecting splicing SO:1000071 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_splicing true A sequence_variant_effect where the way in which the primary transcriptional product is processed to form the mature transcript, specifically by the removal (splicing) of intron sequences is changed. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A sequence_variant_effect that changes the splice donor sequence. SO:0001575 splice donor mutation sequence mutation affecting splice donor sequence variant affecting splice donor SO:1000072 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_splice_donor true A sequence_variant_effect that changes the splice donor sequence. SO:ke A sequence_variant_effect that changes the splice acceptor sequence. SO:0001574 splice acceptor mutation sequence mutation affecting splicing sequence variant affecting splice acceptor SO:1000073 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_splice_acceptor true A sequence_variant_effect that changes the splice acceptor sequence. SO:ke A sequence variant causing a new (functional) splice site. SO:0001569 cryptic splice activator sequence variant sequence variant causing cryptic splice activator sequence mutation causing cryptic splice activator SO:1000074 A cryptic splice site is only used when the natural splice site has been disrupted by a sequence alteration. sequence_variant_causing_cryptic_splice_activation true A sequence variant causing a new (functional) splice site. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html Sequence variant affects the editing of the transcript. SO:0001544 sequence variant affecting editing sequence mutation affecting editing SO:1000075 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_editing true Sequence variant affects the editing of the transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html Mutation affects the process of transcription, its initiation, progression or termination. SO:0001549 sequence variant affecting transcription sequence mutation affecting transcription SO:1000076 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_transcription true Mutation affects the process of transcription, its initiation, progression or termination. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A sequence variation that decreases the rate a which transcription of the sequence occurs. sequence variation decreasing rate of transcription sequence mutation decreasing rate of transcription SO:1000078 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_decreasing_rate_of_transcription true A sequence variation that decreases the rate a which transcription of the sequence occurs. SO:ke mutation affecting transcript sequence sequence variation affecting transcript sequence sequence SO:1000079 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_affecting_transcript_sequence true sequence variation increasing rate of transcription sequence mutation increasing rate of transcription SO:1000080 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_increasing_rate_of_transcription true A mutation that alters the rate a which transcription of the sequence occurs. SO:0001550 sequence variant affecting rate of transcription sequence mutation affecting rate of transcription SO:1000081 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_rate_of_transcription true A mutation that alters the rate a which transcription of the sequence occurs. SO:ke Sequence variant affects the stability of the transcript. SO:0001546 sequence variant affecting transcript stability sequence mutation affecting transcript stability SO:1000082 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence variant_affecting_transcript_stability true Sequence variant affects the stability of the transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html Sequence variant increases the stability (half-life) of the transcript. sequence variant increasing transcript stability sequence mutation increasing transcript stability SO:1000083 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_increasing_transcript_stability true Sequence variant increases the stability (half-life) of the transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html Sequence variant decreases the stability (half-life) of the transcript. sequence variant decreasing transcript stability sequence mutation decreasing transcript stability SO:1000084 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_decreasing_transcript_stability true Sequence variant decreases the stability (half-life) of the transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html A sequence variation that causes a change in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. SO:0001540 sequence variation affecting level of transcript sequence mutation affecting level of transcript SO:1000085 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_affecting_level_of_transcript true A sequence variation that causes a change in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. SO:ke A sequence variation that causes a decrease in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. mutation decreasing level of transcript sequence sequence variation decreasing level of transcript SO:1000086 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_decreasing_level_of_transcript true A sequence variation that causes a decrease in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. SO:ke A sequence_variation that causes an increase in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. mutation increasing level of transcript sequence variation increasing level of transcript sequence SO:1000087 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variation_increasing_level_of_transcript true A sequence_variation that causes an increase in the level of mature, spliced and processed RNA, resulting from a change in the corresponding DNA sequence. SO:ke A sequence variant causing a change in primary translation product of a transcript. SO:0001553 SO:1000090 SO:1000091 sequence variant affecting translational product sequence variant causing partially characterised change of translational product sequence variant causing uncharacterised change of translational product sequence_variant_causing_partially_characterised_change_of_translational_product sequence_variant_causing_uncharacterised_change_of_translational_product sequence mutation affecting translational product mutation causing partially characterised change of translational product mutation causing uncharacterised change of translational product SO:1000088 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_translational_product true A sequence variant causing a change in primary translation product of a transcript. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The sequence variant at RNA level does not lead to any change in polypeptide. sequence variant causing no change of translational product sequence mutation causing no change of translational product SO:1000089 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also, as there is no change, this is not a good ontological term. sequence_variant_causing_no_change_of_translational_product true The sequence variant at RNA level does not lead to any change in polypeptide. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html true true Any sequence variant effect that is known at nucleotide level but cannot be explained by using other key terms. SO:0001539 sequence variant causing complex change of translational product sequence mutation causing complex change of translational product SO:1000092 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_complex_change_of_translational_product true Any sequence variant effect that is known at nucleotide level but cannot be explained by using other key terms. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The replacement of a single amino acid by another. SO:0001606 sequence variant causing amino acid substitution sequence mutation causing amino acid substitution SO:1000093 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_amino_acid_substitution true The replacement of a single amino acid by another. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html SO:0001607 sequence variant causing conservative amino acid substitution sequence mutation causing conservative amino acid substitution SO:1000094 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_conservative_amino_acid_substitution true SO:0001607 sequence variant causing nonconservative amino acid substitution sequence mutation causing nonconservative amino acid substitution SO:1000095 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_nonconservative_amino_acid_substitution true The insertion of one or more amino acids from the polypeptide, without affecting the surrounding sequence. SO:0001605 sequence variant causing amino acid insertion sequence mutation causing amino acid insertion SO:1000096 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_amino_acid_insertion true The insertion of one or more amino acids from the polypeptide, without affecting the surrounding sequence. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The deletion of one or more amino acids from the polypeptide, without affecting the surrounding sequence. SO:0001825 sequence variant causing amino acid deletion sequence mutation causing amino acid deletion SO:1000097 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_amino_acid_deletion true The deletion of one or more amino acids from the polypeptide, without affecting the surrounding sequence. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The translational product is truncated at its C-terminus, usually a result of a nonsense codon change in transcript (SO:1000062). SO:0001587 sequence variant causing polypeptide truncation sequence mutation causing polypeptide truncation SO:1000098 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_polypeptide_truncation true The translational product is truncated at its C-terminus, usually a result of a nonsense codon change in transcript (SO:1000062). EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html The extension of the translational product at either (or both) the N-terminus and/or the C-terminus. SO:0001609 sequence variant causing polypeptide elongation sequence mutation causing polypeptide elongation SO:1000099 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_polypeptide_elongation true The extension of the translational product at either (or both) the N-terminus and/or the C-terminus. EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html . SO:0001611 mutation causing polypeptide N terminal elongation polypeptide N-terminal elongation sequence SO:1000100 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. mutation_causing_polypeptide_N_terminal_elongation true . EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html . SO:0001610 mutation causing polypeptide C terminal elongation polypeptide C-terminal elongation sequence SO:1000101 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. mutation_causing_polypeptide_C_terminal_elongation true . EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html SO:0001553 sequence variant affecting level of translational product sequence mutation affecting level of translational product SO:1000102 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_level_of_translational_product true SO:0001555 sequence variant decreasing level of translation product sequence mutationdecreasing level of translation product SO:1000103 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_decreasing_level_of_translation_product true sequence variant increasing level of translation product sequence mutationt increasing level of translation product SO:1000104 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_increasing_level_of_translation_product true SO:0001603 sequence variant affecting polypeptide amino acid sequence sequence mutation affecting polypeptide amino acid sequence SO:1000105 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_polypeptide_amino_acid_sequence true SO:0001614 inframe polypeptide N-terminal elongation mutation causing inframe polypeptide N terminal elongation sequence SO:1000106 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. mutation_causing_inframe_polypeptide_N_terminal_elongation true SO:0001615 mutation causing out of frame polypeptide N terminal elongation out of frame polypeptide N-terminal elongation sequence SO:1000107 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. mutation_causing_out_of_frame_polypeptide_N_terminal_elongation true SO:0001612 inframe_polypeptide C-terminal elongation mutaton causing inframe polypeptide C terminal elongation sequence SO:1000108 mutaton_causing_inframe_polypeptide_C_terminal_elongation true SO:0001613 mutation causing out of frame polypeptide C terminal elongation out of frame polypeptide C-terminal elongation sequence SO:1000109 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. mutation_causing_out_of_frame_polypeptide_C_terminal_elongation true A mutation that reverts the sequence of a previous frameshift mutation back to the initial frame. frame restoring mutation frame restoring sequence variant sequence SO:1000110 frame_restoring_sequence_variant true A mutation that reverts the sequence of a previous frameshift mutation back to the initial frame. SO:ke A mutation that changes the amino acid sequence of the peptide in such a way that it changes the 3D structure of the molecule. SO:0001599 SO:1000113 SO:1000114 sequence variant affecting 3D structure of polypeptide sequence variant affecting 3D-structure of polypeptide sequence variant causing partially characterised 3D structural change sequence variant causing uncharacterised 3D structural change sequence_variant_causing_partially_characterised_3D_structural_change sequence_variant_causing_uncharacterised_3D_structural_change sequence mutation affecting 3D structure of polypeptide mutation causing partially characterised 3D structural change mutation causing uncharacterised 3D structural change SO:1000111 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_3D_structure_of_polypeptide true A mutation that changes the amino acid sequence of the peptide in such a way that it changes the 3D structure of the molecule. SO:ke sequence variant causing no 3D structural change sequence mutation causing no 3D structural change SO:1000112 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Also as there is no effect, it is not a good term. sequence_variant_causing_no_3D_structural_change true true true SO:0001600 sequence variant causing complex 3D structural change sequence mutation causing complex 3D structural change SO:1000115 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_complex_3D_structural_change true SO:0001601 sequence variant causing conformational change sequence mutation causing conformational change SO:1000116 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_conformational_change true SO:0001554 sequence variant affecting polypeptide function sequence mutation affecting polypeptide function SO:1000117 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_polypeptide_function true SO:0001559 sequence variant causing loss of function of polypeptide sequence loss of function of polypeptide mutation causing loss of function of polypeptide SO:1000118 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_loss_of_function_of_polypeptide true SO:0001560 sequence variant causing inactive ligand binding site sequence mutation causing inactive ligand binding site SO:1000119 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_inactive_ligand_binding_site true SO:0001618 sequence variant causing inactive catalytic site sequence mutation causing inactive catalytic site SO:1000120 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_inactive_catalytic_site true SO:0001558 sequence variant causing polypeptide localization change sequence mutation causing polypeptide localization change SO:1000121 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_polypeptide_localization_change true SO:0001562 polypeptide post-translational processing affected sequence variant causing polypeptide post translational processing change sequence mutation causing polypeptide post translational processing change SO:1000122 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_polypeptide_post_translational_processing_change true sequence polypeptide_post-translational_processing_affected SO:1000123 polypeptide_post_translational_processing_affected true SO:0001561 partial loss of function of polypeptide sequence variant causing partial loss of function of polypeptide sequence mutation causing partial loss of function of polypeptide SO:1000124 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_partial_loss_of_function_of_polypeptide true SO:0001557 gain of function of polypeptide sequence variant causing gain of function of polypeptide sequence mutation causing gain of function of polypeptide SO:1000125 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_gain_of_function_of_polypeptide true A sequence variant that affects the secondary structure (folding) of the RNA transcript molecule. SO:0001596 sequence variant affecting transcript secondary structure sequence mutation affecting transcript secondary structure SO:1000126 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_transcript_secondary_structure true A sequence variant that affects the secondary structure (folding) of the RNA transcript molecule. SO:ke SO:0001597 sequence variant causing compensatory transcript secondary structure mutation sequence mutation causing compensatory transcript secondary structure mutation SO:1000127 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_compensatory_transcript_secondary_structure_mutation true The effect of a change in nucleotide sequence. sequence sequence variant effect SO:1000132 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. Updated after discussion with Peter Taschner - Feb 09. sequence_variant_effect true The effect of a change in nucleotide sequence. SO:ke SO:0001616 sequence variant causing polypeptide fusion sequence mutation causing polypeptide fusion SO:1000134 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_polypeptide_fusion true An autosynaptic chromosome is the aneuploid product of recombination between a pericentric inversion and a cytologically wild-type chromosome. autosynaptic chromosome sequence (Drosophila)A SO:1000136 autosynaptic_chromosome An autosynaptic chromosome is the aneuploid product of recombination between a pericentric inversion and a cytologically wild-type chromosome. PMID:6804304 A compound chromosome whereby two copies of the same chromosomal arm attached to a common centromere. The chromosome is diploid for the arm involved. homo compound chromosome homo-compound chromosome sequence SO:1000138 homo_compound_chromosome A compound chromosome whereby two copies of the same chromosomal arm attached to a common centromere. The chromosome is diploid for the arm involved. SO:ke A compound chromosome whereby two arms from different chromosomes are connected through the centromere of one of them. hetero compound chromosome hetero-compound chromosome sequence SO:1000140 hetero_compound_chromosome A compound chromosome whereby two arms from different chromosomes are connected through the centromere of one of them. FB:reference_manual SO:ke A chromosome that occurred by the division of a larger chromosome. chromosome fission sequence SO:1000141 chromosome_fission A chromosome that occurred by the division of a larger chromosome. SO:ke An autosynaptic chromosome carrying the two right (D = dextro) telomeres. dextrosynaptic chromosome sequence SO:1000142 Corrected spelling from dexstrosynaptic_chromosome to dextrosynaptic_chromosome on April 14, 2020 in response to GitHub request #447 dextrosynaptic_chromosome An autosynaptic chromosome carrying the two right (D = dextro) telomeres. FB:manual LS is an autosynaptic chromosome carrying the two left (L = levo) telomeres. laevosynaptic chromosome sequence SO:1000143 laevosynaptic_chromosome LS is an autosynaptic chromosome carrying the two left (L = levo) telomeres. FB:manual A chromosome structure variation whereby the duplicated sequences are carried as a free centric element. free duplication sequence SO:1000144 free_duplication A chromosome structure variation whereby the duplicated sequences are carried as a free centric element. FB:reference_manual A ring chromosome which is a copy of another chromosome. free ring duplication sequence (Drosophila)R SO:1000145 free_ring_duplication A ring chromosome which is a copy of another chromosome. SO:ke true A chromosomal deletion whereby a translocation occurs in which one of the four broken ends loses a segment before re-joining. deficient translocation sequence (Drosophila)Df (Drosophila)DfT SO:1000147 deficient_translocation A chromosomal deletion whereby a translocation occurs in which one of the four broken ends loses a segment before re-joining. FB:reference_manual A chromosomal translocation whereby the first two breaks are in the same chromosome, and the region between them is rejoined in inverted order to the other side of the first break, such that both sides of break one are present on the same chromosome. The remaining free ends are joined as a translocation with those resulting from the third break. inversion cum translocation sequence (Drosophila)InT (Drosophila)T SO:1000148 inversion_cum_translocation A chromosomal translocation whereby the first two breaks are in the same chromosome, and the region between them is rejoined in inverted order to the other side of the first break, such that both sides of break one are present on the same chromosome. The remaining free ends are joined as a translocation with those resulting from the third break. FB:reference_manual An interchromosomal mutation whereby the (large) region between the first two breaks listed is lost, and the two flanking segments (one of them centric) are joined as a translocation to the free ends resulting from the third break. bipartite duplication sequence (Drosophila)bDp SO:1000149 bipartite_duplication An interchromosomal mutation whereby the (large) region between the first two breaks listed is lost, and the two flanking segments (one of them centric) are joined as a translocation to the free ends resulting from the third break. FB:reference_manual A chromosomal translocation whereby three breaks occurred in three different chromosomes. The centric segment resulting from the first break listed is joined to the acentric segment resulting from the second, rather than the third. cyclic translocation sequence SO:1000150 cyclic_translocation A chromosomal translocation whereby three breaks occurred in three different chromosomes. The centric segment resulting from the first break listed is joined to the acentric segment resulting from the second, rather than the third. FB:reference_manual A chromosomal inversion caused by three breaks in the same chromosome; both central segments are inverted in place (i.e., they are not transposed). bipartite inversion sequence (Drosophila)bIn SO:1000151 bipartite_inversion A chromosomal inversion caused by three breaks in the same chromosome; both central segments are inverted in place (i.e., they are not transposed). FB:reference_manual An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. uninverted insertional duplication sequence (Drosophila)eDp SO:1000152 uninverted_insertional_duplication An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. FB:reference_manual An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments. inverted insertional duplication sequence (Drosophila)iDp SO:1000153 inverted_insertional_duplication An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments. FB:reference_manual A chromosome duplication involving the insertion of a duplicated region (as opposed to a free duplication). insertional duplication sequence (Drosophila)Dpp SO:1000154 insertional_duplication A chromosome duplication involving the insertion of a duplicated region (as opposed to a free duplication). SO:ke A chromosome structure variation whereby a transposition occurred between chromosomes. interchromosomal transposition sequence (Drosophila)Tp SO:1000155 interchromosomal_transposition A chromosome structure variation whereby a transposition occurred between chromosomes. SO:ke An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segment. inverted interchromosomal transposition sequence (Drosophila)iTp SO:1000156 inverted_interchromosomal_transposition An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segment. FB:reference_manual An interchromosomal transition where the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. uninverted interchromosomal transposition sequence (Drosophila)eTp SO:1000157 uninverted_interchromosomal_transposition An interchromosomal transition where the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. FB:reference_manual An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments. inverted intrachromosomal transposition sequence (Drosophila)iTp SO:1000158 inverted_intrachromosomal_transposition An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically inverted orientation with respect to its flanking segments. FB:reference_manual An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. uninverted intrachromosomal transposition sequence (Drosophila)eTp SO:1000159 uninverted_intrachromosomal_transposition An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the insertion is in cytologically the same orientation as its flanking segments. FB:reference_manual An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. unoriented insertional duplication sequence (Drosophila)uDp SO:1000160 Flag - unknown in the definition. unoriented_insertional_duplication An insertional duplication where a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. FB:reference_manual An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. unorientated interchromosomal transposition sequence (Drosophila)uTp SO:1000161 FLAG - term describes an unknown. unoriented_interchromosomal_transposition An interchromosomal transposition whereby a copy of the segment between the first two breaks listed is inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. FB:reference_manual An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. unorientated intrachromosomal transposition sequence (Drosophila)uTp SO:1000162 FLAG - definition describes an unknown. unoriented_intrachromosomal_transposition An intrachromosomal transposition whereby the segment between the first two breaks listed is removed and inserted at the third break; the orientation of the insertion with respect to its flanking segments is not recorded. FB:reference_manual A chromosome structure variant that has not been characterized. uncharacterized chromosomal mutation sequence SO:1000170 uncharacterized_chromosomal_mutation A chromosomal deletion whereby three breaks occur in the same chromosome; one central region is lost, and the other is inverted. deficient inversion sequence (Drosophila)Df (Drosophila)DfIn SO:1000171 deficient_inversion A chromosomal deletion whereby three breaks occur in the same chromosome; one central region is lost, and the other is inverted. FB:reference_manual SO:ke A duplication consisting of 2 identical adjacent regions. tandem duplication sequence erverted SO:1000173 tandem_duplication A duplication consisting of 2 identical adjacent regions. SO:ke erverted http://www.ncbi.nlm.nih.gov/dbvar/ A chromosome structure variant that has not been characterized fully. partially characterized chromosomal mutation sequence SO:1000175 partially_characterized_chromosomal_mutation true true A sequence_variant_effect that changes the gene structure. SO:0001564 sequence variant affecting gene structure sequence mutation affecting gene structure SO:1000180 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_affecting_gene_structure true A sequence_variant_effect that changes the gene structure. SO:ke A sequence_variant_effect that changes the gene structure by causing a fusion to another gene. SO:0001565 sequence variant causing gene fusion sequence mutation causing gene fusion SO:1000181 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_gene_fusion true A sequence_variant_effect that changes the gene structure by causing a fusion to another gene. SO:ke A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number. Jannovar:chromosome_number_variation chromosome number variation sequence SO:1000182 chromosome_number_variation A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number. SO:ke Jannovar:chromosome_number_variation http://doc-openbio.readthedocs.org/projects/jannovar/en/master/var_effects.html An alteration of the genome that leads to a change in the structure or number of one or more chromosomes. http://snpeff.sourceforge.net/SnpEff_manual.html chromosome structure variation snpEff:CHROMOSOME_LARGE_DELETION sequence SO:1000183 chromosome_structure_variation snpEff:CHROMOSOME_LARGE_DELETION A sequence variant affecting splicing and causes an exon loss. sequence variant causes exon loss sequence mutation causes exon loss SO:1000184 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causes_exon_loss true A sequence variant affecting splicing and causes an exon loss. SO:ke A sequence variant effect, causing an intron to be gained by the processed transcript; usually a result of a donor acceptor mutation (SO:1000072). sequence variant causes intron gain sequence mutation causes intron gain SO:1000185 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causes_intron_gain true A sequence variant effect, causing an intron to be gained by the processed transcript; usually a result of a donor acceptor mutation (SO:1000072). EBI:www.ebi.ac.uk/mutations/recommendations/mutevent.html SO:0001571 sequence variant causing cryptic splice donor activation sequence SO:1000186 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_cryptic_splice_donor_activation true SO:0001570 sequence variant causing cryptic splice acceptor activation sequence SO:1001186 OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect. sequence_variant_causing_cryptic_splice_acceptor_activation true A transcript that is alternatively spliced. alternatively spliced transcript sequence SO:1001187 alternatively_spliced_transcript A transcript that is alternatively spliced. SO:xp A gene that is alternately spliced, but encodes only one polypeptide. encodes 1 polypeptide sequence SO:1001188 encodes_1_polypeptide A gene that is alternately spliced, but encodes only one polypeptide. SO:ke A gene that is alternately spliced, and encodes more than one polypeptide. encodes greater than 1 polypeptide sequence SO:1001189 encodes_greater_than_1_polypeptide A gene that is alternately spliced, and encodes more than one polypeptide. SO:ke A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different stop codons. encodes different polypeptides different stop sequence SO:1001190 encodes_different_polypeptides_different_stop A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different stop codons. SO:ke A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start codons. encodes overlapping peptides different start sequence SO:1001191 encodes_overlapping_peptides_different_start A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start codons. SO:ke A gene that is alternately spliced, and encodes more than one polypeptide, that do not have overlapping peptide sequences. encodes disjoint polypeptides sequence SO:1001192 encodes_disjoint_polypeptides A gene that is alternately spliced, and encodes more than one polypeptide, that do not have overlapping peptide sequences. SO:ke A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start and stop codons. encodes overlapping polypeptides different start and stop sequence SO:1001193 encodes_overlapping_polypeptides_different_start_and_stop A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences, but use different start and stop codons. SO:ke sequence SO:1001194 alternatively_spliced_gene_encoding_greater_than_1_polypeptide_coding_regions_overlapping true A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences. encodes overlapping peptides sequence SO:1001195 encodes_overlapping_peptides A gene that is alternately spliced, and encodes more than one polypeptide, that have overlapping peptide sequences. SO:ke A maxicircle gene so extensively edited that it cannot be matched to its edited mRNA sequence. sequence SO:1001196 cryptogene A maxicircle gene so extensively edited that it cannot be matched to its edited mRNA sequence. SO:ma A primary transcript that has the quality dicistronic. dicistronic primary transcript sequence SO:1001197 dicistronic_primary_transcript A primary transcript that has the quality dicistronic. SO:xp A gene that is a member of a group of genes that are either regulated or transcribed together. member of regulon sequence SO:1001217 member_of_regulon sequence alternatively_spliced_transcript_encoding_greater_than_1_polypeptide_different_start_codon_different_stop_codon_coding_regions_non-overlapping SO:1001244 alternatively_spliced_transcript_encoding_greater_than_1_polypeptide_different_start_codon_different_stop_codon_coding_regions_non_overlapping true A CDS with the evidence status of being independently known. CDS independently known sequence SO:1001246 CDS_independently_known A CDS with the evidence status of being independently known. SO:xp A CDS whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence. orphan CDS sequence SO:1001247 orphan_CDS A CDS whose predicted amino acid sequence is unsupported by any experimental evidence or by any match with any other known sequence. SO:ma A CDS that is supported by domain similarity. CDS supported by domain match data sequence SO:1001249 CDS_supported_by_domain_match_data A CDS that is supported by domain similarity. SO:xp A CDS that is supported by sequence similarity data. CDS supported by sequence similarity data sequence SO:1001251 CDS_supported_by_sequence_similarity_data A CDS that is supported by sequence similarity data. SO:xp A CDS that is predicted. CDS predicted sequence SO:1001254 CDS_predicted A CDS that is predicted. SO:ke sequence SO:1001255 status_of_coding_sequence true A CDS that is supported by similarity to EST or cDNA data. CDS supported by EST or cDNA data sequence SO:1001259 CDS_supported_by_EST_or_cDNA_data A CDS that is supported by similarity to EST or cDNA data. SO:xp A Shine-Dalgarno sequence that stimulates recoding through interactions with the anti-Shine-Dalgarno in the RNA of small ribosomal subunits of translating ribosomes. The signal is only operative in Bacteria. internal Shine Dalgarno sequence internal Shine-Dalgarno sequence sequence SO:1001260 internal_Shine_Dalgarno_sequence A Shine-Dalgarno sequence that stimulates recoding through interactions with the anti-Shine-Dalgarno in the RNA of small ribosomal subunits of translating ribosomes. The signal is only operative in Bacteria. PMID:12519954 SO:ke The sequence of a mature mRNA transcript, modified before translation or during translation, usually by special cis-acting signals. recoded mRNA sequence SO:1001261 recoded_mRNA The sequence of a mature mRNA transcript, modified before translation or during translation, usually by special cis-acting signals. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8811194&dopt=Abstract An attribute describing a translational frameshift of -1. minus 1 translationally frameshifted sequence SO:1001262 minus_1_translationally_frameshifted An attribute describing a translational frameshift of -1. SO:ke An attribute describing a translational frameshift of +1. plus 1 translationally frameshifted sequence SO:1001263 plus_1_translationally_frameshifted An attribute describing a translational frameshift of +1. SO:ke A recoded_mRNA where translation was suspended at a particular codon and resumed at a particular non-overlapping downstream codon. mRNA recoded by translational bypass sequence SO:1001264 mRNA_recoded_by_translational_bypass A recoded_mRNA where translation was suspended at a particular codon and resumed at a particular non-overlapping downstream codon. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8811194&dopt=Abstract A recoded_mRNA that was modified by an alteration of codon meaning. mRNA recoded by codon redefinition sequence SO:1001265 mRNA_recoded_by_codon_redefinition A recoded_mRNA that was modified by an alteration of codon meaning. SO:ma sequence SO:1001266 stop_codon_redefinition_as_selenocysteine true sequence SO:1001267 stop_codon_readthrough true A site in an mRNA sequence that stimulates the recoding of a region in the same mRNA. INSDC_feature:regulatory INSDC_qualifier:recoding_stimulatory_region recoding stimulatory region recoding stimulatory signal sequence SO:1001268 recoding_stimulatory_region A site in an mRNA sequence that stimulates the recoding of a region in the same mRNA. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12519954&dopt=Abstract A non-canonical start codon with 4 base pairs. 4bp start codon four bp start codon sequence SO:1001269 four_bp_start_codon A non-canonical start codon with 4 base pairs. SO:ke sequence SO:1001270 stop_codon_redefinition_as_pyrrolysine true An intron characteristic of Archaeal tRNA and rRNA genes, where intron transcript generates a bulge-helix-bulge motif that is recognised by a splicing endoribonuclease. archaeal intron sequence SO:1001271 Intron characteristic of tRNA genes; splices by an endonuclease-ligase mediated mechanism. archaeal_intron An intron characteristic of Archaeal tRNA and rRNA genes, where intron transcript generates a bulge-helix-bulge motif that is recognised by a splicing endoribonuclease. PMID:9301331 SO:ma An intron found in tRNA that is spliced via endonucleolytic cleavage and ligation rather than transesterification. pre-tRNA intron tRNA intron sequence SO:1001272 Could be a cross product with Gene ontology, GO:0006388. tRNA_intron An intron found in tRNA that is spliced via endonucleolytic cleavage and ligation rather than transesterification. SO:ke A non-canonical start codon of sequence CTG. CTG start codon sequence SO:1001273 CTG_start_codon A non-canonical start codon of sequence CTG. SO:ke The incorporation of selenocysteine into a protein sequence is directed by an in-frame UGA codon (usually a stop codon) within the coding region of the mRNA. Selenoprotein mRNAs contain a conserved secondary structure in the 3' UTR that is required for the distinction of UGA stop from UGA selenocysteine. The selenocysteine insertion sequence (SECIS) is around 60 nt in length and adopts a hairpin structure which is sufficiently well-defined and conserved to act as a computational screen for selenoprotein genes. http://en.wikipedia.org/wiki/SECIS_element SECIS element sequence SO:1001274 SECIS_element The incorporation of selenocysteine into a protein sequence is directed by an in-frame UGA codon (usually a stop codon) within the coding region of the mRNA. Selenoprotein mRNAs contain a conserved secondary structure in the 3' UTR that is required for the distinction of UGA stop from UGA selenocysteine. The selenocysteine insertion sequence (SECIS) is around 60 nt in length and adopts a hairpin structure which is sufficiently well-defined and conserved to act as a computational screen for selenoprotein genes. http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00031 http://en.wikipedia.org/wiki/SECIS_element wiki Sequence coding for a short, single-stranded, DNA sequence via a retrotransposed RNA intermediate; characteristic of some microbial genomes. sequence SO:1001275 retron Sequence coding for a short, single-stranded, DNA sequence via a retrotransposed RNA intermediate; characteristic of some microbial genomes. SO:ma The recoding stimulatory signal located downstream of the recoding site. three prime recoding site sequence SO:1001277 three_prime_recoding_site The recoding stimulatory signal located downstream of the recoding site. SO:ke A recoding stimulatory region, the stem-loop secondary structural element is downstream of the redefined region. three prime stem loop structure sequence SO:1001279 three_prime_stem_loop_structure A recoding stimulatory region, the stem-loop secondary structural element is downstream of the redefined region. PMID:12519954 SO:ke The recoding stimulatory signal located upstream of the recoding site. five prime recoding site sequence SO:1001280 five_prime_recoding_site The recoding stimulatory signal located upstream of the recoding site. SO:ke Four base pair sequence immediately downstream of the redefined region. The redefined region is a frameshift site. The quadruplet is 2 overlapping codons. flanking three prime quadruplet recoding signal sequence SO:1001281 flanking_three_prime_quadruplet_recoding_signal Four base pair sequence immediately downstream of the redefined region. The redefined region is a frameshift site. The quadruplet is 2 overlapping codons. PMID:12519954 SO:ke A stop codon signal for a UAG stop codon redefinition. UAG stop codon signal sequence SO:1001282 UAG_stop_codon_signal A stop codon signal for a UAG stop codon redefinition. SO:ke A stop codon signal for a UAA stop codon redefinition. UAA stop codon signal sequence SO:1001283 UAA_stop_codon_signal A stop codon signal for a UAA stop codon redefinition. SO:ke A set of units of gene expression directly regulated by a common set of one or more common regulatory gene products. http://en.wikipedia.org/wiki/Regulon sequence SO:1001284 Definition updated with Mejia-Almonte et.al PMID:32665585 on Aug 5, 2020. Added relationship has_part SO:0002300 regulon A set of units of gene expression directly regulated by a common set of one or more common regulatory gene products. ISBN:0198506732 PMID:32665585 http://en.wikipedia.org/wiki/Regulon wiki A stop codon signal for a UGA stop codon redefinition. UGA stop codon signal sequence SO:1001285 UGA_stop_codon_signal A stop codon signal for a UGA stop codon redefinition. SO:ke A recoding stimulatory signal, downstream sequence important for recoding that contains repetitive elements. three prime repeat recoding signal sequence SO:1001286 three_prime_repeat_recoding_signal A recoding stimulatory signal, downstream sequence important for recoding that contains repetitive elements. PMID:12519954 SO:ke A recoding signal that is found many hundreds of nucleotides 3' of a redefined stop codon. distant three prime recoding signal sequence SO:1001287 distant_three_prime_recoding_signal A recoding signal that is found many hundreds of nucleotides 3' of a redefined stop codon. http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=8709208&dopt=Abstract A recoding stimulatory signal that is a stop codon and has effect on efficiency of recoding. stop codon signal sequence SO:1001288 This term does not include the stop codons that are redefined. An example would be a stop codon that partially overlapped a frame shifting site would be an example stimulatory signal. stop_codon_signal A recoding stimulatory signal that is a stop codon and has effect on efficiency of recoding. PMID:12519954 SO:ke The sequence referred to by an entry in a databank such as GenBank or SwissProt. databank entry sequence accession SO:2000061 databank_entry The sequence referred to by an entry in a databank such as GenBank or SwissProt. SO:ke A gene component region which acts as a recombinational unit of a gene whose functional form is generated through somatic recombination. gene segment sequence SO:3000000 Requested by tracker 2021594, July 2008, by Alex. gene_segment A gene component region which acts as a recombinational unit of a gene whose functional form is generated through somatic recombination. GOC:add