OBO-Edit 2.3.1
22:11:2021 06:31
sequence
1.2
David Sant
definition
term replaced by
amino acid modification
Alliance of Genome Resources
Alliance of Genome Resources Gene Biotype Slim
biosapiens
database of genomic structural variation
RNA modification
SO feature annotation
variant annotation term
amino acid 1 letter code
amino acid 3 letter code
biosapiens protein feature ontology
dbsnp variant terms
DBVAR
ensembl variant terms
subset_property
synonym_type_property
consider
has_alternative_id
has_broad_synonym
database_cross_reference
has_exact_synonym
has_narrow_synonym
has_obo_format_version
has_obo_namespace
has_related_synonym
has_scope
has_synonym_type
in_subset
A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap.
sequence
adjacent_to
adjacent_to
A geometric operator, specified in Egenhofer 1989. Two features meet if they share a junction on the sequence. X adjacent_to Y iff X and Y share a boundary but do not overlap.
PMID:20226267
SO:ke
sequence
associated_with
This relationship is vague and up for discussion.
associated_with
B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A.
sequence
complete_evidence_for_feature
If A is a feature with multiple regions such as a multi exon transcript, the supporting EST evidence is complete if each of the regions is supported by an equivalent region in B. Also there must be no extra regions in B that are not represented in A. This relationship was requested by jeltje on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222.
complete_evidence_for_feature
B is complete_evidence_for_feature A if the extent (5' and 3' boundaries) and internal boundaries of B fully support the extent and internal boundaries of A.
SO:ke
X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z.
kareneilbeck
2010-10-14T01:38:51Z
sequence
connects_on
Example: A splice_junction connects_on exon, exon, mature_transcript.
connects_on
X connects_on Y, Z, R iff whenever Z is on a R, X is adjacent to a Y and adjacent to a Z.
PMID:20226267
X contained_by Y iff X starts after start of Y and X ends before end of Y.
kareneilbeck
2010-10-14T01:26:16Z
sequence
contained_by
The inverse is contains. Example: intein contained_by immature_peptide_region.
contained_by
X contained_by Y iff X starts after start of Y and X ends before end of Y.
PMID:20226267
The inverse of contained_by.
kareneilbeck
2010-10-14T01:32:15Z
sequence
contains
Example: pre_miRNA contains miRNA_loop.
contains
The inverse of contained_by.
PMID:20226267
sequence
derives_from
derives_from
X is disconnected_from Y iff it is not the case that X overlaps Y.
kareneilbeck
2010-10-14T01:42:10Z
sequence
disconnected_from
disconnected_from
X is disconnected_from Y iff it is not the case that X overlaps Y.
PMID:20226267
kareneilbeck
2009-08-19T02:19:45Z
sequence
edited_from
edited_from
kareneilbeck
2009-08-19T02:19:11Z
sequence
edited_to
edited_to
B is evidence_for_feature A, if an instance of B supports the existence of A.
sequence
evidence_for_feature
This relationship was requested by nlw on the SO term tracker. The thread for the discussion is available can be accessed via tracker ID:1917222.
evidence_for_feature
B is evidence_for_feature A, if an instance of B supports the existence of A.
SO:ke
X is exemplar of Y if X is the best evidence for Y.
sequence
exemplar_of
Tracker id: 2594157.
exemplar_of
X is exemplar of Y if X is the best evidence for Y.
SO:ke
Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary.
kareneilbeck
2010-10-14T01:45:45Z
sequence
finished_by
Example CDS finished_by stop_codon.
finished_by
Xy is finished_by Y if Y part of X, and X and Y share a 3' boundary.
PMID:20226267
X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary.
kareneilbeck
2010-10-14T02:17:53Z
sequence
finishes
Example: stop_codon finishes CDS.
finishes
X finishes Y if X is part_of Y and X and Y share a 3' or C terminal boundary.
PMID:20226267
X gained Y if X is a variant_of X' and Y part of X but not X'.
kareneilbeck
2011-06-28T12:51:10Z
sequence
gained
A relation with which to annotate the changes in a variant sequence with respect to a reference.
For example a variant transcript may gain a stop codon not present in the reference sequence.
gained
X gained Y if X is a variant_of X' and Y part of X but not X'.
SO:ke
sequence
genome_of
genome_of
kareneilbeck
2009-08-19T02:27:04Z
sequence
guided_by
guided_by
kareneilbeck
2009-08-19T02:27:24Z
sequence
guides
guides
X has_integral_part Y if and only if: X has_part Y and Y part_of X.
kareneilbeck
2009-08-19T12:01:46Z
sequence
has_integral_part
Example: mRNA has_integral_part CDS.
has_integral_part
X has_integral_part Y if and only if: X has_part Y and Y part_of X.
http://precedings.nature.com/documents/3495/version/1
sequence
has_origin
has_origin
Inverse of part_of.
sequence
has_part
Example: operon has_part gene.
has_part
Inverse of part_of.
http://precedings.nature.com/documents/3495/version/1
sequence
has_quality
The relationship between a feature and an attribute.
has_quality
sequence
homologous_to
homologous_to
X integral_part_of Y if and only if: X part_of Y and Y has_part X.
kareneilbeck
2009-08-19T12:03:28Z
sequence
integral_part_of
Example: exon integral_part_of transcript.
integral_part_of
X integral_part_of Y if and only if: X part_of Y and Y has_part X.
http://precedings.nature.com/documents/3495/version/1
R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical).
kareneilbeck
2010-10-14T02:19:48Z
sequence
is_consecutive_sequence_of
Example: region is consecutive_sequence of base.
is_consecutive_sequence_of
R is_consecutive_sequence_of R iff every instance of R is equivalent to a collection of instances of U:u1, u2, un, such that no pair of ux uy is overlapping and for all ux, it is adjacent to ux-1 and ux+1, with the exception of the initial and terminal u1,and un (which may be identical).
PMID:20226267
X lost Y if X is a variant_of X' and Y part of X' but not X.
kareneilbeck
2011-06-28T12:53:16Z
sequence
lost
A relation with which to annotate the changes in a variant sequence with respect to a reference.
For example a variant transcript may have lost a stop codon present in the reference sequence.
lost
X lost Y if X is a variant_of X' and Y part of X' but not X.
SO:ke
A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y.
kareneilbeck
2010-10-14T01:34:48Z
sequence
maximally_overlaps
Example: non_coding_region_of_exon maximally_overlaps the intersections of exon and UTR.
maximally_overlaps
A maximally_overlaps X iff all parts of A (including A itself) overlap both A and Y.
PMID:20226267
sequence
member_of
A subtype of part_of. Inverse is collection_of. Winston, M, Chaffin, R, Herrmann: A taxonomy of part-whole relations. Cognitive Science 1987, 11:417-444.
member_of
A relationship between a pseudogenic feature and its functional ancestor.
sequence
non_functional_homolog_of
non_functional_homolog_of
A relationship between a pseudogenic feature and its functional ancestor.
SO:ke
sequence
orthologous_to
orthologous_to
X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y.
kareneilbeck
2010-10-14T01:33:15Z
sequence
overlaps
Example: coding_exon overlaps CDS.
overlaps
X overlaps Y iff there exists some Z such that Z contained_by X and Z contained_by Y.
PMID:20226267
sequence
paralogous_to
paralogous_to
X part_of Y if X is a subregion of Y.
sequence
part_of
Example: amino_acid part_of polypeptide.
part_of
X part_of Y if X is a subregion of Y.
http://precedings.nature.com/documents/3495/version/1
B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A.
sequence
partial_evidence_for_feature
partial_evidence_for_feature
B is partial_evidence_for_feature A if the extent of B supports part_of but not all of A.
SO:ke
sequence
position_of
position_of
Inverse of processed_into.
kareneilbeck
2009-08-19T12:14:00Z
sequence
processed_from
Example: miRNA processed_from miRNA_primary_transcript.
processed_from
Inverse of processed_into.
http://precedings.nature.com/documents/3495/version/1
X is processed_into Y if a region X is modified to create Y.
kareneilbeck
2009-08-19T12:15:02Z
sequence
processed_into
Example: miRNA_primary_transcript processed into miRNA.
processed_into
X is processed_into Y if a region X is modified to create Y.
http://precedings.nature.com/documents/3495/version/1
kareneilbeck
2009-08-19T02:21:03Z
sequence
recombined_from
recombined_from
kareneilbeck
2009-08-19T02:20:07Z
sequence
recombined_to
recombined_to
sequence
sequence_of
sequence_of
sequence
similar_to
similar_to
X is strted_by Y if Y is part_of X and X and Y share a 5' boundary.
kareneilbeck
2010-10-14T01:43:55Z
sequence
started_by
Example: CDS started_by start_codon.
started_by
X is strted_by Y if Y is part_of X and X and Y share a 5' boundary.
PMID:20226267
X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary.
kareneilbeck
2010-10-14T01:47:53Z
sequence
starts
Example: start_codon starts CDS.
starts
X starts Y if X is part of Y, and A and Y share a 5' or N-terminal boundary.
PMID:20226267
kareneilbeck
2009-08-19T02:22:14Z
sequence
trans_spliced_from
trans_spliced_from
kareneilbeck
2009-08-19T02:22:00Z
sequence
trans_spliced_to
trans_spliced_to
X is transcribed_from Y if X is synthesized from template Y.
kareneilbeck
2009-08-19T12:05:39Z
sequence
transcribed_from
Example: primary_transcript transcribed_from gene.
transcribed_from
X is transcribed_from Y if X is synthesized from template Y.
http://precedings.nature.com/documents/3495/version/1
Inverse of transcribed_from.
kareneilbeck
2009-08-19T12:08:24Z
sequence
transcribed_to
Example: gene transcribed_to primary_transcript.
transcribed_to
Inverse of transcribed_from.
http://precedings.nature.com/documents/3495/version/1
Inverse of translation _of.
kareneilbeck
2009-08-19T12:11:53Z
sequence
translates_to
Example: codon translates_to amino_acid.
translates_to
Inverse of translation _of.
http://precedings.nature.com/documents/3495/version/1
X is translation of Y if Y is translated by ribosome to create X.
kareneilbeck
2009-08-19T12:09:59Z
sequence
translation_of
Example: Polypeptide translation_of CDS.
translation_of
X is translation of Y if Y is translated by ribosome to create X.
http://precedings.nature.com/documents/3495/version/1
A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A.
sequence
variant_of
Added to SO during the immunology workshop, June 2007. This relationship was approved by Barry Smith.
variant_of
A' is a variant (mutation) of A = definition every instance of A' is either an immediate mutation of some instance of A, or there is a chain of immediate mutation processes linking A' to some instance of A.
SO:immuno_workshop
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
true
sequence
SO:0000000
Sequence_Ontology
true
A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids.
sequence
sequence
SO:0000001
region
A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids.
SO:ke
A folded sequence.
INSDC_feature:misc_structure
sequence secondary structure
sequence
SO:0000002
sequence_secondary_structure
A folded sequence.
SO:ke
G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet.
http://en.wikipedia.org/wiki/G-quadruplex
G quartet
G tetrad
G-quadruplex
G-quartet
G-tetrad
G_quadruplex
guanine tetrad
sequence
SO:0000003
G_quartet
G-quartets are unusual nucleic acid structures consisting of a planar arrangement where each guanine is hydrogen bonded by hoogsteen pairing to another guanine in the quartet.
http://www.ncbi.nlm.nih.gov/pubmed/7919797?dopt=Abstract
http://en.wikipedia.org/wiki/G-quadruplex
wiki
A coding exon that is not the most 3-prime or the most 5-prime in a given transcript.
interior coding exon
sequence
SO:0000004
interior_coding_exon
The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Satellite_DNA
INSDC_qualifier:satellite
satellite DNA
sequence
SO:0000005
satellite_DNA
The many tandem repeats (identical or related) of a short basic repeating unit; many have a base composition or other property different from the genome average that allows them to be separated from the bulk (main band) genomic DNA.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Satellite_DNA
wiki
A region amplified by a PCR reaction.
http://en.wikipedia.org/wiki/RAPD
PCR product
sequence
amplicon
SO:0000006
This term is mapped to MGED. This term is now located in OBI, with the following ID OBI_0000406.
PCR_product
A region amplified by a PCR reaction.
SO:ke
http://en.wikipedia.org/wiki/RAPD
wiki
One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert.
mate pair
read-pair
sequence
SO:0000007
read_pair
One of a pair of sequencing reads in which the two members of the pair are related by originating at either end of a clone insert.
SO:ls
sequence
SO:0000008
gene_sensu_your_favorite_organism
true
sequence
SO:0000009
gene_class
true
A gene which, when transcribed, can be translated into a protein.
protein-coding
sequence
SO:0000010
protein_coding
A gene which can be transcribed, but will not be translated into a protein.
non protein-coding
sequence
SO:0000011
non_protein_coding
The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote.
scRNA primary transcript
scRNA transcript
small cytoplasmic RNA transcript
sequence
small cytoplasmic RNA
small_cytoplasmic_RNA
SO:0000012
scRNA_primary_transcript
The primary transcript of any one of several small cytoplasmic RNA molecules present in the cytoplasm and sometimes nucleus of a Eukaryote.
http://www.ebi.ac.uk/embl/WebFeat/align/scRNA_s.html
A small non coding RNA sequence, present in the cytoplasm.
INSDC_feature:ncRNA
INSDC_qualifier:scRNA
small cytoplasmic RNA
sequence
SO:0000013
scRNA
A small non coding RNA sequence, present in the cytoplasm.
SO:ke
A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element.
INR motif
initiator
initiator motif
sequence
DMp2
SO:0000014
Binds TAF1, TAF2.
INR_motif
A sequence element characteristic of some RNA polymerase II promoters required for the correct positioning of the polymerase for the start of transcription. Overlaps the TSS. The mammalian consensus sequence is YYAN(T|A)YY; the Drosophila consensus sequence is TCA(G|T)t(T|C). In each the A is at position +1 with respect to the TSS. Functionally similar to the TATA box element.
PMID:12651739
PMID:16858867
A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C).
DPE motif
downstream core promoter element
CRWMGCGWKCGCTTS
sequence
SO:0000015
Binds TAF6, TAF9.
DPE_motif
A sequence element characteristic of some RNA polymerase II promoters; Positioned from +28 to +32 with respect to the TSS (+1). Experimental results suggest that the DPE acts in conjunction with the INR_motif to provide a binding site for TFIID in the absence of a TATA box to mediate transcription of TATA-less promoters. Consensus sequence (A|G)G(A|T)(C|T)(G|A|C).
PMID:12515390
PMID:12537576
PMID:12651739
PMID:16858867
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB.
B-recognition element
BRE motif
BREu motif
transcription factor B-recognition element
sequence
BREu
TFIIB recognition element
SO:0000016
Binds TFIIB.
BREu_motif
A sequence element characteristic of some RNA polymerase II promoters, located immediately upstream of some TATA box elements at -37 to -32 with respect to the TSS (+1). Consensus sequence is (G|C)(G|C)(G|A)CGCC. Binds TFIIB.
PMID:12651739
PMID:16858867
A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes.
PSE motif
proximal sequence element
sequence
SO:0000017
PSE_motif
A sequence element characteristic of the promoters of snRNA genes transcribed by RNA polymerase II or by RNA polymerase III. Located between -45 and -60 relative to the TSS. The human PSE_motif consensus sequence is TCACCNTNA(C|G)TNAAAAG(T|G). The basal transcription factor, snRNA-activating protein complex (SNAPc), binds the PSE_motif and is required for the transcription of both RNA polymerase II and III transcribed small-nuclear RNA genes.
PMID:11390411
PMID:12621023
PMID:12651739
PMID:23166507
PMID:8339931
A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned.
http://en.wikipedia.org/wiki/Linkage_group
linkage group
sequence
SO:0000018
linkage_group
A group of loci that can be grouped in a linear order representing the different degrees of linkage among the genes concerned.
ISBN:038752046
http://en.wikipedia.org/wiki/Linkage_group
wiki
true
A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge.
RNA internal loop
sequence
SO:0000020
RNA_internal_loop
A region of double stranded RNA where the bases do not conform to WC base pairing. The loop is closed on both sides by canonical base pairing. If the interruption to base pairing occurs on one strand only, it is known as a bulge.
SO:ke
An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand.
asymmetric RNA internal loop
sequence
SO:0000021
asymmetric_RNA_internal_loop
An internal RNA loop where one of the strands includes more bases than the corresponding region on the other strand.
SO:ke
A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix.
A minor RNA motif
sequence
SO:0000022
A_minor_RNA_motif
A region forming a motif, composed of adenines, where the minor groove edges are inserted into the minor groove of another helix.
SO:ke
The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices.
http://en.wikipedia.org/wiki/K-turn
K turn RNA motif
K-turn
kink turn
kink-turn motif
sequence
SO:0000023
K_turn_RNA_motif
The kink turn (K-turn) is an RNA structural motif that creates a sharp (~120 degree) bend between two continuous helices.
SO:ke
http://en.wikipedia.org/wiki/K-turn
wiki
A loop in ribosomal RNA containing the sites of attack for ricin and sarcin.
sarcin like RNA motif
sarcin/ricin RNA domain
sarcin/ricin domain
sarcin/ricin loop
sequence
SO:0000024
sarcin_like_RNA_motif
A loop in ribosomal RNA containing the sites of attack for ricin and sarcin.
http://www.ncbi.nlm.nih.gov/pubmed/7897662
An internal RNA loop where the extent of the loop on both stands is the same size.
A-minor RNA motif
sequence
SO:0000025
symmetric_RNA_internal_loop
An internal RNA loop where the extent of the loop on both stands is the same size.
SO:ke
RNA junction loop
sequence
SO:0000026
RNA_junction_loop
RNA hook turn
hook-turn motif
sequence
hook turn
SO:0000027
RNA_hook_turn
Two bases paired opposite each other by hydrogen bonds creating a secondary structure.
http://en.wikipedia.org/wiki/Base_pair
base pair
sequence
SO:0000028
base_pair
http://en.wikipedia.org/wiki/Base_pair
wiki
The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation.
WC base pair
Watson Crick base pair
Watson-Crick pair
canonical base pair
sequence
Watson-Crick base pair
SO:0000029
WC_base_pair
The canonical base pair, where two bases interact via WC edges, with glycosidic bonds oriented cis relative to the axis of orientation.
PMID:12177293
A type of non-canonical base-pairing.
sugar edge base pair
sequence
SO:0000030
sugar_edge_base_pair
A type of non-canonical base-pairing.
PMID:12177293
DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://en.wikipedia.org/wiki/Aptamer
sequence
SO:0000031
aptamer
DNA or RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://aptamer.icmb.utexas.edu
http://en.wikipedia.org/wiki/Aptamer
wiki
DNA molecules that have been selected from random pools based on their ability to bind other molecules.
DNA aptamer
sequence
SO:0000032
DNA_aptamer
DNA molecules that have been selected from random pools based on their ability to bind other molecules.
http:aptamer.icmb.utexas.edu
RNA molecules that have been selected from random pools based on their ability to bind other molecules.
RNA aptamer
sequence
SO:0000033
RNA_aptamer
RNA molecules that have been selected from random pools based on their ability to bind other molecules.
http://aptamer.icmb.utexas.edu
Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino.
morphant
morpholino
morpholino oligo
sequence
SO:0000034
morpholino_oligo
Morpholino oligos are synthesized from four different Morpholino subunits, each of which contains one of the four genetic bases (A, C, G, T) linked to a 6-membered morpholine ring. Eighteen to 25 subunits of these four subunit types are joined in a specific order by non-ionic phosphorodiamidate intersubunit linkages to give a Morpholino.
http://www.gene-tools.com/
A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Riboswitch
INSDC_qualifier:riboswitch
riboswitch RNA
sequence
SO:0000035
riboswitch
A riboswitch is a part of an mRNA that can act as a direct sensor of small molecules to control their own expression. A riboswitch is a cis element in the 5' end of an mRNA, that acts as a direct sensor of metabolites.
PMID:2820954
http://en.wikipedia.org/wiki/Riboswitch
wiki
A DNA region that is required for the binding of chromatin to the nuclear matrix.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Matrix_attachment_site
INSDC_qualifier:matrix_attachment_region
MAR
S/MAR
SMAR
matrix association region
matrix attachment region
matrix attachment site
nuclear matrix association region
nuclear matrix attachment site
scaffold attachment site
scaffold matrix attachment region
sequence
S/MAR element
SO:0000036
matrix_attachment_site
A DNA region that is required for the binding of chromatin to the nuclear matrix.
SO:ma
http://en.wikipedia.org/wiki/Matrix_attachment_site
wiki
A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Locus_control_region
INSDC_qualifier:locus_control_region
LCR
locus control region
sequence
locus control element
SO:0000037
Definition updated Nov 10 2020, Colin Logie from GREEKC helped us realize that LCRs can also be located 3' to a gene.
locus_control_region
A DNA region that includes DNAse hypersensitive sites located near a gene that confers the high-level, position-independent, and copy number-dependent expression to that gene.
SO:ma
http://en.wikipedia.org/wiki/Locus_control_region
wiki
A collection of match parts.
sequence
SO:0000038
match_set
true
A collection of match parts.
SO:ke
A part of a match, for example an hsp from blast is a match_part.
match part
sequence
SO:0000039
match_part
A part of a match, for example an hsp from blast is a match_part.
SO:ke
A clone of a DNA region of a genome.
genomic clone
sequence
SO:0000040
genomic_clone
A clone of a DNA region of a genome.
SO:ma
An operation that can be applied to a sequence, that results in a change.
sequence operation
sequence
SO:0000041
sequence_operation
true
An operation that can be applied to a sequence, that results in a change.
SO:ke
An attribute of a pseudogene (SO:0000336).
pseudogene attribute
sequence
SO:0000042
pseudogene_attribute
true
An attribute of a pseudogene (SO:0000336).
SO:ma
A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail.
INSDC_feature:gene
INSDC_qualifier:processed
processed pseudogene
retropseudogene
sequence
R psi G
pseudogene by reverse transcription
SO:0000043
Please not the synonym R psi M uses the spelled out form of the greek letter.
processed_pseudogene
A pseudogene created via retrotranposition of the mRNA of a functional protein-coding parent gene followed by accumulation of deleterious mutations lacking introns and promoters, often including a polyA tail.
GENCODE:http://www.gencodegenes.org/gencode_biotypes.html
A pseudogene caused by unequal crossing over at recombination.
pseudogene by unequal crossing over
sequence
SO:0000044
pseudogene_by_unequal_crossing_over
A pseudogene caused by unequal crossing over at recombination.
SO:ke
To remove a subsection of sequence.
sequence
SO:0000045
delete
true
To remove a subsection of sequence.
SO:ke
To insert a subsection of sequence.
sequence
SO:0000046
insert
true
To insert a subsection of sequence.
SO:ke
To invert a subsection of sequence.
sequence
SO:0000047
invert
true
To invert a subsection of sequence.
SO:ke
To substitute a subsection of sequence for another.
sequence
SO:0000048
substitute
true
To substitute a subsection of sequence for another.
SO:ke
To translocate a subsection of sequence.
sequence
SO:0000049
translocate
true
To translocate a subsection of sequence.
SO:ke
A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene.
sequence
SO:0000050
gene_part
true
A part of a gene, that has no other route in the ontology back to region. This concept is necessary for logical inference as these parts must have the properties of region. It also allows us to associate all the parts of genes with a gene.
SO:ke
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
http://en.wikipedia.org/wiki/Hybridization_probe
sequence
SO:0000051
probe
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
SO:ma
http://en.wikipedia.org/wiki/Hybridization_probe
wiki
sequence
assortment-derived_deficiency
SO:0000052
assortment_derived_deficiency
true
A sequence_variant_effect which changes the regulatory region of a gene.
SO:0001556
sequence variant affecting regulatory region
sequence
mutation affecting regulatory region
SO:0000053
OBSOLETE: This term was deleted as it conflated more than one term. The alteration is separate from the effect.
sequence_variant_affecting_regulatory_region
true
A sequence_variant_effect which changes the regulatory region of a gene.
SO:ke
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
http://en.wikipedia.org/wiki/Aneuploid
sequence
SO:0000054
aneuploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number.
SO:ke
http://en.wikipedia.org/wiki/Aneuploid
wiki
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present.
http://en.wikipedia.org/wiki/Hyperploid
sequence
SO:0000055
hyperploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as extra chromosomes are present.
SO:ke
http://en.wikipedia.org/wiki/Hyperploid
wiki
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing.
http://en.wikipedia.org/wiki/Hypoploid
sequence
SO:0000056
hypoploid
A kind of chromosome variation where the chromosome complement is not an exact multiple of the haploid number as some chromosomes are missing.
SO:ke
http://en.wikipedia.org/wiki/Hypoploid
wiki
A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon.
http://en.wikipedia.org/wiki/Operator_(biology)#Operator
operator segment
sequence
SO:0000057
Moved to transcriptional_cis_regulatory_region (SO:0001055) from gene_group_regulatory_region (SO:0000752) on 11 Feb 2021 when SO:0000752 was merged into SO:0001055. See GitHub Issue #529.
operator
A regulatory element of an operon to which activators or repressors bind thereby effecting translation of genes in that operon.
SO:ma
http://en.wikipedia.org/wiki/Operator_(biology)#Operator
wiki
sequence
assortment-derived_aneuploid
SO:0000058
assortment_derived_aneuploid
true
A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease.
nuclease binding site
sequence
SO:0000059
nuclease_binding_site
A binding site that, of a nucleotide molecule, that interacts selectively and non-covalently with polypeptide residues of a nuclease.
SO:cb
One arm of a compound chromosome.
compound chromosome arm
sequence
SO:0000060
FLAG - this term is should probably be a part of rather than an is_a.
compound_chromosome_arm
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme.
restriction endonuclease binding site
restriction enzyme binding site
sequence
SO:0000061
A region of a molecule that binds to a restriction enzyme.
restriction_enzyme_binding_site
A binding site that, in the nucleotide molecule, interacts selectively and non-covalently with polypeptide residues of a restriction enzyme.
SO:cb
An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
deficient intrachromosomal transposition
sequence
SO:0000062
deficient_intrachromosomal_transposition
An intrachromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
FB:reference_manual
An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
deficient interchromosomal transposition
sequence
SO:0000063
deficient_interchromosomal_transposition
An interchromosomal transposition whereby a translocation in which one of the four broken ends loses a segment before re-joining.
SO:ke
sequence
SO:0000064
This classes of attributes was added by MA to allow the broad description of genes based on qualities of the transcript(s). A product of SO meeting 2004.
gene_by_transcript_attribute
true
A chromosome structure variation whereby an arm exists as an individual chromosome element.
free chromosome arm
sequence
SO:0000065
free_chromosome_arm
A chromosome structure variation whereby an arm exists as an individual chromosome element.
SO:ke
sequence
SO:0000066
gene_by_polyadenylation_attribute
true
gene to gene feature
sequence
SO:0000067
gene_to_gene_feature
An attribute describing a gene that has a sequence that overlaps the sequence of another gene.
sequence
SO:0000068
overlapping
An attribute describing a gene that has a sequence that overlaps the sequence of another gene.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene.
inside intron
sequence
SO:0000069
inside_intron
An attribute to describe a gene when it is located within the intron of another gene.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand.
inside intron antiparallel
sequence
SO:0000070
inside_intron_antiparallel
An attribute to describe a gene when it is located within the intron of another gene and on the opposite strand.
SO:ke
An attribute to describe a gene when it is located within the intron of another gene and on the same strand.
inside intron parallel
sequence
SO:0000071
inside_intron_parallel
An attribute to describe a gene when it is located within the intron of another gene and on the same strand.
SO:ke
sequence
SO:0000072
end_overlapping_gene
true
An attribute to describe a gene when the five prime region overlaps with another gene's 3' region.
five prime-three prime overlap
sequence
SO:0000073
five_prime_three_prime_overlap
An attribute to describe a gene when the five prime region overlaps with another gene's 3' region.
SO:ke
An attribute to describe a gene when the five prime region overlaps with another gene's five prime region.
five prime-five prime overlap
sequence
SO:0000074
five_prime_five_prime_overlap
An attribute to describe a gene when the five prime region overlaps with another gene's five prime region.
SO:ke
An attribute to describe a gene when the 3' region overlaps with another gene's 3' region.
three prime-three prime overlap
sequence
SO:0000075
three_prime_three_prime_overlap
An attribute to describe a gene when the 3' region overlaps with another gene's 3' region.
SO:ke
An attribute to describe a gene when the 3' region overlaps with another gene's 5' region.
5' 3' overlap
three prime five prime overlap
sequence
SO:0000076
three_prime_five_prime_overlap
An attribute to describe a gene when the 3' region overlaps with another gene's 5' region.
SO:ke
A region sequence that is complementary to a sequence of messenger RNA.
http://en.wikipedia.org/wiki/Antisense
sequence
SO:0000077
antisense
A region sequence that is complementary to a sequence of messenger RNA.
SO:ke
http://en.wikipedia.org/wiki/Antisense
wiki
A transcript that is polycistronic.
polycistronic transcript
sequence
SO:0000078
polycistronic_transcript
A transcript that is polycistronic.
SO:xp
A transcript that is dicistronic.
dicistronic transcript
sequence
SO:0000079
dicistronic_transcript
A transcript that is dicistronic.
SO:ke
A gene that is a member of an operon, which is a set of genes transcribed together as a unit.
operon member
sequence
SO:0000080
operon_member
gene array member
sequence
SO:0000081
gene_array_member
sequence
SO:0000082
processed_transcript_attribute
true
DNA belonging to the macronuclei of ciliates.
macronuclear sequence
sequence
SO:0000083
macronuclear_sequence
DNA belonging to the micronuclei of a cell.
micronuclear sequence
sequence
SO:0000084
micronuclear_sequence
sequence
SO:0000085
gene_by_genome_location
true
sequence
SO:0000086
gene_by_organelle_of_genome
true
A gene from nuclear sequence.
http://en.wikipedia.org/wiki/Nuclear_gene
nuclear gene
sequence
SO:0000087
nuclear_gene
A gene from nuclear sequence.
SO:xp
http://en.wikipedia.org/wiki/Nuclear_gene
wiki
A gene located in mitochondrial sequence.
http://en.wikipedia.org/wiki/Mitochondrial_gene
mitochondrial gene
mt gene
sequence
SO:0000088
mt_gene
A gene located in mitochondrial sequence.
SO:xp
http://en.wikipedia.org/wiki/Mitochondrial_gene
wiki
A gene located in kinetoplast sequence.
kinetoplast gene
sequence
SO:0000089
kinetoplast_gene
A gene located in kinetoplast sequence.
SO:xp
A gene from plastid sequence.
plastid gene
sequence
SO:0000090
plastid_gene
A gene from plastid sequence.
SO:xp
A gene from apicoplast sequence.
apicoplast gene
sequence
SO:0000091
apicoplast_gene
A gene from apicoplast sequence.
SO:xp
A gene from chloroplast sequence.
chloroplast gene
ct gene
sequence
SO:0000092
ct_gene
A gene from chloroplast sequence.
SO:xp
A gene from chromoplast_sequence.
chromoplast gene
sequence
SO:0000093
chromoplast_gene
A gene from chromoplast_sequence.
SO:xp
A gene from cyanelle sequence.
cyanelle gene
sequence
SO:0000094
cyanelle_gene
A gene from cyanelle sequence.
SO:xp
A plastid gene from leucoplast sequence.
leucoplast gene
sequence
SO:0000095
leucoplast_gene
A plastid gene from leucoplast sequence.
SO:xp
A gene from proplastid sequence.
proplastid gene
sequence
SO:0000096
proplastid_gene
A gene from proplastid sequence.
SO:ke
A gene from nucleomorph sequence.
nucleomorph gene
sequence
SO:0000097
nucleomorph_gene
A gene from nucleomorph sequence.
SO:xp
A gene from plasmid sequence.
plasmid gene
sequence
SO:0000098
plasmid_gene
A gene from plasmid sequence.
SO:xp
A gene from proviral sequence.
proviral gene
sequence
SO:0000099
proviral_gene
A gene from proviral sequence.
SO:xp
A proviral gene with origin endogenous retrovirus.
endogenous retroviral gene
sequence
SO:0000100
endogenous_retroviral_gene
A proviral gene with origin endogenous retrovirus.
SO:xp
A transposon or insertion sequence. An element that can insert in a variety of DNA sequences.
http://en.wikipedia.org/wiki/Transposable_element
transposable element
transposon
sequence
SO:0000101
transposable_element
A transposon or insertion sequence. An element that can insert in a variety of DNA sequences.
http://www.sci.sdsu.edu/~smaloy/Glossary/T.html
http://en.wikipedia.org/wiki/Transposable_element
wiki
A match to an EST or cDNA sequence.
expressed sequence match
sequence
SO:0000102
expressed_sequence_match
A match to an EST or cDNA sequence.
SO:ke
The end of the clone insert.
clone insert end
sequence
SO:0000103
clone_insert_end
The end of the clone insert.
SO:ke
A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation.
SO:0000358
http://en.wikipedia.org/wiki/Polypeptide
protein
sequence
SO:0000104
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The term 'protein' was merged with 'polypeptide'. Although 'protein' was a sequence_attribute and therefore meant to describe the quality rather than an actual feature, it was being used erroneously. It is replaced by 'peptidyl' as the polymer attribute.
polypeptide
A sequence of amino acids linked by peptide bonds which may lack appreciable tertiary structure and may not be liable to irreversible denaturation.
SO:ma
http://en.wikipedia.org/wiki/Polypeptide
wiki
A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere.
chromosome arm
sequence
SO:0000105
chromosome_arm
A region of the chromosome between the centromere and the telomere. Human chromosomes have two arms, the p arm (short) and the q arm (long) which are separated from each other by the centromere.
http://www.medterms.com/script/main/art.asp?articlekey=5152
sequence
SO:0000106
non_capped_primary_transcript
true
A single stranded oligo used for polymerase chain reaction.
sequencing primer
sequence
SO:0000107
sequencing_primer
An mRNA with a frameshift.
frameshifted mRNA
mRNA with frameshift
sequence
SO:0000108
mRNA_with_frameshift
An mRNA with a frameshift.
SO:xp
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
sequence
mutation
SO:0000109
sequence_variant_obs
true
A sequence_variant is a non exact copy of a sequence_feature or genome exhibiting one or more sequence_alteration.
SO:ke
Any extent of continuous biological sequence.
INSDC_feature:misc_feature
INSDC_note:other
INSDC_note:sequence_feature
located_sequence_feature
sequence feature
sequence
located sequence feature
SO:0000110
sequence_feature
Any extent of continuous biological sequence.
LAMHDI:mb
SO:ke
A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast.
transposable element gene
sequence
SO:0000111
transposable_element_gene
A gene encoded within a transposable element. For example gag, int, env and pol are the transposable element genes of the TY element in yeast.
SO:ke
An oligo to which new deoxyribonucleotides can be added by DNA polymerase.
http://en.wikipedia.org/wiki/Primer_(molecular_biology)
DNA primer
primer oligonucleotide
primer polynucleotide
primer sequence
sequence
SO:0000112
primer
An oligo to which new deoxyribonucleotides can be added by DNA polymerase.
SO:ke
http://en.wikipedia.org/wiki/Primer_(molecular_biology)
wiki
A viral sequence which has integrated into a host genome.
proviral region
sequence
proviral sequence
SO:0000113
proviral_region
A viral sequence which has integrated into a host genome.
SO:ke
A methylated deoxy-cytosine.
methylated C
methylated cytosine
methylated cytosine base
methylated cytosine residue
methylated_C
sequence
SO:0000114
methylated_cytosine
A methylated deoxy-cytosine.
SO:ke
sequence
SO:0000115
transcript_feature
true
An attribute describing a sequence that is modified by editing.
sequence
SO:0000116
edited
An attribute describing a sequence that is modified by editing.
SO:ke
sequence
SO:0000117
transcript_with_readthrough_stop_codon
true
A transcript with a translational frameshift.
transcript with translational frameshift
sequence
SO:0000118
transcript_with_translational_frameshift
A transcript with a translational frameshift.
SO:xp
An attribute to describe a sequence that is regulated.
sequence
SO:0000119
regulated
An attribute to describe a sequence that is regulated.
SO:ke
A primary transcript that, at least in part, encodes one or more proteins.
protein coding primary transcript
sequence
pre mRNA
SO:0000120
May contain introns.
protein_coding_primary_transcript
A primary transcript that, at least in part, encodes one or more proteins.
SO:ke
A single stranded oligo used for polymerase chain reaction.
DNA forward primer
forward DNA primer
forward primer
forward primer oligo
forward primer oligonucleotide
forward primer polynucleotide
forward primer sequence
sequence
SO:0000121
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
forward_primer
A single stranded oligo used for polymerase chain reaction.
http://mged.sourceforge.net/ontologies/MGEDontology.php
A folded RNA sequence.
RNA sequence secondary structure
sequence
SO:0000122
RNA_sequence_secondary_structure
A folded RNA sequence.
SO:ke
An attribute describing a gene that is regulated at transcription.
transcriptionally regulated
sequence
SO:0000123
By:<protein_id>.
transcriptionally_regulated
An attribute describing a gene that is regulated at transcription.
SO:ma
Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate.
transcriptionally constitutive
sequence
SO:0000124
transcriptionally_constitutive
Expressed in relatively constant amounts without regard to cellular environmental conditions such as the concentration of a particular substrate.
SO:ke
An inducer molecule is required for transcription to occur.
transcriptionally induced
sequence
SO:0000125
transcriptionally_induced
An inducer molecule is required for transcription to occur.
SO:ke
A repressor molecule is required for transcription to stop.
transcriptionally repressed
sequence
SO:0000126
transcriptionally_repressed
A repressor molecule is required for transcription to stop.
SO:ke
A gene that is silenced.
silenced gene
sequence
SO:0000127
silenced_gene
A gene that is silenced.
SO:xp
A gene that is silenced by DNA modification.
gene silenced by DNA modification
sequence
SO:0000128
gene_silenced_by_DNA_modification
A gene that is silenced by DNA modification.
SO:xp
A gene that is silenced by DNA methylation.
gene silenced by DNA methylation
methylation-silenced gene
sequence
SO:0000129
gene_silenced_by_DNA_methylation
A gene that is silenced by DNA methylation.
SO:xp
An attribute describing a gene that is regulated after it has been translated.
post translationally regulated
post-translationally regulated
sequence
SO:0000130
post_translationally_regulated
An attribute describing a gene that is regulated after it has been translated.
SO:ke
An attribute describing a gene that is regulated as it is translated.
translationally regulated
sequence
SO:0000131
translationally_regulated
An attribute describing a gene that is regulated as it is translated.
SO:ke
A single stranded oligo used for polymerase chain reaction.
DNA reverse primer
reverse DNA primer
reverse primer
reverse primer oligo
reverse primer oligonucleotide
reverse primer sequence
sequence
SO:0000132
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
reverse_primer
A single stranded oligo used for polymerase chain reaction.
http://mged.sourceforge.net/ontologies/MGEDontology.php
This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones.
epigenetically modified
sequence
SO:0000133
epigenetically_modified
This attribute describes a gene where heritable changes other than those in the DNA sequence occur. These changes include: modification to the DNA (such as DNA methylation, the covalent modification of cytosine), and post-translational modification of histones.
SO:ke
Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin.
imprinted
http:http://en.wikipedia.org/wiki/Genomic_imprinting
genomically imprinted
sequence
SO:0000134
genomically_imprinted
Imprinted genes are epigenetically modified genes that are expressed monoallelically according to their parent of origin.
SO:ke
http:http://en.wikipedia.org/wiki/Genomic_imprinting
wiki
The maternal copy of the gene is modified, rendering it transcriptionally silent.
maternally imprinted
sequence
SO:0000135
maternally_imprinted
The maternal copy of the gene is modified, rendering it transcriptionally silent.
SO:ke
The paternal copy of the gene is modified, rendering it transcriptionally silent.
paternally imprinted
sequence
SO:0000136
paternally_imprinted
The paternal copy of the gene is modified, rendering it transcriptionally silent.
SO:ke
Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell.
allelically excluded
sequence
SO:0000137
Examples are x-inactivation and immunoglobulin formation.
allelically_excluded
Allelic exclusion is a process occurring in diploid organisms, where a gene is inactivated and not expressed in that cell.
SO:ke
An epigenetically modified gene, rearranged at the DNA level.
gene rearranged at DNA level
sequence
SO:0000138
gene_rearranged_at_DNA_level
An epigenetically modified gene, rearranged at the DNA level.
SO:xp
Region in mRNA where ribosome assembles.
INSDC_feature:regulatory
INSDC_qualifier:ribosome_binding_site
ribosome entry site
sequence
SO:0000139
ribosome_entry_site
Region in mRNA where ribosome assembles.
SO:ke
A sequence segment located within the five prime end of an mRNA that causes premature termination of translation.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Attenuator
INSDC_qualifier:attenuator
attenuator sequence
sequence
SO:0000140
attenuator
A sequence segment located within the five prime end of an mRNA that causes premature termination of translation.
SO:as
http://en.wikipedia.org/wiki/Attenuator
wiki
The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Terminator_(genetics)
INSDC_qualifier:terminator
terminator sequence
sequence
SO:0000141
Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
terminator
The sequence of DNA located either at the end of the transcript that causes RNA polymerase to terminate transcription.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Terminator_(genetics)
wiki
A folded DNA sequence.
DNA sequence secondary structure
sequence
SO:0000142
DNA_sequence_secondary_structure
A folded DNA sequence.
SO:ke
A region of known length which may be used to manufacture a longer region.
assembly component
sequence
SO:0000143
assembly_component
A region of known length which may be used to manufacture a longer region.
SO:ke
sequence
SO:0000144
primary_transcript_attribute
true
A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough.
recoded codon
sequence
SO:0000145
recoded_codon
A codon that has been redefined at translation. The redefinition may be as a result of translational bypass, translational frameshifting or stop codon readthrough.
SO:xp
An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end.
sequence
SO:0000146
capped
An attribute describing when a sequence, usually an mRNA is capped by the addition of a modified guanine nucleotide at the 5' end.
SO:ke
A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.
http://en.wikipedia.org/wiki/Exon
INSDC_feature:exon
sequence
SO:0000147
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
exon
A region of the transcript sequence within a gene which is not removed from the primary RNA transcript by RNA splicing.
SO:ke
http://en.wikipedia.org/wiki/Exon
wiki
One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's.
sequence
scaffold
SO:0000148
supercontig
One or more contigs that have been ordered and oriented using end-read information. Contains gaps that are filled with N's.
SO:ls
A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases.
http://en.wikipedia.org/wiki/Contig
sequence
SO:0000149
contig
A contiguous sequence derived from sequence assembly. Has no gaps, but may contain N's from unavailable bases.
SO:ls
http://en.wikipedia.org/wiki/Contig
wiki
A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine.
sequence
SO:0000150
read
A sequence obtained from a single sequencing experiment. Typically a read is produced when a base calling program interprets information from a chromatogram trace file produced from a sequencing machine.
SO:rd
A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism.
http:http://en.wikipedia.org/wiki/Clone_(genetics)
sequence
SO:0000151
clone
A piece of DNA that has been inserted in a vector so that it can be propagated in a host bacterium or some other organism.
SO:ke
http:http://en.wikipedia.org/wiki/Clone_(genetics)
wiki
Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells.
yeast artificial chromosome
sequence
SO:0000152
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
YAC
Yeast Artificial Chromosome, a vector constructed from the telomeric, centromeric, and replication origin sequences needed for replication in yeast cells.
SO:ma
Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host.
bacterial artificial chromosome
sequence
SO:0000153
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
BAC
Bacterial Artificial Chromosome, a cloning vector that can be propagated as mini-chromosomes in a bacterial host.
SO:ma
The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells.
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
P1
P1 artificial chromosome
sequence
SO:0000154
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Drosophila melanogaster PACs carry an average insert size of 80 kb. The library represents a 6-fold coverage of the genome.
PAC
The P1-derived artificial chromosome are DNA constructs that are derived from the DNA of P1 bacteriophage. They can carry large amounts (about 100-300 kilobases) of other sequences for a variety of bioengineering purposes. It is one type of vector used to clone DNA fragments (100- to 300-kb insert size; average, 150 kb) in Escherichia coli cells.
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
http://en.wikipedia.org/wiki/P1-derived_artificial_chromosome
wiki
A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism.
plasmid sequence
sequence
SO:0000155
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
plasmid
A self replicating, using the hosts cellular machinery, often circular nucleic acid molecule that is distinct from a chromosome in the organism.
SO:ma
A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites.
http://en.wikipedia.org/wiki/Cosmid
cosmid vector
sequence
SO:0000156
Paper: vans GA et al. High efficiency vectors for cosmid microcloning and genomic analysis. Gene 1989; 79(1):9-20. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
cosmid
A cloning vector that is a hybrid of lambda phages and a plasmid that can be propagated as a plasmid or packaged as a phage,since they retain the lambda cos sites.
SO:ma
http://en.wikipedia.org/wiki/Cosmid
wiki
A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids.
http://en.wikipedia.org/wiki/Phagemid
sequence
phagemid vector
SO:0000157
phagemid
A plasmid which carries within its sequence a bacteriophage replication origin. When the host bacterium is infected with "helper" phage, a phagemid is replicated along with the phage DNA and packaged into phage capsids.
SO:ma
http://en.wikipedia.org/wiki/Phagemid
wiki
A cloning vector that utilizes the E. coli F factor.
http://en.wikipedia.org/wiki/Fosmid
sequence
fosmid vector
SO:0000158
Birren BW et al. A human chromosome 22 fosmid resource: mapping and analysis of 96 clones. Genomics 1996.
fosmid
A cloning vector that utilizes the E. coli F factor.
SO:ma
http://en.wikipedia.org/wiki/Fosmid
wiki
The point at which one or more contiguous nucleotides were excised.
SO:1000033
http://en.wikipedia.org/wiki/Nucleotide_deletion
loinc:LA6692-3
deleted_sequence
nucleotide deletion
nucleotide_deletion
sequence
SO:0000159
deletion
The point at which one or more contiguous nucleotides were excised.
SO:ke
http://en.wikipedia.org/wiki/Nucleotide_deletion
wiki
loinc:LA6692-3
Deletion
A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
sequence
SO:0000160
lambda_clone
true
A linear clone derived from lambda bacteriophage. The genes involved in the lysogenic pathway are removed from the from the viral DNA. Up to 25 kb of foreign DNA can then be inserted into the lambda genome.
ISBN:0-1767-2380-8
A modified base in which adenine has been methylated.
methylated A
methylated adenine
methylated adenine base
methylated adenine residue
methylated_A
sequence
SO:0000161
methylated_adenine
A modified base in which adenine has been methylated.
SO:ke
Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction.
http://en.wikipedia.org/wiki/Splice_site
splice site
sequence
SO:0000162
With spliceosomal introns, the splice sites bind the spliceosomal machinery.
splice_site
Consensus region of primary transcript bordering junction of splicing. A region that overlaps exactly 2 base and adjacent_to splice_junction.
SO:cjm
SO:ke
http://en.wikipedia.org/wiki/Splice_site
wiki
Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron.
5' splice site
donor splice site
five prime splice site
splice donor site
sequence
donor
SO:0000163
five_prime_cis_splice_site
Intronic 2 bp region bordering the exon, at the 5' edge of the intron. A splice_site that is downstream_adjacent_to exon and starts intron.
SO:cjm
SO:ke
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron.
acceptor splice site
splice acceptor site
three prime splice site
sequence
3' splice site
acceptor
SO:0000164
three_prime_cis_splice_site
Intronic 2 bp region bordering the exon, at the 3' edge of the intron. A splice_site that is upstream_adjacent_to exon and finishes intron.
SO:cjm
SO:ke
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Enhancer_(genetics)
INSDC_qualifier:enhancer
sequence
SO:0000165
An enhancer may participate in an enhanceosome GO:0034206. A protein-DNA complex formed by the association of a distinct set of general and specific transcription factors with a region of enhancer DNA. The cooperative assembly of an enhanceosome confers specificity of transcriptional regulation. This comment is a place holder should we start to make cross products with GO.
enhancer
A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Enhancer_(genetics)
wiki
An enhancer bound by a factor.
enhancer bound by factor
sequence
SO:0000166
enhancer_bound_by_factor
An enhancer bound by a factor.
SO:xp
A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Promoter
INSDC_qualifier:promoter
promoter sequence
sequence
SO:0000167
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. The region on a DNA molecule involved in RNA polymerase binding to initiate transcription. Moved from is_a: SO:0001055 transcriptional_cis_regulatory_region as per request from GREEKC initiative in August 2020. Merged with RNA_polymerase_promoter (SO:0001203) Aug 2020. Moved up one level from is_a CRM (SO:0000727) to is_a transcriptional_cis_regulatory_region (SO:0001055) as part of the GREEKC work January 2021. Pascale Gaudet from Gene Ontology pointed out that CRM can be located upstream of the promoter and therefore cannot include the promoter.
promoter
A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the core transcription machinery. A region (DNA) to which RNA polymerase binds, to begin transcription.
SO:regcreative
http://en.wikipedia.org/wiki/Promoter
wiki
A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA.
sequence
SO:0000168
restriction_enzyme_cut_site
true
A specific nucleotide sequence of DNA at or near which a particular restriction enzyme cuts the DNA.
SO:ma
A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription.
RNA polymerase A promoter
RNApol I promoter
pol I promoter
polymerase I promoter
sequence
SO:0000169
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_I_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase I binds, to begin transcription.
SO:ke
A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription.
RNA polymerase B promoter
RNApol II promoter
polymerase II promoter
sequence
pol II promoter
SO:0000170
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_II_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase II binds, to begin transcription.
SO:ke
A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription.
RNA polymerase C promoter
RNApol III promoter
pol III promoter
polymerase III promoter
sequence
SO:0000171
parent term RNA_polymerase_promoter SO:0001203 was obsoleted in Aug 2020, so term has been moved to eukaryotic_promoter SO:0002221.
RNApol_III_promoter
A DNA sequence in eukaryotic DNA to which RNA polymerase III binds, to begin transcription.
SO:ke
Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/CAAT_box
CAAT box
CAAT signal
CAAT-box
INSDC_qualifier:CAAT_signal
sequence
SO:0000172
CAAT_signal
Part of a conserved sequence located about 75-bp upstream of the start point of eukaryotic transcription units which may be involved in RNA polymerase binding; consensus=GG(C|T)CAATCT.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/CAAT_box
wiki
A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.
INSDC_feature:regulatory
GC rich promoter region
GC-rich region
INSDC_qualifier:GC_rich_promoter_region
sequence
SO:0000173
GC_rich_promoter_region
A conserved GC-rich region located upstream of the start point of eukaryotic transcription units which may occur in multiple copies or in either orientation; consensus=GGGCGG.
http://www.insdc.org/files/feature_table.html
A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T).
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/TATA_box
Goldstein-Hogness box
INSDC_qualifier:TATA_box
TATA box
sequence
SO:0000174
Binds TBP.
TATA_box
A conserved AT-rich septamer found about 25-bp before the start point of many eukaryotic RNA polymerase II transcript units; may be involved in positioning the enzyme for correct initiation; consensus=TATA(A|T)A(A|T).
PMID:16858867
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/TATA_box
wiki
A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70.
INSDC_feature:regulatory
http://en.wikipedia.org/wiki/Pribnow_box
-10 signal
INSDC_qualifier:minus_10_signal
Pribnow Schaller box
Pribnow box
Pribnow-Schaller box
minus 10 signal
sequence
SO:0000175
Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021.
minus_10_signal
A conserved region about 10-bp upstream of the start point of bacterial transcription units which may be involved in binding RNA polymerase; consensus=TAtAaT. This region is associated with sigma factor 70.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Pribnow_box
wiki
A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70.
INSDC_feature:regulatory
-35 signal
INSDC_qualifier:minus_35_signal
minus 35 signal
sequence
SO:0000176
Changed from is_a SO:0000713 DNA_motif to is_a SO:0002312 core_prokaryotic_promoter_element in response to GREEKC Initiative Dave Sant Aug 2020. Changed from is_a SO:0002312 core_prokaryotic_promoter_element back to is_a SO:0000713 DNA_motif to be consistent with minus_12_signal and minus_24_signal on 12 July 2021.
minus_35_signal
A conserved hexamer about 35-bp upstream of the start point of bacterial transcription units; consensus=TTGACa or TGTTGACA. This region is associated with sigma factor 70.
http://www.insdc.org/files/feature_table.html
A nucleotide match against a sequence from another organism.
cross genome match
sequence
SO:0000177
cross_genome_match
A nucleotide match against a sequence from another organism.
SO:ma
The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
http://en.wikipedia.org/wiki/Operon
INSDC_feature:operon
sequence
SO:0000178
This term is mapped to MGED. Do not obsolete without consulting MGED ontology. Definition updated with per Mejia-Almonte et.al Redefining fundamental concepts of transcription initiation in prokaryotes Aug 5 2020.
operon
The DNA region of a group of adjacent genes whose transcription is coordinated on one or several mutually overlapping transcription units transcribed in the same direction and sharing at least one gene.
SO:ma
http://en.wikipedia.org/wiki/Operon
wiki
The start of the clone insert.
clone insert start
sequence
SO:0000179
clone_insert_start
The start of the clone insert.
SO:ke
A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase.
http://en.wikipedia.org/wiki/Retrotransposon
class I transposon
retrotransposon element
sequence
class I
SO:0000180
retrotransposon
A transposable element that is incorporated into a chromosome by a mechanism that requires reverse transcriptase.
http://www.dddmag.com/Glossary.aspx#r
http://en.wikipedia.org/wiki/Retrotransposon
wiki
A match against a translated sequence.
translated nucleotide match
sequence
SO:0000181
translated_nucleotide_match
A match against a translated sequence.
SO:ke
A transposon where the mechanism of transposition is via a DNA intermediate.
DNA transposon
class II transposon
sequence
class II
SO:0000182
DNA_transposon
A transposon where the mechanism of transposition is via a DNA intermediate.
SO:ke
A region of the gene which is not transcribed.
non transcribed region
non-transcribed sequence
nontranscribed region
nontranscribed sequence
sequence
SO:0000183
non_transcribed_region
A region of the gene which is not transcribed.
SO:ke
A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs.
U2 intron
sequence
SO:0000184
May have either GT-AG or AT-AG 5' and 3' boundaries.
U2_intron
A major type of spliceosomal intron spliced by the U2 spliceosome, that includes U1, U2, U4/U6 and U5 snRNAs.
PMID:9428511
A transcript that in its initial state requires modification to be functional.
http://en.wikipedia.org/wiki/Primary_transcript
INSDC_feature:precursor_RNA
INSDC_feature:prim_transcript
precursor RNA
primary transcript
sequence
SO:0000185
primary_transcript
A transcript that in its initial state requires modification to be functional.
SO:ma
http://en.wikipedia.org/wiki/Primary_transcript
wiki
A retrotransposon flanked by long terminal repeat sequences.
LTR retrotransposon
long terminal repeat retrotransposon
sequence
SO:0000186
LTR_retrotransposon
A retrotransposon flanked by long terminal repeat sequences.
SO:ke
A group of characterized repeat sequences.
sequence
SO:0000187
repeat_family
true
A group of characterized repeat sequences.
SO:ke
A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it.
http://en.wikipedia.org/wiki/Intron
INSDC_feature:intron
sequence
SO:0000188
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
intron
A region of a primary transcript that is transcribed, but removed from within the transcript by splicing together the sequences (exons) on either side of it.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Intron
wiki
A retrotransposon without long terminal repeat sequences.
non LTR retrotransposon
sequence
SO:0000189
non_LTR_retrotransposon
A retrotransposon without long terminal repeat sequences.
SO:ke
An intron that is the most 5-prime in a given transcript.
5' intron
5' intron sequence
five prime intron
sequence
SO:0000190
five_prime_intron
An intron that is not the most 3-prime or the most 5-prime in a given transcript.
interior intron
sequence
SO:0000191
interior_intron
An intron that is the most 3-prime in a given transcript.
3' intron
three prime intron
sequence
3' intron sequence
SO:0000192
three_prime_intron
A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme.
http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism
RFLP
RFLP fragment
restriction fragment length polymorphism
sequence
SO:0000193
RFLP_fragment
A DNA fragment used as a reagent to detect the polymorphic genomic loci by hybridizing against the genomic DNA digested with a given restriction enzyme.
GOC:pj
http://en.wikipedia.org/wiki/Restriction_fragment_length_polymorphism
wiki
A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats.
LINE
LINE element
Long interspersed element
Long interspersed nuclear element
sequence
SO:0000194
LINE_element
A dispersed repeat family with many copies, each from 1 to 6 kb long. New elements are generated by retroposition of a transcribed copy. Typically the LINE contains 2 ORF's one of which is reverse transcriptase, and 3'and 5' direct repeats.
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon).
coding exon
sequence
SO:0000195
coding_exon
An exon whereby at least one base is part of a codon (here, 'codon' is inclusive of the stop_codon).
SO:ke
The sequence of the five_prime_coding_exon that codes for protein.
five prime exon coding region
sequence
SO:0000196
five_prime_coding_exon_coding_region
The sequence of the five_prime_coding_exon that codes for protein.
SO:cjm
The sequence of the three_prime_coding_exon that codes for protein.
three prime exon coding region
sequence
SO:0000197
three_prime_coding_exon_coding_region
The sequence of the three_prime_coding_exon that codes for protein.
SO:cjm
An exon that does not contain any codons.
noncoding exon
sequence
SO:0000198
noncoding_exon
An exon that does not contain any codons.
SO:ke
A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions.
translocated sequence
sequence
transchr
SO:0000199
translocation
A region of nucleotide sequence that has translocated to a new position. The observed adjacency of two previously separated regions.
NCBI:th
SO:ke
transchr
http://www.ncbi.nlm.nih.gov/dbvar/
The 5' most coding exon.
5' coding exon
five prime coding exon
sequence
SO:0000200
five_prime_coding_exon
The 5' most coding exon.
SO:ke
An exon that is bounded by 5' and 3' splice sites.
interior exon
sequence
SO:0000201
interior_exon
An exon that is bounded by 5' and 3' splice sites.
PMID:10373547
The coding exon that is most 3-prime on a given transcript.
three prime coding exon
sequence
3' coding exon
SO:0000202
three_prime_coding_exon
The coding exon that is most 3-prime on a given transcript.
SO:ma
Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated.
untranslated region
sequence
SO:0000203
UTR
Messenger RNA sequences that are untranslated and lie five prime or three prime to sequences which are translated.
SO:ke
A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.
http://en.wikipedia.org/wiki/5'_UTR
5' UTR
INSDC_feature:5'UTR
five prime UTR
five_prime_untranslated_region
sequence
SO:0000204
five_prime_UTR
A region at the 5' end of a mature transcript (preceding the initiation codon) that is not translated into a protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/5'_UTR
wiki
A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein.
http://en.wikipedia.org/wiki/Three_prime_untranslated_region
INSDC_feature:3'UTR
three prime UTR
three prime untranslated region
sequence
SO:0000205
three_prime_UTR
A region at the 3' end of a mature transcript (following the stop codon) that is not translated into a protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Three_prime_untranslated_region
wiki
A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element.
http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element
SINE element
Short interspersed element
Short interspersed nuclear element
sequence
SO:0000206
SINE_element
A repetitive element, a few hundred base pairs long, that is dispersed throughout the genome. A common human SINE is the Alu element.
SO:ke
http://en.wikipedia.org/wiki/Short_interspersed_nuclear_element
wiki
SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ.
http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism
simple sequence length variation
sequence
SSLP
simple sequence length polymorphism
SO:0000207
simple_sequence_length_variation
SSLP are a kind of sequence alteration where the number of repeated sequences in intergenic regions may differ.
SO:ke
http://en.wikipedia.org/wiki/Simple_sequence_length_polymorphism
WIKI
A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long.
TIR element
terminal inverted repeat element
sequence
SO:0000208
terminal_inverted_repeat_element
A DNA transposable element defined as having termini with perfect, or nearly perfect short inverted repeats, generally 10 - 40 nucleotides long.
http://www.genetics.org/cgi/reprint/156/4/1983.pdf
A primary transcript encoding a ribosomal RNA.
rRNA primary transcript
ribosomal RNA primary transcript
sequence
SO:0000209
rRNA_primary_transcript
A primary transcript encoding a ribosomal RNA.
SO:ke
A primary transcript encoding a transfer RNA (SO:0000253).
tRNA primary transcript
sequence
SO:0000210
tRNA_primary_transcript
A primary transcript encoding a transfer RNA (SO:0000253).
SO:ke
A primary transcript encoding alanyl tRNA.
alanine tRNA primary transcript
sequence
SO:0000211
alanine_tRNA_primary_transcript
A primary transcript encoding alanyl tRNA.
SO:ke
A primary transcript encoding arginyl tRNA (SO:0000255).
arginine tRNA primary transcript
sequence
SO:0000212
arginine_tRNA_primary_transcript
A primary transcript encoding arginyl tRNA (SO:0000255).
SO:ke
A primary transcript encoding asparaginyl tRNA (SO:0000256).
asparagine tRNA primary transcript
sequence
SO:0000213
asparagine_tRNA_primary_transcript
A primary transcript encoding asparaginyl tRNA (SO:0000256).
SO:ke
A primary transcript encoding aspartyl tRNA (SO:0000257).
aspartic acid tRNA primary transcript
sequence
SO:0000214
aspartic_acid_tRNA_primary_transcript
A primary transcript encoding aspartyl tRNA (SO:0000257).
SO:ke
A primary transcript encoding cysteinyl tRNA (SO:0000258).
cysteine tRNA primary transcript
sequence
SO:0000215
cysteine_tRNA_primary_transcript
A primary transcript encoding cysteinyl tRNA (SO:0000258).
SO:ke
A primary transcript encoding glutaminyl tRNA (SO:0000260).
glutamic acid tRNA primary transcript
sequence
SO:0000216
glutamic_acid_tRNA_primary_transcript
A primary transcript encoding glutaminyl tRNA (SO:0000260).
SO:ke
A primary transcript encoding glutamyl tRNA (SO:0000260).
glutamine tRNA primary transcript
sequence
SO:0000217
glutamine_tRNA_primary_transcript
A primary transcript encoding glutamyl tRNA (SO:0000260).
SO:ke
A primary transcript encoding glycyl tRNA (SO:0000263).
glycine tRNA primary transcript
sequence
SO:0000218
glycine_tRNA_primary_transcript
A primary transcript encoding glycyl tRNA (SO:0000263).
SO:ke
A primary transcript encoding histidyl tRNA (SO:0000262).
histidine tRNA primary transcript
sequence
SO:0000219
histidine_tRNA_primary_transcript
A primary transcript encoding histidyl tRNA (SO:0000262).
SO:ke
A primary transcript encoding isoleucyl tRNA (SO:0000263).
isoleucine tRNA primary transcript
sequence
SO:0000220
isoleucine_tRNA_primary_transcript
A primary transcript encoding isoleucyl tRNA (SO:0000263).
SO:ke
A primary transcript encoding leucyl tRNA (SO:0000264).
leucine tRNA primary transcript
sequence
SO:0000221
leucine_tRNA_primary_transcript
A primary transcript encoding leucyl tRNA (SO:0000264).
SO:ke
A primary transcript encoding lysyl tRNA (SO:0000265).
lysine tRNA primary transcript
sequence
SO:0000222
lysine_tRNA_primary_transcript
A primary transcript encoding lysyl tRNA (SO:0000265).
SO:ke
A primary transcript encoding methionyl tRNA (SO:0000266).
methionine tRNA primary transcript
sequence
SO:0000223
methionine_tRNA_primary_transcript
A primary transcript encoding methionyl tRNA (SO:0000266).
SO:ke
A primary transcript encoding phenylalanyl tRNA (SO:0000267).
phenylalanine tRNA primary transcript
sequence
SO:0000224
phenylalanine_tRNA_primary_transcript
A primary transcript encoding phenylalanyl tRNA (SO:0000267).
SO:ke
A primary transcript encoding prolyl tRNA (SO:0000268).
proline tRNA primary transcript
sequence
SO:0000225
proline_tRNA_primary_transcript
A primary transcript encoding prolyl tRNA (SO:0000268).
SO:ke
A primary transcript encoding seryl tRNA (SO:000269).
serine tRNA primary transcript
sequence
SO:0000226
serine_tRNA_primary_transcript
A primary transcript encoding seryl tRNA (SO:000269).
SO:ke
A primary transcript encoding threonyl tRNA (SO:000270).
threonine tRNA primary transcript
sequence
SO:0000227
threonine_tRNA_primary_transcript
A primary transcript encoding threonyl tRNA (SO:000270).
SO:ke
A primary transcript encoding tryptophanyl tRNA (SO:000271).
tryptophan tRNA primary transcript
sequence
SO:0000228
tryptophan_tRNA_primary_transcript
A primary transcript encoding tryptophanyl tRNA (SO:000271).
SO:ke
A primary transcript encoding tyrosyl tRNA (SO:000272).
tyrosine tRNA primary transcript
sequence
SO:0000229
tyrosine_tRNA_primary_transcript
A primary transcript encoding tyrosyl tRNA (SO:000272).
SO:ke
A primary transcript encoding valyl tRNA (SO:000273).
valine tRNA primary transcript
sequence
SO:0000230
valine_tRNA_primary_transcript
A primary transcript encoding valyl tRNA (SO:000273).
SO:ke
A primary transcript encoding a small nuclear RNA (SO:0000274).
snRNA primary transcript
sequence
SO:0000231
snRNA_primary_transcript
A primary transcript encoding a small nuclear RNA (SO:0000274).
SO:ke
A primary transcript encoding one or more small nucleolar RNAs (SO:0000275).
snoRNA primary transcript
sequence
SO:0000232
This definition was broadened 26 Jan 2021 to reflect that a single transcript can encode one or more snoRNAs. Brought to our attention by FlyBase. GitHub Issue #520 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/520).
snoRNA_primary_transcript
A primary transcript encoding one or more small nucleolar RNAs (SO:0000275).
SO:ke
A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified.
http://en.wikipedia.org/wiki/Mature_transcript
mature transcript
sequence
SO:0000233
A processed transcript cannot contain introns.
mature_transcript
A transcript which has undergone the necessary modifications, if any, for its function. In eukaryotes this includes, for example, processing of introns, cleavage, base modification, and modifications to the 5' and/or the 3' ends, other than addition of bases. In bacteria functional mRNAs are usually not modified.
SO:ke
http://en.wikipedia.org/wiki/Mature_transcript
wiki
Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns.
http://en.wikipedia.org/wiki/MRNA
http://www.gencodegenes.org/gencode_biotypes.html
INSDC_feature:mRNA
messenger RNA
protein_coding_transcript
sequence
SO:0000234
An mRNA does not contain introns as it is a processed_transcript. The equivalent kind of primary_transcript is protein_coding_primary_transcript (SO:0000120) which may contain introns. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
mRNA
Messenger RNA is the intermediate molecule between DNA and protein. It includes UTR and coding sequences. It does not contain introns.
SO:ma
http://en.wikipedia.org/wiki/MRNA
wiki
http://www.gencodegenes.org/gencode_biotypes.html
GENCODE
A DNA site where a transcription factor binds.
TF binding site
transcription factor binding site
sequence
SO:0000235
Definition updated along with definitions in Mejia-Almonte et.al PMID:32665585. Added relationship part_of SO:0000727 CRM in place of previous CRM relationship has_part TF_binding_site August 2020 in response to requests from GREEKC initiative. Moved from transcription_regulatory_region (SO:0001679) to transcriptional_cis_regulatory_region (SO:0001055) by Dave Sant on Feb 11, 2021 when transcription_regulatory_region was merged into transcriptional_cis_regulatory_region to be consistent with GO and reduce redundancy as part of the GREEKC consortium. See GitHub Issue #527.
TF_binding_site
A DNA site where a transcription factor binds.
SO:ke
The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER.
open reading frame
sequence
SO:0000236
The definition was modified by Rama. ORF is defined by the sequence, whereas the CDS is defined according to whether a polypeptide is made. This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ORF
The in-frame interval between the stop codons of a reading frame which when read as sequential triplets, has the potential of encoding a sequential string of amino acids. TER(NNN)nTER.
SGD:rb
SO:ma
An attribute describing a transcript.
transcript attribute
sequence
SO:0000237
transcript_attribute
A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats.
foldback element
sequence
LVR element
long inverted repeat element
SO:0000238
foldback_element
A transposable element with extensive secondary structure, characterized by large modular imperfect long inverted repeats.
http://www.genetics.org/cgi/reprint/156/4/1983.pdf
The sequences extending on either side of a specific region.
flanking region
sequence
SO:0000239
flanking_region
The sequences extending on either side of a specific region.
SO:ke
A deviation in chromosome structure or number.
chromosome variation
sequence
SO:0000240
chromosome_variation
A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal.
internal UTR
sequence
SO:0000241
internal_UTR
A UTR bordered by the terminal and initial codons of two CDSs in a polycistronic transcript. Every UTR is either 5', 3' or internal.
SO:cjm
The untranslated sequence separating the 'cistrons' of multicistronic mRNA.
untranslated region polycistronic mRNA
sequence
SO:0000242
untranslated_region_polycistronic_mRNA
The untranslated sequence separating the 'cistrons' of multicistronic mRNA.
SO:ke
Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation.
http://en.wikipedia.org/wiki/Internal_ribosome_entry_site
IRES
internal ribosomal entry sequence
internal ribosomal entry site
internal ribosome entry site
sequence
internal ribosome entry sequence
SO:0000243
internal_ribosome_entry_site
Sequence element that recruits a ribosomal subunit to internal mRNA for translation initiation.
SO:ke
http://en.wikipedia.org/wiki/Internal_ribosome_entry_site
wiki
sequence
4-cutter_restriction_site
four-cutter_restriction_sit
SO:0000244
four_cutter_restriction_site
true
sequence
SO:0000245
mRNA_by_polyadenylation_status
true
A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule.
sequence
SO:0000246
polyadenylated
A attribute describing the addition of a poly A tail to the 3' end of a mRNA molecule.
SO:ke
sequence
SO:0000247
mRNA_not_polyadenylated
true
A kind of kind of sequence alteration where the copies of a region present varies across a population.
sequence length alteration
sequence
SO:0000248
sequence_length_alteration
A kind of kind of sequence alteration where the copies of a region present varies across a population.
SO:ke
sequence
6-cutter_restriction_site
six-cutter_restriction_site
SO:0000249
six_cutter_restriction_site
true
A post_transcriptionally modified base.
modified RNA base feature
sequence
SO:0000250
modified_RNA_base_feature
A post_transcriptionally modified base.
SO:ke
sequence
8-cutter_restriction_site
eight-cutter_restriction_site
SO:0000251
eight_cutter_restriction_site
true
rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity.
INSDC_qualifier:unknown
http://en.wikipedia.org/wiki/RRNA
INSDC_feature:rRNA
ribosomal RNA
ribosomal ribonucleic acid
sequence
SO:0000252
Definition updated 10 June 2021 as part of restructuring rRNA terms and reforming definitions to have similar structures. Request from EBI. See GitHub Issue #493
rRNA
rRNA is an RNA component of a ribosome that can provide both structural scaffolding and catalytic activity.
ISBN:0198506732
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/RRNA
wiki
Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position.
INSDC_qualifier:unknown
http://en.wikipedia.org/wiki/TRNA
INSDC_feature:tRNA
sequence
transfer RNA
transfer ribonucleic acid
SO:0000253
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
tRNA
Transfer RNA (tRNA) molecules are approximately 80 nucleotides in length. Their secondary structure includes four short double-helical elements and three loops (D, anti-codon, and T loops). Further hydrogen bonds mediate the characteristic L-shaped molecular structure. Transfer RNAs have two regions of fundamental functional importance: the anti-codon, which is responsible for specific mRNA codon recognition, and the 3' end, to which the tRNA's corresponding amino acid is attached (by aminoacyl-tRNA synthetases). Transfer RNAs cope with the degeneracy of the genetic code in two manners: having more than one tRNA (with a specific anti-codon) for a particular amino acid; and 'wobble' base-pairing, i.e. permitting non-standard base-pairing at the 3rd anti-codon position.
ISBN:0198506732
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00005
http://en.wikipedia.org/wiki/TRNA
wiki
A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region.
alanyl tRNA
alanyl-transfer RNA
alanyl-transfer ribonucleic acid
sequence
SO:0000254
alanyl_tRNA
A tRNA sequence that has an alanine anticodon, and a 3' alanine binding region.
SO:ke
A primary transcript encoding a small ribosomal subunit RNA.
rRNA small subunit primary transcript
sequence
SO:0000255
rRNA_small_subunit_primary_transcript
A primary transcript encoding a small ribosomal subunit RNA.
SO:ke
A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region.
asparaginyl tRNA
asparaginyl-transfer RNA
asparaginyl-transfer ribonucleic acid
sequence
SO:0000256
asparaginyl_tRNA
A tRNA sequence that has an asparagine anticodon, and a 3' asparagine binding region.
SO:ke
A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region.
aspartyl tRNA
aspartyl-transfer RNA
aspartyl-transfer ribonucleic acid
sequence
SO:0000257
aspartyl_tRNA
A tRNA sequence that has an aspartic acid anticodon, and a 3' aspartic acid binding region.
SO:ke
A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region.
cysteinyl tRNA
cysteinyl-transfer RNA
cysteinyl-transfer ribonucleic acid
sequence
SO:0000258
cysteinyl_tRNA
A tRNA sequence that has a cysteine anticodon, and a 3' cysteine binding region.
SO:ke
A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region.
glutaminyl tRNA
glutaminyl-transfer RNA
glutaminyl-transfer ribonucleic acid
sequence
SO:0000259
glutaminyl_tRNA
A tRNA sequence that has a glutamine anticodon, and a 3' glutamine binding region.
SO:ke
A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region.
glutamyl tRNA
glutamyl-transfer ribonucleic acid
sequence
glutamyl-transfer RNA
SO:0000260
glutamyl_tRNA
A tRNA sequence that has a glutamic acid anticodon, and a 3' glutamic acid binding region.
SO:ke
A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region.
glycyl tRNA
sequence
glycyl-transfer RNA
glycyl-transfer ribonucleic acid
SO:0000261
glycyl_tRNA
A tRNA sequence that has a glycine anticodon, and a 3' glycine binding region.
SO:ke
A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region.
histidyl tRNA
histidyl-transfer RNA
histidyl-transfer ribonucleic acid
sequence
SO:0000262
histidyl_tRNA
A tRNA sequence that has a histidine anticodon, and a 3' histidine binding region.
SO:ke
A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region.
isoleucyl tRNA
isoleucyl-transfer RNA
isoleucyl-transfer ribonucleic acid
sequence
SO:0000263
isoleucyl_tRNA
A tRNA sequence that has an isoleucine anticodon, and a 3' isoleucine binding region.
SO:ke
A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region.
leucyl tRNA
leucyl-transfer RNA
leucyl-transfer ribonucleic acid
sequence
SO:0000264
leucyl_tRNA
A tRNA sequence that has a leucine anticodon, and a 3' leucine binding region.
SO:ke
A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region.
lysyl tRNA
lysyl-transfer RNA
lysyl-transfer ribonucleic acid
sequence
SO:0000265
lysyl_tRNA
A tRNA sequence that has a lysine anticodon, and a 3' lysine binding region.
SO:ke
A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region.
methionyl tRNA
methionyl-transfer RNA
methionyl-transfer ribonucleic acid
sequence
SO:0000266
methionyl_tRNA
A tRNA sequence that has a methionine anticodon, and a 3' methionine binding region.
SO:ke
A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region.
phenylalanyl tRNA
phenylalanyl-transfer RNA
phenylalanyl-transfer ribonucleic acid
sequence
SO:0000267
phenylalanyl_tRNA
A tRNA sequence that has a phenylalanine anticodon, and a 3' phenylalanine binding region.
SO:ke
A tRNA sequence that has a proline anticodon, and a 3' proline binding region.
prolyl tRNA
prolyl-transfer RNA
prolyl-transfer ribonucleic acid
sequence
SO:0000268
prolyl_tRNA
A tRNA sequence that has a proline anticodon, and a 3' proline binding region.
SO:ke
A tRNA sequence that has a serine anticodon, and a 3' serine binding region.
seryl tRNA
seryl-transfer RNA
sequence
seryl-transfer ribonucleic acid
SO:0000269
seryl_tRNA
A tRNA sequence that has a serine anticodon, and a 3' serine binding region.
SO:ke
A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region.
threonyl tRNA
threonyl-transfer ribonucleic acid
sequence
threonyl-transfer RNA
SO:0000270
threonyl_tRNA
A tRNA sequence that has a threonine anticodon, and a 3' threonine binding region.
SO:ke
A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region.
tryptophanyl tRNA
tryptophanyl-transfer RNA
tryptophanyl-transfer ribonucleic acid
sequence
SO:0000271
tryptophanyl_tRNA
A tRNA sequence that has a tryptophan anticodon, and a 3' tryptophan binding region.
SO:ke
A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region.
tyrosyl tRNA
tyrosyl-transfer ribonucleic acid
sequence
tyrosyl-transfer RNA
SO:0000272
tyrosyl_tRNA
A tRNA sequence that has a tyrosine anticodon, and a 3' tyrosine binding region.
SO:ke
A tRNA sequence that has a valine anticodon, and a 3' valine binding region.
valyl tRNA
valyl-transfer ribonucleic acid
sequence
valyl-transfer RNA
SO:0000273
valyl_tRNA
A tRNA sequence that has a valine anticodon, and a 3' valine binding region.
SO:ke
A small nuclear RNA molecule involved in pre-mRNA splicing and processing.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/SnRNA
INSDC_qualifier:snRNA
small nuclear RNA
sequence
SO:0000274
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
snRNA
A small nuclear RNA molecule involved in pre-mRNA splicing and processing.
PMID:11733745
WB:ems
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/SnRNA
wiki
Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing.
INSDC_feature:ncRNA
INSDC_qualifier:snoRNA
small nucleolar RNA
sequence
SO:0000275
Updated the definition of snoRNA (SO:0000275) from "A snoRNA (small nucleolar RNA) is any one of a class of small RNAs that are associated with the eukaryotic nucleus as components of small nucleolar ribonucleoproteins. They participate in the processing or modifications of many RNAs, mostly ribosomal RNAs (rRNAs) though snoRNAs are also known to target other classes of RNA, including spliceosomal RNAs, tRNAs, and mRNAs via a stretch of sequence that is complementary to a sequence in the targeted RNA." to "Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing." to acknowledge that some snoRNAs functionally localize to other compartments (cytoplasm or even secreted). See GitHub Issue #578.
snoRNA
Small nucleolar RNAs (snoRNAs) are short non-coding RNAs enriched in the nucleolus as components of small nucleolar ribonucleoproteins. They guide ribose methylation and pseudouridylation of rRNAs and snRNAs, and a subgroup regulate excision of rRNAs from rRNA precursor transcripts. snoRNAs may also guide rRNA acetylation and tRNA methylation, and regulate mRNA abundance and alternative splicing.
GOC:kgc
PMID:31828325
Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes. Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors.
SO:0000649
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/MiRNA
http://en.wikipedia.org/wiki/StRNA
INSDC_qualifier:miRNA
micro RNA
microRNA
small temporal RNA
stRNA
sequence
SO:0000276
miRNA
Small, ~22-nt, RNA molecule that is the endogenous transcript of a miRNA gene (or the product of other non coding RNA genes. Micro RNAs are produced from precursor molecules (SO:0001244) that can form local hairpin structures, which ordinarily are processed (usually via the Dicer pathway) such that a single miRNA molecule accumulates from one arm of a hairpin precursor molecule. Micro RNAs may trigger the cleavage of their target molecules or act as translational repressors.
PMID:11081512
PMID:12592000
http://en.wikipedia.org/wiki/MiRNA
wiki
http://en.wikipedia.org/wiki/StRNA
wiki
An attribute describing a sequence that is bound by another molecule.
bound by factor
sequence
SO:0000277
Formerly called transcript_by_bound_factor.
bound_by_factor
An attribute describing a sequence that is bound by another molecule.
SO:ke
A transcript that is bound by a nucleic acid.
transcript bound by nucleic acid
sequence
SO:0000278
Formerly called transcript_by_bound_nucleic_acid.
transcript_bound_by_nucleic_acid
A transcript that is bound by a nucleic acid.
SO:xp
A transcript that is bound by a protein.
transcript bound by protein
sequence
SO:0000279
Formerly called transcript_by_bound_protein.
transcript_bound_by_protein
A transcript that is bound by a protein.
SO:xp
A gene that is engineered.
engineered gene
sequence
SO:0000280
engineered_gene
A gene that is engineered.
SO:xp
A gene that is engineered and foreign.
engineered foreign gene
sequence
SO:0000281
engineered_foreign_gene
A gene that is engineered and foreign.
SO:xp
An mRNA with a minus 1 frameshift.
mRNA with minus 1 frameshift
sequence
SO:0000282
mRNA_with_minus_1_frameshift
An mRNA with a minus 1 frameshift.
SO:xp
A transposable_element that is engineered and foreign.
engineered foreign transposable element gene
sequence
SO:0000283
engineered_foreign_transposable_element_gene
A transposable_element that is engineered and foreign.
SO:xp
The recognition site is bipartite and interrupted.
sequence
SO:0000284
type_I_enzyme_restriction_site
true
The recognition site is bipartite and interrupted.
http://www.promega.com
A gene that is foreign.
foreign gene
sequence
SO:0000285
foreign_gene
A gene that is foreign.
SO:xp
A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Long_terminal_repeat
INSDC_qualifier:long_terminal_repeat
LTR
long terminal repeat
sequence
direct terminal repeat
SO:0000286
long_terminal_repeat
A sequence directly repeated at both ends of a defined sequence, of the sort typically found in retroviruses.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Long_terminal_repeat
wiki
A gene that is a fusion.
http://en.wikipedia.org/wiki/Fusion_gene
fusion gene
sequence
SO:0000287
fusion_gene
A gene that is a fusion.
SO:xp
http://en.wikipedia.org/wiki/Fusion_gene
wiki
A fusion gene that is engineered.
engineered fusion gene
sequence
SO:0000288
engineered_fusion_gene
A fusion gene that is engineered.
SO:xp
A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Microsatellite
INSDC_qualifier:microsatellite
STR
microsatellite locus
microsatellite marker
short tandem repeat
sequence
SO:0000289
microsatellite
A repeat_region containing repeat_units of 2 to 10 bp repeated in tandem.
NCBI:th
http://www.informatics.jax.org/silver/glossary.shtml
http://en.wikipedia.org/wiki/Microsatellite
wiki
STR
http://www.ncbi.nlm.nih.gov/books/NBK21126/def-item/A9651/
A region of a repeating dinucleotide sequence (two bases).
dinucleotide repeat microsatellite
dinucleotide repeat microsatellite feature
dinucleotide repeat microsatellite locus
dinucleotide repeat microsatellite marker
sequence
SO:0000290
dinucleotide_repeat_microsatellite_feature
A region of a repeating trinucleotide sequence (three bases).
rinucleotide repeat microsatellite
trinucleotide repeat microsatellite feature
trinucleotide repeat microsatellite locus
sequence
dinucleotide repeat microsatellite marker
SO:0000291
trinucleotide_repeat_microsatellite_feature
sequence
SO:0000292
repetitive_element
true
A repetitive element that is engineered and foreign.
engineered foreign repetitive element
sequence
SO:0000293
engineered_foreign_repetitive_element
A repetitive element that is engineered and foreign.
SO:xp
The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Inverted_repeat
INSDC_qualifier:inverted
inverted repeat
inverted repeat sequence
sequence
SO:0000294
inverted_repeat
The sequence is complementarily repeated on the opposite strand. It is a palindrome, and it may, or may not be hyphenated. Examples: GCTGATCAGC, or GCTGA-----TCAGC.
SO:ke
http://en.wikipedia.org/wiki/Inverted_repeat
wiki
A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs.
U12 intron
U12-dependent intron
sequence
SO:0000295
May have either GT-AC or AT-AC 5' and 3' boundaries.
U12_intron
A type of spliceosomal intron spliced by the U12 spliceosome, that includes U11, U12, U4atac/U6atac and U5 snRNAs.
PMID:9428511
A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites.
http://en.wikipedia.org/wiki/Origin_of_replication
INSDC_feature:rep_origin
ori
origin of replication
sequence
SO:0000296
origin_of_replication
A region of nucleic acid from which replication initiates; includes sequences that are recognized by replication proteins, the site from which the first separation of complementary strands occurs, and specific replication start sites.
NCBI:cf
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Origin_of_replication
wiki
Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.
http://en.wikipedia.org/wiki/D_loop
D-loop
INSDC_feature:D-loop
sequence
displacement loop
SO:0000297
Moved from is_a: SO:0000296 origin_of_replication to is_a: SO:0001411 biological_region after Terrence Murphy (INSDC) pointed out that the D loop can also refer to a loop in DNA repair, which is not an origin of replication. See GitHub Issue #417 (https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/417)
D_loop
Displacement loop; a region within mitochondrial DNA in which a short stretch of RNA is paired with one strand of DNA, displacing the original partner DNA strand in this region; also used to describe the displacement of a region of one strand of duplex DNA by a single stranded invader in the reaction catalyzed by RecA protein.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/D_loop
wiki
A feature where there has been exchange of genetic material in the event of mitosis or meiosis
INSDC_feature:misc_recomb
INSDC_qualifier:other
recombination feature
sequence
SO:0000298
recombination_feature
A location where recombination or occurs during mitosis or meiosis.
specific recombination site
sequence
SO:0000299
specific_recombination_site
A location where a gene is rearranged due to recombination during mitosis or meiosis.
recombination feature of rearranged gene
sequence
SO:0000300
recombination_feature_of_rearranged_gene
A feature where recombination has occurred for the purpose of generating a diversity in the immune system.
vertebrate immune system gene recombination feature
sequence
SO:0000301
vertebrate_immune_system_gene_recombination_feature
Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence.
J gene recombination feature
J-RS
sequence
SO:0000302
J_gene_recombination_feature
Recombination signal including J-heptamer, J-spacer and J-nonamer in 5' of J-region of a J-gene or J-sequence.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Part of the primary transcript that is clipped off during processing.
sequence
SO:0000303
clip
Part of the primary transcript that is clipped off during processing.
SO:ke
The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site.
sequence
SO:0000304
type_II_enzyme_restriction_site
true
The recognition site is either palindromic, partially palindromic or an interrupted palindrome. Cleavage occurs within the recognition site.
http://www.promega.com
A modified nucleotide, i.e. a nucleotide other than A, T, C. G.
INSDC_feature:modified_base
modified base site
sequence
SO:0000305
Modified base:<modified_base>.
modified_DNA_base
A modified nucleotide, i.e. a nucleotide other than A, T, C. G.
http://www.insdc.org/files/feature_table.html
A nucleotide modified by methylation.
methylated base feature
sequence
SO:0000306
methylated_DNA_base_feature
A nucleotide modified by methylation.
SO:ke
Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes.
http://en.wikipedia.org/wiki/CpG_island
CG island
CpG island
sequence
SO:0000307
CpG_island
Regions of a few hundred to a few thousand bases in vertebrate genomes that are relatively GC and CpG rich; they are typically unmethylated and often found near the 5' ends of genes.
SO:rd
http://en.wikipedia.org/wiki/CpG_island
wiki
sequence
SO:0000308
sequence_feature_locating_method
true
sequence
SO:0000309
computed_feature
true
sequence
SO:0000310
predicted_ab_initio_computation
true
.
sequence
SO:0000311
similar to:<sequence_id>
computed_feature_by_similarity
true
.
SO:ma
Attribute to describe a feature that has been experimentally verified.
experimentally determined
sequence
SO:0000312
experimentally_determined
Attribute to describe a feature that has been experimentally verified.
SO:ke
A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences.
SO:0000019
http://en.wikipedia.org/wiki/Stem_loop
INSDC_feature:stem_loop
RNA_hairpin_loop
stem loop
stem-loop
sequence
SO:0000313
stem_loop
A double-helical region of nucleic acid formed by base-pairing between adjacent (inverted) complementary sequences.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Stem_loop
wiki
A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA.
INSDC_feature:repeat_region
http://en.wikipedia.org/wiki/Direct_repeat
INSDC_qualifier:direct
direct repeat
sequence
SO:0000314
direct_repeat
A repeat where the same sequence is repeated in the same direction. Example: GCTGA-followed by-GCTGA.
SO:ke
http://en.wikipedia.org/wiki/Direct_repeat
wiki
The first base where RNA polymerase begins to synthesize the RNA transcript.
INSDC_feature:misc_feature
INSDC_note:transcription_start_site
transcription start site
transcription_start_site
sequence
SO:0000315
Added relationship is_a SO:0002309 core_promoter_element with the creation of core_promoter_element as part of GREEKC initiative August 2020 - Dave Sant.
TSS
The first base where RNA polymerase begins to synthesize the RNA transcript.
SO:ke
A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon.
INSDC_feature:CDS
coding sequence
coding_sequence
sequence
SO:0000316
CDS
A contiguous sequence which begins with, and includes, a start codon and ends with, and includes, a stop codon.
SO:ma
Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host.
cDNA clone
sequence
SO:0000317
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
cDNA_clone
Complementary DNA; A piece of DNA copied from an mRNA and spliced into a vector for propagation in a suitable host.
http://seqcore.brcf.med.umich.edu/doc/educ/dnapr/mbglossary/mbgloss.html
First codon to be translated by a ribosome.
http://en.wikipedia.org/wiki/Start_codon
initiation codon
start codon
sequence
SO:0000318
start_codon
First codon to be translated by a ribosome.
SO:ke
http://en.wikipedia.org/wiki/Start_codon
wiki
In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis.
http://en.wikipedia.org/wiki/Stop_codon
stop codon
sequence
SO:0000319
stop_codon
In mRNA, a set of three nucleotides that indicates the end of information for protein synthesis.
SO:ke
http://en.wikipedia.org/wiki/Stop_codon
wiki
Sequences within the intron that modulate splice site selection for some introns.
intronic splice enhancer
sequence
SO:0000320
intronic_splice_enhancer
Sequences within the intron that modulate splice site selection for some introns.
SO:ke
An mRNA with a plus 1 frameshift.
mRNA with plus 1 frameshift
sequence
SO:0000321
mRNA_with_plus_1_frameshift
An mRNA with a plus 1 frameshift.
SO:ke
A region of nucleotide sequence targeted by a nuclease enzyme that is found cleaved more than would be expected by chance.
nuclease hypersensitive site
sequence
SO:0000322
Relationship to accessible_DNA_region added 11 Feb 2021. GREEKC pointed out that this is an assay based term, but we need a biological term for the accessible DNA. See GitHub Issue #531.
nuclease_hypersensitive_site
The first base to be translated into protein.
coding start
translation initiation site
sequence
translation start
SO:0000323
coding_start
The first base to be translated into protein.
SO:ke
A nucleotide sequence that may be used to identify a larger sequence.
sequence
SO:0000324
tag
A nucleotide sequence that may be used to identify a larger sequence.
SO:ke
A primary transcript encoding a large ribosomal subunit RNA.
35S rRNA primary transcript
rRNA large subunit primary transcript
sequence
SO:0000325
rRNA_large_subunit_primary_transcript
A primary transcript encoding a large ribosomal subunit RNA.
SO:ke
A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts.
SAGE tag
sequence
SO:0000326
SAGE_tag
A short diagnostic sequence tag, serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts.
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=7570003&dopt=Abstract
The last base to be translated into protein. It does not include the stop codon.
coding end
translation termination site
translation_end
sequence
SO:0000327
coding_end
The last base to be translated into protein. It does not include the stop codon.
SO:ke
A DNA sequence used experimentally to detect the presence or absence of a complementary nucleic acid.
microarray oligo
microarray oligonucleotide
sequence
SO:0000328
microarray_oligo
An mRNA with a plus 2 frameshift.
mRNA with plus 2 frameshift
sequence
SO:0000329
mRNA_with_plus_2_frameshift
An mRNA with a plus 2 frameshift.
SO:xp
Region of sequence similarity by descent from a common ancestor.
INSDC_feature:misc_feature
http://en.wikipedia.org/wiki/Conserved_region
INSDC_note:conserved_region
conserved region
sequence
SO:0000330
conserved_region
Region of sequence similarity by descent from a common ancestor.
SO:ke
http://en.wikipedia.org/wiki/Conserved_region
wiki
Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known.
INSDC_feature:STS
sequence tag site
sequence
SO:0000331
STS
Short (typically a few hundred base pairs) DNA sequence that has a single occurrence in a genome and whose location and base sequence are known.
http://www.biospace.com
Coding region of sequence similarity by descent from a common ancestor.
coding conserved region
sequence
SO:0000332
coding_conserved_region
Coding region of sequence similarity by descent from a common ancestor.
SO:ke
The boundary between two exons in a processed transcript.
exon junction
sequence
SO:0000333
exon_junction
The boundary between two exons in a processed transcript.
SO:ke
Non-coding region of sequence similarity by descent from a common ancestor.
conserved non-coding element
conserved non-coding sequence
nc conserved region
noncoding conserved region
sequence
SO:0000334
nc_conserved_region
Non-coding region of sequence similarity by descent from a common ancestor.
SO:ke
A mRNA with a minus 2 frameshift.
mRNA with minus 2 frameshift
sequence
SO:0000335
mRNA_with_minus_2_frameshift
A mRNA with a minus 2 frameshift.
SO:ke
A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog).
INSDC_feature:gene
http://en.wikipedia.org/wiki/Pseudogene
INSDC_qualifier:pseudo
INSDC_qualifier:unknown
sequence
SO:0000336
pseudogene
A sequence that closely resembles a known functional gene, at another locus within a genome, that is non-functional as a consequence of (usually several) mutations that prevent either its transcription or translation (or both). In general, pseudogenes result from either reverse transcription of a transcript of their "normal" paralog (SO:0000043) (in which case the pseudogene typically lacks introns and includes a poly(A) tail) or from recombination (SO:0000044) (in which case the pseudogene is typically a tandem duplication of its "normal" paralog).
http://www.ucl.ac.uk/~ucbhjow/b241/glossary.html
http://en.wikipedia.org/wiki/Pseudogene
wiki
A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference.
RNAi reagent
sequence
SO:0000337
RNAi_reagent
A double stranded RNA duplex, at least 20bp long, used experimentally to inhibit gene function by RNA interference.
SO:rd
A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins.
miniature inverted repeat transposable element
sequence
SO:0000338
MITE
A highly repetitive and short (100-500 base pair) transposable element with terminal inverted repeats (TIR) and target site duplication (TSD). MITEs do not encode proteins.
http://www.pnas.org/cgi/content/full/97/18/10083
A region in a genome which promotes recombination.
http://en.wikipedia.org/wiki/Recombination_hotspot
recombination hotspot
sequence
SO:0000339
recombination_hotspot
A region in a genome which promotes recombination.
SO:rd
http://en.wikipedia.org/wiki/Recombination_hotspot
wiki
Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication.
http://en.wikipedia.org/wiki/Chromosome
sequence
SO:0000340
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
chromosome
Structural unit composed of a nucleic acid molecule which controls its own replication through the interaction of specific proteins at one or more origins of replication.
SO:ma
http://en.wikipedia.org/wiki/Chromosome
wiki
A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark.
http://en.wikipedia.org/wiki/Cytological_band
chromosome band
cytoband
cytological band
sequence
SO:0000341
chromosome_band
A cytologically distinguishable feature of a chromosome, often made visible by staining, and usually alternating light and dark.
SO:ma
http://en.wikipedia.org/wiki/Cytological_band
wiki
A region specifically recognised by a recombinase where recombination can occur during mitosis or meiosis.
site specific recombination target region
sequence
SO:0000342
site_specific_recombination_target_region
A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4.
sequence
SO:0000343
match
A region of sequence, aligned to another sequence with some statistical significance, using an algorithm such as BLAST or SIM4.
SO:ke
Region of a transcript that regulates splicing.
splice enhancer
sequence
SO:0000344
splice_enhancer
Region of a transcript that regulates splicing.
SO:ke
A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long.
expressed sequence tag
sequence
SO:0000345
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
EST
A tag produced from a single sequencing read from a cDNA clone or PCR product; typically a few hundred base pairs long.
SO:ke
Cre-Recombination target sequence.
loxP site
sequence
Cre-recombination target region
SO:0000346
loxP_site
A match against a nucleotide sequence.
nucleotide match
sequence
SO:0000347
nucleotide_match
A match against a nucleotide sequence.
SO:ke
An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone.
http://en.wikipedia.org/wiki/Nucleic_acid
nucleic acid
sequence
SO:0000348
nucleic_acid
An attribute describing a sequence consisting of nucleobases bound to repeating units. The forms found in nature are deoxyribonucleic acid (DNA), where the repeating units are 2-deoxy-D-ribose rings connected to a phosphate backbone, and ribonucleic acid (RNA), where the repeating units are D-ribose rings connected to a phosphate backbone.
CHEBI:33696
RSC:cb
http://en.wikipedia.org/wiki/Nucleic_acid
wiki
A match against a protein sequence.
protein match
sequence
SO:0000349
protein_match
A match against a protein sequence.
SO:ke
An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid.
FLP recombination target region
FRT site
sequence
SO:0000350
FRT_site
An inversion site found on the Saccharomyces cerevisiae 2 micron plasmid.
SO:ma
An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence.
synthetic sequence
sequence
SO:0000351
synthetic_sequence
An attribute to decide a sequence of nucleotides, nucleotide analogs, or amino acids that has been designed by an experimenter and which may, or may not, correspond with any natural sequence.
SO:ma
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone.
sequence
SO:0000352
DNA
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a 2-deoxy-D-ribose ring connected to a phosphate backbone.
RSC:cb
A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences.
http://en.wikipedia.org/wiki/Sequence_assembly
sequence assembly
sequence
SO:0000353
sequence_assembly
A sequence of nucleotides that has been algorithmically derived from an alignment of two or more different sequences.
SO:ma
http://en.wikipedia.org/wiki/Sequence_assembly
wiki
A region of intronic nucleotide sequence targeted by a nuclease enzyme.
group 1 intron homing endonuclease target region
sequence
SO:0000354
group_1_intron_homing_endonuclease_target_region
A region of intronic nucleotide sequence targeted by a nuclease enzyme.
SO:ke
A region of the genome which is co-inherited as the result of the lack of historic recombination within it.
haplotype block
sequence
SO:0000355
haplotype_block
A region of the genome which is co-inherited as the result of the lack of historic recombination within it.
SO:ma
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone.
sequence
SO:0000356
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
RNA
An attribute describing a sequence consisting of nucleobases bound to a repeating unit made of a D-ribose ring connected to a phosphate backbone.
RSC:cb
An attribute describing a region that is bounded either side by a particular kind of region.
sequence
SO:0000357
flanked
An attribute describing a region that is bounded either side by a particular kind of region.
SO:ke
true
An attribute describing sequence that is flanked by Lox-P sites.
http://en.wikipedia.org/wiki/Floxed
sequence
SO:0000359
floxed
An attribute describing sequence that is flanked by Lox-P sites.
SO:ke
http://en.wikipedia.org/wiki/Floxed
wiki
A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS.
http://en.wikipedia.org/wiki/Codon
sequence
SO:0000360
codon
A set of (usually) three nucleotide bases in a DNA or RNA sequence, which together code for a unique amino acid or the termination of translation and are contained within the CDS.
SO:ke
http://en.wikipedia.org/wiki/Codon
wiki
An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT.
FRT flanked
sequence
SO:0000361
FRT_flanked
An attribute to describe sequence that is flanked by the FLP recombinase recognition site, FRT.
SO:ke
A cDNA clone constructed from more than one mRNA. Usually an experimental artifact.
invalidated by chimeric cDNA
sequence
SO:0000362
invalidated_by_chimeric_cDNA
A cDNA clone constructed from more than one mRNA. Usually an experimental artifact.
SO:ma
A transgene that is floxed.
floxed gene
sequence
SO:0000363
floxed_gene
A transgene that is floxed.
SO:xp
The region of sequence surrounding a transposable element.
transposable element flanking region
sequence
SO:0000364
transposable_element_flanking_region
The region of sequence surrounding a transposable element.
SO:ke
A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site.
http://en.wikipedia.org/wiki/Integron
sequence
SO:0000365
integron
A region encoding an integrase which acts at a site adjacent to it (attI_site) to insert DNA which must include but is not limited to an attC_site.
SO:as
http://en.wikipedia.org/wiki/Integron
wiki
The junction where an insertion occurred.
insertion site
sequence
SO:0000366
insertion_site
The junction where an insertion occurred.
SO:ke
A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place.
attI site
sequence
SO:0000367
attI_site
A region within an integron, adjacent to an integrase, at which site specific recombination involving an attC_site takes place.
SO:as
The junction in a genome where a transposable_element has inserted.
transposable element insertion site
sequence
SO:0000368
transposable_element_insertion_site
The junction in a genome where a transposable_element has inserted.
SO:ke
sequence
SO:0000369
integrase_coding_region
true
A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others).
small regulatory ncRNA
sequence
SO:0000370
small_regulatory_ncRNA
A non-coding RNA less than 200 nucleotides long, usually with a specific secondary structure, that acts to regulate gene expression. These include short ncRNAs such as piRNA, miRNA and siRNAs (among others).
PMID:28541282
PomBase:al
SO:ma
A transposon that encodes function required for conjugation.
conjugative transposon
sequence
SO:0000371
conjugative_transposon
A transposon that encodes function required for conjugation.
http://www.sci.sdsu.edu/~smaloy/Glossary/C.html
An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein.
enzymatic RNA
sequence
SO:0000372
This was moved to be a child of transcript (SO:0000673) because some enzymatic RNA regions are part of primary transcripts and some are part of processed transcripts. Moved under ncRNA on 18 Nov 2021. See GitHub Issue #533.
enzymatic_RNA
An RNA sequence that has catalytic activity with or without an associated ribonucleoprotein.
RSC:cb
A recombinationally rearranged gene by inversion.
recombinationally inverted gene
sequence
SO:0000373
recombinationally_inverted_gene
A recombinationally rearranged gene by inversion.
SO:xp
An RNA with catalytic activity.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Ribozyme
INSDC_qualifier:ribozyme
sequence
SO:0000374
ribozyme
An RNA with catalytic activity.
SO:ma
http://en.wikipedia.org/wiki/Ribozyme
wiki
Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes.
http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA
cytosolic 5.8S LSU rRNA
cytosolic 5.8S rRNA
cytosolic 5.8S ribosomal RNA
cytosolic rRNA 5 8S
sequence
SO:0000375
Dave Sant removed '5_8S rRNA is also found in archaea.' from definition due to lack of references mentioning this on 1 Feb 2021. See GitHub Issue #505. Renamed from rRNA_5_8S to cytosolic_5_8S_rRNA on 10 June 2021 with the restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Requested by EBI. See GitHub Issue #493.
cytosolic_5_8S_rRNA
Cytosolic 5.8S rRNA is an RNA component of the large subunit of cytosolic ribosomes in eukaryotes.
https://rfam.xfam.org/family/RF00002
http://en.wikipedia.org/wiki/5.8S_ribosomal_RNA
wiki
A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase.
http://en.wikipedia.org/wiki/6S_RNA
6S RNA
RNA 6S
sequence
SO:0000376
RNA_6S
A small (184-nt in E. coli) RNA that forms a hairpin type structure. 6S RNA associates with RNA polymerase in a highly specific manner. 6S RNA represses expression from a sigma70-dependent promoter during stationary phase.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00013
http://en.wikipedia.org/wiki/6S_RNA
wiki
An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family.
CsrB RsmB RNA
CsrB-RsmB RNA
sequence
SO:0000377
CsrB_RsmB_RNA
An enterobacterial RNA that binds the CsrA protein. The CsrB RNAs contain a conserved motif CAGGXXG that is found in up to 18 copies and has been suggested to bind CsrA. The Csr regulatory system has a strong negative regulatory effect on glycogen biosynthesis, glyconeogenesis and glycogen catabolism and a positive regulatory effect on glycolysis. In other bacteria such as Erwinia caratovara the RsmA protein has been shown to regulate the production of virulence determinants, such extracellular enzymes. RsmA binds to RsmB regulatory RNA which is also a member of this family.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00018
DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation.
http://en.wikipedia.org/wiki/DsrA_RNA
DsrA RNA
sequence
SO:0000378
DsrA_RNA
DsrA RNA regulates both transcription, by overcoming transcriptional silencing by the nucleoid-associated H-NS protein, and translation, by promoting efficient translation of the stress sigma factor, RpoS. These two activities of DsrA can be separated by mutation: the first of three stem-loops of the 85 nucleotide RNA is necessary for RpoS translation but not for anti-H-NS action, while the second stem-loop is essential for antisilencing and less critical for RpoS translation. The third stem-loop, which behaves as a transcription terminator, can be substituted by the trp transcription terminator without loss of either DsrA function. The sequence of the first stem-loop of DsrA is complementary with the upstream leader portion of RpoS messenger RNA, suggesting that pairing of DsrA with the RpoS message might be important for translational regulation.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00014
http://en.wikipedia.org/wiki/DsrA_RNA
wiki
A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli.
http://en.wikipedia.org/wiki/GcvB_RNA
GcvB RNA
sequence
SO:0000379
GcvB_RNA
A small untranslated RNA involved in expression of the dipeptide and oligopeptide transport systems in Escherichia coli.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00022
http://en.wikipedia.org/wiki/GcvB_RNA
wiki
A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Hammerhead_ribozyme
INSDC_qualifier:hammerhead_ribozyme
hammerhead ribozyme
sequence
SO:0000380
hammerhead_ribozyme
A small catalytic RNA motif that catalyzes self-cleavage reaction. Its name comes from its secondary structure which resembles a carpenter's hammer. The hammerhead ribozyme is involved in the replication of some viroid and some satellite RNAs.
PMID:2436805
http://en.wikipedia.org/wiki/Hammerhead_ribozyme
wiki
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon.
group IIA intron
sequence
SO:0000381
group_IIA_intron
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and gamma/gamma-prime for the 3-prime exon.
PMID:20463000
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon.
group IIB intron
sequence
SO:0000382
group_IIB_intron
A group II intron that recognizes IBS1/EBS1 and IBS2/EBS2 for the 5-prime exon and IBS3/EBS3 for the 3-prime exon.
PMID:20463000
A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message.
http://en.wikipedia.org/wiki/MicF_RNA
MicF RNA
sequence
SO:0000383
MicF_RNA
A non-translated 93 nt antisense RNA that binds its target ompF mRNA and regulates ompF expression by inhibiting translation and inducing degradation of the message.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00033
http://en.wikipedia.org/wiki/MicF_RNA
wiki
A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages.
http://en.wikipedia.org/wiki/OxyS_RNA
OxyS RNA
sequence
SO:0000384
OxyS_RNA
A small untranslated RNA which is induced in response to oxidative stress in Escherichia coli. Acts as a global regulator to activate or repress the expression of as many as 40 genes, including the fhlA-encoded transcriptional activator and the rpoS-encoded sigma(s) subunit of RNA polymerase. OxyS is bound by the Hfq protein, that increases the OxyS RNA interaction with its target messages.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00035
http://en.wikipedia.org/wiki/OxyS_RNA
wiki
The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs.
INSDC_feature:ncRNA
INSDC_qualifier:RNase_MRP_RNA
RNase MRP RNA
sequence
SO:0000385
Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533.
RNase_MRP_RNA
The RNA molecule essential for the catalytic activity of RNase MRP, an enzymatically active ribonucleoprotein with two distinct roles in eukaryotes. In mitochondria it plays a direct role in the initiation of mitochondrial DNA replication. In the nucleus it is involved in precursor rRNA processing, where it cleaves the internal transcribed spacer 1 between 18S and 5.8S rRNAs.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00030
The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs.
INSDC_feature:ncRNA
INSDC_qualifier:RNase_P_RNA
RNase P RNA
sequence
SO:0000386
Moved under enzymatic_RNA on 18 Nov 2021. See GitHub Issue #533.
RNase_P_RNA
The RNA component of Ribonuclease P (RNase P), a ubiquitous endoribonuclease, found in archaea, bacteria and eukarya as well as chloroplasts and mitochondria. Its best characterized activity is the generation of mature 5 prime ends of tRNAs by cleaving the 5 prime leader elements of precursor-tRNAs. Cellular RNase Ps are ribonucleoproteins. RNA from bacterial RNase Ps retains its catalytic activity in the absence of the protein subunit, i.e. it is a ribozyme. Isolated eukaryotic and archaeal RNase P RNA has not been shown to retain its catalytic function, but is still essential for the catalytic activity of the holoenzyme. Although the archaeal and eukaryotic holoenzymes have a much greater protein content than the bacterial ones, the RNA cores from all the three lineages are homologous. Helices corresponding to P1, P2, P3, P4, and P10/11 are common to all cellular RNase P RNAs. Yet, there is considerable sequence variation, particularly among the eukaryotic RNAs.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00010
Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential.
http://en.wikipedia.org/wiki/RprA_RNA
RprA RNA
sequence
SO:0000387
RprA_RNA
Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA Rfam:RF00014, RprA is predicted to form three stem-loops. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00034
http://en.wikipedia.org/wiki/RprA_RNA
wiki
The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins.
RRE RNA
sequence
SO:0000388
RRE_RNA
The Rev response element (RRE) is encoded within the HIV-env gene. Rev is an essential regulatory protein of HIV that binds an internal loop of the RRE leading, encouraging further Rev-RRE binding. This RNP complex is critical for mRNA export and hence for expression of the HIV structural proteins.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00036
A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels.
http://en.wikipedia.org/wiki/Spot_42_RNA
spot-42 RNA
sequence
SO:0000389
spot_42_RNA
A 109-nucleotide RNA of E. coli that seems to have a regulatory role on the galactose operon. Changes in Spot 42 levels are implicated in affecting DNA polymerase I levels.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00021
http://en.wikipedia.org/wiki/Spot_42_RNA
wiki
The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Telomerase_RNA
INSDC_qualifier:telomerase_RNA
telomerase RNA
sequence
SO:0000390
telomerase_RNA
The RNA component of telomerase, a reverse transcriptase that synthesizes telomeric DNA.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00025
http://en.wikipedia.org/wiki/Telomerase_RNA
wiki
U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV.
http://en.wikipedia.org/wiki/U1_snRNA
U1 small nuclear RNA
U1 snRNA
small nuclear RNA U1
snRNA U1
sequence
SO:0000391
U1_snRNA
U1 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Its 5' end forms complementary base pairs with the 5' splice junction, thus defining the 5' donor site of an intron. There are significant differences in sequence and secondary structure between metazoan and yeast U1 snRNAs, the latter being much longer (568 nucleotides as compared to 164 nucleotides in human). Nevertheless, secondary structure predictions suggest that all U1 snRNAs share a 'common core' consisting of helices I, II, the proximal region of III, and IV.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00003
http://en.wikipedia.org/wiki/U1_snRNA
wiki
U1 small nuclear RNA
RSC:cb
small nuclear RNA U1
RSC:cb
snRNA U1
RSC:cb
U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing.
http://en.wikipedia.org/wiki/U2_snRNA
U2 small nuclear RNA
U2 snRNA
small nuclear RNA U2
snRNA U2
sequence
SO:0000392
U2_snRNA
U2 is a small nuclear RNA (snRNA) component of the spliceosome (involved in pre-mRNA splicing). Complementary binding between U2 snRNA (in an area lying towards the 5' end but 3' to hairpin I) and the branchpoint sequence (BPS) of the intron results in the bulging out of an unpaired adenine, on the BPS, which initiates a nucleophilic attack at the intronic 5' splice site, thus starting the first of two transesterification reactions that mediate splicing.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00004
http://en.wikipedia.org/wiki/U2_snRNA
wiki
U2 small nuclear RNA
RSC:CB
small nuclear RNA U2
RSC:CB
snRNA U2
RSC:CB
U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6.
http://en.wikipedia.org/wiki/U4_snRNA
U4 small nuclear RNA
U4 snRNA
small nuclear RNA U4
snRNA U4
sequence
SO:0000393
U4_snRNA
U4 small nuclear RNA (U4 snRNA) is a component of the major U2-dependent spliceosome. It forms a duplex with U6, and with each splicing round, it is displaced from U6 (and the spliceosome) in an ATP-dependent manner, allowing U6 to refold and create the active site for splicing catalysis. A recycling process involving protein Prp24 re-anneals U4 and U6.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015
http://en.wikipedia.org/wiki/U4_snRNA
wiki
U4 small nuclear RNA
RSC:cb
small nuclear RNA U4
RSC:cb
snRNA U4
RSC:cb
An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397).
U4atac small nuclear RNA
U4atac snRNA
small nuclear RNA U4atac
snRNA U4atac
sequence
SO:0000394
U4atac_snRNA
An snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U6atac_snRNA (SO:0000397).
PMID:12409455
U4atac small nuclear RNA
RSC:cb
small nuclear RNA U4atac
RSC:cb
snRNA U4atac
RSC:cb
U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation.
http://en.wikipedia.org/wiki/U5_snRNA
U5 small nuclear RNA
U5 snRNA
small nuclear RNA U5
snRNA U5
sequence
SO:0000395
U5_snRNA
U5 RNA is a component of both types of known spliceosome. The precise function of this molecule is unknown, though it is known that the 5' loop is required for splice site selection and p220 binding, and that both the 3' stem-loop and the Sm site are important for Sm protein binding and cap methylation.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00020
http://en.wikipedia.org/wiki/U5_snRNA
wiki
U5 small nuclear RNA
RSC:cb
small nuclear RNA U5
RSC:cb
snRNA U5
RSC:cb
U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA.
http://en.wikipedia.org/wiki/U6_snRNA
U6 small nuclear RNA
U6 snRNA
small nuclear RNA U6
snRNA U6
sequence
SO:0000396
U6_snRNA
U6 snRNA is a component of the spliceosome which is involved in splicing pre-mRNA. The putative secondary structure consensus base pairing is confined to a short 5' stem loop, but U6 snRNA is thought to form extensive base-pair interactions with U4 snRNA.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00015
http://en.wikipedia.org/wiki/U6_snRNA
wiki
U6 small nuclear RNA
RSC:cb
small nuclear RNA U6
RSC:cb
snRNA U6
RSC:cb
U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394).
U6atac small nuclear RNA
U6atac snRNA
snRNA U6atac
sequence
SO:0000397
U6atac_snRNA
U6atac_snRNA is an snRNA required for the splicing of the minor U12-dependent class of eukaryotic nuclear introns. It forms a base paired complex with U4atac_snRNA (SO:0000394).
http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=retrieve&db=pubmed&list_uids=12409455&dopt=Abstract
U6atac small nuclear RNA
RSC:cb
U6atac snRNA
RSC:cb
snRNA U6atac
RSC:cb
U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence.
http://en.wikipedia.org/wiki/U11_snRNA
U11 small nuclear RNA
U11 snRNA
small nuclear RNA U11
snRNA U11
sequence
SO:0000398
U11_snRNA
U11 snRNA plays a role in splicing of the minor U12-dependent class of eukaryotic nuclear introns, similar to U1 snRNA in the major class spliceosome it base pairs to the conserved 5' splice site sequence.
PMID:9622129
http://en.wikipedia.org/wiki/U11_snRNA
wiki
U11 small nuclear RNA
RSC:cb
small nuclear RNA U11
RSC:cb
snRNA U11
RSC:cb
The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns.
http://en.wikipedia.org/wiki/U12_snRNA
U12 small nuclear RNA
U12 snRNA
small nuclear RNA U12
snRNA U12
sequence
SO:0000399
U12_snRNA
The U12 small nuclear (snRNA), together with U4atac/U6atac, U5, and U11 snRNAs and associated proteins, forms a spliceosome that cleaves a divergent class of low-abundance pre-mRNA introns.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00007
http://en.wikipedia.org/wiki/U12_snRNA
wiki
U12 small nuclear RNA
RSC:cb
small nuclear RNA U12
RSC:cb
snRNA U12
RSC:cb
An attribute describes a quality of sequence.
sequence attribute
sequence
SO:0000400
sequence_attribute
An attribute describes a quality of sequence.
SO:ke
An attribute describing a gene.
gene attribute
sequence
SO:0000401
gene_attribute
sequence
SO:0000402
enhancer_attribute
true
U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates.
SO:0005839
U14 small nucleolar RNA
U14 snoRNA
small nucleolar RNA U14
snoRNA U14
sequence
SO:0000403
An evolutionarily conserved eukaryotic low molecular weight RNA capable of intermolecular hybridization with both homologous and heterologous 18S rRNA.
U14_snoRNA
U14 small nucleolar RNA (U14 snoRNA) is required for early cleavages of eukaryotic precursor rRNAs. In yeasts, this molecule possess a stem-loop region (known as the Y-domain) which is essential for function. A similar structure, but with a different consensus sequence, is found in plants, but is absent in vertebrates.
PMID:2551119
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00016
A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Vault_RNA
INSDC_qualifier:vault_RNA
vault RNA
sequence
SO:0000404
vault_RNA
A family of RNAs are found as part of the enigmatic vault ribonucleoprotein complex. The complex consists of a major vault protein (MVP), two minor vault proteins (VPARP and TEP1), and several small untranslated RNA molecules. It has been suggested that the vault complex is involved in drug resistance.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00006
http://en.wikipedia.org/wiki/Vault_RNA
wiki
Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin.
INSDC_feature:ncRNA
http://en.wikipedia.org/wiki/Y_RNA
INSDC_qualifier:Y_RNA
Y RNA
sequence
SO:0000405
Y_RNA
Y RNAs are components of the Ro ribonucleoprotein particle (Ro RNP), in association with Ro60 and La proteins. The Y RNAs and Ro60 and La proteins are well conserved, but the function of the Ro RNP is not known. In humans the RNA component can be one of four small RNAs: hY1, hY3, hY4 and hY5. These small RNAs are predicted to fold into a conserved secondary structure containing three stem structures. The largest of the four, hY1, contains an additional hairpin.
http://www.sanger.ac.uk/cgi-bin/Rfam/getacc?RF00019
http://en.wikipedia.org/wiki/Y_RNA
wiki
An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed.
http://en.wikipedia.org/wiki/Twintron
sequence
SO:0000406
twintron
An intron within an intron. Twintrons are group II or III introns, into which another group II or III intron has been transposed.
PMID:1899376
PMID:7823908
http://en.wikipedia.org/wiki/Twintron
wiki
Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes.
http://en.wikipedia.org/wiki/18S_ribosomal_RNA
cytosolic 18S rRNA
cytosolic 18S ribosomal RNA
cytosolic rRNA 18S
sequence
SO:0000407
Renamed to cytosolic_18S_rRNA from rRNA_18S on 10 June 2021 as per restructuring of rRNA child terms. Updated definition to be consistent with format of other rRNA definitions. Request from EBI. See GitHub Issue #493.
cytosolic_18S_rRNA
Cytosolic 18S rRNA is an RNA component of the small subunit of cytosolic ribosomes in eukaryotes.
SO:ke
http://en.wikipedia.org/wiki/18S_ribosomal_RNA
wiki
The interbase position where something (eg an aberration) occurred.
sequence
SO:0000408
site
true
The interbase position where something (eg an aberration) occurred.
SO:ke
A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids.
BS:00033
http://en.wikipedia.org/wiki/Binding_site
INSDC_feature:misc_binding
binding site
binding_or_interaction_site
sequence
site
SO:0000409
See GO:0005488 : binding.
binding_site
A biological_region of sequence that, in the molecule, interacts selectively and non-covalently with other molecules. A region on the surface of a molecule that may interact with another molecule. When applied to polypeptides: Amino acids involved in binding or interactions. It can also apply to an amino acid bond which is represented by the positions of the two flanking amino acids.
EBIBS:GAR
SO:ke
http://en.wikipedia.org/wiki/Binding_site
wiki
A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules.
INSDC_feature:protein_bind
protein binding site
sequence
SO:0000410
See GO:0042277 : peptide binding.
protein_binding_site
A binding site that, in the molecule, interacts selectively and non-covalently with polypeptide molecules.
SO:ke
A region that rescues.
rescue fragment
rescue region
sequence
rescue segment
SO:0000411
rescue_region
A region that rescues.
SO:xp
A region of polynucleotide sequence produced by digestion with a restriction endonuclease.
http://en.wikipedia.org/wiki/Restriction_fragment
restriction fragment
sequence
SO:0000412
restriction_fragment
A region of polynucleotide sequence produced by digestion with a restriction endonuclease.
SO:ke
http://en.wikipedia.org/wiki/Restriction_fragment
wiki
A region where the sequence differs from that of a specified sequence.
INSDC_feature:misc_difference
sequence difference
sequence
SO:0000413
sequence_difference
A region where the sequence differs from that of a specified sequence.
SO:ke
An attribute to describe a feature that is invalidated due to genomic contamination.
invalidated by genomic contamination
sequence
SO:0000414
invalidated_by_genomic_contamination
An attribute to describe a feature that is invalidated due to genomic contamination.
SO:ke
An attribute to describe a feature that is invalidated due to polyA priming.
invalidated by genomic polyA primed cDNA
sequence
SO:0000415
invalidated_by_genomic_polyA_primed_cDNA
An attribute to describe a feature that is invalidated due to polyA priming.
SO:ke
An attribute to describe a feature that is invalidated due to partial processing.
invalidated by partial processing
sequence
SO:0000416
invalidated_by_partial_processing
An attribute to describe a feature that is invalidated due to partial processing.
SO:ke
A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution.
BS:00012
BS:00134
SO:0001069
domain
structural domain
polypeptide domain
polypeptide_structural_domain
sequence
SO:0000417
Range. Old definition from before biosapiens: A region of a single polypeptide chain that folds into an independent unit and exhibits biological activity. A polypeptide chain may have multiple domains.
polypeptide_domain
A structurally or functionally defined protein region. In proteins with multiple domains, the combination of the domains determines the function of the protein. A region which has been shown to recur throughout evolution.
EBIBS:GAR
domain
uniprot:feature_type
structural domain
polypeptide_structural_domain
The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components.
BS:00159
http://en.wikipedia.org/wiki/Signal_peptide
INSDC_feature:sig_peptide
signal peptide
signal peptide coding sequence
sequence
signal
SO:0000418
Old def before biosapiens:The sequence for an N-terminal domain of a secreted protein; this domain is involved in attaching nascent polypeptide to the membrane leader sequence.
signal_peptide
The signal_peptide is a short region of the peptide located at the N-terminus that directs the protein to be secreted or part of membrane components.
http://www.insdc.org/files/feature_table.html
http://en.wikipedia.org/wiki/Signal_peptide
wiki
signal
uniprot:feature_type
The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide.
BS:00149
INSDC_feature:mat_peptide
mature protein region
sequence
chain
mature peptide
SO:0000419
This term mature peptide, merged with the biosapiens term mature protein region and took that to be the new name. Old def: The coding sequence for the mature or final peptide or protein product following post-translational modification.
mature_protein_region
The polypeptide sequence that remains when the cleaved peptide regions have been cleaved from the immature peptide.
EBIBS:GAR
SO:cb
http://www.insdc.org/files/feature_table.html
chain
uniprot:feature_type
An inverted repeat (SO:0000294) occurring at the 5-prime termini of a DNA transposon.
5' TIR
five prime terminal inverted repeat
sequence
SO:0000420
five_prime_terminal_inverted_repeat
An inverted repeat (SO:0000294) occurring at the 3-prime termini of a DNA transposon.
3' TIR
three prime terminal inverted repeat
sequence
SO:0000421
three_prime_terminal_inverted_repeat
The U5 segment of the long terminal repeats.
U5 LTR region
U5 long terminal repeat region
sequence
SO:0000422
U5_LTR_region
The R segment of the long terminal repeats.
R LTR region
R long terminal repeat region
sequence
SO:0000423
R_LTR_region
The U3 segment of the long terminal repeats.
U3 LTR region
U3 long terminal repeat region
sequence
SO:0000424
U3_LTR_region
The long terminal repeat found at the five-prime end of the sequence to be inserted into the host genome.
5' LTR
5' long terminal repeat
five prime LTR
sequence
SO:0000425
five_prime_LTR
The long terminal repeat found at the three-prime end of the sequence to be inserted into the host genome.
3' LTR
3' long terminal repeat
three prime LTR
sequence
SO:0000426
three_prime_LTR
The R segment of the three-prime long terminal repeat.
R 5' long term repeat region
R five prime LTR region
sequence
SO:0000427
R_five_prime_LTR_region
The U5 segment of the three-prime long terminal repeat.
U5 5' long terminal repeat region
U5 five prime LTR region
sequence
SO:0000428
U5_five_prime_LTR_region
The U3 segment of the three-prime long terminal repeat.
U3 5' long term repeat region
U3 five prime LTR region
sequence
SO:0000429
U3_five_prime_LTR_region
The R segment of the three-prime long terminal repeat.
R 3' long terminal repeat region
R three prime LTR region
sequence
SO:0000430
R_three_prime_LTR_region
The U3 segment of the three-prime long terminal repeat.
U3 3' long terminal repeat region
U3 three prime LTR region
sequence
SO:0000431
U3_three_prime_LTR_region
The U5 segment of the three-prime long terminal repeat.
U5 3' long terminal repeat region
U5 three prime LTR region
sequence
SO:0000432
U5_three_prime_LTR_region
A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon.
INSDC_feature:repeat_region
INSDC_qualifier:non_ltr_retrotransposon_polymeric_tract
non LTR retrotransposon polymeric tract
sequence
SO:0000433
non_LTR_retrotransposon_polymeric_tract
A polymeric tract, such as poly(dA), within a non_LTR_retrotransposon.
SO:ke
A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion.
target site duplication
sequence
SO:0000434
target_site_duplication
A sequence of the target DNA that is duplicated when a transposable element or phage inserts; usually found at each end the insertion.
http://www.koko.gov.my/CocoaBioTech/Glossaryt.html
A polypurine tract within an LTR_retrotransposon.
RR tract
sequence
LTR retrotransposon poly purine tract
SO:0000435
RR_tract
A polypurine tract within an LTR_retrotransposon.
SO:ke
A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host.
autonomously replicating sequence
sequence
SO:0000436
ARS
A sequence that can autonomously replicate, as a plasmid, when transformed into a bacterial host.
SO:ma
sequence
SO:0000437
assortment_derived_duplication
true
sequence
SO:0000438
gene_not_polyadenylated
true
A ring chromosome is a chromosome whose arms have fused together to form a ring in an inverted fashion, often with the loss of the ends of the chromosome.
inverted ring chromosome
sequence
SO:0000439
inverted_ring_chromosome
A replicon that has been modified to act as a vector for foreign sequence.
http://en.wikipedia.org/wiki/Vector_(molecular_biology)
vector
vector replicon
sequence
SO:0000440
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
vector_replicon
A replicon that has been modified to act as a vector for foreign sequence.
SO:ma
http://en.wikipedia.org/wiki/Vector_(molecular_biology)
wiki
A single stranded oligonucleotide.
single strand oligo
single strand oligonucleotide
single stranded oligonucleotide
ss oligo
ss oligonucleotide
sequence
SO:0000441
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ss_oligo
A single stranded oligonucleotide.
SO:ke
A double stranded oligonucleotide.
double stranded oligonucleotide
ds oligo
ds-oligonucleotide
sequence
SO:0000442
This term is mapped to MGED. Do not obsolete without consulting MGED ontology.
ds_oligo
A double stranded oligonucleotide.
SO:ke
An attribute to describe the kind of biological sequence.
polymer attribute
sequence
SO:0000443
polymer_attribute
An attribute to describe the kind of biological sequence.
SO:ke
Non-coding exon in the 3' UTR.
three prime noncoding exon
sequence
SO:0000444
three_prime_noncoding_exon
Non-coding exon in the 3' UTR.
SO:ke
Non-coding exon in the 5' UTR.
5' nc exon
5' non coding exon
five prime noncoding exon
sequence
SO:0000445
five_prime_noncoding_exon
Non-coding exon in the 5' UTR.
SO:ke
Intron located in the untranslated region.
UTR intron
sequence
SO:0000446
UTR_intron
Intron located in the untranslated region.
SO:ke
An intron located in the 5' UTR.
five prime UTR intron
sequence
SO:0000447
five_prime_UTR_intron
An intron located in the 5' UTR.
SO:ke
An intron located in the 3' UTR.
three prime UTR intron
sequence
SO:0000448
three_prime_UTR_intron
An intron located in the 3' UTR.
SO:ke
A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components.
random sequence
sequence
SO:0000449
random_sequence
A sequence of nucleotides or amino acids which, by design, has a "random" order of components, given a predetermined input frequency of these components.
SO:ma
A light region between two darkly staining bands in a polytene chromosome.
sequence
chromosome interband
SO:0000450
interband
A light region between two darkly staining bands in a polytene chromosome.
SO:ma
A gene that encodes a polyadenylated mRNA.
gene with polyadenylated mRNA
sequence
SO:0000451
gene_with_polyadenylated_mRNA
A gene that encodes a polyadenylated mRNA.
SO:xp
sequence
SO:0000452
transgene_attribute
true
A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type.
chromosomal transposition
transposition
sequence
SO:0000453
chromosomal_transposition
A chromosome structure variant whereby a region of a chromosome has been transferred to another position. Among interchromosomal rearrangements, the term transposition is reserved for that class in which the telomeres of the chromosomes involved are coupled (that is to say, form the two ends of a single DNA molecule) as in wild-type.
FB:reference_manual
SO:ke
A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements.
INSDC_feature:ncRNA
INSDC_qualifier:rasiRNA
repeat associated small interfering RNA
sequence
SO:0000454
Changed parent term from ncRNA (SO:0000655) to piRNA (SO:0001035). See GitHub Issue #573.
rasiRNA
A 17-28-nt, small interfering RNA derived from transcripts of repetitive elements.
PMID:18032451
http://www.developmentalcell.com/content/article/abstract?uid=PIIS1534580703002284
A gene that encodes an mRNA with a frameshift.
gene with mRNA with frameshift
sequence
SO:0000455
gene_with_mRNA_with_frameshift
A gene that encodes an mRNA with a frameshift.
SO:xp
A gene that is recombinationally rearranged.
recombinationally rearranged gene
sequence
SO:0000456
recombinationally_rearranged_gene
A gene that is recombinationally rearranged.
SO:ke
A chromosome duplication involving an insertion from another chromosome.
interchromosomal duplication
sequence
SO:0000457
interchromosomal_duplication
A chromosome duplication involving an insertion from another chromosome.
SO:ke
Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment.
D gene
D-GENE
INSDC_feature:D_segment
sequence
SO:0000458
D_gene_segment
Germline genomic DNA including D-region with 5' UTR and 3' UTR, also designated as D-segment.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A gene with a transcript that is trans-spliced.
gene with trans spliced transcript
sequence
SO:0000459
gene_with_trans_spliced_transcript
A gene with a transcript that is trans-spliced.
SO:xp
Germline genomic DNA with the sequence for a V, D, C, or J portion of an immunoglobulin/T-cell receptor.
vertebrate immunoglobulin T cell receptor segment
vertebrate_immunoglobulin/T-cell receptor gene
sequence
SO:0000460
I am using the term segment instead of gene here to avoid confusion with the region 'gene'.
vertebrate_immunoglobulin_T_cell_receptor_segment
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion.
inversion derived bipartite deficiency
sequence
SO:0000461
inversion_derived_bipartite_deficiency
A chromosomal deletion whereby a chromosome generated by recombination between two inversions; has a deficiency at each end of the inversion.
FB:km
A non-functional descendant of a functional entity.
pseudogenic region
sequence
SO:0000462
pseudogenic_region
A non-functional descendant of a functional entity.
SO:cjm
A gene that encodes more than one transcript.
encodes alternately spliced transcripts
sequence
SO:0000463
encodes_alternately_spliced_transcripts
A gene that encodes more than one transcript.
SO:ke
A non-functional descendant of an exon.
decayed exon
sequence
SO:0000464
Does not have to be part of a pseudogene.
decayed_exon
A non-functional descendant of an exon.
SO:ke
A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion.
inversion derived deficiency plus duplication
sequence
SO:0000465
inversion_derived_deficiency_plus_duplication
A chromosome deletion whereby a chromosome is generated by recombination between two inversions; there is a deficiency at one end of the inversion and a duplication at the other end of the inversion.
FB:km
Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR.
INSDC_feature:V_segment
V gene
V gene segment
V-GENE
variable_gene
sequence
SO:0000466
V_gene_segment
Germline genomic DNA including L-part1, V-intron and V-exon, with the 5' UTR and 3' UTR.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein.
post translationally regulated by protein stability
post-translationally regulated by protein stability
sequence
SO:0000467
post_translationally_regulated_by_protein_stability
An attribute describing a gene sequence where the resulting protein is regulated by the stability of the resulting protein.
SO:ke
One of the pieces of sequence that make up a golden path.
golden path fragment
sequence
SO:0000468
golden_path_fragment
One of the pieces of sequence that make up a golden path.
SO:rd
An attribute describing a gene sequence where the resulting protein is modified to regulate it.
post translationally regulated by protein modification
post-translationally regulated by protein modification
sequence
SO:0000469
post_translationally_regulated_by_protein_modification
An attribute describing a gene sequence where the resulting protein is modified to regulate it.
SO:ke
Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment.
INSDC_feature:J_segment
J gene
J-GENE
sequence
SO:0000470
J_gene_segment
Germline genomic DNA of an immunoglobulin/T-cell receptor gene including J-region with 5' UTR (SO:0000204) and 3' UTR (SO:0000205), also designated as J-segment.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
The gene product is involved in its own transcriptional regulation.
sequence
SO:0000471
autoregulated
The gene product is involved in its own transcriptional regulation.
SO:ke
A set of regions which overlap with minimal polymorphism to form a linear sequence.
tiling path
sequence
SO:0000472
tiling_path
A set of regions which overlap with minimal polymorphism to form a linear sequence.
SO:cjm
The gene product is involved in its own transcriptional regulation where it decreases transcription.
negatively autoregulated
sequence
SO:0000473
negatively_autoregulated
The gene product is involved in its own transcriptional regulation where it decreases transcription.
SO:ke
A piece of sequence that makes up a tiling_path (SO:0000472).
tiling path fragment
sequence
SO:0000474
tiling_path_fragment
A piece of sequence that makes up a tiling_path (SO:0000472).
SO:ke
The gene product is involved in its own transcriptional regulation, where it increases transcription.
positively autoregulated
sequence
SO:0000475
positively_autoregulated
The gene product is involved in its own transcriptional regulation, where it increases transcription.
SO:ke
A DNA sequencer read which is part of a contig.
contig read
sequence
SO:0000476
contig_read
A DNA sequencer read which is part of a contig.
SO:ke
A gene that is polycistronic.
sequence
SO:0000477
polycistronic_gene
true
A gene that is polycistronic.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
C gene
C_GENE
INSDC_feature:C_region
constant gene
sequence
SO:0000478
C_gene_segment
Genomic DNA of immunoglobulin/T-cell receptor gene including C-region (and introns if present) with 5' UTR (SO:0000204) and 3' UTR (SO:0000205).
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A transcript that is trans-spliced.
INSDC_feature:tRNA
INSDC_qualifier:trans_splicing
trans spliced transcript
trans-spliced transcript
sequence
SO:0000479
trans_spliced_transcript
A transcript that is trans-spliced.
SO:xp
A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly.
tiling path clone
sequence
SO:0000480
tiling_path_clone
A clone which is part of a tiling path. A tiling path is a set of sequencing substrates, typically clones, which have been selected in order to efficiently cover a region of the genome in preparation for sequencing and assembly.
SO:ke
An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon.
TIR
terminal inverted repeat
sequence
SO:0000481
terminal_inverted_repeat
An inverted repeat (SO:0000294) occurring at the termini of a DNA transposon.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in germline configuration.
vertebrate immunoglobulin T cell receptor gene cluster
vertebrate_immunoglobulin/T-cell receptor gene cluster
sequence
SO:0000482
vertebrate_immunoglobulin_T_cell_receptor_gene_cluster
A primary transcript that is never translated into a protein.
nc primary transcript
noncoding primary transcript
sequence
SO:0000483
nc_primary_transcript
A primary transcript that is never translated into a protein.
SO:ke
The sequence of the 3' exon that is not coding.
three prime coding exon noncoding region
three_prime_exon_noncoding_region
sequence
SO:0000484
three_prime_coding_exon_noncoding_region
The sequence of the 3' exon that is not coding.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene.
(DJ)-J-CLUSTER
DJ J cluster
sequence
SO:0000485
DJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one DJ-gene, and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
The sequence of the 5' exon preceding the start codon.
five prime coding exon noncoding region
five_prime_exon_noncoding_region
sequence
SO:0000486
five_prime_coding_exon_noncoding_region
The sequence of the 5' exon preceding the start codon.
SO:ke
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene.
(VDJ)-J-C-CLUSTER
VDJ J C cluster
sequence
SO:0000487
VDJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene.
(VDJ)-J-CLUSTER
VDJ J cluster
sequence
SO:0000488
VDJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VDJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene.
VJ C cluster
sequence
(VJ)-C-CLUSTER
SO:0000489
VJ_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene.
(VJ)-J-C-CLUSTER
VJ J C cluster
sequence
SO:0000490
VJ_J_C_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene, one J-gene and one C-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene.
(VJ)-J-CLUSTER
VJ J cluster
sequence
SO:0000491
VJ_J_cluster
Genomic DNA of immunoglobulin/T-cell receptor gene in rearranged configuration including at least one VJ-gene and one J-gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
Recombination signal including D-heptamer, D-spacer and D-nonamer in 5' of D-region of a D-gene or D-sequence.
D gene recombination feature
sequence
SO:0000492
D_gene_recombination_feature
7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
3'D-HEPTAMER
three prime D heptamer
sequence
SO:0000493
three_prime_D_heptamer
7 nucleotide recombination site like CACAGTG, part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
3'D-NOMAMER
three prime D nonamer
sequence
SO:0000494
three_prime_D_nonamer
A 9 nucleotide recombination site (e.g. ACAAAAACC), part of a 3' D-recombination signal sequence of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS.
3'D-SPACER
three prime D spacer
sequence
SO:0000495
three_prime_D_spacer
A 12 or 23 nucleotide spacer between the 3'D-HEPTAMER and 3'D-NONAMER of a 3'D-RS.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
5'D-HEPTAMER
five prime D heptamer
sequence
SO:0000496
five_prime_D_heptamer
7 nucleotide recombination site (e.g. CACTGTG), part of a 5' D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
http://www.imgt.org/cgi-bin/IMGTlect.jv?query=7#
9 nucleotide recombination site (e.g. GGTTTTTGT), part of a five_prime_D-recombination signal sequence (SO:0000556) of an immunoglobulin/T-cell receptor gene.
5'D-NONAMER
five prime D nonamer
sequ