id: https://w3id.org/gocam name: gocam description: |- Gene Ontology Causal Activity Model (GO-CAM) Schema. This schema provides a way of representing causal pathway [Models](Model.md). A model consists of a set of [Activity](Activity.md) objects, where each activity object represents the function of either an [individual gene product](EnabledByGeneProductAssociation), a [protein complex of gene products](EnabledByGeneProductAssociation), or a set of possible gene products. Each [Models](Model.md) has associated metadata slots. Some slots such as [id](id.md), [title](title.md), and [status](status.md) are *required*. imports: - linkml:types prefixes: pav: http://purl.org/pav/ dce: http://purl.org/dc/elements/1.1/ dct: http://purl.org/dc/terms/ rdfs: http://www.w3.org/2000/01/rdf-schema# lego: http://geneontology.org/lego/ linkml: https://w3id.org/linkml/ biolink: https://w3id.org/biolink/vocab/ gocam: https://w3id.org/gocam/ OBAN: http://purl.org/oban/ goshapes: http://purl.obolibrary.org/obo/go/shapes/ RO: http://purl.obolibrary.org/obo/RO_ NCBITaxon: http://purl.obolibrary.org/obo/NCBITaxon_ BFO: http://purl.obolibrary.org/obo/BFO_ GO: http://purl.obolibrary.org/obo/GO_ ECO: http://purl.obolibrary.org/obo/ECO_ gomodel: http://model.geneontology.org/ oio: http://www.geneontology.org/formats/oboInOwl# orcid: https://orcid.org/ UniProtKB: http://purl.uniprot.org/uniprot/ PMID: http://identifiers.org/pubmed/ GOREF: http://purl.obolibrary.org/obo/go/references/ DOI: http://doi.org/ CHEBI: http://purl.obolibrary.org/obo/CHEBI_ CL: http://purl.obolibrary.org/obo/CL_ PO: http://purl.obolibrary.org/obo/PO_ FAO: http://purl.obolibrary.org/obo/FAO_ DDANAT: http://purl.obolibrary.org/obo/DDANAT_ UBERON: http://purl.obolibrary.org/obo/UBERON_ dcterms: http://purl.org/dc/terms/ RHEA: http://rdf.rhea-db.org/ default_prefix: gocam default_range: string slots: term: description: A generic slot for ontology terms range: TermObject slot_uri: gocam:term provenances: description: A generic slot for provenance information range: ProvenanceInfo slot_uri: gocam:provenances multivalued: true part_of: description: A generic slot for part-of relationships slot_uri: gocam:part_of see_also: - https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7012280/ - https://docs.google.com/presentation/d/1ja0Vkw0AoENJ58emM77dGnqPtY1nfIJMeyVnObBxIxI/edit#slide=id.p8 classes: Model: description: A model of a biological program consisting of a set of causally connected activities. rules: - title: Production rules must have at least one activity preconditions: slot_conditions: status: equals_string: production postconditions: slot_conditions: activities: required: true attributes: id: description: The identifier of the model. identifier: true range: uriorcurie id_prefixes: - gomodel examples: - value: gomodel:63f809ec00000701 description: A model representing tRNA repair and recycling title: description: The human-readable descriptive title of the model required: true slot_uri: dct:title taxon: description: The primary taxon that the model is about range: TaxonTermObject additional_taxa: description: Additional taxa that the model is about range: TaxonTermObject multivalued: true status: description: The status of the model in terms of its progression along the developmental lifecycle aliases: - model state range: ModelStateEnum slot_uri: pav:status date_modified: description: The date that the model was last modified slot_uri: dct:date comments: description: Curator-provided comments about the model slot_uri: rdfs:comment multivalued: true activities: description: All of the activities that are part of the model comments: - this slot is conditionally required. It is optional for models in development state (because a curator may need to instantiate a Model before populating it with activities), but is required for production models. See the associated rule. range: Activity inlined_as_list: true multivalued: true molecules: description: All of the molecule nodes that are part of the model. range: MoleculeNode inlined_as_list: true multivalued: true objects: description: All of the objects that are part of the model. This includes terms as well as publications and database objects like gene. This is not strictly part of the data managed by the model, it is for convenience, and should be refreshed from outside. range: Object inlined_as_list: true multivalued: true provenances: description: Model-level provenance information range: ProvenanceInfo inlined_as_list: true multivalued: true query_index: description: An optional object that contains the results of indexing a model with various summary statistics and retrieval indices. comments: - This is typically not populated in the primary transactional store (OLTP processing), because the values will be redundant with the primary edited components of the model. It is intended to be populated in batch *after* editing, and then used for generating reports, or for indexing in web applications. range: QueryIndex Activity: description: An individual activity in a causal model, representing the individual molecular activity of a single gene product or complex in the context of a particular model aliases: - annoton attributes: id: description: Identifier of the activity unit. identifier: true range: uriorcurie comments: - Typically does not need to be exposed to end-user, this exists to allow activity flows id_prefixes: - gomodel examples: - value: gomodel:63f809ec00000701/63f809ec00000706 description: A activity of the model representing alpha-aminoacyl-tRNA binding enabled by NEMF enabled_by: description: The gene product or complex that carries out the activity range: EnabledByAssociation inlined: true recommended: true molecular_function: description: The molecular function that is carried out by the gene product or complex range: MolecularFunctionAssociation inlined: true todos: - currently BP, CC etc are at the level of the activity, not the MolecularFunctionAssociation recommended: true occurs_in: description: The cellular location in which the activity occurs range: CellularAnatomicalEntityAssociation inlined: true recommended: true part_of: description: The larger biological process in which the activity is a part range: BiologicalProcessAssociation inlined: true recommended: true happens_during: description: The phase during which the activity takes place range: PhaseAssociation inlined: true molecular_associations: description: Associations between the activity and molecules that are relevant to it. This includes molecules that are products/substrates or regulators of the activity. range: MoleculeAssociation inlined_as_list: true multivalued: true causal_associations: description: The causal associations that flow out of this activity comments: - All activities in a model must be connected to at least one other activity. If a an activity has no outgoing activities (i.e the value of this slot is empty) then it is a terminal activity in the model. If an activity has no incoming activities, it is an initial activity. range: CausalAssociation inlined_as_list: true multivalued: true provenances: description: Provenance information for the activity range: ProvenanceInfo inlined_as_list: true multivalued: true MoleculeNode: description: An individual molecule or chemical entity in a causal model, representing a specific occurrence of a molecule that participates in activities as an input or output and may be localized to a particular cellular location. attributes: id: description: Identifier of the molecule node. identifier: true range: uriorcurie id_prefixes: - gomodel examples: - value: gomodel:63f809ec00000701/63f809ec00000733 description: A molecule node of the model representing tRNA precursor (CHEBI:10668) that is an input to tRNA-specific ribonuclease activity (GO:0004549) term: description: The molecule term associated with the molecule node range: MoleculeTermObject required: true located_in: description: The cellular location in which the molecule is located range: CellularAnatomicalEntityAssociation inlined: true EvidenceItem: description: An individual piece of evidence that is associated with an assertion in a model attributes: term: description: The ECO term representing the type of evidence range: EvidenceTermObject id_prefixes: - ECO bindings: - binds_value_of: id range: EvidenceCodeEnum obligation_level: REQUIRED examples: - value: ECO:0000314 description: direct assay evidence used in manual assertion (IDA) reference: description: The publication of reference that describes the evidence range: PublicationObject examples: - value: PMID:32075755 with_objects: description: Supporting database entities or terms aliases: - with - with/from range: Object multivalued: true provenances: description: Provenance about the assertion, e.g. who made it range: ProvenanceInfo inlined_as_list: true multivalued: true Association: description: An abstract grouping for different kinds of evidence-associated provenance abstract: true attributes: type: description: The type of association. comments: - when instantiating Association objects in Python and other languages, it isn't necessary to populate this, it is auto-populated from the object class. designates_type: true evidence: description: The set of evidence items that support the association. range: EvidenceItem inlined: true multivalued: true provenances: description: The set of provenance objects that provide metadata about who made the association. range: ProvenanceInfo inlined_as_list: true multivalued: true EnabledByAssociation: description: >- An association between an activity and the gene product or complex or set of potential gene products that carry out that activity. comments: - Note that this is an abstract class, and should ot be instantiated directly, instead instantiate a subclass depending on what kind of entity enables the association is_a: Association abstract: true attributes: term: description: The gene product or complex that carries out the activity range: InformationBiomacromoleculeTermObject EnabledByGeneProductAssociation: description: An association between an activity and an individual gene product is_a: EnabledByAssociation attributes: part_of: description: Associations between the gene product and protein complexes it is a part of. range: PartOfProteinComplexAssociation multivalued: true slot_usage: term: description: A "term" that is an entity database object representing an individual gene product. comments: - In the context of the GO workflow, the allowed values for this field come from the GPI file from an authoritative source. For example, the authoritative source for human is the EBI GOA group, and the GPI for this group consists of UniProtKB IDs (for proteins) and RNA Central IDs (for RNA gene products) - A gene identifier may be provided as a value here (if the authoritative GPI allows it). Note that the *interpretation* of the gene ID in the context of a GO-CAM model is the (spliceform and proteoform agnostic) *product* of that gene. range: GeneProductTermObject examples: - value: UniProtKB:Q96Q11 description: The protein product of the Homo sapiens TRNT1 gene - value: RNAcentral:URS00026A1FBE_9606 description: An RNA product of this RNA central gene EnabledByProteinComplexAssociation: description: An association between an activity and a protein complex, where the complex carries out the activity. This should only be used when the activity cannot be attributed to an individual member of the complex, but instead the function is an emergent property of the complex. comments: - Protein complexes can be specified either by *pre-composition* or *post-composition*. For pre-composition, a species-specific named protein complex (such as an entry in ComplexPortal) can be specified, in which case the value of `members` is *implicit*. For post-composition, the placeholder term "GO:0032991" can be used, in which case `members` must be *explicitly* specified. An intermediate case is when a named class in GO that is a subclass of "GO:0032991" is used. In this case, `members` should still be specified, as this may only be partially specified by the GO class. rules: - title: members must be specified when the generic GO complex is specified preconditions: slot_conditions: term: equals_string: GO:0032991 postconditions: slot_conditions: members: required: true is_a: EnabledByAssociation attributes: has_part: description: Associations between the complex and gene products that are part of it. range: ProteinComplexHasPartAssociation multivalued: true slot_usage: term: range: ProteinComplexTermObject examples: - value: GO:0032991 description: The generic GO entry for a protein complex. If this is the value of `term`, then members *must* be specified. - value: ComplexPortal:CPX-969 description: The human Caspase-2 complex PartOfProteinComplexAssociation: description: An association between a gene product and a protein complex that it is a part of. is_a: TermAssociation attributes: has_part: description: Associations between the complex and gene products that are part of it. range: ProteinComplexHasPartAssociation multivalued: true slot_usage: term: description: The protein complex that the gene product is a part of. range: ProteinComplexTermObject ProteinComplexHasPartAssociation: description: An association between a protein complex and one of its parts. is_a: TermAssociation attributes: part_of: description: Associations between the gene product and protein complexes it is a part of. range: PartOfProteinComplexAssociation multivalued: true slot_usage: term: description: The gene product that is part of the protein complex. range: GeneProductTermObject CausalAssociation: description: A causal association between two activities is_a: Association attributes: predicate: description: The RO relation that represents the type of relationship range: PredicateTermObject downstream_activity: description: The activity unit that is downstream of this one aliases: - object range: Activity TermAssociation: description: An association between an activity and a term, potentially with extensions. This is an abstract class for grouping purposes, it should not be directly instantiated, instead a subclass should be instantiated. is_a: Association abstract: true attributes: term: description: The ontology term that describes the nature of the association range: TermObject MolecularFunctionAssociation: description: An association between an activity and a molecular function term todos: - account for non-MF activity types in Reactome (MolecularEvent) is_a: TermAssociation slot_usage: term: range: MolecularFunctionTermObject BiologicalProcessAssociation: description: An association between an activity and a biological process term is_a: TermAssociation attributes: happens_during: description: Optional extension describing the phase during which the BP takes place range: PhaseAssociation part_of: description: Optional extension allowing hierarchical nesting of BPs range: BiologicalProcessAssociation inlined: true slot_usage: term: range: BiologicalProcessTermObject PhaseAssociation: description: An association between an activity or biological process and a phase term is_a: TermAssociation slot_usage: term: description: The phase ontology term associated with the biological process (not the relation type) range: PhaseTermObject CellularAnatomicalEntityAssociation: description: An association between an activity and a cellular anatomical entity term is_a: TermAssociation attributes: part_of: description: Optional extension allowing hierarchical nesting of CCs range: CellTypeAssociation inlined: true slot_usage: term: range: CellularAnatomicalEntityTermObject bindings: - binds_value_of: id range: CellularAnatomicalEntityEnum obligation_level: REQUIRED CellTypeAssociation: description: An association between an activity and a cell type term is_a: TermAssociation attributes: part_of: range: GrossAnatomyAssociation slot_usage: term: range: CellTypeTermObject GrossAnatomyAssociation: description: An association between an activity and a gross anatomical structure term is_a: TermAssociation attributes: part_of: range: GrossAnatomyAssociation slot_usage: term: range: GrossAnatomicalStructureTermObject MoleculeAssociation: description: An association between an activity and a molecule node is_a: Association attributes: predicate: description: The RO relation that represents the type of relationship range: PredicateTermObject required: true molecule: description: The molecule node that is associated with this activity range: MoleculeNode Object: description: An abstract class for all identified objects in a model attributes: id: identifier: true range: uriorcurie label: slot_uri: rdfs:label type: designates_type: true range: uriorcurie obsolete: range: boolean TermObject: description: An abstract class for all ontology term objects is_a: Object PublicationObject: description: An object that represents a publication or other kind of reference is_a: Object attributes: abstract_text: full_text: id_prefixes: - PMID - GOREF - DOI EvidenceTermObject: description: A term object that represents an evidence term from ECO. Only ECO terms that map up to a GO GAF evidence code should be used. is_a: TermObject id_prefixes: - ECO MolecularFunctionTermObject: description: A term object that represents a molecular function term from GO is_a: TermObject id_prefixes: - GO BiologicalProcessTermObject: description: A term object that represents a biological process term from GO is_a: TermObject id_prefixes: - GO CellularAnatomicalEntityTermObject: description: A term object that represents a cellular anatomical entity term from GO is_a: TermObject id_prefixes: - GO MoleculeTermObject: description: A term object that represents a molecule term from CHEBI or UniProtKB is_a: TermObject id_prefixes: - CHEBI - UniProtKB CellTypeTermObject: description: A term object that represents a cell type term from CL is_a: TermObject id_prefixes: - CL - PO - FAO - DDANAT GrossAnatomicalStructureTermObject: description: A term object that represents a gross anatomical structure term from UBERON is_a: TermObject id_prefixes: - UBERON - PO - FAO - DDANAT PhaseTermObject: description: A term object that represents a phase term from GO or UBERON is_a: TermObject id_prefixes: - GO - UBERON - PO InformationBiomacromoleculeTermObject: description: An abstract class for all information biomacromolecule term objects abstract: true is_a: TermObject GeneProductTermObject: description: A term object that represents a gene product term from GO or UniProtKB is_a: InformationBiomacromoleculeTermObject ProteinComplexTermObject: description: A term object that represents a protein complex term from GO is_a: InformationBiomacromoleculeTermObject TaxonTermObject: description: A term object that represents a taxon term from NCBITaxon is_a: TermObject id_prefixes: - NCBITaxon PredicateTermObject: description: A term object that represents an OWL relation from the OBO Relations Ontology is_a: TermObject id_prefixes: - RO ProvenanceInfo: description: Provenance information for an object attributes: contributor: multivalued: true slot_uri: dct:contributor created: slot_uri: dct:created date: slot_uri: dct:date todos: - consider modeling as date rather than string provided_by: multivalued: true slot_uri: pav:providedBy QueryIndex: description: An index that is optionally placed on a model in order to support common query or index operations. Note that this index is not typically populated in the working transactional store for a model, it is derived via computation from core primary model information. attributes: taxon_label: description: The label of the primary taxon for the model range: string number_of_activities: description: The number of activities in a model. comments: - this includes all activities, even those without an enabler. # equals_expression: "len(../activities)" range: integer number_of_enabled_by_terms: description: The number of molecular entities or sets of entities in a model. # equals_expression: "len(../activities/enabled_by)" range: integer number_of_causal_associations: description: Total number of causal association edges connecting activities in a model. This includes both explicit causal associations (e.g. "directly positively regulates") between activities and implicit causal associations between activities with shared input/output chemical entities. range: integer length_of_longest_causal_association_path: description: The maximum number of hops along activities along the direction of causal flow in a model. range: integer number_of_strongly_connected_components: description: The number of distinct components that consist of activities that are connected (directly or indirectly) via causal connections. Most models will consist of a single SCC. Some models may consist of two or more "islands" where there is no connection from one island to another. range: integer flattened_references: description: All publication objects from the model across different levels combined in one place range: PublicationObject multivalued: true inlined_as_list: true flattened_provided_by: description: All provided_by values from the model across different levels combined in one place range: Object multivalued: true inlined_as_list: true flattened_contributors: description: All contributor values from the model across different levels combined in one place range: string multivalued: true inlined_as_list: true flattened_evidence_terms: description: All evidence term objects from the model across different levels combined in one place range: EvidenceTermObject multivalued: true inlined_as_list: true model_activity_molecular_function_terms: description: All MF terms for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_molecular_function_closure: description: The reflexive transitive closure of `model_activity_molecular_function_terms`, over the is_a relationship range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf]" model_activity_molecular_function_rollup: description: The rollup of `model_activity_molecular_function_closure` to a GO subset or slim. comments: - added for completion but may not be useful in practice range: TermObject multivalued: true inlined_as_list: true model_activity_occurs_in_terms: description: All direct cellular component localization terms for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_occurs_in_closure: description: The reflexive transitive closure of `model_activity_occurs_in_terms`, over the is_a and part_of relationship type range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf, BFO:0000050]" model_activity_occurs_in_rollup: description: The rollup of `model_activity_occurs_in_closure` to a GO subset or slim. range: TermObject multivalued: true inlined_as_list: true model_activity_enabled_by_terms: description: All direct enabler terms for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_enabled_by_closure: description: The reflexive transitive closure of `model_activity_enabled_by_terms`, over the is_a and part_of relationship types range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf, BFO:0000050]" model_activity_enabled_by_rollup: description: The rollup of `model_activity_enabled_by_closure` to a GO subset or slim. comments: - added for completion but may not be useful in practice range: TermObject multivalued: true inlined_as_list: true model_activity_enabled_by_genes: description: All direct enabler genes and genes that are part of enabler complexes for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_part_of_terms: description: All direct biological process terms for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_part_of_closure: description: The reflexive transitive closure of `model_activity_part_of_terms`, over the is_a and part_of relationship type range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf, BFO:0000050]" model_activity_part_of_rollup: description: The rollup of `model_activity_part_of_closure` to a GO subset or slim. range: TermObject multivalued: true inlined_as_list: true model_activity_has_input_terms: description: All direct input terms for all activities range: TermObject multivalued: true inlined_as_list: true model_activity_has_input_closure: description: The reflexive transitive closure of `model_activity_has_input_terms`, over the is_a relationship type range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf]" model_activity_has_input_rollup: description: The rollup of `model_activity_has_input_closure` to a GO subset or slim. range: TermObject multivalued: true inlined_as_list: true model_chemical_terms: description: All molecule terms in the model that are chemical entities (e.g. substrates or products) and not information biomacromolecules (e.g. gene products or complexes). range: TermObject multivalued: true inlined_as_list: true model_taxon: description: The primary taxon term for the model, over the NCBITaxon:subClassOf relationship type. This is used to determine the primary taxon that the model is relevant to. range: TermObject multivalued: true inlined_as_list: true model_taxon_closure: description: The reflexive transitive closure of the taxon term for the model, over the NCBITaxon:subClassOf relationship type. This is used to determine the set of taxa that are relevant to the model. range: TermObject multivalued: true inlined_as_list: true annotations: closure_computed_over: "[rdfs:subClassOf]" model_taxon_rollup: description: The rollup of the taxon closure to a NCBITaxon subset or slim. range: TermObject multivalued: true inlined_as_list: true annoton_terms: range: TermObject multivalued: true inlined_as_list: true start_activities: description: The set of activities that are the starting points of the model, i.e. those that have no incoming causal associations. range: Activity multivalued: true inlined: false end_activities: description: The set of activities that are the end points of the model, i.e. those that have no outgoing causal associations. range: Activity multivalued: true inlined: false intermediate_activities: description: The set of activities that are neither start nor end activities, i.e. those that have both incoming and outgoing causal associations. range: Activity multivalued: true inlined: false singleton_activities: description: The set of activities that have no causal associations, i.e. those that are not connected to any other activity in the model. range: Activity multivalued: true inlined: false number_of_start_activities: description: The number of start activities in a model range: integer number_of_end_activities: description: The number of end activities in a model range: integer number_of_intermediate_activities: description: The number of intermediate activities in a model range: integer number_of_singleton_activities: description: The number of singleton activities in a model range: integer enums: ModelStateEnum: description: A term describing where the model is in the development life cycle. permissible_values: development: description: Used when the curator is still working on the model. Edits are still being made, and the information in the model is not yet guaranteed to be accurate or complete. The model should not be displayed in end-user facing websites, unless it is made clear that the model is a work in progress. aliases: - work in progress production: description: Used when the curator has declared the model is ready for public consumption. Edits might still be performed on the model in future, but the information in the model is believed to be both accurate and reasonably complete. The model may be displayed in public websites. delete: description: When the curator has marked for future deletion. review: description: The model has been marked for curator review. internal_test: description: The model is not intended for use public use; it is likely to be used for internal testing. template: description: The model is a template that is not intended to represent a specific biological scenario, but rather to be used as a template for building other models. closed: description: TBD InformationBiomacromoleculeCategory: description: A term describing the type of the enabler of an activity. permissible_values: GeneOrReferenceProtein: meaning: biolink.GeneOrGeneProduct ProteinIsoform: MacromolecularComplex: Unknown: CausalPredicateEnum: description: A term describing the causal relationship between two activities. All terms are drawn from the "causally upstream or within" (RO:0002418) branch of the Relation Ontology (RO). permissible_values: causally upstream of, positive effect: meaning: RO:0002304 causally upstream of, negative effect: meaning: RO:0002305 causally upstream of: meaning: RO:0002411 immediately causally upstream of: meaning: RO:0002412 causally upstream of or within: meaning: RO:0002418 causally upstream of or within, negative effect: meaning: RO:0004046 causally upstream of or within, positive effect: meaning: RO:0004047 regulates: meaning: RO:0002211 negatively regulates: meaning: RO:0002212 positively regulates: meaning: RO:0002213 provides input for: meaning: RO:0002413 removes input for: meaning: RO:0012010 EvidenceCodeEnum: description: A term from the subset of ECO that maps up to a GAF evidence code reachable_from: source_nodes: - ECO:0000000 is_direct: false relationship_types: - rdfs:subClassOf CellularAnatomicalEntityEnum: description: A term from the subset of the cellular anatomical entity branch of GO CC reachable_from: source_nodes: - GO:0110165 # cellular anatomical structure is_direct: false relationship_types: - rdfs:subClassOf # This enum is not currently used, but it causes issues for `gen-doc` in linkml 1.10.0. # See: https://github.com/linkml/linkml/issues/3242 # PhaseEnum: # description: A term from either the phase branch of GO or the phase branch of an anatomy ontology # include: # - reachable_from: # source_nodes: # - GO:0044848 # biological phase # is_direct: false # relationship_types: # - rdfs:subClassOf # - reachable_from: # source_nodes: # - UBERON:0000105 # life cycle stage # is_direct: false # relationship_types: # - rdfs:subClassOf