http://stato-ontology.org/
Alejandra Gonzalez-Beltran (http://orcid.org/0000-0003-3499-8262)
STATO: the statistical methods ontology
Camille Maumet (http://orcid.org/0000-0002-6290-553X)
STATO is the statistical methods ontology. It contains concepts and properties related to statistical methods, probability distributions and other concepts related to statistical analysis, including relationships to study designs and plots.
stat-ontology@googlegroups.com
This Ontology is distributed under a Creative Commons Attribution License
RC1.4
http://creativecommons.org/licenses/by/3.0/
Philippe Rocca-Serra (http://orcid.org/0000-0001-9853-5668)
Thomas Nichols (http://orcid.org/0000-0002-4516-5103)
Chris Mungall (http://orcid.org/0000-0002-6601-2165)
Orlaith Burke
Statistical Method, Design of Experiment, Plots, Statistical Model
Nolan Nichols (http://orcid.org/0000-0003-1099-3328)
Hanna Cwiek (https://orcid.org/0000-0001-9113-567X)
https://github.com/ISA-tools/stato/issues
Relates an entity in the ontology to the name of the variable that is used to represent it in the code that generates the BFO OWL file from the lispy specification.
Really of interest to developers only
BFO OWL specification label
Relates an entity in the ontology to the term that is used to represent it in the the CLIF specification of BFO2
Person:Alan Ruttenberg
Really of interest to developers only
BFO CLIF specification label
editor preferred label
editor preferred label
editor preferred term
editor preferred term
editor preferred term~editor preferred label
The concise, meaningful, and human-friendly name for a class or property preferred by the ontology developers. (US-English)
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
editor preferred label
editor preferred label
editor preferred term
editor preferred term
editor preferred term~editor preferred label
example
A phrase describing how a term should be used and/or a citation to a work which uses it. May also include other kinds of examples that facilitate immediate understanding, such as widely know prototypes or instances of a class, or cases where a relation is said to hold.
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
example of usage
has curation status
PERSON:Alan Ruttenberg
PERSON:Bill Bug
PERSON:Melanie Courtot
OBI_0000281
has curation status
definition
definition
definition
textual definition
textual definition
The official OBI definition, explaining the meaning of a class or property. Shall be Aristotelian, formalized and normalized. Can be augmented with colloquial definitions.
The official definition, explaining the meaning of a class or property. Shall be Aristotelian, formalized and normalized. Can be augmented with colloquial definitions.
2012-04-05:
Barry Smith
The official OBI definition, explaining the meaning of a class or property: 'Shall be Aristotelian, formalized and normalized. Can be augmented with colloquial definitions' is terrible.
Can you fix to something like:
A statement of necessary and sufficient conditions explaining the meaning of an expression referring to a class or property.
Alan Ruttenberg
Your proposed definition is a reasonable candidate, except that it is very common that necessary and sufficient conditions are not given. Mostly they are necessary, occasionally they are necessary and sufficient or just sufficient. Often they use terms that are not themselves defined and so they effectively can't be evaluated by those criteria.
On the specifics of the proposed definition:
We don't have definitions of 'meaning' or 'expression' or 'property'. For 'reference' in the intended sense I think we use the term 'denotation'. For 'expression', I think we you mean symbol, or identifier. For 'meaning' it differs for class and property. For class we want documentation that let's the intended reader determine whether an entity is instance of the class, or not. For property we want documentation that let's the intended reader determine, given a pair of potential relata, whether the assertion that the relation holds is true. The 'intended reader' part suggests that we also specify who, we expect, would be able to understand the definition, and also generalizes over human and computer reader to include textual and logical definition.
Personally, I am more comfortable weakening definition to documentation, with instructions as to what is desirable.
We also have the outstanding issue of how to aim different definitions to different audiences. A clinical audience reading chebi wants a different sort of definition documentation/definition from a chemistry trained audience, and similarly there is a need for a definition that is adequate for an ontologist to work with.
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
definition
definition
definition
textual definition
textual definition
editor note
An administrative note intended for its editor. It may not be included in the publication version of the ontology, so it should contain nothing necessary for end users to understand the ontology.
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obfoundry.org/obo/obi>
editor note
term editor
Name of editor entering the term in the file. The term editor is a point of contact for information regarding the term. The term editor may be, but is not always, the author of the definition, which may have been worked upon by several people
20110707, MC: label update to term editor and definition modified accordingly. See https://github.com/information-artifact-ontology/IAO/issues/115.
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
term editor
alternative term
An alternative name for a class or property which means the same thing as the preferred name (semantically equivalent)
PERSON:Daniel Schober
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
alternative term
definition source
formal citation, e.g. identifier in external database to indicate / attribute source(s) for the definition. Free text indicate / attribute source(s) for the definition. EXAMPLE: Author Name, URI, MeSH Term C04, PUBMED ID, Wiki uri on 31.01.2007
PERSON:Daniel Schober
Discussion on obo-discuss mailing-list, see http://bit.ly/hgm99w
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
definition source
curator note
An administrative note of use for a curator but of no use for a user
PERSON:Alan Ruttenberg
curator note
term tracker item
the URI for an OBI Terms ticket at sourceforge, such as https://sourceforge.net/p/obi/obi-terms/772/
An IRI or similar locator for a request or discussion of an ontology term.
Person: Jie Zheng, Chris Stoeckert, Alan Ruttenberg
Person: Jie Zheng, Chris Stoeckert, Alan Ruttenberg
The 'tracker item' can associate a tracker with a specific ontology term.
term tracker item
imported from
For external terms/classes, the ontology from which the term was imported
PERSON:Alan Ruttenberg
PERSON:Melanie Courtot
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
imported from
OBO foundry unique label
An alternative name for a class or property which is unique across the OBO Foundry.
The intended usage of that property is as follow: OBO foundry unique labels are automatically generated based on regular expressions provided by each ontology, so that SO could specify unique label = 'sequence ' + [label], etc. , MA could specify 'mouse + [label]' etc. Upon importing terms, ontology developers can choose to use the 'OBO foundry unique label' for an imported term or not. The same applies to tools .
PERSON:Alan Ruttenberg
PERSON:Bjoern Peters
PERSON:Chris Mungall
PERSON:Melanie Courtot
GROUP:OBO Foundry <http://obofoundry.org/>
OBO foundry unique label
elucidation
person:Alan Ruttenberg
Person:Barry Smith
Primitive terms in a highest-level ontology such as BFO are terms which are so basic to our understanding of reality that there is no way of defining them in a non-circular fashion. For these, therefore, we can provide only elucidations, supplemented by examples and by axioms
elucidation
has associated axiom(nl)
Person:Alan Ruttenberg
Person:Alan Ruttenberg
An axiom associated with a term expressed using natural language
has associated axiom(nl)
has associated axiom(fol)
Person:Alan Ruttenberg
Person:Alan Ruttenberg
An axiom expressed in first order logic using CLIF syntax
has associated axiom(fol)
ISA alternative term
An alternative term used by the ISA tools project (http://isa-tools.org).
Requested by Alejandra Gonzalez-Beltran
https://sourceforge.net/tracker/?func=detail&aid=3603413&group_id=177891&atid=886178
Person: Alejandra Gonzalez-Beltran
Person: Philippe Rocca-Serra
ISA tools project (http://isa-tools.org)
ISA alternative term
IEDB alternative term
An alternative term used by the IEDB.
PERSON:Randi Vita, Jason Greenbaum, Bjoern Peters
IEDB
IEDB alternative term
temporal interpretation
https://github.com/oborel/obo-relations/wiki/ROAndTime
an alternative term used for STATO statistical ontology and ISA team
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO alternative term
a R command syntax or link to a R documentation in support of Statistical Ontology Classes or Data Transformations
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
R command
an annotation property to provide a canonical command to invoke a method implementation using Python programming language
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
Python command
the most common series or system of written mathematical symbols used to represent the entity
AGB
preferred mathematical notation
Examples of a Contributor include a person, an
organisation, or a service. Typically, the name of a
Contributor should be used to indicate the entity.
An entity responsible for making contributions to the
content of the resource.
Contributor
Contributor
Examples of a Creator include a person, an organisation,
or a service. Typically, the name of a Creator should
be used to indicate the entity.
An entity primarily responsible for making the content
of the resource.
Creator
Creator
Typically, Date will be associated with the creation or
availability of the resource. Recommended best practice
for encoding the date value is defined in a profile of
ISO 8601 [W3CDTF] and follows the YYYY-MM-DD format.
A date associated with an event in the life cycle of the
resource.
Date
Date
Description may include but is not limited to: an abstract,
table of contents, reference to a graphical representation
of content or a free-text account of the content.
An account of the content of the resource.
Description
Description
Typically, Format may include the media-type or dimensions of
the resource. Format may be used to determine the software,
hardware or other equipment needed to display or operate the
resource. Examples of dimensions include size and duration.
Recommended best practice is to select a value from a
controlled vocabulary (for example, the list of Internet Media
Types [MIME] defining computer media formats).
The physical or digital manifestation of the resource.
Format
Format
The present resource may be derived from the Source resource
in whole or in part. Recommended best practice is to reference
the resource by means of a string or number conforming to a
formal identification system.
A reference to a resource from which the present resource
is derived.
Source
Source
Typically, a Subject will be expressed as keywords,
key phrases or classification codes that describe a topic
of the resource. Recommended best practice is to select
a value from a controlled vocabulary or formal
classification scheme.
The topic of the content of the resource.
Subject and Keywords
Subject and Keywords
Mark Miller
2018-05-11T13:47:29Z
label
label
is part of
my brain is part of my body (continuant parthood, two material entities)
my stomach cavity is part of my stomach (continuant parthood, immaterial entity is part of material entity)
this day is part of this year (occurrent parthood)
a core relation that holds between a part and its whole
Everything is part of itself. Any part of any part of a thing is itself part of that thing. Two distinct things cannot be part of each other.
Occurrents are not subject to change and so parthood between occurrents holds for all the times that the part exists. Many continuants are subject to change, so parthood between continuants will only hold at certain times, but this is difficult to specify in OWL. See https://code.google.com/p/obo-relations/wiki/ROAndTime
Parthood requires the part and the whole to have compatible classes: only an occurrent can be part of an occurrent; only a process can be part of a process; only a continuant can be part of a continuant; only an independent continuant can be part of an independent continuant; only an immaterial entity can be part of an immaterial entity; only a specifically dependent continuant can be part of a specifically dependent continuant; only a generically dependent continuant can be part of a generically dependent continuant. (This list is not exhaustive.)
A continuant cannot be part of an occurrent: use 'participates in'. An occurrent cannot be part of a continuant: use 'has participant'. A material entity cannot be part of an immaterial entity: use 'has location'. A specifically dependent continuant cannot be part of an independent continuant: use 'inheres in'. An independent continuant cannot be part of a specifically dependent continuant: use 'bearer of'.
part_of
part of
http://www.obofoundry.org/ro/#OBO_REL:part_of
has part
my body has part my brain (continuant parthood, two material entities)
my stomach has part my stomach cavity (continuant parthood, material entity has part immaterial entity)
this year has part this day (occurrent parthood)
a core relation that holds between a whole and its part
Everything has itself as a part. Any part of any part of a thing is itself part of that thing. Two distinct things cannot have each other as a part.
Occurrents are not subject to change and so parthood between occurrents holds for all the times that the part exists. Many continuants are subject to change, so parthood between continuants will only hold at certain times, but this is difficult to specify in OWL. See https://code.google.com/p/obo-relations/wiki/ROAndTime
Parthood requires the part and the whole to have compatible classes: only an occurrent have an occurrent as part; only a process can have a process as part; only a continuant can have a continuant as part; only an independent continuant can have an independent continuant as part; only a specifically dependent continuant can have a specifically dependent continuant as part; only a generically dependent continuant can have a generically dependent continuant as part. (This list is not exhaustive.)
A continuant cannot have an occurrent as part: use 'participates in'. An occurrent cannot have a continuant as part: use 'has participant'. An immaterial entity cannot have a material entity as part: use 'location of'. An independent continuant cannot have a specifically dependent continuant as part: use 'bearer of'. A specifically dependent continuant cannot have an independent continuant as part: use 'inheres in'.
has_part
has part
realized in
this disease is realized in this disease course
this fragility is realized in this shattering
this investigator role is realized in this investigation
is realized by
realized_in
[copied from inverse property 'realizes'] to say that b realizes c at t is to assert that there is some material entity d & b is a process which has participant d at t & c is a disposition or role of which d is bearer_of at t& the type instantiated by b is correlated with the type instantiated by c. (axiom label in BFO2 Reference: [059-003])
Paraphrase of elucidation: a relation between a realizable entity and a process, where there is some material entity that is bearer of the realizable entity and participates in the process, and the realizable entity comes to be realized in the course of the process
realized in
realizes
this disease course realizes this disease
this investigation realizes this investigator role
this shattering realizes this fragility
to say that b realizes c at t is to assert that there is some material entity d & b is a process which has participant d at t & c is a disposition or role of which d is bearer_of at t& the type instantiated by b is correlated with the type instantiated by c. (axiom label in BFO2 Reference: [059-003])
Paraphrase of elucidation: a relation between a process and a realizable entity, where there is some material entity that is bearer of the realizable entity and participates in the process, and the realizable entity comes to be realized in the course of the process
realizes
preceded by
An example is: translation preceded_by transcription; aging preceded_by development (not however death preceded_by aging). Where derives_from links classes of continuants, preceded_by links classes of processes. Clearly, however, these two relations are not independent of each other. Thus if cells of type C1 derive_from cells of type C, then any cell division involving an instance of C1 in a given lineage is preceded_by cellular processes involving an instance of C. The assertion P preceded_by P1 tells us something about Ps in general: that is, it tells us something about what happened earlier, given what we know about what happened later. Thus it does not provide information pointing in the opposite direction, concerning instances of P1 in general; that is, that each is such as to be succeeded by some instance of P. Note that an assertion to the effect that P preceded_by P1 is rather weak; it tells us little about the relations between the underlying instances in virtue of which the preceded_by relation obtains. Typically we will be interested in stronger relations, for example in the relation immediately_preceded_by, or in relations which combine preceded_by with a condition to the effect that the corresponding instances of P and P1 share participants, or that their participants are connected by relations of derivation, or (as a first step along the road to a treatment of causality) that the one process in some way affects (for example, initiates or regulates) the other.
is preceded by
preceded_by
http://www.obofoundry.org/ro/#OBO_REL:preceded_by
preceded by
precedes
precedes
has measurement unit label
This document is about information artifacts and their representations
is_about is a (currently) primitive relation that relates an information artifact to an entity.
7/6/2009 Alan Ruttenberg. Following discussion with Jonathan Rees, and introduction of "mentions" relation. Weaken the is_about relationship to be primitive.
We will try to build it back up by elaborating the various subproperties that are more precisely defined.
Some currently missing phenomena that should be considered "about" are predications - "The only person who knows the answer is sitting beside me" , Allegory, Satire, and other literary forms that can be topical without explicitly mentioning the topic.
person:Alan Ruttenberg
Smith, Ceusters, Ruttenberg, 2000 years of philosophy
is about
A person's name denotes the person. A variable name in a computer program denotes some piece of memory. Lexically equivalent strings can denote different things, for instance "Alan" can denote different people. In each case of use, there is a case of the denotation relation obtaining, between "Alan" and the person that is being named.
denotes is a primitive, instance-level, relation obtaining between an information content entity and some portion of reality. Denotation is what happens when someone creates an information content entity E in order to specifically refer to something. The only relation between E and the thing is that E can be used to 'pick out' the thing. This relation connects those two together. Freedictionary.com sense 3: To signify directly; refer to specifically
2009-11-10 Alan Ruttenberg. Old definition said the following to emphasize the generic nature of this relation. We no longer have 'specifically denotes', which would have been primitive, so make this relation primitive.
g denotes r =def
r is a portion of reality
there is some c that is a concretization of g
every c that is a concretization of g specifically denotes r
person:Alan Ruttenberg
Conversations with Barry Smith, Werner Ceusters, Bjoern Peters, Michel Dumontier, Melanie Courtot, James Malone, Bill Hogan
denotes
m is a quality measurement of q at t when
q is a quality
there is a measurement process p that has specified output m, a measurement datum, that is about q
8/6/2009 Alan Ruttenberg: The strategy is to be rather specific with this relationship. There are other kinds of measurements that are not of qualities, such as those that measure time. We will add these as separate properties for the moment and see about generalizing later
From the second IAO workshop [Alan Ruttenberg 8/6/2009: not completely current, though bringing in comparison is probably important]
This one is the one we are struggling with at the moment. The issue is what a measurement measures. On the one hand saying that it measures the quality would include it "measuring" the bearer = referring to the bearer in the measurement. However this makes comparisons of two different things not possible. On the other hand not having it inhere in the bearer, on the face of it, breaks the audit trail.
Werner suggests a solution based on "Magnitudes" a proposal for which we are awaiting details.
--
From the second IAO workshop, various comments, [commented on by Alan Ruttenberg 8/6/2009]
unit of measure is a quality, e.g. the length of a ruler.
[We decided to hedge on what units of measure are, instead talking about measurement unit labels, which are the information content entities that are about whatever measurement units are. For IAO we need that information entity in any case. See the term measurement unit label]
[Some struggling with the various subflavors of is_about. We subsequently removed the relation represents, and describes until and only when we have a better theory]
a represents b means either a denotes b or a describes
describe:
a describes b means a is about b and a allows an inference of at least one quality of b
We have had a long discussion about denotes versus describes.
From the second IAO workshop: An attempt at tieing the quality to the measurement datum more carefully.
a is a magnitude means a is a determinate quality particular inhering in some bearer b existing at a time t that can be represented/denoted by an information content entity e that has parts denoting a unit of measure, a number, and b. The unit of measure is an instance of the determinable quality.
From the second meeting on IAO:
An attempt at defining assay using Barry's "reliability" wording
assay:
process and has_input some material entity
and has_output some information content entity
and which is such that instances of this process type reliably generate
outputs that describes the input.
This one is the one we are struggling with at the moment. The issue is what a measurement measures. On the one hand saying that it measures the quality would include it "measuring" the bearer = referring to the bearer in the measurement. However this makes comparisons of two different things not possible. On the other hand not having it inhere in the bearer, on the face of it, breaks the audit trail.
Werner suggests a solution based on "Magnitudes" a proposal for which we are awaiting details.
Alan Ruttenberg
is quality measurement of
relating a cartesian spatial coordinate datum to a unit label that together with the values represent a point
has coordinate unit label
relates a process to a time-measurement-datum that represents the duration of the process
Person:Alan Ruttenberg
is duration of
inverse of the relation of is quality measurement of
2009/10/19 Alan Ruttenberg. Named 'junk' relation useful in restrictions, but not a real instance relationship
Person:Alan Ruttenberg
is quality measured as
relates a time stamped measurement datum to the time measurement datum that denotes the time when the measurement was taken
Alan Ruttenberg
has time stamp
relates a time stamped measurement datum to the measurement datum that was measured
Alan Ruttenberg
has measurement datum
is_supported_by_data
The relation between the conclusion "Gene tpbA is involved in EPS production" and the data items produced using two sets of organisms, one being a tpbA knockout, the other being tpbA wildtype tested in polysacharide production assays and analyzed using an ANOVA.
The relation between a data item and a conclusion where the conclusion is the output of a data interpreting process and the data item is used as an input to that process
OBI
OBI
Philly 2011 workshop
is_supported_by_data
has_specified_input
has_specified_input
see is_input_of example_of_usage
A relation between a planned process and a continuant participating in that process that is not created during the process. The presence of the continuant during the process is explicitly specified in the plan specification which the process realizes the concretization of.
8/17/09: specified inputs of one process are not necessarily specified inputs of a larger process that it is part of. This is in contrast to how 'has participant' works.
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
PERSON: Larry Hunter
PERSON: Melanie Coutot
has_specified_input
is_specified_input_of
some Autologous EBV(Epstein-Barr virus)-transformed B-LCL (B lymphocyte cell line) is_input_for instance of Chromum Release Assay described at https://wiki.cbil.upenn.edu/obiwiki/index.php/Chromium_Release_assay
A relation between a planned process and a continuant participating in that process that is not created during the process. The presence of the continuant during the process is explicitly specified in the plan specification which the process realizes the concretization of.
Alan Ruttenberg
PERSON:Bjoern Peters
is_specified_input_of
has_specified_output
has_specified_output
A relation between a planned process and a continuant participating in that process. The presence of the continuant at the end of the process is explicitly specified in the objective specification which the process realizes the concretization of.
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
PERSON: Larry Hunter
PERSON: Melanie Courtot
has_specified_output
is_specified_output_of
is_specified_output_of
A relation between a planned process and a continuant participating in that process. The presence of the continuant at the end of the process is explicitly specified in the objective specification which the process realizes the concretization of.
Alan Ruttenberg
PERSON:Bjoern Peters
is_specified_output_of
is_proxy_for
position on a gel is_proxy_for mass and charge of molecule in an western blot. Florescent intensity is_proxy_for amount of protein labeled with GFP. Examples:
A260/A280 (of a DNA sample) is_proxy_for DNA-purity. NMR Sample scan is a proxy for sample quality.
Within the assay mentioned here: https://wiki.cbil.upenn.edu/obiwiki/index.php/Chromium_Release_assay
level of radioactivity is_proxy_for level of toxicity
A relation between continuant instances c1 and c2 where within an experiment/ protocol application, measurement of c1 is used to determine what a measurement of c2 would be.
A relation between continuant instances c1 and c2 where within a protocol
application, measurement of c1 is related to a what would be the
measurement of c2.
(another definition)
Alan Ruttenberg
is_proxy_for
achieves_planned_objective
A cell sorting process achieves the objective specification 'material separation objective'
This relation obtains between a planned process and a objective specification when the criteria specified in the objective specification are met at the end of the planned process.
BP, AR, PPPB branch
PPPB branch derived
modified according to email thread from 1/23/09 in accordince with DT and PPPB branch
achieves_planned_objective
has grain
the relation of the cells in the finger of the skin to the finger, in which an indeterminate number of grains are parts of the whole by virtue of being grains in a collective that is part of the whole, and in which removing one granular part does not nec- essarily damage or diminish the whole. Ontological Whether there is a fixed, or nearly fixed number of parts - e.g. fingers of the hand, chambers of the heart, or wheels of a car - such that there can be a notion of a single one being missing, or whether, by contrast, the number of parts is indeterminate - e.g., cells in the skin of the hand, red cells in blood, or rubber molecules in the tread of the tire of the wheel of the car.
Discussion in Karslruhe with, among others, Alan Rector, Stefan Schulz, Marijke Keet, Melanie Courtot, and Alan Ruttenberg. Definition take from the definition of granular parthood in the cited paper. Needs work to put into standard form
PERSON: Alan Ruttenberg
PAPER: Granularity, scale and collectivity: When size does and does not matter, Alan Rector, Jeremy Rogers, Thomas Bittner, Journal of Biomedical Informatics 39 (2006) 333-349
has grain
objective_achieved_by
This relation obtains between a a objective specification and a planned process when the criteria specified in the objective specification are met at the end of the planned process.
OBI
OBI
objective_achieved_by
is member of organization
Relating a legal person or organization to an organization in the case where the legal person or organization has a role as member of the organization.
2009/10/01 Alan Ruttenberg. Barry prefers generic is-member-of. Question of what the range should be. For now organization. Is organization a population? Would the same relation be used to record members of a population
JZ: Discussed on May 7, 2012 OBI dev call. Bjoern points out that we need to allow for organizations to be members of organizations. And agreed by the other OBI developers. So, human and organization were specified in 'Domains'. The textual definition was updated based on it.
Person:Alan Ruttenberg
Person:Helen Parkinson
Person:Alan Ruttenberg
Person:Helen Parkinson
2009/09/28 Alan Ruttenberg. Fucoidan-use-case
is member of organization
has organization member
Relating an organization to a legal person or organization.
See tracker:
https://sourceforge.net/tracker/index.php?func=detail&aid=3512902&group_id=177891&atid=886178
Person: Jie Zheng
has organization member
specifies value of
A relation between a value specification and an entity which the specification is about.
specifies value of
has value specification
A relation between an information content entity and a value specification that specifies its value.
PERSON: James A. Overton
OBI
has value specification
inheres in
this fragility inheres in this vase
this red color inheres in this apple
a relation between a specifically dependent continuant (the dependent) and an independent continuant (the bearer), in which the dependent specifically depends on the bearer for its existence
A dependent inheres in its bearer at all times for which the dependent exists.
inheres_in
inheres in
bearer of
this apple is bearer of this red color
this vase is bearer of this fragility
a relation between an independent continuant (the bearer) and a specifically dependent continuant (the dependent), in which the dependent specifically depends on the bearer for its existence
A bearer can have many dependents, and its dependents can exist for different periods of time, but none of its dependents can exist when the bearer does not exist.
bearer_of
is bearer of
bearer of
participates in
this blood clot participates in this blood coagulation
this input material (or this output material) participates in this process
this investigator participates in this investigation
a relation between a continuant and a process, in which the continuant is somehow involved in the process
participates_in
participates in
has participant
this blood coagulation has participant this blood clot
this investigation has participant this investigator
this process has participant this input material (or this output material)
a relation between a process and a continuant, in which the continuant is somehow involved in the process
Has_participant is a primitive instance-level relation between a process, a continuant, and a time at which the continuant participates in some way in the process. The relation obtains, for example, when this particular process of oxygen exchange across this particular alveolar membrane has_participant this particular sample of hemoglobin at this particular time.
has_participant
http://www.obofoundry.org/ro/#OBO_REL:has_participant
has participant
A journal article is an information artifact that inheres in some number of printed journals. For each copy of the printed journal there is some quality that carries the journal article, such as a pattern of ink. The journal article (a generically dependent continuant) is concretized as the quality (a specifically dependent continuant), and both depend on that copy of the printed journal (an independent continuant).
An investigator reads a protocol and forms a plan to carry out an assay. The plan is a realizable entity (a specifically dependent continuant) that concretizes the protocol (a generically dependent continuant), and both depend on the investigator (an independent continuant). The plan is then realized by the assay (a process).
A relationship between a generically dependent continuant and a specifically dependent continuant, in which the generically dependent continuant depends on some independent continuant in virtue of the fact that the specifically dependent continuant also depends on that same independent continuant. A generically dependent continuant may be concretized as multiple specifically dependent continuants.
is concretized as
A journal article is an information artifact that inheres in some number of printed journals. For each copy of the printed journal there is some quality that carries the journal article, such as a pattern of ink. The quality (a specifically dependent continuant) concretizes the journal article (a generically dependent continuant), and both depend on that copy of the printed journal (an independent continuant).
An investigator reads a protocol and forms a plan to carry out an assay. The plan is a realizable entity (a specifically dependent continuant) that concretizes the protocol (a generically dependent continuant), and both depend on the investigator (an independent continuant). The plan is then realized by the assay (a process).
A relationship between a specifically dependent continuant and a generically dependent continuant, in which the generically dependent continuant depends on some independent continuant in virtue of the fact that the specifically dependent continuant also depends on that same independent continuant. Multiple specifically dependent continuants can concretize the same generically dependent continuant.
concretizes
this catalysis function is a function of this enzyme
a relation between a function and an independent continuant (the bearer), in which the function specifically depends on the bearer for its existence
A function inheres in its bearer at all times for which the function exists, however the function need not be realized at all the times that the function exists.
function_of
is function of
function of
this red color is a quality of this apple
a relation between a quality and an independent continuant (the bearer), in which the quality specifically depends on the bearer for its existence
A quality inheres in its bearer at all times for which the quality exists.
is quality of
quality_of
quality of
this investigator role is a role of this person
a relation between a role and an independent continuant (the bearer), in which the role specifically depends on the bearer for its existence
A role inheres in its bearer at all times for which the role exists, however the role need not be realized at all the times that the role exists.
is role of
role_of
role of
this enzyme has function this catalysis function (more colloquially: this enzyme has this catalysis function)
a relation between an independent continuant (the bearer) and a function, in which the function specifically depends on the bearer for its existence
A bearer can have many functions, and its functions can exist for different periods of time, but none of its functions can exist when the bearer does not exist. A function need not be realized at all the times that the function exists.
has_function
has function
this apple has quality this red color
a relation between an independent continuant (the bearer) and a quality, in which the quality specifically depends on the bearer for its existence
A bearer can have many qualities, and its qualities can exist for different periods of time, but none of its qualities can exist when the bearer does not exist.
has_quality
has quality
this person has role this investigator role (more colloquially: this person has this role of investigator)
a relation between an independent continuant (the bearer) and a role, in which the role specifically depends on the bearer for its existence
A bearer can have many roles, and its roles can exist for different periods of time, but none of its roles can exist when the bearer does not exist. A role need not be realized at all the times that the role exists.
has_role
has role
derives from
this cell derives from this parent cell (cell division)
this nucleus derives from this parent nucleus (nuclear division)
a relation between two distinct material entities, the new entity and the old entity, in which the new entity begins to exist when the old entity ceases to exist, and the new entity inherits the significant portion of the matter of the old entity
This is a very general relation. More specific relations are preferred when applicable, such as 'directly develops from'.
derives_from
derives from
this parent cell derives into this cell (cell division)
this parent nucleus derives into this nucleus (nuclear division)
a relation between two distinct material entities, the old entity and the new entity, in which the new entity begins to exist when the old entity ceases to exist, and the new entity inherits the significant portion of the matter of the old entity
This is a very general relation. More specific relations are preferred when applicable, such as 'directly develops into'. To avoid making statements about a future that may not come to pass, it is often better to use the backward-looking 'derives from' rather than the forward-looking 'derives into'.
derives_into
derives into
is location of
my head is the location of my brain
this cage is the location of this rat
a relation between two independent continuants, the location and the target, in which the target is entirely within the location
Most location relations will only hold at certain times, but this is difficult to specify in OWL. See https://code.google.com/p/obo-relations/wiki/ROAndTime
location_of
location of
located in
my brain is located in my head
this rat is located in this cage
a relation between two independent continuants, the target and the location, in which the target is entirely within the location
Location as a relation between instances: The primitive instance-level relation c located_in r at t reflects the fact that each continuant is at any given time associated with exactly one spatial region, namely its exact location. Following we can use this relation to define a further instance-level location relation - not between a continuant and the region which it exactly occupies, but rather between one continuant and another. c is located in c1, in this sense, whenever the spatial region occupied by c is part_of the spatial region occupied by c1. Note that this relation comprehends both the relation of exact location between one continuant and another which obtains when r and r1 are identical (for example, when a portion of fluid exactly fills a cavity), as well as those sorts of inexact location relations which obtain, for example, between brain and head or between ovum and uterus
Most location relations will only hold at certain times, but this is difficult to specify in OWL. See https://code.google.com/p/obo-relations/wiki/ROAndTime
located_in
http://www.obofoundry.org/ro/#OBO_REL:located_in
located in
move to BFO?
Allen
A relation that holds between two occurrents. This is a grouping relation that collects together all the Allen relations.
temporal relation
property to indicate that a design declares a variable; the inverse property is 'is declared by'
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
declares
property to indicate the variables declared by a design; the inverse property is 'declares'
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
is declared by
the relationship between a fraction and the number above the line
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
has numerator
relationship between a planned process and the plan specification that it carries out; it is defined as equivalent to the composed relationship (realizes o concretizes)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
executes
This is the inverse of 'specifies value of' and it is intended to say things such as 'compound' 'assumes values specified by' 'independent variable specification'
A relation between an entity and a value specification, where the value specification is about the entity.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
assumes values specified by
relationship between an element and a set it belongs to
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
is member of
relationship between a set and one of its elements
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
has member
Inverse relation of 'denotes', where denotation is what happens when someone creates an information content entity E in order to specifically refer to something (from 'denotes' definition).
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
is denoted by
the relationship between a fraction and the number below the line (or divisor)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB
has denominator
has effect on
has fixed effect on
has interaction effect on
has random effect on
has order in sequence
Relationship between a parameter of a model and the estimate produced by estimation process as used in statistical modeling.
estimate of
computed_from is a relation between 2 information content entity denoting how one is derived from another on through the application of a data transformation or computation process.
computed from
is model for
is modeled by
has measurement value
has x coordinate value
has y coordinate value
has specified numeric value
A relation between a value specification and a number that quantifies it.
A range of 'real' might be better than 'float'. For now we follow 'has measurement value' until we can consider technical issues with SPARQL queries and reasoning.
PERSON: James A. Overton
OBI
has specified numeric value
has specified value
A relation between a value specification and a literal.
This is not an RDF/OWL object property. It is intended to link a value found in e.g. a database column of 'M' (the literal) to an instance of a value specification class, which can then be linked to indicate that this is about the biological gender of a human subject.
OBI
has specified value
A relationship (data property) between an entity and its value.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
has value
entity
Entity
Julius Caesar
Verdi’s Requiem
the Second World War
your body mass index
BFO 2 Reference: In all areas of empirical inquiry we encounter general terms of two sorts. First are general terms which refer to universals or types:animaltuberculosissurgical procedurediseaseSecond, are general terms used to refer to groups of entities which instantiate a given universal but do not correspond to the extension of any subuniversal of that universal because there is nothing intrinsic to the entities in question by virtue of which they – and only they – are counted as belonging to the given group. Examples are: animal purchased by the Emperortuberculosis diagnosed on a Wednesdaysurgical procedure performed on a patient from Stockholmperson identified as candidate for clinical trial #2056-555person who is signatory of Form 656-PPVpainting by Leonardo da VinciSuch terms, which represent what are called ‘specializations’ in [81
Entity doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. For example Werner Ceusters 'portions of reality' include 4 sorts, entities (as BFO construes them), universals, configurations, and relations. It is an open question as to whether entities as construed in BFO will at some point also include these other portions of reality. See, for example, 'How to track absolutely everything' at http://www.referent-tracking.com/_RTU/papers/CeustersICbookRevised.pdf
An entity is anything that exists or has existed or will exist. (axiom label in BFO2 Reference: [001-001])
entity
continuant
Continuant
An entity that exists in full at any time in which it exists at all, persists through time while maintaining its identity and has no temporal parts.
BFO 2 Reference: Continuant entities are entities which can be sliced to yield parts only along the spatial dimension, yielding for example the parts of your table which we call its legs, its top, its nails. ‘My desk stretches from the window to the door. It has spatial parts, and can be sliced (in space) in two. With respect to time, however, a thing is a continuant.’ [60, p. 240
Continuant doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. For example, in an expansion involving bringing in some of Ceuster's other portions of reality, questions are raised as to whether universals are continuants
A continuant is an entity that persists, endures, or continues to exist through time while maintaining its identity. (axiom label in BFO2 Reference: [008-002])
if b is a continuant and if, for some t, c has_continuant_part b at t, then c is a continuant. (axiom label in BFO2 Reference: [126-001])
if b is a continuant and if, for some t, cis continuant_part of b at t, then c is a continuant. (axiom label in BFO2 Reference: [009-002])
if b is a material entity, then there is some temporal interval (referred to below as a one-dimensional temporal region) during which b exists. (axiom label in BFO2 Reference: [011-002])
(forall (x y) (if (and (Continuant x) (exists (t) (continuantPartOfAt y x t))) (Continuant y))) // axiom label in BFO2 CLIF: [009-002]
(forall (x y) (if (and (Continuant x) (exists (t) (hasContinuantPartOfAt y x t))) (Continuant y))) // axiom label in BFO2 CLIF: [126-001]
(forall (x) (if (Continuant x) (Entity x))) // axiom label in BFO2 CLIF: [008-002]
(forall (x) (if (Material Entity x) (exists (t) (and (TemporalRegion t) (existsAt x t))))) // axiom label in BFO2 CLIF: [011-002]
continuant
occurrent
Occurrent
An entity that has temporal parts and that happens, unfolds or develops through time.
BFO 2 Reference: every occurrent that is not a temporal or spatiotemporal region is s-dependent on some independent continuant that is not a spatial region
BFO 2 Reference: s-dependence obtains between every process and its participants in the sense that, as a matter of necessity, this process could not have existed unless these or those participants existed also. A process may have a succession of participants at different phases of its unfolding. Thus there may be different players on the field at different times during the course of a football game; but the process which is the entire game s-depends_on all of these players nonetheless. Some temporal parts of this process will s-depend_on on only some of the players.
Occurrent doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. An example would be the sum of a process and the process boundary of another process.
Simons uses different terminology for relations of occurrents to regions: Denote the spatio-temporal location of a given occurrent e by 'spn[e]' and call this region its span. We may say an occurrent is at its span, in any larger region, and covers any smaller region. Now suppose we have fixed a frame of reference so that we can speak not merely of spatio-temporal but also of spatial regions (places) and temporal regions (times). The spread of an occurrent, (relative to a frame of reference) is the space it exactly occupies, and its spell is likewise the time it exactly occupies. We write 'spr[e]' and `spl[e]' respectively for the spread and spell of e, omitting mention of the frame.
An occurrent is an entity that unfolds itself in time or it is the instantaneous boundary of such an entity (for example a beginning or an ending) or it is a temporal or spatiotemporal region which such an entity occupies_temporal_region or occupies_spatiotemporal_region. (axiom label in BFO2 Reference: [077-002])
Every occurrent occupies_spatiotemporal_region some spatiotemporal region. (axiom label in BFO2 Reference: [108-001])
b is an occurrent entity iff b is an entity that has temporal parts. (axiom label in BFO2 Reference: [079-001])
(forall (x) (if (Occurrent x) (exists (r) (and (SpatioTemporalRegion r) (occupiesSpatioTemporalRegion x r))))) // axiom label in BFO2 CLIF: [108-001]
(forall (x) (iff (Occurrent x) (and (Entity x) (exists (y) (temporalPartOf y x))))) // axiom label in BFO2 CLIF: [079-001]
occurrent
ic
IndependentContinuant
a chair
a heart
a leg
a molecule
a spatial region
an atom
an orchestra.
an organism
the bottom right portion of a human torso
the interior of your mouth
A continuant that is a bearer of quality and realizable entity entities, in which other entities inhere and which itself cannot inhere in anything.
b is an independent continuant = Def. b is a continuant which is such that there is no c and no t such that b s-depends_on c at t. (axiom label in BFO2 Reference: [017-002])
For any independent continuant b and any time t there is some spatial region r such that b is located_in r at t. (axiom label in BFO2 Reference: [134-001])
For every independent continuant b and time t during the region of time spanned by its life, there are entities which s-depends_on b during t. (axiom label in BFO2 Reference: [018-002])
(forall (x t) (if (IndependentContinuant x) (exists (r) (and (SpatialRegion r) (locatedInAt x r t))))) // axiom label in BFO2 CLIF: [134-001]
(forall (x t) (if (and (IndependentContinuant x) (existsAt x t)) (exists (y) (and (Entity y) (specificallyDependsOnAt y x t))))) // axiom label in BFO2 CLIF: [018-002]
(iff (IndependentContinuant a) (and (Continuant a) (not (exists (b t) (specificallyDependsOnAt a b t))))) // axiom label in BFO2 CLIF: [017-002]
independent continuant
s-region
SpatialRegion
BFO 2 Reference: Spatial regions do not participate in processes.
Spatial region doesn't have a closure axiom because the subclasses don't exhaust all possibilites. An example would be the union of a spatial point and a spatial line that doesn't overlap the point, or two spatial lines that intersect at a single point. In both cases the resultant spatial region is neither 0-dimensional, 1-dimensional, 2-dimensional, or 3-dimensional.
A spatial region is a continuant entity that is a continuant_part_of spaceR as defined relative to some frame R. (axiom label in BFO2 Reference: [035-001])
All continuant parts of spatial regions are spatial regions. (axiom label in BFO2 Reference: [036-001])
(forall (x y t) (if (and (SpatialRegion x) (continuantPartOfAt y x t)) (SpatialRegion y))) // axiom label in BFO2 CLIF: [036-001]
(forall (x) (if (SpatialRegion x) (Continuant x))) // axiom label in BFO2 CLIF: [035-001]
spatial region
2d-s-region
TwoDimensionalSpatialRegion
an infinitely thin plane in space.
the surface of a sphere-shaped part of space
A two-dimensional spatial region is a spatial region that is of two dimensions. (axiom label in BFO2 Reference: [039-001])
(forall (x) (if (TwoDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [039-001]
two-dimensional spatial region
process
Process
a process of cell-division, \ a beating of the heart
a process of meiosis
a process of sleeping
the course of a disease
the flight of a bird
the life of an organism
your process of aging.
An occurrent that has temporal proper parts and for some time t, p s-depends_on some material entity at t.
p is a process = Def. p is an occurrent that has temporal proper parts and for some time t, p s-depends_on some material entity at t. (axiom label in BFO2 Reference: [083-003])
BFO 2 Reference: The realm of occurrents is less pervasively marked by the presence of natural units than is the case in the realm of independent continuants. Thus there is here no counterpart of ‘object’. In BFO 1.0 ‘process’ served as such a counterpart. In BFO 2.0 ‘process’ is, rather, the occurrent counterpart of ‘material entity’. Those natural – as contrasted with engineered, which here means: deliberately executed – units which do exist in the realm of occurrents are typically either parasitic on the existence of natural units on the continuant side, or they are fiat in nature. Thus we can count lives; we can count football games; we can count chemical reactions performed in experiments or in chemical manufacturing. We cannot count the processes taking place, for instance, in an episode of insect mating behavior.Even where natural units are identifiable, for example cycles in a cyclical process such as the beating of a heart or an organism’s sleep/wake cycle, the processes in question form a sequence with no discontinuities (temporal gaps) of the sort that we find for instance where billiard balls or zebrafish or planets are separated by clear spatial gaps. Lives of organisms are process units, but they too unfold in a continuous series from other, prior processes such as fertilization, and they unfold in turn in continuous series of post-life processes such as post-mortem decay. Clear examples of boundaries of processes are almost always of the fiat sort (midnight, a time of death as declared in an operating theater or on a death certificate, the initiation of a state of war)
(iff (Process a) (and (Occurrent a) (exists (b) (properTemporalPartOf b a)) (exists (c t) (and (MaterialEntity c) (specificallyDependsOnAt a c t))))) // axiom label in BFO2 CLIF: [083-003]
process
disposition
Disposition
an atom of element X has the disposition to decay to an atom of element Y
certain people have a predisposition to colon cancer
children are innately disposed to categorize objects in certain ways.
the cell wall is disposed to filter chemicals in endocytosis and exocytosis
BFO 2 Reference: Dispositions exist along a strength continuum. Weaker forms of disposition are realized in only a fraction of triggering cases. These forms occur in a significant number of cases of a similar type.
b is a disposition means: b is a realizable entity & b’s bearer is some material entity & b is such that if it ceases to exist, then its bearer is physically changed, & b’s realization occurs when and because this bearer is in some special physical circumstances, & this realization occurs in virtue of the bearer’s physical make-up. (axiom label in BFO2 Reference: [062-002])
If b is a realizable entity then for all t at which b exists, b s-depends_on some material entity at t. (axiom label in BFO2 Reference: [063-002])
(forall (x t) (if (and (RealizableEntity x) (existsAt x t)) (exists (y) (and (MaterialEntity y) (specificallyDepends x y t))))) // axiom label in BFO2 CLIF: [063-002]
(forall (x) (if (Disposition x) (and (RealizableEntity x) (exists (y) (and (MaterialEntity y) (bearerOfAt x y t)))))) // axiom label in BFO2 CLIF: [062-002]
disposition
realizable
RealizableEntity
the disposition of this piece of metal to conduct electricity.
the disposition of your blood to coagulate
the function of your reproductive organs
the role of being a doctor
the role of this boundary to delineate where Utah and Colorado meet
A specifically dependent continuant that inheres in continuant entities and are not exhibited in full at every time in which it inheres in an entity or group of entities. The exhibition or actualization of a realizable entity is a particular manifestation, functioning or process that occurs under certain circumstances.
To say that b is a realizable entity is to say that b is a specifically dependent continuant that inheres in some independent continuant which is not a spatial region and is of a type instances of which are realized in processes of a correlated type. (axiom label in BFO2 Reference: [058-002])
All realizable dependent continuants have independent continuants that are not spatial regions as their bearers. (axiom label in BFO2 Reference: [060-002])
(forall (x t) (if (RealizableEntity x) (exists (y) (and (IndependentContinuant y) (not (SpatialRegion y)) (bearerOfAt y x t))))) // axiom label in BFO2 CLIF: [060-002]
(forall (x) (if (RealizableEntity x) (and (SpecificallyDependentContinuant x) (exists (y) (and (IndependentContinuant y) (not (SpatialRegion y)) (inheresIn x y)))))) // axiom label in BFO2 CLIF: [058-002]
realizable entity
0d-s-region
ZeroDimensionalSpatialRegion
A zero-dimensional spatial region is a point in space. (axiom label in BFO2 Reference: [037-001])
(forall (x) (if (ZeroDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [037-001]
zero-dimensional spatial region
quality
Quality
the ambient temperature of this portion of air
the color of a tomato
the length of the circumference of your waist
the mass of this piece of gold.
the shape of your nose
the shape of your nostril
a quality is a specifically dependent continuant that, in contrast to roles and dispositions, does not require any further process in order to be realized. (axiom label in BFO2 Reference: [055-001])
If an entity is a quality at any time that it exists, then it is a quality at every time that it exists. (axiom label in BFO2 Reference: [105-001])
(forall (x) (if (Quality x) (SpecificallyDependentContinuant x))) // axiom label in BFO2 CLIF: [055-001]
(forall (x) (if (exists (t) (and (existsAt x t) (Quality x))) (forall (t_1) (if (existsAt x t_1) (Quality x))))) // axiom label in BFO2 CLIF: [105-001]
quality
sdc
SpecificallyDependentContinuant
Reciprocal specifically dependent continuants: the function of this key to open this lock and the mutually dependent disposition of this lock: to be opened by this key
of one-sided specifically dependent continuants: the mass of this tomato
of relational dependent continuants (multiple bearers): John’s love for Mary, the ownership relation between John and this statue, the relation of authority between John and his subordinates.
the disposition of this fish to decay
the function of this heart: to pump blood
the mutual dependence of proton donors and acceptors in chemical reactions [79
the mutual dependence of the role predator and the role prey as played by two organisms in a given interaction
the pink color of a medium rare piece of grilled filet mignon at its center
the role of being a doctor
the shape of this hole.
the smell of this portion of mozzarella
A continuant that inheres in or is borne by other entities. Every instance of A requires some specific instance of B which must always be the same.
b is a relational specifically dependent continuant = Def. b is a specifically dependent continuant and there are n > 1 independent continuants c1, … cn which are not spatial regions are such that for all 1 i < j n, ci and cj share no common parts, are such that for each 1 i n, b s-depends_on ci at every time t during the course of b’s existence (axiom label in BFO2 Reference: [131-004])
b is a specifically dependent continuant = Def. b is a continuant & there is some independent continuant c which is not a spatial region and which is such that b s-depends_on c at every time t during the course of b’s existence. (axiom label in BFO2 Reference: [050-003])
Specifically dependent continuant doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. We're not sure what else will develop here, but for example there are questions such as what are promises, obligation, etc.
(iff (RelationalSpecificallyDependentContinuant a) (and (SpecificallyDependentContinuant a) (forall (t) (exists (b c) (and (not (SpatialRegion b)) (not (SpatialRegion c)) (not (= b c)) (not (exists (d) (and (continuantPartOfAt d b t) (continuantPartOfAt d c t)))) (specificallyDependsOnAt a b t) (specificallyDependsOnAt a c t)))))) // axiom label in BFO2 CLIF: [131-004]
(iff (SpecificallyDependentContinuant a) (and (Continuant a) (forall (t) (if (existsAt a t) (exists (b) (and (IndependentContinuant b) (not (SpatialRegion b)) (specificallyDependsOnAt a b t))))))) // axiom label in BFO2 CLIF: [050-003]
specifically dependent continuant
role
Role
John’s role of husband to Mary is dependent on Mary’s role of wife to John, and both are dependent on the object aggregate comprising John and Mary as member parts joined together through the relational quality of being married.
the priest role
the role of a boundary to demarcate two neighboring administrative territories
the role of a building in serving as a military target
the role of a stone in marking a property boundary
the role of subject in a clinical trial
the student role
A realizable entity the manifestation of which brings about some result or end that is not essential to a continuant in virtue of the kind of thing that it is but that can be served or participated in by that kind of continuant in some kinds of natural, social or institutional contexts.
BFO 2 Reference: One major family of examples of non-rigid universals involves roles, and ontologies developed for corresponding administrative purposes may consist entirely of representatives of entities of this sort. Thus ‘professor’, defined as follows,b instance_of professor at t =Def. there is some c, c instance_of professor role & c inheres_in b at t.denotes a non-rigid universal and so also do ‘nurse’, ‘student’, ‘colonel’, ‘taxpayer’, and so forth. (These terms are all, in the jargon of philosophy, phase sortals.) By using role terms in definitions, we can create a BFO conformant treatment of such entities drawing on the fact that, while an instance of professor may be simultaneously an instance of trade union member, no instance of the type professor role is also (at any time) an instance of the type trade union member role (any more than any instance of the type color is at any time an instance of the type length).If an ontology of employment positions should be defined in terms of roles following the above pattern, this enables the ontology to do justice to the fact that individuals instantiate the corresponding universals – professor, sergeant, nurse – only during certain phases in their lives.
b is a role means: b is a realizable entity & b exists because there is some single bearer that is in some special physical, social, or institutional set of circumstances in which this bearer does not have to be& b is not such that, if it ceases to exist, then the physical make-up of the bearer is thereby changed. (axiom label in BFO2 Reference: [061-001])
(forall (x) (if (Role x) (RealizableEntity x))) // axiom label in BFO2 CLIF: [061-001]
role
1d-s-region
OneDimensionalSpatialRegion
an edge of a cube-shaped portion of space.
A one-dimensional spatial region is a line or aggregate of lines stretching from one point in space to another. (axiom label in BFO2 Reference: [038-001])
(forall (x) (if (OneDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [038-001]
one-dimensional spatial region
3d-s-region
ThreeDimensionalSpatialRegion
a cube-shaped region of space
a sphere-shaped region of space,
A three-dimensional spatial region is a spatial region that is of three dimensions. (axiom label in BFO2 Reference: [040-001])
(forall (x) (if (ThreeDimensionalSpatialRegion x) (SpatialRegion x))) // axiom label in BFO2 CLIF: [040-001]
three-dimensional spatial region
gdc
GenericallyDependentContinuant
The entries in your database are patterns instantiated as quality instances in your hard drive. The database itself is an aggregate of such patterns. When you create the database you create a particular instance of the generically dependent continuant type database. Each entry in the database is an instance of the generically dependent continuant type IAO: information content entity.
the pdf file on your laptop, the pdf file that is a copy thereof on my laptop
the sequence of this protein molecule; the sequence that is a copy thereof in that protein molecule.
A continuant that is dependent on one or other independent continuant bearers. For every instance of A requires some instance of (an independent continuant type) B but which instance of B serves can change from time to time.
b is a generically dependent continuant = Def. b is a continuant that g-depends_on one or more other entities. (axiom label in BFO2 Reference: [074-001])
(iff (GenericallyDependentContinuant a) (and (Continuant a) (exists (b t) (genericallyDependsOnAt a b t)))) // axiom label in BFO2 CLIF: [074-001]
generically dependent continuant
function
Function
the function of a hammer to drive in nails
the function of a heart pacemaker to regulate the beating of a heart through electricity
the function of amylase in saliva to break down starch into sugar
BFO 2 Reference: In the past, we have distinguished two varieties of function, artifactual function and biological function. These are not asserted subtypes of BFO:function however, since the same function – for example: to pump, to transport – can exist both in artifacts and in biological entities. The asserted subtypes of function that would be needed in order to yield a separate monoheirarchy are not artifactual function, biological function, etc., but rather transporting function, pumping function, etc.
A function is a disposition that exists in virtue of the bearer’s physical make-up and this physical make-up is something the bearer possesses because it came into being, either through evolution (in the case of natural biological entities) or through intentional design (in the case of artifacts), in order to realize processes of a certain sort. (axiom label in BFO2 Reference: [064-001])
(forall (x) (if (Function x) (Disposition x))) // axiom label in BFO2 CLIF: [064-001]
function
material
MaterialEntity
a flame
a forest fire
a human being
a hurricane
a photon
a puff of smoke
a sea wave
a tornado
an aggregate of human beings.
an energy wave
an epidemic
the undetached arm of a human being
An independent continuant that is spatially extended whose identity is independent of that of other entities and can be maintained through time.
BFO 2 Reference: Material entities (continuants) can preserve their identity even while gaining and losing material parts. Continuants are contrasted with occurrents, which unfold themselves in successive temporal parts or phases [60
BFO 2 Reference: Object, Fiat Object Part and Object Aggregate are not intended to be exhaustive of Material Entity. Users are invited to propose new subcategories of Material Entity.
BFO 2 Reference: ‘Matter’ is intended to encompass both mass and energy (we will address the ontological treatment of portions of energy in a later version of BFO). A portion of matter is anything that includes elementary particles among its proper or improper parts: quarks and leptons, including electrons, as the smallest particles thus far discovered; baryons (including protons and neutrons) at a higher level of granularity; atoms and molecules at still higher levels, forming the cells, organs, organisms and other material entities studied by biologists, the portions of rock studied by geologists, the fossils studied by paleontologists, and so on.Material entities are three-dimensional entities (entities extended in three spatial dimensions), as contrasted with the processes in which they participate, which are four-dimensional entities (entities extended also along the dimension of time).According to the FMA, material entities may have immaterial entities as parts – including the entities identified below as sites; for example the interior (or ‘lumen’) of your small intestine is a part of your body. BFO 2.0 embodies a decision to follow the FMA here.
A material entity is an independent continuant that has some portion of matter as proper or improper continuant part. (axiom label in BFO2 Reference: [019-002])
Every entity which has a material entity as continuant part is a material entity. (axiom label in BFO2 Reference: [020-002])
every entity of which a material entity is continuant part is also a material entity. (axiom label in BFO2 Reference: [021-002])
(forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom label in BFO2 CLIF: [019-002]
(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [021-002]
(forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [020-002]
material entity
immaterial
ImmaterialEntity
BFO 2 Reference: Immaterial entities are divided into two subgroups:boundaries and sites, which bound, or are demarcated in relation, to material entities, and which can thus change location, shape and size and as their material hosts move or change shape or size (for example: your nasal passage; the hold of a ship; the boundary of Wales (which moves with the rotation of the Earth) [38, 7, 10
immaterial entity
peptide
Amide derived from two or more amino carboxylic acid molecules (the same or different) by formation of a covalent bond from the carbonyl carbon of one to the nitrogen atom of another with formal loss of water. The term is usually applied to structures formed from alpha-amino acids, but it includes those derived from any amino carboxylic acid. X = OH, OR, NH2, NHR, etc.
peptide
deoxyribonucleic acid
High molecular weight, linear polymers, composed of nucleotides containing deoxyribose and linked by phosphodiester bonds; DNA contain the genetic information of organisms.
deoxyribonucleic acid
molecular entity
Any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity.
We are assuming that every molecular entity has to be completely connected by chemical bonds. This excludes protein complexes, which are comprised of minimally two separate molecular entities. We will follow up with Chebi to ensure this is their understanding as well
molecular entity
atom
A chemical entity constituting the smallest component of an element having the chemical properties of the element.
atom
nucleic acid
A macromolecule made up of nucleotide units and hydrolysable into certain pyrimidine or purine bases (usually adenine, cytosine, guanine, thymine, uracil), D-ribose or 2-deoxy-D-ribose and phosphoric acid.
nucleic acid
ribonucleic acid
High molecular weight, linear polymers, composed of nucleotides containing ribose and linked by phosphodiester bonds; RNA is central to the synthesis of proteins.
ribonucleic acid
macromolecule
A macromolecule is a molecule of high relative molecular mass, the structure of which essentially comprises the multiple repetition of units derived, actually or conceptually, from molecules of low relative molecular mass.
polymer
macromolecule
cell
cell
PMID:18089833.Cancer Res. 2007 Dec 15;67(24):12018-25. "...Epithelial cells were harvested from histologically confirmed adenocarcinomas .."
A material entity of anatomical origin (part of or deriving from an organism) that has as its parts a maximally connected cell compartment surrounded by a plasma membrane.
cell
cell
cultured cell
A cell in vitro that is or has been maintained or propagated as part of a cell culture.
cultured cell
experimentally modified cell in vitro
A cell in vitro that has undergone physical changes as a consequence of a deliberate and specific experimental procedure.
experimentally modified cell in vitro
molecular_function
A molecular process that can be carried out by the action of a single macromolecular machine, usually via direct physical interactions with other molecular entities. Function in this sense denotes an action, or activity, that a gene product (or a complex) performs. These actions are described from two distinct but related perspectives: (1) biochemical activity, and (2) role as a component in a larger system/process.
GO:molecular_function
catalytic activity
Catalysis of a biochemical reaction at physiological temperatures. In biologically catalyzed reactions, the reactants are known as substrates, and the catalysts are naturally occurring macromolecular substances known as enzymes. Enzymes possess specific binding sites for substrates, and are usually composed wholly or largely of protein, but RNA that has catalytic activity (ribozyme) is often also regarded as enzymatic.
catalytic activity
biological_process
A biological process represents a specific objective that the organism is genetically programmed to achieve. Biological processes are often described by their outcome or ending state, e.g., the biological process of cell division results in the creation of two daughter cells (a divided cell) from a single parent cell. A biological process is accomplished by a particular set of molecular functions carried out by specific gene products (or macromolecular complexes), often in a highly regulated manner and in a particular temporal sequence.
biological_process
gene expression
The process in which a gene's sequence is converted into a mature gene product or products (proteins or RNA). This includes the production of an RNA transcript as well as any processing to produce a mature RNA product or an mRNA or circRNA (for protein-coding genes) and the translation of that mRNA or circRNA into protein. Protein maturation is included when required to form an active form of a product from an inactive precursor form.
gene expression
protein complex
A ribosome is a protein complex
A stable macromolecular complex composed (only) of two or more polypeptide subunits along with any covalently attached molecules (such as lipid anchors or oligosaccharide) or non-protein prosthetic groups (such as nucleotides or metal ions). Prosthetic group in this context refers to a tightly bound cofactor. The component polypeptide subunits may be identical.
protein complex
conditional specification
a directive information entity that specifies what should happen if the trigger condition is fulfilled
PlanAndPlannedProcess Branch
OBI branch derived
OBI_0000349
conditional specification
measurement unit label
Examples of measurement unit labels are liters, inches, weight per volume.
A measurement unit label is as a label that is part of a scalar measurement datum and denotes a unit of measure.
2009-03-16: provenance: a term measurement unit was
proposed for OBI (OBI_0000176) , edited by Chris Stoeckert and
Cristian Cocos, and subsequently moved to IAO where the objective for
which the original term was defined was satisfied with the definition
of this, different, term.
2009-03-16: review of this term done during during the OBI workshop winter 2009 and the current definition was considered acceptable for use in OBI. If there is a need to modify this definition please notify OBI.
PERSON: Alan Ruttenberg
PERSON: Melanie Courtot
measurement unit label
objective specification
In the protocol of a ChIP assay the objective specification says to identify protein and DNA interaction.
a directive information entity that describes an intended process endpoint. When part of a plan specification the concretization is realized in a planned process in which the bearer tries to effect the world so that the process endpoint is achieved.
2009-03-16: original definition when imported from OBI read: "objective is an non realizable information entity which can serve as that proper part of a plan towards which the realization of the plan is directed."
2014-03-31: In the example of usage ("In the protocol of a ChIP assay the objective specification says to identify protein and DNA interaction") there is a protocol which is the ChIP assay protocol. In addition to being concretized on paper, the protocol can be concretized as a realizable entity, such as a plan that inheres in a person. The objective specification is the part that says that some protein and DNA interactions are identified. This is a specification of a process endpoint: the boundary in the process before which they are not identified and after which they are. During the realization of the plan, the goal is to get to the point of having the interactions, and participants in the realization of the plan try to do that.
Answers the question, why did you do this experiment?
PERSON: Alan Ruttenberg
PERSON: Barry Smith
PERSON: Bjoern Peters
PERSON: Jennifer Fostel
goal specification
OBI Plan and Planned Process/Roles Branch
OBI_0000217
objective specification
Pour the contents of flask 1 into flask 2
a directive information entity that describes an action the bearer will take
Alan Ruttenberg
OBI Plan and Planned Process branch
action specification
datum label
A label is a symbol that is part of some other datum and is used to either partially define the denotation of that datum or to provide a means for identifying the datum as a member of the set of data with the same label
http://www.golovchenko.org/cgi-bin/wnsearch?q=label#4n
GROUP: IAO
9/22/11 BP: changed the rdfs:label for this class from 'label' to 'datum label' to convey that this class is not intended to cover all kinds of labels (stickers, radiolabels, etc.), and not even all kind of textual labels, but rather the kind of labels occuring in a datum.
datum label
information carrier
In the case of a printed paperback novel the physicality of the ink and of the paper form part of the information bearer. The qualities of appearing black and having a certain pattern for the ink and appearing white for the paper form part of the information carrier in this case.
A quality of an information bearer that imparts the information content
12/15/09: There is a concern that some ways that carry information may be processes rather than qualities, such as in a 'delayed wave carrier'.
2014-03-10: We are not certain that all information carriers are qualities. There was a discussion of dropping it.
PERSON: Alan Ruttenberg
Smith, Ceusters, Ruttenberg, 2000 years of philosophy
information carrier
data item
Data items include counts of things, analyte concentrations, and statistical summaries.
a data item is an information content entity that is intended to be a truthful statement about something (modulo, e.g., measurement precision or other systematic errors) and is constructed/acquired by a method which reliably tends to produce (approximately) truthful statements.
2/2/2009 Alan and Bjoern discussing FACS run output data. This is a data item because it is about the cell population. Each element records an event and is typically further composed a set of measurment data items that record the fluorescent intensity stimulated by one of the lasers.
2009-03-16: data item deliberatly ambiguous: we merged data set and datum to be one entity, not knowing how to define singular versus plural. So data item is more general than datum.
2009-03-16: removed datum as alternative term as datum specifically refers to singular form, and is thus not an exact synonym.
2014-03-31: See discussion at http://odontomachus.wordpress.com/2014/03/30/aboutness-objects-propositions/
JAR: datum -- well, this will be very tricky to define, but maybe some
information-like stuff that might be put into a computer and that is
meant, by someone, to denote and/or to be interpreted by some
process... I would include lists, tables, sentences... I think I might
defer to Barry, or to Brian Cantwell Smith
JAR: A data item is an approximately justified approximately true approximate belief
PERSON: Alan Ruttenberg
PERSON: Chris Stoeckert
PERSON: Jonathan Rees
data
data item
symbol
a serial number such as "12324X"
a stop sign
a written proper name such as "OBI"
An information content entity that is a mark(s) or character(s) used as a conventional representation of another entity.
20091104, MC: this needs work and will most probably change
2014-03-31: We would like to have a deeper analysis of 'mark' and 'sign' in the future (see https://github.com/information-artifact-ontology/IAO/issues/154).
PERSON: James A. Overton
PERSON: Jonathan Rees
based on Oxford English Dictionary
symbol
information content entity
Examples of information content entites include journal articles, data, graphical layouts, and graphs.
A generically dependent continuant that is about some thing.
2014-03-10: The use of "thing" is intended to be general enough to include universals and configurations (see https://groups.google.com/d/msg/information-ontology/GBxvYZCk1oc/-L6B5fSBBTQJ).
information_content_entity 'is_encoded_in' some digital_entity in obi before split (040907). information_content_entity 'is_encoded_in' some physical_document in obi before split (040907).
Previous. An information content entity is a non-realizable information entity that 'is encoded in' some digital or physical entity.
PERSON: Chris Stoeckert
OBI_0000142
information content entity
1
1
10 feet. 3 ml.
a scalar measurement datum is a measurement datum that is composed of two parts, numerals and a unit label.
2009-03-16: we decided to keep datum singular in scalar measurement datum, as in
this case we explicitly refer to the singular form
Would write this as: has_part some 'measurement unit label' and has_part some numeral and has_part exactly 2, except for the fact that this won't let us take advantage of OWL reasoning over the numbers. Instead use has measurment value property to represent the same. Use has measurement unit label (subproperty of has_part) so we can easily say that there is only one of them.
PERSON: Alan Ruttenberg
PERSON: Melanie Courtot
scalar measurement datum
An information content entity whose concretizations indicate to their bearer how to realize them in a process.
2009-03-16: provenance: a term realizable information entity was proposed for OBI (OBI_0000337) , edited by the PlanAndPlannedProcess branch. Original definition was "is the specification of a process that can be concretized and realized by an actor" with alternative term "instruction".It has been subsequently moved to IAO where the objective for which the original term was defined was satisfied with the definitionof this, different, term.
2013-05-30 Alan Ruttenberg: What differentiates a directive information entity from an information concretization is that it can have concretizations that are either qualities or realizable entities. The concretizations that are realizable entities are created when an individual chooses to take up the direction, i.e. has the intention to (try to) realize it.
8/6/2009 Alan Ruttenberg: Changed label from "information entity about a realizable" after discussions at ICBO
Werner pushed back on calling it realizable information entity as it isn't realizable. However this name isn't right either. An example would be a recipe. The realizable entity would be a plan, but the information entity isn't about the plan, it, once concretized, *is* the plan. -Alan
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
directive information entity
dot plot
Dot plot of SSC-H and FSC-H.
A dot plot is a report graph which is a graphical representation of data where each data point is represented by a single dot placed on coordinates corresponding to data point values in particular dimensions.
person:Allyson Lister
person:Chris Stoeckert
OBI_0000123
group:OBI
dot plot
graph
A diagram that presents one or more tuples of information by mapping those tuples in to a two dimensional space in a non arbitrary way.
PERSON: Lawrence Hunter
person:Alan Ruttenberg
person:Allyson Lister
OBI_0000240
group:OBI
graph
rule
example to be added
a rule is an executable which guides, defines, restricts actions
MSI
PRS
OBI_0500021
PRS
rule
algorithm
PMID: 18378114.Genomics. 2008 Mar 28. LINKGEN: A new algorithm to process data in genetic linkage studies.
A plan specification which describes the inputs and output of mathematical functions as well as workflow of execution for achieving an predefined objective. Algorithms are realized usually by means of implementation as computer programs for execution by automata.
Philippe Rocca-Serra
PlanAndPlannedProcess Branch
OBI_0000270
adapted from discussion on OBI list (Matthew Pocock, Christian Cocos, Alan Ruttenberg)
algorithm
curation status specification
The curation status of the term. The allowed values come from an enumerated list of predefined terms. See the specification of these instances for more detailed definitions of each enumerated value.
Better to represent curation as a process with parts and then relate labels to that process (in IAO meeting)
PERSON:Bill Bug
GROUP:OBI:<http://purl.obolibrary.org/obo/obi>
OBI_0000266
curation status specification
data set
Intensity values in a CEL file or from multiple CEL files comprise a data set (as opposed to the CEL files themselves).
A data item that is an aggregate of other data items of the same type that have something in common. Averages and distributions can be determined for data sets.
2009/10/23 Alan Ruttenberg. The intention is that this term represent collections of like data. So this isn't for, e.g. the whole contents of a cel file, which includes parameters, metadata etc. This is more like java arrays of a certain rather specific type
2014-05-05: Data sets are aggregates and thus must include two or more data items. We have chosen not to add logical axioms to make this restriction.
person:Allyson Lister
person:Chris Stoeckert
OBI_0000042
group:OBI
data set
image
An image is an affine projection to a two dimensional surface, of measurements of some quality of an entity or entities repeated at regular intervals across a spatial range, where the measurements are represented as color and luminosity on the projected on surface.
person:Alan Ruttenberg
person:Allyson
person:Chris Stoeckert
OBI_0000030
group:OBI
image
data about an ontology part is a data item about a part of an ontology, for example a term
Person:Alan Ruttenberg
data about an ontology part
plan specification
PMID: 18323827.Nat Med. 2008 Mar;14(3):226.New plan proposed to help resolve conflicting medical advice.
A directive information entity with action specifications and objective specifications as parts that, when concretized, is realized in a process in which the bearer tries to achieve the objectives by taking the actions specified.
2009-03-16: provenance: a term a plan was proposed for OBI (OBI_0000344) , edited by the PlanAndPlannedProcess branch. Original definition was " a plan is a specification of a process that is realized by an actor to achieve the objective specified as part of the plan". It has been subsequently moved to IAO where the objective for which the original term was defined was satisfied with the definitionof this, different, term.
2014-03-31: A plan specification can have other parts, such as conditional specifications.
Alternative previous definition: a plan is a set of instructions that specify how an objective should be achieved
Alan Ruttenberg
OBI Plan and Planned Process branch
OBI_0000344
2/3/2009 Comment from OBI review.
Action specification not well enough specified.
Conditional specification not well enough specified.
Question whether all plan specifications have objective specifications.
Request that IAO either clarify these or change definitions not to use them
plan specification
measurement datum
Examples of measurement data are the recoding of the weight of a mouse as {40,mass,"grams"}, the recording of an observation of the behavior of the mouse {,process,"agitated"}, the recording of the expression level of a gene as measured through the process of microarray experiment {3.4,luminosity,}.
A measurement datum is an information content entity that is a recording of the output of a measurement such as produced by a device.
2/2/2009 is_specified_output of some assay?
person:Chris Stoeckert
OBI_0000305
group:OBI
measurement datum
version number
A version number is an information content entity which is a sequence of characters borne by part of each of a class of manufactured products or its packaging and indicates its order within a set of other products having the same name.
Note: we feel that at the moment we are happy with a general version number, and that we will subclass as needed in the future. For example, see 7. genome sequence version
GROUP: IAO
version number
conclusion textual entity
that fucoidan has a small statistically significant effect on AT3 level but no useful clinical effect as in-vivo anticoagulant, a paraphrase of part of the last paragraph of the discussion section of the paper 'Pilot clinical study to evaluate the anticoagulant activity of fucoidan', by Lowenthal et. al.PMID:19696660
A textual entity that expresses the results of reasoning about a problem, for instance as typically found towards the end of scientific papers.
2009/09/28 Alan Ruttenberg. Fucoidan-use-case
2009/10/23 Alan Ruttenberg: We need to work on the definition still
Person:Alan Ruttenberg
conclusion textual entity
scatter plot
Comparison of gene expression values in two samples can be displayed in a scatter plot
A scatterplot is a graph which uses Cartesian coordinates to display values for two variables for a set of data. The data is displayed as a collection of points, each having the value of one variable determining the position on the horizontal axis and the value of the other variable determining the position on the vertical axis.
PERSON:Chris Stoeckert
PERSON:James Malone
PERSON:Melanie Courtot
scattergraph
WEB: http://en.wikipedia.org/wiki/Scatterplot
scatter plot
textual entity
Words, sentences, paragraphs, and the written (non-figure) parts of publications are all textual entities
A textual entity is a part of a manifestation (FRBR sense), a generically dependent continuant whose concretizations are patterns of glyphs intended to be interpreted as words, formulas, etc.
AR, (IAO call 2009-09-01): a document as a whole is not typically a textual entity, because it has pictures in it - rather there are parts of it that are textual entities. Examples: The title, paragraph 2 sentence 7, etc.
MC, 2009-09-14 (following IAO call 2009-09-01): textual entities live at the FRBR (http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records) manifestation level. Everything is significant: line break, pdf and html versions of same document are different textual entities.
PERSON: Lawrence Hunter
text
textual entity
table
| T F
--+-----
T | T F
F | F F
A textual entity that contains a two-dimensional arrangement of texts repeated at regular intervals across a spatial range, such that the spatial relationships among the constituent texts expresses propositions
PERSON: Lawrence Hunter
table
figure
Any picture, diagram or table
An information content entity consisting of a two dimensional arrangement of information content entities such that the arrangement itself is about something.
PERSON: Lawrence Hunter
figure
diagram
A molecular structure ribbon cartoon showing helices, turns and sheets and their relations to each other in space.
A figure that expresses one or more propositions
PERSON: Lawrence Hunter
diagram
document
A journal article, patent application, laboratory notebook, or a book
A collection of information content entities intended to be understood together as a whole
PERSON: Lawrence Hunter
document
1
A cartesian spatial coordinate datum is a representation of a point in a spatial region, in which equal changes in the magnitude of a coordinate value denote length qualities with the same magnitude
2009-08-18 Alan Ruttenberg - question to BFO list about whether the BFO sense of the lower dimensional regions is that they are always part of actual space (the three dimensional sort) http://groups.google.com/group/bfo-discuss/browse_thread/thread/9d04e717e39fb617
Alan Ruttenberg
AR notes: We need to discuss whether it should include site.
cartesian spatial coordinate datum
http://groups.google.com/group/bfo-discuss/browse_thread/thread/9d04e717e39fb617
1
A cartesion spatial coordinate datum that uses one value to specify a position along a one dimensional spatial region
Alan Ruttenberg
one dimensional cartesian spatial coordinate datum
1
1
A cartesion spatial coordinate datum that uses two values to specify a position within a two dimensional spatial region
Alan Ruttenberg
two dimensional cartesian spatial coordinate datum
A scalar measurement datum that is the result of measurement of mass quality
2009/09/28 Alan Ruttenberg. Fucoidan-use-case
Person:Alan Ruttenberg
mass measurement datum
A scalar measurement datum that is the result of measuring a temporal interval
2009/09/28 Alan Ruttenberg. Fucoidan-use-case
Person:Alan Ruttenberg
time measurement datum
Recording the current temperature in a laboratory notebook. Writing a journal article. Updating a patient record in a database.
a planned process in which a document is created or added to by including the specified input in it.
6/11/9: Edited at OBI workshop. We need to be able identify a child form of information artifact which corresponds to something enduring (not brain like). This used to be restricted to physical document or digital entity as the output, but that excludes e.g. an audio cassette tape
Bjoern Peters
wikipedia http://en.wikipedia.org/wiki/Documenting
documenting
line graph
A line graph is a type of graph created by connecting a series of data
points together with a line.
PERSON:Chris Stoeckert
PERSON:Melanie Courtot
line chart
GROUP:OBI
WEB: http://en.wikipedia.org/wiki/Line_chart
line graph
The sentence "The article has Pubmed ID 12345." contains a CRID that has two parts: one part is the CRID symbol, which is '12345'; the other part denotes the CRID registry, which is Pubmed.
A symbol that is part of a CRID and that is sufficient to look up a record from the CRID's registry.
PERSON: Alan Ruttenberg
PERSON: Bill Hogan
PERSON: Bjoern Peters
PERSON: Melanie Courtot
CRID symbol
Original proposal from Bjoern, discussions at IAO calls
centrally registered identifier symbol
The sentence "The article has Pubmed ID 12345." contains a CRID that has two parts: one part is the CRID symbol, which is '12345'; the other part denotes the CRID registry, which is Pubmed.
An information content entity that consists of a CRID symbol and additional information about the CRID registry to which it belongs.
2014-05-05: In defining this term we take no position on what the CRID denotes. In particular do not assume it denotes a *record* in the CRID registry (since the registry might not have 'records').
Alan, IAO call 20101124: potentially the CRID denotes the instance it was associated with during creation.
Note, IAO call 20101124: URIs are not always CRID, as not centrally registered. We acknowledge that CRID is a subset of a larger identifier class, but this subset fulfills our current needs. OBI PURLs are CRID as they are registered with OCLC. UPCs (Universal Product Codes from AC Nielsen)are not CRID as they are not centrally registered.
PERSON: Alan Ruttenberg
PERSON: Bill Hogan
PERSON: Bjoern Peters
PERSON: Melanie Courtot
CRID
Original proposal from Bjoern, discussions at IAO calls
centrally registered identifier
PubMed is a CRID registry. It has a dataset of PubMed identifiers associated with journal articles.
A CRID registry is a dataset of CRID records, each consisting of a CRID symbol and additional information which was recorded in the dataset through a assigning a centrally registered identifier process.
PERSON: Alan Ruttenberg
PERSON: Bill Hogan
PERSON: Bjoern Peters
PERSON: Melanie Courtot
CRID registry
Original proposal from Bjoern, discussions at IAO calls
centrally registered identifier registry
time stamped measurement datum
pmid:20604925 - time-lapse live cell microscopy
A data set that is an aggregate of data recording some measurement at a number of time points. The time series data set is an ordered list of pairs of time measurement data and the corresponding measurement data acquired at that time.
Alan Ruttenberg
experimental time series
time sampled measurement data set
Viruses
Viruses
Euteleostomi
bony vertebrates
Euteleostomi
Bacteria
eubacteria
Bacteria
Archaea
Archaea
Eukaryota
eucaryotes
eukaryotes
Eukaryota
Euarchontoglires
Euarchontoglires
Tetrapoda
tetrapods
Tetrapoda
Amniota
amniotes
Amniota
Opisthokonta
Opisthokonta
Bilateria
Bilateria
Mammalia
mammals
Mammalia
Vertebrata <Metazoa>
Vertebrata
vertebrates
Vertebrata <Metazoa>
Homo sapiens
human
human being
man
Homo sapiens
fluorescent reporter intensity
A measurement datum that represents the output of a scanner measuring the intensity value for each fluorescent reporter.
person:Chris Stoeckert
group:OBI
From the DT branch: This term and definition were originally submitted by the community to our branch, but we thought they best fit DENRIE. However we see several issues with this. First of all the name 'probe' might not be used in OBI. Instead we have a 'reporter' role. Also, albeit the term 'probe intensity' is often used in communities such as the microarray one, the name 'probe' is ambiguous (some use it to refer to what's on the array, some use it to refer to what's hybed to the array). Furthermore, this concept could possibly be encompassed by combining different OBI terms, such as the roles of analyte, detector and reporter (you need something hybed to a probe on the array to get an intensity) and maybe a more general term for 'measuring intensities'. We need to find the right balance between what is consistent with OBI and combinations of its terms and what is user-friendly. Finally, note that 'intensity' is already in the OBI .owl file and is also in PATO. Why didn't OBI import it from PATO? This might be a problem.
fluorescent reporter intensity
planned process
planned process
Injecting mice with a vaccine in order to test its efficacy
A processual entity that realizes a plan which is the concretization of a plan specification.
'Plan' includes a future direction sense. That can be problematic if plans are changed during their execution. There are however implicit contingencies for protocols that an agent has in his mind that can be considered part of the plan, even if the agent didn't have them in mind before. Therefore, a planned process can diverge from what the agent would have said the plan was before executing it, by adjusting to problems encountered during execution (e.g. choosing another reagent with equivalent properties, if the originally planned one has run out.)
We are only considering successfully completed planned processes. A plan may be modified, and details added during execution. For a given planned process, the associated realized plan specification is the one encompassing all changes made during execution. This means that all processes in which an agent acts towards achieving some
objectives is a planned process.
Bjoern Peters
branch derived
6/11/9: Edited at workshop. Used to include: is initiated by an agent
This class merges the previously separated objective driven process and planned process, as they the separation proved hard to maintain. (1/22/09, branch call)
planned process
biological feature identification objective
Biological_feature_identification_objective is an objective role carried out by the proposition defining the aim of a study designed to examine or characterize a particular biological feature.
Jennifer Fostel
biological feature identification objective
processed material
Examples include gel matrices, filter paper, parafilm and buffer solutions, mass spectrometer, tissue samples
Is a material entity that is created or changed during material processing.
PERSON: Alan Ruttenberg
processed material
investigation
Lung cancer investigation using expression profiling, a stem cell transplant investigation, biobanking is not an investigation, though it may be part of an investigation
a planned process that consists of parts: planning, study design execution, documentation and which produce conclusion(s).
Bjoern Peters
OBI branch derived
Could add specific objective specification
Following OBI call November 2012,26th: it was decided there was no need for adding "achieves objective of drawing conclusion" as existing relations were providing equivalent ability. this note closes the issue and validates the class definition to be part of the OBI core
editor = PRS
study
investigation
evaluant role
When a specimen of blood is assayed for glucose concentration, the blood has the evaluant role. When measuring the mass of a mouse, the evaluant is the mouse. When measuring the time of DNA replication, the evaluant is the DNA. When measuring the intensity of light on a surface, the evaluant is the light source.
a role that inheres in a material entity that is realized in an assay in which data is generated about the bearer of the evaluant role
Role call - 17nov-08: JF and MC think an evaluant role is always specified input of a process. Even in the case where we have an assay taking blood as evaluant and outputting blood, the blood is not the specified output at the end of the assay (the concentration of glucose in the blood is)
examples of features that could be described in an evaluant: quality.... e.g. "contains 10 pg/ml IL2", or "no glucose detected")
GROUP: Role Branch
OBI
Feb 10, 2009. changes after discussion at OBI Consortium Workshop Feb 2-6, 2009. accepted as core term.
evaluant role
assay
Assay the wavelength of light emitted by excited Neon atoms. Count of geese flying over a house.
A planned process with the objective to produce information about the material entity that is the evaluant, by physically examining it or its proxies.
12/3/12: BP: the reference to the 'physical examination' is included to point out that a prediction is not an assay, as that does not require physical examiniation.
PlanAndPlannedProcess Branch
measuring
scientific observation
OBI branch derived
study assay
any method
assay
quantitative confidence value
A data item which is used to indicate the degree of uncertainty about a measurement.
person:Chris Stoeckert
group:OBI
quantitative confidence value
culture medium
A growth medium or culture medium is a substance in which microorganisms or cells can grow. Wikipedia, growth medium, Feb 29, 2008
a processed material that provides the needed nourishment for microorganisms or cells grown in vitro.
changed from a role to a processed material based on on Aug 22, 2011 dev call. Details see the tracker item: http://sourceforge.net/tracker/?func=detail&aid=3325270&group_id=177891&atid=886178
Modification made by JZ.
Person: Jennifer Fostel, Jie Zheng
OBI
culture medium
reagent role
Buffer, dye, a catalyst, a solvating agent.
A role inhering in a biological or chemical entity that is intended to be applied in a scientific technique to participate (or have molecular components that participate) in a chemical reaction that facilitates the generation of data about some entity distinct from the bearer, or the generation of some specified material output distinct from the bearer.
PERSON:Matthew Brush
reagent
PERSON:Matthew Brush
Feb 10, 2009. changes after discussion at OBI Consortium Workshop Feb 2-6, 2009. accepted as core term.
May 28 2013. Updated definition taken from ReO based on discussions initiated in Philly 2011 workshop. Former defnition described a narrower view of reagents in chemistry that restricts bearers of the role to be chemical entities ("a role played by a molecular entity used to produce a chemical reaction to detect, measure, or produce other substances"). Updated definition allows for broader view of reagents in the domain of biomedical research to include larger materials that have parts that participate chemically in a molecular reaction or interaction.
(copied from ReO)
Reagents are distinguished from instruments or devices that also participate in scientific techniques by the fact that reagents are chemical or biological in nature and necessarily participate in or have parts that participate in some chemical interaction or reaction during their intended participation in some technique. By contrast, instruments do not participate in a chemical reaction/interaction during the technique.
Reagents are distinguished from study subjects/evaluants in that study subjects and evaluants are that about which conclusions are drawn and knowledge is sought in an investigation - while reagents, by definition, are not. It should be noted, however, that reagent and study subject/evaluant roles can be borne by instances of the same type of material entity - but a given instance will realize only one of these roles in the execution of a given assay or technique. For example, taq polymerase can bear a reagent role or an evaluant role. In a DNA sequencing assay aimed at generating sequence data about some plasmid, the reagent role of the taq polymerase is realized. In an assay to evaluate the quality of the taq polymerase itself, the evaluant/study subject role of the taq is realized, but not the reagent role since the taq is the subject about which data is generated.
In regard to the statement that reagents are 'distinct' from the specified outputs of a technique, note that a reagent may be incorporated into a material output of a technique, as long as the IDENTITY of this output is distinct from that of the bearer of the reagent role. For example, dNTPs input into a PCR are reagents that become part of the material output of this technique, but this output has a new identity (ie that of a 'nucleic acid molecule') that is distinct from the identity of the dNTPs that comprise it. Similarly, a biotin molecule input into a cell labeling technique are reagents that become part of the specified output, but the identity of the output is that of some modified cell specimen which shares identity with the input unmodified cell specimen, and not with the biotin label. Thus, we see that an important criteria of 'reagent-ness' is that it is a facilitator, and not the primary focus of an investigation or material processing technique (ie not the specified subject/evaluant about which knowledge is sought, or the specified output material of the technique).
reagent role
material processing
A cell lysis, production of a cloning vector, creating a buffer.
A planned process which results in physical changes in a specified input material
PERSON: Bjoern Peters
PERSON: Frank Gibson
PERSON: Jennifer Fostel
PERSON: Melanie Courtot
PERSON: Philippe Rocca Serra
material transformation
OBI branch derived
material processing
study subject role
Human subjects in a clinical trial, rats in a toxicogenomics study, tissue cutlures subjected to drug tests, fish observed in an ecotoxicology study.
Parasite example: people are infected with a parasite which is then extracted; the particpant under investigation could be the parasite, the people, or a population of which the people are members, depending on the nature of the study.
Lake example: a lake could realize this role in an investigation that assays pollution levels in samples of water taken from the lake.
A role that is realized through the execution of a study design in which the bearer of the role participates and in which data about that bearer is collected.
A participant can realize both "specimen role" and "participant under investigation role" at the same time. However "participant under investigation role" is distinct from "specimen role", since a specimen could somehow be involved in an investigation without being the thing that is under investigation.
GROUP: Role Branch
OBI
Following OBI call November 2012,26th:
1. it was decided there was no need for moving the children class and making them siblings of study subject role.
2. it also settles the disambiguation about 'study subject'. This is about the individual participating in the investigation/study, Not the 'topic' (as in 'toxicity study') of the investigation/study
This note closes the issue and validates the class definition to be part of the OBI core
editor = PRS
participant under investigation role
specimen role
liver section; a portion of a culture of cells; a nemotode or other animal once no longer a subject (generally killed); portion of blood from a patient.
a role borne by a material entity that is gained during a specimen collection process and that can be realized by use of the specimen in an investigation
22Jun09. The definition includes whole organisms, and can include a human. The link between specimen role and study subject role has been removed. A specimen taken as part of a case study is not considered to be a population representative, while a specimen taken as representing a population, e.g. person taken from a cohort, blood specimen taken from an animal) would be considered a population representative and would also bear material sample role.
Note: definition is in specimen creation objective which is defined as an objective to obtain and store a material entity for potential use as an input during an investigation.
blood taken from animal: animal continues in study, whereas blood has role specimen.
something taken from study subject, leaves the study and becomes the specimen.
parasite example
- when parasite in people we study people, people are subjects and parasites are specimen
- when parasite extracted, they become subject in the following study
specimen can later be subject.
GROUP: Role Branch
OBI
specimen role
sequence feature identification objective
Sequence_feature_identification_objective is a biological_feature_identification_objective role describing a study designed to examine or characterize molecular features exhibited at the level of a macromolecular sequence, e.g. nucleic acid, protein, polysaccharide.
Jennifer Fostel
sequence feature identification objective
intervention design
PMID: 18208636.Br J Nutr. 2008 Jan 22;:1-11.Effect of vitamin D supplementation on bone and vitamin D status among Pakistani immigrants in Denmark: a randomised double-blinded placebo-controlled intervention study.
An intervention design is a study design in which a controlled process applied to the subjects (the intervention) serves as the independent variable manipulated by the experimentalist. The treatment (perturbation or intervention) defined can be defined as a combination of values taken by independent variable manipulated by the experimentalists are applied to the recruited subjects assigned (possibly by applying specific methods) to treatment groups. The specificity of intervention design is the fact that independent variables are being manipulated and a response of the biological system is evaluated via response variables as monitored by possibly a series of assays.
Philppe Rocca-Serra
OBI branch derived
intervention design
gene list
Gene lists may arise from analysis to determine differentially expressed genes, may be collected from the literature for involvement in a particular process or pathway (e.g., inflammation), or may be the input for gene set enrichment analysis.
A data set of the names or identifiers of genes that are the outcome of an analysis or have been put together for the purpose of an analysis.
person:Chris Stoeckert
group:OBI
kind of report. (alan) need to be careful to distinguish from output of a data transformation or calculation. A gene list is a report when it is published as such? Relates to question of whether report is a whole, or whether it can be a part of some other narrative object.
gene list
molecular feature identification objective
Molecular_feature_identification_objective is a biological_feature_identification_objective role describing a study designed to examine or characterize molecular features of a biological system, e.g. expression profiling, copy number of molecular components, epigenetic modifications.
Jennifer Fostel
molecular feature identification objective
cDNA library
PMID:6110205. collection of cDNA derived from mouse splenocytes.
Mixed population of cDNAs (complementaryDNA) made from mRNA from a defined source, usually a specific cell type. This term should be associated only to nucleic acid interactors not to their proteins product. For instance in 2h screening use living cells (MI:0349) as sample process.
ALT DEF (PRS):: a cDNA library is a collection of host cells, typically E.Coli cells but not exclusively. modified by transfer of plasmid DNA molecule used as vector containing a fragment or totality of cDNA molecule (the insert) . cDNA library may have an array of role and applications.
PERSON: Luisa Montecchi
PERSON: Philippe Rocca-Serra
GROUP: PSI
PRS: 22022008. class moved under population,
modification of definition and replacement of biomaterials in previous definition with 'material'
addition of has_role restriction
cDNA library
p-value
PMID:19696660
in contrast to the in-vivo data AT-III increased significantly from
113.5% at baseline to 117% after 4 days (n = 10, P-value= 0.02; Table 2).
A quantitative confidence value that represents the probability of obtaining a result at least as extreme as that actually obtained, assuming that the actual value was the result of chance alone.
Addition of restriction 'output of null hypothesis testing' by AGB and PRS while working on STATO
May be outside the scope of OBI long term, is needed so is retained
Alejandra Gonzalez-Beltran
PERSON:Chris Stoeckert
Philippe Rocca-Serra
WEB: http://en.wikipedia.org/wiki/P-value
p
p-value
population
PMID12564891. Environ Sci Technol. 2003 Jan 15;37(2):223-8. Effects of historic PCB exposures on the reproductive success of the Hudson River striped bass population.
a population is a collection of individuals from the same taxonomic class living, counted or sampled at a particular site or in a particular area
1/28/2013, BP, on the call it was raised that we may want to switch to an external ontology for all populatin terms:
http://code.google.com/p/popcomm-ontology/
PERSON: Philippe Rocca-Serra
adapted from Oxford English Dictionnary
rem1: collection somehow always involve a selection process
population
imaging assay
An imaging assay is an assay to produce a picture of an entity. definition_source: OBI.
PlanAndPlannedProcess Branch
OBI branch derived
imaging assay
organization
PMID: 16353909.AAPS J. 2005 Sep 22;7(2):E274-80. Review. The joint food and agriculture organization of the United Nations/World Health Organization Expert Committee on Food Additives and its role in the evaluation of the safety of veterinary drug residues in foods.
An entity that can bear roles, has members, and has a set of organization rules. Members of organizations are either organizations themselves or individual people. Members can bear specific organization member roles that are determined in the organization rules. The organization rules also determine how decisions are made on behalf of the organization by the organization members.
BP: The definition summarizes long email discussions on the OBI developer, roles, biomaterial and denrie branches. It leaves open if an organization is a material entity or a dependent continuant, as no consensus was reached on that. The current placement as material is therefore temporary, in order to move forward with development. Here is the entire email summary, on which the definition is based:
1) there are organization_member_roles (president, treasurer, branch
editor), with individual persons as bearers
2) there are organization_roles (employer, owner, vendor, patent holder)
3) an organization has a charter / rules / bylaws, which specify what roles
there are, how they should be realized, and how to modify the
charter/rules/bylaws themselves.
It is debatable what the organization itself is (some kind of dependent
continuant or an aggregate of people). This also determines who/what the
bearer of organization_roles' are. My personal favorite is still to define
organization as a kind of 'legal entity', but thinking it through leads to
all kinds of questions that are clearly outside the scope of OBI.
Interestingly enough, it does not seem to matter much where we place
organization itself, as long as we can subclass it (University, Corporation,
Government Agency, Hospital), instantiate it (Affymetrix, NCBI, NIH, ISO,
W3C, University of Oklahoma), and have it play roles.
This leads to my proposal: We define organization through the statements 1 -
3 above, but without an 'is a' statement for now. We can leave it in its
current place in the is_a hierarchy (material entity) or move it up to
'continuant'. We leave further clarifications to BFO, and close this issue
for now.
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
PERSON: Philippe Rocca-Serra
PERSON: Susanna Sansone
GROUP: OBI
organization
dye role
A molecular label role which inheres in a material entity and which is realized in the process of detecting a molecular dye that imparts color to some material of interest.
Jennifer Fostel
dye
A substance used to color materials www.answers.com/topic/dye 19feb09
dye role
protocol
PCR protocol, has objective specification, amplify DNA fragment of interest, and has action specification describes the amounts of experimental reagents used (e..g. buffers, dNTPS, enzyme), and the temperature and cycle time settings for running the PCR.
A plan specification which has sufficient level of detail and quantitative information to communicate it between investigation agents, so that different investigation agents will reliably be able to independently reproduce the process.
PlanAndPlannedProcess Branch
OBI branch derived + wikipedia (http://en.wikipedia.org/wiki/Protocol_%28natural_sciences%29)
study protocol
protocol
adding a material entity into a target
Injecting a drug into a mouse. Adding IL-2 to a cell culture. Adding NaCl into water.
is a process with the objective to place a material entity bearing the 'material to be added role' into a material bearing the 'target of material addition role'.
Class was renamed from 'administering substance', as this is commonly used only for additions into organisms.
BP
branch derived
adding a material entity into a target
analyte role
Glucose in blood (measured in an assay to determine the concentration of glucose).
A measurand role borne by a molecular entity or an atom and realized in an analyte assay which achieves the objective to measure the magnitude/concentration/amount of the analyte in the entity bearing evaluant role.
interestingly, an analyte is still an analyte even if it is not detected. for this reason it does not bear a specified input role
pH (technically the inverse log of [H+]) may be considered a quality; this remains to be tested.
qualities such as weight, color are not assayed but measured, so they do not fall into this category.
GROUP: Role Branch
OBI
Feb 10, 2009. changes after discussion at OBI Consortium Workshop Feb 2-6, 2009. accepted as core term.
analyte role
material to be added role
drug added to a buffer contained in a tube; substance injected into an animal;
material to be added role is a protocol participant role realized by a material which is added into a material bearing the target of material addition role in a material addition process
Role Branch
OBI
9 March 09 from discussion with PA branch
material to be added role
interpreting data
Concluding that a gene is upregulated in a tissue sample based on the band intensity in a western blot. Concluding that a patient has a infection based on measurement of an elevated body temperature and reported headache. Concluding that there were problems in an investigation because data from PCR and microarray are conflicting. Concluding that 'defects in gene XYZ cause cancer due to improper DNA repair' based on data from experiments in that study that gene XYZ is involved in DNA repair, and the conclusion of a previous study that cancer patients have an increased number of mutations in this gene.
A planned process in which data gathered in an investigation is evaluated in the context of existing knowledge with the objective to generate more general conclusions or to conclude that the data does not allow one to draw general conclusion
PERSON: Bjoern Peters
PERSON: Jennifer Fostel
Bjoern Peters
drawing a conclusion based on data
planning
The process of a scientist thinking about and deciding what reagents to use as part of a protocol for an experiment. Note that the scientist could be human or a "robot scientist" executing software.
a process of creating or modifying a plan specification
7/18/2011 BP: planning used to itself be a planned process. Barry Smith pointed out that this would lead to an infinite regression, as there would have to be a plan to conduct a planning process, which in itself would be the result of planning etc. Therefore, the restrictions on 'planning' were loosened to allow for informal processes that result in an 'ad hoc plan '. This required changing from 'has_specified_output some plan specifiction' to 'has_participant some plan specification'.
Bjoern Peters
Bjoern Peters
Plans and Planned Processes Branch
planning
light emission function
A light emission function is an excitation function to excite a material to a specific excitation state that it emits light.
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
light emission function
contain function
A syringe, a beaker
A contain function is a function to constrain a material entities location in space
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
contain function
heat function
A heat function is a function that increases the internal kinetic energy of a material
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
heat function
material separation function
A material separation function is a function that increases the resolution between two or more material entities. The to distinction between the entities is usually based on some associated physical quality.
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
material separation function
excitation function
A excitation function is a function to inject energy by bombarding a material with energetic particles (e.g., photons) thereby imbuing internal material components such as electrons with additional energy. These internal, 'excited' particles may lead to the rupturing of covalent chemical bonds or may quickly relax back to there unexcited state with an exponential time course thereby locally emitting energy in the form of photons.
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
excitation function
filter function
A filter function is a function to prevent the flow of certain entities based on a quality or qualities of the entity while allowing entities which have different qualities to pass through
Frank Gibson
filter function
cool function
A cool function is a function to decrease the internal kinetic energy of a material below the initial kinetic energy of that type of material.
Daniel Schober
Frank Gibson
Melanie Courtot
cool function
solid support function
Taped, glued, pinned, dried or molecularly bonded to a solid support
A solid support function is a function of a device on which an entity is kept in a defined position and prevented in its movement
Daniel Schober
Frank Gibson
Melanie Courtot
solid support function
environment control function
An environmental control function is a function that regulates a contained environment within specified parameter ranges. For example the control of light exposure, humidity and temperature.
Bill Bug
Daniel Schober
Frank Gibson
Melanie Courtot
environment control function
sort function
A sort function is a function to distinguish material components based on some associated physical quality or entity and to partition the separate components into distinct fractions according to a defined order.
Daniel Schober
Frank Gibson
Melanie Courtot
sort function
cloning vector role
pBluescript plays the role of a cloning vector
A material to be added role played by a small, self-replicating DNA or RNA molecule - usually a plasmid or chromosome - and realized in a process whereby foreign DNA or RNA is inserted into the vector during the process of cloning.
JZ: related tracker: https://sourceforge.net/p/obi/obi-terms/102/
PERSON: Helen Parkinson
cloning vector role
cloning insert role
cloning insert role is a role which inheres in DNA or RNA and is realized by the process of being inserted into a cloning vector in a cloning process.
Feb 20, 2009. from Wikipedia: cloning of any DNA fragment essentially involves four steps: DNA fragmentation with restriction endonucleases, ligation of DNA fragments to a vector, transfection, and screening/selection. There are multiple processes involved, it is not just "cloning process"
GROUP: Role branch
OBII and Wikipedia
cloning insert role
extract
Up-regulation of inflammatory signalings by areca nut extract and role of cyclooxygenase-2 -1195G>a polymorphism reveal risk of oral cancer. Cancer Res. 2008 Oct 15;68(20):8489-98. PMID: 18922923
an extract is a material entity which results from an extraction process
PERSON: Philippe Rocca-Serra
extracted material
GROUP: OBI Biomatrial Branch
extract
transcription profiling assay
Whole genome transcription profiling of Anaplasma phagocytophilum in human and tick host cells by tiling array analysis. BMC Genomics. 2008 Jul 31;9:364. PMID: 18671858
An assay which aims to provide information about gene expression and transcription activity using ribonucleic acids collected from a material entity using a range of techniques and instrument such as DNA sequencers, DNA microarrays, Northern Blot
Philippe Rocca-Serra
gene expression profiling
OBI
transcription profiling
transcription profiling assay
averaging objective
A mean calculation which has averaging objective is a descriptive statistics calculation in which the mean is calculated by taking the sum of all of the observations in a data set divided by the total number of observations. It gives a measure of the 'center of gravity' for the data set. It is also known as the first moment.
An averaging objective is a data transformation objective where the aim is to perform mean calculations on the input of the data transformation.
Elisabetta Manduchi
James Malone
PERSON: Elisabetta Manduchi
averaging objective
enzyme
(protein or rna) or has_part (protein or rna) and
has_function some GO:0003824 (catalytic activity)
MC: known issue: enzyme doesn't classify under material entity for now as it isn't stated that anything
that has_part some material entity is a material entity. If we add as equivalent classes to material entity has_part some material entity and part_of some material entity (each one in his own necessary and sufficient block) Pellet in P3 doesn't classify any more.
person: Melanie Courtot
GROUP:OBI
enzyme
adding material objective
creating a mouse infected with LCM virus
is the specification of an objective to add a material into a target material. The adding is asymmetric in the sense that the target material largely retains its identity
BP
adding material objective
genotyping assay
High-throughput genotyping of oncogenic human papilloma viruses with MALDI-TOF mass spectrometry. Clin Chem. 2008 Jan;54(1):86-92. Epub 2007 Nov 2.PMID: 17981923
an assay which generates data about a genotype from a specimen of genomic DNA. A variety of
techniques and instruments can be used to produce information about sequence variation at particular genomic positions.
Philippe Rocca-Serra
genotype profiling, SNP genotyping
OBI Biomaterial
SNP analysis
genotyping assay
analyte measurement objective
The objective to measure the concentration of glucose in a blood sample
an assay objective to determine the presence or concentration of an analyte in the evaluant
PERSON: Bjoern Peters
PPPB branch
analyte measurement objective
assay objective
the objective to determine the weight of a mouse.
an objective specification to determine a specified type of information about an evaluated entity (the material entity bearing evaluant role)
PPPB branch
PPPB branch
assay objective
analyte assay
example of usage: In lab test for blood glucose, the test is the assay, the blood bears evaluant_role and glucose bears the analyte role. The evaluant is considered an input to the assay and the information entity that records the measurement of glucose concentration the output
An assay with the objective to capture information about the presence, concentration, or amount of an analyte in an evaluant.
2013-09-23: simplify equivalent axiom
Note: is_realization of some analyte role isn't always true, for example when there is none of the analyte in the evaluant. For the moment we are writing it this way, but when the information ontology is further worked out this will be replaced with a condition discussing the measurement.
logical def modified to remove expression below, as some analyte assays report below the level of detection, and therefore not a scalar measurement datum, replaced by measurement datum
and
('has measurement unit label' some 'measurement unit label') and
('is quality measurement of' some 'molecular concentration'))
PERSON:Bjoern Peters, Helen Parkinson, Philippe Rocca-Serra, Alan Ruttenberg
PERSON:Bjoern Peters
PERSON:Helen Parkinson
PERSON:Philippe Rocca-Serra
PERSON:Alan Ruttenberg
GROUP:OBI Planned process branch
analyte assay
target of material addition role
peritoneum of an animal receiving an interperitoneal injection; solution in a tube receiving additional material; location of absorbed material following a dermal application.
target of material addition role is a role realized by an entity into which a material is added in a material addition process
From Branch discussion with BP, AR, MC -- there is a need for the recipient to interact with the administered material. for example, a tooth receiving a filling was not considered to be a target role.
GROUP: Role Branch
OBI
target of material addition role
normalized data set
A data set that is produced as the output of a normalization data transformation.
PERSON: James Malone
PERSON: Melanie Courtot
normalized data set
measure function
A glucometer measures blood glucose concentration, the glucometer has a measure function.
Measure function is a function that is borne by a processed material and realized in a process in which information about some entity is expressed relative to some reference.
PERSON: Daniel Schober
PERSON: Helen Parkinson
PERSON: Melanie Courtot
PERSON:Frank Gibson
measure function
material transformation objective
The objective to create a mouse infected with LCM virus. The objective to create a defined solution of PBS.
an objective specifiction that creates an specific output object from input materials.
PERSON: Bjoern Peters
PERSON: Frank Gibson
PERSON: Jennifer Fostel
PERSON: Melanie Courtot
PERSON: Philippe Rocca-Serra
artifact creation objective
GROUP: OBI PlanAndPlannedProcess Branch
material transformation objective
study design execution
injecting a mouse with PBS solution, weighing it, and recording the weight according to a study design.
a planned process that carries out a study design
removed axiom has_part some (assay or 'data transformation') per discussion on protocol application mailing list to improve reasoner performance. The axiom is still desired.
branch derived
6/11/9: edited at workshop. Used to be: study design execution is a process with the objective to generate data according to a concretized study design. The execution of a study design is part of an investigation, and minimally consists of an assay or data transformation.
study design execution
DNA sequencing
Genomic deletions of OFD1 account for 23% of oral-facial-digital type 1 syndrome after negative DNA sequencing. Thauvin-Robinet C, Franco B, Saugier-Veber P, Aral B, Gigot N, Donzel A, Van Maldergem L, Bieth E, Layet V, Mathieu M, Teebi A, Lespinasse J, Callier P, Mugneret F, Masurel-Paulet A, Gautier E, Huet F, Teyssier JR, Tosi M, Frébourg T, Faivre L. Hum Mutat. 2008 Nov 19. PMID: 19023858
DNA sequencing is a sequencing process which uses deoxyribonucleic acid as input and results in a the creation of DNA sequence information artifact using a DNA sequencer instrument.
Philippe Rocca-Serra
OBI Branch derived
nucleotide sequencing
DNA sequencing
material separation objective
The objective to obtain multiple aliquots of an enzyme preparation. The objective to obtain cells contained in a sample of blood.
is an objective to transform a material entity into spatially separated components.
PPPB branch
PPPB branch
material separation objective
clustered data set
A clustered data set is the output of a K means clustering data transformation
A data set that is produced as the output of a class discovery data transformation and consists of a data set with assigned discovered class labels.
PERSON: James Malone
PERSON: Monnie McGee
data set with assigned discovered class labels
AR thinks could be a data item instead
clustered data set
data set of features
A data set that is produced as the output of a descriptive statistical calculation data transformation and consists of producing a data set that represents one or more features of interest about the input data set.
PERSON: James Malone
PERSON: Monnie McGee
data set of features
differential expression analysis data transformation
A differential expression analysis data transformation is a data transformation that has objective differential expression analysis and that consists of
James Malone
Melanie Courtot
Monnie McGee
WEB:
differential expression analysis data transformation
material combination
Mixing two fluids. Adding salt into water. Injecting a mouse with PBS.
is a material processing with the objective to combine two or more material entities as input into a single material entity as output.
created at workshop as parent class for 'adding material into target', which is asymmetric, while combination encompasses all addition processes.
bp
bp
material combination
specimen collection process
drawing blood from a patient for analysis, collecting a piece of a plant for depositing in a herbarium, buying meat from a butcher in order to measure its protein content in an investigation
A planned process with the objective of collecting a specimen.
Note: definition is in specimen creation objective which is defined as an objective to obtain and store a material entity for potential use as an input during an investigation.
Philly2013: A specimen collection can have as part a material entity acquisition, such as ordering from a bank. The distinction is that specimen collection necessarily involves the creation of a specimen role. However ordering cell lines cells from ATCC for use in an investigation is NOT a specimen collection, because the cell lines already have a specimen role.
Philly2013: The specimen_role for the specimen is created during the specimen collection process.
label changed to 'specimen collection process' on 10/27/2014, details see tracker:
http://sourceforge.net/p/obi/obi-terms/716/
Bjoern Peters
specimen collection
5/31/2012: This process is not necessarily an acquisition, as specimens may be collected from materials already in posession
6/9/09: used at workshop
specimen collection process
error corrected data set
A data set that is produced as the output of an error correction data transformation and consists of producing a data set which has had erroneous contributions from the input to the data transformation removed (corrected for).
PERSON: James Malone
PERSON: Monnie McGee
error corrected data set
error correction data transformation
An error correction data transformation is a data transformation that has the objective of error correction, where the aim is to remove (correct for) erroneous contributions from the input to the data transformation.
James Malone
Monnie McGee
EDITORS
error correction data transformation
sample from organism
a material obtained from an organism in order to be a representative of the whole
5/29: This is a helper class for now
we need to work on this: Is taking a urine sample a material separation process? If not, we will need to specify what 'taking a sample from organism' entails. We can argue that the objective to obtain a urine sample from a patient is enough to call it a material separation process, but it could dilute what material separation was supposed to be about.
sample from organism
statistical hypothesis test
"A statistical test provides a mechanism for making quantitative decisions about a process or processes".
A statistical hypothesis test data transformation is a data transformation that has objective statistical hypothesis test.
Alejandra Gonzalez-Beltran
James Malone
Philippe Rocca-Serra
PERSON: James Malone
http://www.itl.nist.gov/div898/handbook/prc/section1/prc13.htm
NHST
Null Hypothesis Statistical Testing
statistical hypothesis testing
statistical hypothesis test
center value
A data item that is produced as the output of a center calculation data transformation and represents the center value of the input data.
PERSON: James Malone
PERSON: Monnie McGee
median
center value
statistical hypothesis test objective
is a data transformation objective where the aim is to estimate statistical significance with the aim of proving or disproving a hypothesis by means of some data transformation
James Malone
Person:Helen Parkinson
hypothesis test objective
WEB: http://en.wikipedia.org/wiki/Statistical_hypothesis_testing
statistical hypothesis test objective
portioning objective
The objective to obtain multiple aliquots of an enzyme preparation.
A material separation objective aiming to separate material into multiple portions, each of which contains a similar composition of the input material.
portioning objective
average value
A data item that is produced as the output of an averaging data transformation and represents the average value of the input data.
PERSON: James Malone
PERSON: Monnie McGee
arithmetic mean
average value
separation into different composition objective
The objective to obtain cells contained in a sample of blood.
A material separation objective aiming to separate a material entity that has parts of different types, and end with at least one output that is a material with parts of fewer types (modulo impurities).
We should be using has the grain relations or concentrations to distinguish the portioning and other sub-objectives
separation into different composition objective
specimen collection objective
The objective to collect bits of excrement in the rainforest. The objective to obtain a blood sample from a patient.
A objective specification to obtain a material entity for potential use as an input during an investigation.
Bjoern Peters
Bjoern Peters
specimen collection objective
material combination objective
is an objective to obtain an output material that contains several input materials.
PPPB branch
bp
material combination objective
paired-end library
PMID: 19339662. Next-generation DNA sequencing of paired-end tags (PET) for transcriptome and genome analyses. Genome Res. 2009 Apr;19(4):521-32. Fullwood MJ, Wei CL, Liu ET, Ruan Y.
is a collection of short paired tags from the two ends of DNA fragments are extracted and covalently linked as ditag constructs
Philippe Rocca-Serra
mate-paired library
paired-end tag (PET) library
adapted from information provided by Solid web site
paired-end library
k-nearest neighbors
A k-nearest neighbors is a data transformation which achieves a class discovery or partitioning objective, in which an input data object with vector y is assigned to a class label based upon the k closest training data set points to y; where k is the largest value that class label is assigned.
James Malone
k-NN
PERSON: James Malone
k-nearest neighbors
recombinant vector
A recombinant vector is created by a recombinant vector cloning process, and contains nucleic acids that can be amplified. It retains functions of the original cloning vector.
recombinant vector
single fragment library
is a collection of short tags from DNA fragments, are extracted and covalently linked as single tag constructs
Philippe Rocca-Serra
fragment library
single fragment library
cloning vector
A cloning vector is an engineered material that is used as an input material for a recombinant vector cloning process to carry inserted nucleic acids. It contains an origin of replication for a specific destination host organism, encodes for a selectable gene product and contains a cloning site.
cloning vector
1
2
1
true
1
true
1
2
1
Student's t-test
Studen't t-test is a data transformation with the objective of a statistical hypothesis test in which the test statistic has a Student's t distribution if the null hypothesis is true. It is applied when the population is assumed to be normally distributed but the sample sizes are small enough that the statistic on which inference is based is not normally distributed because it relies on an uncertain estimate of standard deviation rather than on a precisely known value.
Alejandra Gonzalez-Beltran
James Malone
Philippe Rocca-Serra
t-test
WEB: http://en.wikipedia.org/wiki/T-test
t.test(dependent variable ~ independant variable, data = dataset, var.equal = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/t.test.html
Student's t-test
material sample role
a role borne by a portion of blood taken to represent all the blood in an organism; the role borne by a population of humans with HIV enrolled in a study taken to represent patients with HIV in general.
A material sample role is a specimen role borne by a material entity that is the output of a material sampling process.
7/13/09: Note that this is a relational role: between the sample taken and the 'sampled' material of which the sample is thought to be representative off.
material sample role
material sampling process
A specimen gathering process with the objective to obtain a specimen that is representative of the input material entity
material sampling process
material sample
blood drawn from patient to measure his systemic glucose level. A population of humans with HIV enrolled in a study taken to represent patients with HIV in general.
A material entity that has the material sample role
OBI: workshop
sample population
sample
material sample
independent variable specification
In a study in which gene expression is measured in patients between 8 month to 4 years old that have mild or severe malaria and in which the hypothesis is that gene expression in that age group is a function of disease status, disease status is the independent variable.
a directive information entity that is part of a study design. Independent variables are entities whose values are selected to determine its relationship to an observed phenomenon (the dependent variable). In such an experiment, an attempt is made to find evidence that the values of the independent variable determine the values of the dependent variable (that which is being measured). The independent variable can be changed as required, and its values do not represent a problem requiring explanation in an analysis, but are taken simply as given. The dependent variable on the other hand, usually cannot be directly controlled
2/2/2009 Original definition - In the design of experiments, independent variables are those whose values are controlled or selected by the person experimenting (experimenter) to determine its relationship to an observed phenomenon (the dependent variable). In such an experiment, an attempt is made to find evidence that the values of the independent variable determine the values of the dependent variable (that which is being measured). The independent variable can be changed as required, and its values do not represent a problem requiring explanation in an analysis, but are taken simply as given. The dependent variable on the other hand, usually cannot be directly controlled.
In the Philly 2013 workshop the label was chosen to distinguish it from "dependent variable" as used in statistical modelling. See: http://en.wikipedia.org/wiki/Statistical_modeling
an independent variable is a variable which assumes only values set by the operator according to a plan and which are expected to (or are being tested for) influence the ranges of values assumed by one or more dependent variables (also known as 'response variables').
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
PERSON: Chris Stoeckert
experimental factor
independent variable
Web: http://en.wikipedia.org/wiki/Dependent_and_independent_variables
2009-03-16: work has been done on this term during during the OBI workshop winter 2009 and the current definition was considered acceptable for use in OBI. If there is a need to modify thisdefinition please notify OBI.
study factor
explanatory variable
factor
study design independent variable
dependent variable specification
In a study in which gene expression is measured in patients between 8 month to 4 years old that have mild or severe malaria and in which the hypothesis is that gene expression in that age group is a function of disease status, the gene expression is the dependent variable.
dependent variable specification is part of a study design. The dependent variable is the event studied and expected to change when the independent variable varies.
2/2/2009 In the design of experiments, independent variables are those whose values are controlled or selected by the person experimenting (experimenter) to determine its relationship to an observed phenomenon (the dependent variable). In such an experiment, an attempt is made to find evidence that the values of the independent variable determine the values of the dependent variable (that which is being measured). The independent variable can be changed as required, and its values do not represent a problem requiring explanation in an analysis, but are taken simply as given. The dependent variable on the other hand, usually cannot be directly controlled.
In the Philly 2013 workshop the label was chosen to distinguish it from "dependent variable" as used in statistical modelling. See: http://en.wikipedia.org/wiki/Statistical_modeling
PERSON: Alan Ruttenberg
PERSON: Bjoern Peters
PERSON: Chris Stoeckert
dependent variable
WEB: http://en.wikipedia.org/wiki/Dependent_and_independent_variables
2009-03-16: work has been done on this term during during the OBI workshop winter 2009 and the current definition was considered acceptable for use in OBI. If there is a need to modify thisdefinition please notify OBI.
response variable
study design dependent variable
survival rate
A measurement data that represents the percentage of people or animals in a study or treatment group who are alive for a given period of time after diagnosis or initiation of monitoring.
Oliver He
adapted from wikipedia
http://en.wikipedia.org/wiki/Survival_rate
survival rate
multiple testing correction objective
Application of the Bonferroni correction
A multiple testing correction objectives is a data transformation objective where the aim is to correct for a set of statistical inferences considered simultaneously
multiple comparison correction objective
http://en.wikipedia.org/wiki/Multiple_Testing_Correction
multiple testing correction objective
material maintenance objective
An objective specification maintains some or all of the qualities of a material over time.
PERSON: Bjoern Peters
PERSON: Bjoern Peters
material maintenance objective
primary structure of DNA macromolecule
a quality of a DNA molecule that inheres in its bearer due to the order of its DNA nucleotide residues.
placeholder for SO
BP et al
primary structure of DNA macromolecule
measurement device
A ruler, a microarray scanner, a Geiger counter.
A device in which a measure function inheres.
GROUP:OBI Philly workshop
OBI
measurement device
material maintenance
a process with that achieves the objective to maintain some or all of the characteristics of an input material over time
material maintenance
polyA RNA extraction
A RNA extraction process typically involving the use of poly dT oligomers in which the desired output material is polyA RNA.
Person: Chris Stoeckert
Person: Jie Zheng
UPenn Group
polyA RNA extraction
1
2
Likelihood-ratio test
Likelihood-ratio is a data transformation which tests whether there is evidence of the need to move from a simple model to a more complicated one (where the simple model is nested within the complicated one); tests of the goodness-of-fit between two models.
date: March 2013
AGB and PRS provide formal definition expressed the test in terms of output and input, specifying the nature of the variables, the purpose of the test and the distribution used.
Alejandra Gonzales-Beltran
Philippe Rocca-Serra
Tina Boussard
lrtest()
http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/lmtest/html/lrtest.html
Likelihood-ratio test
survival curve
A survival curve is a report graph which is a graphical representation of data where the percentage of survival is plotted as a function of time.
Alejandra Gonzalez-Beltran
PERSON:Chris Stoeckert
PERSON:James Malone
PERSON:Melanie Courtot
Philippe Rocca-Serra
WEB: http://www.graphpad.com/www/book/survive.htm
survival curve
flow cytometry assay
Using a flow cytometer to quantitate the percent of CD3 positive cells in a population by labeling them with a FITC tagged anti-CD3 antibody.
A cytometry assay in which an input cell population is put in solution, is passed by a laser, and optical sensors are used to detect scattering of the laser light and/or fluorescence of specific markers to count and characterize the particles in solution.
IEDB
IEDB
flow cytometry assay
labeled specimen
A specimen that has been modified in order to be able to detect it in future experiments
added during call 3/1/2010
OBI group
labeled specimen
study intervention
the part of the execution of an intervention design study which is varied between two or more subjects in the study
PERSON: Bjoern Peters
GROUP: OBI
study intervention
material separation device
flow cytometer
A device with a separation function realized in a planed process
material separation device
categorical measurement datum
A measurement datum that is reported on a categorical scale
Bjoern Peters
nominal mesurement datum
Bjoern Peters
categorical measurement datum
processed specimen
A tissue sample that has been sliced and stained for a histology study.
A blood specimen that has been centrifuged to obtain the white blood cells.
A specimen that has been intentionally physically modified.
Bjoern Peters
Bjoern Peters
A tissue sample that has been sliced and stained for a histology study.
processed specimen
categorical label
The labels 'positive' vs. 'negative', or 'left handed', 'right handed', 'ambidexterous', or 'strongly binding', 'weakly binding' , 'not binding', or '+++', '++', '+', '-' etc. form scales of categorical labels.
A label that is part of a categorical datum and that indicates the value of the data item on the categorical scale.
Bjoern Peters
Bjoern Peters
categorical label
in live cell assay
An assay in which a measurement is made by observing entities located in a live cell.
in live cell assay
container
A device that can be used to restrict the location of material entities over time
03/21/2010: Added to allow classification of children (similar to what we want to do for 'measurement device'. Lookint at what classifies here, we may want to reconsider a contain function assigned to a part of an entity is necessarily also a function of the whole (e.g. is a centrifuge a container because it has test tubes as parts?)
PERSON: Bjoern Peters
container
device
A voltmeter is a measurement device which is intended to perform some measure function.
An autoclave is a device that sterlizes instruments or contaminated waste by applying high temperature and pressure.
A material entity that is designed to perform a function in a scientific investigation, but is not a reagent.
2012-12-17 JAO: In common lab usage, there is a distinction made between devices and reagents that is difficult to model. Therefore we have chosen to specifically exclude reagents from the definition of "device", and are enumerating the types of roles that a reagent can perform.
2013-6-5 MHB: The following clarifications are outcomes of the May 2013 Philly Workshop. Reagents are distinguished from devices that also participate in scientific techniques by the fact that reagents are chemical or biological in nature and necessarily participate in some chemical interaction or reaction during the realization of their experimental role. By contrast, devices do not participate in such chemical reactions/interactions. Note that there are cases where devices use reagent components during their operation, where the reagent-device distinction is less clear. For example:
(1) An HPLC machine is considered a device, but has a column that holds a stationary phase resin as an operational component. This resin qualifies as a device if it participates purely in size exclusion, but bears a reagent role that is realized in the running of a column if it interacts electrostatically or chemically with the evaluant. The container the resin is in (“the column”) considered alone is a device. So the entire column as well as the entire HPLC machine are devices that have a reagent as an operating part.
(2) A pH meter is a device, but its electrode component bears a reagent role in virtue of its interacting directly with the evaluant in execution of an assay.
(3) A gel running box is a device that has a metallic lead as a component that participates in a chemical reaction with the running buffer when a charge is passed through it. This metallic lead is considered to have a reagent role as a component of this device realized in the running of a gel.
In the examples above, a reagent is an operational component of a device, but the device itself does not realize a reagent role (as bearing a reagent role is not transitive across the part_of relation). In this way, the asserted disjointness between a reagent and device holds, as both roles are never realized in the same bearer during execution of an assay.
PERSON: Helen Parkinson
instrument
OBI development call 2012-12-17.
device
sequence data
example of usage: the representation of a nucleotide sequence in FASTA format used for a sequence similarity search.
A measurement datum that representing the primary structure of a macromolecule(it's sequence) sometimes associated with an indicator of confidence of that measurement.
Person:Chris Stoeckert
GROUP: OBI
sequence data
dose
An organism has been injected 1ml of vaccine
A measurement datum that measures the quantity of something that may be administered to an organism or that an organism may be exposed to. Quantities of nutrients, drugs, vaccines and toxins are referred to as doses.
dose
nucleic acid extract
An extract that is the output of an extraction process in which nucleic acid molecules are isolated from a specimen.
PERSON: Jie Zheng
UPenn Group
nucleic acid extract
light emission device
A light source is an optical subsystem that provides light for use in a distant area using a delivery system (e.g., fiber optics)
a device which has a function to emit light.
Person:Helen Parkinson
OBI
light emission device
environmental control device
A growth chamber is an environmental control device.
An environmental control device is a device which has the function to control some aspect of the environment such as temperature, or humidity.
Helen Parkinson
OBI
environmental control device
labeled nucleic acid extract
a labeled specimen that is the output of a labeling process and has grain labeled nucleic acid for detection of the nucleic acid in future experiments.
Person: Jie Zheng
labeled extract
MO_221 labeledExtract
labeled extract
labeled nucleic acid extract
dose response curve
A data item of paired values, one indicating the dose of a material, the other quantitating a measured effect at that dose. The dosing intervals are chosen so that effect values be interpolated by a plotting a curve.
Bjoern Peters; Randi Vita
Philippe Rocca-Serra, Alejandra Gonzalez-Beltran
dose response curve
genetic population background information
genotype information 'C57BL/6J Hnf1a+/-' in this case, C57BL/6J is the genetic population background information
a genetic characteristics information which is a part of genotype information that identifies the population of organisms
proposed and discussed on San Diego OBI workshop, March 2011
Group: OBI group
Group: OBI group
genetic population background information
FWER adjusted p-value
http://ugrad.stat.ubc.ca/R/library/LPE/html/mt.rawp2adjp.html
A quantitative confidence value resulting from a multiple testing error correction method which adjusts the p-value used as input to control for Type I error in the context of multiple pairwise tests
Addition of restriction 'output of null hypothesis testing' and specified output by AGB and PRS while working on STATO
PERS:Philippe Rocca-Serra
adapted from wikipedia (http://en.wikipedia.org/wiki/Familywise_error_rate)
Family-wise type I error rate
FWER adjusted p-value
RNA-seq assay
An assay in which sequencing technology (e.g. Solexa/454) is used to generate RNA sequence, analyse the transcibed regions of the genome, and or to quantitate transcript abundance
PERSON: James Malone
transcription profiling by high throughput sequencing
EFO_0002770 transcription profiling by high throughput sequencing
JZ: should be inferred as 'DNA sequencing'. Will check in the future.
an assay that uses high-throughput sequencing technologies to sequence cDNA in order to get information about a sample's RNA content. RNA-Seq provides researchers with efficient ways to measure transcriptome data experimentally, allowing them to get information such as how different alleles of a gene are expressed, detect post-transcriptional mutations or identify gene fusions.
WEB: http://en.wikipedia.org/wiki/RNA-Seq
RNA-seq assay
genotype information
Genotype information can be: Mus musculus wild type (in this case the genetic population background information is Mus musculus), C57BL/6J Hnf1a+/- (in this case, C57BL/6J is the genetic population background information and Hnf1a+/- is the allele information
a genetic characteristics information that is about the genetic material of an organism and minimally includes information about the genetic background and can in addition contain information about specific alleles, genetic modifications, etc.
discussed on San Diego OBI workshop, March 2011
Group: OBI group
Group: OBI group
genotype information
transcription profiling identification objective
A molecular feature identification objective that aims to characterize the abundance of transcripts
Person: Chris Stoeckert, Jie Zheng
Group: Penn Group
transcription profiling identification objective
allele information
genotype information 'C57BL/6J Hnf1a+/-' in this case, Hnf1a+/- is the allele information
a genetic alteration information that about one of two or more alternative forms of a gene or marker sequence and differing from other alleles at one or more mutational sites based on sequence. Polymorphisms are included in this definition.
discussed on San Diego OBI workshop, March 2011
Person: Chris Stoeckert, Jie Zheng
MO_58 Allele
allele information
genetic alteration information
a genetic characteristics information that is about known changes or the lack thereof from the genetic background, including allele information, duplication, insertion, deletion, etc.
proposed and discussed on San Diego OBI workshop, March 2011
Group: OBI group
Group: OBI group
genetic alteration information
genetic characteristics information
a data item that is about genetic material including polymorphisms, disease alleles, and haplotypes.
Person: Chris Stoeckert, Jie Zheng
MO_66 IndividualGeneticCharacteristics
MO definition:
The genotype of the individual organism from which the biomaterial was derived. Individual genetic characteristics include polymorphisms, disease alleles, and haplotypes.
examples in ArrayExpress
wild_type
MutaMouse (CD2F1 mice with lambda-gt10LacZ integration)
AlfpCre; SNF5 flox/knockout
p53 knock out
C57Bl/6 gp130lox/lox MLC2vCRE/+
fer-15; fem-1
df/df
pat1-114/pat1-114 ade6-M210/ade6-M216 h+/h+ (cells are diploid)
genetic characteristics information
q-value
PMID: 20483222. Comp Biochem Physiol Part D Genomics Proteomics. 2008 Sep;3(3):234-42. Analysis of Sus scrofa liver proteome and identification of proteins differentially expressed between genders, and conventional and genetically enhanced lines.
"After controlling the false discovery rate (FDR</=0.1) using the Storey q value only four proteins (EPHX1, CAT, PAH, ST13) were shown to be differentially expressed between genders (Males/Females) and two proteins (SELENBP2, TAGLN) were differentially expressed between two lines (Transgenic/Conventional pigs)"
A quantitative confidence value that measures the minimum false discovery rate that is incurred when calling that test significant.
To compute q-values, it is necessary to know the p-value produced by a test and possibly set a false discovery rate level.
Addition of restriction 'output of null hypothesis testing' by AGB and PRS while working on STATO
PERS:Philippe Rocca-Serra
FDR adjusted p-value
Adapted from several sources, including
http://.en/wikipedia.org/wiki/False_discovery_rate
http://svitsrv25.epfl.ch/R-doc/library/qvalue.html
q
q-value
genotyping design
A study design that classifies an individual or group of individuals on the basis of alleles, haplotypes, SNPs.
Person: Chris Stoeckert, Jie Zheng
MO_560 genotyping_design
genotyping design
specimen from organism
A specimen that derives from an anatomical part or substance arising from an organism. Examples of tissue specimen include tissue, organ, physiological system, blood, or body location (arm).
PERSON: Chris Stoeckert, Jie Zheng
tissue specimen
MO_954 organism_part
specimen from organism
fluorescence detection assay
Using a laser to stimulate a cell culture that was previously labeled with fluorescent antibodies to detect light emmission at a different wavelength in order to determine the presence of surface markers the antibodies are specific for.
An assay in which a material's fluorescence is determined.
IEDB
IEDB
fluorescence detection assay
rate measurement datum
The rate of disassociation of a peptide from a complex with an MHC molecule measured by the ratio of bound and unbound peptide per unit of time.
A scalar measurement datum that represents the number of events occuring over a time interval
PERSON: Bjoern Peters, Randi Vita
IEDB
rate measurement datum
DNA sequence data
The part of a FASTA file that contains the letters ACTGGGAA
A sequence data item that is about the primary structure of DNA
OBI call; Bjoern Peters
OBI call; Melanie Courtout
8/29/11 call: This is added after a request from Melanie and Yu. They should review it further. This should be a child of 'sequence data', and as of the current definition will infer there.
DNA sequence data
selection criterion
rats should be aged between 6 and 8 weeks and weight between 180-250grams
A directive information entity which defines and states a principle of standard by which selection process may take place.
Person: Philippe Rocca-Serra
selection rule
OBI discussion summarized under the following tracker item : http://sourceforge.net/p/obi/obi-terms/678/
selection criterion
drawing a conclusion
Concluding that the length of the hypotenuse is equal to the square root of the sum of squares of the other two sides in a right-triangle.
Concluding that a gene is upregulated in a tissue sample based on the band intensity in a western blot. Concluding that a patient has a infection based on measurement of an elevated body temperature and reported headache. Concluding that there were problems in an investigation because data from PCR and microarray are conflicting.
A planned process in which new information is inferred from existing information.
drawing a conclusion
assay array
A device made to be used in an analyte assay for immobilization of substances that bind the analyte at regular spatial positions on a surface.
PERSON: Chris Stoeckert, Jie Zheng, Alan Ruttenberg
Penn Group
assay array
conclusion based on data
The conclusion that a gene is upregulated in a tissue sample based on the band intensity in a western blot. The conclusion that a patient has a infection based on measurement of an elevated body temperature and reported headache. The conclusion that there were problems in an investigation because data from PCR and microarray are conflicting.
The following are NOT conclusions based on data: data themselves; results from pure mathematics, e.g. "13 is prime".
An information content entity that is inferred from data.
In the Philly 2013 workshop, we recognized the limitations of "conclusion textual entity", and we introduced this as more general. The need for the 'textual entity' term going forward is up for future debate.
Group:2013 Philly Workshop group
Group:2013 Philly Workshop group
conclusion based on data
cell freezing medium
A processed material that serves as a liquid vehicle for freezing cells for long term quiescent stroage, which contains chemicls needed to sustain cell viability across freeze-thaw cycles.
PERSON: Matthew Brush
cell freezing medium
categorical value specification
A value specification that is specifies one category out of a fixed number of nominal categories
PERSON:Bjoern Peters
categorical value specification
1
1
scalar value specification
A value specification that consists of two parts: a numeral and a unit label
PERSON:Bjoern Peters
scalar value specification
value specification
The value of 'positive' in a classification scheme of "positive or negative"; the value of '20g' on the quantitative scale of mass.
An information content entity that specifies a value within a classification scheme or on a quantitative scale.
This term is currently a descendant of 'information content entity', which requires that it 'is about' something. A value specification of '20g' for a measurement data item of the mass of a particular mouse 'is about' the mass of that mouse. However there are cases where a value specification is not clearly about any particular. In the future we may change 'value specification' to remove the 'is about' requirement.
PERSON:Bjoern Peters
value specification
molecular-labeled material
a material entity that is the specified output of an addition of molecular label process that aims to label some molecular target to allow for its detection in a detection of molecular label assay
PERSON:Matthew Brush
OBI developer call, 3-12-12
molecular-labeled material
cytometry assay
An intracellular material detection by flow cytometry assay measuring peforin inside a culture of T cells.
An assay that measures properties of cells.
IEDB
IEDB
cytometry assay
physical store
a freezer. a humidity controlled box.
A container with an environmental control function.
For details see tracker item: http://sourceforge.net/p/obi/obi-terms/793/
Chris Stoeckert
Duke Biobank, OBIB
Biobank
physical store
measurand role
A role borne by a material entity and realized in an assay which achieves the objective to measure the magnitude/concentration/amount of the measurand in the entity bearing evaluant role.
Person: Alan Ruttenberg, Jie Zheng
https://en.wiktionary.org/wiki/measurand
https://github.com/obi-ontology/obi/issues/778
measurand role
organism
animal
fungus
plant
virus
A material entity that is an individual living system, such as animal, plant, bacteria or virus, that is capable of replicating or reproducing, growth and maintenance in the right environment. An organism may be unicellular or made up, like humans, of many billions of cells divided into specialized tissues and organs.
10/21/09: This is a placeholder term, that should ideally be imported from the NCBI taxonomy, but the high level hierarchy there does not suit our needs (includes plasmids and 'other organisms')
13-02-2009:
OBI doesn't take position as to when an organism starts or ends being an organism - e.g. sperm, foetus.
This issue is outside the scope of OBI.
GROUP: OBI Biomaterial Branch
WEB: http://en.wikipedia.org/wiki/Organism
organism
specimen
Biobanking of blood taken and stored in a freezer for potential future investigations stores specimen.
A material entity that has the specimen role.
Note: definition is in specimen creation objective which is defined as an objective to obtain and store a material entity for potential use as an input during an investigation.
PERSON: James Malone
PERSON: Philippe Rocca-Serra
GROUP: OBI Biomaterial Branch
specimen
cultured cell population
A cultured cell population applied in an experiment: "293 cells expressing TrkA were serum-starved for 18 hours and then neurotrophins were added for 10 min before cell harvest." (Lee, Ramee, et al. "Regulation of cell survival by secreted proneurotrophins." Science 294.5548 (2001): 1945-1948).
A cultured cell population maintained in vitro: "Rat cortical neurons from 15 day embryos are grown in dissociated cell culture and maintained in vitro for 8–12 weeks" (Dichter, Marc A. "Rat cortical neurons in cell culture: culture methods, cell morphology, electrophysiology, and synapse formation." Brain Research 149.2 (1978): 279-293).
A processed material comprised of a collection of cultured cells that has been continuously maintained together in culture and shares a common propagation history.
2013-6-5 MHB: This OBI class was formerly called 'cell culture', but label changed and definition updated following CLO alignment efforts in spring 2013, during which the intent of this class was clarified to refer to portions of a culture or line rather than a complete cell culture or line.
PERSON:Matthew Brush
cell culture sample
PERSON:Matthew Brush
The extent of a 'cultured cell population' is restricted only in that all cell members must share a propagation history (ie be derived through a common lineage of passages from an initial culture). In being defined in this way, this class can be used to refer to the populations that researchers actually use in the practice of science - ie are the inputs to culturing, experimentation, and sharing. The cells in such populations will be a relatively uniform population as they have experienced similar selective pressures due to their continuous co-propagation. And this population will also have a single passage number, again owing to their common passaging history. Cultured cell populations represent only a collection of cells (ie do not include media, culture dishes, etc), and include populations of cultured unicellular organisms or cultured multicellular organism cells. They can exist under active culture, stored in a quiescent state for future use, or applied experimentally.
cultured cell population
screening library
PMID: 15615535.J Med Chem. 2004 Dec 30;47(27):6864-74.A screening library for peptide activated G-protein coupled receptors. 1. The test set. [cdna_library, phage display library]
a screening library is a collection of materials engineered to identify qualities of a subset of its members during a screening process?
PRS: 22-02-2008: while working on definition of cDNA library and looking at current example of usage, a screening library should be a defined class -> any material library which has input_role in a screening protocol application
change biomaterial to material in definition
PERSON: Bjoern Peters
GROUP: IEDB
7/13/09: Need to clarify if this meets reagent role definition
screening library
data transformation
The application of a clustering protocol to microarray data or the application of a statistical testing method on a primary data set to determine a p-value.
A planned process that produces output data from input data.
Elisabetta Manduchi
Helen Parkinson
James Malone
Melanie Courtot
Philippe Rocca-Serra
Richard Scheuermann
Ryan Brinkman
Tina Hernandez-Boussard
data analysis
data processing
Branch editors
data transformation
differential expression analysis objective
Analyses implemented by the SAM (http://www-stat.stanford.edu/~tibs/SAM), PaGE (www.cbil.upenn.edu/PaGE) or GSEA (www.broad.mit.edu/gsea/) algorithms and software
A differential expression analysis objective is a data transformation objective whose input consists of expression levels of entities (such as transcripts or proteins), or of sets of such expression levels, under two or more conditions and whose output reflects which of these are likely to have different expression across such conditions.
Elisabetta Manduchi
PERSON: Elisabetta Manduchi
differential expression analysis objective
Benjamini and Hochberg false discovery rate correction method
Statistical significance of the 8 most represented biological processes (GO level 4) among E7 6 month upregulated genes following analysis with DAVID software; Benjamini-Hochberg FDR (false discovery rate)
A data transformation process in which the Benjamini and Hochberg method sequential p-value procedure is applied with the aim of correcting false discovery rate
2011-03-31: [PRS].
specified input and output of dt which were missing
Helen Parkinson
Philippe Rocca-Serra
Helen Parkinson
Benjamini and Hochberg false discovery rate correction method
k-means clustering
A k-means clustering is a data transformation which achieves a class discovery or partitioning objective, which takes as input a collection of objects (represented as points in multidimensional space) and which partitions them into a specified number k of clusters. The algorithm attempts to find the centers of natural clusters in the data. The most common form of the algorithm starts by partitioning the input points into k initial sets, either at random or using some heuristic data. It then calculates the mean point, or centroid, of each set. It constructs a new partition by associating each point with the closest centroid. Then the centroids are recalculated for the new clusters, and the algorithm repeated by alternate applications of these two steps until convergence, which is obtained when the points no longer switch clusters (or alternatively centroids are no longer changed).
Elisabetta Manduchi
James Malone
Philippe Rocca-Serra
WEB: http://en.wikipedia.org/wiki/K-means
k-means clustering
hierarchical clustering
A hierarchical clustering is a data transformation which achieves a class discovery objective, which takes as input data item and builds a hierarchy of clusters. The traditional representation of this hierarchy is a tree (visualized by a dendrogram), with the individual input objects at one end (leaves) and a single cluster containing every object at the other (root).
James Malone
WEB: http://en.wikipedia.org/wiki/Data_clustering#Hierarchical_clustering
hierarchical clustering
average linkage hierarchical clustering
An average linkage hierarchical clustering is an agglomerative hierarchical clustering which generates successive clusters based on a distance measure, where the distance between two clusters is calculated as the average distance between objects from the first cluster and objects from the second cluster.
Elisabetta Manduchi
PERSON: Elisabetta Manduchi
average linkage hierarchical clustering
complete linkage hierarchical clustering
an agglomerative hierarchical clustering which generates successive clusters based on a distance measure, where the distance between two clusters is calculated as the maximum distance between objects from the first cluster and objects from the second cluster.
Elisabetta Manduchi
PERSON: Elisabetta Manduchi
complete linkage hierarchical clustering
single linkage hierarchical clustering
A single linkage hierarchical clustering is an agglomerative hierarchical clustering which generates successive clusters based on a distance measure, where the distance between two clusters is calculated as the minimum distance between objects from the first cluster and objects from the second cluster.
Elisabetta Manduchi
PERSON: Elisabetta Manduchi
single linkage hierarchical clustering
Benjamini and Yekutieli false discovery rate correction method
The expression set was compared univariately between the stroke patients and controls, gene list was generated using False Discovery Rate correction (Benjamini and Yekutieli)
A data transformation in which the Benjamini and Yekutieli method is applied with the aim of correcting false discovery rate
2011-03-31: [PRS].
specified input and output of dt which were missing
Helen Parkinson
Philippe Rocca-Serra
Helen Parkinson
Benjamini and Yekutieli false discovery rate correction method
dimensionality reduction
A dimensionality reduction is data partitioning which transforms each input m-dimensional vector (x_1, x_2, ..., x_m) into an output n-dimensional vector (y_1, y_2, ..., y_n), where n is smaller than m.
Elisabetta Manduchi
James Malone
Melanie Courtot
Philippe Rocca-Serra
data projection
PERSON: Elisabetta Manduchi
PERSON: James Malone
PERSON: Melanie Courtot
dimensionality reduction
principal components analysis dimensionality reduction
A principal components analysis dimensionality reduction is a dimensionality reduction achieved by applying principal components analysis and by keeping low-order principal components and excluding higher-order ones.
Elisabetta Manduchi
James Malone
Melanie Courtot
Philippe Rocca-Serra
pca data reduction
PERSON: Elisabetta Manduchi
PERSON: James Malone
PERSON: Melanie Courtot
principal components analysis dimensionality reduction
Holm-Bonferroni family-wise error rate correction method
t-tests were used with the type I error adjusted for multiple comparisons, Holm's correction (HOLM 1979), and false discovery rate, http://www.genetics.org/cgi/content/full/172/2/1179
a data transformation that performs more than one hypothesis test simultaneously, a closed-test procedure, that controls the familywise error rate for all the k hypotheses at level α in the strong sense. Objective: multiple testing correction
2011-03-14: [PRS]. Class Label has been changed to address the conflict with the definition
Also added restriction to specify the output to be a FWER adjusted p-value
The 'editor preferred term' should be removed
Person:Helen Parkinson
Philippe Rocca-Serra
WEB: http://en.wikipedia.org/wiki/Holm%E2%80%93Bonferroni_method
Bonferroni adjustment method
Holm-Bonferroni family-wise error rate correction method
family wise error rate correction method
A family wise error rate correction method is a multiple testing procedure that controls the probability of at least one false positive.
2011-03-31: [PRS].
creating a defined class by specifying the necessary output of dt
allows correct classification of FWER dt
Monnie McGee
Philippe Rocca-Serra
FWER correction
Dudoit, Sandrine and van der Laan, Mark J. (2008) Multiple Testing Procedures with Applications to Genomics. New York: Springer , p. 19
family wise error rate correction method
descriptive statistical calculation objective
A descriptive statistical calculation objective is a data transformation objective which concerns any calculation intended to describe a feature of a data set, for example, its center or its variability.
Elisabetta Manduchi
James Malone
Melanie Courtot
Monnie McGee
PERSON: Elisabetta Manduchi
PERSON: James Malone
PERSON: Melanie Courtot
PERSON: Monnie McGee
descriptive statistical calculation objective
survival analysis objective
Kaplan meier data transformation
A data transformation objective which has the data transformation aims to model time to event data (where events are e.g. death and or disease recurrence); the purpose of survival analysis is to model the underlying distribution of event times and to assess the dependence of the event time on other explanatory variables
PERSON: James Malone
PERSON: Tina Boussard
survival analysis
http://en.wikipedia.org/wiki/Survival_analysis
survival analysis objective
multiple testing correction method
A multiple testing correction method is a hypothesis test performed simultaneously on M > 1 hypotheses. Multiple testing procedures produce a set of rejected hypotheses that is an estimate for the set of false null hypotheses while controlling for a suitably define Type I error rate
Monnie McGee
multiple testing procedure
PAPER: Dudoit, Sandrine and van der Laan, Mark J. (2008) Multiple Testing Procedures with Applications to Genomics. New York: Springer , p. 9-10.
multiple testing correction method
logarithmic transformation
A logarithmic transformation is a data transformation consisting in the application of the logarithm function with a given base a (where a>0 and a is not equal to 1) to a (one dimensional) positive real number input. The logarithm function with base a can be defined as the inverse of the exponential function with the same base. See e.g. http://en.wikipedia.org/wiki/Logarithm.
Elisabetta Manduchi
WEB: http://en.wikipedia.org/wiki/Logarithm
logarithmic transformation
regression analysis method
Regression analysis is a descriptive statistics technique that examines the relation of a dependent variable (response variable) to specified independent variables (explanatory variables). Regression analysis can be used as a descriptive method of data analysis (such as curve fitting) without relying on any assumptions about underlying processes generating the data.
Date:2013-11-15
Person: AGB,PRS
Adding restrictions, specifying model + parameter estimation process
change of label from 'regression analysis method' to 'regression analysis'
Alejandra Gonzalez-Beltran
Philippe Rocca-Serra
Tina Hernandez-Boussard
BOOK: Richard A. Berk, Regression Analysis: A Constructive Critique, Sage Publications (2004) 978-0761929048
regression analysis
regression analysis method
principal component regression
The Principal Component Regression method is a regression analysis method that combines the Principal Component Analysis (PCA)spectral decomposition with an Inverse Least Squares (ILS) regression method to create a quantitative model for complex samples. Unlike quantitation methods based directly on Beer's Law which attempt to calculate the absorbtivity coefficients for the constituents of interest from a direct regression of the constituent concentrations onto the spectroscopic responses, the PCR method regresses the concentrations on the PCA scores.
Tina Hernandez-Boussard
WEB: : http://www.thermo.com/com/cda/resources/resources_detail/1,2166,13414,00.html
principal component regression
data visualization
Generation of a heatmap from a microarray dataset
An planned process that creates images, diagrams or animations from the input data.
Elisabetta Manduchi
James Malone
Melanie Courtot
Tina Boussard
data encoding as image
visualization
PERSON: Elisabetta Manduchi
PERSON: James Malone
PERSON: Melanie Courtot
PERSON: Tina Boussard
Possible future hierarchy might include this:
information_encoding
>data_encoding
>>image_encoding
data visualization
mode calculation
A mode calculation is a descriptive statistics calculation in which the mode is calculated which is the most common value in a data set. It is most often used as a measure of center for discrete data.
James Malone
Monnie McGee
PERSON: James Malone
PERSON: Monnie McGee
From Monnie's file comments - need to add center_calculation role but it doesn't exist yet - (editor note added by James Jan 2008)
mode calculation
median calculation
A median calculation is a descriptive statistics calculation in which the midpoint of the data set (the 0.5 quantile) is calculated. First, the observations are sorted in increasing order. For an odd number of observations, the median is the middle value of the sorted data. For an even number of observations, the median is the average of the two middle values.
James Malone
Monnie McGee
PERSON: James Malone
PERSON: Monnie McGee
From Monnie's file comments - need to add center_calculation role but it doesn't exist yet - (editor note added by James Jan 2008)
median calculation
agglomerative hierarchical clustering
An agglomerative hierarchical clustering is a hierarchical clustering which starts with separate clusters and then successively combines these clusters until there is only one cluster remaining.
Elisabetta Manduchi
James Malone
bottom-up hierarchical clustering
PERSON: Elisabetta Manduchi
agglomerative hierarchical clustering
divisive hierarchical clustering
A divisive hierarchical clustering is a hierarchical clustering which starts with a single cluster and then successively splits resulting clusters until only clusters of individual objects remain.
Elisabetta Manduchi
James Malone
top-down hierarchical clustering
PERSON: Elisabetta Manduchi
divisive hierarchical clustering
false discovery rate correction method
The false discovery rate is a data transformation used in multiple hypothesis testing to correct for multiple comparisons. It controls the expected proportion of incorrectly rejected null hypotheses (type I errors) in a list of rejected hypotheses. It is a less conservative comparison procedure with greater power than familywise error rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors. .
2011-03-31: [PRS].
creating a defined class by specifying the necessary output of dt
allows correct classification of FDR dt
Monnie McGee
Philippe Rocca-Serra
FDR correction method
Dudoit, Sandrine and van der Laan, Mark J. (2008) Multiple Testing Procedures with Applications to Genomics. New York: Springer , p. 21 and http://www.wikidoc.org/index.php/False_discovery_rate
false discovery rate correction method
data transformation objective
normalize objective
An objective specification to transformation input data into output data
Modified definition in 2013 Philly OBI workshop
James Malone
PERSON: James Malone
data transformation objective
data normalization objective
Quantile transformation which has normalization objective can be used for expression microarray assay normalization and it is referred to as "quantile normalization", according to the procedure described e.g. in PMID 12538238.
A normalization objective is a data transformation objective where the aim is to remove
systematic sources of variation to put the data on equal footing in order
to create a common base for comparisons.
Elisabetta Manduchi
Helen Parkinson
James Malone
PERSON: Elisabetta Manduchi
PERSON: Helen Parkinson
PERSON: James Malone
data normalization objective
correction objective
Type I error correction
A correction objective is a data transformation objective where the aim is to correct for error, noise or other impairments to the input of the data transformation or derived from the data transformation itself
James Malone
PERSON: James Malone
PERSON: Melanie Courtot
correction objective
normalization data transformation
A normalization data transformation is a data transformation that has objective normalization.
James Malone
PERSON: James Malone
normalization data transformation
averaging data transformation
An averaging data transformation is a data transformation that has objective averaging.
James Malone
PERSON: James Malone
averaging data transformation
partitioning data transformation
A partitioning data transformation is a data transformation that has objective partitioning.
James Malone
PERSON: James Malone
partitioning data transformation
partitioning objective
A k-means clustering which has partitioning objective is a data transformation in which the input data is partitioned into k output sets.
A partitioning objective is a data transformation objective where the aim is to generate a collection of disjoint non-empty subsets whose union equals a non-empty input set.
Elisabetta Manduchi
James Malone
PERSON: Elisabetta Manduchi
partitioning objective
class discovery data transformation
A class discovery data transformation (sometimes called unsupervised classification) is a data transformation that has objective class discovery.
James Malone
clustering data transformation
unsupervised classification data transformation
PERSON: James Malone
class discovery data transformation
center calculation objective
A mean calculation which has center calculation objective is a data transformation in which the center of the input data is discovered through the calculation of a mean average.
A center calculation objective is a data transformation objective where the aim is to calculate the center of an input data set.
James Malone
PERSON: James Malone
center calculation objective
class discovery objective
A class discovery objective (sometimes called unsupervised classification) is a data transformation objective where the aim is to organize input data (typically vectors of attributes) into classes, where the number of classes and their specifications are not known a priori. Depending on usage, the class assignment can be definite or probabilistic.
James Malone
clustering objective
discriminant analysis objective
unsupervised classification objective
PERSON: Elisabetta Manduchi
PERSON: James Malone
class discovery objective
center calculation data transformation
A center calculation data transformation is a data transformation that has objective of center calculation.
James Malone
PERSON: James Malone
center calculation data transformation
descriptive statistical calculation data transformation
A descriptive statistical calculation data transformation is a data transformation that has objective descriptive statistical calculation and which concerns any calculation intended to describe a feature of a data set, for example, its center or its variability.
James Malone
PERSON: James Malone
descriptive statistical calculation data transformation
error correction objective
Application of a multiple testing correction method
An error correction objective is a data transformation objective where the aim is to remove (correct for) erroneous contributions arising from the input data, or the transformation itself.
James Malone, Helen Parkinson
PERSON: James Malone
error correction objective
gene list visualization
Adata visualization which has input of a gene list and produces an output of a report graph which is capable of rendering data of this type.
James Malone
gene list visualization
survival analysis data transformation
A data transformation which has the objective of performing survival analysis.
James Malone
PERSON: James Malone
survival analysis data transformation
chi square test
The chi-square test is a data transformation with the objective of statistical hypothesis testing, in which the sampling distribution of the test statistic is a chi-square distribution when the null hypothesis is true, or any in which this is asymptotically true, meaning that the sampling distribution (if the null hypothesis is true) can be made to approximate a chi-square distribution as closely as desired by making the sample size large enough.
negociation with OBI hence definition and definition source are missing from this class
PERSON: James Malone
PERSON: Tina Boussard
chi square test
1
1
true
true
2
1
ANOVA
ANOVA or analysis of variance is a data transformation in which a statistical test of whether the means of several groups are all equal.
AGB and PRS augmented the class with formal definitions as part of STATO extension
Alejandra Gonzalez-Beltran
James Malone
Philippe Rocca-Serra
Analysis of Variance
stat.anova()
ANOVA
observation design
PMID: 12387964.Lancet. 2002 Oct 12;360(9340):1144-9.Deficiency of antibacterial peptides in patients with morbus Kostmann: an observation study.
observation design is a study design in which subjects are monitored in the absence of any active intervention by experimentalists.
Philippe Rocca-Serra
OBI branch derived
observation design
extraction
nucleic acid extraction using phenol chloroform
A material separation in which a desired component of an input material is separated from the remainder
Current the output of material processing defined as the molecular entity, main component in the output material entity, rather than the material entity that have grain molecular entity.
'nucleic acid extract' is the output of 'nucleic acid extraction' and has grain 'nucleic acid'. However, the output of 'nucleic acid extraction' is 'nucleic acid' rather than 'nucleic acid extract'. We are aware of this issue and will work it out in the future.
Person:Bjoern Peters
Philippe Rocca-Serra
extraction
group randomization
PMID: 18349405. Randomization reveals unexpected acute leukemias in Southwest Oncology Group prostate cancer trial. J Clin Oncol. 2008 Mar 20;26(9):1532-6.
A group assignment which relies on chance to assign materials to a group of materials in order to avoid bias in experimental set up.
Philippe Rocca-Serra
adapted from wikipedia [http://en.wikipedia.org/wiki/Randomization]
group randomization
nucleic acid hybridization
PMID: 18555787.Quantitative analysis of DNA hybridization in a flowthrough microarray for molecular testing. Anal Biochem. 2008 May 27.
a planned process by which totally or partially complementary, single-stranded nucleic acids are combined into a single molecule called heteroduplex or homoduplex to an extent depending on the amount of complementarity.
Philippe Rocca-Serra
adapted from wikipedia [http://en.wikipedia.org/wiki/Nucleic_acid_hybridization]
hybridization assay
nucleic acid hybridization
flow cell
Biofilm Flow Cell
Aparatus in the fluidic subsystem where the sheath and sample meet. Can be one of several types; jet-in-air, quartz cuvette, or a hybrid of the two. The sample flows through the center of a fluid column of sheath fluid in the flow cell.
Person:John Quinn
flow_cell
http://www.flocyte.com/FRTP/Resources/flow_cytometry_glossary.htm
flow cell
flow cytometer
FACS Calibur
A flow_cytometer is an instrument for counting, examining and sorting microscopic particles in suspension. It allows simultaneous multiparametric analysis of the physical and/or chemical characteristics of single cells flowing through an optical and/or electronic detection apparatus. A flow cytometer is an instrument that can be used to quantitatively measure the properties of individual cells in a flowing medium.
John Quinn
http://en.wikipedia.org/wiki/Flow_cytometer
flow cytometer
light source
A light source is an optical subsystem that provides light for use in a distant area using a delivery system (e.g., fiber optics). Light sources may include one of a variety of lamps (e.g., xenon, halogen, mercury). Most light sources are operated from line power, but some may be powered from batteries. They are mostly used in endoscopic, microscopic, and other examination and/or in surgical procedures. The light source is part of the optical subsystem. In a flow cytometer the light source directs high intensity light at particles at the interrogation point. The light source in a flow cytometer is usually a laser.
Elizabeth M. Goralczyk
John Quinn
Olga Tchuvatkina
Practical Flow Cytometry 4th Edition, Howard Shapiro, ISBN-10: 0471411256, ISBN-13: 978-0471411253
light source
obscuration bar
obscuration bar in a flow cytometer
An obscuration bar is a an optical subsystem which is a strip of metal or other material that serves to block out direct light from the illuminating beam. The obscuration bar prevents the bright light scattered in the forward directions from burning out the collection device.
Daniel Schober
Flow Cytometry: First Principles, by Alice Longobardi Givan, ISBN-10: 0471382248, ISBN-13: 978-0471382249
John Quinn
obscuration bar
optical filter
720 LP filter, 580/30 BP filter
An optical filter is an optical subsystem that selectively transmits light having certain properties (often, a particular range of wavelengths, that is, range of colours of light), while blocking the remainder. They are commonly used in photography, in many optical instruments, and to colour stage lighting Optical filters can be arranged to segregate and collect light by wave length.
John Quinn
http://en.wikipedia.org/wiki/Optical_filter
optical filter
photodetector
A photomultiplier tube, a photo diode
A photodetector is a device used to detect and measure the intensity of radiant energy through photoelectric action. In a cytometer, photodetectors measure either the number of photons of laser light scattered on impact with a cell (for example), or the flourescence emitted by excitation of a fluorescent dye.
John Quinn
http://einstein.stanford.edu/content/glossary/glossary.html
photodetector
DNA sequencer
ABI 377 DNA Sequencer, ABI 310 DNA Sequencer
A DNA sequencer is an instrument that determines the order of deoxynucleotides in deoxyribonucleic acid sequences.
Trish Whetzel
MO
DNA sequencer
hybridization chamber
Glass Array Hybridization Cassette
A device which is used to maintain constant contact of a liquid on an array. This can be either a glass vial or slide.
Trish Whetzel
MO_563 hybridization_chamber
hybridization chamber
cytometer
A cytometer is an instrument for counting and measuring cells.
Melanie Courtot
http://medical.merriam-webster.com/medical/cytometer
cytometer
microarray
An affymetrix U133 array is a microarray. Microarrays include 1 and 2-color arrays, custom and commercial arrays (e.g, Affymetrix, Agilent, Nimblegen, Illumina, etc.) for expression profiling, DNA variant detection, protein binding, and other genomic and functional genomic assays.
A processed material that is made to be used in an analyte assay. It consists of a physical immobilisation matrix in which substances that bind the analyte are placed in regular spatial position.
Daniel Schober
PERSON: Chris Stoeckert
microarray
DNA microarray
Moran G, Stokes C, Thewes S, Hube B, Coleman DC, Sullivan D (2004). "Comparative genomics using Candida albicans DNA microarrays reveals absence and divergence of virulence-associated genes in Candida dubliniensis". Microbiology 150: 3363-3382. doi:10.1099/mic.0.27221-0. PMID 15470115
A DNA-microarray is a microarray that is used as a physical 2D immobilisation matrix for DNA sequences. DNA microarray-bound DNA fragments are used as targets for a hybridising probed sample.
PERSON: Daniel Schober
PERSON: Frank Gibson
DNA Chip
DNA-array
Web:<http://en.wikipedia.org/wiki/DNA_microarray>@2008/03/03
DNA microarray
droplet sorter
A droplet sorter is part_of a flow cytometer sorter that converts the carrier fluid stream into individual droplets, and these droplets are directed into separate locations for recovery (enriching the original
sample for particles of interest based on qualities determined by gating) or disposal.
OBI Instrument branch
OBI Instrument branch
droplet sorter
study design
a matched pairs study design describes criteria by which subjects are identified as pairs which then undergo the same protocols, and the data generated is analyzed by comparing the differences between the paired subjects, which constitute the results of the executed study design.
A plan specification comprised of protocols (which may specify how and what kinds of data will be gathered) that are executed as part of an investigation and is realized during a study design execution.
Editor note: there is at least an implicit restriction on the kind of data transformations that can be done based on the measured data available.
PERSON: Chris Stoeckert
experimental design
rediscussed at length (MC/JF/BP). 12/9/08). The definition was clarified to differentiate it from protocol.
study design
This statement can actually be inferred from 'plan specification', because 'independent variable specification' is a subclass of 'is part of' some 'plan specification'
repeated measure design
PMID: 10959922.J Biopharm Stat. 2000 Aug;10(3):433-45.Equivalence in test assay method comparisons for the repeated-measure, matched-pair design in medical device studies: statistical considerations.
a study design which use the same individuals and exposure them to a set of conditions. The effect of order and practice can be confounding factor in such designs
PlanAndPlannedProcess Branch
http://www.holah.karoo.net/experimentaldesigns.htm
repeated measure design
cross over design
PMID: 17601993-Objective: HIV-infected patients with lipodystrophy (HIV-lipodystrophy) are insulin resistant and have elevated plasma free fatty acid (FFA) concentrations. We aimed to explore the mechanisms underlying FFA-induced insulin resistance in patients with HIV-lipodystrophy. Research Design and Methods: Using a randomized placebo-controlled cross-over design, we studied the effects of an overnight acipimox-induced suppression of FFA on glucose and FFA metabolism by using stable isotope labelled tracer techniques during basal conditions and a two-stage euglycemic, hyperinsulinemic clamp (20 mU insulin/m(2)/min; 50 mU insulin/m(2)/min) in nine patients with nondiabetic HIV-lipodystrophy. All patients received antiretroviral therapy. Biopsies from the vastus lateralis muscle were obtained during each stage of the clamp. Results: Acipimox treatment reduced basal FFA rate of appearance by 68.9% (52.6%-79.5%) and decreased plasma FFA concentration by 51.6 % (42.0%-58.9%), (both, P < 0.0001). Endogenous glucose production was not influenced by acipimox. During the clamp the increase in glucose-uptake was significantly greater after acipimox treatment compared to placebo (acipimox: 26.85 (18.09-39.86) vs placebo: 20.30 (13.67-30.13) mumol/kg/min; P < 0.01). Insulin increased phosphorylation of Akt (Thr(308)) and GSK-3beta (Ser(9)), decreased phosphorylation of glycogen synthase (GS) site 3a+b and increased GS-activity (I-form) in skeletal muscle (P < 0.01). Acipimox decreased phosphorylation of GS (site 3a+b) (P < 0.02) and increased GS-activity (P < 0.01) in muscle. Conclusion: The present study provides direct evidence that suppression of lipolysis in patients with HIV-lipodystrophy improves insulin-stimulated peripheral glucose-uptake. The increased glucose-uptake may in part be explained by increased dephosphorylation of GS (site 3a+b) resulting in increased GS activity.
a repeated measure design which ensures that experimental units receive, in sequence, the treatment (or the control), and then, after a specified time interval (aka *wash-out periods*), switch to the control (or treatment). In this design, subjects (patients in human context) serve as their own controls, and randomization may be used to determine the ordering which a subject receives the treatment and control
Philippe Rocca-Serra
(source: http://www.sbu.se/Filer/Content0/publikationer/1/literaturesearching_1993/glossary.html)
cross over design
matched pairs design
PMID: 17288613-BSTRACT: BACKGROUND: Physicians in Canadian emergency departments (EDs) annually treat 185,000 alert and stable trauma victims who are at risk for cervical spine (C-spine) injury. However, only 0.9% of these patients have suffered a cervical spine fracture. Current use of radiography is not efficient. The Canadian C-Spine Rule is designed to allow physicians to be more selective and accurate in ordering C-spine radiography, and to rapidly clear the C-spine without the need for radiography in many patients. The goal of this phase III study is to evaluate the effectiveness of an active strategy to implement the Canadian C-Spine Rule into physician practice. Specific objectives are to: 1) determine clinical impact, 2) determine sustainability, 3) evaluate performance, and 4) conduct an economic evaluation. METHODS: We propose a matched-pair cluster design study that compares outcomes during three consecutive 12-months before, after, and decay periods at six pairs of intervention and control sites. These 12 hospital ED sites will be stratified as teaching or community hospitals, matched according to baseline C-spine radiography ordering rates, and then allocated within each pair to either intervention or control groups. During the after period at the intervention sites, simple and inexpensive strategies will be employed to actively implement the Canadian C-Spine Rule. The following outcomes will be assessed: 1) measures of clinical impact, 2) performance of the Canadian C-Spine Rule, and 3) economic measures. During the 12-month decay period, implementation strategies will continue, allowing us to evaluate the sustainability of the effect. We estimate a sample size of 4,800 patients in each period in order to have adequate power to evaluate the main outcomes. DISCUSSION: Phase I successfully derived the Canadian C-Spine Rule and phase II confirmed the accuracy and safety of the rule, hence, the potential for physicians to improve care. What remains unknown is the actual change in clinical behaviors that can be affected by implementation of the Canadian C-Spine Rule, and whether implementation can be achieved with simple and inexpensive measures. We believe that the Canadian C-Spine Rule has the potential to significantly reduce health care costs and improve the efficiency of patient flow in busy Canadian EDs.
A matched pair design is a study design which use groups of individuals associated (hence matched) to each other based on a set of criteria, one member going to one treatment, the other member receiving the other treatment.
Philippe Rocca-Serra
http://www.holah.karoo.net/experimentaldesigns.htm
matched pairs design
parallel group design
PMID: 17408389-Purpose: Proliferative vitreoretinopathy (PVR) is the most important reason for blindness following retinal detachment. Presently, vitreous tamponades such as gas or silicone oil cannot contact the lower part of the retina. A heavier-than-water tamponade displaces the inflammatory and PVR-stimulating environment from the inferior area of the retina. The Heavy Silicone Oil versus Standard Silicone Oil Study (HSO Study) is designed to answer the question of whether a heavier-than-water tamponade improves the prognosis of eyes with PVR of the lower retina. Methods: The HSO Study is a multicentre, randomized, prospective controlled clinical trial comparing two endotamponades within a two-arm parallel group design. Patients with inferiorly and posteriorly located PVR are randomized to either heavy silicone oil or standard silicone oil as a tamponading agent. Three hundred and fifty consecutive patients are recruited per group. After intraoperative re-attachment, patients are randomized to either standard silicone oil (1000 cSt or 5000 cSt) or Densiron((R)) as a tamponading agent. The main endpoint criteria are complete retinal attachment at 12 months and change of visual acuity (VA) 12 months postoperatively compared with the preoperative VA. Secondary endpoints include complete retinal attachment before endotamponade removal, quality of life analysis and the number of retina affecting re-operation within 1 year of follow-up. Results: The design and early recruitment phase of the study are described. Conclusions: The results of this study will uncover whether or not heavy silicone oil improves the prognosis of eyes with PVR.
A parallel group design or independent measure design is a study design which uses unique experimental unit each experimental group, in other word no two individuals are shared between experimental groups, hence also known as parallel group design. Subjects of a treatment group receive a unique combination of independent variable values making up a treatment
Philippe Rocca-Serra
independent measure design
http://www.holah.karoo.net/experimentaldesigns.htm
parallel group design
randomized complete block design
http://www.stats.gla.ac.uk/steps/glossary/anova.html,(A researcher is carrying out a study of the effectiveness of four different skin creams for the treatment of a certain skin disease. He has eighty subjects and plans to divide them into 4 treatment groups of twenty subjects each. Using a randomised blocks& design, the subjects are assessed and put in blocks of four according to how severe their skin condition is; the four most severe cases are the first block, the next four most severe cases are the second block, and so on to the twentieth block. The four &members of each block are then randomly assigned, one to each of the four treatment groups. http://www.stats.gla.ac.uk/steps/glossary/anova.html#rbd))
A randomized complete block design is_a study design which assigns randomly treatments to block. The number of units per block equals the number of treatment so each block receives each treatment exactly once (hence the qualifier 'complete'). The design was originally devised from field trials used in agronomy and agriculture. The analysis assumes that there is no interaction between block and treatment. The method was then used in other settings So The randomised complete block design is a design in which the subjects are matched according to a variable which the experimenter wishes to control. The subjects are put into groups (blocks) of the same size as the number of treatments. The members of each block are then randomly assigned to different treatment groups.
Philippe Rocca-Serra
http://www.tufts.edu/~gdallal/ranblock.htm
randomized complete block design
2
latin square design
PMID: 17582121-Our objective was to examine the effects of dietary cation-anion difference (DCAD) with different concentrations of dietary crude protein (CP) on performance and acid-base status in early lactation cows. Six lactating Holstein cows averaging 44 d in milk were used in a 6 x 6 Latin square design with a 2 x 3 factorial arrangement of treatments: DCAD of -3, 22, or 47 milliequivalents (Na + K - Cl - S)/100 g of dry matter (DM), and 16 or 19% CP on a DM basis. Linear increases with DCAD occurred in DM intake, milk fat percentage, 4% fat-corrected milk production, milk true protein, milk lactose, and milk solids-not-fat. Milk production itself was unaffected by DCAD. Jugular venous blood pH, base excess and HCO3(-) concentration, and urine pH increased, but jugular venous blood Cl- concentration, urine titratable acidity, and net acid excretion decreased linearly with increasing DCAD. An elevated ratio of coccygeal venous plasma essential AA to nonessential AA with increasing DCAD indicated that N metabolism in the rumen was affected, probably resulting in more microbial protein flowing to the small intestine. Cows fed 16% CP had lower urea N in milk than cows fed 19% CP; the same was true for urea N in coccygeal venous plasma and urine. Dry matter intake, milk production, milk composition, and acid-base status did not differ between the 16 and 19% CP treatments. It was concluded that DCAD affected DM intake and performance of dairy cows in early lactation. Feeding 16% dietary CP to cows in early lactation, compared with 19% CP, maintained lactation performance while reducing urea N excretion in milk and urine.
Latin square design is_a study design which allows in its simpler form controlling 2 levels of nuisance variables (also known as blocking variables).he 2 nuisance factors are divided into a tabular grid with the property that each row and each column receive each treatment exactly once.
Philippe Rocca-Serra
Adapted from: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm and
latin square design
3
graeco latin square design
PMID: 6846242-Beaton et al (Am J Clin Nutr 1979;32:2546-59) reported on the partitioning of variance in 1-day dietary data for the intake of energy, protein, total carbohydrate, total fat, classes of fatty acids, cholesterol, and alcohol. Using the same food intake data and the expanded National Heart, Lung and Blood Institute food composition data base, these analyses of sources of variance have been expanded to include classes of carbohydrate, vitamin A, vitamin C, thiamin, riboflavin, niacin, calcium, iron, total ash, caffeine, and crude fiber. The analyses relate to observed intakes (replicated six times) of 30 adult males and 30 adult females obtained under a paired Graeco-Latin square design with sequence of interview, interviewer, and day of the week as determinants. Neither sequence nor interviewer made consistent contribution to variance. In females, day of the week had a significant effect for several nutrients. The major partitioning of variance was between interindividual variation (between subjects) and intraindividual variation (within subjects) which included both true day-to-day variation in intake and methodological variation. For all except caffeine, the intraindividual variability of 1-day data was larger than the interindividual variability. For vitamin A, almost all of the variance was associated with day-to-day variability. One day data provide a very inadequate estimate of usual intake of individuals. In the design of nutrition studies it is critical that the intended use of dietary data be a major consideration in deciding on methodology. There is no ideal dietary method. There may be preferred methods for particular purposes.
Greco-Latin square design is a study design which relates to Latin square design
Philippe Rocca-Serra
Adapted from: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm and
only 2 articles in pubmed ->probably irrelevant
Euler square design
orthogonal latin squares design
graeco latin square design
4
hyper graeco latin square design
PRS to do
Philippe Rocca-Serra
Adapted from: http://www.itl.nist.gov/div898/handbook/pri/section3/pri3321.htm and
no example found in pubmed->not in use in the community
hyper graeco latin square design
1
2
factorial design
PMID: 17582121-Our objective was to examine the effects of dietary cation-anion difference (DCAD) with different concentrations of dietary crude protein (CP) on performance and acid-base status in early lactation cows. Six lactating Holstein cows averaging 44 d in milk were used in a 6 x 6 Latin square design with a 2 x 3 factorial arrangement of treatments: DCAD of -3, 22, or 47 milliequivalents (Na + K - Cl - S)/100 g of dry matter (DM), and 16 or 19% CP on a DM basis. Linear increases with DCAD occurred in DM intake, milk fat percentage, 4% fat-corrected milk production, milk true protein, milk lactose, and milk solids-not-fat. Milk production itself was unaffected by DCAD. Jugular venous blood pH, base excess and HCO3(-) concentration, and urine pH increased, but jugular venous blood Cl- concentration, urine titratable acidity, and net acid excretion decreased linearly with increasing DCAD. An elevated ratio of coccygeal venous plasma essential AA to nonessential AA with increasing DCAD indicated that N metabolism in the rumen was affected, probably resulting in more microbial protein flowing to the small intestine. Cows fed 16% CP had lower urea N in milk than cows fed 19% CP; the same was true for urea N in coccygeal venous plasma and urine. Dry matter intake, milk production, milk composition, and acid-base status did not differ between the 16 and 19% CP treatments. It was concluded that DCAD affected DM intake and performance of dairy cows in early lactation. Feeding 16% dietary CP to cows in early lactation, compared with 19% CP, maintained lactation performance while reducing urea N excretion in milk and urine.
factorial design is_a study design which is used to evaluate two or more factors simultaneously. The treatments are combinations of levels of the factors. The advantages of factorial designs over one-factor-at-a-time experiments is that they are more efficient and they allow interactions to be detected. In statistics, a factorial design experiment is an experiment whose design consists of two or more factors, each with discrete possible values or levels, and whose experimental units take on all possible combinations of these levels across all such factors. Such an experiment allows studying the effect of each factor on the response variable, as well as the effects of interactions between factors on the response variable.
Philippe Rocca-Serra
http://www.stats.gla.ac.uk/steps/glossary/anova.html#facdes And from wikipedia (01/03/2007): http://en.wikipedia.org/wiki/Factorial_experiment)
factorial design
2
2x2 factorial design
PMID: 17561240-The present experiment evaluates the effects of intermittent exposure to a social stimulus on ethanol and water drinking in rats. Four groups of rats were arranged in a 2x2 factorial design with 2 levels of Social procedure (Intermittent Social vs Continuous Social) and 2 levels of sipper Liquid (Ethanol vs Water). Intermittent Social groups received 35 trials per session. Each trial consisted of the insertion of the sipper tube for 10 s followed by lifting of the guillotine door for 15 s. The guillotine door separated the experimental rat from the conspecific rat in the wire mesh cage during the 60 s inter-trial interval. The Continuous Social groups received similar procedures except that the guillotine door was raised during the entire duration of the session. For the Ethanol groups, the concentrations of ethanol in the sipper [3, 4, 6, 8, 10, 12, 14, and 16% (vol/vol)] increased across sessions, while the Water groups received 0% ethanol (water) in the sipper throughout the experiment. Both Social procedures induced more intake of ethanol than water. The Intermittent Social procedure induced more ethanol intake at the two highest ethanol concentration blocks (10-12% and 14-16%) than the Continuous Social procedure, but this effect was not observed with water. Effects of social stimulation on ethanol drinking are discussed.
a factorial design which has 2 experimental factors (aka independent variables) and 2 factor levels per experimental factors
Philippe Rocca-Serra
PMID: 17561240
2x2 factorial design
fractional factorial design
A fractional factorial design is_a study design in which only an adequately chosen fraction of the treatment combinations required for the complete factorial experiment is selected to be run
Philippe Rocca-Serra
http://www.itl.nist.gov/div898/handbook/pri/section3/pri334.htm From ASQC (1983) Glossary & Tables for Statistical Quality Control
fractional factorial design
dye swap design
PMID: 17411393-Dye-specific bias effects, commonly observed in the two-color microarray platform, are normally corrected using the dye swap design. This design, however, is relatively expensive and labor-intensive. We propose a self-self hybridization design as an alternative to the dye swap design. In this design, the treated and control samples are labeled with Cy5 and Cy3 (or Cy3 and Cy5), respectively, without dye swap, along with a set of self-self hybridizations on the control sample. We compare this design with the dye swap design through investigation of mouse primary hepatocytes treated with three peroxisome proliferator-activated receptor-alpha (PPARalpha) agonists at three dose levels. Using Agilent's Whole Mouse Genome microarray, differentially expressed genes (DEG) were determined for both the self-self hybridization and dye swap designs. The DEG concordance between the two designs was over 80% across each dose treatment and chemical. Furthermore, 90% of DEG-associated biological pathways were in common between the designs, indicating that biological interpretations would be consistent. The reduced labor and expense for the self-self hybridization design make it an efficient substitute for the dye swap design. For example, in larger toxicogenomic studies, only about half the chips are required for the self-self hybridization design compared to that needed in the dye swap design.
An experiment design type where the label orientations are reversed. exact synonym: flip dye, dye flip
Philippe Rocca-Serra on behalf of MO
MO_858
dye swap design
time series design
PMID: 14744830-Microarrays are powerful tools for surveying the expression levels of many thousands of genes simultaneously. They belong to the new genomics technologies which have important applications in the biological, agricultural and pharmaceutical sciences. There are myriad sources of uncertainty in microarray experiments, and rigorous experimental design is essential for fully realizing the potential of these valuable resources. Two questions frequently asked by biologists on the brink of conducting cDNA or two-colour, spotted microarray experiments are 'Which mRNA samples should be competitively hybridized together on the same slide?' and 'How many times should each slide be replicated?' Early experience has shown that whilst the field of classical experimental design has much to offer this emerging multi-disciplinary area, new approaches which accommodate features specific to the microarray context are needed. In this paper, we propose optimal designs for factorial and time course experiments, which are special designs arising quite frequently in microarray experimentation. Our criterion for optimality is statistical efficiency based on a new notion of admissible designs; our approach enables efficient designs to be selected subject to the information available on the effects of most interest to biologists, the number of arrays available for the experiment, and other resource or practical constraints, including limitations on the amount of mRNA probe. We show that our designs are superior to both the popular reference designs, which are highly inefficient, and to designs incorporating all possible direct pairwise comparisons. Moreover, our proposed designs represent a substantial practical improvement over classical experimental designs which work in terms of standard interactions and main effects. The latter do not provide a basis for meaningful inference on the effects of most interest to biologists, nor make the most efficient use of valuable and limited resources.
Groups of assays that are related as part of a time series.
PRS-AGB adding formal restriction on independent variable specification about time (march 2013) and making time series design class a defined class.
Philippe Rocca-Serra on behalf of MO
MO_887
time series design
collecting specimen from organism
taking a sputum sample from a cancer patient, taking the spleen from a killed mouse, collecting a urine sample from a patient
a process with the objective to obtain a material entity that was part of an organism for potential future use in an investigation
PERSON:Bjoern Peters
IEDB
collecting specimen from organism
material component separation
Using a cell sorter to separate a mixture of T cells into two fractions; one with surface receptor CD8 and the other lacking the receptor, or purification
a material processing in which components of an input material become segregated in space
Bjoern Peters
IEDB
material component separation
group assignment
Assigning' to be treated with active ingredient role' to an organism during group assignment. The group is those organisms that have the same role in the context of an investigation
group assignment is a process which has an organism as specified input and during which a role is assigned
Philippe Rocca-Serra
cohort assignment
study assignment
OBI Plan
group assignment
maintaining cell culture
When harvesting blood from a human, isolating T cells, and then limited dilution cloning of the cells, the maintaining_cell_culture step comprises all steps after the initial dilution and plating of the cells into culture, e.g. placing the culture into an incubator, changing or adding media, and splitting a cell culture
a protocol application in which cells are kept alive in a defined environment outside of an organism. part of cell_culturing
PlanAndPlannedProcess Branch
OBI branch derived
maintaining cell culture
'establishing cell culture'
a process through which a new type of cell culture or cell line is created, either through the isolation and culture of one or more cells from a fresh source, or the deliberate experimental modification of an existing cell culture (e.g passaging a primary culture to become a secondary culture or line, or the immortalization or stable genetic modification of an existing culture or line).
PERSON:Matthew Brush
PERSON:Matthew Brush
A 'cell culture' as used here referes to a new lineage of cells in culture deriving from a single biological source.. New cultures are established through the initial isolation and culturing of cells from an organismal source, or through changes in an existing cell culture or line that result in a new culture with unique characteristics. This can occur through the passaging/selection of a primary culture into a secondary culture or line, or experimental modifications of an existing cell culture or line such as an immortalization process or other stable genetic modification. This class covers establishment of cultures of either multicellular organism cells or unicellular organisms.
establishing cell culture
addition of molecular label
The addition of phycoerytherin label to an anti-CD8 antibody, to label all antibodies. The addition of anti-CD8-PE to a population of cells, to label the subpopulation cells that are CD8+.
a material processing technique intended to add a molecular label to some input material entity, to allow detection of the molecular target of this label in a detection of molecular label assay
PERSON:Matthew Brush
labeling
OBI developer call, 3-12-12
addition of molecular label
sequencing assay
The use of the Sanger method of DNA sequencing to determine the order of the nucleotides in a DNA template
the use of a chemical or biochemical means to infer the sequence of a biomaterial
has_output should be sequence of input; we don't have sequence well defined yet
PlanAndPlannedProcess Branch
OBI branch derived
sequencing assay
recombinant vector cloning
a planned process with the objective to insert genetic material into a cloning vector for future replication of the inserted material
pa_branch (Alan, Randi, Kevin, Jay, Bjoern)
molecular cloning
OBI branch derived
recombinant vector cloning
RNA extraction
A RNA extraction is a nucleic acid extraction where the desired output material is RNA
PlanAndPlannedProcess Branch
OBI branch derived
requested by Helen Parkinson for MO
RNA extraction
nucleic acid extraction
Phenol / chlorophorm extraction disolvation of protein content folllowed by ethanol precipitation of the nucleic acid fraction over night in the fridge followed by centrifugation to obtain a nucleic acid pellet.
a material separation to recover the nucleic acid fraction of an input material
PlanAndPlannedProcess Branch
OBI branch derived
requested by Helen Parkinson for MO. Could be defined class
nucleic acid extraction
phage display library
PMID: 15905471.Nucleic Acids Res. 2005 May 19;33(9):e81.Oligonucleotide-assisted cleavage and ligation: a novel directional DNA cloning technology to capture cDNAs. Application in the construction of a human immune antibody phage-display library. [Phage display library encoding fragments of human antibodies. m-rna library encoding for 9-mer peptides]
a phage display library is a collection of materials in which a mixture of genes or gene fragments is expressed and can be individually selected and amplified.
PERSON: Bjoern Peters
PERSON: Philippe Rocca-Serra
display library
WEB: http://www.immuneepitope.org/home.do
PRS: 22022008. class moved under population,
modification of definition and replacement of biomaterials in previous definition with 'material'
addition of has_role restriction
phage display library
material to be added
A mixture of peptides that is being added into a cell culture.
a material that is added to another one in a material combination process
10/26/09: This defined class is used as a 'macro expression' to reduce the size of the IEDB export
2010/02/24 Alan Ruttenberg: I think this might generate confusion as the common use of the term would consider something to be a specimen during the realization of the role, not only if it bears it. However having this class as a probe, or for display, or as a macro might be useful. Ideally we would mark or segregate such classes
IEDB
material to be added
target of material addition
A cell culture into which a mixture of peptides is being added.
A material entity into which another is being added in a material combinatino process
10/26/09: This defined class is used as a 'macro' to reduce the size of the IEDB export.
IEDB
target of material addition
phenotype
A (combination of) quality(ies) of an organism determined by the interaction of its genetic make-up and environment that differentiates specific instances of a species from other instances of the same species.
phenotype
fluorescence
A luminous flux quality inhering in a bearer by virtue of the bearer's emitting longer wavelength light following the absorption of shorter wavelength radiation; fluorescence is common with aromatic compounds with several rings joined together.
fluorescence
mass
A physical quality that inheres in a bearer by virtue of the proportion of the bearer's amount of matter.
mass
protein
antithrombin III is a protein
An amino acid chain that is produced de novo by ribosome-mediated translation of a genetically-encoded mRNA.
protein
molecular label role
a reagent role inhering in a molecular entity intended to associate with some molecular target to serve as a proxy for the presence, abundance, or location of this target in a detection of molecular label assay.
MHB (9-29-13): 'molecular label role' imported from the Reagent Ontology and replaced OBI:OBI_0000140 (label role)
molecular tracer role
OBI developer call, 3-12-12
molecular label role
molecular label
a molecular reagent intended to associate with some molecular target to serve as a proxy for the presence, abundance, or location of this target in a detection of molecular label assay
molecular tracer
OBI developer call, 3-12-12
molecular label
region
A sequence_feature with an extent greater than zero. A nucleotide region is composed of bases and a polypeptide region is composed of amino acids.
primary structure of sequence macromolecule
sequence
region
digital images may be stored as electronic file in TIFF format on mass memory storage devices
an electronic file is an information content entity which conforms to a specification or format and which is meant to hold data and information in digital form, accessible to software agents
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
digital file
a balanced design is a an experimental design where all experimental group have the an equal number of subject observations
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
balanced design
1
a single factor design is a study design which declares exactly 1 independent variable
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
single factor design
x-axis is a cartesian coordinate axis which is orthogonal to the y-axis and the z-axis
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
x-axis
an axis is a line graph used as reference line for the measurement of coordinates.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.oxforddictionaries.com/definition/english/axis
axis
y-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the z-axis
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
y-axis
A Cartesian coordinate system is a coordinate system that specifies each point uniquely in a plane by a pair of numerical coordinates, which are the signed distances from the point to two fixed perpendicular directed lines, measured in the same unit of length.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Cartesian_coordinate_system
cartesian coordinate system
In geometry, a coordinate system is a system which uses one or more numbers, or coordinates, to uniquely determine the position of a point or other geometric element on a manifold such as Euclidean space.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Coordinate_system
coordinate system
a cartesian axis is one of 3 the axis in a cartesian coordinate system defining a referential in 3 dimensions. each of the axis is orthogonal to the other 2
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
rectangular coordinate axis
adapted from Wolfram Alpha:
https://www.wolframalpha.com/input/?i=cartesian+coordinates&lk=4&num=6&lk=4&num=6
cartesian coordinate axis
z-axis is a cartesian coordinate axis which is orthogonal to the x-axis and the y-axis
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
z-axis
a 2 dimensional cartesian coordinate system is a cartesian coordinate system which defines 2 orthogonal one dimensional axes and which may be used to describe a 2 dimensional spatial region.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
two dimensional cartesian coordinate system
In mathematics, a spherical coordinate system is a coordinate system for three-dimensional space where the position of a point is specified by three numbers: the radial distance of that point from a fixed origin, its polar angle measured from a fixed zenith direction, and the azimuth angle of its orthogonal projection on a reference plane that passes through the origin and is orthogonal to the zenith, measured from a fixed reference direction on that plane.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
https://en.wikipedia.org/wiki/Spherical_coordinate_system
spherical coordinate system
A cylindrical coordinate system is a three-dimensional coordinate system that specifies point positions by the distance from a chosen reference axis, the direction from the axis relative to a chosen reference direction, and the distance from a chosen reference plane perpendicular to the axis. The latter distance is given as a positive or negative number depending on which side of the reference plane faces the point.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
https://en.wikipedia.org/wiki/Cylindrical_coordinate_system
cylindrical coordinate system
In mathematics, the polar coordinate system is a two-dimensional coordinate system in which each point on a plane is determined by a distance from a fixed point and an angle from a fixed direction.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Polar_coordinate_system
polar coordinate system
Wilks' lambda distribution (named for Samuel S. Wilks), is a probability distribution used in multivariate hypothesis testing, especially with regard to the likelihood-ratio test and Multivariate analysis of variance. It is a multivariate generalization of the univariate F-distribution, and generalizes the F-distribution in the same way that the Hotelling's T-squared distribution generalizes Student's t-distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
wikipedia:
last accessed: 2013-09-11
http://en.wikipedia.org/wiki/Wilks%27_lambda_distribution
Wilk's lambda distribution
A cartesian spatial coordinate datum chosen as a fixed point of reference in a three dimensional spatial region.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
three dimensional cartesian spatial coordinate origin
normal distribution hypothesis is a goodness of fit hypothesis stating that the distribution computed from the sample population fits a normal distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
normal distribution hypothesis
A cartesian spatial coordinate datum chosen as a fixed point of reference in a two dimensional spatial region.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
two dimensional cartesian spatial coordinate origin
90
a confidence interval which covers 90% of the sampling distribution, meaning that there is a 90% risk of false positive (type I error)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
90% confidence interval
A one dimensional cartesian coordinate system is a cartesian coordinate system which defines a one dimensional axis and which may be used to describe a one dimensional spatial region, i.e. a straight line. It is defined by a point O, the origin, a unit of length and the orientation for the one dimensional space.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
one dimensional cartesian coordinate system
http://www.stat.duke.edu/courses/Spring98/sta110c/qtable.html
The studentized range (q) distribution is a probability distribution used by the Tukey Honestly Significant Difference test.
The distribution of the statistic
[x̄(k)- x̄(1)]/(s/√n)
where random samples of size n have been taken from k independent and identically distributed normal populations, with x̄(1) and x̄(k) being, respectively, the smallest and largest of the k sample means, and s2 being the pooled estimate of the common variance. This statistic is particularly used in multiple comparison tests.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
q distribution
A Dictionary of Statistics (2 rev ed.), OUP. ISBN-13: 9780199541454
http://www.oxfordreference.com/view/10.1093/acref/9780199541454.001.0001/acref-9780199541454-e-1588
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Tukey.html
studentized range distribution
a three dimensional cartesian coordinate system is a cartesian coordinate system which defines 3 orthogonal one dimensional axes and which may be used to describe a 3 dimensional spatial region.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
three dimensional cartesian coordinate system
A cartesian spatial coordinate datum chosen as a fixed point of reference in a one dimensional spatial region.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
one dimensional cartesian spatial coordinate origin
A cartesian spatial coordinate datum chosen as a fixed point of reference in a spatial region.
placeholder, more work needed
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
cartesian spatial coordinate origin
linkage between 2 categorical variable test is a statistical test which evaluates if there is an association between a predictor variable assuming discrete values and a response variable also assuming discrete values
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
test of association
STATO
test of independence
test of independence between variables
test of association between categorical variables
measure of variation or statistical dispersion is a data item which describes how much a theoritical distribution or dataset is spread.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
measure of dispersion
measure of variation
measure of variation
a measure of central tendency is a data item which attempts to describe a set of data by identifying the value of its centre.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
measure of central tendency
measure of central tendency
Chi-squared statistic is a statistic computed from observations and used to produce a p-value in statistical test when compared to a Chi-Squared distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
Chi-Squared statistic
binary classification (or binomial classification) is a data transformation which aims to cast members of a set into 2 disjoint groups depending on whether the element have a given property/feature or not.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from wikipedia:
http://en.wikipedia.org/wiki/Binary_classifier
last accessed: 2013-11-21
binomial classification
binary classification
The mode is a data item which corresponds to the most frequently occurring number in a set of numbers.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.sagepub.com/upm-data/47775_ch_3.pdf
mode
scipy.stats.mode(a, axis=0)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mode.html#scipy.stats.mode
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L586
mode
a model parameter is a data item which is part of a model and which is meant to characterize an theoritecal or unknown population. a model parameter may be estimated by considering the properties of samples presumably taken from the theoritecal population
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
model parameter
the range is a measure of variation which describes the difference between the lowest score and the highest score in a set of numbers (a data set)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.sagepub.com/upm-data/47775_ch_3.pdf
range(..., na.rm = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/base/html/range.html
range
Outliers are deviant scores that have been legitimately gathered and are not due to equipment failures.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.sagepub.com/upm-data/47775_ch_3.pdf
outlier
http://stats.stackexchange.com/questions/50623/r-calculating-mean-and-standard-error-of-mean-for-factors-with-lm-vs-direct
The standard error of the mean (SEM) is data item denoting the standard deviation of the sample-mean's estimate of a population mean.
It is calculated by dividing the sample standard deviation (i.e., the sample-based estimate of the standard deviation of the population) by the square root of n , the size (number of observations) of the sample.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
SEM
adapted from wikipedia (https://en.wikipedia.org/wiki/Standard_error)
scipy.stats.sem(a, axis=0, ddof=1)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.sem.html#scipy.stats.sem
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L1928
standard error of the mean
a set of 2 subjects which result from a pairing process which assigns subject to a set based on a pairing rule/criteria
possibly submit to 'Population and Community Ontology'
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
matched pair of subjects
a statistic is a measurement datum to describe a dataset or a variable. It is generated by a calculation on set of observed data.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Statistic).
statistic
statistic
an MA plot is a scatter plot of the log intensity ratios M = log_2(T/R) versus the average log intensities A = log_2(T*T)/2, where T and R represent the signal intensities in the test and reference channels respectively.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
M vs A plot
http://www.stat.berkeley.edu/users/terry/zarray/Software/SMAcode/html/plot.mva.html
MA plot
plot.mva()
MA plot
1
The Anderson–Darling test is a statistical test of whether a given sample of data is drawn from a given probability distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Anderson_Darling_test
ad.test(x) function, where x is a numeric vector
scipy.stats.anderson(x, dist='norm')
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.anderson.html#scipy.stats.anderson
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/morestats.py#L1017
Anderson-Darling test
true
true
1
1
one-way anova is an analysis of variance where the different groups being compared are associated with the factor levels of only one independent variable. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
one factor ANOVA
STATO
http://statland.org/R/R/R1way.htm
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.f_oneway.html#scipy.stats.f_oneway
one-way ANOVA
true
true
1
2
two-way anova is an analysis of variance where the different groups being compared are associated the factor levels of exatly 2 independent variables. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
two factor ANOVA
STATO
http://courses.statistics.com/software/R/Rtwoway.htm
two-way ANOVA
a block design is a kind of study design which declares a blocking variable (also known as nuisance variable) in order to account for a known source of variation and reduce its impact on the acquisition of the signal
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from several sources including Wikipedia
block design
1
a count of 4 resulting from counting limbs in humans
a count is a data item denoted by an integer and represented the number of instances or occurences of an entity
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
count
true
true
1
3
Multi-way anova is an analysis of variance where the difference groups being compared are associated to the factor levels of more than 2 independent variables. The null hypothesis is an absence of difference between the means calculated for each of the groups. The test assumes normality and equivariance of the data.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
multiway ANOVA
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2581961/
Hardy-Weinberg equilibrium hypothesis is a good of fit hypothesis which states that allele and genotype frequencies in a population will remain constant from generation to generation in the absence of other evolutionary influences (non-random mating, mutation, selection, genetic drift, gene flow and meiotic drive).
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Hardy–Weinberg_principle)
Hardy-Weinberg equilibrium hypothesis
signal to noise ratio is a measurement datum comparing the amount of meaningful, useful or interesting data (the signal) to the amount of irrelevant or false data (the noise). Depending on the field and domain of application, different variables will be used to determinate a 'signal to noise ratio'. In statistics, the definition of signal to noise ratio is the ratio of the mean of a measurement to its standard deviation. It thus corresponds to the inverse of the coefficient of variation
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia:
http://en.wikipedia.org/wiki/Signal-to-noise_ratio#Alternative_definition
last accessed: 2013-10-18
S/N
SNR
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.signaltonoise.html#scipy.stats.signaltonoise
signal to noise ratio
Poisson distribution is a probability distribution used to model the number of events occurring within a given time interval. It is defined by a real number (λ) and an integer k representing the number of events and a function.
The expected value of a Poisson-distributed random variable is equal to λ and so is its variance.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
dpois(x, lambda, log = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Poisson.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.poisson.html#scipy.stats.poisson
NIST: http://www.itl.nist.gov/div898/handbook/eda/section3/eda366j.htm
Poisson distribution
true
Z-test is a statistical test which evaluate the null hypothesis that the means of 2 populations are equal and returns a p-value.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://reference.wolfram.com/mathematica/ref/ZTest.html
simple.z.test(x, sigma, conf.level=0.95)
http://www.inside-r.org/packages/cran/UsingR/docs/simple.z.test
Z-test
a false positive rate is a data item which accounts for the proportion of incorrect rejection of a true null hypothesis.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
PRS,AGB adapted from wikipedia and wolfram alpha
significance level
type I error rate
α
false positive rate
homoskedasticity states that all variances under consideration are homogenous.
definition edited according to the discussion documented in:
https://github.com/ISA-tools/stato/issues/39
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
equality of variance
STATO
homoskedasticity hypothesis
http://www.ncbi.nlm.nih.gov/assembly/model/
chrX:35,000,000-36,000,000.
chromosome coordinate system is a genomic coordinate which uses chromosome of a particular assembly build process to define start and end positions. This coordinate system is unstable and will change with each new genome sequence assembly build.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
chromosome coordinate system
a null hypothesis which states that no linkage exists between 2 categorical variables
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
no relationship between the variables
variables are independent
absence of association hypothesis
A null hypothesis is a statistical hypothesis that is tested for possible rejection under the assumption that it is true (usually that observations are the result of chance). The concept was introduced by R. A. Fisher.
The hypothesis contrary to the null hypothesis, usually that the observations are the result of a real effect, is known as the alternative hypothesis.[wolfram alpha]
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://mathworld.wolfram.com/NullHypothesis.html
null hypothesis
goodness of fit hypothesis is a null hypothesis stating that the distribution computed from the sample population fits a theoretical distribution or that a dataset can be correctly explained by a model
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
goodness of fit hypothesis
0
the Student's t distribution is a continuous probability distribution which arises when estimating the mean of a normally distributed population in situations where the sample size is small and population standard deviation is unknown.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Student's_t-distribution)
t distribution
dt(x, df, ncp, log = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/TDist.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html#scipy.stats.t
Student's t distribution
hypergeometric distribution is a probability distribution that describes the probability of k successes in n draws from a finite population of size N containing K successes without replacement
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Hypergeometric_distribution
dhyper(x, m, n, k, log = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Hypergeometric.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.hypergeom.html#scipy.stats.hypergeom
hypergeometric distribution
It is a null hypothesis stating that there are no differences observed between group of subjects.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
absence of between group difference hypothesis
is a null hypothesis stating that there are no difference observed across a series of measurements made one same subject.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
absence of within subject difference hypothesis
genomic coordinate datum is a data item which denotes a genomic position expressed using a genomic coordinate system
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
genomic coordinate datum
http://left.subtree.org/2012/04/13/counting-the-number-of-reads-in-a-bam-file/
sequence read count is a data item determining how many sequence reads generated by a DNA sequencing assay for a given stretch of DNA can counted
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB-PRS, STATO
sequence read count
In statistics, a statement that can be tested.[wolfram alpha]
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
http://mathworld.wolfram.com/Hypothesis.html
hypothesis
Cleveland dot plot is a dot plot which plots points that each belong to one of several categories. They are an alternative to bar charts or pie charts, and look somewhat like a horizontal bar chart where the bars are replaced by a dots at the values associated with each category. Compared to (vertical) bar charts and pie charts, Cleveland argues that dot plots allow more accurate interpretation of the graph by readers by making the labels easier to read, reducing non-data ink (or graph clutter) and supporting table look-up.which
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia:
http://en.wikipedia.org/wiki/Dot_plot_(statistics)
and
Cleveland, William S. (1993). Visualizing Data. Hobart Press. ISBN 0-9634884-0-6. hdl:2027/mdp.39015026891187.
http://stat.ethz.ch/R-manual/R-patched/library/graphics/html/dotchart.html
dotchart(x, labels = NULL, groups = NULL, gdata = NULL,
cex = par("cex"), pch = 21, gpch = 21, bg = par("bg"),
color = par("fg"), gcolor = par("fg"), lcolor = "gray",
xlim = range(x[is.finite(x)]),
main = NULL, xlab = NULL, ylab = NULL, ...)
Cleveland dot plot
a continuousprobability distribution is a probability distribution which is defined by a probability density function
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia
http://en.wikipedia.org/wiki/Probability_distribution#Continuous_probability_distribution
last accessed:
14/01/2014
continuous probability distribution
Skewness is a data item indicating of the degree of asymmetry of a distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://mathworld.wolfram.com/Skewness.html
skewness(x, na.rm = FALSE, type = 3)
http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/e1071/html/skewness.html
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.skew.html#scipy.stats.skew
skewness
The number degree of freedom is a count evaluating the number of values in a calculation that can vary. In statistics, the number of degrees of freedom ν is equal to N-1 in the case of the direct measurement of a quantity estimated by the arithmetic mean of N independent observations.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://stats.stackexchange.com/questions/16921/how-to-understand-degrees-of-freedom
http://www.optique-ingenieur.org/en/courses/OPI_ang_M07_C01/co/Contenu_07.html
the rank of the quadratic form (mathematical definition)
number of degrees of freedom
2
Yate's corrected Chi-Squared test is a statistical test which is used to test the association/linkage/independence of 2 dichotomous variables while introducing a correction for using the continous Chi-squared distribution for the test.
To reduce the error in approximation, Frank Yates, an English statistician, suggested a correction for continuity that adjusts the formula for Pearson's chi-squared test by subtracting 0.5 from the difference between each observed value and its expected value in a 2 × 2 contingency table. This reduces the chi-squared value obtained and thus increases its p-value.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Yates's_correction_for_continuity) polled in June 2013
Yate's correction for continuity
chisq.test(x, y = NULL, correct = TRUE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
Yate's corrected Chi-Squared test
reaction rate is a measurement datum which represents the speed of a chemical reaction turning reactive species into product species of event (i.e the number of such conversions)s occuring over a time interval
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
reaction rate
substrate concentration is a scalar measurement datum which denotes the amount of molecular entity involved in an enzymatic reaction (or catalytic chemical reaction) and whose role in that reaction is as substrate.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
substrate concentration
1
2
2
5
Fisher's exact test is a statistical test used to determine if there are nonrandom associations between two categorical variables.
duplicate with OBI_0200176. so either MIREOT and add metadata and axioms or move from OBI
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://mathworld.wolfram.com/FishersExactTest.html
fisher.test(x) function, where x is a matrix
scipy.stats.fisher_exact(table, alternative='two-sided')
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.fisher_exact.html#scipy.stats.fisher_exact
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L2485
Fisher's exact test
true
2
1
1
2
2
Cochran-Mantel-Haenzel test for repeated tests of independence is a statitiscal test which allows the comparison of two groups on a dichotomous/categorical response. It is used when the effect of the explanatory variable on the response variable is influenced by covariates that can be controlled. It is often used in observational studies where random assignment of subjects to different treatments cannot be controlled, but influencing covariates can.
The null hypothesis is that the two nominal variables that are tested within each repetition are independent of each other. So there are 3 variables to consider: two categorical variables to be tested for independence of each other, and the third variable identifies the repeats.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO adapted from wikipedia (http://en.wikipedia.org/wiki/Cochran–Mantel–Haenszel_statistics) and from the Handbook of Biological Statistics (http://udel.edu/~mcdonald/statcmh.html)
CHM test
Mantel–Haenszel test
cmh.test(x,y,z)
Cochran-Mantel-Haenzel test for repeated tests of independence
a rarefaction curve is a graph used for estimating species richness in ecology studies
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
>library(vegan)
>rarefaction(x, subsample=5, plot=TRUE, color=TRUE, error=FALSE, legend=TRUE, symbol)
http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/vegan/html/vegan-package.html
rarefaction curve
1
1
2
1
The Mann-Whitney U-test is a null hypothesis statistical testing procedure which allows two groups (or conditions or treatments) to be compared without making the assumption that values are normally distributed.
The Mann-Whitney test is the non-parametric equivalent of the t-test for independent samples
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
U test
Wilcoxon rank-sum test
rank-sum test for the comparison of two samples
adapted from http://udel.edu/~mcdonald/statkruskalwallis.html
and from http://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U
last accessed [2014-03-04]
Wilcoxon Rank-Sum test
wilcox.test(dependent variable ~ independant variable, data = dataset)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/wilcox.test.html
scipy.stats.mannwhitneyu(x, y, use_continuity=True)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mannwhitneyu.html#scipy.stats.mannwhitneyu
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L4049
scipy.stats.ranksums(x, y)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ranksums.html#scipy.stats.ranksums
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L4103
Mann-Whitney U-test
Shapiro-Wilk test is a goodness of fit test which evaluates the null hypothesis that the sample is drawn from a population following a normal distribution
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
S-W test
STATO, adapted from wikipedia (https://en.wikipedia.org/wiki/Shapiro–Wilk_test)
shapiro.test(x) function, where x is a numeric vector
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/shapiro.test.html
scipy.stats.shapiro(x, a=None, reta=False)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.shapiro.html#scipy.stats.shapiro
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/morestats.py#L944
Shapiro-Wilk test
Levene's test is a null hypothesis statistical test which evaluates the null hypothesis of equality of variance in several populations.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Levene_test
levene.test(x) function, where x is a numeric vector
scipy.stats.levene(*args, **kwds)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.levene.html#scipy.stats.levene
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/morestats.py#L1496
Levene's test
Bartlett's test (see Snedecor and Cochran, 1989) is used to test if k samples are from populations with equal variances. Equal variances across samples is called homoscedasticity or homogeneity of variances. Some statistical tests, for example the analysis of variance, assume that variances are equal across groups or samples. The Bartlett test can be used to verify that assumption.
Bartlett's test is sensitive to departures from normality. That is, if the samples come from non-normal distributions, then Bartlett's test may simply be testing for non-normality. Levene's test and the Brown–Forsythe test are alternatives to the Bartlett test that are less sensitive to departures from normality.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://en.wikipedia.org/wiki/Bartlett_test
bartlett.test(x) function, where x is a numeric vector
scipy.stats.bartlett(*args)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.bartlett.html#scipy.stats.bartlett
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/morestats.py#L1450
Barlett's test
the Brown Forsythe test is a statistical test which evaluates if the variance of different groups are equal. It relies on computing the median rather than the mean, as used in the Levene's test for homoschedacity.
This test maybe used to, for instance, ensure that the conditions of applications of ANOVA are met.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia and Brown, M. B., and A. B. Forsythe. 1974a. The small sample behavior of some statistics which test the equality of several means. Technometrics, 16, 129-132.
http://www.statmethods.net/stats/anovaAssumptions.html
The hovPlot( ) function in the HH package provides a graphic test of homogeneity of variances based on Brown-Forsyth. In the following example, y is numeric and G is a grouping factor. Note that G must be of type factor.
# Homogeneity of Variance Plot
library(HH)
hov(y~G, data=mydata)
hovPlot(y~G,data=mydata)
Brown Forsythe test
2
Pearson's Chi-Squared test is a statistical null hypothesis test which is used to either evaluate goodness of fit of dataset to a Chi-Squared distribution or used to test independence of 2 categorical variables (ie absence of association between those variables).
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
Chi2 test for independence
adapted from:
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
and
http://en.wikipedia.org/wiki/Pearson's_chi-squared_test
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/chisq.test.html
chisq.test(x, y = NULL, correct = TRUE,
p = rep(1/length(x), length(x)), rescale.p = FALSE,
simulate.p.value = FALSE, B = 2000)
http://www.inside-r.org/packages/cran/nortest/docs/pearson.test
pearson.test(x) function, where x is a numeric vector
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chi2_contingency.html#scipy.stats.chi2_contingency
Pearson's Chi square test of independence between categorical variables
2
1
1
a fixed effect model is a statistical model which represents the observed quantities in terms of explanatory variables that are treated as if the quantities were non-random.
PRS: this is a stub and more work is needed to reconcile conflicting definitions
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from wikipedia:
http://en.wikipedia.org/wiki/Fixed_effects_model
fixed effect model
Kolmogorov-Smirnov test is a goodness of fit test which evaluates the null hypothesis that a sample is drawn from a population that follows a specific continuous probability distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
K-S test
STATO, adapted from wikipedia (https://en.wikipedia.org/wiki/Kolmogorov–Smirnov_test)
http://www.itl.nist.gov/div898/handbook/eda/section3/eda35g.htm
ks.test(dataset, distribution)
scipy.stats.kstwobign = <scipy.stats._continuous_distns.kstwobign_gen object at 0x7f6169f842d0>
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstwobign.html#scipy.stats.kstwobign
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/_continuous_distns.py
scipy.stats.mstats.ks_twosamp(data1, data2,alternative='two-sided')
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.ks_twosamp.html
source code:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/mstats_basic.py#L821
Kolmogorov-Smirnov test
multinomial logistic regression model is a model which attempts to explain data distribution associated with *polychotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is probit function.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Multinomial_probit) polled in June 2013
http://cran.r-project.org/web/packages/mlogit/vignettes/mlogit.pdf
multinomial probit regression for analysis of polychotomous dependent variable
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2689604/
effect size estimate is a data item about the direction and strength of the consequences of a causative agent as explored by statistical methods. Those methods produce estimates of the effect size, e.g. confidence interval
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB,PRS
effect size
effect size estimate
an F-test is a statistical test which evaluates that the computed test statistics follows an F-distribution under the null hypothesis. The F-test is sensitive to departure from normality. F-test arise when decomposing the variability in a data set in terms of sum of squares.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
F-test
2
a polychotomous variable is a categorical variable which is defined to have minimally 2 categories or possible values
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
http://udel.edu/~mcdonald/statvartypes.html
polychotomous variable
statistical sample size is a count evaluating the number of individual experimental units
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB-PRS
statistical sample size
study group population size
2
1
a case-control study design is a observation study design which assess the risk of particular outcome (a trait or a disease) associated with an event (either an exposure or endogenous factor). A case-control study design therefore declares an exposure variable which is dichotomous in nature (exposed/non-exposed) and an outcome variable, which is also dichotomous (case or control), thus giving the name to the design. During the execution of the design, a case control study defines a population and counts the events to determine their frequency.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from:
http://www.drcath.net/toolkit/casecontrol.html
case-control study design
2
a dichotomous variable is a categorical variable which is defined to have only 2 categories or possible values
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB-PRS
http://udel.edu/~mcdonald/statvartypes.html
'has part' exactly 1 ('categorical measurement datum'
and ('has category label' exactly 2 'categorical label'))
dichotomous variable
Genome wide association study is a kind of study whose objective is to detect association between genetic markers (SNP or otherwise) accross the genome and a trait which may be a disease or another phenotype (e.g. trait of agronomic relevance in animal or plant studies). Genome wide association study compare the allele frequencies in 2 populations, one free of the trait used as control, the other one showing the trait use as 'case'. GWAS studies implement case-control design
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
AGB, PRS
GWAS study
whole genome association study
genome-wide association study
1
2
The Wilcoxon signed rank test is a statistical test which tests the null hypothesis that the median difference between pairs of observations is zero. This is the non-parametric analogue to the paired t-test, and should be used if the distribution of differences between pairs may be non-normally distributed.
The procedure involves a ranking, hence the name. The absolute value of the differences between observations are ranked from smallest to largest, with the smallest difference getting a rank of 1, then next larger difference getting a rank of 2, etc. Ties are given average ranks. The ranks of all differences in one direction are summed, and the ranks of all differences in the other direction are summed. The smaller of these two sums is the test statistic, W (sometimes symbolized Ts). Unlike most test statistics, smaller values of W are less likely under the null hypothesis.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://udel.edu/~mcdonald/statsignedrank.html
signrank()
scipy.stats.wilcoxon(x, y=None, zero_method='wilcox', correction=False)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html#scipy.stats.wilcoxon
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L4103
Wilcoxon signed rank test
Information about a calendar date or timestamp indicating day, month, year and time of an event.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
date
1
2
1
1
The Kruskal–Wallis test is a null hypothesis statistical testing objective which allows multiple (n>=2) groups (or conditions or treatments) to be compared, without making the assumption that values are normally distributed. The Kruskal–Wallis test is the non-parametric equivalent of the independent samples ANOVA.
The Kruskal–Wallis test is most commonly used when there is one nominal variable and one measurement variable, and the measurement variable does not meet the normality assumption of an anova.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
H test
rank-sum test for the comparison of multiple (more than 2) samples.
http://udel.edu/~mcdonald/statkruskalwallis.html
kruskal.test()
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/kruskal.test.html
scipy.stats.mstats.kruskalwallis(*args)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.mstats.kruskalwallis.html
source code:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/mstats_basic.py#L800
Kruskal Wallis test
true
1
true
1
paired t-test is a statistical test which is specifically designed to analysis differences between paired observations in the case of studies realizing repeated measures design with only 2 repeated measurements per subject (before and after treatment for example)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://udel.edu/~mcdonald/statpaired.html
http://udel.edu/~mcdonald/statsignedrank.html
t-test for dependent means
t-test for repeated measures
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/t.test.html
t.test(dependent variable ~ independant variable, data = dataset, var.equal = FALSE, paired= TRUE)
scipy.stats.ttest_rel(a, b, axis=0)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.ttest_rel.html#scipy.stats.ttest_rel
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L3389
paired t-test
2
1
stratification is a planned process which executes a stratification rule using as input a population and assign it member to mutually exclusive subpopulation based on the values defined by the stratification rule
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
PRS+AGB adapted from wikipedia:
http://en.wikipedia.org/wiki/Stratified_sampling
polled on June 7th,2013
stratifying population
population stratification prior to sampling
A stastical test power analysis is a data transformation which aims to determine the size of a statistical sample required to reach a desired significance level given a particular statistical test
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.statmethods.net/stats/power.html
http://www.statmethods.net/stats/power.html
statistical test power analysis
2
2
http://arxiv.org/pdf/1007.1094.pdf
Hotelling's T2 test is a statistical test which is a generalization of Student's T-test to a assess if the means of a set of variables remains unchanged when studying 2 populations. It is a type of multivariate analysis
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
http://svitsrv25.epfl.ch/R-doc/library/rrcov/html/T2.test.html
two sample Hotelling T2 test
1
1
a random effect(s) model, also called a variance components model, is a kind of hierarchical linear model. It assumes that the dataset being analysed consists of a hierarchy of different populations whose differences relate to that hierarchy.
PRS: this is a stub and more work is needed to reconcile conflicting definitions
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
variance components model
adapted from wikipedia:
http://en.wikipedia.org/wiki/Random_effects_model#Qualitative_description
random effect model
2
standardized mean difference is data item computed by forming the difference between two means, divided by an estimate of the within-group standard deviation.
It is used to provide an estimatation of the effect size between two treatments when the predictor (independent variable) is categorical and the response(dependent) variable is continuous
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
SMD
adapted from "Effect size, confidence interval and statistical significance: a practical guide for biologists" Nakagawa and Cuthill
DOI: 10.1111/j.1469-185X.2007.00027.x
adapted from http://htaglossary.net/standardised+mean+difference+(SMD)
Cohen's d statistic
standardized mean difference
the multinomial distribution is a probability distribution which gives the probability of any particular combination of numbers of successes for various categories defined in the context of n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from
http://mathworld.wolfram.com/MultinomialDistribution.html
and
http://en.wikipedia.org/wiki/Multinomial_distribution
dmultinom(x, size = NULL, prob, log = FALSE)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Multinom.html
multinomial distribution
A z-score (also known as z-value, standard score, or normal score) is a measure of the divergence of an individual experimental result from the most probable result, the mean. Z is expressed in terms of the number of standard deviations from the mean value.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
https://controls.engin.umich.edu/wiki/index.php/Basic_statistics:_mean,_median,_average,_standard_deviation,_z-scores,_and_p-value#Z-Scores
normal score
standard score
scipy.stats.zscore(a, axis=0, ddof=0)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.zscore.html#scipy.stats.zscore
source:
https://github.com/scipy/scipy/blob/v0.15.1/scipy/stats/stats.py#L1977
z-score
log signal intensity ratio is a data item which corresponding the logarithmitic base 2 of the ratio between 2 signal intensity, each corresponding to a condition.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from wikipedia:
http://en.wikipedia.org/wiki/MA_plot
last accessed: 2014-03-13
M-value
log signal intensity ratio
probit regression model is a model which attempts to explain data distribution associated with *dichotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is the probit function aka the quantile function, i.e., the inverse cumulative distribution function (CDF), associated with the standard normal distribution.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Probit_model) polled in June 2013
probit regression for analysis of polychotomous dependent variable
a statistical model is an information content entity which is a formalization of relationships between variables in the form of mathematical equations. A statistical model describes how one or more random variables are related to one or more other variables. The model is statistical as the variables are not deterministically but stochastically related.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia:
http://en.wikipedia.org/wiki/Statistical_model
last accessed: 14/01/2014
statistical model
statistical model
linear regression model is a model which attempts to explain data distribution associated with response/dependent variable in terms of values assumed by the independent variable uses a linear function or linear combination of the regression parameters and the predictor/independent variable(s).
linear regression modeling makes a number of assumptions, which includes homoskedasticity (constance of variance)
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Linear_regression) polled in June 2013
linear regression for analysis of continuous dependent variable
multinomial logistic regression model is a model which attempts to explain data distribution associated with *polychotomous* response/dependent variable in terms of values assumed by the independent variable uses a function of predictor/independent variable(s): the function used in this instance of regression modeling is logistic function.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO, adapted from wikipedia (http://en.wikipedia.org/wiki/Multinomial_logistic_regression) polled in June 2013
http://cran.r-project.org/web/packages/mlogit/vignettes/mlogit.pdf
multinomial logistic regression for analysis of dichotomous dependent variable
a sequence read is a DNA sequence data which is generated by a DNA sequencer
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
sequence read
a Funnel plot is a scatter plot of treatment effect versus a measure of study size and aims to provide a visual aid to detecting bias or systematic heterogeneity. A symmetric inverted funnel shape arises from a ‘well-behaved’ data set, in which publication bias is unlikely. An asymmetric funnel indicates a relationship between treatment effect and study size.
Known caveats: If high precision studies really are different from low precision studies with respect to effect size (e.g., due to different populations examined) a funnel plot may give a wrong impression of publication bias. The appearance of the funnel plot can change quite dramatically depending on the scale on the y-axis — whether it is the inverse square error or the trial size.
Funnel plot was introduced by Light and Palmer in 1984.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia:
http://en.wikipedia.org/wiki/Funnel_plot
Funnel plot
variance is a data item about a random variable or probability distribution. it is equivalent to the square of the standard deviation. It is one of several descriptors of a probability distribution, describing how far the numbers lie from the mean (expected value).The variance is the second moment of a distribution.
Alejandra Gonzalez-Belran
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
σ2
var(x, y = NULL, na.rm = FALSE, use)
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/cor.html
variance
the process of using statistical analysis for interpreting and communicating "what the data say".
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
From "The strenght of statistical evidence" by Richard Royall.
https://www.stat.fi/isi99/proceedings/arkisto/varasto/roya0578.pdf
assess stastistical evidence
assess statistical evidence
a discrete probability distribution is a probability distribution which is defined by a probability mass function where the random variable can only assume a finite number of values or infinitely countable values
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
adapted from Wikipedia
http://en.wikipedia.org/wiki/Probability_distribution#Discrete_probability_distribution
last accessed:
14/01/2014
discrete probability distribution
ranking is a data transformation which turns a non-ordinal variable into a Ordinal variable by sorting the values of the input variable and replacing their value by their position in the sorting result
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
ranking
model parameter estimation is a data transformation that finds parameter values (the model parameter estimates) most compatible with the data as judged by the model.
textual definition modified following contributiong by Thomas Nichols:
https://github.com/ISA-tools/stato/issues/18
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
model parameter estimation
http://www.r-bloggers.com/boxplots-beyond-iv-beanplots/
beanplot is a plot in which (one or) multiple batches ("beans") are shown. Each bean consists of a density trace, which is mirrored to
form a polygon shape. Next to that, a one-dimensional scatter plot shows all the individual measurements, like in a stripchart.
The name beanplot stems from green beans. The density shape can be seen as the pod of a green bean, while the scatter plot shows the seeds inside the pod.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
http://www.jstatsoft.org/v28/c01/paper
http://cran.r-project.org/web/packages/beanplot/index.html
bean plot
the objective of a data transformation to evaluate a null hypothesis of absence of linkage between variables.
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
association between categorical variables testing objective
a pedigree chart is a graph which plots parent child relations
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO adapted from wikipedia (https://en.wikipedia.org/wiki/Pedigree_chart)
family tree
plot.pedigree {kinship}
http://hosho.ees.hokudai.ac.jp/~kubo/Rdoc/library/kinship/html/plot.pedigree.html
pedigree chart
2
r2 is a correlation coefficient which is computed over the frequency of 2 dichotomous variable and is used as a measure of Linkage Disequilibrium and as input data item to the creation of an LD plot
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
R squared measure of LD
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2580747/
r2 measure of LD
r2 measure of linkage desequilibrium
a stratification rule/criteria is a criteria used to determine population strata so that a stratification process implementing the rule can result in any member of the total population being assigned to one and only one stratum
Alejandra Gonzalez-Beltran
Orlaith Burke
Philippe Rocca-Serra
STATO
adapted from wikipedia:
http://en.wikipedia.org/wiki/Stratified_sampling
polled on June 7th,2013
stratification rule
The dot plot as a representation of a distribution consists of group of data points plotted on a simple scale. Dot plots are used