Allyson Lister Andy Brown Duncan Hull Helen Parkinson James Malone Jon Ison Nandini Badarinarayan Robert Stevens SWO (The Software Ontology) The Software Ontology (SWO) is a resource for describing software tools, their types, tasks, versions, licences, provenance and associated data. Date of release: 21.10.2019 The Software Ontology (SWO) is a resource for describing software tools, their types, tasks, versions, licensing, provenance and associated data. It contains SWO-specific classes and axioms as well as imports from: 1. BFO 2. IAO 3. OBI 4. EDAM 1.7 'is about' relates an information entity to other entities in which the information entity holds some information which dscribes some facet of the other entity, such as the arrow direction on a sign. IAO James Malone Alan Ruttenberg is about OBO Foundry Please note that, as stated on one of the RO documentation pages (https://code.google.com/p/obo-relations/wiki/ROAndBFO), some of the relations RO makes use are either uncontroversial (non-temporalized) parts of BFO2, or will be incorporated in the future. In the case of BFO_0000063, it is not currently in BFO2, but may be in the future. The RO ontologists assume this is the transitive form. For a full definition, refer to the inverse of this property, BFO_0000060. (Allyson Lister) preceded by The assertion P 'followed by' P1 tells us something about Ps in general: that is, it tells us something about what happened later, given what we know about what happened earlier. Thus it does not provide information pointing in the opposite direction, concerning instances of P1 in general; that is, that each is such as to be preceded by some instance of P. Note that assertions using this property only are rather weak. Typically we will be interested in stronger relations, for example in the relation 'directly followed by'. Modified by Allyson Lister from the preceded_by (which is the inverse of precedes / 'followed by') definition within RO: http://www.obofoundry.org/ro/ro.owl OBO Foundry Please note that, as stated on one of the RO documentation pages (https://code.google.com/p/obo-relations/wiki/ROAndBFO), some of the relations RO makes use are either uncontroversial (non-temporalized) parts of BFO2, or will be incorporated in the future. In the case of BFO_0000060, it is not currently in BFO2, but may be in the future. The RO ontologists assume this is the transitive form. (Allyson Lister) followed by Andy Brown Please see the official RO definition for the inverse of this property, 'has participant.' participates in The relation obtains, for example, when this particular process of oxygen exchange across this particular alveolar membrane has_participant this particular sample of hemoglobin at this particular time. Has_participant is a primitive instance-level relation between a process, a continuant, and a time at which the continuant participates in some way in the process. http://obo-relations.googlecode.com/svn/trunk/src/ontology/core.owl Andy Brown has participant has_role An entity A is the 'input of' another entity B if A was put into the system, entity or software represented by B. Allyson Lister input of A piece of software can be the output of a particular software company via a software publishing process; data might be the output of a particular piece of data-producing software. An entity A is the 'output of' another entity B if it was produced as a result of the functioning of entity B. Allyson Lister output of The relationship between a software and software developer. is developed by The relationship between software and a software publisher. Marked as obsolete by Allyson Lister. 0.5 This class can be entirely replaced with 'is published by', SWO_0004004. Please use SWO_0004004 instead. obsolete_is_published_by true implements is the relationship between software and an algorithm that is defined for use within that software when executed. James Malone implements Linking a type of software to its particular programming language. Is encoded in is an "is about" relationship which describes the type of encoding used for the referenced class. Allyson Lister is encoded in 'is version of' provides a link between a 'version name' and the entity with that version. Allyson Lister For further information, please see http://softwareontology.wordpress.com/2012/06/20/versioning-in-swo/ is version of Cytoscape plugins would be linked to the Cytoscape application via uses software, while Microsoft Excel is linked to Microsoft Office via has_part. This property allows the linkage of two different pieces of software such that one directly executes or uses the other. The has_part relationship should instead be used to describe related but independent members of a larger software package, and 'uses platform' relationship should be used to describe which operating system(s) a particular piece of software can use. Allyson Lister uses software is implemented by is the relationship between an algorithm and a piece of software which includes an implementation of that software for use when the software is executed. James Malone is implemented by The relationship between input data which is permitted to a piece of software. James Malone See also http://softwareontology.wordpress.com/2011/04/15/ins-and-outs-of-software/ has specified data input The relationship between a piece of software and the data that it is possible to output. James Malone See also http://softwareontology.wordpress.com/2011/04/15/ins-and-outs-of-software/ has specified data output The relationship between a data and the software which can possibly take this data as input. James Malone is specified data input of The relationship between a data and the software which can possibly produce this data as output. James Malone is specified data output of Microsoft version 2007 is directly preceded by Microsoft version 2003. Entity A is 'directly preceded by' entity B if there are no intermediate entities temporally between the two entities. WIthin SWO this property is mainly used to describe versions of entities such as software. Allyson Lister OBO Foundry directly preceded by 'directly followed by' is an object property which further specializes the parent 'followed by' property. In the assertion 'C directly followed by C1', says that Cs generally are immediately followed by C1s. Allyson Lister directly followed by 'uses platform' should be used to link a particular piece of software to one or more operating systems which that software can run on. This is in contrast to both 'uses software' (which describes one piece of software directly executing another), and has_part, which can be used to describe related but independent software in a package, for example. Allyson Lister This property, together with 'uses software', is probably best modelled as a child of has_part. Its current position in the property hierarchy is based on simplicity of use. 'uses platform' was deemed an appropriate addition (rather than making use of already extant has_part or 'uses software') due to the already present definitions of those classes which restrict their use to particular situations. uses platform is software for relationship between an entity and a version name or number For further information, please see http://softwareontology.wordpress.com/2012/06/20/versioning-in-swo/ has version Provides a method of asserting what type of interactions are possible for the class in question. The interface must be from the 'software interface' hierarchy. Allyson Lister Andy Brown Andy Brown has interface Has format specification is a type of "is encoded in" relationship which specifically describes the relationship between data and a data format specification. Allyson Lister has format specification 'is published by' elucidates the relationship between a piece of software or a data format specification and its owning organization. Please note that this is not the same as authorship of the software. For instance, affy is published within Bioconductor, but has its own distinct authorship list. Allyson Lister James Malone is published by is executed in defines the relationship between a software class and an appropriate process in the information processing hierarchy. Specifically, it allows the linking of a particular piece of software to a process of a particular purpose, Allyson Lister Allyson Lister OBI is executed in Axioms using the 'has clause' property, e.g. C 'has clause' C1, provide links from the left hand class to the instances within the 'license clause' hierarchy. This provides a way to more precisely assert the constraints of the licensing applied. Allyson Lister has clause The relationship between an entity and the set of legal restrictions, i.e. license, which are applied in using or otherwise interacting with that entity. Eg. relationship between software and a software license. has license The relationship between an entity and the set of legal restrictions, i.e. license, which are applied in using or otherwise interacting with that entity. Eg. relationship between a software license and the software which implements it. is license for The format for BioPAX Manchester OWL syntax is an alternative format of the BioPAX RDF/XML format. With both a domain and range of 'data format specification', this property provides a means of stating that two different data format specifications are valid specifications for the same type of data. Allyson Lister is alternative format of 'is compatible license of' provides a method of marking two software licenses as compatible and without conflicts, e.g. that the Apache License version 2 is compatible with GNU GPL version 3. If two licenses are connected with this property, it means code released under one license can be released with code from the other license in a larger program. http://www.gnu.org/licenses/gpl-faq.html#WhatDoesCompatMean, accessed 12 June 2013. Allyson Lister is compatible license of 'has declared status' provides a way to assert the developmental status of a class, such as whether it is stable or under development. Is especially useful for software that might not be complete or stable yet, and when combined with version information. Allyson Lister has declared status Should be used to link a particular class (e.g. a piece of software or an algorithm) with a publication(s) which act as the primary reference(s) for that class. Allyson Lister has documentation The location from where the software can be downloaded. Allyson Lister Allyson Lister has download location Andy Brown The official date of release of software has release date An ontology of biological or bioinformatics concepts and relations, a controlled vocabulary, structured glossary etc. Ontologies Ontology A graphical 2D tabular representation of gene expression data, typically derived from a DNA microarray experiment. Heat map Heatmap Image, hybridisation or some other data arising from a study of gene expression, typically profiling or quantification. Gene product profile Gene product quantification data Gene transcription profile Gene transcription quantification data Microarray data Non-coding RNA profile Non-coding RNA quantification data RNA profile RNA quantification data RNA-seq data Transcriptome profile Transcriptome quantification data mRNA profile mRNA quantification data Gene expression data Microarray data Biological or biomedical data has been rendered into an image, typically for display on screen. Image data Image Groupings of gene expression profiles according to a clustering algorithm. Clustered gene expression profiles PDBML Format of Taverna workflows. Taverna workflow format newick FASTA format Genbank entry format. GenBank GenBank format GFF html, or HyperText Markup Language in full, is a data format specification is a markup language for web pages and is the publishing language of the World Wide Web. http://www.w3.org/TR/html401/ James Malone HTML Extensible Markup Language (XML) is a standard set of rules for encoding documents in a machine-readable form defined by the W3C. James Malone XML binary format OBO format is the text file format used by OBO-Edit, the open source, platform-independent application for viewing and editing ontologies. OBO flat file format OBO Flat File Format Systems Biology Markup Language (SBML) is a machine-readable format for representing models. It's oriented towards describing systems where biological entities are involved in, and modified by, processes that occur over time. Systems Biology Markup Language http://sbml.org SBML BED format The PSI-MI format is an acronym for the Proteomics Standards Initiative - Molecular Interaction format. It provides an XML standard for molecular interactions and is supported by many molecular interaction databases and tools. PSI-MI format PSI-MIF Modified from http://wiki.cytoscape.org/GettingStarted, accessed 20 June 2012. Allyson Lister MAGE-ML MAGE-TAB is a tab delimited data format comprimising of ADF file for array design, IDFfor experimental design, SDRF for sample data relationships and associated data files. http://www.mged.org/mage-tab/ A simple spreadsheet-based, MIAME-supportive format for microarray data: MAGE-TAB. MAGE-TAB CopasiML, the native format of COPASI. CopasiML Search or query a data resource and retrieve entries and / or annotation. Database retrieval Query Query and retrieval Annotate a genome sequence with terms from a controlled vocabulary. Metagenome annotation Genome annotation Model or simulate some biological entity or system, typically using mathematical techniques including dynamical systems, statistical models, differential equations, and game theoretic models. Mathematical modelling Modelling and simulation data visualization algorithm data cleaning algorithm data selection algorithm data integration algorithm A data processing task is an information processing objective that specifies the objective that a data processing algorithm execution process needs to achieve when executed on a dataset to produce as output a new dataset. data processing task data construction algorithm Allyson Lister A data processing algorithm is an algorithm that has as its objective a data processing task and has data items both as input and output. data processing algorithm entity Entity Julius Caesar Verdi’s Requiem the Second World War your body mass index BFO 2 Reference: In all areas of empirical inquiry we encounter general terms of two sorts. First are general terms which refer to universals or types:animaltuberculosissurgical procedurediseaseSecond, are general terms used to refer to groups of entities which instantiate a given universal but do not correspond to the extension of any subuniversal of that universal because there is nothing intrinsic to the entities in question by virtue of which they – and only they – are counted as belonging to the given group. Examples are: animal purchased by the Emperortuberculosis diagnosed on a Wednesdaysurgical procedure performed on a patient from Stockholmperson identified as candidate for clinical trial #2056-555person who is signatory of Form 656-PPVpainting by Leonardo da VinciSuch terms, which represent what are called ‘specializations’ in [81 Entity doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. For example Werner Ceusters 'portions of reality' include 4 sorts, entities (as BFO construes them), universals, configurations, and relations. It is an open question as to whether entities as construed in BFO will at some point also include these other portions of reality. See, for example, 'How to track absolutely everything' at http://www.referent-tracking.com/_RTU/papers/CeustersICbookRevised.pdf An entity is anything that exists or has existed or will exist. (axiom label in BFO2 Reference: [001-001]) entity continuant Continuant An entity that exists in full at any time in which it exists at all, persists through time while maintaining its identity and has no temporal parts. BFO 2 Reference: Continuant entities are entities which can be sliced to yield parts only along the spatial dimension, yielding for example the parts of your table which we call its legs, its top, its nails. ‘My desk stretches from the window to the door. It has spatial parts, and can be sliced (in space) in two. With respect to time, however, a thing is a continuant.’ [60, p. 240 Continuant doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. For example, in an expansion involving bringing in some of Ceuster's other portions of reality, questions are raised as to whether universals are continuants A continuant is an entity that persists, endures, or continues to exist through time while maintaining its identity. (axiom label in BFO2 Reference: [008-002]) if b is a continuant and if, for some t, c has_continuant_part b at t, then c is a continuant. (axiom label in BFO2 Reference: [126-001]) if b is a continuant and if, for some t, cis continuant_part of b at t, then c is a continuant. (axiom label in BFO2 Reference: [009-002]) if b is a material entity, then there is some temporal interval (referred to below as a one-dimensional temporal region) during which b exists. (axiom label in BFO2 Reference: [011-002]) (forall (x y) (if (and (Continuant x) (exists (t) (continuantPartOfAt y x t))) (Continuant y))) // axiom label in BFO2 CLIF: [009-002] (forall (x y) (if (and (Continuant x) (exists (t) (hasContinuantPartOfAt y x t))) (Continuant y))) // axiom label in BFO2 CLIF: [126-001] (forall (x) (if (Continuant x) (Entity x))) // axiom label in BFO2 CLIF: [008-002] (forall (x) (if (Material Entity x) (exists (t) (and (TemporalRegion t) (existsAt x t))))) // axiom label in BFO2 CLIF: [011-002] continuant occurrent Occurrent An entity that has temporal parts and that happens, unfolds or develops through time. BFO 2 Reference: every occurrent that is not a temporal or spatiotemporal region is s-dependent on some independent continuant that is not a spatial region BFO 2 Reference: s-dependence obtains between every process and its participants in the sense that, as a matter of necessity, this process could not have existed unless these or those participants existed also. A process may have a succession of participants at different phases of its unfolding. Thus there may be different players on the field at different times during the course of a football game; but the process which is the entire game s-depends_on all of these players nonetheless. Some temporal parts of this process will s-depend_on on only some of the players. Occurrent doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. An example would be the sum of a process and the process boundary of another process. Simons uses different terminology for relations of occurrents to regions: Denote the spatio-temporal location of a given occurrent e by 'spn[e]' and call this region its span. We may say an occurrent is at its span, in any larger region, and covers any smaller region. Now suppose we have fixed a frame of reference so that we can speak not merely of spatio-temporal but also of spatial regions (places) and temporal regions (times). The spread of an occurrent, (relative to a frame of reference) is the space it exactly occupies, and its spell is likewise the time it exactly occupies. We write 'spr[e]' and `spl[e]' respectively for the spread and spell of e, omitting mention of the frame. An occurrent is an entity that unfolds itself in time or it is the instantaneous boundary of such an entity (for example a beginning or an ending) or it is a temporal or spatiotemporal region which such an entity occupies_temporal_region or occupies_spatiotemporal_region. (axiom label in BFO2 Reference: [077-002]) Every occurrent occupies_spatiotemporal_region some spatiotemporal region. (axiom label in BFO2 Reference: [108-001]) b is an occurrent entity iff b is an entity that has temporal parts. (axiom label in BFO2 Reference: [079-001]) (forall (x) (if (Occurrent x) (exists (r) (and (SpatioTemporalRegion r) (occupiesSpatioTemporalRegion x r))))) // axiom label in BFO2 CLIF: [108-001] (forall (x) (iff (Occurrent x) (and (Entity x) (exists (y) (temporalPartOf y x))))) // axiom label in BFO2 CLIF: [079-001] occurrent ic IndependentContinuant a chair a heart a leg a molecule a spatial region an atom an orchestra. an organism the bottom right portion of a human torso the interior of your mouth A continuant that is a bearer of quality and realizable entity entities, in which other entities inhere and which itself cannot inhere in anything. b is an independent continuant = Def. b is a continuant which is such that there is no c and no t such that b s-depends_on c at t. (axiom label in BFO2 Reference: [017-002]) For any independent continuant b and any time t there is some spatial region r such that b is located_in r at t. (axiom label in BFO2 Reference: [134-001]) For every independent continuant b and time t during the region of time spanned by its life, there are entities which s-depends_on b during t. (axiom label in BFO2 Reference: [018-002]) (forall (x t) (if (IndependentContinuant x) (exists (r) (and (SpatialRegion r) (locatedInAt x r t))))) // axiom label in BFO2 CLIF: [134-001] (forall (x t) (if (and (IndependentContinuant x) (existsAt x t)) (exists (y) (and (Entity y) (specificallyDependsOnAt y x t))))) // axiom label in BFO2 CLIF: [018-002] (iff (IndependentContinuant a) (and (Continuant a) (not (exists (b t) (specificallyDependsOnAt a b t))))) // axiom label in BFO2 CLIF: [017-002] independent continuant process Process a process of cell-division, \ a beating of the heart a process of meiosis a process of sleeping the course of a disease the flight of a bird the life of an organism your process of aging. An occurrent that has temporal proper parts and for some time t, p s-depends_on some material entity at t. p is a process = Def. p is an occurrent that has temporal proper parts and for some time t, p s-depends_on some material entity at t. (axiom label in BFO2 Reference: [083-003]) BFO 2 Reference: The realm of occurrents is less pervasively marked by the presence of natural units than is the case in the realm of independent continuants. Thus there is here no counterpart of ‘object’. In BFO 1.0 ‘process’ served as such a counterpart. In BFO 2.0 ‘process’ is, rather, the occurrent counterpart of ‘material entity’. Those natural – as contrasted with engineered, which here means: deliberately executed – units which do exist in the realm of occurrents are typically either parasitic on the existence of natural units on the continuant side, or they are fiat in nature. Thus we can count lives; we can count football games; we can count chemical reactions performed in experiments or in chemical manufacturing. We cannot count the processes taking place, for instance, in an episode of insect mating behavior.Even where natural units are identifiable, for example cycles in a cyclical process such as the beating of a heart or an organism’s sleep/wake cycle, the processes in question form a sequence with no discontinuities (temporal gaps) of the sort that we find for instance where billiard balls or zebrafish or planets are separated by clear spatial gaps. Lives of organisms are process units, but they too unfold in a continuous series from other, prior processes such as fertilization, and they unfold in turn in continuous series of post-life processes such as post-mortem decay. Clear examples of boundaries of processes are almost always of the fiat sort (midnight, a time of death as declared in an operating theater or on a death certificate, the initiation of a state of war) (iff (Process a) (and (Occurrent a) (exists (b) (properTemporalPartOf b a)) (exists (c t) (and (MaterialEntity c) (specificallyDependsOnAt a c t))))) // axiom label in BFO2 CLIF: [083-003] process realizable RealizableEntity the disposition of this piece of metal to conduct electricity. the disposition of your blood to coagulate the function of your reproductive organs the role of being a doctor the role of this boundary to delineate where Utah and Colorado meet To say that b is a realizable entity is to say that b is a specifically dependent continuant that inheres in some independent continuant which is not a spatial region and is of a type instances of which are realized in processes of a correlated type. (axiom label in BFO2 Reference: [058-002]) All realizable dependent continuants have independent continuants that are not spatial regions as their bearers. (axiom label in BFO2 Reference: [060-002]) (forall (x t) (if (RealizableEntity x) (exists (y) (and (IndependentContinuant y) (not (SpatialRegion y)) (bearerOfAt y x t))))) // axiom label in BFO2 CLIF: [060-002] (forall (x) (if (RealizableEntity x) (and (SpecificallyDependentContinuant x) (exists (y) (and (IndependentContinuant y) (not (SpatialRegion y)) (inheresIn x y)))))) // axiom label in BFO2 CLIF: [058-002] realizable entity sdc SpecificallyDependentContinuant Reciprocal specifically dependent continuants: the function of this key to open this lock and the mutually dependent disposition of this lock: to be opened by this key of one-sided specifically dependent continuants: the mass of this tomato of relational dependent continuants (multiple bearers): John’s love for Mary, the ownership relation between John and this statue, the relation of authority between John and his subordinates. the disposition of this fish to decay the function of this heart: to pump blood the mutual dependence of proton donors and acceptors in chemical reactions [79 the mutual dependence of the role predator and the role prey as played by two organisms in a given interaction the pink color of a medium rare piece of grilled filet mignon at its center the role of being a doctor the shape of this hole. the smell of this portion of mozzarella b is a specifically dependent continuant = Def. b is a continuant & there is some independent continuant c which is not a spatial region and which is such that b s-depends_on c at every time t during the course of b’s existence. (axiom label in BFO2 Reference: [050-003]) Specifically dependent continuant doesn't have a closure axiom because the subclasses don't necessarily exhaust all possibilites. We're not sure what else will develop here, but for example there are questions such as what are promises, obligation, etc. (iff (SpecificallyDependentContinuant a) (and (Continuant a) (forall (t) (if (existsAt a t) (exists (b) (and (IndependentContinuant b) (not (SpatialRegion b)) (specificallyDependsOnAt a b t))))))) // axiom label in BFO2 CLIF: [050-003] specifically dependent continuant role Role John’s role of husband to Mary is dependent on Mary’s role of wife to John, and both are dependent on the object aggregate comprising John and Mary as member parts joined together through the relational quality of being married. the priest role the role of a boundary to demarcate two neighboring administrative territories the role of a building in serving as a military target the role of a stone in marking a property boundary the role of subject in a clinical trial the student role BFO 2 Reference: One major family of examples of non-rigid universals involves roles, and ontologies developed for corresponding administrative purposes may consist entirely of representatives of entities of this sort. Thus ‘professor’, defined as follows,b instance_of professor at t =Def. there is some c, c instance_of professor role & c inheres_in b at t.denotes a non-rigid universal and so also do ‘nurse’, ‘student’, ‘colonel’, ‘taxpayer’, and so forth. (These terms are all, in the jargon of philosophy, phase sortals.) By using role terms in definitions, we can create a BFO conformant treatment of such entities drawing on the fact that, while an instance of professor may be simultaneously an instance of trade union member, no instance of the type professor role is also (at any time) an instance of the type trade union member role (any more than any instance of the type color is at any time an instance of the type length).If an ontology of employment positions should be defined in terms of roles following the above pattern, this enables the ontology to do justice to the fact that individuals instantiate the corresponding universals – professor, sergeant, nurse – only during certain phases in their lives. b is a role means: b is a realizable entity & b exists because there is some single bearer that is in some special physical, social, or institutional set of circumstances in which this bearer does not have to be& b is not such that, if it ceases to exist, then the physical make-up of the bearer is thereby changed. (axiom label in BFO2 Reference: [061-001]) (forall (x) (if (Role x) (RealizableEntity x))) // axiom label in BFO2 CLIF: [061-001] role gdc GenericallyDependentContinuant The entries in your database are patterns instantiated as quality instances in your hard drive. The database itself is an aggregate of such patterns. When you create the database you create a particular instance of the generically dependent continuant type database. Each entry in the database is an instance of the generically dependent continuant type IAO: information content entity. the pdf file on your laptop, the pdf file that is a copy thereof on my laptop the sequence of this protein molecule; the sequence that is a copy thereof in that protein molecule. A continuant that is dependent on one or other independent continuant bearers. For every instance of A requires some instance of (an independent continuant type) B but which instance of B serves can change from time to time. b is a generically dependent continuant = Def. b is a continuant that g-depends_on one or more other entities. (axiom label in BFO2 Reference: [074-001]) (iff (GenericallyDependentContinuant a) (and (Continuant a) (exists (b t) (genericallyDependsOnAt a b t)))) // axiom label in BFO2 CLIF: [074-001] generically dependent continuant material MaterialEntity Collection of random bacteria, a chair, dorsal surface of the body. a flame a forest fire a human being a hurricane a photon a puff of smoke a sea wave a tornado an aggregate of human beings. an energy wave an epidemic the undetached arm of a human being An independent continuant [snap:IndependentContinuant] that is spatially extended whose identity is independent of that of other entities and can be maintained through time. Note: Material entity [snap:MaterialEntity] subsumes object [snap:Object], fiat object part [snap:FiatObjectPart], and object aggregate [snap:ObjectAggregate], which assume a three level theory of granularity, which is inadequate for some domains, such as biology. An independent continuant that is spatially extended whose identity is independent of that of other entities and can be maintained through time. BFO 2 Reference: Material entities (continuants) can preserve their identity even while gaining and losing material parts. Continuants are contrasted with occurrents, which unfold themselves in successive temporal parts or phases [60 BFO 2 Reference: Object, Fiat Object Part and Object Aggregate are not intended to be exhaustive of Material Entity. Users are invited to propose new subcategories of Material Entity. BFO 2 Reference: ‘Matter’ is intended to encompass both mass and energy (we will address the ontological treatment of portions of energy in a later version of BFO). A portion of matter is anything that includes elementary particles among its proper or improper parts: quarks and leptons, including electrons, as the smallest particles thus far discovered; baryons (including protons and neutrons) at a higher level of granularity; atoms and molecules at still higher levels, forming the cells, organs, organisms and other material entities studied by biologists, the portions of rock studied by geologists, the fossils studied by paleontologists, and so on.Material entities are three-dimensional entities (entities extended in three spatial dimensions), as contrasted with the processes in which they participate, which are four-dimensional entities (entities extended also along the dimension of time).According to the FMA, material entities may have immaterial entities as parts – including the entities identified below as sites; for example the interior (or ‘lumen’) of your small intestine is a part of your body. BFO 2.0 embodies a decision to follow the FMA here. BFO A material entity is an independent continuant that has some portion of matter as proper or improper continuant part. (axiom label in BFO2 Reference: [019-002]) Every entity which has a material entity as continuant part is a material entity. (axiom label in BFO2 Reference: [020-002]) every entity of which a material entity is continuant part is also a material entity. (axiom label in BFO2 Reference: [021-002]) (forall (x) (if (MaterialEntity x) (IndependentContinuant x))) // axiom label in BFO2 CLIF: [019-002] (forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt x y t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [021-002] (forall (x) (if (and (Entity x) (exists (y t) (and (MaterialEntity y) (continuantPartOfAt y x t)))) (MaterialEntity x))) // axiom label in BFO2 CLIF: [020-002] material entity material entity A language in which source code is written, intended to executed/run by a software interpreter. Programming languages are ways to write instructions that specify what to do, and sometimes, how to do it. IAO programming language data item Data items include counts of things, analyte concentrations, and statistical summaries. a data item is an information content entity that is intended to be a truthful statement about something (modulo, e.g., measurement precision or other systematic errors) and is constructed/acquired by a method which reliably tends to produce (approximately) truthful statements. 2/2/2009 Alan and Bjoern discussing FACS run output data. This is a data item because it is about the cell population. Each element records an event and is typically further composed a set of measurment data items that record the fluorescent intensity stimulated by one of the lasers. 2009-03-16: data item deliberatly ambiguous: we merged data set and datum to be one entity, not knowing how to define singular versus plural. So data item is more general than datum. 2009-03-16: removed datum as alternative term as datum specifically refers to singular form, and is thus not an exact synonym. 2014-03-31: See discussion at http://odontomachus.wordpress.com/2014/03/30/aboutness-objects-propositions/ JAR: datum -- well, this will be very tricky to define, but maybe some information-like stuff that might be put into a computer and that is meant, by someone, to denote and/or to be interpreted by some process... I would include lists, tables, sentences... I think I might defer to Barry, or to Brian Cantwell Smith JAR: A data item is an approximately justified approximately true approximate belief PERSON: Alan Ruttenberg PERSON: Chris Stoeckert PERSON: Jonathan Rees data data item http://www.ontobee.org/browser/rdf.php?o=IAO&iri=http://purl.obolibrary.org/obo/IAO_0000027 symbol a serial number such as "12324X" a stop sign a written proper name such as "OBI" An information content entity that is a mark(s) or character(s) used as a conventional representation of another entity. 20091104, MC: this needs work and will most probably change 2014-03-31: We would like to have a deeper analysis of 'mark' and 'sign' in the future (see https://github.com/information-artifact-ontology/IAO/issues/154). PERSON: James A. Overton PERSON: Jonathan Rees based on Oxford English Dictionary symbol information content entity Examples of information content entites include journal articles, data, graphical layouts, and graphs. Examples of information content entites include journal articles, data, graphical layouts, and graphs. A generically dependent continuant that is about some thing. An information content entity is an entity that is generically dependent on some artifact and stands in relation of aboutness to some entity. 2014-03-10: The use of "thing" is intended to be general enough to include universals and configurations (see https://groups.google.com/d/msg/information-ontology/GBxvYZCk1oc/-L6B5fSBBTQJ). information_content_entity 'is_encoded_in' some digital_entity in obi before split (040907). information_content_entity 'is_encoded_in' some physical_document in obi before split (040907). Previous. An information content entity is a non-realizable information entity that 'is encoded in' some digital or physical entity. PERSON: Chris Stoeckert IAO OBI_0000142 information content entity information content entity An information content entity whose concretizations indicate to their bearer how to realize them in a process. 2009-03-16: provenance: a term realizable information entity was proposed for OBI (OBI_0000337) , edited by the PlanAndPlannedProcess branch. Original definition was "is the specification of a process that can be concretized and realized by an actor" with alternative term "instruction".It has been subsequently moved to IAO where the objective for which the original term was defined was satisfied with the definitionof this, different, term. 2013-05-30 Alan Ruttenberg: What differentiates a directive information entity from an information concretization is that it can have concretizations that are either qualities or realizable entities. The concretizations that are realizable entities are created when an individual chooses to take up the direction, i.e. has the intention to (try to) realize it. 8/6/2009 Alan Ruttenberg: Changed label from "information entity about a realizable" after discussions at ICBO Werner pushed back on calling it realizable information entity as it isn't realizable. However this name isn't right either. An example would be a recipe. The realizable entity would be a plan, but the information entity isn't about the plan, it, once concretized, *is* the plan. -Alan PERSON: Alan Ruttenberg PERSON: Bjoern Peters directive information entity algorithm PMID: 18378114.Genomics. 2008 Mar 28. LINKGEN: A new algorithm to process data in genetic linkage studies. A plan specification which describes inputs, output of mathematical functions as well as workflow of execution for achieving an predefined objective. Algorithms are realized usually by means of implementation as computer programs for execution by automata. A plan specification which describes the inputs and output of mathematical functions as well as workflow of execution for achieving an predefined objective. Algorithms are realized usually by means of implementation as computer programs for execution by automata. An algorithm is a set of instructions for performing a paticular calculation. Philippe Rocca-Serra PlanAndPlannedProcess Branch IAO OBI_0000270 adapted from discussion on OBI list (Matthew Pocock, Christian Cocos, Alan Ruttenberg) algorithm algorithm An algorithm is a set of instructions for performing a paticular calculation. data format specification A data format specification is the information content borne by the document published defining the specification. Example: The ISO document specifying what encompasses an XML document; The instructions in a XSD file The syntax by which data is specified which renders it valid for a given format. 2009-03-16: provenance: term imported from OBI_0000187, which had original definition "A data format specification is a plan which organizes information. Example: The ISO document specifying what encompasses an XML document; The instructions in a XSD file" PERSON: Alan Ruttenberg PlanAndPlannedProcess Branch OBI branch derived OBI_0000187 data format specification The syntax by which data is specified which renders it valid for a given format. plan specification PMID: 18323827.Nat Med. 2008 Mar;14(3):226.New plan proposed to help resolve conflicting medical advice. A directive information entity with action specifications and objective specifications as parts that, when concretized, is realized in a process in which the bearer tries to achieve the objectives by taking the actions specified. 2009-03-16: provenance: a term a plan was proposed for OBI (OBI_0000344) , edited by the PlanAndPlannedProcess branch. Original definition was " a plan is a specification of a process that is realized by an actor to achieve the objective specified as part of the plan". It has been subsequently moved to IAO where the objective for which the original term was defined was satisfied with the definitionof this, different, term. 2014-03-31: A plan specification can have other parts, such as conditional specifications. Alternative previous definition: a plan is a set of instructions that specify how an objective should be achieved Alan Ruttenberg OBI Plan and Planned Process branch OBI_0000344 2/3/2009 Comment from OBI review. Action specification not well enough specified. Conditional specification not well enough specified. Question whether all plan specifications have objective specifications. Request that IAO either clarify these or change definitions not to use them plan specification version number A version number is an information content entity which is a sequence of characters borne by part of each of a class of manufactured products or its packaging and indicates its order within a set of other products having the same name. Note: we feel that at the moment we are happy with a general version number, and that we will subclass as needed in the future. For example, see 7. genome sequence version GROUP: IAO version name version number planned process planned process Injecting mice with a vaccine in order to test its efficacy A processual entity that realizes a plan which is the concretization of a plan specification. 'Plan' includes a future direction sense. That can be problematic if plans are changed during their execution. There are however implicit contingencies for protocols that an agent has in his mind that can be considered part of the plan, even if the agent didn't have them in mind before. Therefore, a planned process can diverge from what the agent would have said the plan was before executing it, by adjusting to problems encountered during execution (e.g. choosing another reagent with equivalent properties, if the originally planned one has run out.) We are only considering successfully completed planned processes. A plan may be modified, and details added during execution. For a given planned process, the associated realized plan specification is the one encompassing all changes made during execution. This means that all processes in which an agent acts towards achieving some objectives is a planned process. Bjoern Peters branch derived 6/11/9: Edited at workshop. Used to include: is initiated by an agent This class merges the previously separated objective driven process and planned process, as they the separation proved hard to maintain. (1/22/09, branch call) planned process organization PMID: 16353909.AAPS J. 2005 Sep 22;7(2):E274-80. Review. The joint food and agriculture organization of the United Nations/World Health Organization Expert Committee on Food Additives and its role in the evaluation of the safety of veterinary drug residues in foods. An entity that can bear roles, has members, and has a set of organization rules. Members of organizations are either organizations themselves or individual people. Members can bear specific organization member roles that are determined in the organization rules. The organization rules also determine how decisions are made on behalf of the organization by the organization members. BP: The definition summarizes long email discussions on the OBI developer, roles, biomaterial and denrie branches. It leaves open if an organization is a material entity or a dependent continuant, as no consensus was reached on that. The current placement as material is therefore temporary, in order to move forward with development. Here is the entire email summary, on which the definition is based: 1) there are organization_member_roles (president, treasurer, branch editor), with individual persons as bearers 2) there are organization_roles (employer, owner, vendor, patent holder) 3) an organization has a charter / rules / bylaws, which specify what roles there are, how they should be realized, and how to modify the charter/rules/bylaws themselves. It is debatable what the organization itself is (some kind of dependent continuant or an aggregate of people). This also determines who/what the bearer of organization_roles' are. My personal favorite is still to define organization as a kind of 'legal entity', but thinking it through leads to all kinds of questions that are clearly outside the scope of OBI. Interestingly enough, it does not seem to matter much where we place organization itself, as long as we can subclass it (University, Corporation, Government Agency, Hospital), instantiate it (Affymetrix, NCBI, NIH, ISO, W3C, University of Oklahoma), and have it play roles. This leads to my proposal: We define organization through the statements 1 - 3 above, but without an 'is a' statement for now. We can leave it in its current place in the is_a hierarchy (material entity) or move it up to 'continuant'. We leave further clarifications to BFO, and close this issue for now. PERSON: Alan Ruttenberg PERSON: Bjoern Peters PERSON: Philippe Rocca-Serra PERSON: Susanna Sansone GROUP: OBI organization organization data transformation The application of a clustering protocol to microarray data or the application of a statistical testing method on a primary data set to determine a p-value. A planned process that produces output data from input data. Elisabetta Manduchi Helen Parkinson James Malone Melanie Courtot Philippe Rocca-Serra Richard Scheuermann Ryan Brinkman Tina Hernandez-Boussard data analysis data processing Branch editors data transformation data visualization Generation of a heatmap from a microarray dataset An planned process that creates images, diagrams or animations from the input data. Elisabetta Manduchi James Malone Melanie Courtot Tina Boussard data encoding as image visualization PERSON: Elisabetta Manduchi PERSON: James Malone PERSON: Melanie Courtot PERSON: Tina Boussard Possible future hierarchy might include this: information_encoding >data_encoding >>image_encoding data visualization Computer software, or generally just software, is any set of machine-readable instructions (most often in the form of a computer program) that conform to a given syntax (sometimes referred to as a language) that is interpretable by a given processor and that directs a computer's processor to perform specific operations. James Malone Modified in parts from https://en.wikipedia.org/wiki/Software Robert Stevens software A licence is a legal instrument (usually by way of contract law, with or without printed material) governing the use or redistribution of the resource containing the licence. Modified from http://en.wikipedia.org/wiki/Software_license James Malone licence software license Racket is a Scheme-based language interpreter and programming environment Racket Java binning clustering method- The algorithm uses single-linkage clustering to join compounds into similarity groups, where every member in a cluster shares with at least one other member a similarity value above a user-specified threshold. The algorithm is optimized for speed and memory eciency by avoiding the calculation of an all-against-all distance matrix. Binning clustering method This role can be borne by any software which is a plugin for another piece of software. It can be used, for example, with an axiom such as "ParentSoftware 'uses software' some (PluginSoftware has_role some Plugin)" Allyson Lister Plugin CEL binary format is a binary data format specification created by Affymetrix where values are stored in little-endian format. The CEL format stores the results of the intensity calculations on the pixel values of the DAT file. This includes an intensity value, standard deviation of the intensity, the number of pixels used to calculate the intensity value, a flag to indicate an outlier as calculated by the algorithm and a user defined flag indicating the feature should be excluded from future analysis. The file stores the previously stated data for each feature on the probe array. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html, accessed 22 May 2013. Allyson Lister It is unclear whether, originally, this format was intended to be the ASCII or binary version of the CEL format. As such, this class has been renamed as the binary format, and a new class created (CEL ASCII format) to make both options available. (22/05/2013) CEL binary format http://dbpedia.org/resource/ActionScript ActionScript http://dbpedia.org/resource/Ada_(programming_language) Ada http://dbpedia.org/resource/AppleScript AppleScript Assembly http://dbpedia.org/resource/C_(programming_language) C http://dbpedia.org/resource/C_Sharp_(programming_language) C Sharp http://dbpedia.org/resource/C++ C++ http://dbpedia.org/resource/COBOL COBOL http://dbpedia.org/resource/ColdFusion_Markup_Language ColdFusion http://dbpedia.org/resource/D_(programming_language) D Delphi http://dbpedia.org/resource/Dylan_(programming_language) Dylan http://dbpedia.org/resource/Eiffel_(programming_language) Eiffel http://dbpedia.org/resource/Forth_(programming_language) Forth http://dbpedia.org/resource/Fortran Fortran http://dbpedia.org/resource/Groovy_(programming_language) Groovy http://dbpedia.org/resource/Haskell_(programming_language) Haskell http://dbpedia.org/resource/JavaScript JavaScript LabVIEW http://dbpedia.org/resource/Lisp_(programming_language) Lisp http://dbpedia.org/resource/Lua_(programming_language) Lua Maple Mathematica http://dbpedia.org/resource/Pascal_(programming_language) Pascal http://dbpedia.org/resource/Perl Perl http://dbpedia.org/resource/PHP PHP http://dbpedia.org/resource/Prolog Prolog Python is a widely used general-purpose, high-level programming language. Its design philosophy emphasizes code readability, and its syntax allows programmers to express concepts in fewer lines of code than would be possible in languages such as C. The language provides constructs intended to enable clear programs on both a small and large scale.Python supports multiple programming paradigms, including object-oriented, imperative and functional programming or procedural styles. http://dbpedia.org/resource/Python_(programming_language), accessed 27 November 2014. Python http://dbpedia.org/resource/REXX REXX http://dbpedia.org/resource/Ruby_(programming_language) Ruby SAS http://dbpedia.org/resource/Scala_(programming_language) Scala http://dbpedia.org/resource/Scheme_(programming_language) Scheme Shell http://dbpedia.org/resource/Smalltalk Smalltalk http://dbpedia.org/resource/SQL SQL http://dbpedia.org/resource/Turing_(programming_language) Turing http://dbpedia.org/resource/Verilog Verilog http://dbpedia.org/resource/VHDL VHDL http://dbpedia.org/resource/Visual_Basic Visual Basic MLXTRAN NMTRAN Web content search is the searching for information on the World Wide Web. website content search web content search The storing of digital information. Often this is done for archiving and retrieval purposes. data storage Version 2.6 of the Python programming language. Allyson Lister Allyson Lister Python 2.6 Version 2.7 of the Python programming language. Allyson Lister Allyson Lister Python 2.7 Firth's bias reduction procedure GNU Octave is a high-level interpreted language, primarily intended for numerical computations. It provides capabilities for the numerical solution of linear and nonlinear problems, and for performing other numerical experiments. It also provides extensive graphics capabilities for data visualization and manipulation. Octave is normally used through its interactive command line interface, but it can also be used to write non-interactive programs. The Octave language is quite similar to Matlab so that most programs are easily portable. https://www.gnu.org/software/octave/, accessed 27 March 2015. Allyson Lister GNU Octave GWT S language Excel Modified version of the GLAD algorithm (Gain and Loss Analysis of DNA) Gene Recommender algorithm Gamma-Gamma hierarchical model Gaussian locally weighted regression Gene-Set Enrichment Analysis Hexagon binning algorithm Heterogeneous Error Model (HEM) Hubert’s gamma Hidden Markov Model Hierarchical Ordered Partitioning and Collapsing Hybrid (HOPACH) algorithm Technique which predicts a given transcription factor activity and then uses this infor- mation to predict its targets. Hidden Variable Dynamic Modelling HVDM) HaarSeg algorithm Hardy-Weinberg equilibrium Hypergeometric probability Iteratively ReWeighted Least Squares ILLUMINA data Jaccard’s index KGML 'KLD' KS measures how biased the ranks of a subset of items are among the ranks of the entire set Kolmogorov Smirnov rank-sum based algorithm LC-MS data Locally Moderated Weighted-t (LMW) method Lognormal Normal Model Lognormal Normal with Modied Variance Model A publisher role is a role borne by an organization or individual in which they are responsible for making software available to a particular consumer group. Such organizations or individuals do need to be involved in the development of the software. James Malone publisher role Langmuir Isotherm Laplace mixture model Library Search Algorithm Loess algorithm Logic regression Median Average Difference Algorithm MAQC data MATLAB language Multivariate correlation estimator Markov Chain Monte Carlo minimum common regions (MCR) algorithm- Minimal common regions (MCRs) are dened as contiguous spans having at least a recurrence rate dened by a parameter (recurrence) across samples. MCR algorithm 'MI' Mutual information matrix (MIM) MMD describes the distributions of gene expression levels directly via the marginal distributions. It includes EM algorithm, FDR, it is the percentage of nondifferentially expressed genes among selected genes), false non-discovery rate (denoted as FNDR; it is the percentage of differentially expressed genes among unselected genes),false positive rate (denoted as FPR; it is the percentage of selected genes among nondifferentially expressed genes), and false negative rate (denoted as FNR; it is the percentage of un-selected genes among differentially expressed genes), MMD Mahalanobis distance Misclassification-Penalized Posteriors (MiPP) Mixed model equations 'Needleman-Wunsch' This includes exhaustive enumeration, triple-based inference,pairwise heuristic, module based inference, greedy hillclimbing Nested Effects Models Nonlinear Estimation by Iterative Partial Least Squares Presence-Absence calls with Negative Probesets (PANP) Pearson correlation estimator it is where we fit a model with probe level and chip level parameters on a probeset by probeset basis 'PLM' Probe level Locally moderated Weighted median-t (PLW) method PPC algorithm Included are summarisation, differential expression detection, clustering and PCA methods, together with useful plotting and data manipulation functions Propagation of uncertainty in microarray analysis Power Law Global Error Model (PLGEM) analysis method PLIER (Probe Logarithmic Error Intensity Estimate) method Radial basis function R interface to boost graph library algorithm (RBGL) Random effects model for the Support Vector Machine (SVM), as presented in [3] and the Nearest Shrunken Centroid (NSC) Recursive Feature Elimination (RFE) Regression model RMA RMA+ RMA++ Rnw Robust likelihood-based survival modeling S-Score algorithm Serial Analysis of Gene Expression (SAGE) Note: It is unclear from just the label what is meant by a SAM algorithm. It may or may not be related to the SAM sequence alignment software described by the class SWO_0000077 (Allyson Lister) SAM SBMLR format SDF format SNPRMA algorithm Semantic Similarity Measures- Four methods proposed by Resnik[Philip, 1999], Jiang[Jiang and Conrath, 1997], Lin[Lin, 1998] and Schlicker[Schlicker et al., 2006] respectively have presented to determine the semantic similarities of two GO terms based on the annotation statistics of their common ancestor terms. Wang [Wang et al., 2007] proposed a new method to measure the similarities based on the graph structure of GO. Semantic Similarity Measures SVDimpute algorithm Software developer role is a role borne by an organization or individual in which they are responsible for authoring software. James Malone software developer role An organization or legal entity (including single person) that is responsible for developing software. Developing includes aspects of design, coding and testing. software developer organization An organization or legal entity (including single person) that is responsible for publishing software. Publishing here includes tasks such as designing and producing physical products, technical customer support, licensing arrangements and marketing. software publisher organization Classical multivariate analysis-of-variance tests perform poorly in cases with several highly correlated responses and the tests collapse when the number of responses exceeds the number of observations. This paper presents a new method which handles this problem. The dimensionality of the data is reduced by using principal component decompositions and the final tests are still based on the classical test statistics and their distributions. The methodology is illustrated with an example from the production of sausages with responses from near infrared reflectance spectroscopy. A closely related method for testing relationships in uniresponse regression with collinear explanatory variables is also presented. The new test, which is called the 50-50 F-test, uses the first k components to calculate SSMODEL. The next d components are not involved in SSERROR and they are called buffer components. Langsrud, &Oslash. (2002), 50-50 Multivariate Analysis of Variance for Collinear Responses, The Statistician, 51, 305-317. 50-50 MANOVA algorithm Gene list Clustered data set R data frame R language Signaling Pathway Impact Analysis (SPIA) algorithm Similarity score 'Smith-Waterman' Theodore Ts’o’s Variance-stabilizing transformation (VST) algorithm WilcEbam Wilcoxon Xba.CQV and Xba.regions data annotation Annotation data packages Associative T method AvgNRRs Bitmap object Bootstrap cdt chamber slide format Categorical (e.g tumor vs normal) class file format http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#CLS:_Categorical_.28e.g_tumor_vs_normal.29_class_file_format_.28.2A.cls.29 cls dataset comparison Concordance covdesc file An objective in which the aim is to create a new database instance. PERSON: James Malone James Malone database creation .data debian control file format dcf design file Dynamic programming algorithm f-test Gene Cluster Text file format http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#GCT:_Gene_Cluster_Text_file_format_.28.2A.gct.29 gct Gene array analysis algorithm gene expression analysis Gene expression dataset global test allows the unit of analysis of the microarray experiment to be shifted from the single gene level to the pathway level, where a ‘pathway’ may be any set of genes, e.g. chosen using the Gene Ontology database or from earlier experiments. Global test gmt format GenePix Pro Results file format gpr format gtr gxl format Hierarchical clustering HTML report Hypergeometric enrichment Iterative local regression and model selection k-cores k-means k-nearest neighbour classification Likelihood method Linear modelling lma Local-pooled-error log file logicFS dataset Logit-t algorithm 'MAS5' mas5 format M-estimation regression Meta data mrnet algorithm Multinomial probit regression with Gaussian Process priors multiple testing which includes controlling the family-wise error rate (FWER), generalized family-wise error rate (gFWER), tail probability of the proportion of false positives (TPPFP), and false discovery rate (FDR). Tests based on a variety of t- and F-statistics (including t-statistics based on regression parameters from linear and survival models as well as those based on correlation parameters) are included. Multiple testing Multivariate t mixture models Negative binomial distribution Neural networks models OMICS data pair file parse pedigree data file Graph plot Position weight matrix (PWM ) qPCR data Quantile normalization Quantile regression techniques Rank-invariant set normalization Rank product non-parametric method .raw files rda .rma format Sim method sproc sqlite A statistical test is an algorithm for making quantitative decisions to determine which outcomes of a study would lead to a rejection of the null hypothesis for a pre-specified level of significance. Modified from http://en.wikipedia.org/wiki/Statistical_test, accessed 9 May 2013; http://www.itl.nist.gov/div898/handbook/prc/section1/prc13.htm, accessed 9 May 2013. Statistical tests 't-test' Two-stage measurement error model 2-sample pooled t-test The ACME algorithm is quite straightforward. Using a user-dened quantile of the data,called the threshold, any probes in the data that are above that threshold are considered positive probes. A p-value is then assigned to each probe. 'ACME' Average log expression across arrays (ALE) ALL/AML data set AMDIS The ANCOVA global test is a test for the association between expression values and clinical entities. The test is carried out by comparison of linear models via the extra sum of squares principle. If the mean expression level for at least one gene diers between corresponding models the global null hypothesis, which is the intersection of all single gene null hypotheses, is violated. FDR ANCOVA ANOVA or Analysis of Variance is a hypothesis testing algorithm which a variable is partitioned into components attributable to different sources of variation. ANNOVA 'ANOVA' AP-MS data ARACNE algorithm ARR AWS algorithm BGL implements Depth First Search, Breadth First Search,Dijkstra's, Bellman Ford's and DAG,Johnson's and Floyd Warshall's.Kruskal's algorithm and Prim's algorithm Cuthill-McKee's algorithm Minimum degree Ordering BGL Iterative Bayesian Model Averaging (BMA) Base-Pair-Distance Kernel BPMAP is a binary data format specification created by Affymetrix where the data is stored in big-endian format. The BPMAP format contains information relating to the design of the Affymetrix tiling arrays. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/bpmap.html, accessed 22 May 2013. Allyson Lister BPMAP BaldiLongT Bayesian Model The CDF binary format is a binary data format specification created by Affymetrix was created for faster access and smaller file size in comparison to the CDF ASCII format. The values in the file are stored in little-endian format. CDF binary format describes the layout for an Affymetrix GeneChip array. An array may contain Expression, Genotyping, CustomSeq, Copy Number and/or Tag probe sets. All probe set names within an array are unique. Multiple copies of a probe set may exist on a single array as long as each copy has a unique name. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cdf.html, accessed 22 May 2013. It is unclear whether, originally, this format was intended to be the ASCII or binary version of the CDF format. As such, this class has been renamed as the binary format, and a new class created (CDF ASCII format) to make both options available. CDF binary format CMA- it implements k-fold cross validation, MCMC cross validation, bootstrap and (t.test or welch.test or wilcox.test or f.test or kruskal.test or One-step Recursive Feature Elimination or random forest variable importance measure or lasso or elastic net or componentwise boosting) CMA This algorithm offers improved confidence scores, quality scores for SNP’s and batches, higher accuracy on different datasets and better performance. CRLMM algorithm Continuous Wavelet Transform (CWT)-based peak detection algorithm Category analysis PCMG- a bipartite graph in which one set of nodes represents proteins, the other set represents complexes, and an edge from a protein node to a complex node represents membership of the protein in that complex. PCMG Chi-square distance calculation BCRANK is a method that takes a ranked list of genomic regions as input and outputs short DNA sequences that are overrepresented in some part of the list. BCRANK cosmo allows the user to target the motif search by specifying a set of constraints that the unknown position weight matrix must satisfy. The algorithm is based on a probabilistic model that describes the DNA sequences of interest through a two- component multinomial mixture model with estimates of the position weight matrix entries obtained by maximizing the observed data likelihood over the smaller parameter space corresponding to the imposed constraints. It includes methods such as Probabilistic models and one-occurrence-per-sequence and zero-or-one-occurrence-per-sequence and two-component mixture Cosmo F test Text data set CSV data set non-linear functional regression model with both additive and multiplicative error terms Non-linear functional regression model Complex Estimation Algorithm A DFP version of a FP (fuzzy pattern) only includes those genes that can serve to differentiate it from the rest of the patterns.This algorithm is based on the discretization of float values (gene expression values) stored in an ExpressionSet object into labels combining 'Low', 'Medium' and 'High' Discriminant Fuzzy Pattern Algorithm DFW Digital gene expression (DGE) datasets Expectation-Maximization(EM) algorithm Empirical Bayes rule FACS data is data which describes flow cytometry data sets. http://www.bioconductor.org/packages/2.12/bioc/vignettes/prada/inst/doc/fcs3.html, accessed 29 May 2013. Allyson Lister FACS data FARMS FC 'FDR' Fixed effect model Fischer's Exact Test GASSCO method GEO data type Gamma Gamma Model The circular binary segmentation (CBS) algorithm divides the genome into regions of equal copy number . The algorithm tests for change-points using a maximal t-statistic with a permutation reference distribution to obtain the corresponding p-value. The number of computations required for the maximal test statistic is O(N^2),where N is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster algorithm. CBS CBS algorithm CHP binary format is a binary data format specification created by Affymetrix, stored in little-endian format and used to store expression, resequencing and genotyping results from algorithms implemented in the GCOS 1.2, 1.3 and 1.4 and BRLMM Analysis Tool software applications. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/chp.html, http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/chp-xda.html, accessed 22 May 2013. Allyson Lister As this is a child of Binary format, it is assumed that this class is intended to represent the binary form of this format. As described in the definition sources, there are other formats for CHP available. CHP binary format CLR algorithm FoxDimmicT GEO Matrix Series format A licence clause is a component of a licence which defines some aspect of a restriction or conversely permission in how something corresponding to a licence may be legally redistributed, partially redistrubed, extended, modified or otherwise used in some way. James Malone clause licence clause An Attribution clause is a license clause intended to provide a specified level of recognition of the licensor as the copyright holder of the work. This can take many forms, including the requirement to preserve any copyright notice, attribution statements and the URL (link) to the original work. The attribution requirement thus serves the dual purpose of ensuring that the publisher receives appropriate credit, and that provenance information is kept intact. There are varying strengths of these clauses, from licensors requesting not to be attributed at all to requiring attribution for all uses of the resource. Modified by Allyson Lister from http://creativecommons.org/tag/attribution Attribution clause Derivatives clauses are license clauses which state what requirements on derivative resource, if any, are attached to a license. The license for a resource may or may not allow the creation of new resources derived from it. If it is allowed, such usage may be restricted in a variety of ways. Allyson Lister Derivatives clause A source code clause is a license clause which states the restrictions placed on the source code for the licensed software, if any. The license for a piece of software may or may not allow access to the source code. If such access is allowed, usage may be restricted in a variety of ways. Allyson Lister Source code clause A Platform clause is a license clause which states the platform restrictions for the licensed resource, if any. The license for a resource may or may not allow the use of any platform. If it is allowed, such usage may be restricted in a variety of ways. Allyson Lister Platform clause A Number of installations clause is a license clause which may limit the number of installations a particular licensee may peform. Allyson Lister Number of installations clause A Number of users clause is a license clause which may limit the number of users a particular licensee may allow to use the resource. Allyson Lister Number of users clause A time clause is a license clause which states the restrictions placed on the length of time the licensed resource may be used, if any. The license for a resource may or may not allow access to the resource for an unlimited time. If such access is allowed, usage may be restricted in a variety of ways. Allyson Lister Time clause A usage clause is a license clause which states the restrictions placed on how the licensed resource may be used. The license for a resource may restrict how the licensee may use the software. If such access is allowed, usage may be restricted in a variety of ways. Allyson Lister Usage clause No restrictions on derivatives is a derivatives clause which places no rules or restrictions on how derivative software is created. Allyson Lister No restrictions on derivatives Restrictions on derivative software is a derivatives clause which allows the creation of derivatives but which also places some kind of restriction on how derivative software may be created. Allyson Lister Restrictions on derivative software Derivatives not allowed clauses are derivatives clauses which state that derivative resources are never allowed using the licensed resource. Allyson Lister Derivatives not allowed Derivative code same license is a restrictive derivatives clause where derivative software must be released under the same license. Allyson Lister derivative code same license A source code available clause is a source code clause which states that the source code for the licensed software is available to the licensee. However, usage of the source code may or may not be restricted in a variety of ways. Allyson Lister Source code available A source code unavailable clause is a source code clause which states that the source code for the licensed software is not available to the licensee. Allyson Lister Source code unavailable Platform restricted is a platform clause which places restrictions on which platform the licensed resource may be installed on. Allyson Lister Platform restricted Platform unrestrictred is a platform clause which does not place any restrictions on which type of platform the resource may be licensed for. Allyson Lister Platform unrestricted A Number of installations restricted clause is a number of installations clause which restricts the number of times the resource may be installed by any given licensee. Allyson Lister Number of installations restricted A Number of installations unrestricted clause is a number of installations clause which does not restrict the number of times the resource may be installed by any given licensee. Allyson Lister Number of installations unrestricted A Number of users restricted clause is a number of users clause which restricts the number of users the resource may have for a particular licensee. This may be number of total users, or number of concurrent users. Allyson Lister Number of users restricted A Number of users unrestricted clause is a number of users clause which does not restrict the number of users of the licensed resource. Allyson Lister Number of users unrestricted Time for use restricted is a time clause which places restrictions on the length of time the licensed resource may be used. Allyson Lister Time for use restricted Time for use unrestrictred is a time clause which does not place any restrictions on the length of time the resource may be licensed for. Time for use unrestricted Usage unrestricted is a usage clause which places no restrictions on how the licensed resource may be used. Allyson Lister Usage unrestricted Usage restricted is a usage clause which places restrictions on how the licensed resource may be used. These restrictions will vary according to the individual license. Allyson Lister Usage restricted Non-commercial use only is a usage restricted clause which restricts the use of the licensed resource only to licensees who are not commercial entites. Non-commercial use only Academic use only is a usage restricted clause which restricts the use of the licensed resource to academic licensees only. Allyson Lister Academic use only Derivative code linked same license is a restrictive derivatives clause where code may only be linked to in derivative software that is released under the same license. Allyson Lister derivative code linked same license The mode of interaction with a piece of software. software interface A stochastic algorithm for population pharmacology modeling http://www.math.u-bordeaux1.fr/MAS10/DOC/PDF-PRES/Lavielle.pdf SAEM A Dynamic Bayesian Network model is a Bayesian Network which relates variables to each other over adjacent time steps. http://en.wikipedia.org/wiki/Dynamic_Bayesian_network, accessed 27 November 2014. dynamic Bayesian network model In mathematics, an ordinary differential equation or ODE is an equation containing a function of one independent variable and its derivatives. The term "ordinary" is used in contrast with the term partial differential equation which may be with respect to more than one independent variable. Ordinary Differential Equation Algorithm http://en.wikipedia.org/wiki/Ordinary_differential_equation, accessed 25 March 2015. Allyson Lister ODE Algorithm Gillespie's Stochastic Simulation Algorithm is an algorithm which generates a statistically correct trajectory (possible solution) of a stochastic equation. It can be used to simulate chemical or biochemical systems of reactions efficiently and accurately using limited computational power. The algorithm is particularly useful for simulating reactions within cells where the number of reagents typically number in the tens of molecules (or less). Mathematically, it is a variety of a dynamic Monte Carlo method and similar to the kinetic Monte Carlo methods. It is used heavily in computational systems biology http://en.wikipedia.org/wiki/Gillespie_algorithm, accessed 25 March 2015. Allyson Lister Gillespie's Stochastic Simulation Algorithm Monte Carlo is a discrete stochastic simulation algorithm and an estimation procedure algorithm. If it is necessary to know the average value of some random variable and its distribution can not be stated, and if it is possible to take samples from the distribution, we can estimate it by taking the samples, independently, and averaging them. If there are sufficiently enough samples, then the law of large numbers says the average must be close to the true value. The central limit theorem says that the average has a Gaussian distribution around the true value. http://en.wikipedia.org/wiki/Stochastic_simulation#Monte_Carlo_simulation, accessed 25 March 2015, and http://www.async.ece.utah.edu/iBioSim/docs/iBioSim.html, accessed 25 March 2015. Allyson Lister Monte Carlo Differential Algebraic equations (DAEs) are a general form of (systems of) differential equations for vector–valued functions in one independent variable. In practical terms, the distinction between DAEs and ODEs is often that the solution of a DAE system depends on the derivatives of the input signal and not just the signal itself as in the case of ODEs. Differential Algebraic Equation Algorithm http://en.wikipedia.org/wiki/Differential_algebraic_equation, accessed 26 March 2015. Allyson Lister DAE Algorithm A partial differential equation (PDE) is a differential equation that contains unknown multivariable functions and their partial derivatives. This is in contrast to ordinary differential equations, which deal with functions of a single variable and their derivatives. Partial Differential Equation Algorithm http://en.wikipedia.org/wiki/Partial_differential_equation, accessed 26 March 2015. Allyson Lister PDE Algorithm A spreadsheet data format is one in which data is organised into a matrix (or matrices) of columns and rows to form cells in which values are entered. James Malone spreadsheet format A spreadsheet data format designed for Microsoft Excel. James Malone XLS spreadsheet A spreadsheet data format in which the structure of the data is described using XML, such as column and row headers and cell identity. James Malone XML spreadsheet James Malone Matlab .m file "Resource Description Framework (RDF) format." [http://edamontology.org] format bioinformatics edam formats The Resource Description Framework (RDF) is a general-purpose language for representing information in the Web. http://www.w3.org/TR/REC-rdf-syntax/ James Malone Jon Ison Data in RDF format can be serialised into XML, textual, or binary format. Merged with now-obsolete EDAM class 'RDF' http://edamontology.org/format_2376 by Allyson Lister. RDF A serialisation of RDF into an XML format. James Malone RDF-XML image format DWG ("drawing") is a binary file format used for storing two and three dimensional design data and metadata http://en.wikipedia.org/wiki/.dwg James Malone DWG DXF (Drawing Interchange Format, or Drawing Exchange Format) is a CAD data file format developed by Autodesk for enabling data interoperability between AutoCAD and other programs. http://en.wikipedia.org/wiki/AutoCAD_DXF James Malone DXF The BMP File Format is a Raster graphics image file format used to store bitmap digital images, independently of the display device (such as a graphics adapter). http://www.fileformat.info/format/bmp/egff.htm James Malone BMP Computer Graphics Metafile (CGM) is a free and open international standard file format for 2D vector graphics, raster graphics, and text, and is defined by ISO/IEC 8632. http://en.wikipedia.org/wiki/Computer_Graphics_Metafile James Malone CGM web page specification document exchange format PDF is an open standard for document exchange. Portable Document Format pdf TIFF is a flexible, adaptable file format for handling images and data within a single file, by including the header tags (size, definition, image-data arrangement, applied image compression) defining the image's geometry. Tagged Image File Format TIFF JPEG is a lossy file format for storing images JPG JPEG PNG is a bitmapped image format and video codec that employs lossless data compression. Portable Network Graphics PNG The Graphics Interchange Format (GIF) is a bitmap image format. The format supports up to 8 bits per pixel thus allowing a single image to reference a palette of up to 256 distinct colors. The colors are chosen from the 24-bit RGB color space. It also supports animations and allows a separate palette of 256 colors for each frame. The color limitation makes the GIF format unsuitable for reproducing color photographs and other images with continuous color, but it is well-suited for simpler images such as graphics or logos with solid areas of color. [wikipedia] Graphics Interchange Format GIF A raster image is a format for representing a rectangular grid of dots (pixels) which contains information on the specific colour of each pixel. raster image format A vector image is a collection of connected lines and curves that produce objects. This geometric description enables the image to be displayed without loss at any size rendering. vector image format Scalable Vector Graphics SVG Adobe Illustrator format AI PostScript is a format used for describing documents. PostScript tex is a format for documents written in the document markup language and document preparation system LaTeX. LaTeX format tex A format specification for data used or produced by outliner software outline document format A proprietary format for documents created and edited using OmniOutliner outliner software, OmniOutline format OPML (Outline Processor Markup Language) is an XML format for outlines Outline Processor Markup Language OPML JPEG 2000 is a compression standard enabling both lossless and lossy storage. The compression methods used are different from the ones in standard JFIF/JPEG; they improve quality and compression ratios, but also require more computational power to process. [wikipedia] JPEG 2000 word processing document format WordStar format A file format for word processing documents for Microsoft Word. Microsoft Word doc programming language format A source code file format which is specified to be used with the Java programming language. James Malone .java file A format in which a .java file has been compiled into bytecode using a Java compiler and which is specified to be executed using the Java virtual machine. James Malone .class file The Web Ontology Language (OWL) in XML serialization OWL-XML Web Ontology Language version 2 in XML Serialization OWL2-XML ASCII format plain text file format Tab delimited file format is a plain text file format where each field value of a record is separated from the next by a tab stop character. http://en.wikipedia.org/wiki/Tab-separated_values, accessed 6 June 2013. Allyson Lister tab delimited file format SIF stands for Simple Interaction Format, and is a text format invented for Cytoscape. If the file contains any tab characters, then tabs are used to delimit the fields and spaces are considered part of the name. If the file contains no tabs, then any spaces are delimiters that separate names (and names cannot contain spaces). http://wiki.cytoscape.org/GettingStarted and http://wiki.cytoscape.org/Cytoscape_User_Manual/Network_Formats, accessed 20 June 2012 Allyson Lister SIF GML stands for Graph Markup Language, and is a standard network file format; supported by multiple generic network software packages http://wiki.cytoscape.org/GettingStarted, accessed 20 June 2012 GML XGMML stands for eXtensible Graph Markup and Modelling Language, and it is a XML standard; similar to but preferred over GML. http://wiki.cytoscape.org/GettingStarted, accessed 20 June 2012 XGMML A type of data which defines interactions between items in the file. This can be simple pairwise interactions or more complex ones. Used to provide a class of data for software requiring specific types of interaction data as input. Allyson Lister Interaction data A knowledge representation role is a role borne by a data format which utilizes formalisms to make complex systems easier to design and build. Knowledge representation is the field of artificial intelligence devoted to representing information about the world in a form that a computer system can utilize to solve complex tasks. Modified from http://en.wikipedia.org/wiki/Knowledge_representation, accessed 10 February 2014 Knowledge representation role The CDF ASCII format is an ASCII data format specification created by Affymetrix similar to the Windows INI format. This format describes the layout for an Affymetrix GeneChip array. An array may contain Expression, Genotyping, CustomSeq, Copy Number and/or Tag probe sets. All probe set names within an array are unique. Multiple copies of a probe set may exist on a single array as long as each copy has a unique name. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cdf.html, accessed 22 May 2013 Allyson Lister CDF ASCII format BAR is a binary data format specification created by Affymetrix where the data is stored in big-endian format. The format of the file is a header section followed by sequences sections (one section per sequence defined). The BAR file contains one and two sample analysis results (signal and p-values) from the tiling array software. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/bar.html, accessed 22 May 2013. Allyson Lister BAR CEL ASCII format is ASCII data format specification created by Affymetrix similar to the Windows INI format. The CEL format stores the results of the intensity calculations on the pixel values of the DAT file. This includes an intensity value, standard deviation of the intensity, the number of pixels used to calculate the intensity value, a flag to indicate an outlier as calculated by the algorithm and a user defined flag indicating the feature should be excluded from future analysis. The file stores the previously stated data for each feature on the probe array. http://www.affymetrix.com/support/developer/powertools/changelog/gcos-agcc/cel.html, accessed 22 May 2013. Allyson Lister CEL ASCII format Affymetrix-compliant data is data produced in a format compatible with Affymetrix software. This is a defined class where other data classes will be inferred to be members if they have a data format specification which has been published by Affymetrix. Allyson Lister Affymetrix-compliant data The flow cytometry data file standard provides the specifications needed to completely describe flow cytometry data sets within the confines of the file containing the experimental data. The principal goal of the Standard is to provide a uniform file format allowing files created by one type of acquisition hardware and software to be analyzed by another type. FCS http://www.bioconductor.org/packages/2.12/bioc/vignettes/prada/inst/doc/fcs3.html, accessed May 29. 2013. Allyson Lister Data File Standard for Flow Cytometry FCS3.0 is version 3.0 of the Data File Standard for Flow Cytometry. It contains a mechanism for handling data sets of 100 megabytes and larger, support for UNICODE text for keyword values, support for cyclic redundancy check (CRC) validation for each data set, a requirement for the inclusion of information describing the method of signal amplification and increased support for the inclusion of time as a measurement parameter. http://www.bioconductor.org/packages/2.12/bioc/vignettes/prada/inst/doc/fcs3.html, accessed May 29. 2013. Allyson Lister FCS3.0 NONMEM data format A nucleic acid sequence that indicate the order of nucleotides within some DNA. DNA nucleotide sequence Data about primary biological sequence information, such as DNA neucleotide sequences. biological sequence data Data which contains information about amino acid sequences of proteins. amino acid protein sequence data sequence feature format Tabix indexes a TAB-delimited genome position file and creates an index file when region is absent from the command-line. The input data file must be position sorted and compressed by bgzip which has a gzip like interface. After indexing, tabix is able to quickly retrieve data lines overlapping regions specified in the format "chr:beginPos-endPos". Fast data retrieval also works over network if URI is given as a file name and in this case the index file will be downloaded if it is not present locally. http://samtools.sourceforge.net/tabix.shtml tabix file format Genomedata provides a way to store and access large-scale functional genomics data in a format which is both space-efficient and allows efficient random-access. Genomedata archives are implemented as one or more HDF5 files, either as single files or as directory archives. HDF5 archives are self describing, like XML, but may also contain more complex structures such contain binary data. http://pmgenomics.ca/hoffmanlab/proj/genomedata/doc/1.3.5/genomedata.html and http://www.hdfgroup.org/why_hdf/, accessed 27 November 2014. Allyson Lister genomedata format The bedGraph format is a line-oriented text file format. Bedgraph data are preceeded by a track definition line, which adds a number of options for controlling the default display of this track. Following the track definition line are the track data in four column BED format. The bedGraph format allows display of continuous-valued data in track format. This display type is useful for probability scores and transcriptome data. This track type is similar to the wiggle (WIG) format, but unlike the wiggle format, data exported in the bedGraph format are preserved in their original state. http://genome.ucsc.edu/goldenpath/help/bedgraph.html, accessed December 3, 2014. Allyson Lister BedGraph GMTK parameter data is a type of data which contains the various parameter files required by GMTK to define a dynamic Bayesian network. These parameter files come in multiple types (e.g. structure files and master files), and therefore this is a data type rather than a single format. Graphical Models Toolkit parameter data Allyson Lister http://noble.gs.washington.edu/proj/philius/README, accessed 3 December 2014. Allyson Lister GMTK parameter data These parameter files come in multiple types (e.g. structure files and master files), and therefore this is a data type rather than a single format. MathML 2.0 is an XML format which is a low-level specification for describing mathematics as a basis for machine to machine communication. It is a W3C Recommendation and was released on 21 Feb 2001. A product of the W3C Math working group, it provides a much needed foundation for the inclusion of mathematical expressions in Web pages. http://www.w3.org/Math/ Allyson Lister James Malone MathML 2.0 FieldML is an XML-based language for describing time-varying and spatially-varying fields. The aims of the language design process are to keep the language concise, consistent, intuitive and flexible. http://www.physiomeproject.org/xml_languages/fieldml James Malone FieldML WKn is a collective name for a spreadsheet format created for Lotus 1-2-3. http://support.sas.com/documentation/cdl/en/acpcref/63184/HTML/default/viewer.htm#a003103772.htm and http://en.wikipedia.org/wiki/Lotus_1-2-3, accessed 3 December 2014. Allyson Lister Please note that this is a collective class for all versions of the WKn format, and the specific version required should be created as necessary and placed as a child of this class. WKn GZipped format .gz Zipped format .zip audio format Resource Interchange File Format RIFF BigWig format .bw Comma-separated value format .csv MASCOT generic format .mgf MySQL format .mysql SQL format .sql A Web User interface is a Graphical User Interface which is loaded and run via a Web browser rather than within the user's operating system. WUI Web UI Allyson Lister web user interface A Desktop Graphical User interface is a Graphical User Interface which is loaded and run within the user's operating system rather than via a Web browser. Desktop GUI Allyson Lister desktop graphical user interface A SOAP service is a Web service which provides a standard, extensible, composable framework for packaging and exchanging XML messages. The service may expose an arbitrary, application-specific set of operations. SOAP Service Modified from http://en.wikipedia.org/wiki/SOAP, accessed 6 June 2013; modified from http://www.w3.org/TR/ws-arch, accessed 6 June 2013. Allyson Lister SOAP service A REST service is a Web service in which the primary purpose of the service is to manipulate XML representations of Web resources using a uniform set of "stateless" operations. RESTful APIs do not require XML-based web service protocols (SOAP and WSDL) to support their light-weight interfaces. REST Service Modified from http://www.w3.org/TR/ws-arch, accessed 6 June 2013, Modified from http://en.wikipedia.org/wiki/Web_service, accessed 6 June 2013. Allyson Lister REST service A web service in which calls invoked return JSON. JSON web service A web service is a software interface which works as a method of communication between two electronic devices over the World Wide Web and which is provided at a particular network address. There are two major classes of Web services: REST-compliant Web services, and arbitrary (or application-specific) Web services. Web Service Modified from http://www.w3.org/TR/ws-arch/, accessed 6 June 2013; Modified from http://en.wikipedia.org/wiki/Web_service, accessed 6 June 2013. Allyson Lister web service A Graphical user interface is a type of software interface that allows users to interact with electronic devices using images rather than text commands. A GUI represents the information and actions available to a user through graphical icons and visual indicators such as secondary notation, as opposed to text-based interfaces, typed command labels or text navigation. https://en.wikipedia.org/wiki/Graphical_user_interface, accessed 6 June 2013. GUI Allyson Lister graphical user interface A command-line interface is a means of interacting with a computer program where the user (or client) issues commands to the program in the form of successive lines of text (command lines). Command line Command line interface Command-line http://en.wikipedia.org/wiki/Command-line_interface, accessed 25 November 2014. command-line interface An application programming interface is a set of routines, protocols, and tools for building software applications. An API expresses a software component in terms of its operations, inputs, outputs, and underlying types. An API defines functionalities that are independent of their respective implementations, which allows definitions and implementations to vary without compromising each other. The API specifies how software components should interact. API http://en.wikipedia.org/wiki/Application_programming_interface, accessed 25 November 2014. application programming interface CC Creative Commons Proprietary commercial software license Mozilla Public License Version 1.1 MPL v1.1 Distribution clauses are license clauses which state the requirements on how the licensed resource is redistributed. The license for a resource may or may not allow the redistribution of that resource. If it is allowed, such usage may be restricted in a variety of ways. Allyson Lister Distribution clause Distribution restricted is a distribution clause which places restrictions on how the licensed resource may be distributed by third parties. These restrictions may be complete, e.g. no further redistribution, or partial. Allyson Lister Distribution restricted Distribution unrestricted is a distribution clause which states that the licensed resource can be redistributed by a third party in whatever manner that party wishes. Allyson Lister Distribution unrestricted Derivatives allowed clauses are derivatives clauses which state that derivative resources are allowed using the licensed resource. Even when allowed, such a clause may or may not restrict the usage of the licensed resource in a variety of ways. Allyson Lister Derivatives allowed GNU General Public License GNU GPL This is a free software license under the definition of "free" by the GNU Project, and is compatible with version 3 of the GNU GPL. http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. Allyson Lister Apache License Version 2.0 The Academic Free License is a free software license under the definition of "free" by the GNU Project, is not copyleft, and is incompatible with the GNU GPL. AFL http://directory.fsf.org/wiki/License:AFLv3, accessed 12 June 2013; http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013 Allyson Lister Andy Brown Academic Free License version 3 This is the original BSD license with the advertising clause and another clause removed. (It is also sometimes called the “2-clause BSD license”.) It is a lax, permissive non-copyleft free software license, compatible with the GNU GPL. 2-clause BSD License http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. Allyson Lister Andy Brown FreeBSD Open source software license License without restrictions on derivatives A licensed is a free license according to GNU if the users have the four essential freedoms: The freedom to run the program, for any purpose (freedom 0). The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this. The freedom to redistribute copies so you can help your neighbor (freedom 2). The freedom to distribute copies of your modified versions to others (freedom 3). By doing this you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this. A program is free software if users have all of these freedoms. Thus, you should be free to redistribute copies, either with or without modifications, either gratis or charging a fee for distribution, to anyone anywhere. Being free to do these things means (among other things) that you do not have to ask or pay for permission to do so. “Free software” does not mean “noncommercial”. A free program must be available for commercial use, commercial development, and commercial distribution. Commercial development of free software is no longer unusual; such free commercial software is very important. You may have paid money to get copies of free software, or you may have obtained copies at no charge. But regardless of how you got your copies, you always have the freedom to copy and change the software, even to sell copies. http://www.gnu.org/philosophy/free-sw.html, accessed 12 June 2013. Allyson Lister Allyson Lister: This class defines those licenses which are free licenses, but which may or may not be compatible with any version of the GNU GPL. GNU Project Free License Type GNU GPL Compatible License Type is a GNU Project Free License Type which is also compatible with one or more versions of the GNU GPL Modified from http://www.gnu.org/copyleft/copyleft.html, accessed 12 June 2013. Allyson Lister GNU GPL Compatible License Type Copyleft is a derivative code same license clause which says that anyone who redistributes the software, with or without changes, must pass along the freedom to further copy and change it. In other words, it requires that all derivative code uses the same license, but further limits the type of license to one which gives everyone the rights to use, modify, and redistribute the program's code, or any program derived from it, but only if the distribution terms are unchanged. To copyleft a program, you first state that it is copyrighted; then the distribution terms described above are added. This makes copyleft a legal instrument ensuring that the code and the freedoms become legally inseparable. Modified from http://www.gnu.org/copyleft/copyleft.html, accessed 12 June 2013. Allyson Lister Copyleft The GNU GPL v3 is the latest version of the GNU GPL: a free software license, and a copyleft license. Please note that GPLv3 is not compatible with GPLv2 by itself. However, most software released under GPLv2 allows you to use the terms of later versions of the GPL as well. When this is the case, you can use the code under GPLv3 to make the desired combination. GNU General Public License Version 3 https://www.gnu.org/licenses/license-list.html, accessed on 23 June 2016. Allyson Lister GNU GPL v3 The GNU GPL v2 is an earlier version of the GNU GPL: a free software license, and a copyleft license. Please note that GPLv2 is, by itself, not compatible with GPLv3. However, most software released under GPLv2 allows you to use the terms of later versions of the GPL as well. When this is the case, you can use the code under GPLv3 to make the desired combination. GNU General Public License Version 2 https://www.gnu.org/licenses/license-list.html, accessed 23 June 2016. GNU GPL v2 CC0 is a public domain dedication from Creative Commons. A work released under CC0 is dedicated to the public domain to the fullest extent permitted by law. If that is not possible for any reason, CC0 also provides a lax, permissive license as a fallback. Both public domain works and the lax license provided by CC0 are compatible with the GNU GPL. CC0 1.0 Universal (CC0 1.0) Public Domain Dedication http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. Allyson Lister CC0 1.0 CC BY 2.0 is a non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 2.0 Generic (CC BY 2.0) Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013 and 28 June 2016; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY 2.0 Attribution required is an attribution clause which states that attribution of the type specified in the license must be provided whenever the resource is used. Allyson Lister Attribution required CC BY-SA 2.0 is a copyleft free (free by the definition of the GNU Project) license which lets others remix, tweak, and build upon your work even for commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project recommend this license for art, entertainment and educational works, but also recommend that it is not used for software or documentation, since it is tricky as to how exactly this license is compatible with the GNU GPL. Creative Commons Attribution-Sharealike 2.0 Generic Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY-SA 2.0 LPPL v1.3 is a free software license (by the defintion of the GNU Project), with less stringent requirements on distribution than LPPL 1.2. It is still incompatible with the GPL because some modified versions must include a copy of or pointer to an unmodified version.Software projects other than LaTeX rarely use it. Latex Project Public License v1.3c Modified from http://en.wikipedia.org/wiki/LaTeX_Project_Public_License, accessed 12 June 2013; modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. LPPL v1.3c LPPL v1.2 is a free software license (by the defintion of the GNU Project), This license is an incomplete statement of the distribution terms for LaTeX. While it is a free software license, it is incompatible with the GPL because it has many requirements that are not in the GPL. Software projects other than LaTeX rarely use it. Latex Project Public License v1.2 Modified from http://en.wikipedia.org/wiki/LaTeX_Project_Public_License, accessed 12 June 2013; modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. LPPL v1.2 MPL v2.0 a free software license as defined by the GNU Project. Section 3.3 provides indirect compatibility between this license and the GNU GPL version 2.0, the GNU LGPL version 2.1, the GNU AGPL version 3.0, and all later versions of those licenses. The MPL allows covered source code to be mixed with other files under a different, even proprietary license. However, code files licensed under the MPL must remain under the MPL and freely available in source form. http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; http://en.wikipedia.org/wiki/Mozilla_Public_License, accessed 12 June 2013. MPL v2.0 Mozilla Public License Version 2.0 Allyson Lister Artistic License The Artistic License v 2.0 is a free software license by the definition of the GNU Project and compatible with the GPL thanks to the relicensing option in section 4(c)(ii) (as compared with the Artistic License 1.0). http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. Allyson Lister Artistic License 2.0 Purchase cost is a license clause which states whether there is a cost involved with a particular usage or licensing of a resource. Allyson Lister Allyson Lister In some ways, purchase cost is similar to the already-extant usage clause hierarchy, which includes restricted and unrestricted usage. However, a usage limitation is not necessarily due to whether or not something costs money: even if the usage is academic only, it could still be either free or non-free. A license could have multiple usage clauses, e.g. academic only when free, and unrestricted if a fee is paid. Purchase cost was created which, together with a usage clause, defines both limitations and cost. Purchase cost Free is a type of purchase cost clause which, when applied, means that there is no cost for the users of the resource to which the license is attached. This clause can be combined with other clauses (such as usage clauses) to specify that only certain usages are free. Allyson Lister Allyson Lister This class refers only to the cost of the resource, not to the definition of "free software" as provided by the GNU Project and which is commonly used to describe software that respects users' freedom and community (http://www.gnu.org/philosophy/free-sw.html). Free Not Free is a type of purchase cost clause which, when applied, means that there is a cost for the users of the resource to which the license is attached. This clause can be combined with other clauses (such as usage clauses) to specify that only certain usages incur a purchase cost. Allyson Lister Allyson Lister This class refers only to the cost of the resource, not to the definition of "free software" as provided by the GNU Project and which is commonly used to describe software that respects users' freedom and community (http://www.gnu.org/philosophy/free-sw.html). Not Free A license which allows any form of usage of the artifact. free to use license The CC BY 4.0 license is a Creative Commons license. This is a non-copyleft free license that is good for art and entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. People are free to: Share — copy and redistribute the material in any medium or format; Adapt — remix, transform, and build upon the material for any purpose, even commercially. The licensor cannot revoke these freedoms as long as you follow the license terms. But they must conform to the following terms: Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits. Creative Commons Attribution 4.0 International (CC BY 4.0) http://creativecommons.org/licenses/by/4.0/ accessed June 22, 2016; https://www.gnu.org/licenses/license-list.html, accessed on June 28, 2016. Allyson Lister Allyson Lister: The only restriction on derivatives is that of the attribution requirement. CC BY 4.0 Attribution not required is an attribution clause which states that no attribution need be provided whenever the resource is used. Allyson Lister Allyson Lister Attribution not required CC BY 2.0 UK is a UK-specific non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 2.0 UK: England & Wales (CC BY 2.0 UK) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 2.0 UK This is the latest version of the LGPL: a free software license, but not a strong copyleft license, because it permits linking with nonfree modules. It is compatible with GPLv3. It is therefore recommend for special circumstances only. Please note that LGPLv3 is not compatible with GPLv2 by itself. However, most software released under GPLv2 allows you to use the terms of later versions of the GPL as well. When this is the case, you can use the code under GPLv3 to make the desired combination. GNU Lesser General Public License (LGPL) version 3 https://www.gnu.org/licenses/license-list.html, accessed 23 June 2016. GNU LGPL v3 GNU AGPL This is a free software, copyleft license. Its terms effectively consist of the terms of GPLv3, with an additional paragraph in section 13 to allow users who interact with the licensed software over a network to receive the source for that program. It is recommended that developers consider using the GNU AGPL for any software which will commonly be run over a network. GNU Affero General Public License (AGPL) version 3 https://www.gnu.org/licenses/license-list.html, accessed on 23 June 2016. Allyson Lister GNU AGPL v3 This is the previous version of the LGPL: a free software license, but not a strong copyleft license, because it permits linking with nonfree modules. It is compatible with GPLv2 and GPLv3. We generally recommend the latest version of the LGPL, for special circumstances only. GNU Lesser General Public License (LGPL) version 2.1 https://www.gnu.org/licenses/license-list.html, accessed 28 June 2016. GNU LGPL v2.1 CC BY 2.1 JP is a Japan-specific non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 2.1 Japan (CC BY 2.1 JP) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 2.1 JP CC BY 2.5 is a non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 2.5 Generic (CC BY 2.5) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 2.5 CC BY 3.0 AU is an Australia-specific non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 3.0 Australia (CC BY 3.0 AU) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 3.0 AU CC BY 3.0 is a non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 3.0 Unported (CC BY 3.0) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 3.0 CC BY 3.0 US is a US-specific non-copyleft free (free by the definition of the GNU Project) license which lets others distribute, remix, tweak, and build upon your work, even commercially, as long as they credit you for the original creation. This is the most accommodating of Creative Commons licenses offered. Recommended for maximum dissemination and use of licensed materials. The GNU Project recommend it for art, entertainment works, and educational works. It is compatible with all versions of the GNU GPL; however, like all CC licenses, it should not be used on software. Creative Commons Attribution 3.0 United States (CC BY 3.0 US) Modified from the definition of http://www.ebi.ac.uk/swo/license/SWO_1000050. Allyson Lister CC BY 3.0 US CC BY-ND 3.0 is a nonfree license, as there are restrictions on distributing modified versions. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. This license is generally used for documentation. Creative Commons Attribution-NoDerivs 3.0 Unported (CC BY-ND 3.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016; https://creativecommons.org/licenses/by-nd/3.0/ accessed 29 June 2016. Allyson Lister CC BY-ND 3.0 CC BY-ND 4.0 is a nonfree license, as there are restrictions on distributing modified versions. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. This license is generally used for documentation. Creative Commons Attribution-NoDerivatives 4.0 International (CC BY-ND 4.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016; https://creativecommons.org/licenses/by-nd/3.0/ accessed 29 June 2016. Allyson Lister CC BY-ND 4.0 CC BY-NC 3.0 is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, use the work for non-commercial purposes, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC 3.0 CC BY-NC 4.0 is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, use the work for non-commercial purposes, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC 4.0 CC BY-NC-ND 3.0 is a nonfree license, as there are restrictions on distributing modified versions and on charging money for copies. You must give appropriate credit, provide a link to the license, and use the work for non-commercial purposes. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported (CC BY-NC-ND 3.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-ND 3.0 CC BY-NC-ND 2.5 is a nonfree license, as there are restrictions on distributing modified versions and on charging money for copies. You must give appropriate credit, provide a link to the license, and use the work for non-commercial purposes. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Generic (CC BY-NC-ND 2.5) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-ND 2.5 CC BY-NC-ND 2.5 CH is a nonfree license, as there are restrictions on distributing modified versions and on charging money for copies. You must give appropriate credit, provide a link to the license, and use the work for non-commercial purposes. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. Creative Commons Attribution-NonCommercial-NoDerivs 2.5 Switzerland (CC BY-NC-ND 2.5 CH) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-ND 2.5 CH CC BY-NC-ND 4.0 is a nonfree license, as there are restrictions on distributing modified versions and on charging money for copies. You must give appropriate credit, provide a link to the license, and use the work for non-commercial purposes. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. If you remix, transform, or build upon the material, you may not distribute the modified material. Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International (CC BY-NC-ND 4.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-ND 4.0 CC BY-NC-SA 2.5 is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license also lets others build upon the original work for non-commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial-ShareAlike 2.5 Generic (CC BY-NC-SA 2.5) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-SA 2.5 CC BY-NC-SA 3.0 is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license also lets others build upon the original work for non-commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-SA 3.0 CC BY-NC-SA 3.0 US is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license also lets others build upon the original work for non-commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States (CC BY-NC-SA 3.0 US) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-SA 3.0 US CC BY-NC-SA 2.5 IN is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license also lets others build upon the original work for non-commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial-ShareAlike 2.5 India (CC BY-NC-SA 2.5 IN) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-SA 2.5 IN CC BY-NC-SA 4.0 is a nonfree license, as there are restrictions on charging money for copies. You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. This license also lets others build upon the original work for non-commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project does not recommend that you use this license for documentation. In addition, it has a drawback for any sort of work: when a modified version has many authors, in practice getting permission for commercial use from all of them would become infeasible. Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) https://www.gnu.org/licenses/license-list.html accessed 29 June 2016. Allyson Lister CC BY-NC-SA 4.0 CC BY-SA 2.1 JP is a copyleft free (free by the definition of the GNU Project) license which lets others remix, tweak, and build upon your work even for commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project recommend this license for art, entertainment and educational works, but also recommend that it is not used for software or documentation, since it is tricky as to how exactly this license is compatible with the GNU GPL. Creative Commons Attribution-Sharealike 2.1 Japan Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY-SA 2.1 JP CC BY-SA 3.0 is a copyleft free (free by the definition of the GNU Project) license which lets others remix, tweak, and build upon your work even for commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project recommend this license for art, entertainment and educational works, but also recommend that it is not used for software or documentation, since it is tricky as to how exactly this license is compatible with the GNU GPL. Creative Commons Attribution-Sharealike 3.0 Unported Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY-SA 3.0 CC BY-SA 3.0 US is a copyleft free (free by the definition of the GNU Project) license which lets others remix, tweak, and build upon your work even for commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project recommend this license for art, entertainment and educational works, but also recommend that it is not used for software or documentation, since it is tricky as to how exactly this license is compatible with the GNU GPL. Creative Commons Attribution-Sharealike 3.0 United States Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY-SA 3.0 US CC BY-SA 4.0 is a copyleft free (free by the definition of the GNU Project) license which lets others remix, tweak, and build upon your work even for commercial purposes, as long as they credit you and license their new creations under the identical terms. The GNU Project recommend this license for art, entertainment and educational works, but also recommend that it is not used for software or documentation, since it is tricky as to how exactly this license is compatible with the GNU GPL. Creative Commons Attribution-Sharealike 4.0 International Modified from http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013; Modified from http://creativecommons.org/licenses/, accessed 12 June 2013. Allyson Lister CC BY-SA 4.0 Open Data Commons The ODbL v1.0 allows a user of the databases to share, create and adapt the database. You must attribute any public use of the database, or works produced from the database. For redistribution of the database or derivative works, you must make the use of this license clear to others, and you must keep intact any notices on the original database. If you publicly use any adapted version of this database, or works produced from an adapted database, you must also offer that adapted database under the ODbL. If you redistribute the database, or an adapted version of it, then you may use technological measures that restrict the work (such as DRM) as long as you also redistribute a version without such measures. Open Database License (ODbL) v1.0 http://opendatacommons.org/licenses/odbl/summary/, accessed 5 July 2016. Allyson Lister ODbL v1.0 The DbCL v1.0 allows a user of the database contents to share, create and adapt the contents. You must attribute any public use of the contents, or works produced from them. For redistribution or derivative works, you must make the use of this license clear to others, and you must keep intact any notices on the original. If you publicly use any adapted version, or works produced from an adapted version, you must also offer it under the DbCL. If you redistribute the contents, or an adapted version of them, then you may use technological measures that restrict the work (such as DRM) as long as you also redistribute a version without such measures. Users of the DbCL must comply with the ODbL v1.0. ODC Database Contents License (DbCL) v1.0 http://opendatacommons.org/licenses/odbl/summary/, accessed 5 July 2016; http://opendatacommons.org/licenses/dbcl/1.0/, accessed 5 July 2016. Allyson Lister DbCL v1.0 The EMBLEM ELM Academic LIcense was developed as the license for the ELM (Eukaryotic Linear Motif) resource. Non-commerical use is allowed under this license, an additional commercial license is also available. This license makes the Licensed Software available free of charge for the licensee, which is a non-profit educational, academic and/or research institution. The software can only be used for academic research projects. This explicitly excludes projects which charge a fee, or projects that are done in collaboration with a third party that is funding the research in whole or in part in exchange for commercial rights on the results and/or possible delay in publication of any relevant results to the academic community. The user and any research assistants, co-workers or other workers who may use the Software agree to not grant licenses on any software that includes the Licensed Software, alone or integrated into other software, to third parties. Modification of the Licensed Software code is prohibited without the prior written consent of EMBLEM. ELM Software License Agreement http://elm.eu.org/media/Elm_academic_license.pdf, accessed 6 July 2017; http://elm.eu.org/infos/about.html accessed 6 July 2016. Allyson Lister EMBLEM ELM Academic License The FlowRepository Open Access Terms of Use license allows any individual to access the licensed product (originally the Flow Cytometry Data Repository) for any purpose. There are no restrictions on the use or redistribution of the data associated with this license, though it makes the statement that some data covered may also be included under more restrictive licensing. http://flowrepository.org/terms_of_service, accessed 6 July 2016. Allyson Lister FlowRepository Open Access Terms of Use The ODbL v1.0 allows a user of the databases to share, create and adapt the database. You must attribute any public use of the database, or works produced from the database. For redistribution of the database or derivative works, you must make the use of this license clear to others, and you must keep intact any notices on the original database. If you redistribute the database, or an adapted version of it, then you may use technological measures that restrict the work (such as DRM) as long as you also redistribute a version without such measures. Open Data Commons Attribution License (ODC-By) v1.0 http://opendatacommons.org/licenses/by/summary/, accessed 6 July 2016. Allyson Lister ODC-By v1.0 The ODC Public Domain Dedication and Licence is a document intended to allow you to freely share, modify, and use this work for any purpose and without any restrictions. This licence is intended for use on databases or their contents (“data”), either together or individually. The goal is to eliminate restrictions held by the original creator of the data and database on the use of it by others. Rightsholders will not be able to “dual license” their work by releasing the same work under different licences. This is because they have allowed anyone to use the work in whatever way they choose. Rightsholders therefore can’t re-license it under copyright or database rights on different terms because they have nothing left to license. ODC Public Domain Dedication and Licence http://opendatacommons.org/licenses/pddl/1.0/, accessed 7 July 2016. Allyson Lister PDDL v1.0 A vendor-specific license is a license which was, at least originally, created by a specific organization to be used just on the resources created within that organization. This is a hierarchy of convenience rather than of shared philosophy. Many of the licenses have since been taken up by other groups whose licensing requirements matched those of the originating organization. Allyson Lister Allyson Lister Vendor-specific License The NIDA NIH Data Access Policy specifies under what legal requirements NIDA NIH data may be accessed. Researchers may gain access to clinical data, genetic analysis data, and DNA by obtaining formal approval from the NIDA Genetic Data Access Request Committee. Allyson Lister Allyson Lister NIDA NIH Data Access Policy The Addgene Terms of Use specifies under what legal requirements Addgene data may be accessed. Data is available for use for non-commercial purposes, but users must get express written consent of Addgene to exploit the data for commercial purposes. All copyright, trademark and other proprietary notices must be retained on the data. As consent must be aquired for commercial use, this license only allows non-commercial use (a separate agreement must be entered into for commercial use). https://www.addgene.org/terms-of-use/, accessed 7 July 2016 Allyson Lister Addgene Terms of Use The CAS Information Use Policy specifies under what legal requirements CAS Information may be accessed. Attribution to ACS must be included whenever creating derivatives or redistributing data covered under this license.Each User is permitted to download and retain a maximum of 5,000 Records and a maximum of 5,000 Molfiles at any given time for personal use or to share within a Project team for the life of the Project. There are also limits on how long records may be stored with the licensee, and how records may be linked. http://www.cas.org/legal/infopolicy accessed 7 July 2016. Allyson Lister CAS Information Use Policy The created contents and works provided under this license by CellFinder are subject to the German copyright law. Third-party contributions are marked as such. Reproduction, adaptation, dissemination and any kind of exploitation outside the limits of the copyright require the written consent of the author or creator. Downloads and copies of these pages are only permitted for private, non-commercial use. The operators of these pages aim to observe the copyright of others or will refer to their own or license-free works. http://cellfinder.de/contact/disclaimer/, accessed 7 July 2016. Allyson Lister CellFinder Copyright The LINCS Data Policy is a license which allows redistribution and derivative works as long as the original data is attributed correctly. All investigators are encouraged, to publish results based on LINCS data. These results may include, but would not be limited to, integrating LINCS data with data from other sources. LINCS data are released with the sole restriction that they must be correctly cited so that others can establish provenance and access the original data; the correct citation will be released with each data set and will comprise either a PMID/PMCID reference or a unique LINCS identifier. http://www.lincsproject.org/data/data-release-policy/, accessed 7 July 2016. Allyson Lister LINCS Data Policy The GeneNetwork Conditions of Use describes the data licensing for the resource covered. Commercial and non-commercial use is allowed, though attribution is requested via either acknowledgement or co-authorship. Further restrictions on the bulk download of as-yet unpublished data is also described. (While mentioned in the document, software licensing is not the focus of this conditions of use and therefore is not modelled here.) http://www.genenetwork.org/conditionsofUse.html, accessed 7 July 2016. Allyson Lister GeneNetwork Conditions of Use The GMD Academic License is a vendor-specific license which allows the access and use of the licensed data for non-commercial purposes, as long as the appropriate attribution is used and copyright notices retained. A separate license agreement is required for commercial users and as such, they are not covered by this license. Without written consent by the GMD, no part of the GMD - in its original or in any way processed or reformatted form - may be re-distributed in any way. Golm Metabolome Database Academic License http://gmd.mpimp-golm.mpg.de/termsconditions.aspx, accessed 7 July 2016. GMD Academic License The GeneProf Academic License is a vendor-specific license which allows the access and use of the licensed resource for non-commercial purposes. http://www.geneprof.org/GeneProf/terms_and_conditions.jsp, accessed 7 July 2016. GeneProf Academic License The European Medicines Agency Copyright states that the Agency is the owner of copyright and other intellectual property rights for documents and other content published on their website. Information and documents made available on the Agency's webpages are public and may be reproduced and/or distributed, totally or in part, irrespective of the means and/or the formats used, for non-commercial and commercial purposes, provided that the Agency is always acknowledged as the source of the material. Such acknowledgement must be included in each copy of the material. Citations may be made from such material without prior permission, provided the source is always acknowledged. The above-mentioned permissions do not apply to content supplied by third parties. Therefore, for documents where the copyright vests in a third party, permission for reproduction must be obtained from this copyright holder. http://www.ema.europa.eu/ema/index.jsp?curl=pages/regulation/general/general_content_000178.jsp, accessed 12 July 2016. Allyson Lister European Medicines Agency Copyright miRTaBase Data License is a simple vendor license which allows academic users to make use of the data for free. http://mirtarbase.mbc.nctu.edu.tw/cache/download/LICENSE, accessed 12 July 2016 Allyson Lister miRTaBase Data License GBIF Data Sharing Agreement is a simple vendor license which allows all users to make use of the data for free, as long as the source of the data is properly attributed. http://www.gbif.org/terms/licences/data-sharing, accessed 12 July 2016 Allyson Lister GBIF Data Sharing Agreement IUPAC/InChI-Trust InChI Licence No. 1.0 is a vendor license which allows all users to make use of the resource, including the source code, for free, as long as its source is properly attributed. It is broadly compatible with the GNU GPL v3 and v2 in that it states in the license that you can change the license from this one to the GNU GPL if you wish. IUPAC/InChI-Trust Licence for the International Chemical Identifier (InChI) Software version 1.04 http://www.inchi-trust.org/wp/wp-content/uploads/2014/06/LICENCE.pdf, accessed 12 July 2016 Allyson Lister IUPAC/InChI-Trust InChI Licence No. 1.0 The Labome copyright makes the contents of their resource freely available for browsing. Any redistribution or reproduction of part or all of the contents in any form is prohibited other than the following: you may print or download to a local hard disk extracts for your use only, with a daily limit of 50 webpages; and you may copy the content to individual third parties for their use, but only if you acknowledge the website as the source of the material. You may not, except with our express written permission, distribute or commercially exploit the content. http://www.labome.com/about/copyright.html, accessed 12 July 2016. Allyson Lister Labome Copyright The CTD Legal Notice and Terms of Data Use specifies under what legal requirements the CTD data may be accessed. Data is available for use for non-commercial purposes as long as the data is properly attributed, but users must get express written consent of CTD to exploit the data for commercial purposes. As consent must be aquired for commercial use, this license only allows non-commercial use (a separate agreement must be entered into for commercial use). Additionally,You must notify CTD and describe your use of their data. For quality control purposes, you must provide CTD with periodic access to your publication of their data. Comparative Toxicogenomics Database Legal Notice and Terms of Use http://ctdbase.org/about/legal.jsp, accessed 12 July 2016 Allyson Lister CTD Legal Notice and Terms of Data Use LOINC RELMA Terms of Use is a vendor license which allows all users to make use of the data for free, as long as the source of the data is properly attributed. An unlimited number of copies of the licensed material may be used. https://loinc.org/terms-of-use/, accessed 12 July 2016 Allyson Lister LOINC RELMA Terms of Use MIACA Full Copyright is a vendor-specific license which allows the document it references to be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the MIACA Standards Initiative or other organizations, except as needed for the purpose of developing MIACA Standards Initiative Recommendations in which case the procedures for copyrights defined in the MIACA Document process must be followed, or as required to translate it into languages other than English. http://miaca.sourceforge.net/copyrightNotice.txt, accessed 12 July 2016 Allyson Lister MIACA Full Copyright The ORCID MIT-style license is identical to the MIT License, except for the addition of the following two clauses: 1. "The above license does not apply to the branding or the "look and feel" of the websites located at the orcid.org URL even if elements thereof are contained in the Software."; 2. "Except to provide the copyright notice required above or as allowed under ORCID Inc.'s Trademark Use Policy (available at http://orcid.org, under "Policies"), you may not use the name of ORCID, Inc., ORCID, its marks and logo, to advertise, promote or suggest any affiliation with or endorsement by ORCID, Inc. in connection with your use of the Software.". The MIT License is is a lax, permissive non-copyleft free software license, compatible with the GNU GPL. https://github.com/ORCID/ORCID-Source/blob/master/LICENSE.md, accessed 12 July 2016; https://www.gnu.org/licenses/license-list.html, accessed 28 June 2016. ORCID MIT-Style License (MIT) The NIH Genomic Data Sharing Policy specifies under what legal requirements NIH genomic data may be accessed. Access to human data is through a tiered model involving unrestricted- and controlled-data access mechanisms. Requests for controlled-access data are reviewed by NIH Data Access Committees (DACs). https://gds.nih.gov/PDF/NIH_GDS_Policy.pdf Allyson Lister NIH Genomic Data Sharing Policy The Facebase Data Access Policy specifies under what legal requirements Facebase data may be accessed. Non-sensitive data is available for use for all purposes and usages. As consent must be aquired for sensitive (or closed) data, this license only covers the use of the open data (a separate agreement must be entered into for restricted access data). https://www.facebase.org/methods/policies/, accessed 12 July 2016 Allyson Lister Facebase Data Access Policy The ORCID Terms of Use is a vendor-specific license for describing how the data in the ORCID resource may be used. ORCID data can be sublicensed, reproduced, stored, transmitted, distributed, publicly performed and publicly displayed for non-commercial and commercial uses. http://orcid.org/content/orcid-terms-use, accessed 12 July 2016 Allyson Lister ORCID Terms of Use The PPDB Academic Licens specifies under what legal requirements PPDB data may be accessed. Data is available for use for non-commercial purposes, but users must get express written consent of PPDB to exploit the data for commercial purposes. The resource will be used for teaching or not-for-profit research purposes only. The resource will not be further distributed to others. The recipient agrees to acknowledge the source of the material in any publication reporting its use. As consent must be aquired for commercial use, this license only allows non-commercial use (a separate agreement must be entered into for commercial use). Plant Promoter Database Academic License http://ppdb.agr.gifu-u.ac.jp/ppdb/cgi-bin/license.cgi, accessed 12 July 2016 Allyson Lister PPDB Academic License The SciCrunch Terms and Conditions specifies under what legal requirements SciCrunch data may be accessed. Broadly speaking, the site conforms, both for data and for the site documents themselves, to the CC BY 3.0 license. However, particular details for this resource (other than attribution requirements) are unclear. https://neuinfo.org/page/terms, accessed 12 July 2016. Allyson Lister SciCrunch Terms and Conditions The ProDom Commercial License is a vendor-specific license which allows the access and use of the licensed data for commercial purposes for a fee. Non-commercial use does not require licensing or a fee, and is covered by a separate license. http://prodom.prabi.fr/prodom/current/html/downcom.php, accessed 12 July 2016. ProDom Commercial License The SBGN Open License with Attribution is a very simple statement of openness for the SBGN standard and related resources available on its website. No one—not the principal investigators, nor the SBGN Editors, nor the members of the SBGN Scientific Committee, nor the funding agencies or anyone else—owns SBGN; it is a free and open community effort that extends beyond any single group, and they view themselves only as organizers and fellow developers. Systems Biology Graphical Notation Open License with Attribution http://www.sbgn.org/About, accessed 12July 2016. Allyson Lister SBGN Open License with Attribution The NLM Open License with Attribution is a vendor-specific license which is similar to public domain, but which requests attribution. Government information at NLM Web sites is in the public domain. Public domain information may be freely distributed and copied, but it is requested that in any subsequent use the National Library of Medicine (NLM) be given appropriate acknowledgement. When using NLM Web sites, you may encounter documents, illustrations, photographs, or other information resources contributed or licensed by private individuals, companies, or organizations that may be protected by U.S. and foreign copyright laws. Transmission or reproduction of protected items beyond that allowed by fair use as defined in the copyright laws requires the written permission of the copyright owners. Specific NLM Web sites containing protected information provide additional notification of conditions associated with its use. National Library of Medicine Open License with Attribution https://www.nlm.nih.gov/copyright.html, accessed 12 July 2016. Allyson Lister NLM Open License with Attribution The RegenBase Terms of Use is a vendor-specific license which requires attribution and which does not meet the requirements for open source code, as there is a statement saying that the user will NOT translate, reverse engineer, decompile or disassemble the SYSTEM, or disclose the SYSTEM or any underlying information or technology to any third party. http://regenbase.org/terms-of-use.html, accessed 12 July 2016 Allyson Lister RegenBase Terms of Use The SABIO-RK Non-Commercial Purpose License covers use of the Database for Non-Commercial Purpose only, and appropriate attribution must be given. Non-Commercial Purpose means the use of the Database solely for internal non-commercial research and academic purposes. As consent must be aquired for commercial use, a separate agreement must be entered into for commercial use and is not covered here. http://sabiork.h-its.org/layouts/content/termscondition.gsp, accessed 13 July 2016 Allyson Lister SABIO-RK Non-Commercial Purpose License The Terms of Use for EMBL-EBI Services reflect EMBL-EBI’s commitment to OpenScience through its mission to provide freely available online services, databases and software relating to data contributed from life science experiments to the largest possible community. They impose no additional constraints on the use of the contributed data than those provided by the data owner.
EMBL-EBI expects attribution (e.g. in publications, services or products) for any of its online services, databases or software in accordance with good scientific practice. The expected attribution will be indicated on the appropriate web page. http://www.ebi.ac.uk/about/terms-of-use, accessed 13 July 2016. Allyson Lister Terms of Use for EMBL-EBI Services The UCUM Terms of Use is a vendor-specific license which allows the resource it references to be used for commercial or non-commercial purposes without restriction of any kind, provided that appropriate attribution is given. However, the resource itself may not be modified in any way. Users may make and distribute an unlimited number of copies of the Licensed Materials. Each copy thereof must include the Copyright Notice and License. The Unified Code for Units of Measure Terms of Use http://unitsofmeasure.org/trac/wiki/TermsOfUse, accessed 13 July 2016 Allyson Lister UCUM Terms of Use The UMLS Metathesaurus License is a vendor-specific license which allows redistribution (as part of a larger computer application) as long as the original data is attributed correctly and a summary of data use provided each year. There is no cost to use the service, and may be used both commercially and non-commercially. Users must inform the resource if it is redistributed within a computer application. https://uts.nlm.nih.gov/license.html, accessed 13 July 2016. Allyson Lister UMLS Metathesaurus License The MHAS Data Policy is a vendor-specific license which requires that users attribute the data properly. The data is accessible free of charge to all commercial and non-commercial users. http://www.mhasweb.org/Data.aspx, accessed 13 July 2016. Allyson Lister MHAS Data Policy The ALFRED Copyright is a vendor-specific license which states that the resource is freely available to the scientific community for statistical analysis, the only condition being that attribution is required. The ALlele FREquency Database Copyright https://alfred.med.yale.edu/alfred/fullcopyrightpage.asp, accessed 13 July 2016. Allyson Lister ALFRED Copyright Commercial use only is a usage restricted clause which restricts the use of the licensed resource only to licensees who are commercial entites. Allyson Lister Allyson Lister Allyson Lister: Some licenses are expressly written for their commercial users. In such cases, a vendor will normally have multiple licenses, one for non-commercial use and one for commercial use. Commercial use only Fee-Based Commercial License is a license which, among its license clauses, states that commercial entities may use the resource for a fee. Allyson Lister Allyson Lister Allyson Lister: This is a convenience class which combines multiple license clauses to create a defined class, such that any license type which meets the requirements will be inferred to fall into this category. Fee-Based Commercial License The BRENDA License is a vendor-specific license which allows access to the official website and its data free of charge. Any inclusion of BRENDA components into other data bases, or redistribution of BRENDA requires a license. To obtain commercial in-house versions please contact our distributor Biobase GmbH ( http://www.biobase-international.com). In-house versions for academic users are available from Enzymeta GmbH (http://www.enzymeta.de). The copyright of the printed version is held by Springer publishers. Users of the website may retrieve a copy of the data but must not alter the BRENDA data or charge money for a copy of it, and must not distribute any BRENDA data without prior permission of the copyright holder. As there are separate licenses for in-house versions of BRENDA, please note that this license only describes the website-accessible version of BRENDA. http://www.brenda-enzymes.org/copy.php, accessed 13 July 2016. Allyson Lister BRENDA License The TCGA Data Use Certification is a vendor-specific license which specifies the certification which must be officially agreed to before data may be accessed. There are a number of requirements in this license including attribution, non-transferability, annual progress updates, and a number of security measures. The Cancer Genome Atlas Data Use Certification http://cancergenome.nih.gov/pdfs/Data_Use_Certv082014, accessed 13 July 2016 Allyson Lister TCGA Data Use Certification Non-Commercial No-Fee License is a license which, among its license clauses, states that non-commercial entities may use the resource without a purchase or usage cost. Allyson Lister Allyson Lister Allyson Lister: This is a convenience class which combines multiple license clauses to create a defined class, such that any license type which meets the requirements will be inferred to fall into this category. Non-Commercial No-Fee License DrugBank is offered to the public as a freely available resource. Use and re-distribution of the data, in whole or in part, for commercial purposes (including internal use) requires a license. We ask that users who download significant portions of the database cite the DrugBank paper in any resulting publications. As consent must be aquired for commercial use, this license only describes the non-commercial aspects of data usage (a separate agreement must be entered into for commercial use). http://www.drugbank.ca/about, accessed 13 July 2016 Allyson Lister DrugBank Academic License The UCSC Genome Browser Commercial Software License is a vendor-specific license which allows the access and use of the licensed data for commercial purposes under specific restrictions for a fee. The licensee shall not have the right to create Derivative Works of the Licensed Products, and the Licensee shall not have the right to sublicense, sell, transfer, or assign the Licensed Products. Non-commercial use does not require licensing or a fee, and is covered by a separate license. http://genome.cse.ucsc.edu/license/gblicense.pdf, accessed 13 July 2016. UCSC Genome Browser Commercial Software License Permission required for commercial use is a clause which restricts the use of the licensed resource only to licensees who are commercial entites which have received explicit permission in the manner stipulated by the resource. Allyson Lister Allyson Lister Allyson Lister: Some licenses are expressly written for their commercial users. In such cases, a vendor will normally have multiple licenses, one for non-commercial use and one for commercial use. Permission required for commercial use Permission required for derivative work is a clause which restricts the production of derivative work based on the licensed resource only to licensees who have received explicit permission in the manner stipulated by the resource. Allyson Lister Allyson Lister Permission required for derivative work Data are to be used only for purposes of research and education. Commercial use is prohibited without written permission from Xenbase. The contributor(s) of unpublished data should be explicitly acknowledged in any publication that incorporates, or is based in part or entirely on these data. Xenbase should be explicitly acknowledged when data used in any publication, has been in part or solely derived from analysis of data from this website. As consent must be aquired for commercial use, this license only describes the non-commercial aspects of data usage (a separate agreement must be entered into for commercial use). http://www.xenbase.org/other/static/aboutXenbase.jsp, accessed 13 July 2016 Allyson Lister Xenbase Academic Conditions of Use The RIKEN copyright is a vendor-specific license which states that all text, photographs, diagrams, and other materials on their website are copyrighted by RIKEN unless explicitly specified otherwise on the website. It is prohibited to use, reproduce, or modify any of the material without RIKEN's permission. http://www.riken.jp/en/terms/, accessed 14 July 2016 Allyson Lister RIKEN Copyright The Design Science License is a license intended to be a general "copyleft" that can be applied to any kind of work that has protection under copyright. This license states those certain conditions under which a work published under its terms may be copied, distributed, and modified. Unlike other open source licenses, the DSL was intended to be used on any type of copyrightable work, including documentation and source code. It was the first "generalized copyleft" license. This is a free and copyleft license meant for general data. THe GNU Project does not recommend it for software or documentation, since it is incompatible with the GNU GPL and with the GNU FDL; however, they consider it appropriate for other kinds of data. The DSL was written by Michael Stutz. The DSL came out in the 1990s, before the formation of the Creative Commons. According to Wikipedia, once the Creative Commons arrived, Stutz considered the DSL experiment "over" and no longer recommended its use. https://www.gnu.org/licenses/dsl.html, accessed 13 July 2016; https://www.gnu.org/licenses/license-list.html, accessed 13 July 2016; https://en.wikipedia.org/wiki/Design_science_license, accessed 13 July 2016. Allyson Lister Design Science License The University of Concordia Terms of Use is a vendor-specific license which states that no document appearing on this website or any other website owned, operated or controlled by Concordia may be copied, sold, reproduced, republished, downloaded, posted, transmitted or distributed by any means with the exception of downloading or printing the contents of the site for personal, non-commercial use. This use must bear in mind that Concordia reserves its copyright and its rights to exclusivity over the material. http://www.concordia.ca/web/terms.html, accessed 14 July 2016 Allyson Lister University of Concordia Terms of Use All OME formats and software are freely available, and all OMERO and Bio-Formats source code is available under GNU public "copyleft" licenses or through commercial license from Glencoe Software. FLIMfit is also now available under a GNU public "copyleft" license. For questions about the GNU license refer to the Frequently Asked Questions about the GNU Licenses page. Bio-Formats - under the terms of the GNU public "copyleft" license, any software package linking to Bio-Formats, either directly or indirectly, cannot be distributed unless its source code is also made available under the terms of the GPL. Some components which provide reader and writer implementations for open file formats, are released under a more permissive BSD-2 license which enables non-GPL third party software. For a complete list of which file formats are included in the BSD license, see the BSD column of the supported formats table. Developers of non-GPL software wishing to leverage Bio-Formats components not covered by the BSD license may purchase a commercial license from Glencoe Software; please contact them at bioformats@glencoesoftware.com to discuss your requirements. The core OME Model Schema files (.XSD) use the Creative Commons Attribution 4.0 International License, and as such you are free to share or adapt them as long as you attribute the OME consortium as follows; "This work is derived in part from the OME specification. Copyright (C) 2002-2016 Open Microscopy Environment". See further information about using OME-XML in your work. This website is also covered by the Creative Commons Attribution 4.0 International License - you are free to share or adapt content as long as you credit the Open Microscopy Environment. The exception to this is the screenshots and videos which can only be used for non-commercial purposes, see Attributions below. http://www.openmicroscopy.org/site/about/licensing-attribution/licensing, accessed on 23 June 2016. Allyson Lister OME Software Conditons of Use The ImmPort Conditions of Use is a vendor-specific license which allows the use of ImmPort data for any legal purpose except for those prohibited elsewhere in this agreement. There are very few prohibitions, and no requirement for attribution. https://aspera-immport.niaid.nih.gov:9443/displayAgreement, accessed 14 July 2016 Allyson Lister ImmPort Conditions of Use Distribution with notices is a distribution clause in which distribution is unrestricted, except that all distributions must retain certain licence information (e.g., copyright notices). Andy Brown Andy Brown Distribution with notices This is a lax, permissive non-copyleft free software license, compatible with the GNU GPL. It is sometimes ambiguously referred to as the MIT License. For substantial programs it is better to use the Apache 2.0 license since it blocks patent treachery. Expat License https://www.gnu.org/licenses/license-list.html, accessed 28 June 2016. MIT License The GNU Lesser General Public License (LGPL) is a free software license published by the Free Software Foundation (FSF). The license allows developers and companies to use and integrate LGPL software into their own (even proprietary) software without being required by the terms of a strong copyleft license to release the source code of their own software-parts. The license requires that only the LGPL software-parts be modifiable by end-users via source code availability. GNU Lesser General Public License http://en.wikipedia.org/wiki/GNU_Lesser_General_Public_License, accessed 27 March 2015. Allyson Lister GNU LGPL The Artistic License v1.0 is not considered a free software license by the definition of the GNU Project because it is too vague; some passages are "too clever for their own good", and their meaning is not clear. The GNU Project recommends that it is not used except as part of the disjunctive license of Perl. http://www.gnu.org/licenses/license-list.html, accessed 13 June 2013. Allyson Lister Andy Brown Artistic License 1.0 The Eclipse Public License is similar to the Common Public License. This is a free software license. Unfortunately, its weak copyleft and choice of law clause make it incompatible with the GNU GPL. The only change is that the EPL removes the broader patent retaliation language regarding patent infringement suits specifically against Contributors to the EPL'd program. Eclipse Public License Version 1.0 https://www.gnu.org/licenses/license-list.html, accessed 28 June 2016. EPL v1 The Modified BSD is a software license based on he original FreeBSD license, modified by removal of the advertising clause. It is a lax, permissive non-copyleft free software license, compatible with the GNU GPL. This license is sometimes referred to as the 3-clause BSD license. According to the GNU Project, the modified BSD license is not bad, as lax permissive licenses go, though the Apache 2.0 license is preferable. However, it is risky to recommend use of “the BSD license”, even for special cases such as small programs, because confusion could easily occur and lead to use of the flawed original BSD license. To avoid this risk, you can suggest the X11 license instead. The X11 license and the modified BSD license are more or less equivalent. According to the GNU project, the Apache 2.0 license is better for substantial programs, since it prevents patent treachery. 3 clause BSD License http://www.gnu.org/licenses/license-list.html, accessed June 12, 2013. Allyson Lister Andy Jones Modified BSD Latex Project Public License LPPL OPL v1.0 is not a free software license according to the definition of the GNU Project because it requires sending every published modified version to a specific initial developer. Open Public License Version 1.0 http://www.gnu.org/licenses/license-list.html, accessed 12 June 2013. OPL v1.0 Mozilla Public License MPL Development status is an information content entity which indicates the maturity of a software entity within the context of the software life cycle. Allyson Lister Andy Brown Development status Alpha is a development status which is applied to software by the developer/publisher during initial development and testing. Software designated alpha is commonly unstable and prone to crashing. It may or may not be released publicly. Modified from http://en.wikipedia.org/wiki/Software_release_life_cycle, accessed 11 June 2013. Allyson Lister alpha Beta is a development status which is generally applied to software by the developer/publisher once the majority of features have been implemented, but when the software may still contain bugs or cause crashes or data loss. Software designated beta is often released publicly, either on a general release or to a specific subset of users called beta testers. Modified from http://en.wikipedia.org/wiki/Software_release_life_cycle, accessed 11 June 2013. Allyson Lister beta A release candidate (RC) is a beta version with potential to be a final product, which is ready to release unless significant bugs emerge. http://en.wikipedia.org/wiki/Software_release_life_cycle#Release_candidate, accessed 22 October 2014. Release candidate Live is a development status which is applied to software that has been designated as suitable for production environments by the developer/publisher. If a non-free product, software at this stage is available for purchase Production Allyson Lister, Jon Ison, and Modified from http://en.wikipedia.org/wiki/Software_release_life_cycle, accessed 11 June 2013. Allyson Lister Live Sofware is no longer being supplied by the developers/publishers Obsolete An updated version of the software is available. Andy Brown Superseded The first release of a piece of software. A first release does not imply any particular levels of maturity other than is this is the first instance of this software to be considered released. James Malone First release The latest release of a piece of software. This does not imply any levels of maturity other than indicating this is the most recent release. James Malone Latest release Software has developers actively maintaining it (fixing bugs) Andy Brown Maintained matrix manipulation spreadsheet editing document outlining image compression ontology engineering word processing simulation and analysis of biochemical networks Manage Laboratory information such is commonly performed by LIMS software. laboratory information management the objective of performing the actions of an operation system such as managing the software running a computer and the interactions with the system resources and hardware. manage computer operations Text editing is the objective of editing plain text files. text editing Renders a file in such a way that its contents can be understood by users. file rendering annotation editing biological data processing molecular sequence analysis Andy Brown citation management Allyson Lister sequence alignment Allyson Lister multiple sequence alignment Allyson Lister pairwise sequence alignment An averaging objective is a data transformation objective where the aim is to perform mean calculations on the input of the data transformation. James Malone averaging A center calculation objective is a data transformation objective where the aim is to calculate the center of an input data set. James Malone center calculation A class discovery objective (sometimes called unsupervised classification) is a data transformation objective where the aim is to organize input data (typically vectors of attributes) into classes, where the number of classes and their specifications are not known a priori. Depending on usage, the class assignment can be definite or probabilistic. James Malone class discovery A class prediction objective (sometimes called supervised classification) is a data transformation objective where the aim is to create a predictor from training data through a machine learning technique. The training data consist of pairs of objects (typically vectors of attributes) and class labels for these objects. The resulting predictor can be used to attach class labels to any valid novel input object. Depending on usage, the prediction can be definite or probabilistic. A classification is learned from the training data and can then be tested on test data. PERSON: Elisabetta Manduchi PERSON: James Malone class prediction A correction is where the aim is to correct for error, noise or other impairments to the input of the data transformation or derived from the data transformation itself. PERSON: James Malone PERSON: Melanie Courtot correction A background correction is where the aim is to remove irrelevant contributions from the measured signal, e.g. those due to instrument noise or sample preparation. PERSON: Elisabetta Manduchi James Malone background correction An error correction is a data transformation objective where the aim is to remove (correct for) erroneous contributions arising from the input data, or the transformation itself. PERSON: James Malone James Malone error correction cross validation curve fitting is a data transformation in which the aim is to find a curve which matches a series of data points and possibly other constraints. PERSON: Elisabetta Manduchi Elisabetta Manduchi James Malone curve fitting A normalization is a data transformation objective where the aim is to remove systematic sources of variation to put the data on equal footing in order to create a common base for comparisons. PERSON: Elisabetta Manduchi PERSON: Helen Parkinson PERSON: James Malone James Malone data normalization A decision tree induction objective is a data transformation objective in which a tree-like graph of edges and nodes is created and from which the selection of each branch requires that some type of logical decision is made. PERSON: James Malone James Malone decision tree induction mean calculation A descriptive statistical calculation objective is a data transformation objective which concerns any calculation intended to describe a feature of a data set, for example, its center or its variability. James Malone PERSON: Elisabetta Manduchi PERSON: James Malone PERSON: Melanie Courtot PERSON: Monnie McGee descriptive statistical calculation differential expression analysis A feature extraction objective is a data transformation objective where the aim of the data transformation is to generate quantified values from a scanned image. feature extraction The process of publishing software. Typically this process involves a software publisher performing acts such as licensing, marketing, publicising and/or providing support for a product. Andy Brown software publishing process The process of developing software, typically involving the design, implementation and testing of software. Andy Brown AL 24.9.19: Merged the now-obsolete http://www.ebi.ac.uk/swo/objective/SWO_4000007 into this class as they were describing the same concept. See also https://github.com/allysonlister/swo/issues/2 software development A data mining algorithm is an algorithm that has as its objective a data mining task and outputs as a result a generalization specified by a generalization specification. data mining algorithm Clustering algorithm is a data mining algorithm that solves a clustering task and as a result produces a clustering. clustering algorithm Estimating the (Joint) Probability Distribution. A set of data (of type T) is often assumed to be a sample taken from a population according to a probability distribution. A probability distribution/density function assigns a non-negative probability/density to each object of type T. Probably the most general data mining task (Hand et al. 2001) is the task of estimating the (joint) probability distribution D over type T from a set of data items or a sample drawn from that distribution. As mentioned above, in the most typical case we would have T = Tuple(T1, . . ., Tk), where each of T1, . . ., Tk is Boolean, Discrete(S) or Real. We talk about the joint probability distribution to emphasize the difference to the marginal distributions of each of the variables of type T1, . . ., Tk: the joint distribution captures the interactions among the variables. Representing multi-variate distributions is a non-trivial task. Two approaches are commonly used in data mining. In the density-based clustering paradigm, mixtures of multi-variate Gaussian distributions are typically considered (Hand et al. 2001). Probabilistic graphical models, most notably Bayesian networks, represent graphically the (in)dependencies between the variables: Learning their structure and parameters is an important approach to the problem of estimating the joint probability distribution. probability distribution estimation task the task of pattern discovery is to find all local patterns from a given pattern language that satisfy the required conditions. A prototypical instantiation of this task is the task of finding frequent itemsets (sets of items, such as {bread, butter}), which are often found together in a transaction (e.g., a market basket) (Aggrawal et al 1993). The condition that a pattern (itemset) has to satisfy in this case is to appear in (hold true for) a sufficiently high proportion (called support and denoted by s) of the transactions in the input dataset. With the increasing interest in mining complex data, mining frequent patterns is also considered for structured data. We can thus talk about mining frequent subsequences or mining frequent subgraphs in sequence or graph data. We can consider as frequency the multiple occurrences of a pattern in a single data structure (e.g., sequence or graph) or the single occurrences of a pattern in multiple data structures. pattern discovery task A data mining task is an objective specification that specifies the objective that a data mining algorithm needs to achieve when executed on a dataset to produce as output a generalization. data mining task Predictive modeling algorithm is a data mining algorithm that solves a predictive modeling task and as a result produces a predictive model. predictive modeling algorithm Pattern discovery algorithm is a data mining algorithm that solves a pattern discovery task and as a result produces a set of patterns. pattern discovery algorithm Learning a (Probabilistic) Predictive Model. In this task, we are given a dataset that consists of examples of the form (d, c), where each d is of type Td and each c is of type Tc. We will refer to d as the description and c as the class or target. To learn a predictive model means to find a mapping from the description to the target, m :: Td → Tc, that fits the data closely. This means that the observed target values and the target values predicted by the model, i.e., c and ˆc = m(d), have to match closely. predictive modeling task probability distribution estimation algorithm Clustering. Clustering in general is concerned with grouping objects into classes of similar objects (Kaufman and Rousseeuw 1990). Given a set of examples (object descriptions), the task of clustering is to partition these examples into subsets, called clusters. clustering task Ensemble algorithms are algorithms that generate an ensemble when executed on a dataset. The ensemble algorithms include a specification of single generalization algorithms which are executed in oreder to produce the single generalizations that compose the ensemble. Examples of ensemble algorithms include bagging, boosting, stacking, random forests, random subspaces,bagging of random subspaces etc. For example, bagging of decision trees algorithm includes the specification of a bagging algorithm and having a decision tree algorithm for generating the base models composing the ensemble. ensemble algorithm single generalization algorithms are data mining algorithms that given a dataset on input produce a generalization at the output. single generalization algorithm