]> POWLA, OWL2/DL edition Potsdamer Austauschformat Linguistischer Annotationen (PAULA) POWLA POWLA provides an OWL2/DL implementation of a generic data model for linguistic annotation (Chiarcos 2012ab). The immediate predecessor of POWLA is the PAULA data model (Chiarcos et al., 2008), and the associated XML standoff format (Götze et al., 2005; Dipper 2005). PAULA (and POWLA) implements the data model of the Linguistic Annotation Framework as described by Ide and Romary (2004). POWLA is thus semantically equivalent with the LAF data model and the GrAF format (Ide and Suderman, 2007), i.e., ISO 24612:2012. History: 2018-04-03 deprecate powla:rootOfDocument in favor of powla:hasLayer (partial inverse) 2018-04-01 deprecate powla:endPosition, powla.startPosition in favor of powla:end, powla:start; deprecate powla:hasMetadata in favor of a generalization of powla:hasAnnotation 2018-03-27 deprecate powla:nextNode and powla:previousNode in favor of powla:next, powla:previous 2012-02-23 initial release of the OWL/DL implementation 2008-05-15 abstract data model published (PAULA 1.1, Chiarcos et al. 2008) References: Chiarcos, C. (2012a). Interoperability of corpora and annotations. In Chiarcos C. et al. (ed.), Linked Data in Linguistics (pp. 161-179). Springer, Berlin, Heidelberg. Chiarcos, C. (2012b). POWLA: Modeling linguistic corpora in OWL/DL. In Extended Semantic Web Conference (pp. 225-239). Springer, Berlin, Heidelberg. Chiarcos, C., Dipper, S., Götze, M., Leser, U., Lüdeling, A., Ritz, J., & Stede, M. (2008). A flexible framework for integrating annotations from different tools and tagsets. Traitement Automatique des Langues, 49(2), 271-293. Dipper, S. (2005). XML-based Stand-off Representation and Exploitation of Multi-Level Linguistic Annotation. In Berliner XML Tage (pp. 39-50). Götze, M., Skopeteas, S., Roloff, T., & Stoel, R. (2005). Towards a cross-linguistic production data archive: Structure and exploration. In International Tbilisi Symposium on Logic, Language, and Computation (pp. 127-138). Springer, Berlin, Heidelberg. Ide, N., & Romary, L. (2004). International standard for a linguistic annotation framework. Natural language engineering, 10(3-4), 211-225. Ide, N., & Suderman, K. (2007). GrAF: A graph-based format for linguistic annotations. In proceedings of the Linguistic Annotation Workshop (pp. 1-8). Association for Computational Linguistics. OPTIONAL PROPERTY for expressing hierarchical annotations with coverage inheritance, e.g., in a tree annotation. Inverse of powla:hasParent, see there for details. OPTIONAL object property hasLayer assigns a Relation or a Node a an annotation layer. hasLayer is recommended for Root nodes; a root can have at most one layer. RECOMMENDED PROPERTY for expressing hierarchical annotations with coverage inheritance, e.g., in a tree annotation. Coverage inheritance means that the string covered by the children must also be covered by the parent node. A typical example is phrase-structure syntax. A typical counter-example is dependency syntax. OPTIONAL object property powla:hasRoot is useful for efficient querying, as it allows to quickly check whether two nodes are part of the same tree structure (i.e., they have the same root). However, this is a derived property and SHOULD NOT be provided when exchanging POWLA datasets. This is an optional shorthand for ?node powla:hasParent+ ?root MINUS {?root powla:hasParent [] } RECOMMENDED datatype property for expressing annotated relations with powla:Relaton. RECOMMENDED datatype property for expressing annotated relations with powla:Relaton. OPTIONAL datatype property for expressing annotated relations with powla:Relaton. Inverse of powla:hasSource. OPTIONAL datatype property for expressing annotated relations with powla:Relaton. Inverse of powla:hasTarget OBLIGATORY PROPERTY for connecting two powla:Nodes in a sequence. Note that powla:next is not transitive, but should connect adjacent nodes only. powla:next can be used to connect subsequent strings in *any* annotation, but when applied to an annotation layer with hierarchical annotation (e.g., a syntactic tree), it is recommended to use powla:next to express the order of nodes with the same parent (e.g., elements of a phrase) are to be connected. Note that sibling order may deviate from string order in the case of discontinuous annotations: [ [What]_PP1 are [ [you]_SBJ talking [about]_PP2 ]_VP ] ? In the example, the prepositional argument [about what]_PP is split ("preposition stranding"). Depending on the type of syntactic representation, it may be desirable to represent this as a single phrase, and POWLA allows to specify the canonical order of phrase elements regardless of their sequential order in the textual representation, i.e., about > what. Note that powla:next MUST be cycle-free, so antisequential order SHOULD NOT be applied to externally provided URIs, e.g., NIF URIs or RFC 5147 URIs. As an alternative, these SHOULD be assigned blank nodes as powla:hasParent each, which are then connected by powla:next. All children of the same parent node SHOULD be connected by a sequence of powla:next transitions. Non-siblings CAN be connected by powla:next, but this is not recommended except in the absence of hierarchical annotations. Also note that a single powla:Node MAY have multiple powla:next properties *relative to different parent nodes*. These can be disambiguated with reference to the parent node. true replaced by powla:next OPTIONAL PROPERTY for connecting two powla:Nodes in a sequence. Inverse of powla:next, see there for details. true deprecated in favor of powla:previous true DEPRECATED object property connecting a DocumentLayer to the roots in the document. deprecated in favor of powla:hasLayer (a partial inverse) OPTIONAL datatype property or powla:Node RECOMMENDED datatype property for powla:Terminal powla:end allows to specify a numerical (integer) index. This may be an offset (as in NIF) or a structure-sensitive index (as in ANNIS). Interpretation is implementation-specific, but end >= start. true DEPRECATED in favor of powla:end 2018-04-01 generalized to metadata annotation corresponds to labels attached to nodes and edges in SALT ABSTRACT datatype property for linguistic annotations (on Node and Relation) and metadata (on Document and Layer). hasAnnotation represents linguistic annotations. The atgtribute name is specified in the hasXY property, the value is the string value. true deprecatred in favor of (a generalization of) powla:hasAnnotation like hasAnnotation, but for Document and Layer corresponds to labels attached to Graphs in SALT true deprecated in favor of powla:string OPTIONAL datatype property for powla:Node RECOMMENDED datatype property for powla:Terminal powla:start allows to specify a numerical (integer) index. This may be an offset (as in NIF) or a structure-sensitive index (as in ANNIS). Interpretation is implementation-specific, but end >= start. true DEPRECATED in favor of powla:start RECOMMENDED datatype property for powla:Terminal OPTIONAL datatype property for powla:Node powla:string carries the string value of the tokens to which this annotation unit (node) applies. For empty strings, this must be left unspecified. Corresponds to nif:anchor. 0 OPTIONAL type A Corpus is a Document without superDocument 1 OPTIONAL type A document is a(n annotated piece of) primary data as defined by the respective use case. A document can aggregate other documents (e.g., the Bible consists of several books), we thus extend the notion of document to collections of documents. powla:Document is roughly corresponding to SALT Graph and PAULA anno sets. 1 OPTIONAL type Layer within a document true DEPRECATED object property Dominance relations are labeled (reified) relations with coverage inheritance, i.e., the target node covers all terminals of the source node(s). Dominance relations are typical for phrase structure syntax. This property, originally motivated from PAULA, is deprecated in POWLA, as dominance relations are sufficiently described by powla:Relation and powla:hasParent. dominance relations aren't actually necessary, a Relation is a DominanceRelation if it coincides with a hasChild property 1 OPTIONAL type A layer describes a group of annotations, this may be either within a document or independently from a particular document OPTIONAL type Markable layers are visualized in a special way. with the current modelling, the status of a layer as a markable layer can be inferred and does not need to be asserted. RECOMMENDED type. Nodes represent units of linguistic annotation (`markables'). Typical nodes have a defined extension or a position in the annotated data. However, nodes may also be empty or not determined in their position if the annotated category can apply to nodes with a defined extension or position. An example for a non-positionable node with zero extension is the annotation of implicit semantic roles. El Salvador is now the only Latin American country which still has troops in [Iraq]_LOC. Nicaragua, Honduras and the Dominican Republic have withdrawn their troops [0]_LOC. In terms of frame semantics, withdrawing toops requires a location element, marked by [0]_LOC, but this is not realized in the local sentence. We can, however, infer its existence from a frame inventory (and in this case, also connect it with a string in the preceding text -- this is not always the case). However, it is impossible to define its actual string position in the second sentence. Yet, the element marked by [0]_LOC is a valid powla:Node because if a phrase like "from Iraq" would have occurred in the sentence, it would have been annotated with the same linguistic features as the empty element. Note that rdf:type powla:Node is recommended, but not obligatory, as this can be RDFS-inferred from (the obligatory) powla:next. 1 OPTIONAL type A powla:Node which is not a powla:Terminal (see there), RDFS-inferrable from powla:hasParent. ABSTRACT type for POWLA data structures, not to be directly applied to data. true DEPRECATED object relation A pointing relation is a labeled (reified) relation without coverage inheritance, i.e., the (terminals covered by the) target nodes do not typically overlap with the (terminals covered by the) source nodes. This property, motivated from PAULA, is deprecated in POWLA, as it is sufiiently described by a relation without accompanying powla:hasParent. pointing relations aren't actually necessary, a relation is a pointing relation if it does not coincide with a hasChild property. RECOMMENDED type A relation represents a labelled edge that holds between two powla nodes. POWLA employs a reified representation, where source and target are linked with the relation via powla:hasSource and powla:hasTarget (resp., powla:isSourceOf and powla:isTargetOf) corresponding to SALT Edge, POM Relation Note: in SALT no edges between Terminals 0 OPTIONAL type A root node is a powla:Node which is not a subject of a powla:hasParent property. A number of properties that organize linguistic annotations into documents and corpora are based on the notion of root nodes as this limits the number of necessary links between data points and data sets. powla:Root can be RDFS-inferred from these or inferred from the absence of powla:hasParent under a closed world assumption. OPTIONAL type Struct layers are visualized as directed acyclic graphs (i.e., generalized trees). with the current modelling, the status of a layer as a struct layer can be inferred and does not need to be asserted. 0 1 1 OPTIONAL type A terminal is a node which does not have a child node (cf. hasParent/hasChild). Can be used as a basis for text anchoring. Also corresponds to the single `base segmentation' in LAF, and the `(privileged) tokenization layer' in PAULA 1.0, which is used for calculating and querying token distance in corpus information systems. Where non-minimal segments are needed as `(privileged) tokenization layer', we recommend to use external vocabularies, e.g., explicit nif:Word and nif:nextWord annotations. Optional, as it can be inferred from powla:hasParent under a closed world assumption. roughly corresponding to SALT Terminal, PAULA 1.0 Token and PAULA 1.1 Terminal OPTIONAL type Token layers consist of annotations of terminal nodes, only. This class is necessary for visualization only, under the closed world assumption, however, it can be inferred and does not need to be asserted.