Apache 2 HTTPD and Noark 5 Core Configuration Long Term Preservation of Archival Data in Noark 5 The ultimate goal of Noark 5 standardization of archival data is longterm preservation and readable storage. Understand Level (gui) Read Level (xml) Storage Level (sda) Problems The following problems sometimes occur: Government bodies such as Arkivverket publish archive standards such as Noark 5. Vendors write scripts that sometimes are incompliant with standards and incompatible software and programming languages. The Noark 5 Standard The switch from DTD in Noark 4 to XSD in Noark 5 made the standard stronger, but still requires correct parsing of endpoints and implementation of the formal standard. With Noark 5 large parts of the standard was tidied up. Audience Document Controllers and Record Keepers can point at Noark 5 for best practise. Separation of Data and Structure with MVC Model View Controller is a concept conceived by Trygve Reenskaug at Xerox PARC in a note on MVC in 1978. The MVC note defines 4 terms: Model, View, Controller and Editor. Formal Structure of Noark 5 Extractions A Noark 5 Extraction The Noark 5 standard defines the following extraction files in section 5.12: addml.xsdhttps://www.arkivverket.no/forvaltning-og-utvikling/regelverk-og-standarder/andre-arkivstandarder/addml-archival-data-description-markup-language arkivstruktur.xmlhttp://edu.hioa.no/ark2200/h17/resources/xsd-noark5/arkivstruktur.xsd arkivuttrekk.xml endringslogg.xml endringslogg.xsd loependeJournal.xml loependeJournal.xsd metadatakatalog.xsd offentligJournal.xml offentligJournal.xsd The folder dokumenter/ contains the specific documents. ADDML (Archival Data Description Markup Language) https://github.com/arkivverket/schemas/blob/master/ADDML/v8.3/addml.xsd Examples of Noark 5 Extractions Below are some examples of Noark 5 Extractions defined in the Noark 5 standard in section 5.12. https://github.com/arkivverket/arkade5/ https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/addml.xsd https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xml https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xsd https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivuttrekk.xml https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/metadatakatalog.xsd https://github.com/KDRS-SA/noark5-validator/ https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/loependeJournal.xml https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/offentligJournal.xml https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/endringslogg.xml https://github.com/documaster/noark-extraction-validator-samples/tree/master/0.2.0/valid-case-archive/extraction https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/addml.xsd https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivstruktur.xml https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivstruktur.xsd https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivuttrekk.xml https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/business-specific.xsd https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/endringslogg.xml https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/loependeJournal.xml https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/loependeJournal.xsd https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/metadatakatalog.xsd https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/offentligJournal.xml https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/offentligJournal.xsd https://github.com/SesamResearch/Records-Management-and-Archive-Systems-Research/ https://raw.githubusercontent.com/SesamResearch/Records-Management-and-Archive-Systems-Research/master/samples/arkivstruktur.xml The Noark 5 standard defines the following files: addml.xsd arkivstruktur.xml arkivstruktur.xsd arkivuttrekk.xml endringslogg.xml endringslogg.xsd loependeJournal.xml loependeJournal.xsd metadatakatalog.xsd offentligJournal.xml offentligJournal.xsd Free Implementation of Noark 5 Core Thomas Sødring at Oslo Metropolitan University with assistance from Petter Reinholdtsen at University of Oslo implements a free Noark 5 Core. https://gitlab.com/OsloMet-ABI/nikita-noark5-core https://lists.nuug.no/mailman/listinfo/nikita-noark https://gitlab.com/OsloMet-ABI/nikita-noark5-core/issues HTTPD Configuration Install Apache 2 and download the core from gitlab.com in /var/www/html/ cd /var/www/html/ git clone https://gitlab.com/OsloMet-ABI/nikita-noark5-core Configure Apache 2 in /etc/apache2/sites-available/000-default.conf <VirtualHost www.arkivarium.no:80> ServerName www.arkivarium.no ServerAdmin webmaster@arkivarium.no DocumentRoot /var/www/html/nikita-noark5-core/web/ ErrorLog ${APACHE_LOG_DIR}/www.arkivarium.no-error.log CustomLog ${APACHE_LOG_DIR}/www.arkivarium.no-access.log combined </VirtualHost> <VirtualHost arkivarium.no:80> ServerName arkivarium.no ServerAdmin webmaster@arkivarium.no DocumentRoot /var/www/html/nikita-noark5-core/web/ ErrorLog ${APACHE_LOG_DIR}/arkivarium.no-error.log CustomLog ${APACHE_LOG_DIR}/arkivarium.no-access.log combined RewriteEngine on RewriteCond %{SERVER_NAME} =arkivarium.no RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent] </VirtualHost> Replace {www.}arkivarium.no with your own domain name and configure DNS settings. Add DNS records on the name servers for your domain to the IP address of your web server. arkivarium.no A 178.255.144.179 www.arkivarium.no A 178.255.144.179 Remember to replace the domain arkivarium.no and IP address 178.255.144.179 with the actual domain and IP address of your web server running Apache. Enable HTTPD configuration in /etc/apache2/sites-enabled/000-default.conf Download certbot-auto from https://certbot.eff.org/ and run certbot --apache -d arkivarium.no (replace arkivarium.no with your domain). Install the certificates for arkivarium.no and add a redirect to https in certbot. Install the Apache 2 HTTP daemon and enable the modules ssl, proxy and proxy_http: apt-get install apache2-bin a2enmod ssl a2enmod proxy a2enmod proxy_http Configure the Apache 2 HTTPD proxy and proxy_http module in /etc/apache2/sites-available/000-noark5v4.conf to access http://localhost:8092/noark5v4/ on http://arkivarium.no/noark5v4/ (replace the domain arkivarium.no with your own domain): <IfModule mod_proxy.c> <Location /noark5v4> ProxyPass http://localhost:8092/noark5v4/ ProxyPassReverse http://localhost:8092/noark5v4/ RequestHeader set X-Forwarded-Proto "https" ProxyPreserveHost On </Location> </IfModule> Add a symbolic link from /etc/apache2/sites-available/000-noark5v4.conf to /etc/apache2/sites-enabled/000-noark5v4.conf cd /etc/apache2/sites-enabled/ ln -s /etc/apache2/sites-available/000-noark5v4.conf Restart the HTTPD configuration with service apache2 restart Configuration of the free Noark 5 Core Install maven and Java 8 Development Kit apt-get install maven apt-get install default-jdk openjdk-8-jdk openjdk-8-jre Download the free Noark 5 core from gitlab.com cd /var/www/html/ git clone https://gitlab.com/OsloMet-ABI/nikita-noark5-core cd nikita-noark5-core/ Edit nikita-noark5-core/core-webapp/src/main/resources/application.yml with a text editor such as vim, Emacs or gedit by locating the settings. vi /var/www/html/nikita-noark5-core/core-webapp/src/main/resources/application.yml Modify the following and replace {www.}arkivarium.no with your domain name: --- a/core-webapp/src/main/resources/application.yml +++ b/core-webapp/src/main/resources/application.yml @@ -38,7 +38,7 @@ info: app.name: OsloMet Noark 5 Core (Demo mode) build.version: ${project.version} hateoas: - publicAddress: http://localhost:8092/noark5v4 + publicAddress: http://www.arkivarium.no:8092/noark5v4 jwt: header: Authorization @@ -53,16 +53,16 @@ nikita-noark5-core: pagination: _maxPageSize: 10 mail: - from: nikita@example.com + from: webmaster@arkivarium.no metrics: # DropWizard Metrics configuration, used by MetricsConfiguration jmx.enabled: true spark: enabled: false - host: localhost + host: www.arkivarium.no port: 9999 graphite: enabled: false - host: localhost + host: www.arkivarium.no port: 2003 prefix: nikitaNoark5Core logs: # report metrics in the logs @@ -73,7 +73,7 @@ nikita-noark5-core: ROOT: DEBUG logstash: # Forward logs to logstash over a socket, used by LoggingConfiguration enabled: false - host: localhost + host: www.arkivarium.no Configuration of the Web Interface Read the previous chapter on how to download and configure the Noark 5 core in /var/www/html/ cd /var/www/html/ Edit nikita-noark5-core/web/dependencies/internal/config.js in a text editor such as vim, Emacs or gedit by locating the settings. vi /var/www/html/nikita-noark5-core/web/dependencies/internal/config.js Modify the following: nikitaOptions = { baseUrl: 'http://www.arkivarium.no:8092/noark5v4/', guiBaseUrl: 'http://www.arkivarium.no/', appUrl: 'http://arkivarium.no:8092/noark5v4/hateoas-api', fondsStructureRoot: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/', createFondsAddress: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/ny-arkiv', createFondsCreatorAddress: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/ny-arkivskaper', loginUrl: "http://www.arkivarium.no:8092/noark5v4/auth", protocol: 'http', appName: 'noark5v4', apiName: 'hateoas-api', authPoint: 'auth', displayFooterNote: true, displayBreadcrumb: true, enabled: true }; Install Debian-specific packages by running setup-debian cd /var/www/html/nikita-noark5-core/web/ make setup-debian Install screen and node.js apt-get install screen apt-get install nodejs-legacy nodejs npm Launch the free Noark 5 Core in a separate terminal with cd /var/www/html/nikita-noark5-core; screen make Launch the web Interface in a new terminal with cd /var/www/html/nikita-noark5-core/web; screen make run Terms in the Noark 5 standard as H2 Database tables in Noark 5 Core The H2 Database Console is available on Noark 5 Core servers at http://localhost:8082/ Arkivdel SERIES Arkiv FONDS Arkivskaper FONDS_CREATOR Avskrivning SIGN_OFF Basisregistrering BASIC_RECORD Dokumentbeskrivelse DOCUMENT_DESCRIPTION Dokumentflyt DOCUMENT_FLOW Dokumentobjekt DOCUMENT_OBJECT Gradering CLASSIFIED Journalpost REGISTRY_ENTRY Kassasjon DISPOSAL Kassasjonsvedtak DISPOSAL_DECISION Klasse CLASS Klassifikasjonssystem CLASSIFICATION_SYSTEM Kode CODE Konvertering CONVERSION Korrespondansepart CORRESPONDENCE_PART Kryssreferanse CROSS_REFERENCE Mappe FILE Mappetype FILE_TYPE Merknad COMMENT Nøkkelord KEYWORD Oppbevaringssted STORAGE_LOCATION Presedens PRECEDENCE Registrering RECORD Saksmappe CASE_FILE Sakspart CASE_PARTY Skjerming SCREENING Sletting DELETION Tittel TITLE