Apache 2 HTTPD and Noark 5 Core Configuration
Long Term Preservation of Archival Data in Noark 5
The ultimate goal of Noark 5 standardization of archival data is longterm preservation and readable storage.
Understand Level (gui)
Read Level (xml)
Storage Level (sda)
Problems
The following problems sometimes occur:
Government bodies such as Arkivverket publish archive standards such as Noark 5.
Vendors write scripts that sometimes are incompliant with standards and incompatible software and programming languages.
The Noark 5 Standard
The switch from DTD in Noark 4 to XSD in Noark 5 made the standard stronger, but still requires correct parsing of endpoints and implementation of the formal standard.
With Noark 5 large parts of the standard was tidied up.
Audience
Document Controllers and Record Keepers can point at Noark 5 for best practise.
Separation of Data and Structure with MVC
Model View Controller is a concept conceived by Trygve Reenskaug at Xerox PARC in a note on MVC in 1978.
The MVC note defines 4 terms: Model, View, Controller and Editor.
Formal Structure of Noark 5 Extractions
A Noark 5 Extraction
The Noark 5 standard defines the following extraction files in section 5.12:
addml.xsdhttps://www.arkivverket.no/forvaltning-og-utvikling/regelverk-og-standarder/andre-arkivstandarder/addml-archival-data-description-markup-language
arkivstruktur.xmlhttp://edu.hioa.no/ark2200/h17/resources/xsd-noark5/arkivstruktur.xsd
arkivuttrekk.xml
endringslogg.xml
endringslogg.xsd
loependeJournal.xml
loependeJournal.xsd
metadatakatalog.xsd
offentligJournal.xml
offentligJournal.xsd
The folder dokumenter/ contains the specific documents.
ADDML (Archival Data Description Markup Language)
https://github.com/arkivverket/schemas/blob/master/ADDML/v8.3/addml.xsd
Examples of Noark 5 Extractions
Below are some examples of Noark 5 Extractions defined in the Noark 5 standard in section 5.12.
https://github.com/arkivverket/arkade5/
https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/addml.xsd
https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xml
https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivstruktur.xsd
https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/arkivuttrekk.xml
https://raw.githubusercontent.com/arkivverket/arkade5/master/src/Arkivverket.Arkade.Test/TestData/Noark5/ContentClassificationSystem/metadatakatalog.xsd
https://github.com/KDRS-SA/noark5-validator/
https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/loependeJournal.xml
https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/offentligJournal.xml
https://raw.githubusercontent.com/KDRS-SA/noark5-validator/master/src/resources/test-uttrekk/uttrekk1/n5uttrekk/endringslogg.xml
https://github.com/documaster/noark-extraction-validator-samples/tree/master/0.2.0/valid-case-archive/extraction
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/addml.xsd
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivstruktur.xml
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivstruktur.xsd
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/arkivuttrekk.xml
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/business-specific.xsd
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/endringslogg.xml
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/loependeJournal.xml
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/loependeJournal.xsd
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/metadatakatalog.xsd
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/offentligJournal.xml
https://raw.githubusercontent.com/documaster/noark-extraction-validator-samples/master/0.2.0/valid-case-archive/extraction/offentligJournal.xsd
https://github.com/SesamResearch/Records-Management-and-Archive-Systems-Research/
https://raw.githubusercontent.com/SesamResearch/Records-Management-and-Archive-Systems-Research/master/samples/arkivstruktur.xml
The Noark 5 standard defines the following files:
addml.xsd
arkivstruktur.xml
arkivstruktur.xsd
arkivuttrekk.xml
endringslogg.xml
endringslogg.xsd
loependeJournal.xml
loependeJournal.xsd
metadatakatalog.xsd
offentligJournal.xml
offentligJournal.xsd
Free Implementation of Noark 5 Core
Thomas Sødring at Oslo Metropolitan University with assistance from Petter Reinholdtsen at University of Oslo implements a free Noark 5 Core.
https://gitlab.com/OsloMet-ABI/nikita-noark5-core
https://lists.nuug.no/mailman/listinfo/nikita-noark
https://gitlab.com/OsloMet-ABI/nikita-noark5-core/issues
HTTPD Configuration
Install Apache 2 and download the core from gitlab.com in /var/www/html/
cd /var/www/html/
git clone https://gitlab.com/OsloMet-ABI/nikita-noark5-core
Configure Apache 2 in /etc/apache2/sites-available/000-default.conf
<VirtualHost www.arkivarium.no:80>
ServerName www.arkivarium.no
ServerAdmin webmaster@arkivarium.no
DocumentRoot /var/www/html/nikita-noark5-core/web/
ErrorLog ${APACHE_LOG_DIR}/www.arkivarium.no-error.log
CustomLog ${APACHE_LOG_DIR}/www.arkivarium.no-access.log combined
</VirtualHost>
<VirtualHost arkivarium.no:80>
ServerName arkivarium.no
ServerAdmin webmaster@arkivarium.no
DocumentRoot /var/www/html/nikita-noark5-core/web/
ErrorLog ${APACHE_LOG_DIR}/arkivarium.no-error.log
CustomLog ${APACHE_LOG_DIR}/arkivarium.no-access.log combined
RewriteEngine on
RewriteCond %{SERVER_NAME} =arkivarium.no
RewriteRule ^ https://%{SERVER_NAME}%{REQUEST_URI} [END,NE,R=permanent]
</VirtualHost>
Replace {www.}arkivarium.no with your own domain name and configure DNS settings.
Add DNS records on the name servers for your domain to the IP address of your web server.
arkivarium.no A 178.255.144.179
www.arkivarium.no A 178.255.144.179
Remember to replace the domain arkivarium.no
and IP address 178.255.144.179
with the actual domain and IP address of your web server running Apache.
Enable HTTPD configuration in /etc/apache2/sites-enabled/000-default.conf
Download certbot-auto from https://certbot.eff.org/
and run certbot --apache -d arkivarium.no
(replace arkivarium.no with your domain). Install the certificates for arkivarium.no
and add a redirect to https in certbot.
Install the Apache 2 HTTP daemon and enable the modules ssl, proxy and proxy_http:
apt-get install apache2-bin
a2enmod ssl
a2enmod proxy
a2enmod proxy_http
Configure the Apache 2 HTTPD proxy and proxy_http module in /etc/apache2/sites-available/000-noark5v4.conf
to access http://localhost:8092/noark5v4/ on http://arkivarium.no/noark5v4/ (replace the domain arkivarium.no
with your own domain):
<IfModule mod_proxy.c>
<Location /noark5v4>
ProxyPass http://localhost:8092/noark5v4/
ProxyPassReverse http://localhost:8092/noark5v4/
RequestHeader set X-Forwarded-Proto "https"
ProxyPreserveHost On
</Location>
</IfModule>
Add a symbolic link from /etc/apache2/sites-available/000-noark5v4.conf
to /etc/apache2/sites-enabled/000-noark5v4.conf
cd /etc/apache2/sites-enabled/
ln -s /etc/apache2/sites-available/000-noark5v4.conf
Restart the HTTPD configuration with service apache2 restart
Configuration of the free Noark 5 Core
Install maven and Java 8 Development Kit
apt-get install maven
apt-get install default-jdk openjdk-8-jdk openjdk-8-jre
Download the free Noark 5 core from gitlab.com
cd /var/www/html/
git clone https://gitlab.com/OsloMet-ABI/nikita-noark5-core
cd nikita-noark5-core/
Edit nikita-noark5-core/core-webapp/src/main/resources/application.yml
with a text editor such as vim, Emacs or gedit by locating the settings.
vi /var/www/html/nikita-noark5-core/core-webapp/src/main/resources/application.yml
Modify the following and replace {www.}arkivarium.no with your domain name:
--- a/core-webapp/src/main/resources/application.yml
+++ b/core-webapp/src/main/resources/application.yml
@@ -38,7 +38,7 @@ info:
app.name: OsloMet Noark 5 Core (Demo mode)
build.version: ${project.version}
hateoas:
- publicAddress: http://localhost:8092/noark5v4
+ publicAddress: http://www.arkivarium.no:8092/noark5v4
jwt:
header: Authorization
@@ -53,16 +53,16 @@ nikita-noark5-core:
pagination:
_maxPageSize: 10
mail:
- from: nikita@example.com
+ from: webmaster@arkivarium.no
metrics: # DropWizard Metrics configuration, used by MetricsConfiguration
jmx.enabled: true
spark:
enabled: false
- host: localhost
+ host: www.arkivarium.no
port: 9999
graphite:
enabled: false
- host: localhost
+ host: www.arkivarium.no
port: 2003
prefix: nikitaNoark5Core
logs: # report metrics in the logs
@@ -73,7 +73,7 @@ nikita-noark5-core:
ROOT: DEBUG
logstash: # Forward logs to logstash over a socket, used by LoggingConfiguration
enabled: false
- host: localhost
+ host: www.arkivarium.no
Configuration of the Web Interface
Read the previous chapter on how to download
and configure the Noark 5 core in /var/www/html/
cd /var/www/html/
Edit nikita-noark5-core/web/dependencies/internal/config.js
in a text editor
such as vim, Emacs or gedit by locating the settings.
vi /var/www/html/nikita-noark5-core/web/dependencies/internal/config.js
Modify the following:
nikitaOptions = {
baseUrl: 'http://www.arkivarium.no:8092/noark5v4/
',
guiBaseUrl: 'http://www.arkivarium.no/
',
appUrl: 'http://arkivarium.no:8092/noark5v4/hateoas-api
',
fondsStructureRoot: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/
',
createFondsAddress: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/ny-arkiv
',
createFondsCreatorAddress: 'http://www.arkivarium.no:8092/noark5v4/hateoas-api/arkivstruktur/ny-arkivskaper
',
loginUrl: "http://www.arkivarium.no:8092/noark5v4/auth
",
protocol: 'http',
appName: 'noark5v4',
apiName: 'hateoas-api',
authPoint: 'auth',
displayFooterNote: true,
displayBreadcrumb: true,
enabled: true
};
Install Debian-specific packages by running setup-debian
cd /var/www/html/nikita-noark5-core/web/
make setup-debian
Install screen and node.js
apt-get install screen
apt-get install nodejs-legacy nodejs npm
Launch the free Noark 5 Core in a separate terminal with cd /var/www/html/nikita-noark5-core; screen make
Launch the web Interface in a new terminal with cd /var/www/html/nikita-noark5-core/web; screen make run
Terms in the Noark 5 standard as H2 Database tables in Noark 5 Core
The H2 Database Console is available on Noark 5 Core servers at http://localhost:8082/
Arkivdel SERIES
Arkiv FONDS
Arkivskaper FONDS_CREATOR
Avskrivning SIGN_OFF
Basisregistrering BASIC_RECORD
Dokumentbeskrivelse DOCUMENT_DESCRIPTION
Dokumentflyt DOCUMENT_FLOW
Dokumentobjekt DOCUMENT_OBJECT
Gradering CLASSIFIED
Journalpost REGISTRY_ENTRY
Kassasjon DISPOSAL
Kassasjonsvedtak DISPOSAL_DECISION
Klasse CLASS
Klassifikasjonssystem CLASSIFICATION_SYSTEM
Kode CODE
Konvertering CONVERSION
Korrespondansepart CORRESPONDENCE_PART
Kryssreferanse CROSS_REFERENCE
Mappe FILE
Mappetype FILE_TYPE
Merknad COMMENT
Nøkkelord KEYWORD
Oppbevaringssted STORAGE_LOCATION
Presedens PRECEDENCE
Registrering RECORD
Saksmappe CASE_FILE
Sakspart CASE_PARTY
Skjerming SCREENING
Sletting DELETION
Tittel TITLE