{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"extensions": {
"jupyter_dashboards": {
"version": 1,
"views": {
"grid_default": {},
"report_default": {}
}
}
}
},
"outputs": [],
"source": [
"import ipywidgets as widgets\n",
"import requests\n",
"from IPython.display import display, clear_output\n",
"from bs4 import BeautifulSoup\n",
"from lxml import etree\n",
"import pandas\n",
"import unicodedata\n",
"import voila\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Testabfrage DNB-Daten\n",
"\n",
"Hier können Sie unsere SRU-Schnittstelle über einfache Formulareingaben abfragen. Wählen Sie dazu den Katalog, den Sie abfragen möchten und das Metadatenformat für die Ausgabe aus. Im nächsten Schritt geben Sie Ihren Suchbegriff ein. \n",
"\n",
"Für die Ausführung des dahinterliegenden Codes muss die Reihenfolge bei Eingaben und Buttonklicks eingehalten werden.\n",
"\n",
"Im Anschluss können Sie sich eine gekürzte tabellarische Darstellung Ihrer Anfrage ansehen und diese als XML- oder CSV-Datei speichern. \n",
"\n",
"**Bitte beachten Sie**: \n",
"* Dieses Tutorial dient als Einstieg. Aus Performance-Gründen werden jeweils immer nur die **ersten 100 Treffer** Ihrer Anfrage ausgegeben. \n",
"* Die Metadatenformate enthalten unterschiedliche Informationen. Die Ausgabetabellen und -dateien variieren daher entsprechend in der Anzahl enthaltener Elemente und Informationen. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Bitte wählen Sie zunächst den gewünschten Katalog: \n",
"\n",
"* DNB = Titeldaten der Deutschen Nationalbibliothek\n",
"* DMA = Deutsches Musikarchiv\n",
"* GND = Gemeinsame Normdatei "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"extensions": {
"jupyter_dashboards": {
"version": 1,
"views": {
"grid_default": {},
"report_default": {}
}
}
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "539636aa6e4f478ab188ce53b3923c83",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Dropdown(description='Katalog:', options=('DNB', 'DMA', 'GND'), style=DescriptionStyle(description_width='init…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"auswahl = widgets.Dropdown(\n",
" options=['DNB', 'DMA', 'GND'], \n",
" value='DNB',\n",
" description='Katalog:',\n",
" style={'description_width': 'initial'},\n",
" disabled=False,\n",
" )\n",
"\n",
"display(auswahl)\n",
"\n",
"default = \"https://services.dnb.de/sru/dnb\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Bitte wählen Sie das Metadatenformat für die Ausgabe: \n"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"extensions": {
"jupyter_dashboards": {
"version": 1,
"views": {
"grid_default": {},
"report_default": {}
}
}
},
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "bff9a51dc19a4056937a1edbd6ff0891",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Dropdown(description='Metadatenformat:', layout=Layout(width='max-content'), options=(('MARC21-xml', 'MARC21-x…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"meta = widgets.Dropdown(\n",
" options=[('MARC21-xml', 'MARC21-xml'), ('DNB Casual (oai_dc)', 'oai_dc'), ('RDF (RDFxml)', 'RDFxml')], \n",
" value='MARC21-xml',\n",
" description='Metadatenformat:', \n",
" layout={'width': 'max-content'},\n",
" style={'description_width': 'initial'},\n",
" disabled=False,\n",
" )\n",
"\n",
"display(meta)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "1e7a7bfae620498fbec26ed8b7a0b399",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Button(description='Bestätigen', style=ButtonStyle())"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "127ec19bf7f1475eb3720d58fd99174e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"button = widgets.Button(description=\"Bestätigen\")\n",
"output1 = widgets.Output()\n",
"\n",
"display(button, output1)\n",
"\n",
"def on_button_clicked(b):\n",
" \n",
" with output1:\n",
" global A\n",
" value = \"https://services.dnb.de/sru/dnb\"\n",
" clear_output()\n",
" result = auswahl.value\n",
" if auswahl.value == \"DNB\":\n",
" selected_url = \"https://services.dnb.de/sru/dnb\"\n",
" elif auswahl.value == \"DMA\":\n",
" selected_url = \"https://services.dnb.de/sru/dnb.dma\"\n",
" elif auswahl.value == \"GND\":\n",
" selected_url = \"https://services.dnb.de/sru/authorities\"\n",
" else:\n",
" selected_url = \"ERROR: Keine URL gewählt\"\n",
" print(\"Auswahl Katalog-URL für\", result, \":\", selected_url)\n",
" print(\"Auswahl Metadatenformat:\", meta.value)\n",
" \n",
" A = selected_url\n",
" return A\n",
" \n",
"button.on_click(on_button_clicked)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Bitte geben Sie nun Ihren Suchbegriff ein: "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"extensions": {
"jupyter_dashboards": {
"version": 1,
"views": {
"grid_default": {},
"report_default": {}
}
}
},
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "f1619ede1f6041cab581c9c3c36968df",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Text(value='', description='Suchbegriff:', placeholder='Suchbegriff eintippen')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"searchterm = widgets.Text(\n",
" value='',\n",
" placeholder='Suchbegriff eintippen',\n",
" description='Suchbegriff:',\n",
" disabled=False\n",
" )\n",
"\n",
"display(searchterm)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"extensions": {
"jupyter_dashboards": {
"version": 1,
"views": {
"grid_default": {},
"report_default": {}
}
}
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "66cf7d06a3b24414bec620eb205ce022",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Button(description='Suche starten', style=ButtonStyle())"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "677a52d0be644bc39488de6c4c493804",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"button_search = widgets.Button(description=\"Suche starten\")\n",
"output2 = widgets.Output()\n",
"\n",
"display(button_search, output2)\n",
"\n",
"def on_button_clicked(b):\n",
" \n",
" with output2:\n",
" global records \n",
" global records_marc\n",
" global gndm\n",
" global r1\n",
" clear_output()\n",
" searchtext = searchterm.value\n",
" print(\"Suche nach:\", searchtext)\n",
" \n",
" if 'A' in globals():\n",
" test = 'yes'\n",
" else:\n",
" test = 'no' \n",
" \n",
" parameter = {'version' : '1.1' , 'operation' : 'searchRetrieve' , 'query' : searchtext, 'recordSchema' : meta.value, \n",
" 'maximumRecords': '100'} \n",
" \n",
" if test == 'yes':\n",
" r1 = requests.get(A, params = parameter)\n",
" if test == 'no': \n",
" r1 = requests.get(default, params = parameter)\n",
"\n",
" response = BeautifulSoup(r1.content)\n",
" records = response.find_all('record')\n",
" records_marc = response.find_all('record', {'type':'Bibliographic'})\n",
" gndm = response.find_all('record', {'type':'Authority'})\n",
" numberofrecords = response.find_all('numberofrecords')[0].text\n",
" vorschau = records[0]\n",
" print(\"Gefundene Treffer:\", numberofrecords)\n",
" print(\" \")\n",
" print(\"Vorschau des ersten Treffers der SRU-Antwort:\")\n",
" print(\"\")\n",
" print(vorschau.prettify())\n",
" print(\"\")\n",
" print(\" - Ende der Vorschau - \")\n",
" \n",
"button_search.on_click(on_button_clicked)\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für Titeldaten in OAI-DC\n",
"def parse_record_dc(record):\n",
" \n",
" ns = {\"dc\": \"http://purl.org/dc/elements/1.1/\", \n",
" \"xsi\": \"http://www.w3.org/2001/XMLSchema-instance\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" #idn\n",
" idn = xml.xpath(\".//dc:identifier[@xsi:type='dnb:IDN']\", namespaces=ns) #--> Adressiert das Element direkt \n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'fail'\n",
" \n",
" #creator:\n",
" creator = xml.xpath('.//dc:creator', namespaces=ns)\n",
" try:\n",
" creator = creator[0].text\n",
" except:\n",
" creator = \"N/A\"\n",
" \n",
" #titel\n",
" titel = xml.xpath('.//dc:title', namespaces=ns)\n",
" try:\n",
" titel = titel[0].text\n",
" except:\n",
" titel = \"N/A\"\n",
" \n",
" #date\n",
" date = xml.xpath('.//dc:date', namespaces=ns)\n",
" try:\n",
" date = date[0].text\n",
" except:\n",
" date = \"N/A\"\n",
" \n",
" \n",
" #publisher\n",
" publ = xml.xpath('.//dc:publisher', namespaces=ns)\n",
" try:\n",
" publ = publ[0].text\n",
" except:\n",
" publ = \"N/A\"\n",
" \n",
" \n",
" #identifier\n",
" ids = xml.xpath('.//dc:identifier[@xsi:type=\"tel:ISBN\"]', namespaces=ns)\n",
" try:\n",
" ids = ids[0].text\n",
" except:\n",
" ids = \"N/A\"\n",
" \n",
" #urn\n",
" urn = xml.xpath('.//dc:identifier[@xsi:type=\"tel:URN\"]', namespaces=ns)\n",
" try:\n",
" urn = urn[0].text\n",
" except:\n",
" urn = \"N/A\"\n",
" \n",
" \n",
" meta_dict = {\"IDN\":idn, \"CREATOR\":creator, \"TITLE\":titel, \"DATE\":date, \"PUBLISHER\":publ, \"URN\":urn, \"ISBN\":ids}\n",
" \n",
" return meta_dict"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"#Function für Titeldaten in MARC21\n",
"def parse_record_marc(item):\n",
"\n",
" ns = {\"marc\":\"http://www.loc.gov/MARC21/slim\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(item)))\n",
" \n",
" \n",
" #idn\n",
" idn = xml.findall(\"marc:controlfield[@tag = '001']\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #creator\n",
" creator1 = xml.findall(\"marc:datafield[@tag = '100']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" creator2 = xml.findall(\"marc:datafield[@tag = '110']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" subfield = xml.findall(\"marc:datafield[@tag = '110']/marc:subfield[@code = 'e']\", namespaces=ns)\n",
" \n",
" if creator1:\n",
" creator = creator1[0].text\n",
" elif creator2:\n",
" creator = creator2[0].text\n",
" if subfield:\n",
" creator = creator + \" [\" + subfield[0].text + \"]\"\n",
" else:\n",
" creator = \"N/A\"\n",
" \n",
" #Titel $a\n",
" title = xml.findall(\"marc:datafield[@tag = '245']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" title2 = xml.findall(\"marc:datafield[@tag = '245']/marc:subfield[@code = 'b']\", namespaces=ns)\n",
" \n",
" if title and not title2:\n",
" titletext = title[0].text\n",
" elif title and title2: \n",
" titletext = title[0].text + \": \" + title2[0].text\n",
" else:\n",
" titletext = \"N/A\"\n",
" \n",
" \n",
" #date\n",
" date = xml.findall(\"marc:datafield[@tag = '264']/marc:subfield[@code = 'c']\", namespaces=ns)\n",
" try:\n",
" date = date[0].text\n",
" except: \n",
" date = 'N/A'\n",
" \n",
" \n",
" #publisher\n",
" publ = xml.findall(\"marc:datafield[@tag = '264']/marc:subfield[@code = 'b']\", namespaces=ns)\n",
" try:\n",
" publ = publ[0].text\n",
" except: \n",
" publ = 'N/A'\n",
" \n",
" \n",
" #URN\n",
" testurn = xml.findall(\"marc:datafield[@tag = '856']/marc:subfield[@code = 'x']\", namespaces=ns)\n",
" urn = xml.findall(\"marc:datafield[@tag = '856']/marc:subfield[@code = 'u']\", namespaces=ns)\n",
" \n",
" if testurn:\n",
" urn = urn[0].text\n",
" else: \n",
" urn = 'N/A'\n",
" \n",
" \n",
" #ISBN\n",
" isbn_new = xml.findall(\"marc:datafield[@tag = '020']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" isbn_old = xml.findall(\"marc:datafield[@tag = '024']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" if isbn_new:\n",
" isbn = isbn_new[0].text\n",
" elif isbn_old: \n",
" isbn = isbn_old[0].text\n",
" else: \n",
" isbn = 'N/A'\n",
" \n",
"\n",
" \n",
" meta_dict = {\"IDN\":idn, \"CREATOR\":creator, \"TITLE\": titletext, \"DATE\":date, \n",
" \"PUBLISHER\":publ, \"URN\":urn, \"ISBN\":isbn}\n",
" \n",
" return meta_dict\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für Titeldaten in RDF:\n",
"\n",
"def parse_record_rdf(record):\n",
" \n",
" ns = {\"xlmns\":\"http://www.loc.gov/zing/srw/\", \n",
" \"agrelon\":\"https://d-nb.info/standards/elementset/agrelon#\",\n",
" \"bflc\":\"http://id.loc.gov/ontologies/bflc/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u/\",\n",
" \"dc\":\"http://purl.org/dc/elements/1.1/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u\",\n",
" \"bibo\":\"http://purl.org/ontology/bibo/\",\n",
" \"dbp\":\"http://dbpedia.org/property/\", \n",
" \"dcmitype\":\"http://purl.org/dc/dcmitype/\", \n",
" \"dcterms\":\"http://purl.org/dc/terms/\", \n",
" \"dnb_intern\":\"http://dnb.de/\", \n",
" \"dnbt\":\"https://d-nb.info/standards/elementset/dnb#\", \n",
" \"ebu\":\"http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#\", \n",
" \"editeur\":\"https://ns.editeur.org/thema/\", \n",
" \"foaf\":\"http://xmlns.com/foaf/0.1/\", \n",
" \"gbv\":\"http://purl.org/ontology/gbv/\", \n",
" \"geo\":\"http://www.opengis.net/ont/geosparql#\", \n",
" \"gndo\":\"https://d-nb.info/standards/elementset/gnd#\", \n",
" \"isbd\":\"http://iflastandards.info/ns/isbd/elements/\", \n",
" \"lib\":\"http://purl.org/library/\", \n",
" \"madsrdf\":\"http://www.loc.gov/mads/rdf/v1#\", \n",
" \"marcrole\":\"http://id.loc.gov/vocabulary/relators/\",\n",
" \"mo\":\"http://purl.org/ontology/mo/\", \n",
" \"owl\":\"http://www.w3.org/2002/07/owl#\", \n",
" \"rdf\":\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\", \n",
" \"rdfs\":\"http://www.w3.org/2000/01/rdf-schema#\", \n",
" \"schema\":\"http://schema.org/\", \n",
" \"sf\":\"http://www.opengis.net/ont/sf#\", \n",
" \"skos\":\"http://www.w3.org/2004/02/skos/core#\", \n",
" \"umbel\":\"http://umbel.org/umbel#\", \n",
" \"v\":\"http://www.w3.org/2006/vcard/ns#\", \n",
" \"vivo\":\"http://vivoweb.org/ontology/core#\", \n",
" \"wdrs\":\"http://www.w3.org/2007/05/powder-s#\", \n",
" \"xsd\":\"http://www.w3.org/2001/XMLSchema#\"}\n",
" \n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" #idn\n",
" idn = xml.findall(\".//dc:identifier\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #creator\n",
" creator = record.find_all('rdau:p60327')\n",
" \n",
" try:\n",
" creator = creator[0].text\n",
" except:\n",
" creator = \"N/A\"\n",
" \n",
" \n",
" #title\n",
" test = record.find_all('dc:title')\n",
" \n",
" try:\n",
" test = test[0].text\n",
" except:\n",
" test = \"N/A\"\n",
" \n",
" \n",
" #date\n",
" date = record.find_all('dcterms:issued')\n",
" \n",
" try:\n",
" date = date[0].text\n",
" except:\n",
" date = \"N/A\" \n",
" \n",
" \n",
" #publisher\n",
" publ = record.find_all('dc:publisher')\n",
" \n",
" try:\n",
" publ = publ[0].text\n",
" except:\n",
" publ = \"N/A\" \n",
" \n",
" #urn\n",
" urn = record.find_all('umbel:islike')\n",
" \n",
" try:\n",
" urn = urn[0]\n",
" urn = urn.get('rdf:resource')\n",
" except:\n",
" urn = \"N/A\"\n",
" \n",
" #isbn\n",
" isbn = xml.findall(\".//bibo:isbn13\", namespaces=ns)\n",
" isbn10 = xml.findall(\".//bibo:isbn10\", namespaces=ns)\n",
" \n",
" if isbn:\n",
" isbn = isbn[0].text\n",
" elif isbn10: \n",
" isbn = isbn10[0].text\n",
" else:\n",
" isbn = \"N/A\"\n",
" \n",
" \n",
" \n",
" \n",
" meta_dict = {\"IDN\":idn, \"CREATOR\":creator, \"TITLE\":test, \"DATE\":date, \"PUBLISHER\":publ, \"URN\":urn, \"ISBN\":isbn}\n",
" \n",
" return meta_dict\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für GND in MARC21:\n",
"\n",
"def parse_record_gndm(record):\n",
" \n",
" ns = {\"xmlns\":\"http://www.loc.gov/MARC21/slim\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" \n",
" #Art\n",
" gndtype = xml.findall(\"xmlns:datafield[@tag = '075']/xmlns:subfield[@code = 'b']\", namespaces=ns)\n",
" gndtype = gndtype[0].text\n",
" \n",
" if gndtype == \"p\": \n",
" gndtype = \"Person\"\n",
" elif gndtype == \"b\":\n",
" gndtype = \"Organisation\"\n",
" elif gndtype == \"u\": \n",
" gndtype = \"Werk\"\n",
" elif gndtype == \"f\": \n",
" gndtype = \"Veranstaltung\"\n",
" elif gndtype == \"g\": \n",
" gndtype = \"Geografikum\" \n",
" elif gndtype == \"n\": \n",
" gndtype = \"Person\"\n",
" elif gndtype == \"s\": \n",
" gndtype = \"Sachbegriff\"\n",
" \n",
" \n",
" #Name\n",
" main1 = xml.findall(\"xmlns:datafield[@tag = '100']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" main2 = xml.findall(\"xmlns:datafield[@tag = '110']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" main3 = xml.findall(\"xmlns:datafield[@tag = '111']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" main4 = xml.findall(\"xmlns:datafield[@tag = '130']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" main5 = xml.findall(\"xmlns:datafield[@tag = '150']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" main6 = xml.findall(\"xmlns:datafield[@tag = '151']/xmlns:subfield[@code = 'a']\", namespaces=ns)\n",
" \n",
" \n",
" if main1: \n",
" main = main1[0].text\n",
" elif main2: \n",
" main = main2[0].text\n",
" elif main3: \n",
" main = main3[0].text\n",
" elif main4:\n",
" main = main4[0].text\n",
" elif main5:\n",
" main = main5[0].text\n",
" elif main6:\n",
" main = main6[0].text\n",
" else:\n",
" main = \"N/A\"\n",
" \n",
" \n",
" #title (bei Werken)\n",
" title1 = xml.findall(\"xmlns:datafield[@tag = '100']/xmlns:subfield[@code = 't']\", namespaces=ns)\n",
" \n",
" if title1: \n",
" title = title1[0].text\n",
" else:\n",
" title = 'N/A'\n",
" \n",
" \n",
" \n",
" #idn\n",
" idn = xml.findall(\"xmlns:controlfield[@tag = '001']\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #Link\n",
" link1 = xml.findall(\"xmlns:datafield[@tag = '024']/xmlns:subfield[@code = '0']\", namespaces=ns)\n",
" try:\n",
" link1 = link1[0].text\n",
" except:\n",
" link1 = 'N/A' \n",
" \n",
" \n",
" \n",
" dicty = {\"IDN\":idn, \"TYPE\":gndtype, \"NAME\":main, \"TITLE\":title, \"LINK\":link1}\n",
" return dicty"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für GND in OAI-DC:\n",
"\n",
"def parse_record_gndoai(record):\n",
" \n",
" ns = {\"dc\": \"http://purl.org/dc/elements/1.1/\", \n",
" \"xsi\": \"http://www.w3.org/2001/XMLSchema-instance\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" \n",
" #idn\n",
" idn = xml.xpath(\".//dc:identifier[@xsi:type='dnb:IDN']\", namespaces=ns) #--> Adressiert das Element direkt \n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'fail'\n",
" \n",
" \n",
" #title\n",
" title = xml.xpath(\".//dc:title\", namespaces=ns) \n",
" try:\n",
" title = title[0].text\n",
" except:\n",
" title = 'N/A' \n",
" \n",
" \n",
" #creator\n",
" creator = xml.xpath(\".//dc:creator\", namespaces=ns) \n",
" try:\n",
" creator = creator[0].text\n",
" except:\n",
" creator = 'N/A' \n",
" \n",
" \n",
" dicty = {\"IDN\":idn, \"NAME\":creator, \"TITLE\":title}\n",
" return dicty\n",
"\n",
"\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [],
"source": [
"#GND in RDF\n",
"\n",
"def parse_record_gndrdf(record):\n",
" \n",
" ns = {\"xlmns\":\"http://www.loc.gov/zing/srw/\", \n",
" \"agrelon\":\"https://d-nb.info/standards/elementset/agrelon#\",\n",
" \"bflc\":\"http://id.loc.gov/ontologies/bflc/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u/\",\n",
" \"dc\":\"http://purl.org/dc/elements/1.1/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u\",\n",
" \"bibo\":\"http://purl.org/ontology/bibo/\",\n",
" \"dbp\":\"http://dbpedia.org/property/\", \n",
" \"dcmitype\":\"http://purl.org/dc/dcmitype/\", \n",
" \"dcterms\":\"http://purl.org/dc/terms/\", \n",
" \"dnb_intern\":\"http://dnb.de/\", \n",
" \"dnbt\":\"https://d-nb.info/standards/elementset/dnb#\", \n",
" \"ebu\":\"http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#\", \n",
" \"editeur\":\"https://ns.editeur.org/thema/\", \n",
" \"foaf\":\"http://xmlns.com/foaf/0.1/\", \n",
" \"gbv\":\"http://purl.org/ontology/gbv/\", \n",
" \"geo\":\"http://www.opengis.net/ont/geosparql#\", \n",
" \"gndo\":\"https://d-nb.info/standards/elementset/gnd#\", \n",
" \"isbd\":\"http://iflastandards.info/ns/isbd/elements/\", \n",
" \"lib\":\"http://purl.org/library/\", \n",
" \"madsrdf\":\"http://www.loc.gov/mads/rdf/v1#\", \n",
" \"marcrole\":\"http://id.loc.gov/vocabulary/relators/\",\n",
" \"mo\":\"http://purl.org/ontology/mo/\", \n",
" \"owl\":\"http://www.w3.org/2002/07/owl#\", \n",
" \"rdf\":\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\", \n",
" \"rdfs\":\"http://www.w3.org/2000/01/rdf-schema#\", \n",
" \"schema\":\"http://schema.org/\", \n",
" \"sf\":\"http://www.opengis.net/ont/sf#\", \n",
" \"skos\":\"http://www.w3.org/2004/02/skos/core#\", \n",
" \"umbel\":\"http://umbel.org/umbel#\", \n",
" \"v\":\"http://www.w3.org/2006/vcard/ns#\", \n",
" \"vivo\":\"http://vivoweb.org/ontology/core#\", \n",
" \"wdrs\":\"http://www.w3.org/2007/05/powder-s#\", \n",
" \"xsd\":\"http://www.w3.org/2001/XMLSchema#\"}\n",
" \n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" #idn\n",
" idn = xml.findall(\".//gndo:gndidentifier\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #link\n",
" link = record.find_all('rdf:description')\n",
" \n",
" try: \n",
" link = link[0]\n",
" link = link.get('rdf:about')\n",
" except:\n",
" link = 'N/A' \n",
" \n",
" \n",
" #name\n",
" name = record.find_all('gndo:preferrednamefortheperson')\n",
" name2 = record.find_all('gndo:preferrednameforthecorporatebody') \n",
" name3 = record.find_all('gndo:preferrednameforthework') \n",
" name4 = record.find_all('gndo:preferrednamefortheconferenceorevent')\n",
" name5 = record.find_all('gndo:preferrednameforthesubjectheading')\n",
" \n",
" if name:\n",
" name = name[0].text\n",
" elif name2:\n",
" name = name2[0].text\n",
" elif name3:\n",
" name = name3[0].text\n",
" elif name4: \n",
" name = name4[0].text\n",
" elif name5: \n",
" name = name5[0].text\n",
" else:\n",
" name = \"N/A\"\n",
" \n",
" \n",
" #time\n",
" time = record.find_all('gndo:periodofactivity')\n",
" time2 = record.find_all('gndo:dateofpublication')\n",
" time3 = record.find_all('gndo:dateofconferenceorevent')\n",
" time4 = record.find_all('gndo:dateofbirth')\n",
" \n",
" if time:\n",
" time = time[0].text\n",
" elif time2:\n",
" time = time2[0].text\n",
" elif time3:\n",
" time = time3[0].text\n",
" elif time4:\n",
" time = time4[0].text + \"-\"\n",
" else:\n",
" time = \"N/A\"\n",
" \n",
" \n",
" \n",
" #type\n",
" gndtype = record.find_all('rdf:type')\n",
" \n",
" try: \n",
" gndtype = gndtype[0]\n",
" gndtype = gndtype.get('rdf:resource')\n",
" except:\n",
" link = 'N/A' \n",
" \n",
" \n",
" \n",
" meta_dict = {\"IDN\":idn, \"NAME\":name, \"TIME\":time, \"LINK\":link}\n",
" \n",
" return meta_dict\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"#Function für DMA in MARC21\n",
"def parse_record_dmamarc(item):\n",
"\n",
" ns = {\"marc\":\"http://www.loc.gov/MARC21/slim\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(item)))\n",
" \n",
" \n",
" #idn\n",
" idn = xml.findall(\"marc:controlfield[@tag = '001']\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #creator\n",
" creator1 = xml.findall(\"marc:datafield[@tag = '100']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" #creator2 = xml.findall(\"marc:datafield[@tag = '110']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" subfield = xml.findall(\"marc:datafield[@tag = '245']/marc:subfield[@code = 'c']\", namespaces=ns)\n",
" \n",
" \n",
" if creator1:\n",
" creator = creator1[0].text\n",
" elif subfield:\n",
" creator = subfield[0].text\n",
" else:\n",
" creator = \"N/A\"\n",
" \n",
" #Titel $a\n",
" title = xml.findall(\"marc:datafield[@tag = '245']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" title2 = xml.findall(\"marc:datafield[@tag = '245']/marc:subfield[@code = 'b']\", namespaces=ns)\n",
" \n",
" if title and not title2:\n",
" titletext = title[0].text\n",
" elif title and title2: \n",
" titletext = title[0].text + \": \" + title2[0].text\n",
" else:\n",
" titletext = \"N/A\"\n",
" \n",
" \n",
" #Umfang/Format\n",
" art = xml.findall(\"marc:datafield[@tag = '300']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" try:\n",
" art = art[0].text\n",
" except: \n",
" art = 'N/A'\n",
" \n",
" \n",
" \n",
" #date\n",
" date = xml.findall(\"marc:datafield[@tag = '264']/marc:subfield[@code = 'c']\", namespaces=ns)\n",
" try:\n",
" date = date[0].text\n",
" except: \n",
" date = 'N/A'\n",
" \n",
" \n",
" #publisher\n",
" publ = xml.findall(\"marc:datafield[@tag = '264']/marc:subfield[@code = 'b']\", namespaces=ns)\n",
" try:\n",
" publ = publ[0].text\n",
" except: \n",
" publ = 'N/A'\n",
" \n",
" \n",
" #ISBN\n",
" isbn_new = xml.findall(\"marc:datafield[@tag = '020']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" isbn_old = xml.findall(\"marc:datafield[@tag = '024']/marc:subfield[@code = 'a']\", namespaces=ns)\n",
" if isbn_new:\n",
" isbn = isbn_new[0].text\n",
" elif isbn_old: \n",
" isbn = isbn_old[0].text\n",
" else: \n",
" isbn = 'N/A'\n",
" \n",
" meta_dict = {\"IDN\":idn, \"CREATOR\":creator, \"TITLE\":titletext, \"DATE\":date, \"PUBLISHER\":publ, \"ISBN\":isbn} \n",
" \n",
" return meta_dict\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für DMA in OAI-DC\n",
"def parse_record_dmadc(record):\n",
" \n",
" ns = {\"dc\": \"http://purl.org/dc/elements/1.1/\", \n",
" \"xsi\": \"http://www.w3.org/2001/XMLSchema-instance\"}\n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" #idn\n",
" idn = xml.xpath(\".//dc:identifier[@xsi:type='dnb:IDN']\", namespaces=ns) #--> Adressiert das Element direkt \n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'fail'\n",
" \n",
" #creator:\n",
" creator = xml.xpath('.//dc:creator', namespaces=ns)\n",
" try:\n",
" creator = creator[0].text\n",
" except:\n",
" creator = \"N/A\"\n",
" \n",
" #titel\n",
" titel = xml.xpath('.//dc:title', namespaces=ns)\n",
" try:\n",
" titel = titel[0].text\n",
" except:\n",
" titel = \"N/A\"\n",
" \n",
" #date\n",
" date = xml.xpath('.//dc:date', namespaces=ns)\n",
" try:\n",
" date = date[0].text\n",
" except:\n",
" date = \"N/A\"\n",
" \n",
" \n",
" #publisher\n",
" publ = xml.xpath('.//dc:publisher', namespaces=ns)\n",
" try:\n",
" publ = publ[0].text\n",
" except:\n",
" publ = \"N/A\"\n",
" \n",
" \n",
" #format\n",
" form = xml.xpath('.//dc:format', namespaces=ns)\n",
" try:\n",
" form = form[0].text\n",
" except:\n",
" form = \"N/A\"\n",
" \n",
" \n",
" meta_dict = {\"IDN\":idn, \"CREATOR\":creator, \"TITLE\":titel, \"DATE\":date, \"PUBLISHER\":publ, \"FORMAT\":form}\n",
" \n",
" return meta_dict"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"#DMA in RDF: \n",
"\n",
"def parse_record_dmardf(record):\n",
" \n",
" ns = {\"xlmns\":\"http://www.loc.gov/zing/srw/\", \n",
" \"agrelon\":\"https://d-nb.info/standards/elementset/agrelon#\",\n",
" \"bflc\":\"http://id.loc.gov/ontologies/bflc/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u/\",\n",
" \"dc\":\"http://purl.org/dc/elements/1.1/\",\n",
" \"rdau\":\"http://rdaregistry.info/Elements/u\",\n",
" \"bibo\":\"http://purl.org/ontology/bibo/\",\n",
" \"dbp\":\"http://dbpedia.org/property/\", \n",
" \"dcmitype\":\"http://purl.org/dc/dcmitype/\", \n",
" \"dcterms\":\"http://purl.org/dc/terms/\", \n",
" \"dnb_intern\":\"http://dnb.de/\", \n",
" \"dnbt\":\"https://d-nb.info/standards/elementset/dnb#\", \n",
" \"ebu\":\"http://www.ebu.ch/metadata/ontologies/ebucore/ebucore#\", \n",
" \"editeur\":\"https://ns.editeur.org/thema/\", \n",
" \"foaf\":\"http://xmlns.com/foaf/0.1/\", \n",
" \"gbv\":\"http://purl.org/ontology/gbv/\", \n",
" \"geo\":\"http://www.opengis.net/ont/geosparql#\", \n",
" \"gndo\":\"https://d-nb.info/standards/elementset/gnd#\", \n",
" \"isbd\":\"http://iflastandards.info/ns/isbd/elements/\", \n",
" \"lib\":\"http://purl.org/library/\", \n",
" \"madsrdf\":\"http://www.loc.gov/mads/rdf/v1#\", \n",
" \"marcrole\":\"http://id.loc.gov/vocabulary/relators/\",\n",
" \"mo\":\"http://purl.org/ontology/mo/\", \n",
" \"owl\":\"http://www.w3.org/2002/07/owl#\", \n",
" \"rdf\":\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\", \n",
" \"rdfs\":\"http://www.w3.org/2000/01/rdf-schema#\", \n",
" \"schema\":\"http://schema.org/\", \n",
" \"sf\":\"http://www.opengis.net/ont/sf#\", \n",
" \"skos\":\"http://www.w3.org/2004/02/skos/core#\", \n",
" \"umbel\":\"http://umbel.org/umbel#\", \n",
" \"v\":\"http://www.w3.org/2006/vcard/ns#\", \n",
" \"vivo\":\"http://vivoweb.org/ontology/core#\", \n",
" \"wdrs\":\"http://www.w3.org/2007/05/powder-s#\", \n",
" \"xsd\":\"http://www.w3.org/2001/XMLSchema#\"}\n",
" \n",
" xml = etree.fromstring(unicodedata.normalize(\"NFC\", str(record)))\n",
" \n",
" #idn\n",
" idn = xml.findall(\".//dc:identifier\", namespaces=ns)\n",
" try:\n",
" idn = idn[0].text\n",
" except:\n",
" idn = 'N/A' \n",
" \n",
" \n",
" #creator\n",
" name = record.find_all('rdau:p60327')\n",
" \n",
" if name:\n",
" name = name[0].text\n",
" else:\n",
" name = \"N/A\"\n",
" \n",
" \n",
" #title:\n",
" title = record.find_all('dc:title')\n",
" \n",
" if title:\n",
" title = title[0].text\n",
" else: \n",
" title = \"N/A\"\n",
" \n",
" \n",
" #publisher\n",
" publ = record.find_all('dc:publisher')\n",
" \n",
" if publ:\n",
" publ = publ[0].text\n",
" else:\n",
" publ = \"N/A\"\n",
" \n",
" \n",
" #date\n",
" time = record.find_all('dcterms:issued')\n",
" \n",
" if time:\n",
" time = time[0].text\n",
" else:\n",
" time = \"N/A\"\n",
" \n",
" \n",
" \n",
" \n",
" meta_dict = {\"IDN\":idn, \"NAME\":name, \"TITLE\":title, \"PUBLISHER\":publ, \"DATE\":time}\n",
" \n",
" return meta_dict\n",
"\n",
"\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Ausgeben und Speichern der Daten: "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"#Funktion für Links: \n",
"def make_clickable(val):\n",
" if val == \"N/A\": \n",
" link = \"N/A\"\n",
" else: \n",
" link = '{}'.format(val,val)\n",
" \n",
" return link "
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"ename": "KeyError",
"evalue": "'JUPYTERHUB_ACTIVITY_URL'",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mKeyError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mos\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 3\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mre\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 4\u001b[1;33m \u001b[0murl\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mos\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0menviron\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;34m\"JUPYTERHUB_ACTIVITY_URL\"\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 5\u001b[0m \u001b[0murl_new\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0murl\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mreplace\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'/activity'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m''\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 6\u001b[0m \u001b[0murl_newer\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0murl_new\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mreplace\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;34m'http://hub:8081/binder/jupyter/hub/api/users'\u001b[0m\u001b[1;33m,\u001b[0m \u001b[1;34m'https://notebooks.gesis.org/binder/jupyter/user'\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;32mC:\\ProgramData\\Anaconda3\\lib\\os.py\u001b[0m in \u001b[0;36m__getitem__\u001b[1;34m(self, key)\u001b[0m\n\u001b[0;32m 673\u001b[0m \u001b[1;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 674\u001b[0m \u001b[1;31m# raise KeyError with the original key value\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 675\u001b[1;33m \u001b[1;32mraise\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[1;33m)\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[1;32mNone\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 676\u001b[0m \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdecodevalue\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mvalue\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 677\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mKeyError\u001b[0m: 'JUPYTERHUB_ACTIVITY_URL'"
]
}
],
"source": [
"#Extrahieren der Browser-URL:\n",
"import os\n",
"import re\n",
"url = os.environ[\"JUPYTERHUB_ACTIVITY_URL\"]\n",
"url_new = url.replace('/activity', '')\n",
"url_newer = url_new.replace('http://hub:8081/binder/jupyter/hub/api/users', 'https://notebooks.gesis.org/binder/jupyter/user')\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Speichern der Schnittstellenantwort als XML-Datei: \n",
"\n",
"Eine Ergebnisdatei \"data.xml\" wird mit einem Klick auf dem Button in folgendes Verzeichnis gespeichert: "
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'url_newer' is not defined",
"output_type": "error",
"traceback": [
"\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[1;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[1;32m\u001b[0m in \u001b[0;36m\u001b[1;34m\u001b[0m\n\u001b[0;32m 1\u001b[0m \u001b[1;32mfrom\u001b[0m \u001b[0mIPython\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mcore\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mdisplay\u001b[0m \u001b[1;32mimport\u001b[0m \u001b[0mHTML\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 2\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m----> 3\u001b[1;33m \u001b[0mclicky\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mmake_clickable\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0murl_newer\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m 4\u001b[0m \u001b[0mdisplay\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mHTML\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mclicky\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m 5\u001b[0m \u001b[1;33m\u001b[0m\u001b[0m\n",
"\u001b[1;31mNameError\u001b[0m: name 'url_newer' is not defined"
]
}
],
"source": [
"from IPython.core.display import HTML\n",
"\n",
"clicky = make_clickable(url_newer)\n",
"display(HTML(clicky))\n",
" \n",
"#print(url_newer) "
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "061603c58c964fa589dd905de2673336",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Button(description='XML speichern', style=ButtonStyle())"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "8f1be23c7e774183a4ef7d80f1c05bca",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"button_xml = widgets.Button(description=\"XML speichern\")\n",
"output_xml = widgets.Output()\n",
"\n",
"display(button_xml, output_xml)\n",
"\n",
"def on_button_clicked(b):\n",
" with output_xml:\n",
" \n",
" with open('data.xml', 'w', encoding='utf-8') as f:\n",
" print(r1.text, file=f)\n",
" \n",
" \n",
"button_xml.on_click(on_button_clicked)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Kopieren Sie am besten den Link in ein neuens Browser-Tab. Sie können die Datei daraufhin bei sich speichern, indem Sie die Datei **\"data.xml\"** in der linken Navigationsleiste suchen und diese markieren (Häkchen setzen), woraufhin in der oberen Leiste ein Button mit der Beschriftung \"Download\" erscheint. Klicken Sie diesen an und wählen Sie einen lokalen Speicherort, um die Datei dauerhaft zu sichern. \n",
"Wenn Sie die Datei nicht herunterladen, steht Sie Ihnen nur temporär für die Dauer Ihrer aktuellen Sitzung zur Verfügung."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Darstellung der Daten in tabellarischer Form:\n",
" "
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "6f99448385ba4bc9bb25efea61b5cf0a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Button(description='Ausgabe als Tabelle', style=ButtonStyle())"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "dbc4cd47d2e34ad0a12563356be8447a",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"button_df = widgets.Button(description=\"Ausgabe als Tabelle\")\n",
"output3 = widgets.Output()\n",
"\n",
"display(button_df, output3)\n",
"\n",
"def on_button_clicked(b):\n",
" global df\n",
" with output3:\n",
" clear_output()\n",
" #für Titeldaten:\n",
" if auswahl.value == \"DNB\" and meta.value == \"oai_dc\":\n",
" result = [parse_record_dc(record) for record in records]\n",
" df = pandas.DataFrame(result)\n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" elif auswahl.value == \"DNB\" and meta.value == \"MARC21-xml\":\n",
" result2 = [parse_record_marc(item) for item in records_marc]\n",
" df = pandas.DataFrame(result2)\n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" elif auswahl.value == \"DNB\" and meta.value == \"RDFxml\":\n",
" result3 = [parse_record_rdf(item) for item in records]\n",
" df = pandas.DataFrame(result3)\n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" \n",
" #für GND:\n",
" elif auswahl.value == \"GND\" and meta.value == \"MARC21-xml\":\n",
" result4 = [parse_record_gndm(item) for item in gndm]\n",
" df = pandas.DataFrame(result4) \n",
" df1 = (df.style\n",
" .format({'Link': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" elif auswahl.value == \"GND\" and meta.value == \"oai_dc\":\n",
" result5 = [parse_record_gndoai(item) for item in records]\n",
" df = pandas.DataFrame(result5)\n",
" print('Bitte beachten Sie, dass sich das Format \"DNB Casual (oai_dc)\" nur bedingt für GND-Datensätze eignet.')\n",
" print('Für eine Darstellung mit mehr Informationen wählen Sie bitte das Format \"MARC21-xml\".')\n",
" display(df)\n",
" elif auswahl.value == \"GND\" and meta.value == \"RDFxml\":\n",
" result6 = [parse_record_gndrdf(item) for item in records]\n",
" df = pandas.DataFrame(result6)\n",
" df1 = (df.style\n",
" .format({'LINK': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" \n",
" #für DMA:\n",
" elif auswahl.value == \"DMA\" and meta.value == \"MARC21-xml\":\n",
" result7 = [parse_record_dmamarc(item) for item in records_marc]\n",
" df = pandas.DataFrame(result7) \n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" elif auswahl.value == \"DMA\" and meta.value == \"oai_dc\":\n",
" result8 = [parse_record_dmadc(record) for record in records]\n",
" df = pandas.DataFrame(result8)\n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" elif auswahl.value == \"DMA\" and meta.value == \"RDFxml\":\n",
" result9 = [parse_record_dmardf(record) for record in records]\n",
" df = pandas.DataFrame(result9)\n",
" df1 = (df.style\n",
" .format({'URN': make_clickable})\n",
" .set_properties(**{'text-align': 'left'})\n",
" .set_table_styles([dict(selector = 'th', props=[('text-align', 'left')])]) ) \n",
" display(df1)\n",
" else:\n",
" print(\"ERROR\")\n",
" \n",
"\n",
" \n",
"button_df.on_click(on_button_clicked)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Tabelle in .csv-Datei überführen: \n",
"\n",
"Sie finden die hier erstellte Datei im selben Verzeichnis wie oben bereits angegeben - die hier erzeugte Datei erscheint als **\"Tabelle.csv\"**. "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a58ec01e973643e5aec59d9e89019163",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Button(description='Als CSV speichern', style=ButtonStyle())"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "750e5b222ca34301875f7ab61f6fef27",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Output()"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"button_csv = widgets.Button(description=\"Als CSV speichern\")\n",
"output4 = widgets.Output()\n",
"\n",
"display(button_csv, output4)\n",
"\n",
"def on_button_clicked(b):\n",
" with output4:\n",
" df.to_csv(\"Tabelle.csv\")\n",
" \n",
"button_csv.on_click(on_button_clicked)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"extensions": {
"jupyter_dashboards": {
"activeView": "report_default",
"version": 1,
"views": {
"grid_default": {
"cellMargin": 10,
"defaultCellHeight": 20,
"maxColumns": 12,
"name": "grid",
"type": "grid"
},
"report_default": {
"name": "report",
"type": "report"
}
}
}
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}