{ "cells": [ { "cell_type": "markdown", "id": "d06f1e82-1895-4e3e-b88d-b67f857ddb56", "metadata": {}, "source": [ "# Test/create feature 'unaccented' and 'transliteration'" ] }, { "cell_type": "markdown", "id": "c692f7e1-df4f-43bf-b80f-0b0e4fa1e329", "metadata": {}, "source": [ "This is a playground for the development of two new features.\n", "\n", "It depends on Python library:\n", "\n", "* unidecode = ASCII transliteration of unicode text" ] }, { "cell_type": "code", "execution_count": 5, "id": "ebc96637-30a9-4561-9c46-44f60330804a", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "en arkhe en o logos, kai o logos en pros ton theon, kai theos en o logos.\n", "εν αρχη ην ο λογος, και ο λογος ην προς τον θεον, και θεος ην ο λογος.\n", "ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος.\n" ] } ], "source": [ "from unidecode import unidecode\n", "import unicodedata\n", "\n", "def make_transliteration(text):\n", " return unidecode(text)\n", "\n", "def remove_accents(text):\n", " return ''.join(c for c in unicodedata.normalize('NFD', text) if unicodedata.category(c) != 'Mn')\n", "\n", "# test with John 1:1\n", "\n", "greek_text_with_accents = \"ἐν ἀρχῇ ἦν ὁ λόγος, καὶ ὁ λόγος ἦν πρὸς τὸν θεόν, καὶ θεὸς ἦν ὁ λόγος.\"\n", "transliterated_text = make_transliteration(greek_text_with_accents)\n", "unaccented_text= remove_accents(greek_text_with_accents)\n", "\n", "print(transliterated_text) # in TF>0.5 use with: fmt='text-transliterated'\n", "print(unaccented_text) # in TF>0.5 use with: fmt='text-unaccented'\n", "print(greek_text_with_accents)" ] }, { "cell_type": "code", "execution_count": null, "id": "91314721-c5d5-4899-bc6d-ab5d3ac7446e", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.5" } }, "nbformat": 4, "nbformat_minor": 5 }