{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## `genalog.generation` module:\n", "\n", "This module is responsible for:\n", "- Analog document generation\n", "\n", "There are two important concepts to be familiar with:\n", "1. **Template** - controls the layout of the document (i.e. font, langauge, position of the content, etc)\n", "1. **Content** - items to be used to fill the template (i.e. text, images, tables, lists, etc)\n", "\n", "We are using a HTML templating engine (**Jinja2**) to build our html templates, and a html-pdf converter (**Weasyprint**) to print the html as a pdf or an image.\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from genalog.generation.document import DocumentGenerator\n", "from genalog.generation.content import CompositeContent, ContentType\n", "\n", "with open(\"sample/generation/example.txt\", 'r') as f:\n", " text = f.read()\n", "\n", "print(text[:1000])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `genalog.generation.content` Submodule\n", "\n", "This module is used to initialize a `Content` object for populating content into a template. It has the following classes:\n", "1. **ContentType** - an enumeration dictating the supported content type (ex. ContentType.PARAGRAPH, ContentType.TITLE, ContentType.COMPOSITE)\n", "1. **Content** - a base class for inheritance. All `Content` classes should be printable, iterable and indexable.\n", "1. **Paragraph** - a class inherited from `Content`. It represents a paragraph block.\n", "1. **Title** - a class inherited from `Content`. It represents a title of a block.\n", "1. **CompositeContent** - a class inherited from `Content`. It is a composite class designed to hold a hybrid collection of extended `Content` classes in the order they would populate the template. It is the base class used to initalize a document." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initialize Content Object\n", "paragraphs = text.split('\\n\\n')\n", "content_types = [ContentType.PARAGRAPH] * len(paragraphs)\n", "content = CompositeContent(paragraphs, content_types)\n", "\n", "print(f\"Object represention: {repr(content)}\\n\")\n", "# you can also print the content object like a string\n", "print(f\"Transparent string: {content}\"[:400])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Or iterate the content obj like an array\n", "for paragraph in content:\n", " print(paragraph[:100])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `genalog.generation.document` Submodule\n", "\n", "This module initializes a document by loading the html template and setting up the document styles. It has two classes:\n", "1. **Document** - given a template and the content, this class populates the content into the template and controls the document style. It can print the document as PDF or PNG.\n", "1. **DocumentGenerator** - a factory class that can generate `Document` classes. This will be the primary object used to generate documents." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Initalize DocumentGenerator\n", "# The default setting allow to use the built-in templates from the `genalog` package\n", "default_generator = DocumentGenerator()\n", "\n", "print(f\"Available default templates: {default_generator.template_list}\")\n", "print(f\"Default styles to generate: {default_generator.styles_to_generate}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import Image, display\n", "\n", "# Select specific template, content and create the generator\n", "doc_gen = default_generator.create_generator(content, ['text_block.html.jinja']) \n", "# we will use the `content` object initialized from above \n", "\n", "for doc in doc_gen:\n", " print(doc.styles)\n", " print(doc.template.name)\n", " image_byte = doc.render_png(resolution=100)\n", " display(Image(image_byte))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alternative settings:\n", "\n", "Apart from the default behaviors, you can customize the tool to your need:\n", "1. Load custom templates\n", "1. Add custom style configurations\n", "1. Save individual pages as separate PNG file" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Load custom templates in local file path:\n", "custom_document_generator = DocumentGenerator(template_path=\"../genalog/generation/templates\")\n", "\n", "print(custom_document_generator.template_list)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Create new content with titles and paragraphs\n", "sections = []\n", "section_content_types = []\n", "for index, p in enumerate(paragraphs):\n", " sections.append(f\"Section {index}:\")\n", " section_content_types.append(ContentType.TITLE)\n", " sections.append(p)\n", " section_content_types.append(ContentType.PARAGRAPH)\n", "\n", "titled_content = CompositeContent(sections, section_content_types)\n", "\n", "for c in titled_content:\n", " print(c[:200])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# List the available tempaltes:\n", "print(default_generator.template_list)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set new styles to generate\n", "new_style_combinations = {\n", " \"hyphenate\": [True],\n", " \"font_size\": [\"11px\"],\n", " \"font_family\": [\"Times\"],\n", " \"text_align\": [\"justify\"]\n", "}\n", "default_generator.set_styles_to_generate(new_style_combinations)\n", "print(f\"Styles to generate: {default_generator.styles_to_generate}\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from IPython.core.display import Image, display\n", "doc_gen = default_generator.create_generator(titled_content, [\"columns.html.jinja\", \"letter.html.jinja\"])\n", "\n", "for doc in doc_gen:\n", " print(doc.styles)\n", " print(doc.template.name)\n", " # Save with different resolution\n", " image_byte = doc.render_png(resolution=90)\n", " display(Image(image_byte))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "doc_gen = default_generator.create_generator(titled_content, [\"columns.html.jinja\", \"letter.html.jinja\"])\n", "default_generator.set_styles_to_generate(new_style_combinations)\n", "# Or you can save the document as a separate file\n", "for doc in doc_gen:\n", " template_name = doc.template.name.replace(\".html.jinja\", \"\")\n", " pdf_name = \"sample/generation/\" + template_name + \"_\" + doc.styles[\"font_family\"] + \"_\" + doc.styles[\"font_size\"] + \".pdf\"\n", " png_name = pdf_name.replace(\".pdf\", \".png\")\n", " doc.render_pdf(target=pdf_name, zoom=2)\n", " doc.render_png(target=png_name, resolution=100)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Save as separate PNG files\n", "doc_gen = default_generator.create_generator(content, [\"text_block.html.jinja\"])\n", "default_generator.set_styles_to_generate(new_style_combinations)\n", "# Configure image resolution in dots per inch (dpi)\n", "resolution = 300\n", "for doc in doc_gen:\n", " template_name = doc.template.name.replace(\".html.jinja\", \"\")\n", " png_name = \"sample/generation/\" + template_name + \"_\" + doc.styles[\"font_family\"] + \"_\" + doc.styles[\"font_size\"] + \".png\"\n", " doc.render_png(target=png_name, split_pages=True, resolution=resolution)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.9" } }, "nbformat": 4, "nbformat_minor": 4 }