{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Best practices in development" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## One virtual environment per project\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Isolation\n", "* Different projects have different dependency versions\n", "* You don't want to mess up the system Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling\n", "`poetry` is currently the recommended tool for dependency management but [\"consider other tools such as pip when poetry does not meet your use case\"](https://packaging.python.org/guides/tool-recommendations/#application-dependency-management). If `poetry` doesn't fit your use case, I recommend using `virtualenvwrapper` for managing virtual environments and `pip` for package management. On the other hand, if you're working on only a couple of projects, built-in [`venv`](https://docs.python.org/3/library/venv.html) will do just fine.\n", "#### [`poetry`](https://python-poetry.org/docs/)\n", "* Basically combines `pip`, `virtualenv`, and packaging facilities under single CLI\n", "* [pyproject.toml](https://python-poetry.org/docs/pyproject/) which replaces the need for requirements.txt and requirements-dev.txt \n", "* poetry.lock which pins dependencies, this means deterministic builds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`virtualenvwrapper`](https://virtualenvwrapper.readthedocs.io/en/latest/)\n", "* If you are using Windows command prompt: [`virtualenvwrapper-win`](https://pypi.org/project/virtualenvwrapper-win/)\n", "* Like the name suggests, a wrapper around [`virtualenv`](https://pypi.org/project/virtualenv/)\n", "* Eases the workflow for creating, deleting, and (de)activating your virtual environments" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`pyenv`](https://github.com/pyenv/pyenv)\n", "* Easily change global / per project Python version \n", "* Also a tool for installing different Python versions (also different runtimes available, e.g. [PyPy](https://pypy.org/))\n", "* Useful if you'll need to work with different Python versions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Test your code\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* No surprises (especially in production)\n", "* Make sure that everything works as expected\n", "* Make sure that old stuff works as expected after introducing new features (regression)\n", "* Tests give you confidence while refactoring\n", "* Good tests demonstrate the use cases of application, i.e. they also document the implementation \n", "* ..." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling\n", "#### [`pytest`](https://docs.pytest.org/en/latest/)\n", "There's [`unittest`](https://docs.python.org/3/library/unittest.html) module in Python Standard Library but the go-to test runner nowadays is definitely `pytest`.\n", "\n", "Some reasons to use `pytest`:\n", "* [`fixtures`](https://docs.pytest.org/en/latest/fixture.html#fixture) for writing reusable testing code\n", "* [`markers`](https://docs.pytest.org/en/latest/example/markers.html) for splitting tests to different groups (e.g. smoke, run only on CI machine, etc) or skipping tests in certain conditions\n", "* [Automatic test discovery](https://docs.pytest.org/en/latest/goodpractices.html#test-discovery)\n", "* [Configurability](https://docs.pytest.org/en/latest/customize.html)\n", "* Active development of plugins, to mention a few:\n", " * [`pytest-cov`](https://pytest-cov.readthedocs.io/en/latest/) for coverage reporting\n", " * [`pytest-xdist`](https://github.com/pytest-dev/pytest-xdist) for speeding up test suit run time with parallelization\n", " * see [complete list](https://github.com/pytest-dev) of plugins maintained by `pytest`\n", "* Ease of [writing own plugins](https://docs.pytest.org/en/latest/writing_plugins.html)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`tox`](https://tox.readthedocs.io/en/latest/)\n", "`tox` makes it simple to test your code against different Python interpreter/runtime and dependency versions. This is important when you're writing software which should support different Python versions, which is usually the case with library-like packages. On the other hand, if you're developing, say, a web application which will be deployed to a known target platform, testing against multiple different versions is usually not necessary. However, `tox` makes it also possible to configure, for example, linting as part of `tox` run. Thus, `tox` may simplify the development workflow significantly by wrapping multiple different operations under a single command." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Write high quality code\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Easy to read\n", "* Better maintainability\n", "* Better quality == less bugs" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import this" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - code formatters\n", "[PEP8](https://www.python.org/dev/peps/pep-0008/?) (see also [\"for humans version\"](https://pep8.org/)) describes the style guidelines for Python code, you should follow them. Luckily, there are awesome tools that handle this for you while you focus on writing code, not formatting it." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`black`](https://black.readthedocs.io/en/stable/)\n", "* This is the go-to formatter of the Python community" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - linters\n", "Automatic code formatting is great but in addition to that, you should use static analyzer (linter) to detect potential pitfalls in your code." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`ruff`](https://beta.ruff.rs/docs/)\n", "* Most comprehensive linter. As a bonus, it's extremely fast." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - pre-commit\n", "#### [`pre-commit`](https://pre-commit.com/)\n", "Ideally, all project contributors should follow the best practices of the project, let it be e.g. respecting PEP8 or making sure there's no linting errors or security vulnerabilities in their change sets. However, as code formatters and linters are (mainly) local tools, this can not be guaranteed. `pre-commit` let's you configure (.pre-commit-config.yaml file) a set of hooks that will be run as pre actions to a commit/push. After a developer has called once `pre-commit install`, these hooks will be then automatically ran prior each commit/push.\n", "* Run formatting before commit\n", "* Fail the commit in case linting errors\n", "* Even exercise the test suite before the code ends up to remote (might be time consuming in most scenarios though)\n", "* Easy to configure [your own hooks](https://pre-commit.com/#new-hooks) \n", "* And use the [existing ones](https://github.com/pre-commit/pre-commit-hooks)\n", "* There's also [pre-push option](https://pre-commit.com/#pre-commit-during-push)\n", "* Written in Python but supports also other languages, such as Ruby, Go, and Rust\n", "* Less failed CI builds!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Structure your code and projects\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Package and module structure gives an overview about the project\n", "* Modular design == better reusability" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### How\n", "Some general guidelines:\n", "* Don't put too much stuff into one module\n", "* Split project into packages\n", "* Be consistent with your naming conventions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A few words about structuring your projects. If you're developing, say, a relative big business application, it makes sense to separate some of the non-core business logic packages into a separate project and publish that as separate package. This way the \"main\" repository doesn't get bloated and it's more approachable for newcomers. Additionally, there's a change that you (or someone else) can easily reuse this \"separated\" package in the future, which is often the case e.g. for different kinds of utility functionalities. \n", "\n", "Let's take a practical example. If your team has two different applications which interact with the same third party, it's beneficial to implement a separate client library for communication with it. This way a change is needed only in one place (in the client library) if the third party decides to make a backwards incompatible change in their API. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling\n", "#### [`cookiecutter`](https://cookiecutter.readthedocs.io/en/latest/)\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Cookiecutter is a tool which let's you create projects from predefined templates. \n", "\n", "* Rapid set-up of new projects, no need to copy paste from existing ones\n", "* Consistent development practices across all projects (project structure as well as e.g. `pre-commit`, `tox`, and CI configuration)\n", "* You can create one yourself or use some of the [existing ones](https://cookiecutter.readthedocs.io/en/latest/readme.html#python)\n", "* Written in Python but is applicable for non-Python projects as well, even non-programming related directory and file structures" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use continuous integration and deployment\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "CI & CD belong to the best practices of software development without controversy, no matter what is the technology stack used for development. From Python point of view, CI is the place where we want to make sure that the other best practices described above are followed. For example, in bigger projects, it may not be even practical/possible to run the full test suite on developer's machine." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Make sure the tests pass\n", "* CI is the place where it's possible to run also some time consuming tests which the impatient developers prefer to skip on their local machines\n", "* Make sure there's no linting errors\n", "* Ideally, the place to test against all target versions and platforms\n", "* Overall, CI is the last resort for automatically ensuring the quality \n", "* Manual deployments are time consuming and error-prone, CD is automated and deterministic\n", "* You want to automate as much as possible, human time is expensive\n", "* Minimize the time required for code reviews - what could be detected with automatic tools, should be detected by using those tools. Human time is expensive. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling\n", "Tooling depends on which git repository manager option you've chosen and what kind of requirements you have. For example:\n", "* [GitHub Actions](https://github.com/features/actions)\n", "* If you're using Gitlab, it also has [integrated CI/CD](https://about.gitlab.com/features/gitlab-ci-cd/)\n", "* Same for [BitBucket](https://www.atlassian.com/continuous-delivery/continuous-integration-tutorial)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Utilize the capabitilies of your editor" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Efficient and fluent development\n", "* There's plenty of tools to make your daily programming easier, why would you not use them" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling\n", "As there's a number of different editors and IDEs available, not to mention that everyone has their own preferences, I'll just focus on highlighting some of the features of my favorite IDE, PyCharm, which I highly recommend for Python development.\n", "\n", "#### [PyCharm](https://www.jetbrains.com/help/pycharm/quick-start-guide.html)\n", "* Good integration with `pytest`, e.g. run single tests / test classes / test modules\n", "* Git integration (in case you don't like command line)\n", "* Easy to configure to use automatic formatting, e.g [`black`](https://github.com/ambv/black#pycharm)\n", "* Intuitive searching capabilities\n", "* Refactoring features\n", "* Debugger\n", "* Jupyter Notebook integration\n", "* Free community edition already contains all you need" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use existing solutions\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* Python Standard Library is extensive - and stable!\n", "* There are over 150k packages in [PyPI](https://pypi.org/)\n", "* Someone has most likely solved the problem you're trying to solve\n", "* Spend 5 minutes doing a google research before starting to solve a new problem, e.g. [stackoverflow](https://stackoverflow.com/) is a magical place." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learn how to debug efficiently\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Why\n", "* You won't write completely stable code anyway - impossible looking conditions will occur. \n", "* When something is not working as expected, there are plenty of tools out there to help you figure out what's going on. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - debuggers\n", "#### [`pdb`](https://docs.python.org/3/library/pdb.html)\n", "* Part of the Standard Library\n", "* Sufficient for most use cases" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`ipdb`](https://pypi.org/project/ipdb/)\n", "* Feature rich `pdb` with similar API" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`pdb++`](https://pypi.org/project/pdbpp/)\n", "* Drop-in replacement for `pdb` with additional features" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - profilers" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`memray`](https://bloomberg.github.io/memray/)\n", "* Probably the only memory profiler you'll ever need" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [`py-spy`](https://github.com/benfred/py-spy)\n", "* Profile running Python program without the need for modifying the source code or restarting the target process\n", "* Potential tool for identifying problems of e.g. a web application in production " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - runtime error tracking\n", "These are especially useful with web applications as you'll get reports - and notifications - of runtime exceptions with full tracebacks and variable values. This information is often enough for identifying the root cause of the problem, which is a huge benefit considering the time required for implementing and deploying the fix." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### [Sentry](https://docs.sentry.io/?platform=python)\n", "* Complete stack traces with relevant variable (`locals()`) values\n", "* Browser and OS information of the client\n", "* Support for other languages as well" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Tooling - misc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use logging instead of prints\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* [`logging`](https://docs.python.org/3/library/logging.html) is part of the Standard Library\n", "* With logging you can redirect the output to a file\n", "* Logs are usually the first place to look at after an end user reports an issue\n", "* You can specify the runtime level - no need to remove the debug prints" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### General guidelines\n", "* If you're building applications, use the latest Python.\n", "* If you're building libraries, make sure they support also older Python versions. \n", "* Develop in branches. Even if you're the only person in the project, branching makes it possible to easily switch between different features / bug fixes.\n", "* If you're not developing alone, practice code reviews. It's one of the best ways to learn for both parties.\n", "* Document your master pieces" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 2 }