{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Transformation toolbox" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This how-to guide explores the different tools available in Mitsuba 3 to manipulate cartesian coordinate systems. When generating datasets, researching advanced light transport algorithms, or developing new appearance models, you will quickly realize how essential those tools are, so we strongly recommend all users to go through this guide." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import mitsuba as mi\n", "\n", "mi.set_variant(\"scalar_rgb\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Frame\n", "\n", "The [Frame3f][1] class stores a three-dimensional orthonormal coordinate frame. This class is very handy when you wish to convert vectors between different cartesian coordinates systems.\n", "\n", "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.Frame3f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Frame initialization\n", "\n", "A `Frame3f` can be initialized in different ways as shown below. When given a single vector, it will make use of [coordinate_system()][1] to compute the other two basis vectors.\n", "\n", "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.coordinate_system" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Frame[\n", " s = [1, -0, -0],\n", " t = [-0, 0, -1],\n", " n = [0, 1, 0]\n", "]" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Frame3f() # Empty frame\n", "\n", "mi.Frame3f(\n", " [1, 0, 0], # s\n", " [0, 1, 0], # t\n", " [0, 0, 1], # n\n", ")\n", "\n", "mi.Frame3f([0, 1, 0]) # n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Converting to/from local frames\n", "\n", "The two methods below are the main operations you will be using to convert between different coordinate frames.\n", "\n", "- [Frame3f.to_local()][1]\n", "- [Frame3f.to_world()][2]\n", "\n", "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.Frame3f.to_local\n", "[2]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.Frame3f.to_world" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[1.0, 2.0, 3.0]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "frame = mi.Frame3f(\n", " [0, 0, 1],\n", " [0, 1, 0],\n", " [1, 0, 0],\n", ")\n", "\n", "world_vector = mi.Vector3f([3, 2, 1]) # In world frame\n", "local_vector = frame.to_local(world_vector)\n", "local_vector" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Spherical coordinates\n", "\n", "Mitsuba 3 provides convenience methods to efficiently compute certain trigonometric evaluations of spherical coordinates with respect to a `Frame3f`. We use the naming convention that *theta* is the elevation and *phi* is the azimuth.\n", "For example, you can call `Frame3f.sin_theta_2()` or `Frame3f.cos_phi()`. As always, the full list of methods is availble in the [reference API][1].\n", "\n", "[1]: https://mitsuba.readthedocs.io/en/latest/src/api_reference.html#mitsuba.Frame3f" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Transform\n", "\n", "The `Transform4f` and `Transform3f` classes provides several static functions to create common transformations, such as `translate`, `scale`, `rotate` and `look_at`. These are often used for setting `\"to_world\"` object parameters in Python using `load_dict()`. As we will see later, those transformations can also be applied to a `Vector`, `Point`, `Normal` and even a `Ray3f`. \n", "\n", "Note that all transforms are in homogenous coordiantes. `Transform4f` can therefore be applied to 3-dimensional objects and `Transform3f` to 2-dimensional objects.\n", "\n", "The `Transform4f` and `Transform3f` objects hold both the transformation matrix and its transpose of inverse. For convenience, there is also a `Transform4f.inverse()` method. All put together, this makes transforming back and forth straightforward.\n", "\n", "
\n", "\n", "🗒 **Note**\n", "\n", "Often when working with a vectorized variant of Mitsuba (e.g. `llvm_ad_rgb`), we still want to work with scalar transformation. For instance in the context of scene loading when setting `to_world` transformations. Mitsuba data-structure types such `Transform4f` can be prefixed with `Scalar` to indicate that no matter which variant of Mitsuba is enabled, this type should always refer to the CPU scalar version (which can also be accessed with `mitsuba.scalar_rgb.Transform4f`. The same applies to all basic types (e.g. `Float`, `UInt32`) and other data-structure types (e.g. `Ray3f`, `SurfaceInteraction3f`) which can all be prefixed with `Scalar`.\n", "\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Transform initialization\n", "\n", "There are several ways to instanciate a transformation object. For example one can create a `Transform4f` from a `numpy` array directly, or a simple Python `list`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[3, 3, 3],\n", " [2, 2, 2],\n", " [3, 3, 3]]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "\n", "# Default constructor is identity matrix\n", "identity = mi.Transform4f() \n", "\n", "np_mat = np.array(\n", " [\n", " [1, 2, 3],\n", " [4, 5, 6],\n", " [7, 8, 9],\n", " ]\n", ")\n", "mi_mat = mi.Matrix3f(\n", " [\n", " [1, 2, 3],\n", " [4, 5, 6],\n", " [7, 8, 9],\n", " ]\n", ")\n", "list_mat = [\n", " [1, 2, 3],\n", " [4, 5, 6],\n", " [7, 8, 9],\n", "]\n", "\n", "# Build from different types\n", "t_from_np = mi.Transform3f(np_mat)\n", "t_from_mi = mi.Transform3f(mi_mat)\n", "t_from_list = mi.Transform3f(list_mat)\n", "\n", "# Broadcasting\n", "t_from_value = mi.Transform3f(3) # Scaled identity matrix\n", "t_from_row = mi.Transform3f([3, 2, 3]) # Broadcast over matrix columns\n", "t_from_row" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We then have a few static function helpful to construct common transformations:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Translate" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[1, 0, 0, 10],\n", " [0, 1, 0, 20],\n", " [0, 0, 1, 30],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.translate([10, 20, 30])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Scale" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[10, 0, 0, 0],\n", " [0, 20, 0, 0],\n", " [0, 0, 30, 0],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.scale([10, 20, 30])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Rotate" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[-4.37114e-08, 0, 1, 0],\n", " [0, 1, 0, 0],\n", " [-1, 0, -4.37114e-08, 0],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.rotate(axis=[0, 1, 0], angle=90)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Look at" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[-1, 0, 0, 0],\n", " [0, 1, 0, 0],\n", " [0, 0, -1, 2],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.look_at(origin=[0, 0, 2], target=[0, 0, 0], up=[0, 1, 0])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Perspective" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The perspective projection does the following:\n", "- (1) Project camera space points onto $z=1$ plane, and non-linearly map $z$-coordinates from $[\\text{near}, \\text{far}]$ to $[0, 1]$,\n", "- (2) Scale $(x, y)$ such that the visible region specified by `fov` lies in $[-1, 1]\\times [-1, 1]$." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1, 0, 0, 0],\n", " [0, 1, 0, 0],\n", " [0, 0, 1.0101, -0.10101],\n", " [0, 0, 1, 0]]\n" ] }, { "data": { "text/plain": [ "[0.9999998807907104, -0.9999998807907104, 0.9595960378646851]" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trafo = mi.Transform4f.perspective(fov=90, near=0.1, far=10)\n", "print(trafo)\n", "trafo @ mi.Point3f(2, -2, 2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Orthographic" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The orthographic projection maps the $z$-coordinate to $[0, 1]$." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1, 0, 0, 0],\n", " [0, 1, 0, 0],\n", " [0, 0, 0.10101, -0.010101],\n", " [0, 0, 0, 1]]\n" ] }, { "data": { "text/plain": [ "[1.0, 2.0, 0.2929292917251587]" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "trafo = mi.Transform4f.orthographic(near=0.1, far=10)\n", "print(trafo)\n", "trafo @ mi.Point3f(1, 2, 3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### From/to frame\n", "\n", "
\n", "\n", "⚠️ Only available for Transform4f\n", " \n", "
\n", "\n", "`mi.Transform4f.to_frame(frame)` is the matrix representation of the function `frame.to_local()`" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[0, 0, 3, 0],\n", " [0, 2, 0, 0],\n", " [1, 0, 0, 0],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "frame = mi.Frame3f(\n", " [0, 0, 1],\n", " [0, 2, 0],\n", " [3, 0, 0],\n", ")\n", "mi.Transform4f.to_frame(frame)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Applying transforms\n", "\n", "The Python `@` (`__matmul__`) operator can be used to apply `Transform` objects to points, vectors, normals and rays or multiply transforms with other transforms. Depending on the operand's type, the operation has a different effect.\n", "\n", "- `Vector3f`: A typical matrix multiplication ignoring the homogenous coordinates (e.g. translation)\n", "- `Point3f`: Adjusted matrix multiplication taking into account homogenous coordinates\n", "- `Normal3f`: Matrix multiplication using the inverse transpose to handle non-uniform scaling of surface normals\n", "- `Ray3f`: Both the ray origin (`mi.Point`) and the ray direction (`mi.Vector`) are transformed with the `@` operator\n", "- `Transform4f`: Combine both transformation. " ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "t @ v=[3.0, 8.0, 15.0]\n", "t @ p=[3.0, 9.0, 17.0]\n", "t @ n=[1.0, 0.0, 0.0]\n" ] } ], "source": [ "t = mi.Transform4f.translate([0, 1, 2])\n", "t = t @ mi.Transform4f.scale([1, 2, 3])\n", "v = mi.Vector3f([3, 4, 5])\n", "p = mi.Point3f([3, 4, 5])\n", "n = mi.Normal3f([1, 0, 0])\n", "\n", "print(f\"{t @ v=}\")\n", "print(f\"{t @ p=}\")\n", "print(f\"{t @ n=}\")" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "#### Transformation order\n", "\n", "Transformations in Mitsuba are applied from right to left, similar to how such operations would be written in mathematical form. This means that when multiple transformations are chained together, the net transformation is equivalent to first performing the rightmost transformation, followed by the second rightmost transformation, and so on.\n", "\n", "In the following example, the point will first be scaled and then transposed." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[6.0, 2.0, 2.0]\n" ] } ], "source": [ "S = mi.Transform4f.scale(2.0)\n", "T = mi.Transform4f.translate([4, 0, 0])\n", "v = mi.Point3f([1, 1, 1])\n", "\n", "trasfo = T @ S\n", "\n", "print(trasfo @ v)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Chaining transforms\n", "\n", "For convinience, it is also possible to chain transformation intialization as follows:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "tags": [] }, "outputs": [ { "data": { "text/plain": [ "[[2, 0, 0, 2],\n", " [0, 2, 0, 0],\n", " [0, 0, 2, 0],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.scale(2.0).translate([1, 0, 0])" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "The code above is equivalent to:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "[[2, 0, 0, 2],\n", " [0, 2, 0, 0],\n", " [0, 0, 2, 0],\n", " [0, 0, 0, 1]]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mi.Transform4f.scale(2.0) @ mi.Transform4f.translate([1, 0, 0])" ] } ], "metadata": { "interpreter": { "hash": "324262bda25e4aeb89fac5521e5e52d6dea4600b0315b63007798d9c65d5c62c" }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 4 }