{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# \"fast.ai\" layers\n", "\n", "This notebook defines layers similar to those defined in the [Swift for TensorFlow Deep Learning Library](https://github.com/tensorflow/swift-apis), but with some experimental extra features for the fast.ai course." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Installing packages:\n", "\t.package(path: \"/home/jupyter/notebooks/swift/FastaiNotebook_01_matmul\")\n", "\t\tFastaiNotebook_01_matmul\n", "With SwiftPM flags: []\n", "Working in: /tmp/tmp9uv0vajs/swift-install\n", "[1/3] Compiling FastaiNotebook_01_matmul 01_matmul.swift\n", "[2/3] Compiling FastaiNotebook_01_matmul 00_load_data.swift\n", "[3/4] Merging module FastaiNotebook_01_matmul\n", "[4/5] Compiling jupyterInstalledPackages jupyterInstalledPackages.swift\n", "[5/6] Merging module jupyterInstalledPackages\n", "[6/6] Linking libjupyterInstalledPackages.so\n", "Initializing Swift...\n", "Installation complete!\n" ] } ], "source": [ "%install-location $cwd/swift-install\n", "%install '.package(path: \"$cwd/FastaiNotebook_01_matmul\")' FastaiNotebook_01_matmul" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "import Path\n", "import TensorFlow" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import FastaiNotebook_01_matmul" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Tensor extensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "All the fastai layers will use kaiming initialization as default, so we write an extension to do this fast." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "// export\n", "public extension Tensor where Scalar: TensorFlowFloatingPoint {\n", " init(kaimingNormal shape: TensorShape, negativeSlope: Double = 1.0) {\n", " // Assumes Leaky ReLU nonlinearity\n", " let gain = Scalar.init(TensorFlow.sqrt(2.0 / (1.0 + TensorFlow.pow(negativeSlope, 2))))\n", " let spatialDimCount = shape.count - 2\n", " let receptiveField = shape[0..(gain/TensorFlow.sqrt(Scalar(fanIn)))\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We shorten the name of standardDeviation to match numpy/PyTorch since we'll have to write it a lot in the notebooks." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "// export\n", "public extension Tensor where Scalar: TensorFlowFloatingPoint {\n", " func std() -> Tensor { return standardDeviation() }\n", " func std(alongAxes a: [Int]) -> Tensor { return standardDeviation(alongAxes: a) }\n", " func std(alongAxes a: Tensor) -> Tensor { return standardDeviation(alongAxes: a) }\n", " func std(alongAxes a: Int...) -> Tensor { return standardDeviation(alongAxes: a) }\n", " func std(squeezingAxes a: [Int]) -> Tensor { return standardDeviation(squeezingAxes: a) }\n", " func std(squeezingAxes a: Tensor) -> Tensor { return standardDeviation(squeezingAxes: a) }\n", " func std(squeezingAxes a: Int...) -> Tensor { return standardDeviation(squeezingAxes: a) }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introducing Protocols\n", "\n", "Just like PyTorch, the S4TF base library provides a standard layer type. It is defined in the [TensorFlow/swift-apis](https://github.com/tensorflow/swift-apis/) repository on GitHub in [Layer.swift](https://github.com/tensorflow/swift-apis/blob/master/Sources/TensorFlow/Layer.swift) and it looks like this:\n", "\n", "```swift\n", "protocol Layer: Differentiable {\n", " /// The input type of the layer.\n", " associatedtype Input: Differentiable\n", " /// The output type of the layer.\n", " associatedtype Output: Differentiable\n", "\n", " /// Returns the output obtained from applying the layer to the given input.\n", " ///\n", " /// - Parameter input: The input to the layer.\n", " /// - Returns: The output.\n", " @differentiable\n", " func call(_ input: Input) -> Output\n", "}\n", "```\n", "\n", "There is a lot going on here - before we dive into the details of Layers, lets talk about what protocols are:\n", "\n", "**Slides**: [Protocols in Swift](https://docs.google.com/presentation/d/1dc6o2o-uYGnJeCeyvgsgyk05dBMneArxdICW5vF75oU/edit#slide=id.g5674d3ead7_0_474)\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Explaining `Layer`\n", " \n", "Let's break down the `Layer` protocol into its pieces:\n", "\n", "```swift\n", "protocol Layer: Differentiable {\n", " ...\n", "}\n", "```\n", "\n", "This declares a protocol named `Layer` and says that all layers are also `Differentiable`. `Differentiable` is a yet-another protocol [defined in the Swift standard library](https://github.com/apple/swift/blob/tensorflow/stdlib/public/core/AutoDiff.swift#L99).\n", "\n", "```swift\n", " /// Returns the output obtained from applying the layer to the given input.\n", " ///\n", " /// - Parameter input: The input to the layer.\n", " /// - Returns: The output.\n", " @differentiable\n", " func call(_ input: Input) -> Output\n", "```\n", "\n", "These lines say that all `Layer`s are required to have a method named `call`, that it must be differentiable, and that it can take an return an arbitrary `Input` and `Output` type.\n", "\n", "```swift\n", " /// The input type of the layer.\n", " associatedtype Input: Differentiable\n", " /// The output type of the layer.\n", " associatedtype Output: Differentiable\n", "```\n", "\n", "These two lines say that each `Layer` has to have an `Input` and `Output` type, and that those types must furthermore be differentiable.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Taking `Layer` further: `FALayer`\n", "\n", "Jeremy likes to have nice things, and in particular wants to be able to install delegate callbacks on `Layer`s (to mimic PyTorch's [hooks](https://pytorch.org/tutorials/beginner/former_torchies/nn_tutorial.html#forward-and-backward-function-hooks)). It might seem that we have to give up on using the builtin layer, but that isn't so! We can extend the existing `Layer` and give it new functionality with `FALayer`. First we define the type we want to use for our callbacks:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can now define `FALayer`, which is just a layer that has a delegate. Types that implement `FALayer` will implement a new `forward` method instead of `func call`:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "// FALayer is a layer that supports callbacks through its LayerDelegate.\n", "public protocol FALayer: Layer {\n", " var delegates: [(Output) -> ()] { get set }\n", " \n", " // FALayer's will implement this instead of `func call`.\n", " @differentiable\n", " func forward(_ input: Input) -> Output\n", " \n", " associatedtype Input\n", " associatedtype Output\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It would be a tremendous amount of boilerplate to require all `FALayer`s to implement `func call` and invoke their delegate, so we'll do that once for all `FALayer`s right here in an extension. We implement it by calling the `forward` function and then invoke the delegate:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "public extension FALayer {\n", " @differentiable(vjp: callGrad)\n", " func callAsFunction(_ input: Input) -> Output {\n", " let activation = forward(input)\n", " for d in delegates { d(activation) }\n", " return activation\n", " }\n", " \n", " // NOTE: AutoDiff synthesizes a leaking VJP for this, so we define a custom VJP.\n", " // TF-475: https://bugs.swift.org/browse/TF-475\n", " // NOTE: If we use `@differentiating`, then there is a linker error. So we use `@differentiable` instead.\n", " // TF-476: https://bugs.swift.org/browse/TF-476\n", " func callGrad(_ input: Input) ->\n", " (Output, (Self.Output.TangentVector) -> (Self.TangentVector, Self.Input.TangentVector)) {\n", " return Swift.valueWithPullback(at: self, input) { (m, i) in m.forward(i) }\n", " }\n", " \n", " //We also add a default init to our `delegates` variable, so that we don't have to define it each time, as\n", " //well as a function to easily add a delegate.\n", " //var delegates: [(Output) -> ()] { \n", " // get { return [] }\n", " // set {}\n", " //}\n", " \n", " mutating func addDelegate(_ d: @escaping (Output) -> ()) { delegates.append(d) }\n", "}\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we can start implementing some layers, like this dense layer:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Dense" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The code is copy-pasted from S4TF for the most general init, and we add a convenience init with kaiming initialization for the weights." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "@frozen\n", "public struct FADense: FALayer {\n", " // Note: remove the explicit typealiases after TF-603 is resolved.\n", " public typealias Input = Tensor\n", " public typealias Output = Tensor\n", " public var weight: Tensor\n", " public var bias: Tensor\n", " public typealias Activation = @differentiable (Tensor) -> Tensor\n", " @noDerivative public var delegates: [(Output) -> ()] = []\n", " @noDerivative public let activation: Activation\n", "\n", " public init(\n", " weight: Tensor,\n", " bias: Tensor,\n", " activation: @escaping Activation\n", " ) {\n", " self.weight = weight\n", " self.bias = bias\n", " self.activation = activation\n", " }\n", "\n", " @differentiable\n", " public func forward(_ input: Tensor) -> Tensor {\n", " return activation(input • weight + bias)\n", " }\n", "}\n", "\n", "public extension FADense {\n", " init(_ nIn: Int, _ nOut: Int, activation: @escaping Activation = identity) {\n", " self.init(weight: Tensor(kaimingNormal: [nIn, nOut], negativeSlope: 1.0),\n", " bias: Tensor(zeros: [nOut]),\n", " activation: activation)\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Conv2D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "@frozen\n", "public struct FANoBiasConv2D: FALayer {\n", " // TF-603 workaround.\n", " public typealias Input = Tensor\n", " public typealias Output = Tensor\n", " \n", " public var filter: Tensor\n", " public typealias Activation = @differentiable (Tensor) -> Tensor\n", " @noDerivative public let activation: Activation\n", " @noDerivative public let strides: (Int, Int)\n", " @noDerivative public let padding: Padding\n", " @noDerivative public var delegates: [(Output) -> ()] = []\n", "\n", " public init(\n", " filter: Tensor,\n", " activation: @escaping Activation,\n", " strides: (Int, Int),\n", " padding: Padding\n", " ) {\n", " self.filter = filter\n", " self.activation = activation\n", " self.strides = strides\n", " self.padding = padding\n", " }\n", "\n", " @differentiable\n", " public func forward(_ input: Tensor) -> Tensor {\n", " return activation(conv2D(input, filter: filter,\n", " strides: (1, strides.0, strides.1, 1),\n", " padding: padding))\n", " }\n", "}\n", "\n", "public extension FANoBiasConv2D {\n", " init(\n", " filterShape: (Int, Int, Int, Int),\n", " strides: (Int, Int) = (1, 1),\n", " padding: Padding = .same,\n", " activation: @escaping Activation = identity\n", " ) {\n", " let filterTensorShape = TensorShape([\n", " filterShape.0, filterShape.1,\n", " filterShape.2, filterShape.3])\n", " self.init(\n", " filter: Tensor(kaimingNormal: filterTensorShape, negativeSlope: 1.0),\n", " activation: activation,\n", " strides: strides,\n", " padding: padding)\n", " }\n", "}\n", "\n", "public extension FANoBiasConv2D {\n", " init(_ cIn: Int, _ cOut: Int, ks: Int, stride: Int = 1, padding: Padding = .same,\n", " activation: @escaping Activation = identity){\n", " self.init(filterShape: (ks, ks, cIn, cOut),\n", " strides: (stride, stride),\n", " padding: padding,\n", " activation: activation)\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "@frozen\n", "public struct FAConv2D: FALayer {\n", " // Note: remove the explicit typealiases after TF-603 is resolved.\n", " public typealias Input = Tensor\n", " public typealias Output = Tensor\n", " \n", " public var filter: Tensor\n", " public var bias: Tensor\n", " public typealias Activation = @differentiable (Tensor) -> Tensor\n", " @noDerivative public let activation: Activation\n", " @noDerivative public let strides: (Int, Int)\n", " @noDerivative public let padding: Padding\n", " @noDerivative public var delegates: [(Output) -> ()] = []\n", "\n", " public init(\n", " filter: Tensor,\n", " bias: Tensor,\n", " activation: @escaping Activation,\n", " strides: (Int, Int),\n", " padding: Padding\n", " ) {\n", " self.filter = filter\n", " self.bias = bias\n", " self.activation = activation\n", " self.strides = strides\n", " self.padding = padding\n", " }\n", "\n", " @differentiable\n", " public func forward(_ input: Tensor) -> Tensor {\n", " return activation(conv2D(input, filter: filter,\n", " strides: (1, strides.0, strides.1, 1),\n", " padding: padding) + bias)\n", " }\n", "}\n", "\n", "public extension FAConv2D {\n", " init(\n", " filterShape: (Int, Int, Int, Int),\n", " strides: (Int, Int) = (1, 1),\n", " padding: Padding = .same,\n", " activation: @escaping Activation = identity\n", " ) {\n", " let filterTensorShape = TensorShape([\n", " filterShape.0, filterShape.1,\n", " filterShape.2, filterShape.3])\n", " self.init(\n", " filter: Tensor(kaimingNormal: filterTensorShape, negativeSlope: 1.0),\n", " bias: Tensor(zeros: TensorShape([filterShape.3])),\n", " activation: activation,\n", " strides: strides,\n", " padding: padding)\n", " }\n", "}\n", "\n", "public extension FAConv2D {\n", " init(_ cIn: Int, _ cOut: Int, ks: Int, stride: Int = 1, padding: Padding = .same,\n", " activation: @escaping Activation = identity){\n", " self.init(filterShape: (ks, ks, cIn, cOut),\n", " strides: (stride, stride),\n", " padding: padding,\n", " activation: activation)\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# AvgPool2D" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "@frozen\n", "public struct FAAvgPool2D: FALayer {\n", " // TF-603 workaround.\n", " public typealias Input = Tensor\n", " public typealias Output = Tensor\n", " \n", " @noDerivative let poolSize: (Int, Int, Int, Int)\n", " @noDerivative let strides: (Int, Int, Int, Int)\n", " @noDerivative let padding: Padding\n", " @noDerivative public var delegates: [(Output) -> ()] = []\n", "\n", " public init(\n", " poolSize: (Int, Int, Int, Int),\n", " strides: (Int, Int, Int, Int),\n", " padding: Padding\n", " ) {\n", " self.poolSize = poolSize\n", " self.strides = strides\n", " self.padding = padding\n", " }\n", "\n", " public init(poolSize: (Int, Int), strides: (Int, Int), padding: Padding = .valid) {\n", " self.poolSize = (1, poolSize.0, poolSize.1, 1)\n", " self.strides = (1, strides.0, strides.1, 1)\n", " self.padding = padding\n", " }\n", " \n", " public init(_ sz: Int, padding: Padding = .valid) {\n", " poolSize = (1, sz, sz, 1)\n", " strides = (1, sz, sz, 1)\n", " self.padding = padding\n", " }\n", "\n", " @differentiable\n", " public func forward(_ input: Tensor) -> Tensor {\n", " return avgPool2D(input, filterSize: poolSize, strides: strides, padding: padding)\n", " }\n", "}" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "\n", "@frozen\n", "public struct FAGlobalAvgPool2D: FALayer {\n", " // TF-603 workaround.\n", " public typealias Input = Tensor\n", " public typealias Output = Tensor\n", " @noDerivative public var delegates: [(Output) -> ()] = []\n", " \n", " public init() {}\n", "\n", " @differentiable\n", " public func forward(_ input: Tensor) -> Tensor {\n", " return input.mean(squeezingAxes: [1,2])\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Make Array conform to Differentiable" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "extension Array: Layer where Element: Layer, Element.Input == Element.Output {\n", " // Note: remove the explicit typealiases after TF-603 is resolved.\n", " public typealias Input = Element.Input\n", " public typealias Output = Element.Output\n", " \n", " @differentiable(vjp: _vjpApplied)\n", " public func callAsFunction(_ input: Input) -> Output {\n", " var activation = input\n", " for layer in self {\n", " activation = layer(activation)\n", " }\n", " return activation\n", " }\n", " \n", " public func _vjpApplied(_ input: Input)\n", " -> (Output, (Output.TangentVector) -> (Array.TangentVector, Input.TangentVector))\n", " {\n", " var activation = input\n", " var pullbacks: [(Input.TangentVector) -> (Element.TangentVector, Input.TangentVector)] = []\n", " for layer in self {\n", " let (newActivation, newPullback) = layer.valueWithPullback(at: activation) { $0($1) }\n", " activation = newActivation\n", " pullbacks.append(newPullback)\n", " }\n", " func pullback(_ v: Input.TangentVector) -> (Array.TangentVector, Input.TangentVector) {\n", " var activationGradient = v\n", " var layerGradients: [Element.TangentVector] = []\n", " for pullback in pullbacks.reversed() {\n", " let (newLayerGradient, newActivationGradient) = pullback(activationGradient)\n", " activationGradient = newActivationGradient\n", " layerGradients.append(newLayerGradient)\n", " }\n", " return (Array.TangentVector(layerGradients.reversed()), activationGradient)\n", " }\n", " return (activation, pullback)\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Simplify some names" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export \n", "extension KeyPathIterable {\n", " public var keyPaths: [WritableKeyPath>] {\n", " return recursivelyAllWritableKeyPaths(to: Tensor.self)\n", " }\n", "}\n", "\n", "extension Layer {\n", " public var variables: AllDifferentiableVariables {\n", " get { return allDifferentiableVariables }\n", " set { allDifferentiableVariables = newValue }\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## power operator" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "// export\n", "public func ** (lhs: Int, rhs: Int) -> Int {\n", " return Int(pow(Double(lhs), Double(rhs)))\n", "}\n", "\n", "public func ** (lhs: Double, rhs: Double) -> Double {\n", " return pow(lhs, rhs)\n", "}\n", "\n", "public func **(_ x: T, _ y: T) -> T {\n", " return T(pow(Double(x), Double(y)))\n", "}\n", "\n", "public func **(_ x: Tensor, _ y: Tensor) -> Tensor\n", " where T : TensorFlowFloatingPoint { return pow(x, y)}\n", "\n", "public func **(_ x: T, _ y: Tensor) -> Tensor\n", " where T : TensorFlowFloatingPoint { return pow(x, y)}\n", "\n", "public func **(_ x: Tensor, _ y: T) -> Tensor\n", " where T : TensorFlowFloatingPoint { return pow(x, y)}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compose" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "//export\n", "public extension Differentiable {\n", " @differentiable\n", " func compose(_ l1: L1, _ l2: L2) -> L2.Output\n", " where L1.Input == Self, L1.Output == L2.Input {\n", " return sequenced(through: l1, l2)\n", " }\n", " \n", " @differentiable\n", " func compose(_ l1: L1, _ l2: L2, _ l3: L3) -> L3.Output\n", " where L1.Input == Self, L1.Output == L2.Input, L2.Output == L3.Input {\n", " return sequenced(through: l1, l2, l3)\n", " }\n", " \n", " @differentiable\n", " func compose(\n", " _ l1: L1, _ l2: L2, _ l3: L3, _ l4: L4\n", " ) -> L4.Output\n", " where L1.Input == Self, L1.Output == L2.Input, L2.Output == L3.Input,\n", " L3.Output == L4.Input {\n", " return sequenced(through: l1, l2, l3, l4)\n", " }\n", " \n", " @differentiable\n", " func compose(\n", " _ l1: L1, _ l2: L2, _ l3: L3, _ l4: L4, _ l5: L5\n", " ) -> L5.Output\n", " where L1.Input == Self, L1.Output == L2.Input, L2.Output == L3.Input, L3.Output == L4.Input,\n", " L4.Output == L5.Input {\n", " return sequenced(through: l1, l2, l3, l4, l5)\n", " }\n", " \n", " @differentiable\n", " func compose(\n", " _ l1: L1, _ l2: L2, _ l3: L3, _ l4: L4, _ l5: L5, _ l6: L6\n", " ) -> L6.Output\n", " where L1.Input == Self, L1.Output == L2.Input, L2.Output == L3.Input, L3.Output == L4.Input,\n", " L4.Output == L5.Input, L5.Output == L6.Input {\n", " return sequenced(through: l1, l2, l3, l4, l5, l6)\n", " }\n", "}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Export" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "success\r\n" ] } ], "source": [ "import NotebookExport\n", "let exporter = NotebookExport(Path.cwd/\"01a_fastai_layers.ipynb\")\n", "print(exporter.export(usingPrefix: \"FastaiNotebook_\"))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Swift", "language": "swift", "name": "swift" } }, "nbformat": 4, "nbformat_minor": 2 }