# RDMA Environment Preparation and Installation This chapter mainly introduces the parameters and instructions for RDMA when installing Spiderpool. Currently, RDMA supports the following two usage modes: - Based on Macvlan/IPVLAN CNI, using **RDMA Shared mode** to expose the RoCE NIC on the host to the pod. It is necessary to deploy the [Shared Device Plugin](https://github.com/Mellanox/k8s-rdma-shared-dev-plugin) to expose the RDMA NIC resources and schedule pods. - Based on SR-IOV CNI, using **RDMA Exclusive mode** to expose the RoCE NIC on the host to the pod. The [RDMA CNI](https://github.com/k8snetworkplumbingwg/rdma-cni) needs to be used to achieve RDMA device isolation. ## Prerequisites 1. Please make sure that RDMA devices are available in the cluster environment. 2. Please make sure that the nodes in the cluster have mellanox NICs with RoCE. In this example, we use mellanox ConnectX 5 model NICs, and the proper OFED driver has been installed, if not, please refer to the document [Install OFED Driver](./ofed_driver.md) to install the driver. ## Exposing RoCE NIC based on Macvlan/IPVLAN 1. When exposing the RoCE NIC based on Macvlan/IPVLAN, you need to make sure that the RDMA subsystem on the host is working in **Shared mode**. Otherwise, please switch to **Shared mode**. ```sh # Check the RoCE NIC mode rdma system netns shared copy-on-fork on # Switch to shared mode rdma system set netns shared ``` 1. Confirm the information of the RDMA NIC for subsequent device plugin discovery. Enter the following command to view the vendor of the NIC, which is 15b3, and the deviceIDs, which is 1017. This information is required when deploying Spiderpool. ```sh $ lspci -nn | grep Ethernet af:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017] af:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017] ``` 1. Install Spiderpool and configure the parameters related to the [Shared Device Plugin](https://github.com/Mellanox/k8s-rdma-shared-dev-plugin). Please refer to [Installing Spiderpool](install.md) for deployment details. | Parameter | Value | Description | | ------------------------------------------- | ---------------- | ----------- | | RdmaSharedDevicePlugin.install | True | Whether to enable **RdmaSharedDevicePlugin** | | rdmaSharedDevicePlugin.deviceConfig.resourceName | hca_shared_devices | Define the name of the Shared RDMA Device resource, which needs to be used when creating Workloads | | rdmaSharedDevicePlugin.deviceConfig.deviceIDs | 1017 | Device ID number, consistent with the information queried in the previous step | | rdmaSharedDevicePlugin.deviceConfig.vendors | 15b3 | NIC Vendors information, consistent with the information queried in the previous step |  After successful deployment, you can view the installed components.  1. After the installation is complete, you can log in to the controller node to view the reported RDMA device resources. ```sh kubectl get no -o json | jq -r '[.items[] | {name:.metadata.name, allocable:.status.allocatable}]' [ { "name": "10-20-1-10", "allocable": { "cpu": "40", "memory": "263518036Ki", "pods": "110", "spidernet.io/hca_shared_devices": "500", # Number of available hca_shared_devices ... } }, ... ] ``` If the reported resource count is 0, possible reasons are: - Please confirm that the **vendors** and **deviceID** in the Configmap **spiderpool-rdma-shared-device-plugin** match the actual ones. - Check the logs of **rdma-shared-device-plugin**. For NICs that support RDMA, if the following error is found in the log, you can try to run `apt-get install rdma-core` or `dnf install rdma-core` on the host to install rdma-core. ```console error creating new device: "missing RDMA device spec for device 0000:04:00.0, RDMA device \"issm\" not found" ``` 1. If Spiderpool has been successfully deployed and the Device resources have been successfully discovered, please complete the following steps: - Complete the creation of Multus instances, refer to [Creating Multus CR](../../../config/multus-cr.md) - Complete the creation of IP Pool, refer to [Creating Subnets and IP Pool](../../../config/ippool/createpool.md) 1. After the creation is complete, you can use this resource pool to create workloads. Please refer to [Using RDMA in Workloads](../../../config/userdma.md) for details. For more usage methods, please refer to [Using IP Pool in Workloads](../../../config/use-ippool/usage.md). ## Using RoCE NIC based on SR-IOV 1. When exposing the RoCE NIC based on SR-IOV, you need to make sure that the RDMA subsystem on the host is working in **exclusive mode**. Otherwise, please switch to **exclusive mode**. ```sh # Switch to exclusive mode, invalid after restarting the host rdma system set netns exclusive # Persistently configure, please restart the machine after modification echo "options ib_core netns_mode=0" >> /etc/modprobe.d/ib_core.conf rdma system netns exclusive copy-on-fork on ``` 1. Confirm that the NIC has SR-IOV functionality and check the maximum number of VFs supported: ```sh cat /sys/class/net/ens6f0np0/device/sriov_totalvfs ``` The output is similar to: ```output 8 ``` 1. Confirm the information of the RDMA NIC for subsequent Device Plugin discovery. In this demo environment, the NIC **vendors** is 15b3, and the NIC **deviceIDs** is 1017. These information will be used when creating **SriovNetworkNodePolicy** in the next step. ```sh lspci -nn | grep Ethernet ``` The output is similar to: ```output 04:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017] 04:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017] ``` 1. Install Spiderpool and enable RDMA CNI and SR-IOV CNI. Please refer to [Installing Spiderpool](install.md) for installation details.
| Parameter | Value | Description |
|---|---|---|
| multus.multusCNI.defaultCniCRName | sriov-rdma |
Default CNI name, specifies the name of the NetworkAttachmentDefinition instance used by multus by default.
|
| sriov.install | true | Enable SR-IOV CNI |
| plugins.installRdmaCNI | true | Enable RDMA CNI |