--- title: Getting started weight: 10 aliases: /gaudi-rag-chat-qna/gaudi-rag-chat-qna-getting-started/ --- :toc: :imagesdir: /images :_content-type: ASSEMBLY include::modules/comm-attributes.adoc[] [id="deploying-grcq-pattern"] = Deploying the {gcqna-pattern} .Prerequisites * An {ocp} cluster ** To create an {ocp} cluster, go to the https://console.redhat.com/[Red Hat Hybrid Cloud console] and select *Services \-> Containers \-> Create cluster*. ** The cluster must have a dynamic `StorageClass` to provision `PersistentVolumes`. It was tested with ODF (OpenShift Data Foundation) or LVM Storage solutions. CephFS should be set as a default Storage Class - https://docs.openshift.com/container-platform/4.16/storage/container_storage_interface/persistent-storage-csi-sc-manage.html#change-default-storage-class_persistent-storage-csi-sc-manage[Setup Guide] ** link:../../gaudi-rag-chat-qna/gaudi-rag-chat-qna-required-hardware[Required hardware]. * {ocp} Cluster must have a configured Image Registry - https://docs.openshift.com/container-platform/4.16/registry/configuring_registry_storage/configuring-registry-storage-rhodf.html[Setup Guide] * A GitHub account and a token for it with repositories permissions, to read from and write to your forks. * A HuggingFace account and User Access token, which allows to download AI models. More about User Access token can be found on https://huggingface.co/docs/hub/en/security-tokens[official HuggingFace website] //Replaced git and podman prereqs with the tooling dependencies page * https://validatedpatterns.io/learn/quickstart/[Install the tooling dependencies]. * https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html#getting-started-install-instructions[Install AWS CLI tool] to check status of S3 bucket (RGW storage). If you do not have a running Red Hat OpenShift cluster, you can start one on a public or private cloud by using https://console.redhat.com/openshift/create[Red Hat Hybrid Cloud Console]. .Procedure . Fork the https://github.com/validatedpatterns-sandbox/qna-chat-gaudi[qna-chat-gaudi] repository on GitHub. . Clone the forked copy of this repository. + [source,terminal] ---- git clone git@github.com:your-username/qna-chat-gaudi.git ---- . Create a local copy of the secret values file that can safely include credentials. Run the following commands: + [source,terminal] ---- cp values-secret.yaml.template ~/values-gaudi-rag-chat-qna.yaml ---- + [source,terminal] ---- vi ~/values-gaudi-rag-chat-qna.yaml ---- + [WARNING] ==== Do not commit this file. You do not want to push personal credentials to GitHub. If you do not want to customize the secrets by copying secret, these steps are not needed. User can just type in required HuggingFace User Access Token while installing pattern. In the beginning of the installation process there should appear prompt asking for the HuggingFace User Access Token. ==== + . (Optional) If cluster is behind proxy `values-global.yaml` should be similar to the following: + [WARNING] ==== If the cluster is behind proxy remember to change proxy values of fields `gaudillm.build_envs` and `gaudillm.runtime_envs` in `values-global.yaml` file to appropriate ones. ==== + [source,terminal] ---- gaudillm: namespace: gaudi-llm build_envs: - name: http_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: https_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTP_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTPS_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: no_proxy value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost - name: NO_PROXY value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost runtime_envs: - name: http_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: https_proxy value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTP_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: HTTPS_PROXY value: http://proxy-internal.cluster1.gaudi.internal:912 - name: no_proxy value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost - name: NO_PROXY value: .cluster.local,.gaudi.internal,.cluster1.gaudi.internal,.svc,192.168.122.0/24,10.128.0.0/14,127.0.0.1,172.30.0.0/16,api-int.cluster1.gaudi.internal,localhost ---- . Customize the deployment for your cluster. Run the following command: + [source,terminal] ---- git checkout -b my-branch ---- + [source,terminal] ---- vi values-global.yaml ---- + [source,terminal] ---- git add values-global.yaml ---- + [source,terminal] ---- git commit values-global.yaml ---- + [source,terminal] ---- git push origin my-branch ---- . Deploy the pattern by running `./pattern.sh make install` or by using the link:/infrastructure/using-validated-pattern-operator/[Validated Patterns Operator]. + [WARNING] ==== If you have not set HuggingFace token in secrets file there will be prompt to set the token. ==== [id="deploying-cluster-using-patternsh-file"] == Deploying the cluster by using the pattern.sh file To deploy the cluster by using the `pattern.sh` file, complete the following steps: . Login to your cluster by running the following command: + [source,terminal] ---- oc login ---- + Optional: Set the `KUBECONFIG` variable for the `kubeconfig` file path: + [source,terminal] ---- export KUBECONFIG=~/ ---- . Deploy the pattern to your cluster. Run the following command: + [source,terminal] ---- ./pattern.sh make install ---- . In the beginning of installation there should be prompt to enter HuggingFace token, looking like this: + [source,terminal] ---- Insert HuggingFace Token: ---- + [WARNING] ==== Validated Pattern can take a while to be fully installed as it requires couple of reboots to apply MachineConfigs for worker nodes. ==== + As part of this pattern, HashiCorp Vault has been installed. Refer to the section on https://validatedpatterns.io/secrets/vault/[Vault]. [id="pattern-verification"] == Verification . Verify that the Operators have been installed. .. To verify, in the OpenShift Container Platform web console, navigate to *Operators → Installed Operators* page. .. Check that the Operators are installed in the `openshift-operators` namespace and its status is `Succeeded`. . Verify that S3 bucket was created successfully. Run following commands: + [source,terminal] ---- export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d) export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d) export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}') aws --endpoint ${CEPH_RGW_ENDPOINT} s3api list-buckets ---- + Expected response should show that bucket named `model-bucket` exists: + [source,terminal] ---- { "Buckets": [ { "Name": "model-bucket", "CreationDate": "2024-06-27T08:44:09.451000+00:00" } ], "Owner": { "DisplayName": "ocs-storagecluster", "ID": "ocs-storagecluster-cephobjectstoreuser" } } ---- . Verify that MachineConfigPool is ready. To check the status run following command: + [source,terminal] ---- oc get mcp ---- + . Verify that all applications are healthy and synchronized. Under the project `gaudi-rag-chat-qna` click the URL for the `hub` gitops `server`. + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-argocd.png[Gaudi RAG Chat QnA GitOps Hub] [id="setup-rhoai"] == Setup Red Hat OpenShift AI After all components are properly deployed user can proceed to setup Red Hat OpenShift AI. The procedures consists of two manual steps: . Uploading AI model to S3 bucket (RGW storage) on the cluster . Deploying TGI [id="setup-rhoai-ai-model"] === Upload AI model . First step is to go to `RHOAI dashboard/Data Science Projects` tab and select `gaudi-llm` project (If the `gaudi-llm` project is missing user should check if app is ready in ArgoCD dashboard): + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-choose-project.png[Select RHOAI project] . Go to `Workbenches` tab and click `Create workbench`: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-create-workbench.png[Create RHOAI workbench] . Fill the input form following images below: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-deploy-notebook-1.png[Deploy Jupyter notebook 1] image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-deploy-notebook-2.png[Deploy Jupyter notebook 2] . After workbench is created go to this Jupyter notebook dashboard and upload Jupyter notebook file `/download-model.ipynb` to the file explorer so it looks like this: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-notebook-view.png[Jupyter notebook view] . When `download-model` is uploaded, run notebook's all commands. After notebook is executed, model should be uploaded to S3 bucket. To check if model is present please run following commands: + [source,terminal] ---- export AWS_ACCESS_KEY_ID=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d) export AWS_SECRET_ACCESS_KEY=$(oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d) export CEPH_RGW_ENDPOINT=http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}') aws --endpoint ${CEPH_RGW_ENDPOINT} s3 ls model-bucket/models/ ---- + Response should show that model directory is present. [id="setup-rhoai-tgi"] === Deploy TGI . First step is to go to `RHOAI dashboard/Data Science Projects` tab and select `gaudi-llm` project: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-choose-project.png[Select RHOAI project] . Now select `Data connections` tab and click `Add data connection`: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-add-data-connection.png[Add Data connection] . There should appear form `Add data connection` with couple of inputs. Form should be looking similar to this: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-data-connection-details.png[Complete Data connection details] + .. To get value for `Access key` run command: + [source,terminal] ---- oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_ACCESS_KEY_ID}' | base64 -d ---- + .. To get value for `Secret key` run command: + [source,terminal] ---- oc -n openshift-storage get secret s3-secret-bck -ojsonpath='{.data.AWS_SECRET_ACCESS_KEY}' | base64 -d ---- + .. To get value for `Endpoint` run command: + [source,terminal] ---- echo "http://$(oc -n openshift-storage get route s3-rgw -ojsonpath='{.spec.host}')" ---- . Next step is to go to tab `Models` and click `Deploy model` in `Single-model serving platform` section: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-create-single-model-serving.png[Create Single Model Serving] . There should appear form `Deploy model`. Fill all the inputs like in following images and then click `Deploy` button: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-deploy-model-1.png[Deploy model 1] + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-deploy-model-2.png[Deploy model 2] . If everything is setup correctly go to tab `Models` again to check status of TGI. It should be looking like this: + image::/images/gaudi-rag-chat-qna/gaudi-rag-chat-qna-tgi-status.png[Status of TGI] [id="start-rag-chat-qna-demo"] == RAG Chat demo After whole setup is complete demo application is ready to use. Chat QnA UI address can be obtained by running following command: [source,terminal] ---- echo "http://$(oc -n gaudi-llm get route chatqna-gaudi-ui-server -ojsonpath='{.spec.host}')" ----