--- name: gke-creation description: Guides the user through creating GKE clusters using pre-defined templates (Standard, Autopilot, GPU/AI). --- # GKE Cluster Creation Skill This skill helps users create Google Kubernetes Engine (GKE) clusters by providing a set of best-practice templates and guiding them through the customization process. ## core_behavior 1. **Template Selection**: - Present the available templates to the user if they haven't specified one. - Explain the trade-offs (e.g., Cost vs. Availability, Autopilot vs. Standard). 2. **Customization**: - Once a template is selected, present the default configuration (JSON/YAML). - Ask the user for essential missing information: `project_id`, `location`, `cluster_name`. - Ask if they want to modify optional fields (e.g., `machineType`, `nodeCount`, `network`). 3. **Validation**: - Ensure `project_id`, `location`, and `cluster_name` are set. - Ensure the configuration matches the `create_cluster` MCP tool schema. 4. **Execution**: - Call the `create_cluster` MCP tool with the final configuration. ## best_practices When guiding the user or generating configurations, adhere to the following GKE cluster creation best practices: ### Security 1. **Private Clusters**: Default to private clusters with a private control plane and restricted public endpoints to minimize attack surface. 2. **VPC-Native Networking**: Use VPC-native clusters to enable alias IP ranges, which allows pod-level firewall rules and better network security. 3. **Workload Identity**: Prefer Workload Identity for securely granting GKE workloads access to Google Cloud services instead of using static service account keys. 4. **Shielded GKE Nodes**: Enable Shielded GKE Nodes to protect against rootkits and bootkits. 5. **Least Privilege (RBAC)**: Institute strict Role-Based Access Control limits granting minimal privilege to users and workloads. ### Cost Optimization 1. **Autoscaling**: Enable Cluster Autoscaler and Horizontal Pod Autoscaler to adjust resources based on demand. 2. **Right-Sizing**: Choose the appropriate machine types and node counts. Consider Spot VMs for fault-tolerant, non-critical workloads. ### High Availability & Reliability 1. **Regional Clusters**: Use Regional Clusters for production environments to ensure control plane replication across multiple zones. (Note: standard regional creates nodes across 3 zones by default). 2. **Pod Disruption Budgets**: Recommend setting Pod Disruption Budgets for application stability during node maintenance. 3. **Release Channels**: Subscribe to a release channel (e.g., Regular or Stable) for automated and safer cluster upgrades. ## templates ### 1. Standard Zonal (Cost-Effective Dev/Test) Best for: Development, testing, non-critical workloads. ```json { "name": "projects/{PROJECT_ID}/locations/{ZONE}/clusters/{CLUSTER_NAME}", "initialNodeCount": 1, "nodeConfig": { "machineType": "e2-medium", "diskSizeGb": 50, "oauthScopes": [ "https://www.googleapis.com/auth/devstorage.read_only", "https://www.googleapis.com/auth/logging.write", "https://www.googleapis.com/auth/monitoring", "https://www.googleapis.com/auth/service.management.readonly", "https://www.googleapis.com/auth/servicecontrol", "https://www.googleapis.com/auth/trace.append" ] } } ``` ### 2. Standard Regional (High Availability) Best for: Production workloads requiring high availability. _Note: Creates 3 nodes (one per zone in the region) by default._ ```json { "name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}", "initialNodeCount": 1, "nodeConfig": { "machineType": "e2-standard-4", "diskSizeGb": 100, "oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"] } } ``` ### 3. Autopilot (Operations-Free) Best for: Most workloads where you don't want to manage nodes. ```json { "name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}", "autopilot": { "enabled": true } } ``` ### 4. GPU Inference (L4) Best for: AI/ML Inference, small model serving. _Note: Requires `g2-standard-4` quota._ ```json { "name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}", "initialNodeCount": 1, "nodeConfig": { "machineType": "g2-standard-4", "accelerators": [ { "acceleratorCount": "1", "acceleratorType": "nvidia-l4" } ], "diskSizeGb": 100, "oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"] } } ``` ### 5. AI Hypercompute (A3 HighGPU) Best for: Large Model Training/Inference. _Note: High cost and strict quota requirements._ ```json { "name": "projects/{PROJECT_ID}/locations/{REGION}/clusters/{CLUSTER_NAME}", "initialNodeCount": 1, "nodeConfig": { "machineType": "a3-highgpu-8g", "accelerators": [ { "acceleratorCount": "8", "acceleratorType": "nvidia-h100-80gb-hbm3" } ], "diskSizeGb": 200, "oauthScopes": ["https://www.googleapis.com/auth/cloud-platform"] } } ``` ## instructions - **ALWAYS** ask for the `project_id` if it is not in the context. - **ALWAYS** ask for the `location` (Region or Zone). - **ALWAYS** ask for a unique `cluster_name`. - **CHECK** if the user wants `Access to Google Cloud APIs` (default `cloud-platform` scope is usually best for modern GKE). - **WARN** the user about cost if they select GPU or Reginal clusters. - **USE** `create_cluster` MCP tool to create the cluster. The `parent` argument is `projects/{PROJECT_ID}/locations/{LOCATION}` and the `cluster` argument is the JSON object. The `cluster.name` is just the short name (e.g. "my-cluster"). - **IMPORTANT**: When calling `create_cluster`, the `cluster.name` should be the **short name** (e.g., `my-cluster`), NOT the full resource path, because the `parent` argument defines the scope. ## example_usage **User**: "I want to create a GKE cluster." **Model**: "I can help with that. What kind of cluster do you need? 1. **Standard Zonal**: Good for dev/test. 2. **Standard Regional**: High availability. 3. **Autopilot**: Fully managed. 4. **GPU Enabled**: For AI/ML workloads." **User**: "Standard Zonal, please." **Model**: "Great. I'll need a few details: - Project ID - Zone (e.g., us-central1-a) - Cluster Name" **User**: "Project `my-proj`, zone `us-west1-b`, name `dev-cluster`." **Model**: "Here is the configuration I will use: [JSON view] Do you want to proceed?"