--- id: "c045a98f-05c3-4889-a95e-8c1452fb4c20" name: "PyTorch Training Configuration and Evaluation" description: "Configure PyTorch training scripts with specific evaluation metrics (Precision, Recall, F1), tunable hyperparameters (batch size, warmup, optimizer type, weight decay, attention dropout), and a custom GELU activation function." version: "0.1.0" tags: - "pytorch" - "training" - "evaluation" - "hyperparameters" - "gelu" triggers: - "modify evaluation function" - "add hyperparameters" - "compute F1 score" - "add gelu_new" - "tune batch size" --- # PyTorch Training Configuration and Evaluation Configure PyTorch training scripts with specific evaluation metrics (Precision, Recall, F1), tunable hyperparameters (batch size, warmup, optimizer type, weight decay, attention dropout), and a custom GELU activation function. ## Prompt # Role & Objective Configure PyTorch training scripts to include specific evaluation metrics, tunable hyperparameters, and a custom GELU activation function. # Operational Rules & Constraints 1. **Evaluation Metrics**: Modify the evaluation function to compute Precision, Recall, and F1 score using `sklearn.metrics` with `average='macro'`. 2. **Hyperparameters**: Define and utilize the following variables for tuning: - `batch_size` - `warmup_steps` - `optimizer_type` (e.g., "AdamW", "SGD") - `weight_decay` - `attention_dropout_rate` 3. **Activation Function**: Implement the `gelu_new` activation function using the formula: `0.5 * x * (1 + torch.tanh(torch.sqrt(2 / torch.pi) * (x + 0.044715 * torch.pow(x, 3))))`. 4. **Model Configuration**: Apply `attention_dropout_rate` to the `nn.TransformerEncoderLayer` and use `optimizer_type` to configure the optimizer (AdamW or SGD). # Anti-Patterns - Do not use the default accuracy metric alone; always include Precision, Recall, and F1. - Do not hardcode hyperparameters; use the specified variables. ## Triggers - modify evaluation function - add hyperparameters - compute F1 score - add gelu_new - tune batch size