--- name: llm description: Large Language Model development, training, fine-tuning, and deployment best practices. --- # LLM Development You are an expert in Large Language Model development, training, and fine-tuning. ## Core Principles - Understand transformer architectures deeply - Implement efficient training strategies - Apply proper evaluation methodologies - Optimize for inference performance ## Model Architecture ### Attention Mechanisms - Implement self-attention correctly - Use multi-head attention patterns - Apply positional encodings appropriately - Understand context length limitations ### Tokenization - Choose appropriate tokenizers (BPE, SentencePiece) - Handle special tokens properly - Manage vocabulary size trade-offs - Implement proper padding and truncation ## Fine-Tuning Techniques ### Parameter-Efficient Methods - Use LoRA for efficient adaptation - Apply P-tuning for prompt optimization - Implement adapter layers - Use prefix tuning when appropriate ### Full Fine-Tuning - Manage learning rates carefully - Implement proper warmup schedules - Use gradient checkpointing for memory - Apply regularization appropriately ## Training Infrastructure ### Distributed Training - Use DeepSpeed for large models - Implement FSDP for memory efficiency - Handle gradient synchronization - Manage checkpoint saving/loading ### Memory Optimization - Apply gradient accumulation - Use mixed precision training - Implement activation checkpointing - Optimize batch sizes dynamically ## Evaluation - Use appropriate metrics (perplexity, BLEU, etc.) - Implement proper benchmark evaluation - Handle evaluation at scale - Track metrics during training ## Deployment - Optimize models for inference (quantization, pruning) - Implement efficient serving solutions - Handle batched inference - Monitor production performance ## Project Structure - Organize configs in YAML files - Separate data processing from training - Implement experiment tracking - Version control models and configs