--- name: aws-sdk-java-v2-bedrock description: Provides Amazon Bedrock patterns using AWS SDK for Java 2.x. Invokes foundation models (Claude, Llama, Titan), generates text and images, creates embeddings for RAG, streams real-time responses, and configures Spring Boot integration. Use when asking about Bedrock integration, Java SDK for AI models, AWS generative AI, Claude/Llama invocation, embeddings for RAG, or Spring Boot AI setup. allowed-tools: Read, Write, Edit, Bash, Glob, Grep --- # AWS SDK for Java 2.x - Amazon Bedrock ## Overview Invokes foundation models through AWS SDK for Java 2.x. Configures clients, builds model-specific JSON payloads, handles streaming responses with error recovery, creates embeddings for RAG, integrates generative AI into Spring Boot applications, and implements exponential backoff for resilience. ## When to Use - Invoke Claude, Llama, Titan, or Stable Diffusion for text/image generation - Configure BedrockClient and BedrockRuntimeClient instances - Build and parse model-specific payloads (Claude, Titan, Llama formats) - Stream real-time AI responses with async handlers and error recovery - Create embeddings for retrieval-augmented generation - Integrate generative AI into Spring Boot microservices - Handle throttling with exponential backoff retry logic ## Quick Start ### Dependencies ```xml software.amazon.awssdk bedrock software.amazon.awssdk bedrockruntime org.json json 20231013 ``` ### Client Setup ```java import software.amazon.awssdk.regions.Region; import software.amazon.awssdk.services.bedrock.BedrockClient; import software.amazon.awssdk.services.bedrockruntime.BedrockRuntimeClient; // Model management client BedrockClient bedrockClient = BedrockClient.builder() .region(Region.US_EAST_1) .build(); // Model invocation client BedrockRuntimeClient bedrockRuntimeClient = BedrockRuntimeClient.builder() .region(Region.US_EAST_1) .build(); ``` ## Instructions Follow these steps for production-ready Bedrock integration: 1. **Configure AWS Credentials** - Set up IAM roles with Bedrock permissions (avoid access keys) 2. **Enable Model Access** - Request access to specific foundation models in AWS Console 3. **Initialize Clients** - Create reusable `BedrockClient` and `BedrockRuntimeClient` instances 4. **Validate Model Availability** - Test with a simple invocation before production use 5. **Build Payloads** - Create model-specific JSON payloads with proper format 6. **Handle Responses** - Parse response structure and extract content 7. **Implement Streaming** - Use response stream handlers for real-time generation 8. **Add Error Handling** - Implement retry logic with exponential backoff **Validation Checkpoint**: Always test with a simple prompt (e.g., "Hello") before production use to verify model access and response parsing. ## Examples ### Text Generation with Claude ```java public String generateWithClaude(BedrockRuntimeClient client, String prompt) { JSONObject payload = new JSONObject() .put("anthropic_version", "bedrock-2023-05-31") .put("max_tokens", 1000) .put("messages", new JSONObject[]{ new JSONObject().put("role", "user").put("content", prompt) }); InvokeModelResponse response = client.invokeModel(InvokeModelRequest.builder() .modelId("anthropic.claude-sonnet-4-5-20250929-v1:0") .body(SdkBytes.fromUtf8String(payload.toString())) .build()); JSONObject responseBody = new JSONObject(response.body().asUtf8String()); return responseBody.getJSONArray("content") .getJSONObject(0) .getString("text"); } ``` ### Model Discovery ```java import software.amazon.awssdk.services.bedrock.model.*; public List listFoundationModels(BedrockClient bedrockClient) { return bedrockClient.listFoundationModels().modelSummaries(); } ``` ### Multi-Model Invocation ```java public String invokeModel(BedrockRuntimeClient client, String modelId, String prompt) { JSONObject payload = createPayload(modelId, prompt); InvokeModelResponse response = client.invokeModel(request -> request .modelId(modelId) .body(SdkBytes.fromUtf8String(payload.toString()))); return extractTextFromResponse(modelId, response.body().asUtf8String()); } private JSONObject createPayload(String modelId, String prompt) { if (modelId.startsWith("anthropic.claude")) { return new JSONObject() .put("anthropic_version", "bedrock-2023-05-31") .put("max_tokens", 1000) .put("messages", new JSONObject[]{ new JSONObject().put("role", "user").put("content", prompt) }); } else if (modelId.startsWith("amazon.titan")) { return new JSONObject() .put("inputText", prompt) .put("textGenerationConfig", new JSONObject() .put("maxTokenCount", 512) .put("temperature", 0.7)); } else if (modelId.startsWith("meta.llama")) { return new JSONObject() .put("prompt", "[INST] " + prompt + " [/INST]") .put("max_gen_len", 512) .put("temperature", 0.7); } throw new IllegalArgumentException("Unsupported model: " + modelId); } ``` ### Streaming Response with Error Handling ```java public String streamResponseWithRetry(BedrockRuntimeClient client, String modelId, String prompt, int maxRetries) { int attempt = 0; while (attempt < maxRetries) { try { JSONObject payload = createPayload(modelId, prompt); StringBuilder fullResponse = new StringBuilder(); InvokeModelWithResponseStreamRequest request = InvokeModelWithResponseStreamRequest.builder() .modelId(modelId) .body(SdkBytes.fromUtf8String(payload.toString())) .build(); client.invokeModelWithResponseStream(request, InvokeModelWithResponseStreamResponseHandler.builder() .onEventStream(stream -> stream.forEach(event -> { if (event instanceof PayloadPart) { String chunk = ((PayloadPart) event).bytes().asUtf8String(); fullResponse.append(chunk); } })) .onError(e -> System.err.println("Stream error: " + e.getMessage())) .build()); return fullResponse.toString(); } catch (Exception e) { attempt++; if (attempt >= maxRetries) { throw new RuntimeException("Stream failed after " + maxRetries + " attempts", e); } try { Thread.sleep((long) Math.pow(2, attempt) * 1000); // Exponential backoff } catch (InterruptedException ie) { Thread.currentThread().interrupt(); throw new RuntimeException("Interrupted during retry", ie); } } } throw new RuntimeException("Unexpected error in streaming"); } ``` ### Exponential Backoff for Throttling ```java import software.amazon.awssdk.awscore.exception.AwsServiceException; public T invokeWithRetry(Supplier invocation, int maxRetries) { int attempt = 0; while (attempt < maxRetries) { try { return invocation.get(); } catch (AwsServiceException e) { if (e.statusCode() == 429 || e.statusCode() >= 500) { attempt++; if (attempt >= maxRetries) throw e; long delayMs = Math.min(1000 * (1L << attempt) + (long) (Math.random() * 1000), 30000); Thread.sleep(delayMs); } else { throw e; } } } throw new IllegalStateException("Should not reach here"); } ``` ### Text Embeddings ```java public double[] createEmbeddings(BedrockRuntimeClient client, String text) { String modelId = "amazon.titan-embed-text-v1"; JSONObject payload = new JSONObject().put("inputText", text); InvokeModelResponse response = client.invokeModel(request -> request .modelId(modelId) .body(SdkBytes.fromUtf8String(payload.toString()))); JSONObject responseBody = new JSONObject(response.body().asUtf8String()); JSONArray embeddingArray = responseBody.getJSONArray("embedding"); double[] embeddings = new double[embeddingArray.length()]; for (int i = 0; i < embeddingArray.length(); i++) { embeddings[i] = embeddingArray.getDouble(i); } return embeddings; } ``` ### Spring Boot Integration ```java @Configuration public class BedrockConfiguration { @Bean public BedrockClient bedrockClient() { return BedrockClient.builder() .region(Region.US_EAST_1) .build(); } @Bean public BedrockRuntimeClient bedrockRuntimeClient() { return BedrockRuntimeClient.builder() .region(Region.US_EAST_1) .build(); } } @Service public class BedrockAIService { private final BedrockRuntimeClient bedrockRuntimeClient; private final ObjectMapper mapper; @Value("${bedrock.default-model-id:anthropic.claude-sonnet-4-5-20250929-v1:0}") private String defaultModelId; public BedrockAIService(BedrockRuntimeClient bedrockRuntimeClient, ObjectMapper mapper) { this.bedrockRuntimeClient = bedrockRuntimeClient; this.mapper = mapper; } public String generateText(String prompt) { Map payload = Map.of( "anthropic_version", "bedrock-2023-05-31", "max_tokens", 1000, "messages", List.of(Map.of("role", "user", "content", prompt)) ); InvokeModelResponse response = bedrockRuntimeClient.invokeModel( InvokeModelRequest.builder() .modelId(defaultModelId) .body(SdkBytes.fromUtf8String(mapper.writeValueAsString(payload))) .build()); return extractText(response.body().asUtf8String()); } } ``` See [examples directory](references/aws-sdk-examples.md) for comprehensive usage patterns. ## Best Practices ### Model Selection - **Claude 4.5 Sonnet**: Complex reasoning, analysis, and creative tasks - **Claude 4.5 Haiku**: Fast and affordable for real-time applications - **Llama 3.1**: Open-source alternative for general tasks - **Titan**: AWS native, cost-effective for simple text generation ### Performance - Reuse client instances (avoid creating new clients per request) - Use async clients for I/O operations - Implement streaming for long responses - Cache foundation model lists ### Security - Never log sensitive prompt data - Use IAM roles for authentication - Sanitize user inputs to prevent prompt injection - Implement rate limiting for public applications ## Constraints and Warnings - **Cost Management**: Bedrock API calls incur charges per token; implement usage monitoring and budget alerts. - **Model Access**: Foundation models must be enabled in AWS Console; verify region availability. - **Rate Limits**: Implement exponential backoff for throttling; check per-model limits. - **Payload Size**: Maximum payload size varies by model; use chunking for large documents. - **Streaming Complexity**: Handle partial content and error recovery carefully. - **Data Privacy**: Prompts and responses may be logged by AWS; review data policies. - **Credentials**: Never embed credentials in code; use IAM roles for EC2/Lambda. ## Common Model IDs - Claude Sonnet 4.5: `anthropic.claude-sonnet-4-5-20250929-v1:0` - Claude Haiku 4.5: `anthropic.claude-haiku-4-5-20251001-v1:0` - Llama 3.1 70B: `meta.llama3-1-70b-instruct-v1:0` - Titan Embeddings: `amazon.titan-embed-text-v1` See [Model Reference](references/model-reference.md) for complete list. ## References - [Advanced Topics](references/advanced-topics.md) - Multi-model patterns, advanced error handling - [Model Reference](references/model-reference.md) - Detailed specifications, payload formats - [Testing Strategies](references/testing-strategies.md) - Unit testing, LocalStack integration - [AWS Bedrock User Guide](references/aws-bedrock-user-guide.md) - [AWS SDK Examples](references/aws-sdk-examples.md) - [Supported Models](references/bedrock-models-supported.md) ## Related Skills - `aws-sdk-java-v2-core` - Core AWS SDK patterns - `langchain4j-ai-services-patterns` - LangChain4j integration - `spring-boot-dependency-injection` - Spring DI patterns