---
name: azure-ai-inference-java
description: |
Azure AI Inference SDK for Java. Use for chat completions, embeddings, and model inference with Azure AI Foundry endpoints.
Triggers: "ChatCompletionsClient java", "EmbeddingsClient java", "azure inference java", "model inference java", "Foundry models java".
package: com.azure:azure-ai-inference
---
# Azure AI Inference SDK for Java
Client library for Azure AI model inference with chat completions and embeddings.
## Installation
```xml
com.azure
azure-ai-inference
1.0.0-beta.5
```
## Environment Variables
```bash
AZURE_INFERENCE_ENDPOINT=https://.services.ai.azure.com/models
AZURE_INFERENCE_CREDENTIAL=
AZURE_INFERENCE_MODEL=gpt-4o-mini
```
## Authentication
### API Key
```java
import com.azure.ai.inference.ChatCompletionsClient;
import com.azure.ai.inference.ChatCompletionsClientBuilder;
import com.azure.core.credential.AzureKeyCredential;
ChatCompletionsClient client = new ChatCompletionsClientBuilder()
.endpoint(System.getenv("AZURE_INFERENCE_ENDPOINT"))
.credential(new AzureKeyCredential(System.getenv("AZURE_INFERENCE_CREDENTIAL")))
.buildClient();
```
### DefaultAzureCredential (Recommended)
```java
import com.azure.identity.DefaultAzureCredentialBuilder;
ChatCompletionsClient client = new ChatCompletionsClientBuilder()
.endpoint(System.getenv("AZURE_INFERENCE_ENDPOINT"))
.credential(new DefaultAzureCredentialBuilder().build())
.buildClient();
```
## Client Types
| Client | Purpose |
|--------|---------|
| `ChatCompletionsClient` | Chat and text completions (sync) |
| `ChatCompletionsAsyncClient` | Chat completions (async) |
| `EmbeddingsClient` | Text embeddings (sync) |
| `EmbeddingsAsyncClient` | Text embeddings (async) |
## Chat Completions
### Basic Completion
```java
import com.azure.ai.inference.models.*;
import java.util.ArrayList;
import java.util.List;
List messages = new ArrayList<>();
messages.add(new ChatRequestSystemMessage("You are a helpful assistant."));
messages.add(new ChatRequestUserMessage("What is Azure AI?"));
ChatCompletions response = client.complete(new ChatCompletionsOptions(messages));
for (ChatChoice choice : response.getChoices()) {
System.out.println(choice.getMessage().getContent());
}
```
### Streaming Completions
```java
List messages = new ArrayList<>();
messages.add(new ChatRequestSystemMessage("You are a helpful assistant."));
messages.add(new ChatRequestUserMessage("Write a poem about Azure."));
client.completeStream(new ChatCompletionsOptions(messages))
.forEach(chatCompletions -> {
if (!CoreUtils.isNullOrEmpty(chatCompletions.getChoices())) {
StreamingChatResponseMessageUpdate delta = chatCompletions.getChoice().getDelta();
if (delta.getContent() != null) {
System.out.print(delta.getContent());
}
}
});
```
### Chat with Image URL
```java
List contentItems = new ArrayList<>();
contentItems.add(new ChatMessageTextContentItem("Describe the image."));
contentItems.add(new ChatMessageImageContentItem(new ChatMessageImageUrl("")));
List messages = new ArrayList<>();
messages.add(new ChatRequestSystemMessage("You are a helpful assistant."));
messages.add(ChatRequestUserMessage.fromContentItems(contentItems));
ChatCompletions response = client.complete(new ChatCompletionsOptions(messages));
System.out.println(response.getChoice().getMessage().getContent());
```
## Embeddings
```java
import com.azure.ai.inference.EmbeddingsClient;
import com.azure.ai.inference.EmbeddingsClientBuilder;
import com.azure.ai.inference.models.*;
import java.util.List;
EmbeddingsClient embeddingsClient = new EmbeddingsClientBuilder()
.endpoint(endpoint)
.credential(new AzureKeyCredential(key))
.buildClient();
List input = List.of("Your text string goes here");
EmbeddingsResult result = embeddingsClient.embed(input);
for (EmbeddingItem item : result.getData()) {
System.out.println("Dimensions: " + item.getEmbeddingList().size());
}
```
## Message Types
| Type | Description |
|------|-------------|
| `ChatRequestSystemMessage` | System instructions |
| `ChatRequestUserMessage` | User input (text, images) |
| `ChatRequestAssistantMessage` | Model responses |
| `ChatRequestToolMessage` | Tool execution results |
## Model Information
```java
ModelInfo modelInfo = client.getModelInfo();
System.out.println("Model: " + modelInfo.getModelName());
System.out.println("Provider: " + modelInfo.getModelProviderName());
System.out.println("Type: " + modelInfo.getModelType());
```
## Best Practices
1. **Use DefaultAzureCredential** for production
2. **Use streaming** for long responses to improve UX
3. **Specify model** when endpoint serves multiple deployments
4. **Close async clients** explicitly or use try-with-resources
5. **Handle rate limits** with appropriate retry logic
## Error Handling
```java
import com.azure.core.exception.HttpResponseException;
try {
ChatCompletions response = client.complete(options);
} catch (HttpResponseException e) {
System.err.println("Error: " + e.getResponse().getStatusCode());
System.err.println("Message: " + e.getMessage());
}
```
## Reference Links
| Resource | URL |
|----------|-----|
| API Reference | https://aka.ms/azsdk/azure-ai-inference/java/reference |
| GitHub Source | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-inference |
| Samples | https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/ai/azure-ai-inference/src/samples |