# Datahike Java API
**Status: Beta** - The Java API is functional and tested, but may receive breaking changes as we gather feedback from production use.
Datahike provides a comprehensive Java API that enables you to use the full power of Datalog databases from Java applications without writing Clojure code. The API offers both high-level convenience methods and low-level access for advanced use cases.
## Features
- **Type-Safe Configuration** - Fluent builder pattern with compile-time checks
- **Modern Java API** - Works with Java Maps, UUIDs, and standard collections
- **Full Datalog Support** - Expressive declarative queries with joins, aggregates, and rules
- **Time Travel** - Query database history and point-in-time snapshots
- **Pull API** - Recursive pattern-based entity retrieval
- **Schema Support** - Optional strict or flexible schema enforcement
- **Multiple Backends** - Memory, file system, and extensible to custom stores
## Installation
### Maven
Add the Clojars repository and Datahike dependency to your `pom.xml`:
```xml
clojars
Clojars
https://repo.clojars.org/
org.replikativ
datahike
CURRENT
```
### Gradle
Add to your `build.gradle`:
```gradle
repositories {
maven { url "https://repo.clojars.org/" }
mavenCentral()
}
dependencies {
implementation 'org.replikativ:datahike:CURRENT' // Check https://clojars.org/org.replikativ/datahike for latest
}
```
## Quick Start
```java
import datahike.java.Datahike;
import datahike.java.Database;
import datahike.java.SchemaFlexibility;
import java.util.*;
// Create and connect to database
Map config = Database.memory(UUID.randomUUID())
.schemaFlexibility(SchemaFlexibility.READ)
.build();
Datahike.createDatabase(config);
Object conn = Datahike.connect(config);
// Transact data using Java Maps
Datahike.transact(conn, List.of(
Map.of("name", "Alice", "age", 30),
Map.of("name", "Bob", "age", 25)
));
// Query with Datalog
Set> results = (Set>) Datahike.q(
"[:find ?name ?age :where [?e :name ?name] [?e :age ?age]]",
Datahike.deref(conn)
);
System.out.println(results);
// => #{["Alice" 30] ["Bob" 25]}
// Cleanup
Datahike.deleteDatabase(config);
```
## Configuration
Datahike offers three approaches to configuration, each suited for different needs.
### Approach 1: Database Builder (Recommended)
The fluent builder pattern provides type safety and IDE autocompletion:
```java
import datahike.java.Database;
import datahike.java.SchemaFlexibility;
import java.util.UUID;
// In-memory database (requires UUID)
Map config = Database.memory(UUID.randomUUID())
.keepHistory(true)
.schemaFlexibility(SchemaFlexibility.READ)
.build();
// File-based database
Map config = Database.file("/var/lib/mydb")
.keepHistory(true)
.name("production-db")
.build();
// With initial schema
import static datahike.java.Keywords.*;
import static datahike.java.Util.*;
Object schema = vec(
map(DB_IDENT, kwd(":person/name"),
DB_VALUE_TYPE, STRING,
DB_CARDINALITY, ONE)
);
Map config = Database.memory(UUID.randomUUID())
.initialTx(schema)
.build();
```
**When to use:** New projects, when you want type safety and clear code.
### Approach 2: Java Maps
Use standard Java collections with string keys (automatically converted to Clojure keywords):
```java
import java.util.*;
// Configuration with nested maps
Map config = Map.of(
"store", Map.of(
"backend", ":memory", // : prefix makes it a keyword
"id", UUID.randomUUID()
),
"schema-flexibility", ":read",
"keep-history?", true
);
// Custom backends
Map config = Map.of(
"store", Map.of(
"backend", ":pg",
"host", "localhost",
"port", 5432,
"username", "user",
"password", "secret"
)
);
```
**When to use:** Dynamic configuration, config files (JSON/YAML → Map), custom backends.
### Approach 3: EDN Strings (Advanced)
For advanced use cases, work directly with Clojure's Extensible Data Notation:
```java
import static datahike.java.Util.*;
// Parse EDN string
Object config = ednFromString(
"{:store {:backend :memory :id #uuid \"550e8400-e29b-41d4-a716-446655440000\"}}"
);
// Build EDN programmatically
Object config = map(
kwd(":store"), map(
kwd(":backend"), kwd(":memory"),
kwd(":id"), UUID.randomUUID()
)
);
```
**When to use:** Interop with Clojure code, advanced EDN features, maximum control.
## Database Lifecycle
### Creating and Connecting
```java
import datahike.java.Datahike;
// Create database (idempotent - safe to call multiple times)
Datahike.createDatabase(config);
// Check if database exists
boolean exists = Datahike.databaseExists(config);
// Connect to database (returns connection object)
Object conn = Datahike.connect(config);
// Get current database value
Object db = Datahike.deref(conn);
```
### Deleting Databases
```java
// Delete all database files/data
Datahike.deleteDatabase(config);
// Release connection (required for some backends like LevelDB)
Datahike.release(conn);
```
## Transactions
Transactions are atomic and consistent. Add, update, or retract data.
### Simple Transactions
```java
import java.util.*;
// Add entities with auto-generated IDs
Datahike.transact(conn, List.of(
Map.of("name", "Alice", "age", 30),
Map.of("name", "Bob", "age", 25)
));
// Update existing entity (requires :db/id)
Datahike.transact(conn, List.of(
Map.of(":db/id", 1, "age", 31)
));
// Retract attribute
import static datahike.java.Util.*;
Datahike.transact(conn, vec(
vec(kwd(":db/retract"), 1, kwd(":age"), 30)
));
// Retract entire entity
Datahike.transact(conn, vec(
vec(kwd(":db.fn/retractEntity"), 1)
));
```
### EDN Conversion Rules
Datahike automatically converts between Java and Clojure data:
| Java Type | EDN Type | Example |
|-----------|----------|---------|
| `String` starting with `:` | Keyword | `":memory"` → `:memory` |
| Other `String` | String | `"Alice"` → `"Alice"` |
| `Integer`, `Long` | Long | `42` → `42` |
| `Boolean` | Boolean | `true` → `true` |
| `Map` | Map | `{"a": 1}` → `{:a 1}` |
| `List>`, `Object[]` | Vector | `[1, 2]` → `[1 2]` |
| `UUID` | UUID | `UUID` → `#uuid "..."` |
**Important:** Map keys are always converted to keywords. Use `:` prefix in string values to create keyword values.
See [EDN Conversion Documentation](bindings/edn-conversion.md) for complete rules and edge cases.
## Queries
Datahike uses Datalog, a declarative query language similar to SQL but more expressive.
### Basic Queries
```java
// Find all names
Set> results = (Set>) Datahike.q(
"[:find ?name :where [?e :name ?name]]",
Datahike.deref(conn)
);
// Find with conditions
Set> results = (Set>) Datahike.q(
"[:find ?name ?age :where [?e :name ?name] [?e :age ?age] [(>= ?age 25)]]",
Datahike.deref(conn)
);
// Joins across entities
Set> results = (Set>) Datahike.q(
"""
[:find ?person-name ?friend-name
:where
[?p :person/name ?person-name]
[?p :person/friends ?f]
[?f :person/name ?friend-name]]
""",
Datahike.deref(conn)
);
```
### Parameterized Queries
```java
// Query with input parameters
Set> results = (Set>) Datahike.q(
"[:find ?e :in $ ?name :where [?e :name ?name]]",
Datahike.deref(conn),
"Alice"
);
// Multiple databases
Object conn2 = Datahike.connect(otherConfig);
Set> results = (Set>) Datahike.q(
"[:find ?name :in $ $2 :where [$ ?e :name ?name] [$2 ?e :active true]]",
Datahike.deref(conn),
Datahike.deref(conn2)
);
```
### Aggregates
```java
// Count, sum, min, max, avg
Set> results = (Set>) Datahike.q(
"[:find (count ?e) (avg ?age) :where [?e :age ?age]]",
Datahike.deref(conn)
);
// Group by
Set> results = (Set>) Datahike.q(
"""
[:find ?department (avg ?salary)
:where
[?e :employee/department ?department]
[?e :employee/salary ?salary]]
""",
Datahike.deref(conn)
);
```
## Pull API
The Pull API retrieves entities with nested relationships using pattern-based selectors.
### Basic Pull
```java
// Pull single entity by ID
Map, ?> entity = (Map, ?>) Datahike.pull(
Datahike.deref(conn),
"[:name :age]",
1
);
// => {:name "Alice" :age 30}
// Pull all attributes
Map, ?> entity = (Map, ?>) Datahike.pull(
Datahike.deref(conn),
"[*]",
1
);
// Pull multiple entities
List> entities = (List>) Datahike.pullMany(
Datahike.deref(conn),
"[:name :age]",
List.of(1, 2, 3)
);
```
### Nested Pull
```java
// Pull with nested relationships
Map, ?> person = (Map, ?>) Datahike.pull(
Datahike.deref(conn),
"""
[:person/name
{:person/friends [:person/name :person/email]}]
""",
1
);
// => {:person/name "Alice"
// :person/friends [{:person/name "Bob" :person/email "bob@example.com"}]}
// Recursive pull (follow references up to 3 levels)
Map, ?> org = (Map, ?>) Datahike.pull(
Datahike.deref(conn),
"[:org/name {:org/parent 3}]", // 3 = recursion depth
orgId
);
```
## Time Travel
Query database state at any point in history.
### Historical Queries
```java
import java.time.Instant;
import java.util.Date;
// Query as of specific time
Date timestamp = Date.from(Instant.parse("2024-01-01T00:00:00Z"));
Object pastDb = Datahike.asOf(Datahike.deref(conn), timestamp);
Set> results = (Set>) Datahike.q(
"[:find ?name :where [?e :name ?name]]",
pastDb
);
// Query changes since timestamp
Object recentDb = Datahike.since(Datahike.deref(conn), timestamp);
Set> changes = (Set>) Datahike.q(
"[:find ?name :where [?e :name ?name]]",
recentDb
);
// Query full history (includes all assertions and retractions)
Object historyDb = Datahike.history(Datahike.deref(conn));
Set> allValues = (Set>) Datahike.q(
"[:find ?name :where [?e :name ?name]]",
historyDb
);
```
### Transaction Metadata
```java
import static datahike.java.Util.*;
// Add metadata to transaction
Datahike.transact(conn, Map.of(
":tx-data", List.of(Map.of("name", "Alice")),
":tx-meta", Map.of("author", "user@example.com", "reason", "user signup")
));
// Query transaction metadata
Set> txData = (Set>) Datahike.q(
"""
[:find ?tx ?author ?time
:where
[?tx :author ?author]
[?tx :db/txInstant ?time]]
""",
Datahike.deref(conn)
);
```
## Schema Definition
Schemas define attributes and their properties, enabling validation and optimizations.
### Defining Schema
```java
import static datahike.java.Keywords.*;
import static datahike.java.Util.*;
// Define schema attributes
Object schema = vec(
map(
DB_IDENT, kwd(":person/name"),
DB_VALUE_TYPE, STRING,
DB_CARDINALITY, ONE,
DB_DOC, "Person's full name",
DB_UNIQUE, UNIQUE_IDENTITY
),
map(
DB_IDENT, kwd(":person/age"),
DB_VALUE_TYPE, LONG,
DB_CARDINALITY, ONE
),
map(
DB_IDENT, kwd(":person/friends"),
DB_VALUE_TYPE, REF,
DB_CARDINALITY, MANY,
DB_DOC, "Person's friends (entity references)"
)
);
// Option 1: Set schema at database creation
Map config = Database.memory(UUID.randomUUID())
.initialTx(schema)
.build();
Datahike.createDatabase(config);
// Option 2: Transact schema after creation
Datahike.createDatabase(config);
Object conn = Datahike.connect(config);
Datahike.transact(conn, schema);
```
### Schema Flexibility
```java
import datahike.java.SchemaFlexibility;
// Strict: Only defined attributes allowed
Map config = Database.memory(UUID.randomUUID())
.build(); // Default: strict schema enforcement
// Flexible read: Allow reading undefined attributes, reject writes
Map config = Database.memory(UUID.randomUUID())
.schemaFlexibility(SchemaFlexibility.READ)
.build();
// Flexible write: Allow both reading and writing undefined attributes
Map config = Database.memory(UUID.randomUUID())
.schemaFlexibility(SchemaFlexibility.WRITE)
.build();
```
### Schema Constants
Use the `Keywords` class for type-safe schema definitions:
```java
import static datahike.java.Keywords.*;
// Entity attributes
DB_ID, DB_IDENT
// Schema definition
DB_VALUE_TYPE, DB_CARDINALITY, DB_DOC, DB_UNIQUE, DB_INDEX
// Value types
STRING, BOOLEAN, LONG, BIGINT, FLOAT, DOUBLE, BIGDEC,
INSTANT, UUID_TYPE, KEYWORD_TYPE, SYMBOL_TYPE, REF, BYTES
// Cardinality
ONE, MANY
// Uniqueness
UNIQUE_VALUE, UNIQUE_IDENTITY
// Schema flexibility
SCHEMA_READ, SCHEMA_WRITE
// Storage backends
BACKEND_MEMORY, BACKEND_FILE
```
## Advanced Features
### Index Access
```java
// Get datoms from index (EAVT, AEVT, AVET)
Iterable> datoms = Datahike.datoms(
Datahike.deref(conn),
":eavt"
);
// Seek to position in index
import static datahike.java.Util.*;
Iterable> datoms = Datahike.seekDatoms(
Datahike.deref(conn),
":avet",
kwd(":name"), "Alice"
);
// Get index range
Iterable> range = Datahike.indexRange(
Datahike.deref(conn),
":name",
"A", "M" // Lexicographic range
);
```
### Entity API
```java
import datahike.java.IEntity;
import static datahike.java.Util.*;
// Get entity by ID
IEntity entity = (IEntity) Datahike.entity(Datahike.deref(conn), 1);
// Access attributes
String name = (String) entity.valAt(kwd(":name"));
Long age = (Long) entity.valAt(kwd(":age"));
// Touch entity (load all attributes)
Object touchedEntity = Datahike.touch(entity);
```
### Database Metrics
```java
// Get database statistics
Map, ?> metrics = (Map, ?>) Datahike.metrics(
Datahike.deref(conn)
);
System.out.println(metrics);
// => {:datoms 1000 :indexed-datoms 1000 ...}
```
### Schema Introspection
```java
// Get current schema
Map, ?> schema = (Map, ?>) Datahike.schema(
Datahike.deref(conn)
);
// Get reverse schema (ident -> attribute map)
Map, ?> reverseSchema = (Map, ?>) Datahike.reverseSchema(
Datahike.deref(conn)
);
```
## Storage Backends
### Memory Backend
Fast in-memory storage, requires UUID identifier:
```java
Map config = Database.memory(UUID.randomUUID())
.build();
```
**Use cases:** Testing, caching, temporary data, development.
### File Backend
Persistent file-based storage:
```java
Map config = Database.file("/var/lib/myapp/db")
.build();
```
**Use cases:** Local applications, single-server deployments, development persistence.
### Custom Backends
Extend Datahike with custom storage implementations:
```java
Map config = Database.custom(Map.of(
"backend", ":my-backend",
"custom-option-1", "value1",
"custom-option-2", "value2"
)).build();
```
Available via plugins: PostgreSQL, S3, Redis, and more.
## Examples
### Complete Application
```java
import datahike.java.*;
import static datahike.java.Keywords.*;
import static datahike.java.Util.*;
import java.util.*;
public class DatahikeExample {
public static void main(String[] args) {
// 1. Configure database
Object schema = vec(
map(DB_IDENT, kwd(":user/email"),
DB_VALUE_TYPE, STRING,
DB_CARDINALITY, ONE,
DB_UNIQUE, UNIQUE_IDENTITY),
map(DB_IDENT, kwd(":user/name"),
DB_VALUE_TYPE, STRING,
DB_CARDINALITY, ONE)
);
Map config = Database.file("/tmp/app-db")
.initialTx(schema)
.schemaFlexibility(SchemaFlexibility.READ)
.keepHistory(true)
.build();
// 2. Create and connect
Datahike.createDatabase(config);
Object conn = Datahike.connect(config);
// 3. Add data
Datahike.transact(conn, List.of(
Map.of(":user/email", "alice@example.com",
":user/name", "Alice"),
Map.of(":user/email", "bob@example.com",
":user/name", "Bob")
));
// 4. Query data
Set> users = (Set>) Datahike.q(
"[:find ?email ?name :where [?e :user/email ?email] [?e :user/name ?name]]",
Datahike.deref(conn)
);
System.out.println("Users: " + users);
// 5. Update data
Datahike.transact(conn, List.of(
Map.of(":user/email", "alice@example.com", // Upsert by unique attr
":user/name", "Alice Smith")
));
// 6. Time travel
var pastDb = Datahike.asOf(Datahike.deref(conn), new Date(0));
var pastUsers = Datahike.q(
"[:find ?name :where [?e :user/name ?name]]",
pastDb
);
System.out.println("Past users: " + pastUsers);
// 7. Cleanup
Datahike.deleteDatabase(config);
}
}
```
## API Reference
Full Javadoc available at: `https://javadoc.io/doc/org.replikativ/datahike/latest/`
### Key Classes
- **`Datahike`** - Main API with all database operations
- **`Database`** - Fluent builder for configuration
- **`Keywords`** - Pre-defined constants for schema and configuration
- **`EDN`** - EDN data type constructors and conversion
- **`Util`** - Low-level utilities (map, vec, kwd, etc.)
- **`SchemaFlexibility`** - Enum for schema modes
- **`IEntity`** - Entity interface for direct attribute access
### Core Methods
See auto-generated bindings in `Datahike.java` for complete list. All Datahike API functions are available.
## Comparison with Other JVM Databases
### vs Datomic
Datahike is Datomic-compatible with similar semantics:
- ✅ Same query language and API
- ✅ Same time-travel capabilities
- ✅ Similar schema system
- ✅ Open source and free
- ✅ Multiple storage backends
- ⚠️ Smaller community and ecosystem, but API compatible
### vs SQL Databases
| Feature | Datahike | SQL |
|---------|----------|-----|
| Query Language | Declarative Datalog | Declarative SQL |
| Schema | Optional, flexible | Usually required |
| Joins | Implicit, natural | Explicit with JOIN |
| Time Travel | Built-in | Requires audit tables |
| Immutability | Yes, all data versioned | No, updates in-place |
| Transactions | ACID | ACID |
## Performance Tips
1. **Use indexes** - Define `:db/index true` for frequently queried attributes
2. **Batch transactions** - Transact multiple entities at once
3. **Disable history** - Set `:keep-history? false` for write-heavy workloads
4. **Pull API** - More efficient than multiple queries for related data
5. **Index selection** - Use appropriate index (:eavt, :aevt, :avet) for datoms access
## Troubleshooting
### Common Issues
**"Could not locate Clojure runtime"**
- Ensure Clojure is on your classpath
- Datahike includes it transitively, but check for conflicts
**"Memory backend requires UUID"**
- Use `UUID.randomUUID()` not string IDs
- Required by konserve store for distributed tracking
**"Schema validation failed"**
- Check schema flexibility setting
- Verify attribute definitions match data types
**ClassCastException in results**
- Query results are Clojure collections (Set, List, Map)
- Cast appropriately: `(Set>) Datahike.q(...)`
## Further Reading
- [Main README](../README.md) - Project overview and installation
- [Schema Documentation](schema.md) - Detailed schema guide
- [Time Travel Guide](time_variance.md) - Historical queries
- [Storage Backends](storage-backends.md) - Backend configuration
- [EDN Conversion](bindings/edn-conversion.md) - Java ↔ EDN mapping
- [Datalog Tutorial](https://docs.datomic.com/on-prem/query.html) - Query language guide
## License
Eclipse Public License 1.0 (EPL-1.0)
## Support
- **GitHub Issues**: https://github.com/replikativ/datahike/issues
- **Discussions**: https://github.com/replikativ/datahike/discussions
- **Professional Support**: Contact christian@weilbach.name for consulting