In order to understand what happens when individual servers stop responding due to a network partition or a crash, know that OpenDJ can offer both directory service and also replication service, and the two services are not the same, even if they can run alongside each other in the same OpenDJ server in the same Java Virtual Machine.
Replication relies on the replication service provided by OpenDJ replication servers, where OpenDJ directory servers publish changes made to their data, and subscribe to changes published by other OpenDJ directory servers. A replication server manages replication data only, handling replication traffic with directory servers and with other replication servers, receiving, sending, and storing only changes to directory data rather than directory data itself. Once a replication server is connected to a replication topology, it maintains connections to all other replication servers in that topology.
A directory server handles directory data. It responds to requests,
stores directory data and historical information. For each replicated
suffix, such as dc=example,dc=com,
cn=schema and cn=admin data, the
directory server publishes changes to a replication server, and subscribes
to changes from that replication server. (Directory servers do not publish
changes to other directory servers.) A directory server also resolves any
conflicts that arise when reconciling changes from other directory servers,
using the historical information about changes to resolve the conflicts.
(Conflict resolution is the responsibility of the directory server rather
than the replication server.)
Once a directory server is connected to a replication topology for a particular suffix, it connects to one replication server at a time for that suffix. The replication server provides the directory server with a list of all replication servers for that suffix. Given the list of possible replication servers to which it can connect, the directory server can determine which replication server to connect to when starting up, or when the current connection is lost or becomes unresponsive.
For each replicated suffix, a directory server prefers to connect to a replication server:
-
In the same group as the directory server
-
Having the same initial data for the suffix as the directory server
-
If initial data were the same, having all the latest changes from the directory server
-
Running in the same Java Virtual Machine as the directory server
-
Having the most available capacity relative to other eligible replication servers
Available capacity depends on how many directory servers in the topology are already connected to a replication server, and what proportion of all directory servers in the topology ought to be connected to the replication server.
To determine what proportion of the total number of directory servers should be connected to a replication server, OpenDJ uses replication server weight. When configuring a replication server, you can assign it a weight (default: 1). The weight property takes an integer that indicates capacity to provide replication service relative to other servers. For example, a weight of 2 would indicate a replication server that can handle twice as many connected servers as a replication server with weight 1.
The proportion of directory servers in a topology that should be connected to a given replication server is equal to (replication server weight)/(sum of replication server weights). In other words, if there are 4 replication servers in a topology each with default weights, the proportion for each replication server is 1/4.
Consider a situation where 7 directory servers are connected to
replication servers A, B, C, and D for dc=example,dc=com
data. Suppose 2 directory servers each are connected to A, B, and C, and 1
directory server is connected to replication server D. Replication server D
is therefore the server with the most available capacity relative to other
replication servers in the topology. All other criteria being equal,
replication server D is the server to connect to when an 8th directory
server joins the topology.
The directory server regularly updates the list of replication servers in case it must reconnect. As available capacity of replication servers for each replication topology can change dynamically, a directory server can potentially reconnect to another replication server to balance the replication load in the topology. For this reason the server can also end up connected to different replication servers for different suffixes.

