25.7.1. Troubleshooting Certificates & SSL Authentication

Replication uses SSL to protect directory data on the network. In some configurations, replica can fail to connect to each other due to SSL handshake errors. This leads to error log messages such as the following.

[21/Nov/2011:13:03:20 -0600] category=SYNC severity=NOTICE
 msgID=15138921 msg=SSL connection attempt from myserver (123.456.789.012)
 failed: Remote host closed connection during handshake

Notice these problem characteristics in the message above.

  • The host name, myserver, is not fully qualified.

    You should not see non fully qualified host names in the error logs. Non fully qualified host names are a sign that an OpenDJ server has not been configured properly.

    Always install and configure OpenDJ using fully-qualified host names. The OpenDJ administration connector, which is used by the dsconfig command, and also replication depend upon SSL and, more specifically, self-signed certificates for establishing SSL connections. If the host name used for connection establishment does not correspond to the host name stored in the SSL certificate then the SSL handshake can fail. For the purposes of establishing the SSL connection, a host name like myserver does not match myserver.example.com, and vice versa.

  • The connection succeeded, but the SSL handshake failed, suggesting a problem with authentication or with the cipher or protocol negotiation. As most deployments use the same Java Virtual Machine, and the same JVM configuration for each replica, the problem is likely not related to SSL cipher or protocol negotiation, but instead lies with authentication.

Follow these steps on each OpenDJ server to check whether the problem lies with the host name configuration.

  1. Make sure each OpenDJ server uses only fully qualified host names in the replication configuration. You can obtain a quick summary by running the following command against each server's configuration.

    $ grep ds-cfg-replication-server: config/config.ldif | sort | uniq
  2. Make sure that the host names in OpenDJ certificates also contain fully qualified host names, and correspond to the host names found in the previous step.

    # Examine the certificates used for the administration connector.
    $ keytool -list -v -keystore config/admin-truststore
     -storepass `cat config/admin-keystore.pin` |grep "^Owner:"
    
    # Examine the certificates used for replication.
    $ keytool -list -v -keystore config/ads-truststore
     -storepass `cat config/ads-truststore.pin`| grep "^Owner:"
        

Sample output for a server on host opendj.example.com follows.

$ grep ds-cfg-replication-server: config/config.ldif |sort | uniq
ds-cfg-replication-server: opendj.example.com:8989
ds-cfg-replication-server: opendj.example.com:9989

$ keytool -list -v -keystore config/admin-truststore
-storepass `cat config/admin-keystore.pin` | grep "^Owner:"
Owner: CN=opendj.example.com, O=Administration Connector Self-Signed Certificate

$ keytool -list -v -keystore config/ads-truststore
 -storepass `cat config/ads-truststore.pin`| grep "^Owner:"
Owner: CN=opendj.example.com, O=OpenDJ Certificate
Owner: CN=opendj.example.com, O=OpenDJ Certificate
Owner: CN=opendj.example.com, O=OpenDJ Certificate

Unfortunately there is no easy solution to badly configured host names. It is often easier and quicker simply to reinstall your OpenDJ servers remembering to use fully qualified host names everywhere.

  • When using the setup tool to install and configure a server ensure that the -h option is included, and that it specifies the fully qualified host name. Make sure you include this option even if you are not enabling SSL/StartTLS LDAP connections (see OPENDJ-363).

    If you are using the GUI installer, then make sure you specify the fully qualified host name on the first page of the wizard.

  • When using the dsreplication tool to enable replication make sure that any --host options include the fully qualified host name.

If you cannot reinstall the server, follow these steps.

  1. Disable replication in each replica.

    $ dsreplication
     disable
     --disableAll
     --port adminPort
     --hostname hostName
     --bindDN "cn=Directory Manager"
     --adminPassword password
     --trustAll
     --no-prompt
  2. Stop and restart each server in order to clear the in-memory ADS trust store backend.

  3. Enable replication making certain that fully qualified host names are used throughout

    $ dsreplication
     enable
     --adminUID admin
     --adminPassword password
     --baseDN dc=example,dc=com
     --host1 hostName1
     --port1 adminPort1
     --bindDN1 "cn=Directory Manager"
     --bindPassword1 password
     --replicationPort1 replPort1
     --host2 hostName2
     --port2 adminPort2
     --bindDN2 "cn=Directory Manager"
     --bindPassword2 password
     --replicationPort2 replPort2
     --trustAll
     --no-prompt
  4. Repeat the previous step for each remaining replica. In other words, host1 with host2, host1 with host3, host1 with host4, ..., host1 with hostN.

  5. Initialize all remaining replica with the data from host1.

    $ dsreplication
     initialize-all
     --adminUID admin
     --adminPassword password
     --baseDN dc=example,dc=com
     --hostname hostName1
     --port 4444
     --trustAll
     --no-prompt
  6. Check that the host names are correct in the configuration and in the key stores by following the steps you used to check for host name problems. The only broken host name remaining should be in the key and trust stores for the administration connector.

    $ keytool -list -v -keystore config/admin-truststore
     -storepass `cat config/admin-keystore.pin` |grep "^Owner:"
  7. Stop each server, and then fix the remaining admin connector certificate as described here in the procedure To Replace a Server Key Pair.