Tuesday, November 22, 2011

BEA-000386 LDAP file (maybe) corrupted

We had some OutOfMemory errors in our server, and finally - after fixing the memory problems - we were unable to start the Admin because of this problem:

####<Nov 21, 2011 1:12:20 PM CET> <Critical> <WebLogicServer> <hqchnesoa102> <osbdv1as> <main> <<WLS Kernel>> <> <> <1321877540974> <BEA-000386> <Server subsystem failed. Reason: java.lang.NumberFormatException: null
java.lang.NumberFormatException: null
at java.lang.Integer.parseInt(Integer.java:417)
at java.lang.Integer.<init>(Integer.java:660)
at com.octetstring.vde.replication.Replication.initAgreements(Replication.java:146)
at com.octetstring.vde.replication.Replication.init(Replication.java:87)
at weblogic.ldap.EmbeddedLDAP.initReplication(EmbeddedLDAP.java:1304)
at weblogic.ldap.EmbeddedLDAP.start(EmbeddedLDAP.java:344)
at weblogic.t3.srvr.SubsystemRequest.run(SubsystemRequest.java:64)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:178)
>


moving the AdminServer/data/LDAP directory to a LDAP_BACKUP and restarting the server solved the problem. Of course we lost all the users.

I assume that the LDIF files simply get occasionally corrupted upon OOM. This is quite annoying.

Further digging showed that the file $DOMAIN_HOME/servers/AdminServer/data/ldap/conf/replicas.prop was empty, while it should contain something like this:


#Generated property file
#Mon Nov 21 13:38:21 CET 2011
replica.num=1
replica.0.name=osbdv1ms1
replica.0.base=dc\=osbdv1do
replica.0.port=8001
replica.0.hostname=soa102.acme.com
replica.0.masterurl=ldap\://soa102.acme.com\:7001/
replica.0.masterid=osbdv1as
replica.0.secure=0
replica.0.binddn=cn\=Admin
replica.0.consumerid=osbdv1ms1


How the file managed to get nuked, it's anybody's guess.

The message is: keep a backup of the AdminServer/data directory...

No comments: