mongodb java 日志分析_记一次log4j与mongodb集成引发的问题分析
問題背景
對項目中的關鍵應用調用鏈日志需要結構化得統一吐出到mongodb中,同時項目中日志輸出使用log4j,故準備使用log4j的Appender直接集成mongodb的輸出,同時mongodb采用集群模式。因為對于日志有特殊結構化要求,所以沒有使用log4mongo庫。
mongodb驅動版本:3.5.0
log4j版本:1.2.17
問題現象
error和warn級別的日志輸出沒有任何問題,但是當輸出info級別日志的時候,保存失敗,進程被阻塞,過一陣子后控制臺報出如下:
Exception in thread "main" com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches WritableServerSelector. Client view of cluster state is {type=UNKNOWN, servers=[{address=127.0.0.1:27017, type=UNKNOWN, state=CONNECTING}]
排查過程
首先查看mongodb進程和端口全部正常,并且切回warn和error級別再次輸出后,一切正常,mongodb數據成功插入,排除了mongodb進程的原因。此步直接確定是log4j與mongodb集成驅動的原因
再次執行info級別輸出,在進程無響應的時候,通過jstack拿取線程棧,發現盡然后線程阻塞引發的死鎖。如下:main線程鎖住了0x0000000797b8f4d0,而cluster-ClusterId再等待該鎖的釋放。
"cluster-ClusterId{value='59a3c74a936270211a375fc6', description='null'}-127.0.0.1:27017" #12 daemon prio=5 os_prio=31 tid=0x00007f97c2263800 nid=0x570f waiting for monitor entry [0x0000700001452000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.log4j.Category.callAppenders(Category.java:202)
- waiting to lock <0x0000000797b8f4d0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:388)
at org.apache.log4j.Category.log(Category.java:853)
at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:300)
at com.mongodb.diagnostics.logging.SLF4JLogger.info(SLF4JLogger.java:71)
at com.mongodb.connection.InternalStreamConnection.open(InternalStreamConnection.java:110)
at com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable.run(DefaultServerMonitor.java:111)
- locked <0x0000000797b01148> (a com.mongodb.connection.DefaultServerMonitor$ServerMonitorRunnable)
"main" #1 prio=5 os_prio=31 tid=0x00007f97c202f000 nid=0x1003 waiting on condition [0x0000700000182000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x0000000797b03588> (a java.util.concurrent.CountDownLatch$Sync)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:114)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:413)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:168)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:74)
at com.mongodb.Mongo.execute(Mongo.java:819)
at com.mongodb.Mongo$2.execute(Mongo.java:802)
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:550)
at com.mongodb.MongoCollectionImpl.insertOne(MongoCollectionImpl.java:317)
at com.mongodb.MongoCollectionImpl.insertOne(MongoCollectionImpl.java:307)
at com.test.mongo.core.MongoLogAgent.saveLog(MongoLogAgent.java:52)
at com.test.log.log4j.MongoAppender.append(MongoAppender.java:86)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:230)
- locked <0x0000000797b8f3e8> (a com.test.log.log4j.MongoAppender)
at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:65)
at org.apache.log4j.Category.callAppenders(Category.java:203)
- locked <0x0000000797b8f4d0> (a org.apache.log4j.spi.RootLogger)
at org.apache.log4j.Category.forcedLog(Category.java:388)
at org.apache.log4j.Category.log(Category.java:853)
3.測試代碼及log4j.properties配置
private final static Logger logger = LoggerFactory.getLogger(Test.class);
public static void main(String[] args) throws Exception {
logger.info("測試",new Exception("出錯了"));
}
log4j.rootLogger=info,console,file,mongo
log4j.appender.mongo=com.test.log.log4j.MongoAppender
PS:mongo就是自己實現的Appender,因為沒有實質性內容,就不再貼出,主要就是拿到日志后開始寫入mongodb
4.既然找到了問題原因,直接看源碼分析
log4j的Category對象中,會迭代循環Category及父節點,拿到之后對該對象進行加鎖。synchronized(c)就是造成了上述線程棧中main線程(- locked <0x0000000797b8f4d0> (a org.apache.log4j.spi.RootLogger)的原因)
AppenderAttachableImpl aai;
//中間代碼省略
public void callAppenders(LoggingEvent event) {
int writes = 0;
for(Category c = this; c != null; c = c.parent) {
synchronized(c) {
if(c.aai != null) {
writes += c.aai.appendLoopOnAppenders(event);
}
if(!c.additive) {
break;
}
}
}
if(writes == 0) {
this.repository.emitNoAppenderWarning(this);
}
}
根據log4j.properties的配置,MongoAppender對象,會在Log4j讀取配置文件的時候加入到Category的aai對象中,此處appendLoopOnAppenders方法內就是循環執行console,file,mongo三個Appender的寫方法,用來記錄日志,其中MongoAppender中通過獲取MongoClient進行insertDocument操作。
那cluster-ClusterId線程又是干什么的呢?
這個線程就是mongodb客戶端驅動用來啟動并且監聽Socket鏈接并且建立連接池的一個獨立線程,mongodb的客戶端驅動在傳入多個mongodb服務端ip并創建MongoClient對象時會構造MultiServerCluster對象及DefaultServer對象,該對象會綁定一個缺省的監聽對象DefaultServerMonitor,DefaultServerMonitor對象在構造函數中,會初始化一個線程,這個線程就是真正建立mongodb數據庫鏈接的地方。而DefaultServer對象在構造函數中負責啟動該線程。當建立好鏈接后會將鏈接放入池中
DefaultServer(final ServerId serverId, final ClusterConnectionMode clusterConnectionMode, final ConnectionPool connectionPool,
final ConnectionFactory connectionFactory, final ServerMonitorFactory serverMonitorFactory,
final ServerListener serverListener, final CommandListener commandListener) {
this.serverListener = notNull("serverListener", serverListener);
this.commandListener = commandListener;
notNull("serverAddress", serverId);
notNull("serverMonitorFactory", serverMonitorFactory);
this.clusterConnectionMode = notNull("clusterConnectionMode", clusterConnectionMode);
this.connectionFactory = notNull("connectionFactory", connectionFactory);
this.connectionPool = notNull("connectionPool", connectionPool);
this.serverStateListener = new DefaultServerStateListener();
this.serverId = serverId;
serverListener.serverOpening(new ServerOpeningEvent(this.serverId));
description = ServerDescription.builder().state(CONNECTING).address(serverId.getAddress()).build();
//構造并啟動監控連接線程
serverMonitor = serverMonitorFactory.create(serverStateListener);
serverMonitor.start();
}
//無關代碼已注釋
class DefaultServerMonitor implements ServerMonitor {
DefaultServerMonitor(final ServerId serverId, final ServerSettings serverSettings,
final ChangeListener serverStateListener,
final InternalConnectionFactory internalConnectionFactory, final ConnectionPool connectionPool) {
this.serverSettings = serverSettings;
this.serverId = serverId;
this.serverMonitorListener = getServerMonitorListener(serverSettings);
this.serverStateListener = serverStateListener;
this.internalConnectionFactory = internalConnectionFactory;
this.connectionPool = connectionPool;
monitor = new ServerMonitorRunnable();
monitorThread = new Thread(monitor, "cluster-" + this.serverId.getClusterId() + "-" + this.serverId.getAddress());
monitorThread.setDaemon(true);
isClosed = false;
}
//一個內部類的線程,用于建立連接
class ServerMonitorRunnable implements Runnable {
private final ExponentiallyWeightedMovingAverage averageRoundTripTime = new ExponentiallyWeightedMovingAverage(0.2);
@Override
@SuppressWarnings("unchecked")
public synchronized void run() {
InternalConnection connection = null;
try {
ServerDescription currentServerDescription = getConnectingServerDescription(null);
while (!isClosed) {
ServerDescription previousServerDescription = currentServerDescription;
try {
if (connection == null) {
connection = internalConnectionFactory.create(serverId);
try {
connection.open();
} catch (Throwable t) {
connection = null;
throw t;
}
}
try {
currentServerDescription = lookupServerDescription(connection);
} catch (MongoSocketException e) {
connectionPool.invalidate();
connection.close();
connection = null;
connection = internalConnectionFactory.create(serverId);
try {
connection.open();
} catch (Throwable t) {
connection = null;
throw t;
}
try {
currentServerDescription = lookupServerDescription(connection);
} catch (MongoSocketException e1) {
connection.close();
connection = null;
throw e1;
}
}
} catch (Throwable t) {
averageRoundTripTime.reset();
currentServerDescription = getConnectingServerDescription(t);
}
if (!isClosed) {
try {
logStateChange(previousServerDescription, currentServerDescription);
serverStateListener.stateChanged(new ChangeEvent(previousServerDescription,
currentServerDescription));
} catch (Throwable t) {
LOGGER.warn("Exception in monitor thread during notification of server description state change", t);
}
waitForNext();
}
}
} finally {
if (connection != null) {
connection.close();
}
}
}
}
}
可以看到mongodb驅動客戶端是通過獨立線程來建立連接的,而main線程中是操作的MongoClient對象,當主線程運行較快,MongoClient在執行insert獲取不到連接時,會在BaseCluster類的selectServer方法中循環等待是否可以建立連接,若循環多次后時間超過了maxWaitTimeNanos,則直接拋出MongoTimeoutException,這就是為什么造成了超時現象,而mongodb驅動將真實原因屏蔽的地方,這樣有個好處就是不會造成線程僵死,影響應用系統,但同時對人造成了誤導。
public Server selectServer(final ServerSelector serverSelector) {
isTrue("open", !isClosed());
try {
CountDownLatch currentPhase = phase.get();
ClusterDescription curDescription = description;
ServerSelector compositeServerSelector = getCompositeServerSelector(serverSelector);
Server server = selectRandomServer(compositeServerSelector, curDescription);
boolean selectionFailureLogged = false;
long startTimeNanos = System.nanoTime();
long curTimeNanos = startTimeNanos;
long maxWaitTimeNanos = getMaxWaitTimeNanos();
while (true) {
throwIfIncompatible(curDescription);
if (server != null) {
return server;
}
if (curTimeNanos - startTimeNanos > maxWaitTimeNanos) {
throw createTimeoutException(serverSelector, curDescription);
}
if (!selectionFailureLogged) {
logServerSelectionFailure(serverSelector, curDescription);
selectionFailureLogged = true;
}
connect();
currentPhase.await(Math.min(maxWaitTimeNanos - (curTimeNanos - startTimeNanos), getMinWaitTimeNanos()), NANOSECONDS);
curTimeNanos = System.nanoTime();
currentPhase = phase.get();
curDescription = description;
server = selectRandomServer(compositeServerSelector, curDescription);
}
} catch (InterruptedException e) {
throw new MongoInterruptedException(format("Interrupted while waiting for a server that matches %s", serverSelector), e);
}
}
死鎖的產生
因為在log4j.propertis配置中,root上配置了所有的Appender。通過線程dump的分析和源碼的解讀,很容易發現是因為main線程在執行mongodb的Appender是,獲取到的是rootLogger的鎖,同時,BaseCluster類中每一次獲取鏈接時利用了CountDownLatch方式將線程掛起,導致了主線程的等待,但此時main線程的鎖并沒有釋放。而cluster-ClusterId線程中執行的mongodb代碼中,通過slf4j也有Info級別的日志輸出,而此時在執行Appender時,獲取到的還是rootLogger對象,并又去申請該鎖,因為是兩個不同的線程,造成了cluster-ClusterId線程的阻塞
問題的解決
將mongodb包中的logger對象指向變為不是rootLogger即可,這樣,cluster-ClusterId線程在獲取Category對象鎖時,與main線程已經申請到的對象鎖不是同一個即可。
log4j.rootLogger=debug,console,file,mongo
#此配置是讓org.mongodb的logger不繼承父對象級別,即root,
log4j.additivity.org.mongodb=false
log4j.logger.org.mongodb=info,console
通過以上配置,所有org.mongodb包的日志,都會使用一個新的Logger對象,該對象只綁定了console一個Appender,此Logger對象的parent節點是rootLogger對象
AppenderAttachableImpl aai;
//中間代碼省略
public void callAppenders(LoggingEvent event) {
int writes = 0;
for(Category c = this; c != null; c = c.parent) {
//cluster-ClusterId線程中,此時加鎖對象即為新的logger對象
synchronized(c) {
if(c.aai != null) {
writes += c.aai.appendLoopOnAppenders(event);
}
if(!c.additive) {
break;
}
}
}
if(writes == 0) {
this.repository.emitNoAppenderWarning(this);
}
}
總結
網上也發現有log4j的死鎖問題,其實跟多線程并發沒什么關系,主要是一個線程在輸出日志時,啟動了另一個線程,但此時主線程又不釋放鎖,而子線程又去申請該鎖導致的。
此時若是同一個線程,在synchronized(c)時,直接就是重入鎖,不會產生死鎖阻塞的問題。
總結
以上是生活随笔為你收集整理的mongodb java 日志分析_记一次log4j与mongodb集成引发的问题分析的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: java程序a-z b-y_有一行电文,
- 下一篇: mysql编码不对_MySQL编码不一致