错误使用.Net Redis客户端CSRedisCore,自己挖坑自己填
本文2019年中原創(chuàng)首發(fā)于博客園,當(dāng)時使用CSRedisCore的排障思路引起很大反響,當(dāng)時被張隊(duì)公眾號翻牌,本次轉(zhuǎn)回公號。
背景
????上次Redis MQ分布式改造之后,編排的容器穩(wěn)定運(yùn)行一個多月,昨天突然收到ETL端同事通知,沒有采集到解析日志。
趕緊進(jìn)服務(wù)器 docker ps查看容器:
用于數(shù)據(jù)接收的ReceiverApp容器掛掉了;
嘗試docker?container?start [containerid],幾分鐘后該容器再次崩潰。?
Redis連接超限
docker?log [containerid]??查看容器日志:?顯示連接Redis服務(wù)的客戶端數(shù)量超限。
CSRedis.RedisException: ERR max number of clients reached.
Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker[2]Executed action EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds (EqidReceiver) in 7.1767ms fail: Microsoft.AspNetCore.Server.Kestrel[13]Connection id "0HLPR3AP8ODKH", Request id "0HLPR3AP8ODKH:00000001": An unhandled exception was thrown by the application. CSRedis.RedisException: ERR max number of clients reachedat CSRedis.CSRedisClient.GetAndExecute[T](RedisClientPool pool, Func`2 handler, Int32 jump, Int32 errtimes)at CSRedis.CSRedisClient.ExecuteScalar[T](String key, Func`3 hander)at CSRedis.CSRedisClient.LPush[T](String key, T[] value)at RedisHelper.LPush[T](String key, T[] value)at EqidManager.Controllers.EqidController.BatchPutEqidAndProfileIds(List`1 eqidPairs) in /home/gitlab-runner/builds/haD2h5xC/0/webdissector/datasource/eqid-manager/src/EqidReceiver/Controllers/EqidController.cs:line 31at lambda_method(Closure , Object )at Microsoft.Extensions.Internal.ObjectMethodExecutorAwaitable.Awaiter.GetResult()at Microsoft.AspNetCore.Mvc.Internal.ActionMethodExecutor.AwaitableResultExecutor.Execute(IActionResultTypeMapper mapper, ObjectMethodExecutor executor, Object controller, Object[] arguments)at System.Threading.Tasks.ValueTask`1.get_Result()at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeActionMethodAsync()at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeNextActionFilterAsync()at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Rethrow(ActionExecutedContext context)at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)at Microsoft.AspNetCore.Mvc.Internal.ControllerActionInvoker.InvokeInnerFilterAsync()at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeNextResourceFilter()at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Rethrow(ResourceExecutedContext context)at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.Next(State& next, Scope& scope, Object& state, Boolean& isCompleted)at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeFilterPipelineAsync()at Microsoft.AspNetCore.Mvc.Internal.ResourceInvoker.InvokeAsync()at Microsoft.AspNetCore.Builder.RouterMiddleware.Invoke(HttpContext httpContext)at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol.ProcessRequests[TContext](IHttpApplication`1 application) info: Microsoft.AspNetCore.Hosting.Internal.WebHost[2]Request finished in 8.9549ms 500 【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached)【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached)【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached) 【dockerhost:6379/0】仍然不可用,下一次恢復(fù)檢查時間:09/17/2019 03:11:25,錯誤:(ERR max number of clients reached)快速思考:目前編排的某容器使用CSRedisCore對16個Redis DB實(shí)例化了16個客戶端,但Redis服務(wù)也不至于這么不經(jīng)折騰吧。
趕緊進(jìn)redis.io官網(wǎng)搜集資料。
After the client is initialized, Redis checks if we are already at the limit of the number of clients that it is possible to handle simultaneously (this is configured using the?maxclients?configuration directive, see the next p of this document for further information).
In case it can't accept the current client because the maximum number of clients was already accepted, Redis tries to send an error to the client in order to make it aware of this condition, and closes the connection immediately. The error message will be able to reach the client even if the connection is closed immediately by Redis because the new socket output buffer is usually big enough to contain the error, so the kernel will handle the transmission of the error.
大致意思是:maxclients配置了Redis服務(wù)允許的客戶端最大連接數(shù), 如果當(dāng)前連接的客戶端數(shù)超限,Redis服務(wù)會回發(fā)一個錯誤消息給客戶端,并迅速關(guān)閉客戶端連接。
立刻進(jìn)入Redis宿主機(jī)查看默認(rèn)配置,確認(rèn)當(dāng)前Redis服務(wù)的maxclients=10000(這是一個動態(tài)值,由maxclients和最大進(jìn)程文件句柄決定)。
# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
# maxclients 10000
通過Redis-Cli登錄Redis服務(wù)器, 立刻被踢下線。
基本可認(rèn)定Redis客戶端使用方式有問題。
CSRedisCore使用方式
查看Redis官方資料,可利用redis-cli命令info clients、client list?分析客戶端連接。
info clients?命令顯示現(xiàn)場確實(shí)有10000的連接數(shù);
client list命令輸出字段的官方解釋:
addr: The client address, that is, the client IP and the remote port number it used to connect with the Redis server.
fd: The client socket file descriptor number.
name: The client name as set by?CLIENT SETNAME.
age: The number of seconds the connection existed for.
idle: The number of seconds the connection is idle.
flags: The kind of client (N means normal client, check the?full list of flags).
omem: The amount of memory used by the client for the output buffer.
cmd: The last executed command.
以上解釋表明Redis服務(wù)器收到很多ip=172.16.1.3(故障容器在網(wǎng)橋內(nèi)的Ip 地址)的客戶端連接,這些連接最后發(fā)出的是ping命令(這是一個測試命令)
故障容器使用的Redis客戶端是CSRedisCore,該客戶端只是單純將msg寫入Redis list數(shù)據(jù)結(jié)構(gòu),CSRedisCore上相關(guān)github issue給了一些啟發(fā)。
發(fā)現(xiàn)自己將CSRedisClient實(shí)例化代碼寫在 .NETCore api Controller構(gòu)造函數(shù),這樣每次請求構(gòu)造Controller時都實(shí)例化一次Redis客戶端,最終Redis客戶端連接數(shù)達(dá)到最大允許連接值。
依賴注入三種模式: 單例(系統(tǒng)內(nèi)單一實(shí)例,一次性注入);瞬態(tài)(每次請求產(chǎn)生實(shí)例并注入);自定義范圍。
有關(guān)dotnet apiController?以瞬態(tài)模式注入,請查閱文末鏈接。
還有一個疑問?
為什么Redis服務(wù)器沒有釋放空閑的客戶端連接,如果空閑連接被釋放了,即使我寫了low代碼也不至于如此?
查詢官方:
By default recent versions of Redis don't close the connection with the client if the client is idle for many seconds: the connection will remain open forever.
However if you don't like this behavior, you can configure a timeout, so that if the client is idle for more than the specified number of seconds, the client connection will be closed.
You can configure this limit via?redis.conf?or simply using?CONFIG SET timeout <value>.
大致意思是最新的Redis服務(wù)默認(rèn)不會釋放空閑的客戶端連接。
# Close the connection after a client is idle for N seconds (0 to disable) timeout 0修改以上Redis服務(wù)配置可釋放空閑客戶端連接。
我們最佳實(shí)踐當(dāng)然不是修改Redis idle timeout 配置,問題本質(zhì)還是因?yàn)槲覍?shí)例化了多客戶端,趕緊將CSRedisCore實(shí)例化代碼移到startup.cs并注冊為單例。
大膽求證
info clients命令顯示穩(wěn)定在53個Redis連接。
client?list命令顯示:172.16.1.3(故障容器)建立了50個客戶端連接,編排的另一個容器webapp建立了2個連接,redis-cli命令登錄到服務(wù)器建立了1個連接。
那么問題來了,修改之后,ReceiverApp容器為什么還穩(wěn)定建立了50個redis連接?
進(jìn)一步與CSRedisCore原作者溝通,確認(rèn)CSRedisCore有預(yù)熱機(jī)制,默認(rèn)在連接池中預(yù)熱了 50 個連接。
bingo,故障和困惑全部排查清楚。
總結(jié)
經(jīng)此一役,在使用CSRedisCore客戶端時,要深入理解
①? Stackexchange.Redis 使用的多路復(fù)用連接機(jī)制(使用時很容易想到注冊為單例),CSRedisCore開源庫采用連接池機(jī)制,在高并發(fā)場景下強(qiáng)烈建議注冊為單例, 否則在生產(chǎn)使用中可能會誤用在瞬態(tài)請求中實(shí)例化,導(dǎo)致redis連接數(shù)幾天之后消耗完。
②? ?CSRedisCore會默認(rèn)建立連接池,預(yù)熱50個連接,開發(fā)者要心中有數(shù)。
額外的方法論:?盡量不要從某度找答案,要學(xué)會問問題,并嘗試從官方、stackoverflow、github社區(qū)尋求解答,你挖過的坑也許別人早就挖過并踏平過。
Update
????很多博友說問題在于我沒有細(xì)看CSRedisCore官方readme(readme推薦使用單例),使用方式上我確實(shí)沒有做成單例:
③ 一般連接池都會有空閑釋放回收機(jī)制 (CSRedisCore也是連接池機(jī)制),所以當(dāng)時并沒有把單例放在心上
④ 本次重要知識點(diǎn):Redis默認(rèn)并不會釋放空閑客戶端連接(但是又設(shè)置了最大連接數(shù)),這也直接促成了本次容器崩潰事故。
嗯,坑是自己挖的。
+?https://stackoverflow.com/questions/57553401/net-core-are-mvc-controllers-default-singleton
+?https://redis.io/topics/clients
+?https://github.com/2881099/csredis/issues/115
▼
往期精彩回顧
▼
AspNetCore結(jié)合Redis實(shí)踐消息隊(duì)列
TPL Dataflow組件應(yīng)對高并發(fā),低延遲要求
基于docker-compose的Gitlab CI/CD實(shí)踐&排坑指南
轉(zhuǎn)載是一種動力,分享是一種美德? ??~~..~~
如果你覺得文章還不賴,您的鼓勵是原創(chuàng)干貨作者的最大動力,讓我們一起激濁揚(yáng)清。
掃碼
關(guān)注
總結(jié)
以上是生活随笔為你收集整理的错误使用.Net Redis客户端CSRedisCore,自己挖坑自己填的全部內(nèi)容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: .NET Core 3.1通用主机原理及
- 下一篇: IHostingEnvironment