nacos的版本是2.0.2,k8s部署的集群,3个节点
2021-11-20 13:15:51--2021-11-20 13:55:50 这个时间段 nacos三个节点都报错,config-fatal.log的报错如下:
2021-11-20 13:15:07,724 ERROR [db-error] org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms.
org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:612) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:669) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:700) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:783) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:804) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:68) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:57) at com.alibaba.nacos.config.server.auth.ExternalPermissionPersistServiceImpl.getPermissions(ExternalPermissionPersistServiceImpl.java:74) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getPermissionsFromDatabase(NacosRoleServiceImpl.java:220) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getPermissions(NacosRoleServiceImpl.java:181) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.hasPermission(NacosRoleServiceImpl.java:143) at com.alibaba.nacos.console.security.nacos.NacosAuthManager.auth(NacosAuthManager.java:144) at com.alibaba.nacos.core.auth.RemoteRequestAuthFilter.filter(RemoteRequestAuthFilter.java:79) at com.alibaba.nacos.core.remote.RequestHandler.handleRequest(RequestHandler.java:49) at com.alibaba.nacos.core.remote.grpc.GrpcRequestAcceptor.request(GrpcRequestAcceptor.java:168) at com.alibaba.nacos.core.remote.grpc.BaseGrpcServer.lambda$addServices$0(BaseGrpcServer.java:177) at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172) at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35) at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23) at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40) at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86) at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331) at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:814) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) ... 29 common frames omitted
2021-11-20 13:15:10,920 ERROR [db-error] org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms.
org.springframework.jdbc.CannotGetJdbcConnectionException: Failed to obtain JDBC Connection; nested exception is java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms. at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:82) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:612) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:669) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:700) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:783) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:804) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:68) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:57) at com.alibaba.nacos.config.server.auth.ExternalRolePersistServiceImpl.getRolesByUserName(ExternalRolePersistServiceImpl.java:103) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getRolesFromDatabase(NacosRoleServiceImpl.java:171) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getRoles(NacosRoleServiceImpl.java:162) at com.alibaba.nacos.console.security.nacos.NacosAuthManager.loginRemote(NacosAuthManager.java:126) at com.alibaba.nacos.core.auth.RemoteRequestAuthFilter.filter(RemoteRequestAuthFilter.java:79) at com.alibaba.nacos.core.remote.RequestHandler.handleRequest(RequestHandler.java:49) at com.alibaba.nacos.core.remote.grpc.GrpcRequestAcceptor.request(GrpcRequestAcceptor.java:168) at com.alibaba.nacos.core.remote.grpc.BaseGrpcServer.lambda$addServices$0(BaseGrpcServer.java:177) at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172) at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35) at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23) at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40) at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86) at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331) at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:814) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.sql.SQLTransientConnectionException: HikariPool-1 - Connection is not available, request timed out after 3000ms. at com.zaxxer.hikari.pool.HikariPool.createTimeoutException(HikariPool.java:689) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:196) at com.zaxxer.hikari.pool.HikariPool.getConnection(HikariPool.java:161) at com.zaxxer.hikari.HikariDataSource.getConnection(HikariDataSource.java:128) at org.springframework.jdbc.datasource.DataSourceUtils.fetchConnection(DataSourceUtils.java:158) at org.springframework.jdbc.datasource.DataSourceUtils.doGetConnection(DataSourceUtils.java:116) at org.springframework.jdbc.datasource.DataSourceUtils.getConnection(DataSourceUtils.java:79) ... 28 common frames omitted 2021-11-20 13:15:11,038 ERROR [db-error] master db down.
nacos.log的信息如下: 2021-11-20 13:06:47,659 INFO [capacityManagement] end correct usage, cost: 0s
2021-11-20 13:15:07,470 WARN HikariPool-1 - Connection com.mysql.cj.jdbc.ConnectionImpl@15ca2acf marked as broken because of SQLSTATE(null), ErrorCode(0)
com.mysql.cj.jdbc.exceptions.MySQLTimeoutException: Statement cancelled due to timeout or client request at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:113) at com.mysql.cj.jdbc.StatementImpl.checkCancelTimeout(StatementImpl.java:2191) at com.mysql.cj.protocol.a.NativeProtocol.sendQueryPacket(NativeProtocol.java:1022) at com.mysql.cj.NativeSession.execSQL(NativeSession.java:1075) at com.mysql.cj.jdbc.ClientPreparedStatement.executeInternal(ClientPreparedStatement.java:930) at com.mysql.cj.jdbc.ClientPreparedStatement.executeQuery(ClientPreparedStatement.java:1003) at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52) at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java) at org.springframework.jdbc.core.JdbcTemplate$1.doInPreparedStatement(JdbcTemplate.java:678) at org.springframework.jdbc.core.JdbcTemplate.execute(JdbcTemplate.java:617) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:669) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:700) at org.springframework.jdbc.core.JdbcTemplate.query(JdbcTemplate.java:712) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:783) at org.springframework.jdbc.core.JdbcTemplate.queryForObject(JdbcTemplate.java:804) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:68) at com.alibaba.nacos.config.server.service.repository.extrnal.ExternalStoragePaginationHelperImpl.fetchPage(ExternalStoragePaginationHelperImpl.java:57) at com.alibaba.nacos.config.server.auth.ExternalRolePersistServiceImpl.getRolesByUserName(ExternalRolePersistServiceImpl.java:103) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getRolesFromDatabase(NacosRoleServiceImpl.java:171) at com.alibaba.nacos.console.security.nacos.roles.NacosRoleServiceImpl.getRoles(NacosRoleServiceImpl.java:162) at com.alibaba.nacos.console.security.nacos.NacosAuthManager.loginRemote(NacosAuthManager.java:126) at com.alibaba.nacos.core.auth.RemoteRequestAuthFilter.filter(RemoteRequestAuthFilter.java:79) at com.alibaba.nacos.core.remote.RequestHandler.handleRequest(RequestHandler.java:49) at com.alibaba.nacos.core.remote.grpc.GrpcRequestAcceptor.request(GrpcRequestAcceptor.java:168) at com.alibaba.nacos.core.remote.grpc.BaseGrpcServer.lambda$addServices$0(BaseGrpcServer.java:177) at io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:172) at io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35) at io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23) at io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40) at io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86) at io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:331) at io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:814) at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2021-11-20 13:15:07,647 WARN HikariPool-1 - Connection com.mysql.cj.jdbc.ConnectionImpl@1927dc0f marked as broken because of SQLSTATE(08S01), ErrorCode(0)
2021-11-20 13:55:50 后 config-fatal.log和nacos.log 没有再报错这些了, 期间也没有重启nacos,但是有一部分应用在nacos的注册中心上找不到了,netstat查看和nacos的9948端口还有正常的连接,而且抓包发现也有通信,这些业务应用都已经运行了一段时间,有些运行了几个月。最后是重启这些注册失败的应用,然后在nacos的控制台上查看注册成功了。
现在有几个疑问: 1、这些日志报错看提示是从hikaripool拿不到有效的连接?为什么会突然出现这种报错呢? 2、nacos我们只用了注册中心功能,没有使用配置中心功能,为什么这个报错后会导致有部分应用从nacos掉线?正常来讲不管是消费者应用还是生产者应用,和nacos建立连接后,即使nacos和mysql的交互有问题也不影响应用的注册和发现功能吧? 3、prometheus暴露的指标没有hikaripool相关的信息,这些后面是否考虑暴露出来好监控查看 ?
- OS: centos
- Version: nacos-server 2.0.2, nacos-client 有几个
- Module: naming