原理回顾
- Eureka Server 提供服务注册服务,各个节点启动后,会在Eureka Server中进行注册,这样Eureka Server中的服务注册表中将会存储所有可用服务节点的信息,服务节点的信息可以在界面中直观的看到。
- Eureka Client 是一个Java 客户端,用于简化与Eureka Server的交互,客户端同时也具备一个内置的、使用轮询负载算法的负载均衡器。
- 在应用启动后,将会向Eureka Server发送心跳(默认周期为30秒),如果Eureka Server在多个心跳周期(默认3个心跳周期=90秒)没有收到某个节点的心跳,Eureka Server将会从服务注册表中把这个服务节点移除。
- 高可用情况下的:Eureka Server之间将会通过复制的方式完成数据的同步;
- Eureka Client具有缓存的机制,即使所有的Eureka Server 都挂掉的话,客户端依然可以利用缓存中的信息消费其它服务的API;
EurekaServer 启动流程分析
EurekaServer 处理服务注册、集群数据复制
EurekaClient 是如何注册到 EurekaServer 的?
刚才在org.springframework.cloud.netflix.eureka.server.InstanceRegistry 的每个方法都打了一个断点,而且现在EurekaServer已经处于Debug运行状态,那么我们就随便找一个被 @EnableEurekaClient 的微服务启动试试微服务来试试吧,直接Run。
- 当启动后,就一定会调用注册register方法,那么就接着往下看,拭目以待;
实例注册方法机制
InstanceRegistry.register(final InstanceInfo info, final boolean isReplication) 方法进断点了。
复制代码
- InstanceRegistry.register顺着堆栈信息往上看,是 ApplicationResource.addInstance 方法被调用了,分析addInstance;
ApplicationResource 类
主要是处理接收 Http 的服务请求。
public Response addInstance(InstanceInfo info,
String isReplication) {
logger.debug("Registering instance {} (replication={})", info.getId(), isReplication);
if (isBlank(info.getId())) {
return Response.status(400).entity("Missing instanceId").build();
} else if (isBlank(info.getHostName())) {
return Response.status(400).entity("Missing hostname").build();
} else if (isBlank(info.getAppName())) {
return Response.status(400).entity("Missing appName").build();
} else if (!appName.equals(info.getAppName())) {
return Response.status(400).entity("Mismatched appName, expecting " + appName + " but was " + info.getAppName()).build();
} else if (info.getDataCenterInfo() == null) {
return Response.status(400).entity("Missing dataCenterInfo").build();
} else if (info.getDataCenterInfo().getName() == null) {
return Response.status(400).entity("Missing dataCenterInfo Name").build();
}
DataCenterInfo dataCenterInfo = info.getDataCenterInfo();
if (dataCenterInfo instanceof UniqueIdentifier) {
String dataCenterInfoId = ((UniqueIdentifier) dataCenterInfo).getId();
if (isBlank(dataCenterInfoId)) {
boolean experimental = "true".equalsIgnoreCase(serverConfig.getExperimental("registration.validation.dataCenterInfoId"));
if (experimental) {
String entity = "DataCenterInfo of type " + dataCenterInfo.getClass() + " must contain a valid id";
return Response.status(400).entity(entity).build();
} else if (dataCenterInfo instanceof AmazonInfo) {
AmazonInfo amazonInfo = (AmazonInfo) dataCenterInfo;
String effectiveId = amazonInfo.get(AmazonInfo.MetaDataKey.instanceId);
if (effectiveId == null) {
amazonInfo.getMetadata().put(AmazonInfo.MetaDataKey.instanceId.getName(), info.getId());
}
} else {
logger.warn("Registering DataCenterInfo of type {} without an appropriate id", dataCenterInfo.getClass());
}
}
}
registry.register(info, "true".equals(isReplication));
return Response.status(204).build();
}
复制代码
- 这里的写法貌似看起来和我们之前 Controller 的 RESTFUL 写法有点不一样,仔细一看,原来是Jersey RESTful 框架,是一个产品级的RESTful service 和 client 框架。与Struts类似,它同样可以和hibernate,spring框架整合。
- 看到 registry.register(info, “true”.equals(isReplication)); 注册啊,原来EurekaClient客户端启动后会调用会通过Http(s)请求,直接调到ApplicationResource.addInstance 方法,只要是和注册有关的,都会调用这个方法。
- 接着我们深入 registry.register(info, “true”.equals(isReplication)) 查看;
public void register(final InstanceInfo info, final boolean isReplication) {
handleRegistration(info, resolveInstanceLeaseDuration(info), isReplication);
super.register(info, isReplication);
}
复制代码
- handleRegistration(info, resolveInstanceLeaseDuration(info), isReplication) 方法;
private void handleRegistration(InstanceInfo info, int leaseDuration,
boolean isReplication) {
log("register " + info.getAppName() + ", vip " + info.getVIPAddress()
+ ", leaseDuration " + leaseDuration + ", isReplication "
+ isReplication);
publishEvent(new EurekaInstanceRegisteredEvent(this, info, leaseDuration,
isReplication));
}
复制代码
- 然后通过 ApplicationContext 发布了一个事件 EurekaInstanceRegisteredEvent 服务注册事件,可以给 EurekaInstanceRegisteredEvent 添加监听事件,那么用户就可以在此刻实现自己想要的一些业务逻辑。
- 然后我们再来看看 super.register(info, isReplication) 方法,该方法是 InstanceRegistry 的父类 PeerAwareInstanceRegistryImpl 的方法。
服务户厕机制
进入PeerAwareInstanceRegistryImpl 类的 register(final InstanceInfo info, final boolean isReplication) 方法;
public void register(final InstanceInfo info, final boolean isReplication) {
int leaseDuration = Lease.DEFAULT_DURATION_IN_SECS;
if (info.getLeaseInfo() != null && info.getLeaseInfo().getDurationInSecs() > 0) {
leaseDuration = info.getLeaseInfo().getDurationInSecs();
}
super.register(info, leaseDuration, isReplication);
replicateToPeers(Action.Register, info.getAppName(), info.getId(), info, null, isReplication);
}
复制代码
进入super.register(info, leaseDuration, isReplication),如何写入EurekaServer 的注册表的,进入AbstractInstanceRegistry.register(InstanceInfo registrant, int leaseDuration, boolean isReplication) 方法。
public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
try {
read.lock();
Map> gMap = registry.get(registrant.getAppName());
REGISTER.increment(isReplication);
if (gMap == null) {
final ConcurrentHashMap> gNewMap = new ConcurrentHashMap>();
gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
if (gMap == null) {
gMap = gNewMap;
}
}
Lease existingLease = gMap.get(registrant.getId());
if (existingLease != null && (existingLease.getHolder() != null)) {
Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
" than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
registrant = existingLease.getHolder();
}
} else {
synchronized (lock) {
if (this.expectedNumberOfRenewsPerMin > 0) {
this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
this.numberOfRenewsPerMinThreshold =
(int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
}
}
logger.debug("No previous lease information found; it is new registration");
}
Lease lease = new Lease(registrant, leaseDuration);
if (existingLease != null) {
lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
}
gMap.put(registrant.getId(), lease);
synchronized (recentRegisteredQueue) {
recentRegisteredQueue.add(new Pair(
System.currentTimeMillis(),
registrant.getAppName() + "(" + registrant.getId() + ")"));
}
if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
+ "overrides", registrant.getOverriddenStatus(), registrant.getId());
if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
logger.info("Not found overridden id {} and hence adding it", registrant.getId());
overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
}
}
InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
if (overriddenStatusFromMap != null) {
logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
registrant.setOverriddenStatus(overriddenStatusFromMap);
}
InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
registrant.setStatusWithoutDirty(overriddenInstanceStatus);
if (InstanceStatus.UP.equals(registrant.getStatus())) {
lease.serviceUp();
}
registrant.setActionType(ActionType.ADDED);
recentlyChangedQueue.add(new RecentlyChangedItem(lease));
registrant.setLastUpdatedTimestamp();
invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
logger.info("Registered instance {}/{} with status {} (replication={})",
registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
} finally {
read.unlock();
}
}
复制代码
- *发现这个方法有点长,大致阅读,主要更新了注册表的时间之外,还更新了缓存等其它东西,大家有兴趣的可以深究阅读该方法;
集群之间的复制
replicateToPeers(Action.Register, info.getAppName(), info.getId(), info, null, isReplication) 的这个方法。
private void replicateToPeers(Action action, String appName, String id,
InstanceInfo info ,
InstanceStatus newStatus , boolean isReplication) {
Stopwatch tracer = action.getTimer().start();
try {
if (isReplication) {
numberOfReplicationsLastMin.increment();
}
if (peerEurekaNodes == Collections.EMPTY_LIST || isReplication) {
return;
}
for (final PeerEurekaNode node : peerEurekaNodes.getPeerEurekaNodes()) {
if (peerEurekaNodes.isThisMyUrl(node.getServiceUrl())) {
continue;
}
replicateInstanceActionsToPeers(action, appName, id, info, newStatus, node);
}
} finally {
tracer.stop();
}
}
复制代码
- 每当有注册请求,首先更新 EurekaServer 的注册表,然后再将信息同步到其它EurekaServer的节点上去;
- 接下来我们看看 node 节点是如何进行复制操作的,进入 replicateInstanceActionsToPeers 方法。
private void replicateInstanceActionsToPeers(Action action, String appName,
String id, InstanceInfo info, InstanceStatus newStatus,
PeerEurekaNode node) {
try {
InstanceInfo infoFromRegistry = null;
CurrentRequestVersion.set(Version.V2);
switch (action) {
case Cancel:
node.cancel(appName, id);
break;
case Heartbeat:
InstanceStatus overriddenStatus = overriddenInstanceStatusMap.get(id);
infoFromRegistry = getInstanceByAppAndId(appName, id, false);
node.heartbeat(appName, id, infoFromRegistry, overriddenStatus, false);
break;
case Register:
node.register(info);
break;
case StatusUpdate:
infoFromRegistry = getInstanceByAppAndId(appName, id, false);
node.statusUpdate(appName, id, newStatus, infoFromRegistry);
break;
case DeleteStatusOverride:
infoFromRegistry = getInstanceByAppAndId(appName, id, false);
node.deleteStatusOverride(appName, id, infoFromRegistry);
break;
}
} catch (Throwable t) {
logger.error("Cannot replicate information to {} for action {}", node.getServiceUrl(), action.name(), t);
}
}
复制代码
- 节点之间的复制状态操作,都在这里体现的淋漓尽致,那么我们就拿 Register 类型 node.register(info) 来看,我们来看看 node 究竟是如何做到同步信息的,进入 node.register(info) 方法看看;
同级之间的复制机制
PeerEurekaNode.register(final InstanceInfo info) 方法,一窥究竟如何同步数据。
public void register(final InstanceInfo info) throws Exception {
long expiryTime = System.currentTimeMillis() + getLeaseRenewalOf(info);
batchingDispatcher.process(
taskId("register", info),
new InstanceReplicationTask(targetHost, Action.Register, info, null, true) {
public EurekaHttpResponse execute() {
return replicationClient.register(info);
}
},
expiryTime
);
}
复制代码
- 这里涉及到了 Eureka 的任务批处理,通常情况下Peer之间的同步需要调用多次,如果EurekaServer一多的话,那么将会有很多http请求,所 以自然而然的孕育出了任务批处理,但是也在一定程度上导致了注册和下线的一些延迟,突出优势的同时也势必会造成一些劣势,但是这些延迟情况还是能符合 常理在容忍范围之内的。
- 在 expiryTime 超时时间之内,批次处理要做的事情就是合并任务为一个List,然后发送请求的时候,将这个批次List直接打包发送请求出去,这样的话,在这个批次的List里面,可能包含取消、注册、心跳、状态等一系列状态的集合List。
- 我们再接着看源码,batchingDispatcher.process 这么一调用,然后我们就直接看这个 TaskDispatchers.createBatchingTaskDispatcher 方法。
public static TaskDispatcher createBatchingTaskDispatcher(String id,
int maxBufferSize,
int workloadSize,
int workerCount,
long maxBatchingDelay,
long congestionRetryDelayMs,
long networkFailureRetryMs,
TaskProcessor taskProcessor) {
final AcceptorExecutor acceptorExecutor = new AcceptorExecutor<>(
id, maxBufferSize, workloadSize, maxBatchingDelay, congestionRetryDelayMs, networkFailureRetryMs
);
final TaskExecutors taskExecutor = TaskExecutors.batchExecutors(id, workerCount, taskProcessor, acceptorExecutor);
return new TaskDispatcher() {
public void process(ID id, T task, long expiryTime) {
acceptorExecutor.process(id, task, expiryTime);
}
public void shutdown() {
acceptorExecutor.shutdown();
taskExecutor.shutdown();
}
};
}
复制代码
- 这里的 process 方法会将任务添加到队列中,有入队列自然有出队列,具体怎么取任务,我就不一一给大家讲解了,我就讲讲最后是怎么触发任务的。进入 final TaskExecutors
static TaskExecutors batchExecutors(final String name,
int workerCount,
final TaskProcessor processor,
final AcceptorExecutor acceptorExecutor) {
final AtomicBoolean isShutdown = new AtomicBoolean();
final TaskExecutorMetrics metrics = new TaskExecutorMetrics(name);
return new TaskExecutors<>(new WorkerRunnableFactory() {
public WorkerRunnable create(int idx) {
return new BatchWorkerRunnable<>("TaskBatchingWorker-" +name + '-' + idx, isShutdown, metrics, processor, acceptorExecutor);
}
}, workerCount, isShutdown);
}
复制代码
- 我们发现 TaskExecutors 类中的 batchExecutors 这个静态方法,有个 BatchWorkerRunnable 返回的实现类,因此我们再次进入 BatchWorkerRunnable 类看看究竟,而且既然是 Runnable,那么势必会有 run 方法。
public void run() {
try {
while (!isShutdown.get()) {
List> holders = getWork();
metrics.registerExpiryTimes(holders);
List tasks = getTasksOf(holders);
ProcessingResult result = processor.process(tasks);
switch (result) {
case Success:
break;
case Congestion:
case TransientError:
taskDispatcher.reprocess(holders, result);
break;
case PermanentError:
logger.warn("Discarding {} tasks of {} due to permanent error", holders.size(), workerName);
}
metrics.registerTaskResult(result, tasks.size());
}
} catch (InterruptedException e) {
} catch (Throwable e) {
logger.warn("Discovery WorkerThread error", e);
}
}
复制代码
- 这就是我们 BatchWorkerRunnable 类的 run 方法,这里面首先要获取信号量释放,才能获得任务集合,一旦获取到了任务集合的话,那么就直接调用 processor.process(tasks) 方法请求 Peer 节点同步数据,接下来我们看看 ReplicationTaskProcessor.process 方法;
public ProcessingResult process(List tasks) {
ReplicationList list = createReplicationListOf(tasks);
try {
EurekaHttpResponse response = replicationClient.submitBatchUpdates(list);
int statusCode = response.getStatusCode();
if (!isSuccess(statusCode)) {
if (statusCode == 503) {
logger.warn("Server busy (503) HTTP status code received from the peer {}; rescheduling tasks after delay", peerId);
return ProcessingResult.Congestion;
} else {
logger.error("Batch update failure with HTTP status code {}; discarding {} replication tasks", statusCode, tasks.size());
return ProcessingResult.PermanentError;
}
} else {
handleBatchResponse(tasks, response.getEntity().getResponseList());
}
} catch (Throwable e) {
if (isNetworkConnectException(e)) {
logNetworkErrorSample(null, e);
return ProcessingResult.TransientError;
} else {
logger.error("Not re-trying this exception because it does not seem to be a network exception", e);
return ProcessingResult.PermanentError;
}
}
return ProcessingResult.Success;
}
复制代码
- 感觉快要见到真相了,所以我们迫不及待的进入 JerseyReplicationClient.submitBatchUpdates(ReplicationList replicationList) 方法一窥究竟。
public EurekaHttpResponse submitBatchUpdates(ReplicationList replicationList) {
ClientResponse response = null;
try {
response = jerseyApacheClient.resource(serviceUrl)
.path(PeerEurekaNode.BATCH_URL_PATH)
.accept(MediaType.APPLICATION_JSON_TYPE)
.type(MediaType.APPLICATION_JSON_TYPE)
.post(ClientResponse.class, replicationList);
if (!isSuccess(response.getStatus())) {
return anEurekaHttpResponse(response.getStatus(), ReplicationListResponse.class).build();
}
ReplicationListResponse batchResponse = response.getEntity(ReplicationListResponse.class);
return anEurekaHttpResponse(response.getStatus(), batchResponse).type(MediaType.APPLICATION_JSON_TYPE).build();
} finally {
if (response != null) {
response.close();
}
}
}
复制代码
- 看到了相对路径地址,我们搜索下"batch"这样的字符串看看有没有对应的接收方法或者被@Path注解进入的;在 eureka-core-1.4.12.jar 这个包下面,果然搜到到了 @Path(“batch”) 这样的字样,直接进入,发现这是 PeerReplicationResource 类的方法 batchReplication,我们进入这方法看看。
public Response batchReplication(ReplicationList replicationList) {
try {
ReplicationListResponse batchResponse = new ReplicationListResponse();
for (ReplicationInstance instanceInfo : replicationList.getReplicationList()) {
try {
batchResponse.addResponse(dispatch(instanceInfo));
} catch (Exception e) {
batchResponse.addResponse(new ReplicationInstanceResponse(Status.INTERNAL_SERVER_ERROR.getStatusCode(), null));
logger.error(instanceInfo.getAction() + " request processing failed for batch item "
+ instanceInfo.getAppName() + '/' + instanceInfo.getId(), e);
}
}
return Response.ok(batchResponse).build();
} catch (Throwable e) {
logger.error("Cannot execute batch Request", e);
return Response.status(Status.INTERNAL_SERVER_ERROR).build();
}
}
复制代码
- 看到了循环一次遍历任务进行处理,不知不觉觉得心花怒放,胜利的重点马上就要到来了,我们进入 PeerReplicationResource.dispatch 方法看看。
private ReplicationInstanceResponse dispatch(ReplicationInstance instanceInfo) {
ApplicationResource applicationResource = createApplicationResource(instanceInfo);
InstanceResource resource = createInstanceResource(instanceInfo, applicationResource);
String lastDirtyTimestamp = toString(instanceInfo.getLastDirtyTimestamp());
String overriddenStatus = toString(instanceInfo.getOverriddenStatus());
String instanceStatus = toString(instanceInfo.getStatus());
Builder singleResponseBuilder = new Builder();
switch (instanceInfo.getAction()) {
case Register:
singleResponseBuilder = handleRegister(instanceInfo, applicationResource);
break;
case Heartbeat:
singleResponseBuilder = handleHeartbeat(resource, lastDirtyTimestamp, overriddenStatus, instanceStatus);
break;
case Cancel:
singleResponseBuilder = handleCancel(resource);
break;
case StatusUpdate:
singleResponseBuilder = handleStatusUpdate(instanceInfo, resource);
break;
case DeleteStatusOverride:
singleResponseBuilder = handleDeleteStatusOverride(instanceInfo, resource);
break;
}
return singleResponseBuilder.build();
}
复制代码
- 随便抓一个类型,那我们也拿 Register 类型来看,进入 PeerReplicationResource.handleRegister 看看。
private static Builder handleRegister(ReplicationInstance instanceInfo, ApplicationResource applicationResource) {
applicationResource.addInstance(instanceInfo.getInstanceInfo(), REPLICATION);
return new Builder().setStatusCode(Status.OK.getStatusCode());
}
复制代码
- Peer节点的同步旅程终于结束了,最终又回调到了 ApplicationResource.addInstance 这个方法,这个方法在最终是EurekaClient启动后注册调用的方法,然而Peer节点的信息同步也调用了这个方法,仅仅只是通过一个变量 isReplication 为true还是false来判断是否是节点复制。剩下的ApplicationResource.addInstance流程前面已经提到过了,相信大家已经明白了注册的流程是如何扭转的,包括批量任务是如何处理EurekaServer节点之间的信息同步的了。
EurekaClient 启动流程分析
调换运行模式
Run运行discovery-eureka服务,Debug 运行 provider-user 服务,先观察日志先;
2017-10-23 19:43:07.688 INFO 1488 --- [ main] o.s.c.support.DefaultLifecycleProcessor : Starting beans in phase 0
2017-10-23 19:43:07.694 INFO 1488 --- [ main] o.s.c.n.eureka.InstanceInfoFactory : Setting initial instance status as: STARTING
2017-10-23 19:43:07.874 INFO 1488 --- [ main] c.n.d.provider.DiscoveryJerseyProvider : Using JSON encoding codec LegacyJacksonJson
2017-10-23 19:43:07.874 INFO 1488 --- [ main] c.n.d.provider.DiscoveryJerseyProvider : Using JSON decoding codec LegacyJacksonJson
2017-10-23 19:43:07.971 INFO 1488 --- [ main] c.n.d.provider.DiscoveryJerseyProvider : Using XML encoding codec XStreamXml
2017-10-23 19:43:07.971 INFO 1488 --- [ main] c.n.d.provider.DiscoveryJerseyProvider : Using XML decoding codec XStreamXml
2017-10-23 19:43:08.134 INFO 1488 --- [ main] c.n.d.s.r.aws.ConfigClusterResolver : Resolving eureka endpoints via configuration
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Disable delta property : false
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Single vip registry refresh property : null
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Force full registry fetch : false
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Application is null : false
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Registered Applications size is zero : true
2017-10-23 19:43:08.344 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Application version is -1: true
2017-10-23 19:43:08.345 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Getting all instance registry info from the eureka server
2017-10-23 19:43:08.630 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : The response status is 200
2017-10-23 19:43:08.631 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Starting heartbeat executor: renew interval is: 30
2017-10-23 19:43:08.634 INFO 1488 --- [ main] c.n.discovery.InstanceInfoReplicator : InstanceInfoReplicator onDemand update allowed rate per min is 4
2017-10-23 19:43:08.637 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Discovery Client initialized at timestamp 1508758988637 with initial instances count: 0
2017-10-23 19:43:08.657 INFO 1488 --- [ main] c.n.e.EurekaDiscoveryClientConfiguration : Registering application springms-provider-user with eureka with status UP
2017-10-23 19:43:08.658 INFO 1488 --- [ main] com.netflix.discovery.DiscoveryClient : Saw local status change event StatusChangeEvent [timestamp=1508758988658, current=UP, previous=STARTING]
2017-10-23 19:43:08.659 INFO 1488 --- [nfoReplicator-0] com.netflix.discovery.DiscoveryClient : DiscoveryClient_SPRINGMS-PROVIDER-USER/springms-provider-user:192.168.3.101:7900: registering service...
2017-10-23 19:43:08.768 INFO 1488 --- [ main] s.b.c.e.t.TomcatEmbeddedServletContainer : Tomcat started on port(s): 7900 (http)
2017-10-23 19:43:08.768 INFO 1488 --- [ main] c.n.e.EurekaDiscoveryClientConfiguration : Updating port to 7900
2017-10-23 19:43:08.773 INFO 1488 --- [ main] c.s.cloud.MsProviderUserApplication : Started ProviderApplication in 882.1 seconds (JVM running for 10.398)
复制代码
服务提供方主体加载流程
- 【1】:仔细查看下日志,先是 DefaultLifecycleProcessor 类处理了一些 bean,然后接下来肯定会调用一些实现 SmartLifecycle 类的start 方法;
- 【2】: 接着初始化设置了EurekaClient的状态为 STARTING,初始化编码使用的格式,哪些用JSON,哪些用XML;
- 【3】: 紧接着打印了强制获取注册信息状态为false,已注册的应用大小为0,客户端发送心跳续约,心跳续约间隔为30秒,最后打印Client初始化完成;
EnableEurekaClient 组件。
public EnableEurekaClient {}
复制代码
@EnableEurekaClient
这个注解类竟然也使用了注解 @EnableDiscoveryClient,那么我们有必要去这个注解类看看。
public EnableDiscoveryClient {}
复制代码
@EnableDiscoveryClient
这个注解类有个比较特殊的注解 @Import,由此我们猜想,这里的大多数逻辑是不是都写在这个 EnableDiscoveryClientImportSelector 类呢?
EnableDiscoveryClientImportSelector
public class EnableDiscoveryClientImportSelector
extends SpringFactoryImportSelector {
protected boolean isEnabled() {
return new RelaxedPropertyResolver(getEnvironment()).getProperty(
"spring.cloud.discovery.enabled", Boolean.class, Boolean.TRUE);
}
protected boolean hasDefaultFactory() {
return true;
}
}
复制代码
EnableDiscoveryClientImportSelector 类继承了 SpringFactoryImportSelector 类,但是重写了一个 isEnabled() 方法,默认值返回 true,为什么会返回true。
public String[] selectImports(AnnotationMetadata metadata) {
if (!isEnabled()) {
return new String[0];
}
AnnotationAttributes attributes = AnnotationAttributes.fromMap(
metadata.getAnnotationAttributes(this.annotationClass.getName(), true));
Assert.notNull(attributes, "No " + getSimpleName() + " attributes found. Is "
+ metadata.getClassName() + " annotated with @" + getSimpleName() + "?");
List factories = new ArrayList<>(new LinkedHashSet<>(SpringFactoriesLoader
.loadFactoryNames(this.annotationClass, this.beanClassLoader)));
if (factories.isEmpty() && !hasDefaultFactory()) {
throw new IllegalStateException("Annotation @" + getSimpleName()
+ " found, but there are no implementations. Did you forget to include a starter?");
}
if (factories.size() > 1) {
log.warn("More than one implementation " + "of @" + getSimpleName()
+ " (now relying on @Conditionals to pick one): " + factories);
}
return factories.toArray(new String[factories.size()]);
}
复制代码
EnableDiscoveryClientImportSelector.selectImports
首先通过注解获取了一些属性,然后加载了一些类名称,我们进入loadFactoryNames 方法看看。
public static List loadFactoryNames(Class factoryClass, ClassLoader classLoader) {
String factoryClassName = factoryClass.getName();
try {
Enumeration urls = (classLoader != null ? classLoader.getResources(FACTORIES_RESOURCE_LOCATION) :
ClassLoader.getSystemResources(FACTORIES_RESOURCE_LOCATION));
List result = new ArrayList();
while (urls.hasMoreElements()) {
URL url = urls.nextElement();
Properties properties = PropertiesLoaderUtils.loadProperties(new UrlResource(url));
String factoryClassNames = properties.getProperty(factoryClassName);
result.addAll(Arrays.asList(StringUtils.commaDelimitedListToStringArray(factoryClassNames)));
}
return result;
}
catch (IOException ex) {
throw new IllegalArgumentException("Unable to load [" + factoryClass.getName() +
"] factories from location [" + FACTORIES_RESOURCE_LOCATION + "]", ex);
}
}
复制代码
加载了一个配置文件,配置文件里面写了啥呢?打开SpringFactoryImportSelector该文件所在的jar包的spring.factories文件一看。
org.springframework.boot.autoconfigure.EnableAutoConfiguration=\
org.springframework.cloud.client.CommonsClientAutoConfiguration,\
org.springframework.cloud.client.discovery.noop.NoopDiscoveryClientAutoConfiguration,\
org.springframework.cloud.client.hypermedia.CloudHypermediaAutoConfiguration,\
org.springframework.cloud.client.loadbalancer.LoadBalancerAutoConfiguration,\
org.springframework.cloud.commons.util.UtilAutoConfiguration
org.springframework.boot.env.EnvironmentPostProcessor=\
org.springframework.cloud.client.HostInfoEnvironmentPostProcessor
复制代码
都是一些 Configuration 后缀的类名,所以这些都是加载的一堆堆的配置文件类。 factories对象里面只有一个类名路径为 org.springframework.cloud.netflix.eureka.EurekaDiscoveryClientConfiguration 。
EurekaDiscoveryClientConfiguration
public class EurekaDiscoveryClientConfiguration implements SmartLifecycle, Ordered {
public void start() {
if (this.port.get() != 0 && this.instanceConfig.getNonSecurePort() == 0) {
this.instanceConfig.setNonSecurePort(this.port.get());
}
if (!this.running.get() && this.instanceConfig.getNonSecurePort() > 0) {
maybeInitializeClient();
if (log.isInfoEnabled()) {
log.info("Registering application " + this.instanceConfig.getAppname()
+ " with eureka with status "
+ this.instanceConfig.getInitialStatus());
}
this.applicationInfoManager
.setInstanceStatus(this.instanceConfig.getInitialStatus());
if (this.healthCheckHandler != null) {
this.eurekaClient.registerHealthCheck(this.healthCheckHandler);
}
this.context.publishEvent(
new InstanceRegisteredEvent<>(this, this.instanceConfig));
this.running.set(true);
}
}
}
复制代码
- 首先看到该类实现了SmartLifecycle 接口,那么就肯定会实现 start 方法,而且这个 start 方法感觉应在会被加载执行的。 this.applicationInfoManager.setInstanceStatus(this.instanceConfig.getInitialStatus()) 这段代码有一个观察者模式的回调存在。
public synchronized void setInstanceStatus(InstanceStatus status) {
InstanceStatus prev = instanceInfo.setStatus(status);
if (prev != null) {
for (StatusChangeListener listener : listeners.values()) {
try {
listener.notify(new StatusChangeEvent(prev, status));
} catch (Exception e) {
logger.warn("failed to notify listener: {}", listener.getId(), e);
}
}
}
}
复制代码
- 这个方法会因为状态的改变而回调所有实现 StatusChangeListener 这个类的地方,前提得先注册到 listeners 中去才行。
- 于是乎,我们断定,若想要回调,那么就必须有地方先注册这个事件,而且这个注册还必须提前执行在 start 方法前执行,于是我们得先在ApplicationInfoManager 这个类中找到注册到 listeners 的这个方法。
public void registerStatusChangeListener(StatusChangeListener listener) {
listeners.put(listener.getId(), listener);
}
复制代码
- 于是我们逆向找下 registerStatusChangeListener 被调用的地方。
- 很不巧的是,尽然只有1个地方被调用,这个地方就是 DiscoveryClient.initScheduledTasks 方法,而且 initScheduledTasks 方法又是在 DiscoveryClient 的构造函数里面调用的,同时我们也对 initScheduledTasks 以及 initScheduledTasks 被调用的构造方法地方 打上断点。
果不其然,EurekaDiscoveryClientConfiguration.start 方法被调用了,紧接着 this.applicationInfoManager.setInstanceStatus(this.instanceConfig.getInitialStatus()) 也进入断点,然后在往下走,又进入的 DiscoveryClient.initScheduledTasks 方法中的 notify 回调处。
- 看着断点依次经过我们上述分析的地方,然后也符合日志打印的顺序,所以我们现在应该是有必要好好看看 DiscoveryClient.initScheduledTasks 这个方法究竟干了什么伟大的事情。然而又想了想,还不如看看 initScheduledTasks 被调用的构造方法。
DiscoveryClient 经过 @Inject 注解过的构造方法。
DiscoveryClient(ApplicationInfoManager applicationInfoManager, EurekaClientConfig config, DiscoveryClientOptionalArgs args, Provider backupRegistryProvider) {
if (args != null) {
this.healthCheckHandlerProvider = args.healthCheckHandlerProvider;
this.healthCheckCallbackProvider = args.healthCheckCallbackProvider;
this.eventListeners.addAll(args.getEventListeners());
} else {
this.healthCheckCallbackProvider = null;
this.healthCheckHandlerProvider = null;
}
this.applicationInfoManager = applicationInfoManager;
InstanceInfo myInfo = applicationInfoManager.getInfo();
clientConfig = config;
staticClientConfig = clientConfig;
transportConfig = config.getTransportConfig();
instanceInfo = myInfo;
if (myInfo != null) {
appPathIdentifier = instanceInfo.getAppName() + "/" + instanceInfo.getId();
} else {
logger.warn("Setting instanceInfo to a passed in null value");
}
this.backupRegistryProvider = backupRegistryProvider;
this.urlRandomizer = new EndpointUtils.InstanceInfoBasedUrlRandomizer(instanceInfo);
localRegionApps.set(new Applications());
fetchRegistryGeneration = new AtomicLong(0);
remoteRegionsToFetch = new AtomicReference(clientConfig.fetchRegistryForRemoteRegions());
remoteRegionsRef = new AtomicReference<>(remoteRegionsToFetch.get() == null ? null : remoteRegionsToFetch.get().split(","));
if (config.shouldFetchRegistry()) {
this.registryStalenessMonitor = new ThresholdLevelsMetric(this, METRIC_REGISTRY_PREFIX + "lastUpdateSec_", new long[]{15L, 30L, 60L, 120L, 240L, 480L});
} else {
this.registryStalenessMonitor = ThresholdLevelsMetric.NO_OP_METRIC;
}
if (config.shouldRegisterWithEureka()) {
this.heartbeatStalenessMonitor = new ThresholdLevelsMetric(this, METRIC_REGISTRATION_PREFIX + "lastHeartbeatSec_", new long[]{15L, 30L, 60L, 120L, 240L, 480L});
} else {
this.heartbeatStalenessMonitor = ThresholdLevelsMetric.NO_OP_METRIC;
}
if (!config.shouldRegisterWithEureka() && !config.shouldFetchRegistry()) {
logger.info("Client configured to neither register nor query for data.");
scheduler = null;
heartbeatExecutor = null;
cacheRefreshExecutor = null;
eurekaTransport = null;
instanceRegionChecker = new InstanceRegionChecker(new PropertyBasedAzToRegionMapper(config), clientConfig.getRegion());
DiscoveryManager.getInstance().setDiscoveryClient(this);
DiscoveryManager.getInstance().setEurekaClientConfig(config);
initTimestampMs = System.currentTimeMillis();
logger.info("Discovery Client initialized at timestamp {} with initial instances count: {}",
initTimestampMs, this.getApplications().size());
return;
}
try {
scheduler = Executors.newScheduledThreadPool(3,
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-%d")
.setDaemon(true)
.build());
heartbeatExecutor = new ThreadPoolExecutor(
1, clientConfig.getHeartbeatExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-HeartbeatExecutor-%d")
.setDaemon(true)
.build()
);
cacheRefreshExecutor = new ThreadPoolExecutor(
1, clientConfig.getCacheRefreshExecutorThreadPoolSize(), 0, TimeUnit.SECONDS,
new SynchronousQueue(),
new ThreadFactoryBuilder()
.setNameFormat("DiscoveryClient-CacheRefreshExecutor-%d")
.setDaemon(true)
.build()
);
eurekaTransport = new EurekaTransport();
scheduleServerEndpointTask(eurekaTransport, args);
AzToRegionMapper azToRegionMapper;
if (clientConfig.shouldUseDnsForFetchingServiceUrls()) {
azToRegionMapper = new DNSBasedAzToRegionMapper(clientConfig);
} else {
azToRegionMapper = new PropertyBasedAzToRegionMapper(clientConfig);
}
if (null != remoteRegionsToFetch.get()) {
azToRegionMapper.setRegionsToFetch(remoteRegionsToFetch.get().split(","));
}
instanceRegionChecker = new InstanceRegionChecker(azToRegionMapper, clientConfig.getRegion());
} catch (Throwable e) {
throw new RuntimeException("Failed to initialize DiscoveryClient!", e);
}
if (clientConfig.shouldFetchRegistry() && !fetchRegistry(false)) {
fetchRegistryFromBackup();
}
initScheduledTasks();
try {
Monitors.registerObject(this);
} catch (Throwable e) {
logger.warn("Cannot register timers", e);
}
DiscoveryManager.getInstance().setDiscoveryClient(this);
DiscoveryManager.getInstance().setEurekaClientConfig(config);
initTimestampMs = System.currentTimeMillis();
logger.info("Discovery Client initialized at timestamp {} with initial instances count: {}",
initTimestampMs, this.getApplications().size());
}
复制代码
- 从往下看,initScheduledTasks 这个方法顾名思义就是初始化调度任务,所以这里面的内容应该就是重头戏,进入看看。
private void initScheduledTasks() {
if (clientConfig.shouldFetchRegistry()) {
int registryFetchIntervalSeconds = clientConfig.getRegistryFetchIntervalSeconds();
int expBackOffBound = clientConfig.getCacheRefreshExecutorExponentialBackOffBound();
scheduler.schedule(
new TimedSupervisorTask(
"cacheRefresh",
scheduler,
cacheRefreshExecutor,
registryFetchIntervalSeconds,
TimeUnit.SECONDS,
expBackOffBound,
new CacheRefreshThread()
),
registryFetchIntervalSeconds, TimeUnit.SECONDS);
}
if (clientConfig.shouldRegisterWithEureka()) {
int renewalIntervalInSecs = instanceInfo.getLeaseInfo().getRenewalIntervalInSecs();
int expBackOffBound = clientConfig.getHeartbeatExecutorExponentialBackOffBound();
logger.info("Starting heartbeat executor: " + "renew interval is: " + renewalIntervalInSecs);
scheduler.schedule(
new TimedSupervisorTask(
"heartbeat",
scheduler,
heartbeatExecutor,
renewalIntervalInSecs,
TimeUnit.SECONDS,
expBackOffBound,
new HeartbeatThread()
),
renewalIntervalInSecs, TimeUnit.SECONDS);
instanceInfoReplicator = new InstanceInfoReplicator(
this,
instanceInfo,
clientConfig.getInstanceInfoReplicationIntervalSeconds(),
2);
statusChangeListener = new ApplicationInfoManager.StatusChangeListener() {
public String getId() {
return "statusChangeListener";
}
public void notify(StatusChangeEvent statusChangeEvent) {
if (InstanceStatus.DOWN == statusChangeEvent.getStatus() ||
InstanceStatus.DOWN == statusChangeEvent.getPreviousStatus()) {
logger.warn("Saw local status change event {}", statusChangeEvent);
} else {
logger.info("Saw local status change event {}", statusChangeEvent);
}
instanceInfoReplicator.onDemandUpdate();
}
};
if (clientConfig.shouldOnDemandUpdateStatusChange()) {
applicationInfoManager.registerStatusChangeListener(statusChangeListener);
}
instanceInfoReplicator.start(clientConfig.getInitialInstanceInfoReplicationIntervalSeconds());
} else {
logger.info("Not registering with Eureka server per configuration");
}
}
复制代码
- 在这个方法从上往下一路注释分析下来,干了EurekaClient我们最想知道的一些事情,定时任务获取注册信息,定时任务刷新缓存,定时任务心跳续约,定时任务同步数据中心数据,状态变化监听回调等。但是唯独没看到注册,这是怎么回事呢?
- instanceInfoReplicator.onDemandUpdate() 就是在状态改变的时候。
public boolean onDemandUpdate() {
if (rateLimiter.acquire(burstSize, allowedRatePerMinute)) {
scheduler.submit(new Runnable() {
public void run() {
logger.debug("Executing on-demand update of local InstanceInfo");
Future latestPeriodic = scheduledPeriodicRef.get();
if (latestPeriodic != null && !latestPeriodic.isDone()) {
logger.debug("Canceling the latest scheduled update, it will be rescheduled at the end of on demand update");
latestPeriodic.cancel(false);
}
InstanceInfoReplicator.this.run();
}
});
return true;
} else {
logger.warn("Ignoring onDemand update due to rate limiter");
return false;
}
}
复制代码
- onDemandUpdate 这个方法,唯独 InstanceInfoReplicator.this.run() 这个方法还有点用,而且还是 run 方法呢,感情 InstanceInfoReplicator 这个类还是实现了 Runnable 接口?经过查看这个类,还真是实现了 Runnable 接口。
- 这个方法应该我们要找的注册所在的地方。
public void run() {
try {
discoveryClient.refreshInstanceInfo();
Long dirtyTimestamp = instanceInfo.isDirtyWithTime();
if (dirtyTimestamp != null) {
discoveryClient.register();
instanceInfo.unsetIsDirty(dirtyTimestamp);
}
} catch (Throwable t) {
logger.warn("There was a problem with the instance info replicator", t);
} finally {
Future next = scheduler.schedule(this, replicationIntervalSeconds, TimeUnit.SECONDS);
scheduledPeriodicRef.set(next);
}
}
复制代码
- discoveryClient.register() 这个 register 方法,原来注册方法就是这个。
boolean register() throws Throwable {
logger.info(PREFIX + appPathIdentifier + ": registering service...");
EurekaHttpResponse httpResponse;
try {
httpResponse = eurekaTransport.registrationClient.register(instanceInfo);
} catch (Exception e) {
logger.warn("{} - registration failed {}", PREFIX + appPathIdentifier, e.getMessage(), e);
throw e;
}
if (logger.isInfoEnabled()) {
logger.info("{} - registration status: {}", PREFIX + appPathIdentifier, httpResponse.getStatusCode());
}
return httpResponse.getStatusCode() == 204;
}
复制代码
- 原来调用了 EurekaHttpClient 封装的客户端请求对象来进行注册的,再继续深探 registrationClient.register 方法,于是我们来到了 AbstractJerseyEurekaHttpClient.register 方法。
public EurekaHttpResponse register(InstanceInfo info) {
String urlPath = "apps/" + info.getAppName();
ClientResponse response = null;
try {
Builder resourceBuilder = jerseyClient.resource(serviceUrl).path(urlPath).getRequestBuilder();
addExtraHeaders(resourceBuilder);
response = resourceBuilder
.header("Accept-Encoding", "gzip")
.type(MediaType.APPLICATION_JSON_TYPE)
.accept(MediaType.APPLICATION_JSON)
.post(ClientResponse.class, info);
return anEurekaHttpResponse(response.getStatus()).headers(headersOf(response)).build();
} finally {
if (logger.isDebugEnabled()) {
logger.debug("Jersey HTTP POST {}/{} with instance {}; statusCode={}", serviceUrl, urlPath, info.getId(),
response == null ? "N/A" : response.getStatus());
}
if (response != null) {
response.close();
}
}
}
复制代码
- *调用的是 Jersey RESTful 框架来进行请求的,然后在 EurekaServer 那边就会在 ApplicationResource.addInstance 方法接收客户端的注册请求,因此我们的 EurekaClient 是如何注册的就到此为止了。
分享资源
获取以上资源请访问开源项目 点击跳转