seata源码-全局事务提交服务端源码

前面的博客中，我们介绍了，发起全局事务时，是如何进行全局事务提交的，这篇博客，主要记录，在seata分布式事务中，全局事务提交的时候，服务端是如何进行处理的

发起全局事务提交操作

事务发起者，在所有分支事务执行完毕之后，如果没有发生异常，会进行全局事务提交

在这里插入图片描述

这里就不做过多解释了，前面seata入门和全局事务begin的博客中，有介绍过这里入参的request对象的重要性

服务端接收到请求

前面全局事务begin的源码，介绍过，netty服务端接收到请求之后，是如何执行到这里的，在这里会根据request请求的类型，交给不同的handler来处理
在这里插入图片描述

io.seata.server.AbstractTCInboundHandler#handle(io.seata.core.protocol.transaction.GlobalCommitRequest, io.seata.core.rpc.RpcContext)

在这里插入图片描述

io.seata.server.coordinator.DefaultCoordinator#doGlobalCommit

在这里插入图片描述

上面也没有太多的业务逻辑，没什么好说的，我们直接来看core.commit()方法的逻辑

io.seata.server.coordinator.DefaultCore#commit


@Override
public GlobalStatus commit(String xid) throws TransactionException {
    /**
     * 1.获取全局session：globalSession
     */
    GlobalSession globalSession = SessionHolder.findGlobalSession(xid);
    if (globalSession == null) {
        return GlobalStatus.Finished;
    }
    /**
     * 2.给globalSession添加监听
     */
    globalSession.addSessionLifecycleListener(SessionHolder.getRootSessionManager());
    // just lock changeStatus
    /**
     * 3.对globalSession进行一些处理
     */
    boolean shouldCommit = SessionHolder.lockAndExecute(globalSession, () -> {
        // Highlight: Firstly, close the session, then no more branch can be registered.
        /**
         * 3.1 globalSession.setActive(false);将全局session的active设置为false
         * 在clean()方法中会把lockTable中本次加锁的记录（分支事务相关锁信息）删除
         */
        globalSession.closeAndClean();
        if (globalSession.getStatus() == GlobalStatus.Begin) {
            /**
             * 对于AT模式，这里永远返回false
             * 对于AT模式，这里会执行if的逻辑，将globalSession的status设置为AsyncCommitting
             */
            if (globalSession.canBeCommittedAsync()) {
                globalSession.asyncCommit();
                return false;
            } else {
                /**
                 * 将globalSession的状态设置为committing状态
                 */
                globalSession.changeStatus(GlobalStatus.Committing);
                return true;
            }
        }
        return false;
    });

    /**
     * 4.通知分支事务进行提交，如果是AT模式，不会进入到这里执行，因为shouldCommit是false
     * 是通过一个异步线程来进行调用doGlobalCommit()方法的
     */
    if (shouldCommit) {
        boolean success = doGlobalCommit(globalSession, false);
        //If successful and all remaining branches can be committed asynchronously, do async commit.
        if (success && globalSession.hasBranch() && globalSession.canBeCommittedAsync()) {
            globalSession.asyncCommit();
            return GlobalStatus.Committed;
        } else {
            return globalSession.getStatus();
        }
    } else {
        return globalSession.getStatus() == GlobalStatus.AsyncCommitting ? GlobalStatus.Committed : globalSession.getStatus();
    }
}

上面贴的这段代码，是netty服务端接收到客户端全局事务请求之后，最核心的一个入口，我们拆开来看

commit

closeAndClean()

上面代码中注释1、2、3就不细看了，是一些不重要的逻辑，我们直接来看3.1这个注释对应的方法
在这个方法中，一个close()方法，一个clean()方法
在这里插入图片描述

close()方法中的逻辑比较简单，就是把globalSession的active属性设置为了false
在这里插入图片描述

我们接着来看clean()方法：
在这里插入图片描述

clean()内部调用的这个方法，如果看过前面全局事务回滚的代码，就会发现这个代码很眼熟，就是把分支事务对应的lockTable中指定的加锁的资源进行释放
在这里插入图片描述

shouldCommit()

我们接着来看shouldCommit参数的赋值逻辑，可以着重看下第三点的注释，这里的shouldCommit参数，如果是AT模式，永远false，原因是在这里
在这里插入图片描述
在canBeCommittedAsync方法中，下面两张截图我们结合起来看，如果是AT模式，canBeCommitedAsync()返回的一定是true；如果这里返回true，那上面截图，一定会进入到if()的逻辑中

所以，对于AT模式，shouldCommit一定是false，并且会调用globalSession.asyncCommit();
在这里插入图片描述

这段代码的整体逻辑，就是说，如果当前事务是允许异步处理的，那就给shouldCommit赋值为false，同时把globalSession的status修该为AsyncCommitting；这个状态很重要，这里的意思，我认为是说，当前事务是需要异步处理的，当前代码中，就不同步处理了，接来下的逻辑，可以证明
在这里插入图片描述

可以发现，当是AT模式的时候，直接执行了else的逻辑，那我们接下来看下，对于netty服务端，真正去处理分支事务的代码

init()

这段初始化的代码，和上面commit()有点关联，我们截止到这里，需要知道上面commit()方法，如果是AT模式的时候，只是把当前globalSession的状态改成了AsyncCommitting状态

io.seata.server.coordinator.DefaultCoordinator#init

这个方法，是在服务端启动的时候，会在这里初始化一批异步线程，其中有一个和本篇博客有关系
在这里插入图片描述

在这里插入图片描述

查找所有AsyncCommitting状态的globalSession

io.seata.server.storage.db.session.DataBaseSessionManager#allSessions

在这里插入图片描述

这里就不再继续往下贴代码了，逻辑比较简单，可以自己看下，简单来说，就是根据当前入参中的AsyncCommitting，从globalTable中，根据状态进行查询，然后找到所有待异步处理的globalSession

core.doGlobalCommit

io.seata.server.coordinator.DefaultCore#doGlobalCommit

这个方法，是服务端进行全局事务提交的处理逻辑，中间绕了这么一大圈，现在逻辑应该有点清晰明了了，其实就是在netty服务端接收到同步请求的时候，只会先把lockTable中加锁的数据删除，然后修改globalSession的状态，最后通过异步定时执行的线程池去执行全局事务提交的逻辑

public boolean doGlobalCommit(GlobalSession globalSession, boolean retrying) throws TransactionException {
    boolean success = true;
    // start committing event
    eventBus.post(new GlobalTransactionEvent(globalSession.getTransactionId(), GlobalTransactionEvent.ROLE_TC,
        globalSession.getTransactionName(), globalSession.getBeginTime(), null, globalSession.getStatus()));

    if (globalSession.isSaga()) {
        success = getCore(BranchType.SAGA).doGlobalCommit(globalSession, retrying);
    } else {
        for (BranchSession branchSession : globalSession.getSortedBranches()) {
            // if not retrying, skip the canBeCommittedAsync branches
            if (!retrying && branchSession.canBeCommittedAsync()) {
                continue;
            }

            BranchStatus currentStatus = branchSession.getStatus();
            /**
             * 如果二阶段分支事务状态是失败，就无需执行下面的逻辑，直接remove即可
             */
            if (currentStatus == BranchStatus.PhaseOne_Failed) {
                globalSession.removeBranch(branchSession);
                continue;
            }
            try {
                /**
                 * 这里是服务端在全局事务提交的时候，会通知RM去对本地的branch事务进行处理，是通过netty去交互的
                 * 1.如果RM删除成功，就将branchSession移除，并释放锁
                 * 2.如果删除失败
                 */
                BranchStatus branchStatus = getCore(branchSession.getBranchType()).branchCommit(globalSession, branchSession);

                switch (branchStatus) {
                    case PhaseTwo_Committed:
                        globalSession.removeBranch(branchSession);
                        continue;
                    case PhaseTwo_CommitFailed_Unretryable:
                        if (globalSession.canBeCommittedAsync()) {
                            LOGGER.error(
                                "Committing branch transaction[{}], status: PhaseTwo_CommitFailed_Unretryable, please check the business log.", branchSession.getBranchId());
                            continue;
                        } else {
                            SessionHelper.endCommitFailed(globalSession);
                            LOGGER.error("Committing global transaction[{}] finally failed, caused by branch transaction[{}] commit failed.", globalSession.getXid(), branchSession.getBranchId());
                            return false;
                        }
                    default:
                        if (!retrying) {
                            globalSession.queueToRetryCommit();
                            return false;
                        }
                        if (globalSession.canBeCommittedAsync()) {
                            LOGGER.error("Committing branch transaction[{}], status:{} and will retry later",
                                branchSession.getBranchId(), branchStatus);
                            continue;
                        } else {
                            LOGGER.error(
                                "Committing global transaction[{}] failed, caused by branch transaction[{}] commit failed, will retry later.", globalSession.getXid(), branchSession.getBranchId());
                            return false;
                        }
                }
            } catch (Exception ex) {
                StackTraceLogger.error(LOGGER, ex, "Committing branch transaction exception: {}",
                    new String[] {branchSession.toString()});
                if (!retrying) {
                    globalSession.queueToRetryCommit();
                    throw new TransactionException(ex);
                }
            }
        }
        //If has branch and not all remaining branches can be committed asynchronously,
        //do print log and return false
        if (globalSession.hasBranch() && !globalSession.canBeCommittedAsync()) {
            LOGGER.info("Committing global transaction is NOT done, xid = {}.", globalSession.getXid());
            return false;
        }
    }
    //If success and there is no branch, end the global transaction.
    /**
     * 这里的sessionHelper.encCommitted会将globalSession中的记录删除
     */
    if (success && globalSession.getBranchSessions().isEmpty()) {
        SessionHelper.endCommitted(globalSession);

        // committed event
        eventBus.post(new GlobalTransactionEvent(globalSession.getTransactionId(), GlobalTransactionEvent.ROLE_TC,
            globalSession.getTransactionName(), globalSession.getBeginTime(), System.currentTimeMillis(),
            globalSession.getStatus()));

        LOGGER.info("Committing global transaction is successfully done, xid = {}.", globalSession.getXid());
    }
    return success;
}

上面这段代码，是全局事务提交的逻辑，本质上，和全局事务回滚的逻辑，没太大的区别，只是底层一个调用的是commit，一个调用的是rollback；针对上面这段代码，我们需要着重关注的是：branchCommit这个方法，这个方法，我们下面单独说，先来看下这段代码的逻辑

getCore(branchSession.getBranchType()).branchCommit(globalSession, branchSession); 通过这个方法，发起netty请求，通知各个分支事务，进行全局事务的提交
根据分支事务的返回状态，进行不同的处理；如果分支事务处理成功，在seata服务端这里，会把分支事务删除，并且把内存中的分支事务id删除（在seata服务端，每个全局事务都会维护一个集合，存储当前全局事务对应的所有分支事务）
在所有的分支事务，处理完毕之后，SessionHelper.endCommitted(globalSession); 通过这个方法，结束所有的逻辑，其实就是把globalSession从mysql数据库中删除

针对第二点和第三点，就不点进去细看了，其实本质上和全局事务回滚时，做的逻辑几乎上是一致的，我们主要来看下第一点，分支事务是如何处理分支事务提交的逻辑，因为这段逻辑和分支事务回滚的逻辑不一样

分支事务提交

io.seata.rm.datasource.AsyncWorker#branchCommit

这是rm这一端接收到分支事务提交的处理逻辑，但是我们会发现，这段代码很简单：
在这里插入图片描述
这里可以看到，只是将当前请求信息，塞到了一个queue中

io.seata.rm.datasource.AsyncWorker#init

这里可以看到，在rm启动的时候，会初始化一个定时执行的线程池，在这个线程池中，会定时的调用doBranchCommit()方法
在这里插入图片描述

io.seata.rm.datasource.AsyncWorker#doBranchCommits

private void doBranchCommits() {
    if (ASYNC_COMMIT_BUFFER.isEmpty()) {
        return;
    }

    /**
     * 1.mappedContexts存储的是从阻塞队列中获取到的要处理的分支事务
     */
    Map<String, List<Phase2Context>> mappedContexts = new HashMap<>(DEFAULT_RESOURCE_SIZE);
    List<Phase2Context> contextsGroupedByResourceId;
    while (!ASYNC_COMMIT_BUFFER.isEmpty()) {
        Phase2Context commitContext = ASYNC_COMMIT_BUFFER.poll();
        contextsGroupedByResourceId = CollectionUtils.computeIfAbsent(mappedContexts, commitContext.resourceId, key -> new ArrayList<>());
        contextsGroupedByResourceId.add(commitContext);
    }

    /**
     * 2.遍历mappedContexts
     */
    for (Map.Entry<String, List<Phase2Context>> entry : mappedContexts.entrySet()) {
        Connection conn = null;
        DataSourceProxy dataSourceProxy;
        try {
            try {
                DataSourceManager resourceManager = (DataSourceManager) DefaultResourceManager.get()
                    .getResourceManager(BranchType.AT);
                dataSourceProxy = resourceManager.get(entry.getKey());
                if (dataSourceProxy == null) {
                    throw new ShouldNeverHappenException("Failed to find resource on " + entry.getKey());
                }
                conn = dataSourceProxy.getPlainConnection();
            } catch (SQLException sqle) {
                LOGGER.warn("Failed to get connection for async committing on " + entry.getKey(), sqle);
                continue;
            }
            contextsGroupedByResourceId = entry.getValue();
            Set<String> xids = new LinkedHashSet<>(UNDOLOG_DELETE_LIMIT_SIZE);
            Set<Long> branchIds = new LinkedHashSet<>(UNDOLOG_DELETE_LIMIT_SIZE);
            /**
             * 3.判断要处理的分支事务数量是否达到了批量处理的阈值
             * 如果到了，就批量进行删除
             * 否则的话，就清空xids和branchIds 然后return
             */
            for (Phase2Context commitContext : contextsGroupedByResourceId) {
                xids.add(commitContext.xid);
                branchIds.add(commitContext.branchId);
                int maxSize = Math.max(xids.size(), branchIds.size());
                /**
                 * 并不是在每次全局事务提交的时候，就会执行下面的sql
                 * 而是在达到一定的阈值的时候，才会批量执行，阈值默认是1000
                 *
                 * 删除undoLog日志
                 */
                if (maxSize == UNDOLOG_DELETE_LIMIT_SIZE) {
                    try {
                        UndoLogManagerFactory.getUndoLogManager(dataSourceProxy.getDbType()).batchDeleteUndoLog(
                            xids, branchIds, conn);
                    } catch (Exception ex) {
                        LOGGER.warn("Failed to batch delete undo log [" + branchIds + "/" + xids + "]", ex);
                    }
                    xids.clear();
                    branchIds.clear();
                }
            }

            if (CollectionUtils.isEmpty(xids) || CollectionUtils.isEmpty(branchIds)) {
                return;
            }

            try {
                UndoLogManagerFactory.getUndoLogManager(dataSourceProxy.getDbType()).batchDeleteUndoLog(xids,
                    branchIds, conn);
            } catch (Exception ex) {
                LOGGER.warn("Failed to batch delete undo log [" + branchIds + "/" + xids + "]", ex);
            }

            if (!conn.getAutoCommit()) {
                conn.commit();
            }
        } catch (Throwable e) {
            LOGGER.error(e.getMessage(), e);
            try {
                if (conn != null) {
                    conn.rollback();
                }
            } catch (SQLException rollbackEx) {
                LOGGER.warn("Failed to rollback JDBC resource while deleting undo_log ", rollbackEx);
            }
        } finally {
            if (conn != null) {
                try {
                    conn.close();
                } catch (SQLException closeEx) {
                    LOGGER.warn("Failed to close JDBC resource while deleting undo_log ", closeEx);
                }
            }
        }
    }
}

上面这端代码，我们可以看下，没1000ms执行一次，如果这1000ms之内，queue中待指定分支事务commit的请求达到了一定的阈值(UNDOLOG_DELETE_LIMIT_SIZE)，就会执行commit请求（UndoLogManagerFactory.getUndoLogManager(dataSourceProxy.getDbType()).batchDeleteUndoLog(
xids, branchIds, conn);）；如果没有达到阈值，也会去执行commit的操作

UndoLogManagerFactory.getUndoLogManager 这里执行提交的逻辑，其实就是把undoLog数据给删除，因为对于全局事务提交时，其实在第一阶段，mysql已经真正执行了commit的操作，在第二阶段，只需要把undoLog给删除即可

总结

以上，就是全局事务提交的逻辑，整体看下来，我们可以发现，对于全局事务提交的时候，分支事务在处理的时候，是异步来处理的，这是和回滚逻辑有很大的区别，因为上篇博客中，我们有看到，全局事务回滚时，分支事务在第二阶段，是同步处理的，在接收到请求之后，会根据undoLog生成回滚sql，并执行，然后删除undoLog数据，但是对于全局事务提交的第二阶段，会发现，接收到请求之后，直接塞到了队列中，通过异步的请求，没1000ms执行一次提交的逻辑