大脑网路分析的进展：基于大规模自监督学习的诊断| 文献速递-先进深度学习疾病诊断

Title

题目

BrainMass: Advancing Brain Network Analysis for Diagnosis with Large-scale Self-Supervised Learning

大脑网路分析的进展：基于大规模自监督学习的诊断

文献速递介绍

功能性磁共振成像（fMRI）利用血氧水平依赖（BOLD）效应已成为神经科学中的重要工具。它为体内映射认知神经底物提供了独特的机会。最近，fMRI广泛用于分析脑功能障碍，并能够揭示相互作用的脑区域网络。许多脑疾病似乎源于局限于特定脑功能而非结构性局部病变的破坏。这一范式的关键结果是功能性脑网络的发展，通过不同感兴趣区域（ROI）BOLD信号之间的相关性来估计神经相互作用和时间同步。这些网络已成为研究脑疾病的不可或缺工具，检查各种疾病中潜在的连接组。

近年来，脑功能网络分析领域深受深度学习方法影响，这些方法通过非线性和深度嵌入表示来表征ROI的复杂相互作用，并显著改善疾病诊断性能。这些方法包括卷积神经网络（CNN）、图神经网络（GNN）和Transformer网络。尽管取得了显著进展，但这些研究的普遍局限性是它们的普适性和适应性有限 [18], 。仍然主要使用特定任务的模型，受注释样本数量有限并且难以适应其他任务。此外，缺乏少样本或零样本学习的能力，限制了它们在仅有少量带注释MRI可用的临床场景中的潜在应用。此外，数据异质性也限制了其普适性。

通过大规模自监督学习（SSL）是解决这一问题的方法之一，可以生成均匀和通用的表示。这种方法显示出潜力，在其他领域的各种下游任务中获得了显著的性能增益。与传统预训练模型不同，基于大规模数据集预训练的基础模型可以使用单一的模型权重处理各种任务。然而，在医学图像分析领域，特别是在脑网络中开发基础模型存在显著挑战，主要是由于数据样本有限和自监督学习不足。目前利用SSL进行脑网络的研究仅达到了与非SSL方法相当的性能。因此，目前迫切需要针对脑网络的特定基础模型。

为此，我们旨在填补脑网络基础模型的空白。在本文中，我们收集了来自多个中心的大样本队列，包括46,686名参与者的70,781个样本。我们还介绍了一种增强方法，通过随机删除BOLD信号中的时间点来创建更多脑网络，形成伪功能连接（pFC）。此外，我们提出了BrainMass，首个专为脑网络分析设计的基础模型，通过面具建模和自监督学习中的表示对齐预训练Transformer编码器：

（1）MRM：MRM通过随机屏蔽一些ROI并通过其余ROI预测被屏蔽特征来执行。特别是，分类头用于预测元标签（被屏蔽ROI的索引），重构头用于估计被屏蔽ROI的特征。这种方法有助于关联内部网络依赖关系，并增强下游任务的局部特性。

（2）LRA：BrainMass利用LRA采用双分支方法从同一BOLD信号中提取两个pFC的表示，并使它们规范化为相似的潜在嵌入。该设计承认来自同一参与者的增强脑网络应产生相似的潜在表示。我们利用双分支网络从两个pFC中提取嵌入，并使它们的规范化更接近。

Abstract

摘要

Foundation models pretrained on large-scaledatasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach isespecially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstreamtasks without the need for numerous costly annotations.However, there has been limited investigation into brainnetwork foundation models, limiting their adaptability andgeneralizability for broad neuroscience studies. In thisstudy, we aim to bridge this gap. In particular, (1) we curated a comprehensive dataset by collating images from 30datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmentedbrain networks by randomly dropping certain timepoints ofthe BOLD signal. (2) We propose the BrainMass frameworkfor brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROIModeling (MRM) to bolster intra-network dependencies andregional specificity. Furthermore, Latent RepresentationAlignment (LRA) module is utilized to regularize augmentedbrain networks of the same participant with similar topological properties to yield similar latent representations byaligning their latent embeddings. Extensive experimentson eight internal tasks and seven external brain disorderdiagnosis tasks show BrainMass’s superior performance,highlighting its significant generalizability and adaptability.Nonetheless, BrainMass demonstrates powerful few/zeroshot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use forclinical applications.

基于大规模自监督学习预训练的基础模型展示了在各种任务中异常的多功能性。由于医学数据的异质性和难以收集的特点，这种方法对医学图像分析和神经科学研究尤为有益，能够在无需大量昂贵注释的情况下简化广泛的下游任务。

然而，对于大脑网络基础模型的研究还存在有限的探索，限制了它们在广泛神经科学研究中的适应性和普适性。在本研究中，我们旨在填补这一空白。具体而言，（1）我们通过整合来自30个数据集的图像筛选出了一个全面的数据集，包括46,686名参与者的70,781个样本。此外，我们引入了伪功能连接（pFC），通过随机删除BOLD信号的特定时间点生成数百万个增强的脑网络。（2）我们提出了BrainMass框架，用于通过面具建模和特征对齐进行脑网络自监督学习。BrainMass采用Mask-ROI建模（MRM）来加强网络内部依赖关系和区域特异性。此外，利用潜在表示对齐（LRA）模块来规范同一参与者的增强脑网络，使它们具有类似的拓扑特性，通过调整它们的潜在嵌入来产生类似的潜在表示。

对八项内部任务和七项外部脑部疾病诊断任务进行的广泛实验显示，BrainMass表现出卓越的性能，突显其显著的普适性和适应性。尽管如此，BrainMass展示了强大的少/零样本学习能力，并展示了对各种疾病的有意义解释，展示了其在临床应用中的潜力。

Method

方法

The brain functional networks X are derived by mappingprocessed neuroimages onto a template with V Regions of Interest (ROIs). These networks are symmetric positive definitematrices, X ∈ R V ×**V . For diagnosis purposes, the goal is todevelop a mapping function f : X → y, where y representsthe predicted diagnosis phenotype for each subject.In this study, we first generate two pFCs for each participant,and feed them into the BrainMass framework for pre-traininga brain network Transformer (BrainTF) encoder. During thedownstream classification phase, we froze the BrainTF anduse it to extract latent representations, Z, for each participant. The learned latent representations are further fed intoa Support Vector Machine (SVM) classifier for downstreamprediction. This process is shown in Fig. 1. To note that, in thetraining phase, the BrainMass consists of three components:the MRM network, the online network, and the target network.Each network features a BrainTF encoder, sharing the samearchitectural design. The BrainTFs in the MRM and onlinenetworks share the same weights, while the BrainTF in thetarget network is updated by an exponential moving averagebased on the online network.

脑功能网络 X 通过将处理后的神经影像映射到一个包含 V 个感兴趣区域（ROIs）的模板上得到。这些网络是对称正定矩阵，X ∈ R V ×**V。为了进行诊断，目标是开发一个映射函数 f : X → y，其中 y 代表每个受试者的预测诊断表型。在本研究中，我们首先为每位参与者生成两个 pFC，并将其输入到 BrainMass 框架中，用于预训练脑网络变压器（BrainTF）编码器。在下游分类阶段，我们冻结 BrainTF，并使用它为每个参与者提取潜在表示 Z。学习到的潜在表示进一步输入支持向量机（SVM）分类器进行下游预测。此过程如图1所示。需要注意的是，在训练阶段，BrainMass 包含三个组件：MRM 网络、在线网络和目标网络。每个网络都具有一个 BrainTF 编码器，具有相同的架构设计。MRM 和在线网络中的 BrainTF 共享相同的权重，而目标网络中的 BrainTF 则通过基于在线网络的指数移动平均值进行更新。

Conclusion

结论

In this study, we propose BrainMass, the first foundationmodel specifically designed for brain network analysis anddisease diagnosis through functional measurements. BrainMass leverages the MRM and LRA modules to pre-trainthe Transformer encoder, focusing on intra-network dependencies and bootstrapped regularized latent representations.Our BrainMass model fosters generalizable and homogeneousrepresentations, facilitating a wide range of brain disorderdiagnoses using a single model set. Moreover, visualizations ofthe attention maps and multivariate analysis of the latent reprsentations demonstrate the model’s potential emergent abilityto discriminate between abnormal and normal states. Thishighlights its potential for clinical application with robust zeroshot and few-shot learning capabilities. Our study providesnew insights into the application of large-scale self-supervisedlearning in the realm of brain functional network analysis andaddresses the lack of large models in brain network analysis.

在本研究中，我们提出了BrainMass，这是第一个专为脑网络分析和疾病诊断而设计的基础模型，通过功能性测量来实现。BrainMass利用MRM和LRA模块对Transformer编码器进行预训练，重点放在网络内部依赖关系和引导正则化的潜在表示上。

我们的BrainMass模型促进了可泛化和均匀的表示，利用单一模型集进行广泛的脑部疾病诊断。此外，注意力热图的可视化和潜在表示的多变量分析展示了模型在区分异常和正常状态方面潜在的能力。这突显了其在具有强大的零样本和少样本学习能力的临床应用潜力。

我们的研究为大规模自监督学习在脑功能网络分析领域的应用提供了新的见解，并解决了脑网络分析中大模型的缺乏问题。

Results

结果

For comparison, two categories of baseline models areincluded: those with SSL and those without SSL. The baselinemodels without SSL include BrainNetCNN , DHGNN, BrainGNN , Semi-GCN , vanillaTransformer(vanillaTF), and BrainNetTransformer (BrainNetTF) . ForSSL comparisons, powerful SSL frameworks like BYOLand MOCO are included. Furthermore, we considered twoexisting works: BrainNPT and BrainGSLs.

为了比较，我们包括了两类基准模型：一类是使用了SSL的模型，另一类是没有使用SSL的模型。没有使用SSL的基准模型包括BrainNetCNN 、DHGNN 、BrainGNN 、Semi-GCN 、vanilla-Transformer（vanillaTF）和BrainNetTransformer（BrainNetTF）。对于SSL比较，我们还包括了强大的SSL框架，如BYOL 和MOCO 。此外，我们考虑了两个现有的工作：BrainNPT 和BrainGSLs 。

Figure

图

Fig. 1: Illustration of (i) the construction of pFC, (ii) the training phase of BrainMass method, including an MRM (an MRMnetwork) and an LRA (an online network and a target network) module, and (iii) the inference phase of BrainMass.

Fig. 1: 图示包括以下内容：(i) pFC的构建过程，(ii) BrainMass方法的训练阶段，包括MRM（一个MRM网络）和LRA（一个在线网络和一个目标网络）模块，以及(iii) BrainMass的推断阶段。

Fig. 2: The effect on the dropping rate on eight internal tasks.

图2：降低率对八个内部任务的影响。

Fig. 3: The effect on the model size

Fig. 3: 模型大小的影响

Fig. 4: The accuracy performances on seven external tasks.

Fig. 4: 七个外部任务的准确率表现

Fig. 5: The workflow of the zero/few-shot learning for Brain Mass.

Fig. 5: BrainMass的零/少样本学习工作流程

Fig. 6: Heatmaps of the Transformer encoder attention maps on 7 tasks, including the averaged attention maps (the first row),those of the first layer (the second row), and the last layer (the third row). The values in heatmaps are normalized into 0 to 1.

Fig.

6: Transformer编码器在7个任务上的注意力热图，包括平均注意力热图（第一行）、第一层的注意力热图（第二行）和最后一层的注意力热图（第三行）。热图中的值已归一化到0到1之间。

Fig. 7: Visualization on the ten key regions. The key regions are colored with the corresponding sub-network. Temp: thetemporal. Par: the parietal. Cing: the cingulate. Med: the medial. PFC: the prefrontal cortex. pCun: the precuneus. PCC: theposterior cingulate cortex. OFC: the orbital frontal cortex.

Fig. 7: 十个关键区域的可视化。关键区域用相应的子网络着色。Temp: 颞叶区。Par: 顶叶区。Cing: 扣带回。Med: 中央区。PFC: 前额皮层。pCun: 楔前叶。PCC: 后扣带皮层。OFC: 眶额皮层。

Table

表

TABLE I: Demographical information on 30 datasets.

TABLE I: 30个数据集的人口统计信息

TABLE II: Classification results of different approaches on 8 tasks of 6 internal datasets in terms of accuracy (Acc), sensitivity(Sen), and specificity (Spe). SSL indicates the model is pretrained by self-supervised learning.

TABLE II: 不同方法在6个内部数据集的8个任务中，根据准确率（Acc）、灵敏度（Sen）和特异度（Spe）的分类结果。SSL表示模型通过自监督学习进行预训练。