文献速递:深度学习乳腺癌诊断---基于深度学习的图像分析预测乳腺癌中HE染色组织病理学图像的PD-L1状态

news2025/1/18 6:17:52

Title 

题目

Deep learning-based image analysis predicts PD-L1 status from H&E-stained histopathol ogy images in breast cancer

基于深度学习的图像分析预测乳腺癌中H&E染色组织病理学图像的PD-L1状态

01

文献速递介绍

编程死亡配体-1(PD-L1)最近被用于乳腺癌,作为免疫治疗的预测生物标志物。通过免疫组化(IHC)进行PD-L1定量的成本、时间和变异性是一项挑战。相比之下,苏木精-伊红(H&E)染色是一种用于常规癌症诊断的稳健染色方法。在这里,我们展示了通过采用最先进的深度学习技术,可以从H&E染色图像中预测PD-L1表达。在两位专家病理学家和一个专门设计的标注软件的帮助下,我们构建了一个数据集来评估从乳腺癌H&E中预测PD-L1的可行性。在一个包含3,376名患者的队列中,我们的系统预测PD-L1状态的曲线下面积(AUC)为0.91 – 0.93。我们的系统在两个外部数据集上进行了验证,包括一个独立的临床试验队列,显示出一致的预测性能。此外,所提出的系统预测哪些病例容易被病理学家误解,显示它可以作为临床实践中的决策支持和质量保证系统。

Results

结果

PD-L1 in the BCCA and MA31 cohorts

The study was based on breast cancer tissue samples and clin icopathological data of 5596 patients with 26,763 TMA images from two independent cohorts: The British Columbia Cancer Agency (BCCA) and the MA31 (Table 1). The BCCA cohort is composed of 4,944 women with newly diagnosed invasive breast cancer in British Columbia, whose tumor specimens were processed by a central laboratory at Vancouver General Hospital between 1986 and 1992. Each woman had three H&E-stained TMA cores, one IHC-stained TMA for PD-L1, and one for PD-1.

The MA31 cohort is a clinical trial of the Canadian Cancer Trials Group, conducted from January 17, 2008, through December 1, 2011,and was designed to evaluate the prognostic and predictive biomarker utility of pretreatment serum PD-L1 levels. This cohort consists of 652 recruited patients with ERBB2-positive metastatic breast cancer from 21 countries. Each woman had between 1 to 4 H&E-stained images, and one PD-L1-stained image corresponding to each H&E image (Table 1).

An expert pathologist annotated the entire data, consisting of both BCCA and MA31 cohorts, for PD-L1 positive or negative status, by BCCA和MA31

队列中的PD-L1

本研究基于乳腺癌组织样本和5596名患者的临床病理数据,这些数据来自两个独立的队列:不列颠哥伦比亚癌症机构(BCCA)和MA31(表1)。BCCA队列由4944名在不列颠哥伦比亚新诊断的侵袭性乳腺癌女性组成,她们的肿瘤标本在1986年至1992年间由温哥华总医院的一个中心实验室处理。每位女性有三个H&E染色的TMA核心,一个用于PD-L1的免疫组织化学(IHC)染色的TMA和一个用于PD-1的。

MA31队列是加拿大癌症试验组的一个临床试验,从2008年1月17日进行到2011年12月1日,旨在评估治疗前血清PD-L1水平的预后和预测生物标志物的效用。该队列由来自21个国家的652名ERBB2阳性转移性乳腺癌患者组成。每位女性有1到4张H&E染色图像,和每张H&E图像对应的一个PD-L1染色图像(表1)。

一位专家病理学家标注了包括BCCA和MA31队列在内的全部数据,根据PD-L1阳性或阴性状态,

Methods

方法

Characteristics of the patients and the stains The dataset used in this study consists of two independent cohorts:BCCA and MA31 (Table 1). Each cohort contains breast cancer tissue samples and clinicopathological data with TMA images. Each patient in the BCCA cohort had 3 H&E-stained TMA cores, one IHC-stained TMA for PD-L1, and one for PD-1. Each patient in the MA31 cohort had between 1 to 4 H&E-stained images, and one PD-L1-stained image corresponding to each H&E image. An expert pathologist annotated the data for PD-L1 positive or negative status, by going through all available H&E and IHC-stained TMA images (Fig. 1a). Some of the samples were annotated to be excluded from the analysis (Table 1), while the rest of the patients were classified as either negative or positive for PD-L1 status. BCCA median follow-up was 12.4 years, and age at diagnosis 62 years. MA31 median follow-up was 21.5 months, and mean age at diagnosis was 55 years. The TMA images from both cohorts contain 0.6-mm-diameter cores and were scanned using the Bacus Laboratories, Inc. Slide Scanner (Bliss) scanner at a resolution of 2256 × 1440 pixels.

患者和染色的特征

本研究使用的数据集包括两个独立的队列:BCCA和MA31(表1)。每个队列包含乳腺癌组织样本和带有TMA图像的临床病理数据。BCCA队列中的每位患者有3个H&E染色的TMA核心,一个用于PD-L1的IHC染色的TMA,和一个用于PD-1的。MA31队列中的每位患者有1到4张H&E染色图像,和每张H&E图像对应的一个PD-L1染色图像。一位专家病理学家通过查看所有可用的H&E和IHC染色TMA图像(图1a),为PD-L1阳性或阴性状态进行了注释。一些样本被注释为从分析中排除(表1),而其他患者则被分类为PD-L1状态的阴性或阳性。BCCA的中位随访时间为12.4年,诊断时的年龄为62岁。MA31的中位随访时间为21.5个月,诊断时的平均年龄为55岁。两个队列中的TMA图像包含直径为0.6毫米的核心,使用Bacus实验室公司的幻灯片扫描仪(Bliss)以2256 × 1440像素的分辨率进行扫描。

Fig

图片

Fig. 1 | Overview of the proposed framework. The annotation, training, and inference methodologies. a An expert pathologist used our designed computer aided annotation software to annotate patients for PD-L1 status, based on their H&E and corresponding IHC-stained TMA images. Patients with no TMAs, unclear ima ges, deficient staining, and with insufficient tissue or tumor, were excluded from the analysis. The rest of the patients were assigned each a PD-L1 positive or negative label, resulting in 2516 annotated patients in the BCCA training set, 860 in the BCCA test set, and 275 in the MA31 external test set. b H&E images of the included patients were assigned the annotation of their corresponding patients. The H&E images in the BCCA training set were used to train and validate the CNN in a 5-fold cross validation manner, using the ground truth PD-L1 annotations. The model was then applied to the validation folds, the BCCA test set, and the external MA31 test set, to produce a prediction score for each H&E image. The prediction score per patient was defined as the maximum over its corresponding H&E prediction scores. The prediction scores at the patient level were then compared to the ground truth PD-L1 annotations to produce statistical analyses.

图1 | 提出的框架概览。注释、训练和推断方法。a 一位专家病理学家使用我们设计的计算机辅助注释软件,根据患者的H&E及相应的IHC染色TMA图像,为PD-L1状态进行注释。没有TMA、图像不清晰、染色不足、以及组织或肿瘤量不足的患者被排除在分析之外。其余患者每人被分配一个PD-L1阳性或阴性标签,结果在BCCA训练集中有2516名被注释的患者,在BCCA测试集中有860名,在MA31外部测试集中有275名。b 包括患者的H&E图像被分配了与其相应患者的注释。BCCA训练集中的H&E图像被用来训练和验证CNN,采用5折交叉验证的方式,使用PD-L1的真实注释。然后,模型被应用于验证折叠、BCCA测试集和外部MA31测试集,为每张H&E图像产生预测分数。每位患者的预测分数定义为其对应H&E预测分数的最大值。然后将患者级别的预测分数与真实的PD-L1注释进行比较,以产生统计分析。

图片

Fig. 2 | Convolutional neural networks achieve high performance in the pre diction of PD-L1 and PD-1 expression. Receiver operating characteristics (ROC) curves for the performance of the proposed models, in terms of AUC, for PD-L1 and PD-1 prediction in the BCCA and MA31 cohorts. a The model obtained high pre diction accuracies for both the BCCA cross-validation (0.911) and BCCA test set (0.915). When analyzing only concordant cases between pathologists, AUC per formance was further increased (0.928). b For the external MA31 cohort, the performance dropped to 0.854, showing that a calibration step may benefit the application of the system to new cohorts. Indeed, the calibration step increased the AUC on MA31 to 0.886, which was further increased to 0.919 after removing the discordant cases. c The AUC performance results for PD-1 prediction were lower than for PD-L1. The PD-1 AUC results were high, however, given the extremely imbalanced nature of data (only 3% positives), which poses optimization difficulties due to very few positive samples to train the system with.

图2 | 卷积神经网络在预测PD-L1和PD-1表达方面达到高性能。接收器操作特征(ROC)曲线用于展示所提模型在预测PD-L1和PD-1方面的性能,以AUC(曲线下面积)表示,针对BCCA和MA31队列。a 模型在BCCA交叉验证(0.911)和BCCA测试集(0.915)中均获得了高预测准确性。仅分析病理学家之间一致的病例时,AUC性能进一步提高(0.928)。b 对于外部的MA31队列,性能下降到0.854,表明校准步骤可能有助于系统应用到新的队列。实际上,校准步骤将MA31的AUC提高到0.886,去除不一致病例后进一步提高到0.919。c PD-1预测的AUC性能结果低于PD-L1。然而,考虑到数据极度不平衡的特性(仅3%为阳性),这对优化构成了难题,因为用于训练系统的阳性样本非常少,PD-1的AUC结果仍然很高。

图片

Fig. 3 | Impact of the proposed system on clinical practice. a The threshold for splitting the patients’ prediction scores to low and high is tuned in the BCCA cross validation. Bottom: The sorted prediction scores of the patients, versus the per centage of patients classified below the threshold. Top: The cross-validation sen sitivity of the system, versus the percentage of patients classified below the threshold (i.e., classified as low-PS), showing a trade-off between the two. The threshold was selected as 0.5, resulting in a sensitivity of 0.92 for BCCA-CV with 58% of the patients in the low-PS group. b Applying the selected threshold to the BCCA test patients (top) and MA31 patients (bottom). Following the system’s predictions allows the pathologists to focus on reviewing the cases classified as low-PS by the system and positive by the pathologist, which may be prone to miss-interpretation or deficient PD-L1 staining. After removing the discordant cases from the analysis, the sensitivity was increased (BCCA-test-con and MA31-con), revealing the inter pathologist variability. In addition to quality assurance, the system could be used to allow pathologists to spare IHC staining and interpretation from more than 70% of the patients while retaining 100% sensitivity for PD-L1 expression in MA31.

图3 | 提出的系统对临床实践的影响。a 在BCCA交叉验证中调整将患者预测分数分为低和高的阈值。底部:患者的排序预测分数与被分类为低于阈值的患者百分比。顶部:系统的交叉验证灵敏度与被分类为低于阈值的患者百分比(即,被分类为低PS)相对,显示了两者之间的权衡。阈值被选为0.5,导致BCCA-CV的灵敏度为0.92,有58%的患者在低PS组中。b 将选定的阈值应用于BCCA测试患者(顶部)和MA31患者(底部)。遵循系统的预测允许病理学家专注于复审被系统分类为低PS且病理学家判定为阳性的病例,这些病例可能容易误解或PD-L1染色不足。从分析中移除不一致的病例后,灵敏度增加(BCCA-test-con和MA31-con),揭示了病理学家之间的变异性。除了质量保证外,该系统还可以使病理学家在保留对MA31中PD-L1表达100%灵敏度的同时,免除超过70%的患者的IHC染色和解释。

图片

Fig. 4 | Low-prediction score cases classified positive. Tissue images of patients classified positive by the first pathologist and low-PS by the system. The BCCA-test patients are shown on the left (by PD-L1) and right (by PD-1), and the MA31 patients are shown in the middle (by PD-L1). For each patient, a representative H&E image and its corresponding IHC image are displayed one below the other. The classifi- cation of the second pathologist is registered below each sample, showing that most of the low-PS cases that were classified positive by the first pathologist were classified otherwise by the second one.

图4 | 被第一位病理学家分类为阳性的低预测分数病例。系统将其分类为低PS的患者的组织图像。左侧展示了BCCA测试患者(通过PD-L1)和右侧(通过PD-1),中间展示了MA31患者(通过PD-L1)。对于每位患者,展示了一个代表性的H&E图像及其相应的IHC图像,一个排在另一个下方。每个样本下方登记了第二位病理学家的分类,显示大多数被第一位病理学家分类为阳性的低PS病例被第二位病理学家分类为其他情况。

图片

Fig. 5 | t-SNE embedding for visualization of feature space. a A 2D visualization of the image feature vectors by applying t-SNE. Each point represents a single patient in the BCCA test set. The t-SNE embedding maps patients with similar image fea tures to near points, and patients with dissimilar image features to far points. The points are colored by the PD-L1 prediction scores of their corresponding patients. The 8 patients that were classified positive by the first pathologist and low-PS by the system are marked and their classifications by both pathologists are noted. b The TMA images corresponding to the t-SNE embedding are presented. Several examples of low and high prediction score images are shown, to demonstrate the characteristics observed by the pathologists. Examples of partially missing tissues are shown at the bottom.

图5 | t-SNE嵌入以可视化特征空间。a 通过应用t-SNE,对图像特征向量进行了2D可视化。每个点代表BCCA测试集中的一个单独患者。t-SNE嵌入将具有相似图像特征的患者映射到近点,将具有不同图像特征的患者映射到远点。点按其对应患者的PD-L1预测分数进行着色。被第一位病理学家分类为阳性且系统分类为低PS的8位患者被标记出来,并记录了两位病理学家的分类。b 展示了与t-SNE嵌入对应的TMA图像。展示了几个低和高预测分数图像的示例,以展示病理学家观察到的特征。底部展示了部分缺失组织的示例。

Table

 1

图片

Table 1 | Patients and TMAs included and excluded in each data group

表1 | 每个数据组中包括和排除的患者及TMA

图片

Table 2 | Summary of the system’s performance and statistics for PD-L1 and PD-1 in the BCCA and MA31 cohorts

表2 | 系统在BCCA和MA31队列中对PD-L1和PD-1性能和统计数据的总结

图片

Table 3 | a Concordance matrix for the agreement of the two expert pathologists for PD-L1 status in the MA31 cohort, at patient level. b: Re-classification of a second pathologist for PD-L1 and PD-1 status in the BCCA and MA31 cohorts

表3 | a MA31队列中两位专家病理学家对PD-L1状态一致性的一致性矩阵,以患者为单位。b: 在BCCA和MA31队列中,第二位病理学家对PD-L1和PD-1状态的重新分类

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1518853.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

代码随想录算法训练营三刷day24 | 回溯算法 之 理论基础 77. 组合

三刷day24 理论基础77. 组合递归函数的返回值以及参数回溯函数终止条件单层搜索的过程 理论基础 回溯法解决的问题都可以抽象为树形结构。 因为回溯法解决的都是在集合中递归查找子集,集合的大小就构成了树的宽度,递归的深度,都构成的树的深…

网络安全,硬防迪云

要减少被攻击的频率,游戏开发者可以采取以下措施: 1. 强化安全措施:确保游戏服务器和用户数据的安全性,加密网络传输,防止黑客攻击和数据泄露。 2. 更新和修复漏洞:定期检查游戏代码和服务器,…

css3 实现html样式蛇形布局

文章目录 1. 实现效果2. 实现代码 1. 实现效果 2. 实现代码 <template><div class"body"><div class"title">CSS3实现蛇形布局</div><div class"list"><div class"item" v-for"(item, index) …

如何使用第三方接入淘宝商品详情(主图,详情图)

1、找到可用的API接口&#xff1a;首先&#xff0c;需要找到支持查询商品信息的API接口。这些信息通常可以在电商平台的官方文档或开发者门户网站上找到。 2、注册并获取API密钥&#xff1a;在使用API接口之前&#xff0c;需要注册并获取API密钥。API密钥是识别身份的唯一标识符…

区块链技术中的共识机制算法:以权益证明(PoS)为例

引言&#xff1a; 在区块链技术的演进过程中&#xff0c;共识机制算法扮演着至关重要的角色。除了广为人知的工作量证明&#xff08;PoW&#xff09;外&#xff0c;权益证明&#xff08;Proof of Stake&#xff0c;PoS&#xff09;也是近年来备受关注的一种共识算法。 …

C# 读取多条数据记录导出到 Word 标签模板

目录 应用需求 实现步骤 范例运行环境 配置Office DCOM 实现代码 组件库引入 ​编辑 核心代码 小结 应用需求 将数据库数据表中的数据输出并打印&#xff0c;WORD 是一个良好的载体&#xff0c; 在应用项目里&#xff0c;许多情况下我们会使用数据记录结合 WORD 标签模…

Halcon OCR文字识别

1、OCR文字识别 FontFile : Universal_0-9_NoRej dev_update_window (off) read_image (bottle, bottle2) get_image_size (bottle, Width, Height) dev_open_window (0, 0, Width, Height, black, WindowHandle) set_display_font (WindowHandle, 16, mono, true, false) dev…

妇女节专访|勇敢踏入未知领域,她的 Web3 奇妙之旅

Web3 的出现席卷着数字世界的剧烈变革&#xff0c;让每个人与互联网和数字资产互动的方式产生了深刻的变化。Web3 所强调的去中心化特征&#xff0c;使其成为人们对理想未来世界的一个缩影。而作为一个以技术为核心的新兴领域&#xff0c;Web3 也难以避免传统认知中男性占主导地…

信息检索(十一):Nonparametric Decoding for Generative Retrieval

Nonparametric Decoding for Generative Retrieval 摘要1. 引言2. 相关工作3. 非参数解码3.1 关键优势3.2 Base Np3.3 异步 Np3.4 对比 Np3.5 聚类 4. 实验设置4.1 基线4.2 数据集和评价指标4.3 构建CE 的细节 5. 实验结果5.1 普通解码 vs Np 解码5.2 非参数解码的优点5.3 什么…

Win11安装Plsql140报错2503

一、安装异常 二、解决办法 出现上述问题&#xff0c;主要是因为msi包安装的权限问题&#xff0c;使用管理员权限安装即解决 。cmd控制台以管理员身份打开WINR&#xff09;->(SHIFTCTRLRNTER)&#xff0c;进入到msi安装包目录下&#xff0c;以管理员身份安装即可&#xff1…

保姆级OpenSSL下载及安装教程

下载地址下载步骤安装步骤环境变量配置查看是否安装成功下载地址 官网链接:(https://slproweb.com/products/Win32OpenSSL.html ) 点击跳转 下载步骤 以下步骤截图,以当前官网界面为标准,后有变动请提示博主修改。 点击链接跳转后界面为 往下滚动找到安装包下载按钮…

医疗设备控费系统防止私收、漏收、人情费

加19339904493&#xff08;康&#xff09; 医院完成信息化建设&#xff0c;不仅是一次技术性人深过信息化技术&#xff0c;医院能够更好地管理病患信息&#xff0c;提高诊断的准确性和效率&#xff0c;同时优化医疗资源的配置&#xff0c;降低医疗成本。在信息化的推动下&#…

基于亚马逊云EC2+Docker搭建nextcloud私有化云盘

亚马逊云科技EC2云服务器&#xff08;Elastic Compute Cloud&#xff09;是亚马逊云科技AWS&#xff08;Amazon Web Services&#xff09;提供的一种云计算服务。EC2代表弹性计算云&#xff0c;它允许用户租用虚拟计算资源&#xff0c;包括CPU、内存、存储和网络带宽&#xff0…

创建阿里云MySQL数据库详细流程,云数据库账号密码创建和连接教程

阿里云数据库怎么使用&#xff1f;阿里云服务器网aliyunfuwuqi.com整理阿里云数据库从购买到使用全流程&#xff0c;阿里云支持MySQL、SQL Server、PostgreSQL和MariaDB等数据库引擎&#xff0c;阿里云数据库具有高可用、高容灾特性&#xff0c;阿里云提供数据库备份、恢复、迁…

感谢Cognition公司AI程序员Devin为人类程序员提供新工作:AI驯兽师AI鼓励师AI接锅侠

讲动人的故事&#xff0c;写懂人的代码 初创公司Cognition最近推出的AI程序员Devin&#xff0c;只会给人类程序员增加3类新工作。 最近&#xff0c;初创公司Cognition告诉大家一个新闻&#xff1a;他们研发了个AI程序员&#xff0c;名叫Devin。 Devin能干这些事&#xff1a; …

【面试精讲】String是如何实现的?String源码分析

【面试精讲】String是如何实现的&#xff1f;String源码分析 目录 一、String实现机制 二、String不可变性&#xff08;使用final修饰&#xff09; 三、String 和 StringBuilder、StringBuffer 的区别 四、和equals的区别 五、String创建对象与JVM辨析 六、String源码解…

vos3000外呼系统非标准的11位手机号码开启国内业务和黑白名单时需设置忽略前缀

通过软交换管理-补充设置-系统参数SS_NON_STANDARD_PREFIX中填写999,用来忽略这些非标准的手机前缀&#xff0c;从而实现功能 还可以按照以下步骤进行设置&#xff0c;系统问题欢迎微博主一起交流学习&#xff1a; 登录VOS3000管理界面&#xff1a; 使用管理员账号登录VOS3000管…

实现基本的登录功能

一、登录功能的前端处理过程 1、导入项目所需的图片和CSS等静态文件 参考代码存放client节点的/opt/code目录下 执行如下命令&#xff1a; [rootclient ~]# cp -r /opt/code/kongguan_web/src/assets/* /root/kongguan_web/src/assets/ 将参考代码中的css、icon、images等文…

49、C++/友元、常成员函数和常对象、运算符重载学习20240314

一、封装类 用其成员函数实现&#xff08;对该类的&#xff09;数学运算符的重载&#xff08;加法&#xff09;&#xff0c;并封装一个全局函数实现&#xff08;对该类的&#xff09;数学运算符的重载&#xff08;减法&#xff09;。 代码&#xff1a; #include <iostream…

ITK Region 解析

ITK 官方文档里面关于region的讲解&#xff1a;In summary:* LargestPossibleRegion is the total size of the image* BufferedRegion is the portion of the image that iscurrently loaded in memory * RequestedRegion is the portion that the pipelinerequest from a fil…