mistyR官网教程 空转spatial

news2024/11/15 21:07:36

Modeling spatially resolved omics with mistyR • mistyR (saezlab.github.io)

mistyR and data formats • mistyR (saezlab.github.io)

Heidelberg University and Heidelberg University Hospital, Heidelberg, Germany
Jožef Stefan Institute, Ljubljana, Slovenia
jovan.tanevski@uni-heidelberg.de

2023-07-26

Source: vignettes/mistyR.Rmd

Introduction

The use of mistyR is conceptualized around building a workflow for analysis of spatial omics data by four classes of functions:

  • View composition

  • Model training

  • Result processing

  • Plotting

To construct a workflow mistyR is designed with the use of pipe operators (for example the operator %>% from magrittr) for chaining functions in mind.

When loading mistyR please consider configuring a future::multisession() parallel execution plan. mistyR will then use all available cores for execution of computationally demanding functions. It is recommended that the user modifies the future::plan() according to their needs.

# MISTy
library(mistyR)
library(future)

# data manipulation
library(dplyr)
library(purrr)
library(distances)

# plotting
library(ggplot2)

plan(multisession)

The following example uses the synthetically generated benchmark data synthetic that is included in the package. The dataset is a list of 10 tibbles, each representing data generated from a random layout of four cell types and empty space on a 100-by-100 grid.

The data was generated by simulating a two-dimensional cellular automata model that focuses on signaling events. The model simulates the production, diffusion, degradation and interactions of 11 molecular species. Note that the dataset contains simulated measurements only for the non-empty spaces.

data("synthetic")

ggplot(synthetic[[1]], aes(x = col, y = row, color = type)) +
  geom_point(shape = 15, size = 0.7) +
  scale_color_manual(values = c("#e9eed3", "#dcc38d", "#c9e2ad", "#a6bab6")) +
  theme_void()


str(synthetic[[1]], give.attr = FALSE)
#> spc_tbl_ [4,205 × 14] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
#>  $ row  : num [1:4205] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ col  : num [1:4205] 100 11 13 14 15 20 23 24 26 32 ...
#>  $ ECM  : num [1:4205] 0.0385 0.0327 0.1444 0.387 0.1635 ...
#>  $ ligA : num [1:4205] 0.834 0.119 0.525 0.269 0.195 ...
#>  $ ligB : num [1:4205] 0.0157 0.0104 0.014 0.0367 0.1176 ...
#>  $ ligC : num [1:4205] 0.236 0.804 0.334 0.502 0.232 ...
#>  $ ligD : num [1:4205] 1.183 0.101 0.434 0.241 0.203 ...
#>  $ protE: num [1:4205] 1.18 0 1.67 0 0 ...
#>  $ protF: num [1:4205] 2.547 0.386 1.614 0.913 0.162 ...
#>  $ prodA: num [1:4205] 0.382 0 0.472 0 0 ...
#>  $ prodB: num [1:4205] 0 0 0 0 0.16 ...
#>  $ prodC: num [1:4205] 0 0.536 0 0.418 0 ...
#>  $ prodD: num [1:4205] 0.588 0 0.379 0 0 ...
#>  $ type : chr [1:4205] "CT1" "CT2" "CT1" "CT2" ...

For more information about the underlying model and data generation see help(synthetic) or the publication.

View composition

The mistyR workflow always starts by defining an intraview (create_initial_view()) containing measurements of the markers that are the target of the modeling at each cell of interest. For the first sample from the synthetic dataset we select all markers except for the ligands for all available cells.

expr <- synthetic[[1]] %>% select(-c(row, col, type, starts_with("lig")))

misty.intra <- create_initial_view(expr)

summary(misty.intra)
#>                Length Class  Mode     
#> intraview      2      -none- list     
#> misty.uniqueid 1      -none- character
summary(misty.intra$intraview)
#>        Length Class       Mode     
#> abbrev 1      -none-      character
#> data   7      spec_tbl_df list

From the intrinsic view (intraview)1 mistyR will model the expression of each marker as a function of the expression of other markers within the cell. We are interested in exploring marker expressions coming from different spatial contexts that are complementary, i.e., that are distinguishable and contribute to the explanation of the overall expression of the markers.

mistyR includes two default helper functions for calculating and adding views that take into account the spatial context of the data: add_juxtaview() and add_paraview(). The juxtaview represent a local spatial view and captures the expression of all markers available in the intraview within the immediate neighborhood of each cell. The paraview captures the expression of all markers avainalbe in the intraview in the boarder tissue structure where the importance of the influence is proportional to the inverse of the distance between two cells. To add a paraview in the view composition, we first need information about the location of each cell from the intraview. Using this information we can create and add a paraview with importance radius of 10 to the view composition.

pos <- synthetic[[1]] %>% select(row, col)

misty.views <- misty.intra %>% add_paraview(pos, l = 10)
#> 
#> Generating paraview

summary(misty.views)
#>                Length Class  Mode     
#> intraview      2      -none- list     
#> misty.uniqueid 1      -none- character
#> paraview.10    2      -none- list

The calculation of a juxtaview and a paraview can be computationally intensive when there are a large number of cells in the sample. Therefore the calculation is run in parallel with the set future::plan(). The computational time needed for the calculation of the paraview can also be significantly reduced by approximation. Refer to the documentation of this function (help(add_paraview)) for more details.

Other relevant and custom views can be created (create_view()) from an external resource (data.frametibble) and added to the view composition. The data should contain and one row per cell in order as in the intraview. For example we can create a view that captures the mean expression of the 10 nearest neighbors of each cell.

# find the 10 nearest neighbors
neighbors <- nearest_neighbor_search(distances(as.matrix(pos)), k = 11)[-1, ]

# calculate the mean expression of the nearest neighbors for all markers
# for each cell in expr
nnexpr <- seq_len(nrow(expr)) %>%
  map_dfr(~ expr %>%
    slice(neighbors[, .x]) %>%
    colMeans())

nn.view <- create_view("nearest", nnexpr, "nn")

nn.view
#> $nearest
#> $nearest$abbrev
#> [1] "nn"
#> 
#> $nearest$data
#> # A tibble: 4,205 × 7
#>      ECM protE protF  prodA  prodB  prodC  prodD
#>    <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
#>  1 0.169 0.337 1.07  0.120  0.0138 0.146  0.165 
#>  2 0.346 0.676 0.549 0.0969 0.0140 0.190  0.0766
#>  3 0.219 0.304 0.495 0.0496 0.0288 0.236  0.0387
#>  4 0.238 0.607 0.651 0.132  0.0288 0.0954 0.122 
#>  5 0.313 0.688 0.835 0.166  0.0297 0.0837 0.173 
#>  6 0.527 0.743 0.616 0.0722 0.0184 0.135  0.0964
#>  7 0.278 0.399 0.501 0.0413 0.0632 0.160  0.0604
#>  8 0.266 0.537 0.624 0.0738 0.0463 0.154  0.117 
#>  9 0.356 0.564 0.565 0.0696 0.0415 0.208  0.106 
#> 10 0.625 0.863 0.458 0.0350 0.0823 0.230  0.0576
#> # ℹ 4,195 more rows

The created view(s) can be added (add_views()) to an existing view composition one by one or by providing them in a form of a list. Other examples of creating and adding custom views to the composition can be found in the resources in See also.

extended.views <- misty.views %>% add_views(nn.view)

summary(extended.views)
#>                Length Class  Mode     
#> intraview      2      -none- list     
#> misty.uniqueid 1      -none- character
#> paraview.10    2      -none- list     
#> nearest        2      -none- list

Views can also be removed from the composition by providing one or more names of views to remove_views(). The intraview and misty.uniqueid cannot be removed with this function.

extended.views %>%
  remove_views("nearest") %>%
  summary()
#>                Length Class  Mode     
#> intraview      2      -none- list     
#> misty.uniqueid 1      -none- character
#> paraview.10    2      -none- list

extended.views %>%
  remove_views("intraview") %>%
  summary()
#>                Length Class  Mode     
#> intraview      2      -none- list     
#> misty.uniqueid 1      -none- character
#> paraview.10    2      -none- list     
#> nearest        2      -none- list

Model training

Once the view composition is created, the model training is managed by the function run_misty(). By default, models are trained for each marker available in the intraview for each view independently. The results of the model training will be stored in a folder named “results”.

misty.views %>% run_misty()
#> 
#> Training models
#> [1] "/home/runner/work/mistyR/mistyR/vignettes/results"

The workflow that we used for the first sample from the synthetic dataset can be easily extended to be applied to all 10 samples to completely reproduce one of the results reported in the publication. The results for each sample will be stored in a subfolder of the folder “results”.

result.folders <- synthetic %>% imap_chr(function(sample, name) {
  sample.expr <- sample %>% select(-c(row, col, type, starts_with("lig")))
  sample.pos <- sample %>% select(row, col)
  
  create_initial_view(sample.expr) %>% add_paraview(sample.pos, l = 10) %>%
    run_misty(results.folder = paste0("results", .Platform$file.sep, name))
})
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models
#> 
#> Generating paraview
#> 
#> Training models

result.folders
#>                                                      synthetic1 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic1" 
#>                                                     synthetic10 
#> "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic10" 
#>                                                      synthetic2 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic2" 
#>                                                      synthetic3 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic3" 
#>                                                      synthetic4 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic4" 
#>                                                      synthetic5 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic5" 
#>                                                      synthetic6 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic6" 
#>                                                      synthetic7 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic7" 
#>                                                      synthetic8 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic8" 
#>                                                      synthetic9 
#>  "/home/runner/work/mistyR/mistyR/vignettes/results/synthetic9"

Note that by default, mistyR caches calculated views2 and trained models, such that in case of repeated running of the workflow they will be retrieved instead of recalculated, thus saving significant computational time. However, the size of the generated cache files can be large. Therefore, the functions that can work with cached files, such as run_misty(), have parameter named cached that can be set to FALSE. Additionally the function clear_cache() provides means to remove cache files.

Result processing

The raw mistyR results are stored in several text files in the output folder for each analyzed sample. The results from one or more samples can be collected, aggregated and coverted to an R object with the function collect_results(), by providing path(s) to folder(s) containing results generated by run_misty().

misty.results <- collect_results(result.folders)
#> 
#> Collecting improvements
#> 
#> Collecting contributions
#> 
#> Collecting importances
#> 
#> Aggregating

summary(misty.results)
#>                        Length Class  Mode
#> improvements           4      tbl_df list
#> contributions          4      tbl_df list
#> importances            5      tbl_df list
#> improvements.stats     5      tbl_df list
#> contributions.stats    6      tbl_df list
#> importances.aggregated 5      tbl_df list

See help(collect_results) for more information on the structure of misty.results.

Plotting

MISTy gives explanatory answers to three general questions. Each question can be answered by looking at the corresponding plot.

1. How much can the broader spatial context explain the expression of markers (in contrast to the intraview)?

This can be observed in the gain in R2 (absolute percentage) (or relative percentage of decrease RMSE) of using the multiview model in contrast to the single intraview only model.

misty.results %>%
  plot_improvement_stats("gain.R2") %>%
  plot_improvement_stats("gain.RMSE")

We can further inspect the significance of the gain in variance explained, by the assigned p-value of improvement based on cross-validation.

misty.results$improvements %>%
  filter(measure == "p.R2") %>%
  group_by(target) %>% 
  summarize(mean.p = mean(value)) %>%
  arrange(mean.p)
#> # A tibble: 7 × 2
#>   target mean.p
#>   <chr>   <dbl>
#> 1 ECM    0.0184
#> 2 protF  0.0496
#> 3 protE  0.421 
#> 4 prodA  0.460 
#> 5 prodB  0.499 
#> 6 prodC  0.503 
#> 7 prodD  0.505

In general, the significant gain in R2 can be interpreted as the following:

“We can better explain the expression of marker X, when we consider additional views, other than the intrinsic view.”

2.How much do different view components contribute to explaining the expression?

misty.results %>% plot_view_contributions()

As expected most of the contribution to the prediction of the expression of the markers comes from the intraview. However for the markers that we observed significant improvement of variance we can also observe a proportional estimated contribution of the paraview.

3.What are the specific relations that can explain the contributions?

To explain the contributions, we can visualize the importances of markers coming from each view separately as predictors of the expression of all markers.

First, the intraview importances.

misty.results %>% plot_interaction_heatmap(view = "intra", cutoff = 0.8)

These importances are associated to the relationship between markers in the same cell. As we didn’t use the information about the cell types in any way during the process of modeling the significant interactions that we see in the heatmap may come from any of the cell types.

Second, the paraview importances.

misty.results %>% plot_interaction_heatmap(view = "para.10", cutoff = 0.5)

These importances are associated to the relationship between markers in the cell and markers in the broader structure (controlled by our parameter l).

We can observe that some interactions in the paraview might be redundant, i.e., they are also found to be important in the intraview. To focus on the interactions coming from the paraview only we can plot the contrast between these results.

misty.results %>% plot_contrast_heatmap("intra", "para.10", cutoff = 0.5)
#> Warning: Specifying the `id_cols` argument by position was deprecated in tidyr 1.3.0.
#> ℹ Please explicitly name `id_cols`, like `id_cols = -c(view, nsamples)`.
#> ℹ The deprecated feature was likely used in the mistyR package.
#>   Please report the issue at <https://github.com/saezlab/mistyR/issues>.
#> This warning is displayed once every 8 hours.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

Futhermore, since the predictor and target markers in both views are the same, we can plot the interaction communities that can be extracted from the estimated interaction pairs from the intraview

misty.results %>% plot_interaction_communities("intra")

and the paraview.

misty.results %>% plot_interaction_communities("para.10", cutoff = 0.5)

When interpreting the results and the plots it is important to note that the relationships captured in the importances are not to assumed or interpreted as linear or casual. Furthermore, the estimated importance of a single predictor - marker pair should not be interpreted in isolation but in the context of the other predictors, since training MISTy models is multivariate predictive task.

See also

More examples

browseVignettes("mistyR")

Online articles

Publication

Jovan Tanevski, Ricardo Omar Ramirez Flores, Attila Gabor, Denis Schapiro, Julio Saez-Rodriguez. Explainable multiview framework for dissecting spatial relationships from highly multiplexed data. Genome Biology 23, 97 (2022). Explainable multiview framework for dissecting spatial relationships from highly multiplexed data | Genome Biology | Full Text

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/1078635.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

阿里5年经验之谈 —— 浅谈自动化测试方法!

导读 在当今快节奏的软件开发环境中&#xff0c;高质量的代码交付至关重要。而针对经过多次迭代&#xff0c;主要功能趋向稳定的产品&#xff0c;大量传统的重复性手动测试方法已经无法满足高效、快速的需求。为了提高测试效率保证产品质量&#xff0c;本文通过产品实践应用&a…

Python接口自动化测试之token参数关联

前言 在做自动化接口测试时&#xff0c;有时候会遇到token的动态关联&#xff0c;例如查询余额接口&#xff0c;需要关联登录接口的token动态值&#xff0c;如何利用python脚本进行接口token关联呢?今天我们爱学习一下吧&#xff01; 一&#xff1a;获取登录接口返回的token…

研发项目管理系统对比:找到最适合的高效工具

研发部门是企业非常重要的部门&#xff0c;代表着企业未来能否在市场上拥有优秀的技术&#xff0c;站稳市场份额。很多企业的研发方式往往是瀑布式的&#xff0c;所以项目的阶段规划&#xff0c;然后每个阶段的需求分配给开发人员&#xff0c;可以随时查看每个需求的开发进度和…

Redis学习5——有序集合Zset数据类型的操作

有序集合Zset 常用命令 数据结构 跳跃表 跳跃表

移远通信EM060K系列LTE-A Cat 6模组完成全球认证覆盖

近日&#xff0c;移远通信LTE-A Cat 6模组EM060K系列顺利完成全球认证覆盖&#xff0c;将以卓越的性能和品质助力海内外客户终端大规模部署&#xff0c;为其提供畅快的高速网络连接。同时&#xff0c;凭借着有竞争力的性能和成本优势&#xff0c;EM060K系列将加速释放海外固定无…

[架构之路-235]:目标系统 - 纵向分层 - 数据库 - 数据库系统基础与概述:数据库定义、核心概念、系统组成

目录 一、核心概念 1.1 什么是数据与信息 1.2 数据与数据库的关系 1.3 什么是数据库 1.4 数据库中的数据的特点 1.5 数据库与数据结构的关系 二、数据库系统 2.1 什么是数据库管理系统 2.2 什么是数据库系统 2.3 数据库相关的人员 2.4 数据库的主要功能 2.5 Excel表…

Vuex的基础使用存值及异步

目录 一、概述 ( 1 ) 讲述 ( 2 ) 概念 ( 3 ) 作用 二、取值 1. 安装 2. 菜单栏 3. 模块 4. 引用 三、改值 四、异步&后台请求 带来的获取 一、概述 ( 1 ) 讲述 Vuex 是一个专为 Vue.js 应用程序开发的状态管理模式。它采用集中式存储管理应用的所有组件的…

【Linux初阶】多线程1 | 页表的索引作用,线程基础(优缺点、异常、用途),线程VS进程,线程控制,C++多线程引入

文章目录 ☀️一、深入理解页表☀️二、Linux线程概念&#x1f33b;1.什么是线程&#xff08;重点&#xff09;⚡&#xff08;1&#xff09;线程的概念⚡&#xff08;2&#xff09;线程库初识 &#x1f33b;2.线程的优点&#x1f33b;3.线程的缺点&#x1f33b;4.线程异常&…

为什么设置静态代理IP后无法正常上网,怎么解决?

静态代理IP是一个固定的IP地址&#xff0c;因为其出色的稳定性和安全性而得到广泛应用&#xff0c;常用于一些对网络质量要求高、需要长期稳定和持续可靠连接的业务。设置静态代理IP后无法上网是用户常见的网络问题&#xff0c;通常有多种原因&#xff1a; 1. 静态代理IP不可用…

【Flutter学习】AppBar

App Bar 可以视为页面的标题栏&#xff0c;在 Flutter 中用AppBar组件实现。 一个简单的AppBar实现代码如下&#xff1a; import package:flutter/material.dart;void main() {runApp(const AppBarTest()); }class AppBarTest extends StatelessWidget {const AppBarTest({Key…

【AGC】云托管新建站点时间过长的问题排查方法

【问题描述】 开发者按照指导文档使用云托管服务&#xff0c;已经申请了域名&#xff0c;在创建站点时页面显示证书配置最长需要12小时&#xff0c;然而&#xff0c;在等了两天后依然是激活中的状态&#xff0c;没有如期上线。 ​ 【解决方案】 卡在上线中的状态有以下几个原…

F. Vasilije Loves Number Theory

Problem - F - Codeforces 思路&#xff1a;分析一下题意&#xff0c;对于第一种操作来说&#xff0c;每次乘以x&#xff0c;那么nn*x&#xff0c;然后问是否存在一个a使得gcd(n,a)1并且n*a的约数个数等于n&#xff0c;有最大公约数等于1我们能够知道其实这两个数是互质的&…

圆满完成重保网络防护行动,持安科技获西南兵工致信感谢

近日&#xff0c;因积极协助西南兵工有限责任公司开展重保网络防护行动中&#xff0c;提供强大零信任网络安全产品和专业技术力量配合&#xff0c;持安科技收到了来自西南兵工有限责任公司的致信感谢。 持安为西南兵工提供的应用层零信任解决方案&#xff0c;是持安科充分吸取了…

Hibiki Run 市场火爆,“Listen to Earn”赛道的现象级应用?

在 9 月 18 日&#xff0c;以“Listen to Earn”为特点的 Web3 数字音乐类项目 Hibiki Run&#xff0c;在包括 DAOStarter、Spores Network、BitMart 在内的 三个平台&#xff0c;开启了实用通证 $HUT 的 IDO / IEO 活动。据悉&#xff0c;在本轮认购开启后的短时间内所有平台均…

Python 图形化界面基础篇:更改字体、颜色和样式

Python 图形化界面基础篇&#xff1a;更改字体、颜色和样式 引言 Tkinter 库简介步骤1&#xff1a;导入 Tkinter 模块步骤2&#xff1a;创建 Tkinter 窗口步骤3&#xff1a;创建文本标签步骤4&#xff1a;更改字体步骤5&#xff1a;更改颜色步骤6&#xff1a;更改样式 完整示例…

数字图像处理实验记录一(图像基本灰度变换)

文章目录 基础知识图像是什么样的&#xff1f;1&#xff0c;空间分辨率&#xff0c;灰度分辨率2&#xff0c;灰度图和彩色图的区别3&#xff0c;什么是灰度直方图&#xff1f; 实验要求1&#xff0c;按照灰度变换曲线对图像进行灰度变换2&#xff0c;读入一幅图像&#xff0c;分…

使用python查找指定文件夹下所有xml文件中带有指定字符的xml文件

文件夹目录如下&#xff08;需要递归删除文件夹下的.DS_Store文件&#xff09;&#xff1a; labels文件夹下面是xml文件&#xff1a; import os import os.pathpath "name/labels" files os.listdir(path) # 得到文件夹下所有文件名称 s []for xmlFile in files:…

成为领导心腹:新入行的测试人员,如何快速提升自己的影响力?

作为一名新入行的测试人员&#xff0c;如何提高自己在工作中的影响力呢&#xff1f;可能有人会问了&#xff1a;“测试人员不是只要安分守己的做好自己的测试工作不就行了吗&#xff1f;又不是当管理者&#xff0c;为什么要提高影响力呢&#xff1f;”说实话&#xff0c;我刚入…

“比特币震荡中的秘密信号?技术分析揭示最近走势的关键!“

技术分析 比特币维持在 27,000 美元的支撑位&#xff0c;甚至在此价格水平上形成了新的更高低点。这标志着一个非常有利的发展&#xff0c;表明每小时和每日时间框架上的看涨趋势。 然而&#xff0c;当考虑每周和每月的观点时&#xff0c;我们仍然遇到阻力&#xff0c;这可以…

【开源系统开发框架】:一招高效实现办公流程化发展!

实现高效化办公是很多职场人的愿望。毕竟这能提高企业的办公效率&#xff0c;高效利用内部资源&#xff0c;创造顺畅无阻的流程化办公&#xff0c;因此也成为很多企业的追求。什么样的平台软件可以助力实现&#xff1f;低代码技术平台的优势多&#xff0c;轻量级、易操作、简单…