禁止商业或二改转载,仅供自学使用,侵权必究,如需截取部分内容请后台联系作者!
文章目录
- 介绍
- 加载R包
- 数据下载
- Load data
- Figure 1
- Fig 1B: functional assays adhension
- FIG 1C: Functional assays OPK
- Figure 2
- Fig 2C: Settings and function for heatmap
- Fig 2C (continued): automate comparisons for IgG (1 month)
- Fig 2C (continued): Automate comparisons for IgA (1week)
- Related to Fig 2C - Supp Figure 1: heatmaps for IgG 1wk and IgA 1 mo.
- Fig 2D and E: Column graphs of the #antigens each individual responded to for serum IgG
- Figure 2D and E (continued): Column graph of the #antigens each individual responded to for saliva IgA
- Fig 2F: graphing significant antigens raw values
- Related to Fig 2F - Supplementary Fig S3 and S4: Graph each antigen seperately - raw titres
- Fig 2G : Heatmap of correlation coefficients between fold change in IgA and IgG.
- Fig 3: Luminex data over 3 months
- Fig 4: luminex compare responses to pharyngitis children
- Figure 5: Luminex: Correlation of pre-challenge titre with 1 month fold change:
- Fig 6 and 7: multivariate analysis - data preparation
- Fig 6: Multidimensional scaling
- Fig 6 (continued) : compare MDS1
- Fig 7: Correlation with outcome variables
- Figure 7 (continued) : sPLS-DA
- 系统信息
介绍
【科研绘图系列】R语言绘制SCI论文图合集
加载R包
library(ggplot2)
library(tidyverse)
library(reshape2)
library(ggbeeswarm)
library(pheatmap)
library(RColorBrewer)
library(viridis)
library(vegan)
library(ggpubr)
library(gghalves)
library(Polychrome)
library(uwot)
library(mixOmics)
library(cowplot)
数据下载
所需要的数据的下载链接:
- 百度网盘链接: 从百度网盘下载
- 提取码: t2hs
Load data
load("RawData.RData")
Figure 1
The following code can be used to generate figures for the saliva adhesion assay and opsonic phagocytosis assay (OPK)
Fig 1B: functional assays adhension
#split out timepoints
Adhesion.pre <- data[,c("id","Pharyngitis","Adehesion_percent_Pre_saliva")]
Adhesion.post <- data[,c("id","Pharyngitis","Adehesion_percent_1_week_saliva")]
#define new timepoint variable
Adhesion.pre$time_point<- "Pre.challenge"
Adhesion.post$time_point<- "1week"
#rename vars
names(Adhesion.pre) <- c("id" ,"Pharyngitis","Adhension Percent", "time_point")
names(Adhesion.post) <- c("id" ,"Pharyngitis","Adhension Percent", "time_point")
#bind back together
Adhesion <- rbind(Adhesion.pre, Adhesion.post)
Adhesion$time_point <- factor(Adhesion$time_point, levels = c("Pre.challenge","1week"))
#make new grouping variable for use with graphing.
Adhesion$Group <- factor(paste(Adhesion$Pharyngitis,Adhesion$time_point, sep = "_"), levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1week", "pharyngitis_Pre.challenge", "pharyngitis_1week"))
#plot the data
ggplot(Adhesion, aes(Group,Adhesion$`Adhension Percent`,fill = Pharyngitis))+
geom_violin(aes(alpha = time_point), draw_quantiles = c(0.5), scale= "width", position = "dodge")+
geom_beeswarm(size = 3, alpha = 0.75, cex =1, shape = 21)+
geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+
scale_fill_manual(values = c("blue", "red"))+
scale_alpha_manual(values = c(0.15, 0.5))+
scale_color_manual(values = c("blue", "red"))+
ylab("Percent Adhesion")+
xlab("")+
scale_y_continuous(expand = c(0, 0))+
coord_cartesian(ylim = c(0,200))+
custom_theme()
#Save the plot to a PDF file
ggsave(file = "Adhesion.pdf", width = 4, height = 4 )
# calculate p-value for pharyngitis group, pre and post, paired Wilcoxon-signed rank test
wilcox.test(Adhesion[Adhesion$Pharyngitis == "pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = T)
# calculate p-value for no pharyngitis group, pre and post, paired Wilcoxon-signed rank test
wilcox.test(Adhesion[Adhesion$Pharyngitis == "no pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "no pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = T)
# calculate p-value for pre challenge, pharyngitis vs no pharyngitis, Mann-Whitney test
wilcox.test(Adhesion[Adhesion$Pharyngitis == "pharyngitis" & Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "no pharyngitis"& Adhesion$time_point == "Pre.challenge",]$`Adhension Percent`, paired = F)
# calculate p-value for 1 month, pharyngitis vs no pharyngitis, Mann-Whitney test
wilcox.test(Adhesion[Adhesion$Pharyngitis == "no pharyngitis" & Adhesion$time_point == "1week",]$`Adhension Percent`, Adhesion[Adhesion$Pharyngitis == "pharyngitis"& Adhesion$time_point == "1week",]$`Adhension Percent`, paired = F)
#remove dataframes that wont be used again
rm(Adhesion, Adhesion.post, Adhesion.pre)
FIG 1C: Functional assays OPK
#Split out time-points into seperate dataframe
OPK.pre <- data[,c("id","Pharyngitis","Opsonic_index_pre")]
OPK.post <- data[,c("id","Pharyngitis","Opsonic_index_1mo")]
# Add time pont variable
OPK.pre$time_point<- "Pre.challenge"
OPK.post$time_point<- "1month"
#rename columns for consistency
names(OPK.pre) <- c("id" ,"Pharyngitis","Opsonic_index", "time_point")
names(OPK.post) <- c("id" ,"Pharyngitis","Opsonic_index", "time_point")
#bind the the pre and post timepoints back together
OPK <- rbind(OPK.pre, OPK.post)
# Make time_point a factor with specified levels
OPK$time_point <- factor(OPK$time_point, levels = c("Pre.challenge","1month"))
# Create a binary variable to distinguish two individuals with OPK response
OPK$responders <- ifelse(OPK$id %in% c("SN010", "SN013"), "Y", "N")
# Create a new grouping variable that distinguishes based on responders and pharyngitis variables
OPK$Group <- factor(paste(OPK$responders, OPK$Pharyngitis, OPK$time_point, sep = "_"),
levels = c("Y_no pharyngitis_Pre.challenge", "Y_no pharyngitis_1month",
"Y_pharyngitis_Pre.challenge", "Y_pharyngitis_1month",
"N_no pharyngitis_Pre.challenge", "N_no pharyngitis_1month",
"N_pharyngitis_Pre.challenge", "N_pharyngitis_1month"))
#plot data
ggplot(OPK, aes(Group, Opsonic_index, fill = Pharyngitis))+
geom_col(stat="identity",position = "dodge", aes(alpha = time_point))+
geom_beeswarm(size = 3, alpha = 0.75, cex =0.5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
scale_alpha_manual(values = c(0.15, 0.5))+
ylab("OPK index")+
xlab("")+
scale_y_continuous(expand = c(0, 0))+
coord_cartesian(ylim = c(0,100))+
custom_theme()+
theme( axis.text.x = element_text(colour = "black", size = 17, angle = 90)) # change orientation of X-axis text
# Save the plot to a PDF file
ggsave(file = "OPK.pdf", width = 8, height = 7 )
#remove datasets that dont get used again
rm(OPK, OPK.pre, OPK.post)
Figure 2
The following code can be used to generate figures for the the analysis of ELISA data shown in figure 2 and linked supplementary figures
#Fig 2B: Heatmap Saliva IgA and Serum IgG and Saliva IgA
Note: Column order in heatmap was manually altered in Adobe Illustrator
#use this for both pheatmap plotting objects
breaksList <- seq(from =-2.5, to = 2.5, by = 0.1)
#Generate new dataframe subsetting into pre-challenge and 1 month data for IgA
plotting <- data_long[data_long$timepoint %in% c("Pre.challenge") & data_long$Type == "IgA",]
plotting_1 <- data_long[data_long$timepoint %in% c("1week") & data_long$Type == "IgA",]
#add other variables back in
plotting <- left_join(plotting, outcome, by = "id")
plotting_1 <- left_join(plotting_1, outcome, by = "id")
#bind them back together
plotting <- rbind(plotting, plotting_1)
plotting <-(plotting[complete.cases(plotting[,colnames(plotting) %in% Antigen.Order]),])
#transpose
plotting_t <- t(plotting[,c(colnames(plotting) %in% Antigen.Order)])
#add column names
colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_")
#generate annotation variables for heatmap
mat_col <- data.frame(outcome=as.factor(ifelse(plotting$id %in% pharyngitis, "pharyngitis", "no pharyngitis")))
rownames(mat_col) <- colnames(plotting_t)
mat_colors <- list(outcome = c("blue","red"))
names(mat_colors$outcome) <- levels(mat_col$outcome)
#Plot data
pdf(file = "heatmap_raw_titres_IgA.pdf", height = 4, width = 6)
pheatmap(plotting_t,
breaks = breaksList,
color = viridis(50),
cellwidth = 2.5, cellheight =9,
border_color = "black",
scale = "row",
cluster_rows = F,
cluster_cols = F,
clustering_distance_cols = "correlation",
clustering_method = "ward.D2",
annotation_col = mat_col, annotation_colors = mat_colors,
annotation_legend = TRUE,
show_rownames = T, show_colnames = F,
fontsize = 8, display_numbers = F, number_format = "%.2f")
dev.off()
# Now lets plot IgG
#Generate new dataframe subsetting into pre-challenge and 1 month data
plotting <- data_long[data_long$timepoint %in% c("Pre.challenge") & data_long$Type == "IgG",]
plotting_1 <- data_long[data_long$timepoint %in% c("1month") & data_long$Type == "IgG",]
#add other variables back in
plotting <- left_join(plotting, outcome, by = "id")
plotting_1 <- left_join(plotting_1, outcome, by = "id")
#bind then back together
plotting <- rbind(plotting, plotting_1)
#remove the two individuals for IgG without 1 month data
plotting <-(plotting[complete.cases(plotting[,colnames(plotting) %in% Antigen.Order]),])
#transform the data
plotting_t <- t(plotting[,c(colnames(plotting) %in% Antigen.Order)])
colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_")
colnames(plotting_t) <- paste(plotting$id, plotting$timepoint, sep = "_")
#generate annotation variables for heatmap
mat_col <- data.frame(outcome=as.factor(ifelse(plotting$id %in% pharyngitis, "pharyngitis", "no pharyngitis")))
rownames(mat_col) <- colnames(plotting_t)
mat_colors <- list(outcome = c("blue","red"))
names(mat_colors$outcome) <- levels(mat_col$outcome)
pdf(file = "heatmap_raw_titres_IgG.pdf", height = 4, width = 6)
pheatmap(plotting_t,
breaks = breaksList,
color = viridis(length(breaksList)),,
cellwidth = 2.5, cellheight =9,
border_color = "black",
scale = "row",
cluster_rows = F,
cluster_cols = F,
clustering_distance_cols = "correlation",
clustering_method = "ward.D2",
annotation_col = mat_col, annotation_colors = mat_colors,
annotation_legend = TRUE,
show_rownames = T, show_colnames = F,
fontsize = 8, display_numbers = F, number_format = "%.2f")
dev.off()
#remove dataframes/variables no longer needed
rm(mat_col, mat_colors, plotting, plotting_1, plotting_t, breaksList)
Fig 2C: Settings and function for heatmap
# define colour scheme
breaksList = seq(from = 0.5, to = 1.5, by = 0.005)
heatpal <- c("seagreen","white", "mediumpurple3")
# heatmap function:
heatIT <- function(heated) {
pheatmap(heated,
kmeans_k = NA,
breaks = breaksList,
color = colorRampPalette(heatpal)(length(breaksList)),
border_color = "black",
cellwidth = 11, cellheight =11,
# display_numbers = test_labels,
number_format = "%",
scale = "none",
cluster_rows = F,
cluster_cols = F,
number_color = "grey",
show_rownames = T, show_colnames = F,
fontsize = 8, display_numbers = F,
filename = NA, silent = FALSE)
}
Fig 2C (continued): automate comparisons for IgG (1 month)
#Select IgG variable names
Ig.Names <- names(data[,32:150])
#Select within Ig.names for these variables at the following time-points
PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)]
OneMonth <- Ig.Names[grepl("1month", Ig.Names)]
OneMonth <- OneMonth[!grepl("1monthFC", OneMonth)]
FoldChange <- Ig.Names[grepl("1monthFC", Ig.Names)]
#subset data into each timepoint
PreChallenge <- data[,PreChallenge]
OneMonth <- data[,OneMonth]
FoldChange <- data[,FoldChange]
#tidy variable names
names(PreChallenge) <- gsub("IgG_", "", names(PreChallenge))
names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge))
names(OneMonth) <- gsub("IgG_", "", names(OneMonth))
names(OneMonth) <- gsub("_1month", "", names(OneMonth))
names(FoldChange) <- gsub("IgG_", "", names(FoldChange))
names(FoldChange) <- gsub("_1monthFC", "", names(FoldChange))
#add outcome variable
PreChallenge$Pharyngitis <- data$Pharyngitis
OneMonth$Pharyngitis <- data$Pharyngitis
FoldChange$Pharyngitis <- data$Pharyngitis
Postchallenge = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix
for(i in 1:17){
Postchallenge[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants
Postchallenge[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants
Postchallenge[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneMonth[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneMonth[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge[i,7] <- wilcox.test(OneMonth[c(1:19),i],OneMonth[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
}
for(i in 1:17){
Postchallenge[i,c(8:9)] <- p.adjust(Postchallenge[i,c(4:5)], method = "fdr" )
Postchallenge[i,c(10:11)] <- p.adjust(Postchallenge[i,c(6:7)], method = "fdr" )
}
#make into a data-frame
Postchallenge<- as.data.frame(Postchallenge)
#define column names
names(Postchallenge) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj")
#name rows with antigen names
rownames(Postchallenge) <- names(FoldChange)[1:17]
#name rows with antigen names
Postchallenge <- Postchallenge[c(Antigen.Order,additional),]#reorder
# export table of FC and p-values to use to annotate figures.
write.csv(file = "Postchallenge.IgG.csv", Postchallenge)
# Make heatmap - Main antigens
heated <- (Postchallenge[Antigen.Order,c(2,1)])
pdf(file = "IgG_FcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
# Make heatmap - additional antigens
heated <- (Postchallenge[additional,c(2,1)])
pdf(file = "IgG_FcHeatmap.additional.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
Fig 2C (continued): Automate comparisons for IgA (1week)
# Select IgA variables
Ig.Names <- names(data[,151:201])
#subset variable names for each timepoint
PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)] # extract variable names
OneWeek <- Ig.Names[grepl("1week", Ig.Names)] # extract variable names
OneWeek <- OneWeek[!grepl("1weekFC", OneWeek)] # extract variable names
FoldChange <- Ig.Names[grepl("1weekFC", Ig.Names)] # extract variable names
#subset data by names
PreChallenge <- data[,PreChallenge] #extract prechallenge data
OneWeek <- data[,OneWeek] #extract fold change data
FoldChange <- data[,FoldChange] #extract fold change data
#tidy variable names
names(PreChallenge) <- gsub("IgA_", "", names(PreChallenge))
names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge))
names(OneWeek) <- gsub("IgA_", "", names(OneWeek))
names(OneWeek) <- gsub("_1week", "", names(OneWeek))
names(FoldChange) <- gsub("IgA_", "", names(FoldChange))
names(FoldChange) <- gsub("_1weekFC", "", names(FoldChange))
#ADD pharyngitis variable
PreChallenge$Pharyngitis <- data$Pharyngitis
OneWeek$Pharyngitis <- data$Pharyngitis
FoldChange$Pharyngitis <- data$Pharyngitis
#double check that column names are identical
names(FoldChange) == names(PreChallenge)
names(PreChallenge) == names(OneWeek)
#generate fold change and p-value matrix
Postchallenge.IgA = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix
for(i in 1:17){
Postchallenge.IgA[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants
Postchallenge.IgA[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants
Postchallenge.IgA[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge.IgA[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneWeek[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge.IgA[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneWeek[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge.IgA[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge.IgA[i,7] <- wilcox.test(OneWeek[c(1:19),i],OneWeek[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
}
for(i in 1:17){
Postchallenge.IgA[i,c(8:9)] <- p.adjust(Postchallenge.IgA[i,c(4:5)], method = "fdr" )
Postchallenge.IgA[i,c(10:11)] <- p.adjust(Postchallenge.IgA[i,c(6:7)], method = "fdr" )
}
#make into a dataframe
Postchallenge.IgA<- as.data.frame(Postchallenge.IgA)
#name columns
names(Postchallenge.IgA) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj")
#name rows
rownames(Postchallenge.IgA) <- names(OneWeek)[1:17]
#export p-values for annoating graph.
write.csv(file = "Postchallenge.IgA.csv", Postchallenge.IgA)
#generate heatmap for IgA main antigens
heated <- (Postchallenge.IgA[Antigen.Order,c(2,1)])
pdf(file = "IgA_FcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
#generate heatmap for IgA additional antigens
heated <- (Postchallenge.IgA[additional,c(2,1)])
pdf(file = "IgA_FcHeatmap.additional.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
Related to Fig 2C - Supp Figure 1: heatmaps for IgG 1wk and IgA 1 mo.
#IgA 1 month data. used only for Supp fig x:
FigS1_IgA$Mrp24 <- FigS1_IgA$Mrp24 +1
FigS1_IgA$P145 <-FigS1_IgA$P145 +1
pre <- FigS1_IgA[FigS1_IgA$X %in% "pre",]
mo1 <- FigS1_IgA[FigS1_IgA$X %in% "1mo",]
Foldchange_1mo <- mo1[,c(3:19)]/ pre[,c(3:19)]
mo1$Saliva.IgA.AU <- pre$Saliva.IgA.AU # add participant ID
Foldchange_1mo$Saliva.IgA.AU <- pre$Saliva.IgA.AU
#keep only the 18 pharyngitis participants
pre <- pre[pre$Saliva.IgA.AU %in% pharyngitis,]
mo1 <- mo1[mo1$Saliva.IgA.AU %in% pharyngitis,]
Foldchange_1mo <- Foldchange_1mo[pre$Saliva.IgA.AU %in% pharyngitis,]
#generate fold change and p-value matrix
Postchallenge.IgA.1month = matrix(data = 0, nrow = 17, ncol = 3) #initialize matrix
for(i in 1:17){
Postchallenge.IgA.1month[i,1] <- median(Foldchange_1mo[,i],na.rm = T) #median FC pharyngitis
Postchallenge.IgA.1month[i,2] <- wilcox.test(pre[,i+2],mo1[,i+2], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge.IgA.1month[i,3] <- p.adjust(Postchallenge.IgA.1month[i,2], method = "fdr" )
}
#make into a dataframe
Postchallenge.IgA.1month<- as.data.frame(Postchallenge.IgA.1month)
#name columns
names(Postchallenge.IgA.1month) <- c("FC_Pharyn", "p_0vs1_pharyn","p_0vs1_nopharyn.adj")
#name rows
rownames(Postchallenge.IgA.1month) <- names(Foldchange_1mo[,1:17])
#export p-values for annoating graph.
write.csv(file = "Postchallenge.IgA.1month.csv", Postchallenge.IgA.1month)
#generate heatmap for IgA main antigens
heated <- (Postchallenge.IgA.1month[Antigen.Order,])
pdf(file = "IgA_FcHeatmap_1mo.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
#generate heatmap for IgA additional antigens
heated <- (Postchallenge.IgA.1month[additional,])
pdf(file = "IgA_FcHeatmap.additional_1mo.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated[,])
dev.off()
# Supp Fig 1 (continued) IgG 1 week.
Ig.Names <- names(data[,32:150])
PreChallenge <- Ig.Names[grepl("_Pre.challenge", Ig.Names)]
OneWeek <- Ig.Names[grepl("1week", Ig.Names)]
OneWeek <- OneWeek[!grepl("1weekFC", OneWeek)]
FoldChange <- Ig.Names[grepl("1weekFC", Ig.Names)]
PreChallenge <- data[,PreChallenge]
OneWeek <- data[,OneWeek]
FoldChange <- data[,FoldChange]
names(PreChallenge) <- gsub("IgG_", "", names(PreChallenge))
names(PreChallenge) <- gsub("_Pre.challenge", "", names(PreChallenge))
PreChallenge$Pharyngitis <- data$Pharyngitis
names(OneWeek) <- gsub("IgG_", "", names(OneWeek))
names(OneWeek) <- gsub("_1week", "", names(OneWeek))
OneWeek$Pharyngitis <- data$Pharyngitis
names(FoldChange) <- gsub("IgG_", "", names(FoldChange))
names(FoldChange) <- gsub("_1weekFC", "", names(FoldChange))
FoldChange$Pharyngitis <- data$Pharyngitis
names(FoldChange) == names(PreChallenge) # double check
names(PreChallenge) == names(OneWeek) # double check
Postchallenge_1wk = matrix(data = 0, nrow = 17, ncol = 11) #initialize matrix
for(i in 1:17){
Postchallenge_1wk[i,1] <- median(FoldChange[c(1:19),i],na.rm = T) #median FC pharyngitis participants
Postchallenge_1wk[i,2] <- median(FoldChange[c(20:25),i],na.rm = T) #median FC non pharyngitis participants
Postchallenge_1wk[i,3] <- wilcox.test(FoldChange[c(1:19),i],FoldChange[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge_1wk[i,4] <- wilcox.test(PreChallenge[c(1:19),i],OneWeek[c(1:19),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge_1wk[i,5] <- wilcox.test(PreChallenge[c(20:25),i],OneWeek[c(20:25),i], paired = T, alternative = "two.sided", na.rm = T)$p.value
Postchallenge_1wk[i,6] <- wilcox.test(PreChallenge[c(1:19),i],PreChallenge[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
Postchallenge_1wk[i,7] <- wilcox.test(OneWeek[c(1:19),i],OneWeek[c(20:25),i], paired = F, alternative = "two.sided", na.rm = T)$p.value
}
for(i in 1:17){
Postchallenge_1wk[i,c(8:9)] <- p.adjust(Postchallenge_1wk[i,c(4:5)], method = "fdr" )
Postchallenge_1wk[i,c(10:11)] <- p.adjust(Postchallenge_1wk[i,c(6:7)], method = "fdr" )
}
#make into a data-frame
Postchallenge_1wk<- as.data.frame(Postchallenge_1wk)
#define column names
names(Postchallenge_1wk) <- c("FC_Pharyn", "FC_NoPharyn","p_FcvsFC","p_0vs1_pharyn","p_0vs1_nopharyn","p_0vs0", "p_1vs1","p_0vs1_pharyn.adj","p_0vs1_nopharyn.adj","p_0vs0.adj", "p_1vs1.adj")
rownames(Postchallenge_1wk) <- names(FoldChange)[1:17] #name rows with antigen names
Postchallenge_1wk <- Postchallenge_1wk[c(Antigen.Order,additional),]#reorder
# export table of FC and p-values to use to annotate figures.
write.csv(file = "Postchallenge.IgG_1wk.csv", Postchallenge_1wk)
# Make heatmap - Main antigens
heated <- (Postchallenge_1wk[Antigen.Order,c(2,1)])
pdf(file = "IgG_FcHeatmap.pdf_1wk.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
# Make heatmap - additional antigens
heated <- (Postchallenge_1wk[additional,c(2,1)])
pdf(file = "IgG_FcHeatmap.additional_1wk.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
Fig 2D and E: Column graphs of the #antigens each individual responded to for serum IgG
# Specifying the antigens of interest
Filtered_Ags <- c("IgG_M75_1monthFC", "IgG_GAC_1monthFC", "IgG_SLO_1monthFC", "IgG_ScpA_1monthFC",
"IgG_SPYAD_1monthFC", "IgG_SPYCEP_1monthFC", "IgG_ADI_1monthFC", "IgG_Enn336_1monthFC",
"IgG_Mrp24_1monthFC", "IgG_TF_1monthFC", "IgG_T25_1monthFC")
#remove the two individuals without 1 month data
FC.vars <- data[!data$id %in% c("SN017","SN075"),]
# Initializing a binary matrix to store response data
binary <- matrix(nrow = 23, ncol = 11)
# Populating the binary matrix based on a 25% increase threshold
for(i in 1:11){
binary[,i] <- if_else(FC.vars[, Filtered_Ags[i]] > 1.25, 1, 0)
}
# Converting the binary matrix to a data frame
binary <- as.data.frame(binary)
names(binary) <- Filtered_Ags
# Counting the number of positive responses (value == 1) for each row
binary$count.1 <- apply(binary, 1, function(x) length(which(x == 1)))
# Setting row names to match the IDs
row.names(binary) <- FC.vars$id
# Adding the Pharyngitis column from the original dataset
binary$Pharyngitis <- FC.vars$Pharyngitis
# Adding a new column to the original dataset to tally IgG responses
data$Tally_IgG <- ifelse(data$id %in% rownames(binary), binary$count.1, NA)
# Creating a subset for specific responders to use in MDS analysis (Fig 6)
four.responders <- binary[,c("IgG_SPYCEP_1monthFC","IgG_GAC_1monthFC" ,"IgG_SLO_1monthFC","IgG_ScpA_1monthFC")]
four.responders$count <- apply(four.responders, 1, function(x) length(which(x==1)))
four.responders$count_2 <- ifelse(four.responders$count <2, "no", "yes")
four.responders <- rownames(four.responders[four.responders$count >2,])
# plot results (Fig2E IgG)
ggplot(binary, aes(Pharyngitis, count.1, fill = Pharyngitis))+
geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+
geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
ylab("Antigen response (#)")+
xlab("")+
stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F,
aes(label = paste0("p = ", ..p.format..)))+
coord_cartesian( ylim = c(0, 11))+
scale_y_continuous(breaks = seq(0, 11, len = 12))+
custom_theme()
#Saving the plot to a PDF file
ggsave(file = "Antigen_response_25pc_IgG_Tally.pdf", width = 3.25, height = 4 )
# Calculate the count of positive responses for each antigen in the "pharyngitis" group
Antigen.Tally <- apply(binary[binary$Pharyngitis == "pharyngitis",1:11], 2, function(x) length(which(x==1)))
# Define and clean antigen names
Antigen.Tally.Names <-Filtered_Ags
Antigen.Tally.Names <- gsub("_1monthFC", "", Antigen.Tally.Names)
Antigen.Tally.Names <- gsub("IgG_", "", Antigen.Tally.Names)
# Create a data frame for antigen tally
Antigen.Tally <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally)
Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order)
Antigen.Tally$Percent <- (Antigen.Tally$Count/nrow(binary[binary$Pharyngitis == "pharyngitis",]))*100
Antigen.Tally$Pharyngitis <- "pharyngitis"
# Repeat the same process for the "no pharyngitis" group
Antigen.Tally.np <- apply(binary[binary$Pharyngitis == "no pharyngitis",1:11], 2, function(x) length(which(x==1)))
# Create a data frame for antigen tally in the "no pharyngitis" group
Antigen.Tally.np <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally.np)
Antigen.Tally.np$Antigen <- factor(Antigen.Tally.np$Antigen, levels = Antigen.Order)
Antigen.Tally.np$Percent <- (Antigen.Tally.np$Count/nrow(binary[binary$Pharyngitis == "no pharyngitis",]))*100
Antigen.Tally.np$Pharyngitis <- "no pharyngitis"
# Combine both groups into a single data frame
Antigen.Tally<- rbind(Antigen.Tally, Antigen.Tally.np)
Antigen.Tally$Pharyngitis <- factor(Antigen.Tally$Pharyngitis, levels = c("no pharyngitis","pharyngitis"))
# Reorder antigens by count for visualization
Antigen.Tally.Order<- Antigen.Tally.Names[rev(order(Antigen.Tally$Count))]
Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order)
# Graph % and faceted by outcome (Fig2D IgG)
ggplot(Antigen.Tally, aes(x = Antigen, y = Percent, fill = Pharyngitis)) +
geom_col(alpha = 0.75, color= "black") +
geom_text(aes(label=Count), nudge_y = 3)+
scale_fill_manual(values = c("blue","red"))+
xlab("")+
ylab("Responders (%)")+
scale_y_continuous(expand = c(0, 0))+
coord_cartesian( ylim = c(0, 100))+
facet_grid(. ~ Pharyngitis)+
custom_theme()+
theme(axis.text.x=element_text(angle = 90, vjust = 0.5, hjust=1))
#Save the percentage plot to a PDF file
ggsave(file = "Percent.responders.core.pdf", width = 5, height = 4)
rm(binary)
rm(FC.vars)
Figure 2D and E (continued): Column graph of the #antigens each individual responded to for saliva IgA
# Specifying the antigens of interest
Filtered_Ags <- c("IgA_M75_1weekFC", "IgA_GAC_1weekFC", "IgA_SLO_1weekFC", "IgA_ScpA_1weekFC",
"IgA_SPYAD_1weekFC", "IgA_SPYCEP_1weekFC", "IgA_ADI_1weekFC", "IgA_Enn336_1weekFC",
"IgA_Mrp24_1weekFC", "IgA_TF_1weekFC", "IgA_T25_1weekFC")
# build FC.vars dataframe
FC.vars <- data
# Initializing a binary matrix to store response data
binary <- matrix(nrow = 25, ncol = 11)
# Populating the binary matrix based on a 25% increase threshold
for(i in 1:11){
binary[,i] <- if_else(FC.vars[, Filtered_Ags[i]] > 1.25, 1, 0)
}
# Converting the binary matrix to a data frame
binary <- as.data.frame(binary)
names(binary) <- Filtered_Ags
# Counting the number of positive responses (value == 1) for each row
binary$count.1 <- apply(binary, 1, function(x) length(which(x == 1)))
# Setting row names to match the IDs
row.names(binary) <- FC.vars$id
# Adding the Pharyngitis column from the original dataset
binary$Pharyngitis <- FC.vars$Pharyngitis
# Adding a new column to the original dataset to tally IgA responses
data$Tally_IgA <- ifelse(data$id %in% rownames(binary), binary$count.1, NA)
# plot results (Fig2E IgA)
ggplot(binary, aes(Pharyngitis, count.1, fill = Pharyngitis))+
geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+
geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
ylab("Antigen response (#)")+
xlab("")+
stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F,
aes(label = paste0("p = ", ..p.format..)))+
coord_cartesian( ylim = c(0, 11))+
scale_y_continuous(breaks = seq(0, 11, len = 12))+
custom_theme()
#Saving the plot to a PDF file
ggsave(file = "Antigen_response_25pc_IgA_Tally.pdf", width = 3.25, height = 4 )
# Calculate the count of positive responses for each antigen in the "pharyngitis" group
Antigen.Tally <- apply(binary[binary$Pharyngitis == "pharyngitis",1:11], 2, function(x) length(which(x==1)))
# Define and clean antigen names
Antigen.Tally.Names <-Filtered_Ags
Antigen.Tally.Names <- gsub("_1weekFC", "", Antigen.Tally.Names)
Antigen.Tally.Names <- gsub("IgA_", "", Antigen.Tally.Names)
# Create a data frame for antigen tally
Antigen.Tally <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally)
Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order)
Antigen.Tally$Percent <- (Antigen.Tally$Count/nrow(binary[binary$Pharyngitis == "pharyngitis",]))*100
Antigen.Tally$Pharyngitis <- "pharyngitis"
# Repeat the same process for the "no pharyngitis" group
Antigen.Tally.np <- apply(binary[binary$Pharyngitis == "no pharyngitis",1:11], 2, function(x) length(which(x==1)))
# Create a data frame for antigen tally in the "no pharyngitis" group
Antigen.Tally.np <- data.frame(Antigen = Antigen.Tally.Names, Count = Antigen.Tally.np)
Antigen.Tally.np$Antigen <- factor(Antigen.Tally.np$Antigen, levels = Antigen.Order)
Antigen.Tally.np$Percent <- (Antigen.Tally.np$Count/nrow(binary[binary$Pharyngitis == "no pharyngitis",]))*100
Antigen.Tally.np$Pharyngitis <- "no pharyngitis"
# Combine both groups into a single data frame
Antigen.Tally<- rbind(Antigen.Tally, Antigen.Tally.np)
Antigen.Tally$Pharyngitis <- factor(Antigen.Tally$Pharyngitis, levels = c("no pharyngitis","pharyngitis"))
# Reorder antigens by count for visualization
Antigen.Tally.Order<- Antigen.Tally.Names[rev(order(Antigen.Tally$Count))]
Antigen.Tally$Antigen <- factor(Antigen.Tally$Antigen, levels = Antigen.Order)
# Graph % and faceted by outcome (Fig2D IgA)
ggplot(Antigen.Tally, aes(x = Antigen, y = Percent, fill = Pharyngitis)) +
geom_col(alpha = 0.75, color= "black") +
geom_text(aes(label=Count), nudge_y = 3)+
scale_fill_manual(values = c("blue","red"))+
xlab("")+
ylab("Responders (%)")+
scale_y_continuous(expand = c(0, 0))+
coord_cartesian( ylim = c(0, 100))+
facet_grid(. ~ Pharyngitis)+
custom_theme()+
theme(axis.text.x=element_text(angle = 90, vjust = 0.5, hjust=1))
#Save the percentage plot to a PDF file
ggsave(file = "Percent.responders.core.pdf", width = 5, height = 4)
#remove objects and dataframes that wont be used again
rm(binary)
rm(FC.vars)
rm(Filtered_Ags)
Fig 2F: graphing significant antigens raw values
# Define a function for plotting
plot_antigen <- function(data, antigen, antibody_type, timepoints, file_suffix) {
# Filter data
plotting <- data %>%
filter(Antigen == antigen, timepoint %in% timepoints, Type == antibody_type)
# Define group variable
plotting$Group <- factor(paste(plotting$Pharyngitis, plotting$timepoint, sep= "_"),
levels = c(paste("no pharyngitis", timepoints, sep="_"),
paste("pharyngitis", timepoints, sep="_")))
plotting$id <- factor(plotting$id)
# Plot the graph
p <- ggplot(plotting, aes(Group, value, fill = Pharyngitis)) +
geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge") +
geom_beeswarm(size = 3, alpha = 0.75, cex =2, shape = 21) +
geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8) +
scale_fill_manual(values = c("blue", "red")) +
scale_alpha_manual(values = c(0.15, 0.5)) +
scale_color_manual(values = c("blue", "red")) +
ylab(paste(antigen, "(ELISA AU)")) +
xlab("") +
coord_cartesian(ylim = c(0, max(plotting$value, na.rm = TRUE) * 1.25)) +
custom_theme()
# Save the plot
ggsave(file = paste(antigen, file_suffix, ".pdf", sep = "_"), plot = p, width = 4, height = 4)
}
# Prepare data for IgG
rep.plotting_igg <- data_melt %>%
filter(timepoint %in% c("Pre.challenge", "1month"), Type == "IgG")
# Antigens to plot for IgG
antigens_igg <- c("SPYCEP", "SLO", "ScpA", "GAC", "TF")
# Plot IgG antigens
for (antigen in antigens_igg) {
plot_antigen(rep.plotting_igg, antigen, "IgG", c("Pre.challenge", "1month"), "rawtitres")
}
# Prepare IgA data
rep.plotting_iga <- data_melt %>%
filter(timepoint %in% c("Pre.challenge", "1week"), Type == "IgA")
# Antigens to plot for IgA
antigens_iga <- c("GAC", "M75.HVR")
# Plot IgA antigens
for (antigen in antigens_iga) {
plot_antigen(rep.plotting_iga, antigen, "IgA", c("Pre.challenge", "1week"), "IgA.rawtitres")
}
#remove variables and dataframes not used again
rm(rep.plotting_iga, rep.plotting_igg, antigens_iga, antigens_igg)
Related to Fig 2F - Supplementary Fig S3 and S4: Graph each antigen seperately - raw titres
#make a group variable with short names for the graph
rep.plotting <- data_melt[data_melt$timepoint %in% c("Pre.challenge","1month")& data_melt$Type == "IgG",]
rep.plotting$Group <- factor(paste(rep.plotting$Pharyngitis, rep.plotting$timepoint, sep= "_"),
levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1month",
"pharyngitis_Pre.challenge", "pharyngitis_1month"))
rep.plotting$Group <- ifelse(rep.plotting$Group == "no pharyngitis_Pre.challenge", "NP.Pre", ifelse(rep.plotting$Group == "no pharyngitis_1month", "NP.1mo",ifelse(rep.plotting$Group == "pharyngitis_Pre.challenge", "P.Pre", "P.1mo")))
rep.plotting$Group <- factor(rep.plotting$Group,
levels = c("NP.Pre","NP.1mo","P.Pre", "P.1mo" ))
rep.plotting$id <- factor(rep.plotting$id)
#make the list
Antigen.order.all <- c(Antigen.Order, additional)
rep.plotting$Antigen <- factor(rep.plotting$Antigen , levels = Antigen.order.all )
antigen_levels <- unique(levels(rep.plotting$Antigen))
removeAgs <- c("GAC", "SLO", "ScpA", "SPYCEP") # These are already plotted
is_in_removeAgs <- antigen_levels %in% removeAgs
antigen_levels <- antigen_levels[!is_in_removeAgs]
#IgG data first
plot_list1 <- lapply(antigen_levels, function(antigen) {
# Subset the data for the current level of 'Antigen'
subset_df <- rep.plotting[rep.plotting$Antigen == antigen, ]
ggplot(subset_df, aes(Group, value, fill = Pharyngitis))+
geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge")+
geom_point(size = 3, alpha = 0.75, shape = 21)+
geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+
scale_fill_manual(values = c("blue", "red"))+
scale_alpha_manual(values = c(0.15, 0.5))+
scale_color_manual(values = c("blue", "red"))+
ylab(paste(antigen, "(ELISA AU)"))+
xlab("")+
coord_cartesian( ylim = c(0, max(subset_df$value, na.rm = T)*1.25))+
theme_bw()+
theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 9), axis.text.y=element_text(colour="black", size = 9), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.title.y = element_text(size = 12))
})
#IgA data next
rep.plotting <- data_melt[data_melt$timepoint %in% c("Pre.challenge","1week")& data_melt$Type == "IgA",]
rep.plotting$Group <- factor(paste(rep.plotting$Pharyngitis, rep.plotting$timepoint, sep= "_"),
levels = c("no pharyngitis_Pre.challenge", "no pharyngitis_1week",
"pharyngitis_Pre.challenge", "pharyngitis_1week"))
rep.plotting$Group <- ifelse(rep.plotting$Group == "no pharyngitis_Pre.challenge", "NP.Pre", ifelse(rep.plotting$Group == "no pharyngitis_1week", "NP.1wk",ifelse(rep.plotting$Group == "pharyngitis_Pre.challenge", "P.Pre", "P.1wk")))
rep.plotting$Group <- factor(rep.plotting$Group,
levels = c("NP.Pre","NP.1wk","P.Pre", "P.1wk" ))
rep.plotting$id <- factor(rep.plotting$id)
rep.plotting$Antigen <- factor(rep.plotting$Antigen , levels = Antigen.order.all )
antigen_levels <- unique(levels(rep.plotting$Antigen))
removeAgs <- c("GAC", "M75.HVR") # These are already plotted
is_in_removeAgs <- antigen_levels %in% removeAgs
antigen_levels <- antigen_levels[!is_in_removeAgs]
plot_list2 <- lapply(antigen_levels, function(antigen) {
# Subset the data for the current level of 'Antigen'
subset_df <- rep.plotting[rep.plotting$Antigen == antigen, ]
ggplot(subset_df, aes(Group, value, fill = Pharyngitis))+
geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale= "width", position = "dodge")+
geom_point(size = 3, alpha = 0.75, shape = 21)+
geom_line(aes(group = id, color = Pharyngitis), size = 0.35, alpha = 0.8)+
scale_fill_manual(values = c("blue", "red"))+
scale_alpha_manual(values = c(0.15, 0.5))+
scale_color_manual(values = c("blue", "red"))+
ylab(paste(antigen, "(ELISA AU)"))+
xlab("")+
coord_cartesian( ylim = c(0, max(subset_df$value, na.rm = T)*1.25))+
theme_bw()+
theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 9), axis.text.y=element_text(colour="black", size = 9), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.title.y = element_text(size = 12))
})
# Step 2: Use lapply() to generate a list of ggplots for each 'Antigen' level
combined_plot1<- plot_grid(plotlist= plot_list1[1:9], ncol = 3, align = 'h')
combined_plot2<- plot_grid(plotlist = plot_list1[10:17], ncol = 3, align = 'h')
combined_plot3<- plot_grid(plotlist = plot_list2[1:9], ncol = 3, align = 'h')
combined_plot4<- plot_grid(plotlist = plot_list2[10:17], ncol = 3, align = 'h')
# Now, use ggsave to save the combined plot to a file
ggsave("combined_plots1_IgG.pdf", plot = combined_plot1, width = 8, height = 8, dpi = 300)
ggsave("combined_plots2_IgG.pdf", plot = combined_plot2, width = 8, height = 8, dpi = 300)
ggsave("combined_plots3_IgA.pdf", plot = combined_plot3, width = 8, height = 8, dpi = 300)
ggsave("combined_plots4_IgA.pdf", plot = combined_plot4, width = 8, height = 8, dpi = 300)
rm(plot_list1, plot_list2, combined_plot1, combined_plot2, combined_plot3, combined_plot4)
Fig 2G : Heatmap of correlation coefficients between fold change in IgA and IgG.
# heatmap function:
breaksList = c(-1.0, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1) # a ramp of vales for log2 FC
heatpal <- c("#2166AC" ,"#3F8DC0", "#80B9D8" ,"#BBDAEA" ,"#F7F7F7" ,"#F4A582", "#D6604D", "#B2182B", "#67001F")
heatIT <- function(heated) {pheatmap(heated,
kmeans_k = NA,
breaks = breaksList,
color = colorRampPalette(heatpal)(11),
border_color = "black",
cellwidth = 11, cellheight =11,
# display_numbers = test_labels,
number_format = "%",
scale = "none",
cluster_rows = F,
cluster_cols = F,
number_color = "grey",
show_rownames = T, show_colnames = F,
fontsize = 8, display_numbers = F,
filename = NA, silent = FALSE)
}
#extract variables
FC.Names <- names(data[,grepl("FC", names(data))])
#further subset variable names
IgA <- FC.Names[grepl("IgA", FC.Names)]
OneWeek <- FC.Names[grepl("week", FC.Names) ]
OneWeek <- OneWeek[!grepl("IgA", OneWeek) ]
OneMonth <- FC.Names[grepl("1month", FC.Names)]
#subset data by timepoint
IgA <- data[,IgA]
OneWeek <- data[,OneWeek]
OneMonth <- data[,OneMonth]
#tidy variable names
names(IgA) <- gsub("IgA_", "", names(IgA))
names(IgA) <- gsub("_1weekFC", "", names(IgA))
names(OneWeek) <- gsub("IgG_", "", names(OneWeek))
names(OneWeek) <- gsub("_1weekFC", "", names(OneWeek))
names(OneMonth) <- gsub("IgG_", "", names(OneMonth))
names(OneMonth) <- gsub("_1monthFC", "", names(OneMonth))
# keep main antigens only
IgA <- IgA[,Antigen.Order]
OneWeek <- OneWeek[,Antigen.Order]
OneMonth <- OneMonth[,Antigen.Order]
#initiate matrix
Postchallenge = matrix(data = 0, nrow = 11, ncol = 4) #initialize matrix
for(i in 1:11){
Postchallenge[i,1] <- cor.test(IgA[,i], OneWeek[,i], method = "spearman")$estimate #median FC pharyngitis participants
Postchallenge[i,2] <- cor.test(IgA[,i], OneMonth[,i], method = "spearman")$estimate #median FC non pharyngitis participants
Postchallenge[i,3] <- cor.test(IgA[,i], OneWeek[,i], method = "spearman")$p.value
Postchallenge[i,4] <- cor.test(IgA[,i], OneMonth[,i], method = "spearman")$p.value
}
#make into a dataframe and name columns and rows
Postchallenge<- as.data.frame(Postchallenge)
names(Postchallenge) <- c("Rho.1week", "Rho.1month", "pvalOneweek", "pvalOneMonth")
row.names(Postchallenge) <- names(IgA)
#graph it
heated <- (Postchallenge[Antigen.Order,c(1,2)])
pdf(file = "IgA.correlationFcHeatmap.pdf", width = 3, height = c((nrow(heated)/12)+1))
heatIT(heated)
dev.off()
#remove dataframes and variables no longer needed
rm(mo1, heated, Postchallenge, Postchallenge.IgA, Postchallenge_1wk, Postchallenge.IgA.1month, rep.plotting_iga, breaksList, is_in_removeAgs,heatpal)
Fig 3: Luminex data over 3 months
#define plotting function
PlotIT <-function(d){ggplot(data=d, aes(y=y)) +
geom_line(aes(x=xj, group=id), color= "black", alpha = .8) +
geom_point(data = d %>% filter(x=="1"), aes(x=xj), shape = 21, fill = 'darkgrey', size = 2.5,
alpha = .7) +
geom_point(data = d %>% filter(x=="2"), aes(x=xj), shape = 21, fill= '#F0E442', size = 2.5,
alpha = .7) +
geom_point(data = d %>% filter(x=="3"), aes(x=xj), shape = 21, fill= '#1FAF9D', size = 2.5,
alpha = .7) +
geom_point(data = d %>% filter(x=="4"), aes(x=xj), shape = 21, fill = '#521ADA', size = 2.5,
alpha = .7) +
geom_half_violin(
data = d %>% filter(x=="1"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 3.7),
side = "l", fill = 'darkgrey') +
geom_half_violin(
data = d %>% filter(x=="2"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 2.9),
side = "l", fill = "#F0E442") +
geom_half_violin(
data = d %>% filter(x=="3"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 2.1),
side = "l", fill = '#1FAF9D') +
geom_half_violin(
data = d %>% filter(x=="4"),aes(x = x, y = y), draw_quantiles = c(0.5),position = position_nudge(x = 1.3),
side = "l", fill = '#521ADA')+
#Define additional settings
scale_x_continuous(breaks=c(1,2,3,4), labels=c("Pre", "1wk","1mo", "3mo"), limits=c(0.75, 5.3)) +
xlab("") + ylab(paste(var, "log10 ug/mL")) +
coord_cartesian(ylim= ylimits)+
ggtitle(paste(group, "over.time")) +
custom_theme()+
theme(axis.text.x=element_text(angle = 90))
}
# Function to adjust p-values
p_values <- function(d, e) {
p <- c(
wilcox.test(d[d$x == 1,]$y, d[d$x == 2,]$y, paired = TRUE)$p.value,
wilcox.test(d[d$x == 1,]$y, d[d$x == 3,]$y, paired = TRUE)$p.value,
wilcox.test(d[d$x == 1,]$y, d[d$x == 4,]$y, paired = TRUE)$p.value
)
p.adj.paired <- p.adjust(p, method = "fdr", n = length(p))
p <- c(
wilcox.test(d[d$x == 1,]$y, e[e$x == 1,]$y, paired = FALSE)$p.value,
wilcox.test(d[d$x == 2,]$y, e[e$x == 2,]$y, paired = FALSE)$p.value,
wilcox.test(d[d$x == 3,]$y, e[e$x == 3,]$y, paired = FALSE)$p.value,
wilcox.test(d[d$x == 4,]$y, e[e$x == 4,]$y, paired = FALSE)$p.value
)
p.adj.groups <- p.adjust(p, method = "fdr", n = length(p))
return(list(paired = p.adj.paired, groups = p.adj.groups))
}
#luminex data split out timepoints
OneMonth <-luminex[luminex$timepoint == "1month",]
PreChallenge <-luminex[luminex$timepoint == "Pre.challenge",]
#remove the 2 individuals with missing one month datapoints
PreChallenge <-PreChallenge[PreChallenge$id %in% OneMonth$id,]
#Create normalised dataframe
OneMonthFC <-OneMonth # to get the other vars
OneMonthFC[,c(6:11)] <- OneMonth[,c(6:11)] / PreChallenge[,c(6:11)]
# Keep data for participants with samples present at all time-points
keep.Ids <- luminex %>% group_by(id) %>% tally() %>% filter(n == 4) %>% pull(id)
#SpyCEP responders
var <- "SPYCEP"
#Split into group based on a 'non-response' of < 1.25 fold change
SpyCEP.non.responders <- OneMonthFC[which(OneMonthFC$SpyCEP < 1.25),]$id
#use this to generate new variable
luminex$SPYCEP.Resp <- if_else(luminex$id %in% SpyCEP.non.responders, "no","yes")
#define graph ylimits
ylimits <- c(-0.25, 2.25)
#graph SpyCEPresponders
group <- "responders"
d <- luminex[luminex$SPYCEP.Resp == "yes" & luminex$id %in% keep.Ids ,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$SpyCEP,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#graph SpyCEP non-responders
group <- "non-responders"
e <- luminex[luminex$SPYCEP.Resp == "no"& luminex$id %in% keep.Ids ,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$SpyCEP,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
#SLO
var <- "SLO"
#Split into group based on a 'non-response' of < 1.25 fold change
SLO.non.responders <- OneMonthFC[which(OneMonthFC$SLO < 1.25),]$id
luminex$SLO.Resp <- if_else(luminex$id %in% SLO.non.responders, "no","yes")
#set graph limits
ylimits <- c(0.5, 2.5)
#graph responders
group <- "responders"
d <- luminex[luminex$SLO.Resp == "yes"& luminex$id %in% keep.Ids,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$SLO,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
group <- "non-responders"
e <- luminex[luminex$SLO.Resp == "no"& luminex$id %in% keep.Ids ,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$SLO,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
#SCPA
var <- "SCPA"
ylimits <- c(1, 2.5)
SCPA.non.responders <- OneMonthFC[which(OneMonthFC$SCPA < 1.25),]$id
luminex$SCPA.Resp <- if_else(luminex$id %in% SCPA.non.responders, "no","yes")
luminex %>% filter(timepoint == "Pre.challenge" & id %in% keep.Ids) %>% count(SCPA.Resp)
group <- "responders"
d <- luminex[luminex$SCPA.Resp == "yes" & luminex$id %in% keep.Ids,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$SCPA,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
group <- "non-responders"
e <- luminex[luminex$SCPA.Resp == "no"& luminex$id %in% keep.Ids ,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$SCPA,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
##GAC
var <- "GAC"
ylimits <- c(-1.5,1)
GAC.non.responders <- OneMonthFC[which(OneMonthFC$GAC < 1.25),]$id
luminex$GAC.Resp <- if_else(luminex$id %in% GAC.non.responders, "no","yes")
luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(GAC.Resp)
group <- "responders"
d <- luminex[luminex$GAC.Resp == "yes" & luminex$id %in% keep.Ids,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$GAC,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
group <- "non-responders"
e <- luminex[luminex$GAC.Resp == "no" & luminex$id %in% keep.Ids,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$GAC,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
#DNAseB
var <- "DnaseB"
DnaseB.non.responders <- OneMonthFC[which(OneMonthFC$DnaseB < 1.25),]$id
luminex$Dnase.Resp <- if_else(luminex$id %in% DnaseB.non.responders, "no","yes")
luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(Dnase.Resp)
ylimits <- c(-1,2.5)
group <- "responders"
d <- luminex[luminex$Dnase.Resp == "yes" & luminex$id %in% keep.Ids,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$DnaseB,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
group <- "non-responders"
e <- luminex[luminex$Dnase.Resp == "no" & luminex$id %in% keep.Ids,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$DnaseB,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
##SPNA
var <- "SpnA"
SpnA.non.responders <- OneMonthFC[which(OneMonthFC$SpnA < 1.25),]$id
luminex$SpnA.Resp <- if_else(luminex$id %in% SpnA.non.responders, "no","yes")
luminex %>% filter(timepoint == "Pre.challenge"& id %in% keep.Ids) %>% count(SpnA.Resp)
ylimits <- c(-1,2)
group <- "responders"
d <- luminex[luminex$SpnA.Resp == "yes" & luminex$id %in% keep.Ids,]
d$x <-as.numeric(d$timepoint)
d$xj <-jitter(d$x, amount = .09)
d$y <-log(d$SpnA,10)
PlotIT(d)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
group <- "non-responders"
e <- luminex[luminex$SpnA.Resp == "no" & luminex$id %in% keep.Ids,]
e$x <-as.numeric(e$timepoint)
e$xj <-jitter(e$x, amount = .09)
e$y <-log(e$SpnA,10)
PlotIT(e)
ggsave(file = paste(var, group, "lumi_over_time.pdf", sep = "_"), width = 2.5, height = 3.0)
#write out p-values
print(p_values(d,e))
rm(d, e, group, var, DnaseB.non.responders, GAC.non.responders, SCPA.non.responders, SLO.non.responders, SpnA.non.responders, SpyCEP.non.responders)
Fig 4: luminex compare responses to pharyngitis children
#join children dataset to pharyngitis participants from Challenge data
luminex.keep <- luminex[
luminex$timepoint %in% c("Pre.challenge", "1month") &
luminex$pharyngitis == "pharyngitis" &
!luminex$id %in% c("SN017", "SN075"),
c("id", "timepoint", "SCPA", "SpyCEP", "GAC", "SLO", "DnaseB", "SpnA")
]
names(luminex.children) <- names(luminex.keep)
luminex.keep <- rbind(luminex.keep, luminex.children)
luminex.keep$timepoint <- factor(luminex.keep$timepoint, levels = c("Pre.challenge", "1month", "healthy_child", "pharyngitis_child"))
# List of antigens
antigens <- c("SpyCEP", "SLO", "SCPA", "GAC", "DnaseB", "SpnA")
# Function to plot data for each antigen
plot_antigen_children <- function(data, antigen) {
d <- data
d$x <- as.numeric(d$timepoint)
d$xj <- jitter(d$x, amount = .09)
d$y <- log(d[[antigen]], 10)
p <- ggplot(d, aes(timepoint, y, fill = timepoint)) +
geom_violin(aes(alpha = timepoint), draw_quantiles = c(0.5), scale = "width", position = "dodge") +
geom_beeswarm(size = 3, alpha = 0.65, cex = 2, shape = 21) +
scale_fill_manual(values = c("red", "red", "#FFC107", "darkorange3")) +
scale_alpha_manual(values = c(0.15, 0.5, 0.5, 0.5)) +
scale_color_manual(values = c("red", "red", "#FFC107", "darkorange3")) +
ylab(antigen) +
xlab("") +
coord_cartesian(ylim = c(min(d$y, na.rm = TRUE) * 1.25, max(d$y, na.rm = TRUE) * 1.25)) +
custom_theme()
ggsave(file = paste(antigen, "children.pdf", sep = "_"), plot = p, width = 4, height = 5)
}
# Iterate through each antigen and plot
for (antigen in antigens) {
plot_antigen_children(luminex.keep, antigen)
}
#annotated p-value luminex healthy
luminex_pvals = matrix(data = 0, nrow = 6, ncol = 12) #initialize matrix
for(i in 1:6){
luminex_pvals[i,1] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(18:34),c(i+2)], paired = T, alternative = "two.sided", na.rm = T)$p.value
luminex_pvals[i,2] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(35:49),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value
luminex_pvals[i,3] <- wilcox.test(luminex.keep[c(1:17),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value
luminex_pvals[i,4] <- wilcox.test(luminex.keep[c(18:34),c(i+2)],luminex.keep[c(35:49),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value
luminex_pvals[i,5] <- wilcox.test(luminex.keep[c(18:34),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value
luminex_pvals[i,6] <- wilcox.test(luminex.keep[c(35:49),c(i+2)],luminex.keep[c(50:73),c(i+2)], paired = F, alternative = "two.sided", na.rm = T)$p.value
}
for(i in 1:6){
luminex_pvals[i,c(7:12)] <- p.adjust(luminex_pvals[i,c(1:6)], method = "fdr" )
}
luminex_pvals<- as.data.frame(luminex_pvals)
names(luminex_pvals) <- c("prevspost","pre_vs_healthy","pre_vs_pharyn","post_vs_healthy","post_vs_pharyn", "healthy_vs_pharyn",
"prevspost.adj","pre_vs_healthy.adj","pre_vs_pharyn.adj","post_vs_healthy.adj","post_vs_pharyn.adj",
"healthy_vs_pharyn.adj")
rownames(luminex_pvals) <- names(luminex.keep)[3:8]
write.csv(file = "children.luminex.csv", luminex_pvals)
rm(luminex_pvals, antigens, luminex.keep)
Figure 5: Luminex: Correlation of pre-challenge titre with 1 month fold change:
#define plotting function
plot_and_save <- function(data, fc_data, antigen, file_name, xlim, ylim) {
p <- ggplot(data, aes(y = log(data[[antigen]], 10), x = log(fc_data[[antigen]], 10))) +
geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill = pharyngitis)) +
scale_fill_manual(values = c("blue", "red")) +
ylab("log10 Pre challenge") +
xlab("log10 FC IgG 1month") +
geom_smooth(method = 'lm', se = FALSE, colour = "black") +
coord_cartesian(ylim = ylim, xlim = xlim) +
custom_theme()
ggsave(file = file_name, plot = p, width = 3, height = 3)
# Spearman Correlation p-value
cor_result <- cor.test(data[[antigen]], fc_data[[antigen]], method = "spearman")
print(cor_result)
return(cor_result)
}
antigens <- c("SpyCEP", "SLO", "SCPA", "GAC", "SpnA", "DnaseB")
xlims <- list(c(-0.5, 2.5), c(-0.1, 0.8), c(-0.1, 0.8), c(-0.5, 1.5), c(-0.1, 1.5), c(-0.5, 2.5))
ylims <- list(c(-0.5, 2.5), c(0.5, 2.5), c(0.5, 2.5), c(-1.5, 1.5), c(-1.5, 1.65), c(-1, 2.5))
# Loop through each antigen and plot
for (i in seq_along(antigens)) {
antigen <- antigens[i]
xlim <- xlims[[i]]
ylim <- ylims[[i]]
file_name <- paste("PrechallengevsFC_", antigen, "_luminex.pdf", sep = "")
plot_and_save(PreChallenge, OneMonthFC, antigen, file_name, xlim, ylim)
}
# now repeat for M75 IgG and for ScpA IgA, bot ELISA data
ggplot(data, aes(y = log(IgA_ScpA_Pre.challenge,10), x = log(data$IgA_ScpA_1weekFC,10)))+
geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill =Pharyngitis))+
scale_fill_manual(values = c("blue", "red"))+
ylab("log10 Pre challenge")+
xlab("log 10 FC IgG 1month")+
geom_smooth(method='lm', se = FALSE, colour = "black")+
coord_cartesian(ylim = c(0.5,2.5), xlim= c(-0.5, 0.5))+
custom_theme()
ggsave(file = "PrechallengevsFC_IgA_Scpa.pdf", width = 3, height = 3 )
ggplot(data, aes(y = log(IgG_M75_Pre.challenge,10), x = log(data$IgG_M75_1weekFC,10)))+
geom_point(shape = 21, cex = 3, alpha = 0.75, aes(fill =Pharyngitis))+
scale_fill_manual(values = c("blue", "red"))+
ylab("log10 Pre challenge")+
xlab("log10 FC IgG 1month")+
geom_smooth(method='lm', se = FALSE, colour = "black")+
coord_cartesian(ylim = c(1,4), xlim= c(-0.25, 0.5))+
custom_theme()
ggsave(file = "PrechallengevsFC_M75.pdf", width = 3, height = 3 )
rm(PreChallenge, OneMonth, OneMonthFC)
Fig 6 and 7: multivariate analysis - data preparation
#build a big dataframe of pre-challenge data variables, and substitute luminex for ELISA where antigens are in common.
#keep baseline only. Substitute luminex for ELISA
PreChallenge <- data[,grepl("Pre.challenge",names(data))]
# add in clinical variables to include in Figure 7.
PreChallenge <-cbind(PreChallenge, data[,c(204,1,2,208:ncol(data))])
#reorder variables
PreChallenge <- PreChallenge[,c(36:ncol(PreChallenge), 1:35)]
#create new dataframe for removing non-IgG variables
PreChallenge.lumi <-luminex[luminex$timepoint == "Pre.challenge",c(1, 6:11)]
PreChallenge <- PreChallenge[,!names(PreChallenge) %in% c("IgG_SPYCEP_Pre.challenge","IgG_GAC_Pre.challenge","IgG_ScpA_Pre.challenge","IgG_SLO_Pre.challenge")]
PreChallenge <- left_join(PreChallenge, PreChallenge.lumi, by = "id")
Fig 6: Multidimensional scaling
set.seed(36) # ensures same plots each time for mds
#remove anything thats not IgG
PreChallenge_trim <- PreChallenge[,!grepl("IgA_", names(PreChallenge))]
PreChallenge_trim <- PreChallenge_trim[,!grepl("Adehesion_percent_Pre_saliva", names(PreChallenge_trim))]
# remove clinical variables and antigen response tally. Create the Prechallenge.z dataframe for use later
PreChallenge.z <- PreChallenge_trim[,c(1,14,17:ncol(PreChallenge_trim))]
#tidy names
names(PreChallenge.z) <- gsub("IgG_", "", names(PreChallenge.z))
names(PreChallenge.z) <- gsub("_Pre.challenge", "", names(PreChallenge.z))
#Select one month data
OneMonth <- data[,grepl("1month",names(data))]
OneMonth <- OneMonth[,!grepl("FC",names(OneMonth))]
OneMonth <-cbind(OneMonth, data[,c(1,219)])
#reorder
OneMonth <- OneMonth[,c(18:ncol(OneMonth), 1:17)]
OneMonth.lumi <-luminex[luminex$timepoint == "1month", c(1, 6:11)]
#remove ELISA vars
OneMonth <- OneMonth[,!names(OneMonth) %in% c("IgG_SPYCEP_1month","IgG_GAC_1month","IgG_ScpA_1month","IgG_SLO_1month")]
OneMonth <- left_join(OneMonth, OneMonth.lumi, by = "id")
#Create the OneMonth.z dataframe for use later
One.Month.z <- OneMonth
#Tidy names
names(One.Month.z) <- gsub("IgG_", "", names(One.Month.z))
names(One.Month.z) <- gsub("_1month", "", names(One.Month.z))
#select 3 months data
ThreeMonths <- data[,grepl("3months",names(data))]
ThreeMonths <- ThreeMonths[,!grepl("FC",names(ThreeMonths))]
ThreeMonths <-cbind(ThreeMonths, data[,c(1,219)])
#reorder
ThreeMonths <- ThreeMonths[,c(18:ncol(ThreeMonths), 1:17)]
ThreeMonths.lumi <-luminex[luminex$timepoint == "3months", c(1, 6:11)]
#remove ELISA vars
ThreeMonths <- ThreeMonths[,!names(ThreeMonths) %in% c("IgG_SPYCEP_3months","IgG_GAC_3months","IgG_ScpA_3months","IgG_SLO_3months")]
ThreeMonths <- left_join(ThreeMonths, ThreeMonths.lumi, by = "id")
#Create the Threemonths.z dataframe for use later
ThreeMonths.z <- ThreeMonths
names(ThreeMonths.z) <- gsub("IgG_", "", names(ThreeMonths.z))
names(ThreeMonths.z) <- gsub("_3months", "", names(ThreeMonths.z))
#double check names are consistent
names(PreChallenge.z) == names(One.Month.z)
names(ThreeMonths.z) == names(One.Month.z)
#add timepoint variable
PreChallenge.z$timepoint <- "Pre"
One.Month.z$timepoint <- "1mo"
ThreeMonths.z$timepoint <- "3mo"
#bind together
data.z <- rbind(PreChallenge.z, One.Month.z, ThreeMonths.z)
data.z$timepoint <- factor(data.z$timepoint, levels =c("Pre","1mo", "3mo"))
#reorder
data.z <- data.z[,c(1, 2, 22, 3:21)]
#Zscore transform data
norm.study <- as.data.frame(data.z[,c(4:22)])
# define variables with which to normalise data using the 5-95% intervals of the data
conf.Int <- matrix(nrow = ncol(norm.study), ncol =4)
for(i in 1:19){
conf.Int[i,1] <- quantile(norm.study[,i], probs=c(0.05), na.rm= T) # 5th percentile
conf.Int[i,2] <- quantile(norm.study[,i], probs=c(0.95), na.rm= T) # 95th Percentile
conf.Int[i,3] <- mean(norm.study[,i], na.rm= T) # determine mean of data within 90th percentile
conf.Int[i,4] <- sd(norm.study[,i], na.rm= T) # determine st dev of data within 90th percentile
}
#normalise the data
for(i in 1:19){
norm.study[,i] <- (norm.study[,i] - conf.Int[i,3])/conf.Int[i,4] #normalise data using this mean and stdev
}
#replace transformed data
data.z[,c(4:22)] <- norm.study
#filter data for graphing
plotting <- data.z
plotting <- data.z[complete.cases(data.z[,c(4:22)]),] # 2 individuals with missing 1 month and 3 month samples are removed
plotting <- plotting[plotting$timepoint %in% c("Pre", "1mo"),]
Y <- plotting$Pharyngitis #outcome variable
X <- plotting[,c(4:22)] #only numerical data
Z <- plotting$timepoint
#add in response variable (see line 562 )
plotting$four_responders <- if_else(plotting$id %in% four.responders, "Y", "N")
#MDS
data.cor <- cor(t(X), method = "spearman")
# cluster
d <- dist(data.cor) # determine distance matrix
xcMDS <- monoMDS(d, k =2) # reduces dimensionality to 2 dimensions.
#plot Multidimensional scaled data
ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+
theme_classic() +
geom_point(show.legend = F, cex = 4, alpha = 0.75, aes(color = Y, fill = Y, shape = Z))+
scale_fill_manual(values = c("blue", "red"))+
scale_color_manual(values = c("blue", "red"))+
scale_shape_manual(values = c(21, 22, 23, 24, 25))+
guides(fill = FALSE, color = FALSE)+
ylab("MDS dim 2")+
xlab("MDS dim 1")
ggsave("MDS.color.by.outcome.seed36.pdf", width = 3, height = 2.5)
#plot MDS plot coloured by ID with connecting lines
#define colour palette
P25 = createPalette(25, c("#ff0000", "#00ff00", "#0000ff"))
names(P25) <- NULL
ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+
theme_classic() +
geom_line(aes(color = id), size =2)+
geom_point(show.legend = F, cex = 4, alpha = 0.75, aes( fill = id, shape = Z)) +
scale_fill_manual(values = P25)+
scale_color_manual(values = P25)+
scale_shape_manual(values = c(21, 22, 23, 24, 25))+
guides(fill = FALSE, color = FALSE)+
ylab("MDS dim 2")+
xlab("MDS dim 1")
ggsave("MDS.color.by.id.seed36.pdf", width = 3, height = 2.5)
#plot MDS data coloured by four_responders
ggplot(data = plotting, aes(x = xcMDS$points[,1], y = xcMDS$points[,2]))+
theme_classic() +
geom_point(show.legend = F, cex = 4, alpha = 0.75, aes(fill = plotting$four_responders, shape = timepoint)) +
scale_fill_manual(values = c("white", "dark green"))+
scale_shape_manual(values = c(21, 22, 23))+
# guides(fill = F)+
ylab("MDS dim 2")+
xlab("MDS dim 1")
ggsave("MDS.color.by.count.pdf", width = 3, height = 2.5)
#remove datasets not used again
rm(PreChallenge.lumi, PreChallenge_trim, OneMonth.lumi, ThreeMonths.lumi, ThreeMonths, OneMonth, PreChallenge.z, One.Month.z, ThreeMonths.z, norm.study, X, Y, Z, data.cor, d, P25)
Fig 6 (continued) : compare MDS1
# add new variable to plotting dataframe
plotting$mds1 <- xcMDS$points[,1]
#subset plotting data into each timepoint
mds1.pre <- plotting[plotting$timepoint %in% "Pre",]
mds1.post <- plotting[plotting$timepoint %in% "1mo",]
#create a post-pre MDS1 variable
mds1.post$mds1 <- mds1.post$mds1-mds1.pre[mds1.pre$id %in% mds1.post$id,]$mds1
#plots for 7B
ggplot(mds1.pre, aes(Pharyngitis, mds1, fill = Pharyngitis))+
geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+
geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
ylab("NMDS dim 1")+
xlab("")+
stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F,
aes(label = paste0("p = ", ..p.format..)))+
coord_cartesian( ylim = c(-1.50, 1.5))+
theme_bw()+
theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14))
ggsave(file = "NMDS_pre.pdf", width = 2.25, height = 3 )
ggplot(mds1.post, aes(Pharyngitis, mds1, fill = Pharyngitis))+
geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+
geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
ylab("dim 1 change (1mo-Pre)")+
xlab("")+
stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F,
aes(label = paste0("p = ", ..p.format..)))+
coord_cartesian( ylim = c(-0.50, 1.5))+
theme_bw()+
theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14))
ggsave(file = "NMDS_post.pdf", width = 2.25, height = 3)
# Fig 6C:
mds.Corr = matrix(data = 0, nrow =19 , ncol = 2) #initialize matrix
for(i in 1:19){
mds.Corr[i,1] <- cor.test(plotting[,c(i+3)], plotting$mds1, method = "spearman")$estimate
mds.Corr[i,2] <- cor.test(plotting[,c(i+3)], plotting$mds1, method = "spearman")$p.value
}
rownames(mds.Corr) <- names(plotting[,c(4:22)])
mds.Corr <- as.data.frame(mds.Corr)
mds.Corr$adj <- p.adjust(mds.Corr[,2], method = "fdr")
mds.Corr <- mds.Corr[mds.Corr$adj <0.05,]
mds.Corr <- mds.Corr[order(mds.Corr$V1),]
heatpal <- c("seagreen","white", "mediumpurple3")
#plot a heatmap or Rho values, but ignore columns 2 and 3.
pdf(file = "Correlation.with.mds.pdf")
pheatmap(mds.Corr[,],
kmeans_k = NA,
# breaks = breaksList,
color = colorRampPalette(heatpal)(1000),
border_color = "black",
cellwidth = 11, cellheight =11,
# display_numbers = test_labels,
number_format = "%",
scale = "none",
cluster_rows = F,
cluster_cols = F,
number_color = "grey",
show_rownames = T, show_colnames = T,
fontsize = 8, display_numbers = F,
filename = NA, silent = FALSE)
dev.off()
#remove variables no longer needed
rm(plotting, mds1.pre, mds1.post, mds.Corr)
Fig 7: Correlation with outcome variables
#automate comparisons between pre-challenge immune variables and outcome vars
Outcome.Corr = matrix(data = 0, nrow =37 , ncol = 16) #initialize matrix
for(i in 1:37){
Outcome.Corr[i,1] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Time_to_diagnosis, method = "spearman")$estimate
Outcome.Corr[i,2] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Peak_Ct, method = "spearman")$estimate
Outcome.Corr[i,3] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Tonsil, method = "spearman")$estimate
Outcome.Corr[i,4] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Adenopathy, method = "spearman")$estimate
Outcome.Corr[i,5] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Temp_max, method = "spearman")$estimate
Outcome.Corr[i,6] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$CRP, method = "spearman")$estimate
Outcome.Corr[i,7] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Pain, method = "spearman")$estimate
Outcome.Corr[i,8] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Headache, method = "spearman")$estimate
Outcome.Corr[i,9] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Time_to_diagnosis, method = "spearman")$p.value
Outcome.Corr[i,10] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Peak_Ct, method = "spearman")$p.value
Outcome.Corr[i,11] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Tonsil, method = "spearman")$p.value
Outcome.Corr[i,12] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Adenopathy, method = "spearman")$p.value
Outcome.Corr[i,13] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Temp_max, method = "spearman")$p.value
Outcome.Corr[i,14] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$CRP, method = "spearman")$p.value
Outcome.Corr[i,15] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Pain, method = "spearman")$p.value
Outcome.Corr[i,16] <- cor.test(PreChallenge[,c(i+16)], PreChallenge$Headache, method = "spearman")$p.value
}
#make into a dataframe and rename columns and rows
Outcome.Corr <- as.data.frame(Outcome.Corr)
names(Outcome.Corr) <- c("Rho_time","Rho_CT","Rho_Tonsil","Rho_Adeno","Rho_Temp","Rho_CRP","Rho_Pain","Rho_Head",
"p_time","p_CT","p_Tonsil", "p_Adeno", "p_Temp","p_CRP","p_Pain","p_Head")
rownames(Outcome.Corr) <- names(PreChallenge[,c(17:ncol(PreChallenge))])
# save a csv of the p-values
write.csv(Outcome.Corr, file = "outcomeCorr2.csv")
#Keep any variables with a p<0.05 for any correlation
plotting <- Outcome.Corr[c("IgG_ADI_Pre.challenge","IgG_TF_Pre.challenge","IgG_J8_Pre.challenge","IgG_K4S2_Pre.challenge","DnaseB","IgG_SPYAD_Pre.challenge","IgA_P145_Pre.challenge","SpyCEP","SpnA","IgG_T25_Pre.challenge"),1:8]
#simplify names
rownames(plotting) <-c("ADI","TF","J8","K4S2","DnaseB","SpyAD","P145_IgA","SpyCEP","SpnA","T25")
#reorder by correlation coefficient for 'Time to diagnosis'
names(plotting) <- gsub("Rho_", "",names(plotting) )
# graph it
heatpal <- c("seagreen","white", "mediumpurple3")
pdf(file = "Correlation.with.outcome.pdf")
pheatmap(plotting,
kmeans_k = NA,
color = colorRampPalette(heatpal)(1000),
border_color = "black",
cellwidth = 11, cellheight =11,
# display_numbers = test_labels,
number_format = "%",
scale = "none",
cluster_rows = F,
cluster_cols = F,
number_color = "grey",
show_rownames = T, show_colnames = T,
fontsize = 8, display_numbers = F,
filename = NA, silent = FALSE)
dev.off()
#remove variables not used elsewhere:
rm(plotting, Outcome.Corr, heatpal)
Figure 7 (continued) : sPLS-DA
set.seed(36)
Y <- PreChallenge$Pharyngitis #outcome variable
X <- PreChallenge[,c(17:ncol(PreChallenge))] #only numerical data
#initial model
srbct.splsda <- splsda(X, Y, ncomp = 10) # set ncomp to 10 for performance assessment later
# plot the samples projected onto the first two components of the PLS-DA subspace
plotIndiv(srbct.splsda , comp = 1:2,
group = PreChallenge$Pharyngitis, ind.names = FALSE, # colour points by class
ellipse = TRUE, # include 95% confidence ellipse for each class
legend = TRUE, title = 'sPLS-DA untuned')
# undergo performance evaluation in order to tune the number of components to use
perf.splsda.srbct <- perf(srbct.splsda, validation = "Mfold", folds = 3,
progressBar = F, auc = TRUE, nrepeat = 100)
perf.splsda.srbct$choice.ncomp # what is the optimal value of components according to perf()
# grid of possible keepX values that will be tested for each component
list.keepX <- c(1:37)
# undergo the tuning process to determine the optimal number of variables
tune.splsda.srbct <- tune.splsda(X, Y, ncomp = 2, validation = 'Mfold', folds = 3,
progressBar = F, dist = 'mahalanobis.dist', measure = "BER",
test.keepX = list.keepX, nrepeat = 200, cpus=3) # allow for parallelisation to decrease runtime
#run final model
final.splsda <- splsda(X, Y,
ncomp = 2,
keepX = 5)
pdf(file = "sPLSDA.2dim.pdf", width = 3, height = 3)
plotIndiv(final.splsda , comp = 1:2,
group = PreChallenge$Pharyngitis, ind.names = FALSE, # colour points by class
ellipse = TRUE, ellipse.level = 0.95, # include 95% confidence ellipse for each class
legend = F, title = 'sPLS-DA')
dev.off()
pdf(file = "sPLSDA.loadings.pdf", width = 4.5, height = 3)
ContribComp1 <- plotLoadings(final.splsda, comp = 1,
contrib = 'max', method = 'mean', xlim = c(-0.4, 0.8))
dev.off()
selectVar(final.splsda, comp=1)$name[which(ContribComp1$GroupContrib == "no pharyngitis")]
selectVar(final.splsda, comp=1)$name[which(ContribComp1$GroupContrib == "pharyngitis")]
# Graph up raw data of results
Select.var <- "IgG_TF_Pre.challenge"
# Select.var <- "IgG_ADI_Pre.challenge"
# Select.var <- "IgA_P145_Pre.challenge"
# Select.var <- "SpnA"
# Select.var <- "IgG_T25_Pre.challenge"
ggplot(PreChallenge, aes(Pharyngitis, log(PreChallenge[,Select.var],10), fill = Pharyngitis))+
geom_violin(draw_quantiles = c(0.5), scale= "width", alpha = 0.25)+
geom_beeswarm(size = 3, alpha = 0.75, cex =5, shape = 21)+
scale_fill_manual(values = c("blue", "red"))+
ylab("")+
xlab("")+
stat_compare_means(method = "wilcox.test", tip.length = 0.005, bracket.size = 0.5, vjust = -1, paired = F,
aes(label = paste0("p = ", ..p.format..)))+
coord_cartesian( ylim = c(0, 3))+
theme_bw()+
theme(legend.position = "none", axis.text.x=element_text(colour="black", size = 14), axis.text.y=element_text(colour="black", size = 10), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.title.y = element_text(size = 14))
ggsave(file = paste(Select.var, "Pre.challenge.pdf", sep = ""), width = 2, height = 4)
系统信息
R version 4.3.3 (2024-02-29)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Asia/Shanghai
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] cowplot_1.1.3 mixOmics_6.26.0 MASS_7.3-60.0.1 uwot_0.1.16 Matrix_1.6-5 Polychrome_1.5.1 gghalves_0.1.4
[8] ggpubr_0.6.0 vegan_2.6-4 lattice_0.22-6 permute_0.9-7 viridis_0.6.5 viridisLite_0.4.2 RColorBrewer_1.1-3
[15] pheatmap_1.0.12 ggbeeswarm_0.7.2 reshape2_1.4.4 lubridate_1.9.3 forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[22] purrr_1.0.2 readr_2.1.5 tidyr_1.3.1 tibble_3.2.1 tidyverse_2.0.0 ggplot2_3.5.1
loaded via a namespace (and not attached):
[1] tidyselect_1.2.1 vipor_0.4.7 fastmap_1.1.1 digest_0.6.35 timechange_0.3.0 lifecycle_1.0.4
[7] cluster_2.1.6 magrittr_2.0.3 compiler_4.3.3 rlang_1.1.4 tools_4.3.3 igraph_2.0.3
[13] utf8_1.2.4 yaml_2.3.8 knitr_1.49 ggsignif_0.6.4 rARPACK_0.11-0 scatterplot3d_0.3-44
[19] plyr_1.8.9 abind_1.4-5 BiocParallel_1.36.0 withr_3.0.2 grid_4.3.3 fansi_1.0.6
[25] colorspace_2.1-0 scales_1.3.0 cli_3.6.3 ellipse_0.5.0 rmarkdown_2.26 generics_0.1.3
[31] RSpectra_0.16-1 rstudioapi_0.16.0 tzdb_0.4.0 splines_4.3.3 parallel_4.3.3 matrixStats_1.2.0
[37] vctrs_0.6.5 carData_3.0-5 car_3.1-2 hms_1.1.3 rstatix_0.7.2 ggrepel_0.9.6
[43] beeswarm_0.4.0 glue_1.8.0 codetools_0.2-19 stringi_1.8.3 gtable_0.3.4 munsell_0.5.0
[49] pillar_1.9.0 htmltools_0.5.8 R6_2.5.1 evaluate_0.23 backports_1.4.1 broom_1.0.5
[55] corpcor_1.6.10 Rcpp_1.0.12 gridExtra_2.3 nlme_3.1-164 mgcv_1.9-1 xfun_0.50
[61] pkgconfig_2.0.3