概率密度函数可视化
文章目录
- 概率密度函数可视化
- @[toc]
- 1 一维随机变量情形
- 2 二维随机变量情形
文章目录
- 概率密度函数可视化
- @[toc]
- 1 一维随机变量情形
- 2 二维随机变量情形
1 一维随机变量情形
以正态概率密度函数为例,其中位置参数为
μ
\mu
μ,尺度参数为
σ
\sigma
σ,
f
(
x
)
=
1
2
π
σ
e
−
(
x
−
μ
)
2
2
σ
2
,
x
∈
R
f(x) = \dfrac{1}{\sqrt{2\pi}\sigma}e^{-\dfrac{(x-\mu)^2}{2\sigma^2}},x\in R
f(x)=2πσ1e−2σ2(x−μ)2,x∈R
library(tidyverse)
# 一维情形
set.seed(1)
N <- 10000
data <- data.frame(v1 = rnorm(N, 1, 1))
data %>% ggplot(aes(x = v1)) +
geom_density(colour = 2, size = 1.5, fill = "#009933") +
xlab("X") +
ylab("Density") +
theme_bw() +
theme(axis.title.x = element_text(vjust = 1, size = 15)) +
theme(axis.title.y = element_text(vjust = 1, size = 15)) +
theme(axis.text.x = element_text(vjust = 1, size = 15)) +
theme(axis.text.y = element_text(vjust = 1, size = 15)) +
annotate("text",
x = 3, y = 0.3, parse = TRUE, size = 6,
label = "y==frac(1, sqrt(2*pi)) * e^{frac(-(x-1)^2 , 2)}"
)
2 二维随机变量情形
以多元正态概率密度函数为例,随机变量
X
1
…
X
n
X_1\dots X_n
X1…Xn的联合正态概率密度函数为
p
(
x
)
=
p
(
x
1
,
x
2
,
…
,
x
n
)
=
1
(
2
π
)
n
/
2
∣
Σ
∣
1
/
2
exp
(
−
1
2
(
x
−
μ
)
T
Σ
−
1
(
x
−
μ
)
)
p(x)=p\left(x_{1}, x_{2}, \ldots, x_{n}\right)=\frac{1}{(2 \pi)^{n / 2}|\Sigma|^{1 / 2}} \exp \left(-\frac{1}{2}(x-\mu)^{T} \Sigma^{-1}(x-\mu)\right)
p(x)=p(x1,x2,…,xn)=(2π)n/2∣Σ∣1/21exp(−21(x−μ)TΣ−1(x−μ))
其中
Σ
\Sigma
Σ为随机变量
X
1
…
X
n
X_1\dots X_n
X1…Xn的方差协方差矩阵,
μ
\mu
μ为各随机变量均值向量。当
n
=
2
n=2
n=2时,概率密度函数为
p
(
x
)
=
1
2
π
σ
1
σ
2
1
−
ρ
2
exp
(
−
1
2
(
x
1
−
μ
1
σ
1
)
2
−
2
ρ
(
x
1
−
μ
1
σ
1
)
(
x
2
−
μ
2
σ
2
)
+
(
x
2
−
μ
2
σ
2
)
2
1
−
ρ
2
)
p(x) = \frac{1}{2 \pi \sigma_{1} \sigma_{2} \sqrt{1-\rho^{2}}} \exp \left(-\frac{1}{2} \frac{\left(\frac{x_{1}-\mu_{1}}{\sigma_{1}}\right)^{2}-2 \rho\left(\frac{x_{1}-\mu_{1}}{\sigma_{1}}\right)\left(\frac{x_{2}-\mu_{2}}{\sigma_{2}}\right)+\left(\frac{x_{2}-\mu_{2}}{\sigma_{2}}\right)^{2}}{1-\rho^{2}}\right)
p(x)=2πσ1σ21−ρ21exp
−211−ρ2(σ1x1−μ1)2−2ρ(σ1x1−μ1)(σ2x2−μ2)+(σ2x2−μ2)2
其中
ρ
\rho
ρ表示
x
1
x_1
x1和
x
2
x_2
x2的相关系数。
# 二维情形
rm(list = ls())
n <- 10000
mean <- c(-1, 1)
sigma <- matrix(c(1, 0.7, 0.7, 1), nrow = 2, ncol = 2)
mydata <- as.data.frame(MASS::mvrnorm(n, mean, sigma))
# 二维变量散点图
mydata %>% ggplot( aes(x = V1, y = V2)) +
geom_bin2d(bins = 80) +
scale_fill_viridis_c(option = "D") +
theme_bw()+
guides(fill = guide_colorbar(title = "频数"))+
theme(legend.position = c(0.9,0.2), legend.title = element_text(size = 15))
# 二维随机变量联合密度
gg <- mydata %>% ggplot( aes(x = V1, y = V2)) +
stat_density_2d(aes(fill = stat(nlevel)), geom = "polygon", colour = "white") +
# 轴标签
xlab("随机变量1") +
ylab("随机变量2") +
# 背景颜色
theme_bw() +
# 填充颜色设置
scale_fill_viridis_c(option = "D") +
# x轴标签设置
theme(axis.title.x = element_text(vjust = -1, size = 30)) +
# y轴标签设置
theme(axis.title.y = element_text(vjust = 1, size = 30)) +
# x轴刻度尺标签设置
theme(axis.text.x = element_text(vjust = 1, size = 30)) +
# y轴刻度尺标签设置
theme(axis.text.y = element_text(vjust = 1, size = 30)) +
# 图例设置
guides(fill = guide_colorbar(title = " ")) +
theme(legend.position = c(0.15, 0.8), legend.title = element_text(size = 15))
rayshader::plot_gg(gg, multicore = TRUE, width = 10, height = 10, scale = 1000)
rgl:rgl.postscript("rgl1.svg", fmt = "svg",drawText = TRUE)
u1 = 0
u2 = 0
sigma1 = 1
sigma2 = 1
rho = 0.7
f <- function(X,Y){
return( 1/(2*pi*sigma1*sigma2*sqrt(1-rho^2))*exp(-1/(2*sqrt(1-rho^2))*(X^2+Y^2-2*rho*X*Y)))
}
X = seq(-4,4,0.1)
Y = seq(-4,4,0.1)
Z = outer(X,Y,f)
library(shape)
persp(X,Y,Z,theta = -40,phi=20,expand = 0.6,r = 10,shade = 0.01,
ticktype = "detailed",xlab = "X",ylab = "Y",zlab = expression(f(X,Y)),
col = drapecol(Z))