b站课程视频链接:
https://www.bilibili.com/video/BV19x411X7C6?p=1
腾讯课堂(最新,但是要花钱,我花99😢😢元买了,感觉讲的没问题,就是知识点结构有点乱,有点废话):
https://ke.qq.com/course/3707827#term_id=103855009
本笔记前面的笔记参照b站视频,【后面的画图】参考了付费视频
笔记顺序做了些调整【个人感觉逻辑顺畅】,并删掉一些不重要的内容,以及补充了个人理解
系列笔记目录【持续更新】:https://blog.csdn.net/weixin_42214698/category_11393896.html
文章目录
- 1. 一维的数据框进行频数统计
- 2. 二维的数据框进行频数统计
- 3. 三维的数据框进行频数统计
- 4. 对列联表进【行和列】边际频数统计
- 5. 频率统计:prop.table( )
1. 一维的数据框进行频数统计
因子是专门用来进行分组的,有因子才能分组【as.factor】,分组之后才能进行频数统计。
> mtcars$cyl <- as.factor(mtcars$cyl)
> table(mtcars$cyl) #频数统计
4 6 8
11 7 14
# seq(from=,to=,by=组距)
> table(cut(mtcars$mpg,c(seq(10,50,10)))) #频数统计,数据不一样
(10,20] (20,30] (30,40] (40,50]
18 10 4 0
查看每个因子具体的行数-:
> split(mtcars,mtcars$cyl) #结果按照cyl分类
$`4`
mpg cyl disp hp drat wt qsec vs am gear carb
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
还有7列..
$`6`
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
还有5列..
$`8`
mpg cyl disp hp drat wt qsec vs am gear carb
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
Duster 360 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
还有12列..
2. 二维的数据框进行频数统计
# vcd包中的风湿病数据集(arthritis)进行示范
> library(vcd)
># 统计两个量的频数,返回的结果是一个二维的列联表
> table(Arthritis$Treatment,Arthritis$Improved)
None Some Marked
Placebo 29 7 7
Treated 13 7 21
或者:
> with(data = Arthritis,table(Treatment,Improved))
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
或者:
> xtabs(~Treatment+Improved,data = Arthritis)
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
3. 三维的数据框进行频数统计
> y <- xtabs(~Treatment+Improved+Sex,data = Arthritis)
> y
, , Sex = Female
Improved
Treatment None Some Marked
Placebo 19 7 6
Treated 6 5 16
, , Sex = Male
Improved
Treatment None Some Marked
Placebo 10 0 1
Treated 7 2 5
---------------------------------将结果转换为一个评估式的列联表
> ftable(y)
Sex Female Male
Treatment Improved
Placebo None 19 10
Some 7 0
Marked 6 1
Treated None 6 7
Some 5 2
Marked 16 5
4. 对列联表进【行和列】边际频数统计
1️⃣边际频数统计 :margin.table( )
> x <- xtabs(~Treatment+Improved,data = Arthritis)
> x
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
--------------------- 按行进行边际频数统计-----------------------
> margin.table(x,1)
Treatment
Placebo Treated
43 41
--------------------- 按列进行边际频数统计-----------------------
> margin.table(x,2)
Improved
None Some Marked
42 14 28
2️⃣将边际频数的和添加到频数表中:addmargins( )
> addmargins(x)
Improved
Treatment None Some Marked Sum
Placebo 29 7 7 43
Treated 13 7 21 41
Sum 42 14 28 84
------------------------多了最后一行------------------------
> addmargins(x,1)
Improved
Treatment None Some Marked
Placebo 29 7 7
Treated 13 7 21
Sum 42 14 28
------------------------多了最后一列------------------------
> addmargins(x,2)
Improved
Treatment None Some Marked Sum
Placebo 29 7 7 43
Treated 13 7 21 41
5. 频率统计:prop.table( )
就是在计算频数的外面,加个函数:prop.table( ) 就可以算出频率
1️⃣一维
2️⃣二维
3️⃣三维