网络安全--筛选给定范围内的日志

news2024/11/27 13:30:29

目录

pass:在观看此篇前先看上篇的awk介绍

一、文件

二、第一方法

1.步骤

​编辑三、第二方法:

awk内容:

结果:

 四、第二要求

统计独立ip

操作步骤:

1.先创建文件写入一下测试内容:

2.书写awk代码如下:

 3.未生成之前:

4.生成后:

​编辑 5.检查

五、第三要求

处理字段缺失的数据

内容:

 1.问题:

2.奇异的解题思路---重构(无法解决)

​编辑 3.小技巧:将空白部分保留下来打印

 4.看下一个有字符如何打印:

5.解决:

总结:逗号不再是分隔符,可正常打印

 六、第四要求

筛选给定时间范围内的日志

 问题解释:

概念引入

相关例题:

文件中引入内容:

 awk内容:

运行内容如下:

 解释:


pass:在观看此篇前先看上篇的awk介绍

一、文件

找到自己目录下Apache的工作日志作为例子,这里我挑选了一个比较大的 

127.0.0.1 - - [30/Jul/2023:08:34:54 +0800] "GET /less02/index.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:34:54 +0800] "GET /favicon.ico HTTP/1.1" 404 2659
127.0.0.1 - - [30/Jul/2023:08:36:05 +0800] "GET /less02/index.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:36:55 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:36:55 +0800] "GET /less02/js.js HTTP/1.1" 200 211
127.0.0.1 - - [30/Jul/2023:08:37:55 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:08:38:17 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:17 +0800] "GET /less02/js.js HTTP/1.1" 200 226
127.0.0.1 - - [30/Jul/2023:08:38:20 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:20 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:21 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:21 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:21 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:21 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:35 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:35 +0800] "GET /less02/js.js HTTP/1.1" 200 226
127.0.0.1 - - [30/Jul/2023:08:38:36 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:36 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:36 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:36 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:36 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:37 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:38 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:38 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:39 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:59 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:38:59 +0800] "GET /less02/js.js HTTP/1.1" 200 249
127.0.0.1 - - [30/Jul/2023:08:39:59 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:08:42:20 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:42:20 +0800] "GET /less02/js.js HTTP/1.1" 200 178
127.0.0.1 - - [30/Jul/2023:08:43:20 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:08:44:50 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:44:50 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:45:50 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:08:50:04 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:50:04 +0800] "GET /less02/js.js HTTP/1.1" 200 271
127.0.0.1 - - [30/Jul/2023:08:50:08 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:50:08 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:51:04 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:08:58:41 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:58:41 +0800] "GET /less02/js.js HTTP/1.1" 200 472
127.0.0.1 - - [30/Jul/2023:08:58:47 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:58:47 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:58:48 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:58:48 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:58:48 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:08:59:47 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:14:40:28 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:40:28 +0800] "GET /less02/js.js HTTP/1.1" 200 180
127.0.0.1 - - [30/Jul/2023:14:40:28 +0800] "GET /favicon.ico HTTP/1.1" 404 2659
127.0.0.1 - - [30/Jul/2023:14:40:53 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:40:53 +0800] "GET /less02/js.js HTTP/1.1" 200 180
127.0.0.1 - - [30/Jul/2023:14:40:54 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:40:54 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:39 +0800] "GET /less02/js.js HTTP/1.1" 200 180
127.0.0.1 - - [30/Jul/2023:14:41:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:39 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:40 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:40 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:41:40 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:42:39 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:14:42:51 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:42:51 +0800] "GET /less02/js.js HTTP/1.1" 200 189
127.0.0.1 - - [30/Jul/2023:14:43:51 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:14:44:03 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:44:03 +0800] "GET /less02/js.js HTTP/1.1" 200 231
127.0.0.1 - - [30/Jul/2023:14:45:03 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:14:48:51 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:48:51 +0800] "GET /less02/js.js HTTP/1.1" 200 253
127.0.0.1 - - [30/Jul/2023:14:48:52 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:48:52 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:49:51 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:14:52:27 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:14:52:27 +0800] "GET /less02/js.js HTTP/1.1" 200 281
127.0.0.1 - - [30/Jul/2023:21:56:45 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:56:45 +0800] "GET /less02/js.js HTTP/1.1" 200 36
127.0.0.1 - - [30/Jul/2023:21:57:15 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:15 +0800] "GET /less02/js.js HTTP/1.1" 200 34
127.0.0.1 - - [30/Jul/2023:21:57:36 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:36 +0800] "GET /less02/js.js HTTP/1.1" 200 33
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:38 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:57 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:57 +0800] "GET /less02/js.js HTTP/1.1" 200 31
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:58 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:57:59 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:00 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:01 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:02 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:03 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:03 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:58:57 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:21:59:25 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:25 +0800] "GET /less02/js.js HTTP/1.1" 200 32
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:26 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:27 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:27 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:27 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:27 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:27 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:45 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:45 +0800] "GET /less02/js.js HTTP/1.1" 200 32
127.0.0.1 - - [30/Jul/2023:21:59:46 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:21:59:46 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:00:34 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:00:34 +0800] "GET /less02/js.js HTTP/1.1" 200 33
127.0.0.1 - - [30/Jul/2023:22:00:34 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:00:34 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:00:37 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:00:37 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:01:34 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:22:05:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:05:39 +0800] "GET /less02/js.js HTTP/1.1" 200 51
127.0.0.1 - - [30/Jul/2023:22:05:39 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:05:39 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:05:40 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:05:41 +0800] "GET /less02/js.js HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:05:41 +0800] "GET /less02/index1.html HTTP/1.1" 304 -
127.0.0.1 - - [30/Jul/2023:22:06:39 +0800] "-" 408 -
127.0.0.1 - - [30/Jul/2023:23:35:11 +0800] "GET /less02/tools HTTP/1.1" 404 2659
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/ HTTP/1.1" 200 56719
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/assets/main.css HTTP/1.1" 200 626596
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/images/file-32x32.png HTTP/1.1" 200 1946
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/images/file-128x128.png HTTP/1.1" 200 19378
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/images/cook_male-32x32.png HTTP/1.1" 200 1624
127.0.0.1 - - [30/Jul/2023:23:35:27 +0800] "GET /tools/assets/main.js HTTP/1.1" 200 4237575
127.0.0.1 - - [30/Jul/2023:23:41:08 +0800] "GET /less/index.html HTTP/1.1" 404 2659
127.0.0.1 - - [30/Jul/2023:23:41:20 +0800] "GET /less01/index.html HTTP/1.1" 200 790
127.0.0.1 - - [30/Jul/2023:23:41:20 +0800] "GET /less01/css/style.css HTTP/1.1" 200 55
127.0.0.1 - - [30/Jul/2023:23:41:34 +0800] "GET /less02/index.html HTTP/1.1" 200 323
127.0.0.1 - - [30/Jul/2023:23:42:34 +0800] "-" 408 -

二、第一方法

统计日志中各IP访问304状态码的次数

1.步骤

首先第一步先测试(看状态码是否可以正常打印)

cat access.log | awk '{print $1  $9}'

 其次统计出其次数和状态码如下:

注:因为我本机的Apache一直用来测试,因此只有访问本地端,真实的应该为:

 当我们看到一个ip访问多次的时候,就应该明白此为爆破扫描ip应及时封除

这里我以我的举例如下:

awk '$9==304{arr[$1]++}END{for(i in arr){print arr[i],i}}' access.log 


三、第二方法:

awk内容:

$9 == 200 {
    arr[$1]++
}
END {
    PROCINFO["sorted_in"] = "@val_num_desc";
    for (i in arr) {
        if (cnt++ == 10) {
            exit
        }
        print arr[i], i
    }
}

结果:

 

 访问多次解决方法:自动封堵

如何解决:

统计非200状态码的ip,并获取次数最多的前10个ip

awk中排序函数 sort asort

设置排序顺序PROCINFO

 四、第二要求

统计独立ip

需求:统计每个URL的独立访问IP有多少个(去重),并且要为每个URL保存一个对应的文件

操作步骤:

1.先创建文件写入一下测试内容:

a.com.cn|202.109.134.23|2015-11-20 20:34:43|guest
b.com.cn|202.109.134.23|2015-11-20 20:34:48|guest
c.com.cn|202.109.134.24|2015-11-20 20:34:48|guest
a.com.cn|202.109.134.23|2015-11-20 20:34:43|guest
a.com.cn|202.109.134.24|2015-11-20 20:34:43|guest
b.com.cn|202.109.134.25|2015-11-20 20:34:48|guest

2.书写awk代码如下:

BEGIN{
        FS="|"
}

!arr[$1,$2]++{
        arr1[$1]++
}

END{
        for(i in arr1) {
                print i, arr[i] > (i".txt")
        }
}
~      

 3.未生成之前:

4.生成后:

 5.检查

五、第三要求

处理字段缺失的数据

内容:

ID  name    gender  age  email          phone
1   Bob     male    28   abc@qq.com     18023394012
2   Alice   female  24   def@gmail.com  18084925203
3   Tony    male    21                  17048792503
4   Kevin   male    21   bbb@189.com    17023929033
5   Alex    male    18   ccc@xyz.com    18185904230
6   Andy    female       ddd@139.com    18923902352
7   Jerry   female  25   exdsa@189.com  18785234906
8   Peter   male    20   bax@qq.com     17729348758
9   Steven          23   bc@sohu.com    15947893212
10  Bruce   female  27   bcbd@139.com   13942943905

 1.问题:

当字段缺失时很明显打印错误

2.奇异的解题思路---重构(无法解决)

 3.小技巧:将空白部分保留下来打印

awk '{print $0}' FIELDWIDTHS="2 2:6 2:6 2:3 2:13 2:11" a.txt
FIELDWIDTH第一个字段是字符宽度ID为2,指定2个字符宽度
第两个字段最大为6,但前面和ID之间还有两个空格,所以可以指定宽度为8,也可以跳过两个字符2:6
awk 'NR==4{print $5}' FIELDWIDTHS="2 2:6 2:6 2:3 2:13 2:11" a.txt

 4.看下一个有字符如何打印:

5.解决:

FPAT可以收集正则匹配的结果,并将它们保存在各个字段中。(就像grep匹配成功的部分会加颜色显示,而使用FPAT划分字段,则是将匹配成功的部分保存在字段$1 $2 $3...中)。

总结:逗号不再是分隔符,可正常打印

 cat demo2.txt | awk 'BEGIN{FPAT="[^,]+|\".*\""}{print $1 $3}'

 

 六、第四要求

筛选给定时间范围内的日志

 问题解释:

grep/sed/awk用正则去筛选日志时,如果要精确到小时、分钟、秒,则非常难以实现。

但是awk提供了mktime()函数,它可以将时间转换成epoch时间值。

借此,可以取得日志中的时间字符串部分,再将它们的年、月、日、时、分、秒都取出来,然后放入mktime()构建成对应的epoch值。因为epoch值是数值,所以可以比较大小,从而决定时间的大小。

概念引入

在 AWK 编程语言中,时间戳通常用于处理文本数据中的时间信息。AWK 是一种用于文本处理和数据提取的编程语言,它允许你使用模式匹配和操作来处理文本文件中的行和字段。

在 AWK 中,你可以使用内置函数 systime() 来获取当前的 Unix 时间戳,它返回从 Epoch 时间(1970 年 1 月 1 日)到当前时间的秒数。这可以用于处理时间戳相关的操作。

相关例题:

文件中引入内容:

John 2023-08-01 15:30:45
Alice 2023-08-02 12:45:00
Bob 2023-08-03 09:15:30

 awk内容:

awk '{ 
    cmd = "date -d \"" $2 " " $3 "\" +%s"; 
    cmd | getline timestamp; 
    close(cmd); 
    print $1, timestamp 
}' data.txt

运行内容如下:

 

 解释:

$2$3 表示输入行中的第二个和第三个字段,即日期和时间。date -d 命令被用来将日期和时间转换为 Unix 时间戳,+%s 参数表示输出结果以秒为单位。getline 函数用于执行外部命令并读取其输出,将结果存储在 timestamp 变量中,然后通过 print 命令输出名字和对应的 Unix 时间戳。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/875581.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

Redis缓存穿透、击穿和雪崩

1、Redis缓存穿透 缓存穿透是指当用户在查询一条数据的时候,而此时数据库和缓存却没有关于这条数据的任何记录,而这条数据在缓存中没找到就会向数据库请求获取数据。用户拿不到数据时,就会一直发请求,查询数据库,这样…

如何在电力行业运用IPD?

电力行业是国民经济众多垄断行业中较早实施改革的行业之一。近几年我国电力行业保持着较快的发展速度,也取得了很大的成绩,发电机容量和发电量居世界首位。2015-2020年,全国发电量不断攀升。 电力是以电能作为动力的能源。电力的发现和应用掀…

MR300C工业无线WiFi图传模块 内窥镜机器人图像传输有线无线的两种方式

MR300C无线WiFi图传模使用方法工业机器人图像高清传输 ⚫ MR300C图传模块基于MIPS处理器实现,电脑/手机连接模块的WIFI热点或网口即可查看视频流 ⚫ 模块的USB 2.0 Host接口,可接入USB uvc摄像头/内窥镜默认输出的视频格式必须是MJPG ⚫ 模块支持接入摄…

Linux 主函数参数介绍

主函数如下: int main( int argc, char* argv[], char* envp[]) 参数分析如下: (1) argc 参数个数 (2) argv 参数内容,是char*类型,说明传给主函数的内容是一个一个的字符串。 (3) envp 环境变量,传给主函数的也…

蓝牙入耳式耳机老是滑出来,耳朵小适合戴什么样的骨传导耳机

最近体验了几款骨传导耳机,分享下我的使用感受。首先说一下为什么要选择骨传导耳机,我之前是使用入耳式耳机,戴久了耳朵会疼,而且晚上睡觉不能戴。于是就考虑骨传导耳机,因为骨传导耳机在传声的过程中不需要经过耳膜&a…

【Elasticsearch】学好Elasticsearch系列-脚本查询

本文已收录至 Github,推荐阅读 👉 Java 随想录 先看后赞,养成习惯。 点赞收藏,人生辉煌。 文章目录 概念支持的语言Painless特点简单例子 Scripting的CRUDinsert(新增)update(更新)d…

智能工厂:适应不断变化的制造世界

制造业已经从过去传统的装配线工艺流程中走了很长一段路。随着技术的进步和工业 4.0 的兴起,制造业正在迅速发展,以满足现代世界不断变化的需求。近年来出现的一个关键概念就是“智能工厂”。在这篇文章中,我们将探讨什么是智能工厂、它是如何…

为什么要学PMP项目管理?

为什么要学习PMP呢,主要有以下五点: 01提升个人能力 PMP是一个系统学习的过程,充分理解各个项目管理的过程以及项目管理的各个过程组、知识领域等,可以从理论上掌握项目经理应具有的理论素质。能够知道如何对执行的项目进行系统…

【Docker】个人镜像文件Dockerfile制作详解

前言 洁洁的个人主页 我就问你有没有发挥! 知行合一,志存高远。 Docker 是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的Linux或Windows操作系统的机器上,也可以实现虚拟化,容器是…

jupyter打开ipynb后,还没有运行cell,反复报错

今天遇到了一个比较奇怪的问题: 这个原因是当前目录下有一个code.py的文件,一旦打开ipynb,就是先执行code.py,而且遇到报错,还会反复执行,导致内核崩溃。

Windows 11 家庭中文版找不到组策略文件gpedit.msc

最近因为调整日期问题需要用到组策略文件gpedit.msc,但是发现找不到文件 在按键盘 winR 打开运行界面输入 gpedit.msc 回车 Windows找不到文件’gpedit.msc’。请确定文件名是否正确后,再试-次。 检查电脑Windows系统版本 是 Windows 11 家庭中文版 果断早网上搜…

生信豆芽菜-单基因预后

网址:http://www.sxdyc.com/panCancerPrognosis 该工具主要用于查看单基因在泛癌的预后情况,这里默认用火山图展示 提交后等待运行成功即可,还可以关注公众号:豆芽数据分析

617-合并二叉树

题目: 给你两棵二叉树: root1 和 root2 。 想象一下,当你将其中一棵覆盖到另一棵之上时,两棵树上的一些节点将会重叠(而另一些不会)。你需要将这两棵树合并成一棵新二叉树。合并的规则是:如果…

How to install GrayLog5.1.2 with one-click script

先决条件: CentOS7.9 OpenSearch2.7 环境下安装 GrayLog5.1 基础环境及组件版本说明 1、CentOS7.9 2、GrayLog5.1.2 3、MongoDB6.0 4、OpenSearch2.8 1. 准备一台 Centos 7.9 主机 最低配置要求 CPU*8 内存16GB HDD500 正式环境根据需要来 2. 一键安装属于…

clickHouse部署

docker仓库地址 https://hub.docker.com/ 1、docker环境搭建 # 1.先安装yml yum install -y yum-utils device-mapper-persistent-data lvm2 # 2.设置阿里云镜像 sudo yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo# 3.查…

【C语言】回调函数,qsort排序函数的使用和自己实现,超详解

文章目录 前言一、回调函数是什么二、回调函数的使用1.使用标准库中的qsort函数2.利用qsort函数对结构体数组进行排序 三、实现qsort函数总结 先记录一下访问量突破2000啦,谢谢大家支持!!! 这里是上期指针进阶链接,方便…

季度到季度的组件选择

组件&#xff1a;<template><div class"quarter"><div class"input-wrap" id"closeId" mouseover"handler" click.stop"btn" :style"{color:colorItem}"><i class"el-icon-date"&…

React Native文本添加下划线

import { StyleSheet } from react-nativeconst styles StyleSheet.create({mExchangeCopyText: {fontWeight: bold, color: #1677ff, textDecorationLine: underline} })export default styles

无涯教程-Perl - return函数

描述 此函数在子例程,块或do函数的末尾返回EXPR。 EXPR可以是标量,数组或哈希值&#xff1b;context将在执行时选择。如果没有给出EXPR,则在列表context中返回一个空列表,在标量context中返回undef,在空context中不返回任何内容。 语法 以下是此函数的简单语法- return EXP…

程序设计语言基础知识

1.1程序设计语言的基本概念 1、低级语言与高级语言&#xff1a; 低级语言&#xff1a;汇编语言 高级语言&#xff1a;常见的有Java、C、C、PHP、Python、Delphi等 2、翻译形式&#xff1a;汇编、解释、编译 3、程序设计语言的定义&#xff1a;语法、语义、语用 4、程序设计语言…