Golang基于Redis bitmap实现布隆过滤器(完结版)
为了防止黑客恶意刷接口(请求压根不存在的数据),目前通常有以下几种做法:
- 限制IP(限流)
- Redis缓存不存在的key
- 布隆过滤器挡在Redis前
完整代码地址:
https://github.com/ziyifast/ziyifast-code_instruction/tree/main/blond_filter
1 概念:
1.1 本质:超大bit数组
- 原理:由一个初始值都为0的bit数组和多个hash函数构成(相当于多把锁才能打开一把钥匙,才能确认某个元素是否真的存在,提高布隆过滤器的准确率),用于快速判断集合中是否存在某个元素
- 使用3步骤:初始化bitmap -> 添加元素到bitmap(占坑位) -> 判断是否存在
-Hash冲突: 为了避免hash冲突,我们可以通过多个hash函数进行映射,比如:将player:1982分别通过多个hash函数映射到多个offset。在查询时,就需要判断是否映射的所有的offset都存在。(一个hash函数冲突概率可能很高,但是通过不同多个hash进行映射,大幅降低冲突概率)
注意📢:
- 是否存在:
- 有,可能有;因为存在hash冲突,比如我添加的是王五在1号来上班了,但是王五和李四hash值一样,结果我查询李四时,发现hash定为的offset为1了,我就误以为李四也来上班了
- 无,是肯定无。100%不存在
- 使用时,bit数组尽量大些,防止扩容。当实际元素超过初始化数量时,应重建布隆过滤器,重新分配一个size更大的过滤器,再将所有历史元素批量add
- 避免删除元素,防止误删(hash冲突:我原本想删李四的记录,结果把王五的也删除了,“连坐”)
1.2 应用场景:防止Redis缓存穿透(海量数据中判断某个元素是否存在)
- 应用场景:加在数据库、Redis之前。
- 在查询之前,先查布隆过滤器是否存在,如果不存在直接返回请求。如果存在,再查询Redis、数据库,看是否真的存在。
防止因缓存穿透导致数据库被打挂掉。
- 防止被人恶意刷接口
2 环境准备
2.1 安装docker
yum install -y yum-utils
yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
yum install docker
systemctl start docker
2.2 搭建Postgres
docker run -d \
-p 5432:5432 \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-v /Users/ziyi2/docker-home/pg:/var/lib/postgresql/data \
--name pg \
--restart always \
docker.io/postgres:9.6-alpine
# -p port 映射端口,可以通过宿主机的端口访问到容器内的服务
# -d 是detach 保持程序后台运行的意思
# -e environment 设置环境变量
# -v volume 文件或者文件夹的挂载
2.3 搭建Redis
docker run -d \
--name redis \
-v /Users/ziyi2/docker-home/redis:/data \
-p 6379:6379 redis
3 代码实现
完整代码地址:
https://github.com/ziyifast/ziyifast-code_instruction/tree/main/blond_filter
3.1 方案
思路:
- 先搭建Iris+Postgres,然后再数据库前挡一层Redis
- 在Redis之前再加一层布隆过滤器。效果:
请求 - 布隆过滤器 - Redis - Postgres
代码结构:
3.2 Iris+Redis+Postgres
注意:案例中部分代码不规范,主要起演示作用
①blond_filter/pg/pg.go
package pg
import (
"fmt"
_ "github.com/lib/pq"
"github.com/ziyifast/log"
"time"
"xorm.io/xorm"
)
var Cli *xorm.Engine
const (
host = "localhost"
port = 5432
user = "postgres"
password = "postgres"
dbName = "postgres"
)
var Engine *xorm.Engine
func init() {
psqlInfo := fmt.Sprintf("host=%s port=%d user=%s password=%s dbname=%s sslmode=disable", host, port, user, password, dbName)
engine, err := xorm.NewEngine("postgres", psqlInfo)
if err != nil {
log.Fatal(err)
}
engine.ShowSQL(true)
engine.SetMaxIdleConns(10)
engine.SetMaxOpenConns(20)
engine.SetConnMaxLifetime(time.Minute * 10)
engine.Cascade(true)
if err = engine.Ping(); err != nil {
log.Fatalf("%v", err)
}
Engine = engine
log.Infof("connect postgresql success")
}
②blond_filter/redis/redis.go
package redis
import "github.com/go-redis/redis"
var (
Client *redis.Client
PlayerPrefix = "player:"
)
func init() {
Client = redis.NewClient(&redis.Options{
Addr: "127.0.0.1:6379",
Password: "", // no password set
DB: 0, // use default DB
})
}
③blond_filter/model/player.go
package model
type Player struct {
Id int64 `xorm:"id" json:"id"`
Name string `xorm:"name" json:"name"`
Age int `xorm:"age" json:"age"`
}
func (p *Player) TableName() string {
return "player"
}
④blond_filter/dao/player_dao.go
package dao
import (
"github.com/aobco/log"
"myTest/demo_home/blond_filter/model"
"myTest/demo_home/blond_filter/pg"
"time"
)
type playerDao struct {
}
var PlayerDao = new(playerDao)
func (p *playerDao) InsertOne(player model.Player) (int64, error) {
return pg.Engine.InsertOne(player)
}
func (p *playerDao) GetById(id int64) (*model.Player, error) {
log.Infof("query postgres,time:%v", time.Now().String())
player := new(model.Player)
get, err := pg.Engine.Where("id=?", id).Get(player)
if err != nil {
log.Errorf("%v", err)
}
if !get {
return nil, nil
}
return player, nil
}
⑤blond_filter/service/player_service.go
package service
import (
"github.com/ziyifast/log"
"myTest/demo_home/blond_filter/dao"
"myTest/demo_home/blond_filter/model"
"myTest/demo_home/blond_filter/util"
)
type playerService struct {
}
var PlayerService = new(playerService)
func (s *playerService) FindById(id int64) (*model.Player, error) {
query blond filter
//if !util.CheckExist(id) {
// return nil, nil
//}
//query redis
player, err := util.PlayerCache.GetById(id)
if err != nil {
return nil, err
}
if player != nil {
return player, nil
}
//query db and cache result
p, err := dao.PlayerDao.GetById(id)
if err != nil {
log.Errorf("%v", err)
return nil, err
}
if p != nil {
err = util.PlayerCache.Put(p)
if err != nil {
log.Errorf("%v", err)
}
return p, nil
}
return p, nil
}
⑥blond_filter/controller/player_controller.go
package controller
import (
"encoding/json"
"github.com/kataras/iris/v12"
"github.com/kataras/iris/v12/mvc"
"myTest/demo_home/blond_filter/service"
"net/http"
"strconv"
)
type PlayerController struct {
Ctx iris.Context
}
func (p *PlayerController) BeforeActivation(b mvc.BeforeActivation) {
b.Handle("GET", "/find/{id}", "FindById")
}
func (p *PlayerController) FindById() mvc.Result {
defer p.Ctx.Next()
pId := p.Ctx.Params().Get("id")
id, err := strconv.ParseInt(pId, 10, 64)
if err != nil {
return mvc.Response{
Code: http.StatusBadRequest,
Content: []byte(err.Error()),
ContentType: "application/json",
}
}
player, err := service.PlayerService.FindById(id)
if err != nil {
return mvc.Response{
Code: http.StatusInternalServerError,
Content: []byte(err.Error()),
ContentType: "application/json",
}
}
marshal, err := json.Marshal(player)
if err != nil {
return mvc.Response{
Code: http.StatusInternalServerError,
Content: []byte(err.Error()),
ContentType: "application/json",
}
}
return mvc.Response{
Code: http.StatusOK,
Content: marshal,
ContentType: "application/json",
}
}
⑦blond_filter/util/player_cache.go
Redis缓存模块
package util
import (
"encoding/json"
"github.com/go-redis/redis"
"github.com/ziyifast/log"
"myTest/demo_home/blond_filter/model"
redis2 "myTest/demo_home/blond_filter/redis"
"strconv"
"time"
)
type playerCache struct {
}
var (
PlayerCache = new(playerCache)
PlayerKey = "player"
)
func (c *playerCache) GetById(id int64) (*model.Player, error) {
log.Infof("query redis,time:%v", time.Now().String())
result, err := redis2.Client.HGet(PlayerKey, strconv.FormatInt(id, 10)).Result()
if err != nil && err != redis.Nil {
log.Errorf("%v", err)
return nil, err
}
if result == "" {
return nil, nil
}
p := new(model.Player)
err = json.Unmarshal([]byte(result), p)
if err != nil {
log.Errorf("%v", err)
return nil, err
}
return p, nil
}
func (c *playerCache) Put(player *model.Player) error {
marshal, err := json.Marshal(player)
if err != nil {
log.Errorf("%v", err)
return err
}
_, err = redis2.Client.HSet(PlayerKey, strconv.FormatInt(player.Id, 10), string(marshal)).Result()
if err != nil {
log.Errorf("%v", err)
return err
}
return nil
}
⑧blond_filter/main.go
package main
import (
"github.com/kataras/iris/v12"
"github.com/kataras/iris/v12/mvc"
"myTest/demo_home/blond_filter/controller"
)
func main() {
//pg.Engine.Sync(new(model.Player))
app := iris.New()
pMvc := mvc.New(app.Party("player"))
pMvc.Handle(new(controller.PlayerController))
//util.InitBlondFilter()
app.Listen(":9999", nil)
}
演示
我们在请求到达之后,先去查询Redis,如果Redis没有则去查询Postgres,但如果此时有黑客恶意查询压根不合法的数据。就会导致在Redis一直查不到数据而不断请求Postgres。
- 导致Postgres负载过高
- 请求不存在的用户
- 查看
3.3 添加布隆过滤器(通过Redis bitmap实现)
新增布隆过滤器,加在Redis之前。
- 请求流程:请求 - 布隆过滤器 - Redis - 数据库
①blond_filter/util/check_blond_util.go
实现简易版hashCode。
- 为了避免hash冲突,我们可以通过多个hash函数进行映射,比如:将player:1982分别通过多个hash函数映射到多个offset。在查询时,就需要判断是否映射的所有的offset都存在。(一个hash函数冲突概率可能很高,但是通过不同多个hash进行映射,大幅降低冲突概率)
package util
import (
"fmt"
"github.com/ziyifast/log"
"math"
"myTest/demo_home/blond_filter/redis"
)
var base = 1 << 32
// achieve blond filter
// 1. calculate the hash of key
// 2. preload the players data
func InitBlondFilter() {
//get hashCode
key := fmt.Sprintf("%s%d", redis.PlayerPrefix, 1)
hashCode := int(math.Abs(float64(getHashCode(key))))
//calculate the offset
offset := hashCode % base
_, err := redis.Client.SetBit(key, int64(offset), 1).Result()
if err != nil {
panic(err)
}
}
func getHashCode(str string) int {
var hash int32 = 17
for i := 0; i < len(str); i++ {
hash = hash*31 + int32(str[i])
}
return int(hash)
}
func CheckExist(id int64) bool {
key := fmt.Sprintf("%s%d", redis.PlayerPrefix, id)
hashCode := int(math.Abs(float64(getHashCode(key))))
offset := hashCode % base
res, err := redis.Client.GetBit(key, int64(offset)).Result()
if err != nil {
log.Errorf("%v", err)
return false
}
log.Infof("%v", res)
return res == 1
}
②blond_filter/service/player_service.go
在查询Redis之前,先去查询布隆过滤器是否有数据
package service
import (
"github.com/ziyifast/log"
"myTest/demo_home/blond_filter/dao"
"myTest/demo_home/blond_filter/model"
"myTest/demo_home/blond_filter/util"
)
type playerService struct {
}
var PlayerService = new(playerService)
func (s *playerService) FindById(id int64) (*model.Player, error) {
// query blond filter
if !util.CheckExist(id) {
log.Infof("the player does not exist in the blond filter,return it!!! ")
return nil, nil
}
//query redis
player, err := util.PlayerCache.GetById(id)
if err != nil {
return nil, err
}
if player != nil {
return player, nil
}
//query db and cache result
p, err := dao.PlayerDao.GetById(id)
if err != nil {
log.Errorf("%v", err)
return nil, err
}
if p != nil {
err = util.PlayerCache.Put(p)
if err != nil {
log.Errorf("%v", err)
}
return p, nil
}
return p, nil
}
③blond_filter/main.go
package main
import (
"github.com/kataras/iris/v12"
"github.com/kataras/iris/v12/mvc"
"myTest/demo_home/blond_filter/controller"
"myTest/demo_home/blond_filter/util"
)
func main() {
//pg.Engine.Sync(new(model.Player))
app := iris.New()
pMvc := mvc.New(app.Party("player"))
pMvc.Handle(new(controller.PlayerController))
util.InitBlondFilter()
app.Listen(":9999", nil)
}
演示
- 请求不存在的用户
- 查看:已经被布隆过滤器拦截,恶意请求不会打到Redis和Postgres
如果查询存在的数据,当布隆过滤器中包含时,则会继续查询Redis和Postgres,查看数据是否真的存在。(因为存在Hash冲突,导致可能误判。)
- 比如id=1982与id=28算出来的hash值一致,但其实只有28存在Redis。这时我们通过hash值查询1982,bitmap对应offset返回值为表示存在值,但其实这时Redis中只有28的数据。因此我们要继续向下查询看Redis和Postgres是否真的存在1982的数据。