【python计算机视觉编程—

python计算机视觉编程——7.图像搜索

7.图像搜索
- 7.1 基于内容的图像检索（CBIR）
- 从文本挖掘中获取灵感——矢量空间模型（BOW表示模型）
- 7.2 视觉单词
- - - **思想**
    - **特征提取**：
- 创建词汇
- 7.3 图像索引
- - 7.3.1 建立数据库
  - 7.3.2 添加图像
- 7.4 在数据库中搜索图像
- - 7.4.1 利用索引获取获选图像
  - 7.4.2 用一幅图像进行查询
  - 7.4.3 确定对比基准并绘制结果
- 7.5 使用几何特性对结果排序

7.图像搜索

利用文本挖掘技术对基于图像视觉内容进行图像搜索

7.1 基于内容的图像检索（CBIR）

CBIR用于检索在视觉上具有相似性的图像（颜色相似，纹理相似、图像中的物体或场景相似）。

将查询图像与数据库中所有的图像进行完全比较（特征匹配）往往是不可行的，庞大的数据库回导致耗时过多。

从文本挖掘中获取灵感——矢量空间模型（BOW表示模型）

矢量空间模型是一个用于表示和搜索文本文档的模型。这些矢量是由文本词频直方图构成的。换句话说，矢量包含了每个单词出现的次数。

通过单词计数来构建文档直方图向量v，从而建立文档索引。通常，在单词计数时会忽略掉一些常用词，“这” “和” “是”等，这些常用词称为停用词。由于每篇文档长度不同，故除以直方图总和将向量归一化成单位长度。对于直方图向量中的每个元素，一般根据每个单词的重要性来赋予相应的权重。通常，数据集（或语料库）中一个单词的重要性与它在文档中出现的次数成正比，而与它在语料库中出现的次数成反比。

单词w在文档d中的词频是
${\rm tf}_{w,d}=\frac{单词w在文档d中出现的次数}{文档d中单词的总数}=\frac{n_w}{\sum_j n_j}$
逆向文档频率
${\rm idf}_{w,d}=\log\frac{在语料库D中文档的数目}{语料库中包含单词w的文档数的}=\log\frac{|(D)|}{\{d:w\in d\}}$

7.2 视觉单词

首先需要建立视觉等效单词，通常采用sift局部描述子做到

思想
1. 特征提取：
  - 从图像中提取局部特征（SIFT）。
2. 特征聚类：
  - 使用聚类算法（如 K-means）将这些局部特征描述符分组到不同的视觉单词中。每个聚类中心代表一个视觉单词。
3. 特征表示：
  - 将每个图像中的局部特征映射到最近的视觉单词上，形成图像的视觉单词直方图
创建词汇

使用sift特征描述子创建视觉单词词汇

from PIL import Image
import os

需用到如下两个函数

def process_image(imagename, resultname, params="--edge-thresh 10 --peak-thresh 5"):
    if imagename[-3:] != 'pgm':
        im = Image.open(imagename).convert('L')
        im.save('tmp.pgm')
        imagename = 'tmp.pgm'     
    cmmd = str(".\sift.exe " + imagename + " --output=" + resultname + " " + params)
    os.system(cmmd)
    print('processed', imagename, 'to', resultname)

# 可以返回目录中所有jpg图像的列表
def get_imlist(path):
    return [os.path.join(path,f) for f in os.listdir(path) if f.endswith('.jpg')]
imlist=get_imlist(r'.\first1000')

nbr_images=len(imlist)
featlist=[imlist[i][:-3]+'sift' for i in range(nbr_images)]

for i in range(nbr_images):
    process_image(imlist[i],featlist[i])

创建类名为Vocabulary，内含有两个函数train和project
train函数：从图像特征文件中读取特征描述符，使用 K-means 聚类算法生成视觉单词，并计算每个图像的视觉单词直方图。随后，它计算每个视觉单词的 IDF，并保存训练数据的文件路径。
- 步骤：
  1. 读取特征：
    - 从第一个特征文件中读取特征，并将其存储在 descriptors 中。
    - 依次读取其他特征文件，将它们的特征垂直堆叠到 descriptors 中。
  2. 训练视觉单词：
    - 使用 kmeans 聚类算法对特征进行聚类，得到视觉单词。
    - descriptors[::subsampling, :]表示对特征进行子采样，以提高计算效率。
    - self.voc存储视觉单词，distortion存储聚类的失真度。
  3. 计算视觉单词直方图：
    - 对每个图像计算视觉单词直方图。
    - 统计每个视觉单词在所有图像中的出现次数，计算 IDF 值。
  4. 存储训练数据：
    - 保存特征文件路径到 self.trainingdata
project函数：计算每张图像的视觉单词直方图
- 步骤
  - 将特征描述符映射到视觉单词：
    - 使用 vq（向量量化）方法将特征描述符映射到最近的视觉单词。
    - words 包含每个特征描述符对应的视觉单词索引。
  - 创建视觉单词直方图：
    - 遍历 words，统计每个视觉单词的出现次数。
    - 返回图像的视觉单词直方图 imhist。

class Vocabulary(object):
    def __init__(self,name):
        self.name=name  # 词汇表的名称
        self.voc=[]     # 存储视觉单词（词汇）的列表
        self.idf=[]     # 逆文档频率（IDF）值列表，用于加权视觉单词
        self.trainingdata=[]  # 训练数据的文件列表
        self.nbr_words=0   # 视觉单词的数量
    def train(self,featurefiles,k=100,subsampling=10):
    """
    featurefiles:存储特征文件路径的列表,每个文件包含图像的特征
    k:视觉单词的数量,默认为 100
    subsampling:子采样比例,默认为10
    """
        nbr_images=len(featurefiles)
        descr=[]
        descr.append(read_features_from_file(featurefiles[0])[1]) # 从第一个特征文件中读取特征，并将其存储在descriptors中
        descriptors=descr[0]
        for i in range(1,nbr_images): # 依次读取其他特征文件，将它们的特征垂直堆叠到 descriptors 中
            descr.append(read_features_from_file(featurefiles[0])[1])
            descriptors=np.vstack((descriptors,descr[i]))
        
        # 使用kmeans聚类算法对特征进行聚类,得到视觉单词
        # self.voc存储视觉单词，distortion存储聚类的失真度
        self.voc,distortion=kmeans(descriptors[::subsampling,:],k,1) #descriptors[::subsampling,:]:表示对特征进行子采样，以提高计算效率
        self.nbr_words=self.voc.shape[0]
        
        imwords=np.zeros((nbr_images,self.nbr_words)) #每一行表示一张图像的视觉单词直方图
        for i in range(nbr_images):
            imwords[i]=self.project(descr[i]) # 计算每张图像的视觉单词直方图,并将其存储在imwords的对应行中
        nbr_occurences=np.sum((imwords>0)*1,axis=0)  # 计算每个视觉单词在多少张图像中出现过
        self.idf=np.log((1.0*nbr_images)/(1.0*nbr_occurences+1)) #计算每个视觉单词的逆文档频率（IDF）。IDF是用来加权视觉单词的,避免频繁出现的视觉单词占据主导地位。+1 是为了避免除零错误
        self.trainingdata=featurefiles  # 保存训练数据文件路径列表，以便后续使用
    def project(self,descriptors):
        imhist=np.zeros((self.nbr_words))  # 使用 vq(向量量化)方法将特征描述符映射到最近的视觉单词
        words,distance=vq(descriptors,self.voc)  # words 包含每个特征描述符对应的视觉单词索引
        for w in words:
            imhist[w]+=1
        return imhist

voc=Vocabulary('ukbenchtest')
voc.train(featlist,1000,10) # 使用 featlist 中的特征文件训练 voc 对象,设定视觉单词数量为 1000,子采样比例为 10

with open('vocabulary.pkl','wb') as f:
    pickle.dump(voc,f)
print('vocabulary is:',voc.name,voc.nbr_words)

在这里插入图片描述

7.3 图像索引

7.3.1 建立数据库

from numpy import *
import pickle
import sqlite3 as sqlite

class Indexer(object):
    
    def __init__(self,db,voc):
        """ Initialize with the name of the database 
            and a vocabulary object. """
            
        self.con = sqlite.connect(db) #使用SQLite的connect函数连接到指定的数据库文件db，并将连接对象存储在 self.con 中
        self.voc = voc  # 将传入的 voc（词汇对象）存储在 self.voc 中，以便后续操作中使用
    
    def __del__(self):
        self.con.close()  # 在对象被销毁时，关闭与数据库的连接，以释放资源。
    
    def db_commit(self):
        self.con.commit()  # 提交当前事务，将对数据库的所有修改保存到数据库中。
    
    def create_tables(self): 
        """ Create the database tables. """
        
        self.con.execute('create table imlist(filename)') #创建一个名为 imlist 的表，该表包含一个字段 filename，用于存储图像文件名。
        self.con.execute('create table imwords(imid,wordid,vocname)') # 创建一个名为 imwords 的表，包含字段 imid（图像ID）、wordid（视觉单词ID）和 vocname（词汇名称）。此表用于存储每张图像中视觉单词的出现情况。
        self.con.execute('create table imhistograms(imid,histogram,vocname)') #创建一个名为 imhistograms 的表，包含字段 imid（图像ID）、histogram（图像的视觉单词直方图）和 vocname（词汇名称）。此表用于存储每张图像的视觉单词直方图。        
        self.con.execute('create index im_idx on imlist(filename)') #为 imlist 表中的 filename 字段创建一个索引，以加快对图像文件名的查询速度。
        self.con.execute('create index wordid_idx on imwords(wordid)') #为 imwords 表中的 wordid 字段创建一个索引，以加快对视觉单词ID的查询速度。
        self.con.execute('create index imid_idx on imwords(imid)') #为 imwords 表中的 imid 字段创建一个索引，以加快对图像ID的查询速度。
        self.con.execute('create index imidhist_idx on imhistograms(imid)') #为 imhistograms 表中的 imid 字段创建一个索引，以加快对图像ID的查询速度。

7.3.2 添加图像

因为需要在索引中添加图像，所以还需要在类中添加以下几个方法

    def add_to_index(self,imname,descr):
        """将图像及其特征描述符添加到数据库中"""
            
        if self.is_indexed(imname): return # 如果图像已经被索引（即已经存在于数据库中），则退出方法，避免重复索引。
        print('indexing', imname)
        
        imid = self.get_id(imname)
        
        imwords = self.voc.project(descr) #使用词汇对象 self.voc 的 project 方法，将特征描述符 descr 转换为视觉单词的直方图 imwords
        nbr_words = imwords.shape[0] #获取视觉单词直方图中视觉单词的数量 nbr_words
        
        
        for i in range(nbr_words):
            word = imwords[i]
            # wordid is the word number itself
            self.con.execute("insert into imwords(imid,wordid,vocname) values (?,?,?)", (imid,word,self.voc.name))
            
        # 将图像 ID (imid)、视觉单词直方图 imwords（使用 pickle.dumps 将 NumPy 数组编码为字符串）和词汇名称 (self.voc.name) 插入到 imhistograms 表中，记录图像的视觉单词直方图
        self.con.execute("insert into imhistograms(imid,histogram,vocname) values (?,?,?)", (imid,pickle.dumps(imwords),self.voc.name))

    def is_indexed(self,imname):
        """检查图像是否已被索引"""
        
        #执行 SQL 查询，检查 imlist 表中是否存在与 imname 匹配的 filename
        #fetchone() 方法返回查询结果的第一行。如果存在匹配的记录，则返回该记录；否则，返回 None
        im = self.con.execute("select rowid from imlist where filename='%s'" % imname).fetchone()
        return im != None

    def get_id(self,imname):
        """获取图像的唯一 ID。如果图像不在数据库中，则将其添加进去并返回新 ID"""
        
        #执行 SQL 查询，从 imlist 表中获取与 imname 匹配的图像 ID（rowid）
        cur = self.con.execute(
        "select rowid from imlist where filename='%s'" % imname)
        res=cur.fetchone()
        if res==None:
            cur = self.con.execute(
            "insert into imlist(filename) values ('%s')" % imname)
            return cur.lastrowid
        else:
            return res[0]

接着我们遍历整个数据库中的样本图像，并将其加入我们的索引

nbr_images=len(imlist)
with open('vocabulary.pkl','rb') as f:
    voc=pickle.load(f)

indx=Indexer('test.db',voc)
indx.create_tables()

for i in range(nbr_images)[:1000]:
    locs,descr=read_features_from_file(featlist[i])
    indx.add_to_index(imlist[i],descr)

indx.db_commit()

con=sqlite.connect('test.db')
print(con.execute('select count (filename) from imlist').fetchone())
print(con.execute('select * from imlist').fetchone())

在这里插入图片描述

7.4 在数据库中搜索图像

为实现搜索，我们创建一个Searcher类

class Searcher(object):
    
    def __init__(self,db,voc):
        """ Initialize with the name of the database. """
        self.con = sqlite.connect(db)
        self.voc = voc
    
    def __del__(self):
        self.con.close()

7.4.1 利用索引获取获选图像

需要利用建立起来的索引找到包含特定单词的所有图像，因此添加candidates_from_word函数到Searcher类中

    def candidates_from_word(self,imword):
        """ Get list of images containing imword. """
        
        im_ids = self.con.execute(
            "select distinct imid from imwords where wordid=%d" % imword).fetchall()
        return [i[0] for i in im_ids]

需要在合并了的列表中对每一个图像id出现的次数进行跟踪

    def candidates_from_histogram(self,imwords):
        """ Get list of images with similar words. """
        
        # get the word ids
        words = imwords.nonzero()[0]
        
        # find candidates
        candidates = []
        for word in words:
            c = self.candidates_from_word(word)
            candidates+=c
        
        # take all unique words and reverse sort on occurrence 
        tmp = [(w,candidates.count(w)) for w in set(candidates)]
        #         tmp.sort(cmp=lambda x,y:cmp(x[1],y[1]))
        tmp.sort(key=lambda x: x[1], reverse=True)
        tmp.reverse()
        
        # return sorted list, best matches first    
        return [w[0] for w in tmp]

src=Searcher('test.db',voc)
locs,descr=read_features_from_file(featlist[0])
iw=voc.project(descr)

print('ask using a histogram...')
print(src.candidates_from_histogram(iw)[:10])

在这里插入图片描述

7.4.2 用一幅图像进行查询

添加get_imhistogram函数到Searcher类中

    def get_imhistogram(self,imname):
        """ Return the word histogram for an image. """
        im_id = self.con.execute(
            "select rowid from imlist where filename='%s'" % imname).fetchone()
        s = self.con.execute(
            "select histogram from imhistograms where rowid='%d'" % im_id).fetchone()
        # use pickle to decode NumPy arrays from string
        #         return pickle.loads(str(s[0]))
#         print(type(s[0]))
        return pickle.loads(s[0])

添加query函数到Searcher类中

    def query(self,imname):
        """ Find a list of matching images for imname. """
        
        h = self.get_imhistogram(imname)
        candidates = self.candidates_from_histogram(h)
        
        matchscores = []
        for imid in candidates:
            # get the name
            cand_name = self.con.execute(
                "select filename from imlist where rowid=%d" % imid).fetchone()
            cand_h = self.get_imhistogram(cand_name)
            cand_dist = sqrt( sum( self.voc.idf*(h-cand_h)**2 ) )
            matchscores.append( (cand_dist,imid) )
        
        # return a sorted list of distances and database ids
        matchscores.sort()
        return matchscores

src=Searcher('test.db',voc)
print('try a query...')
print(src.query(imlist[0])[:10])

7.4.3 确定对比基准并绘制结果

def compute_ukbench_score(src,imlist):
    """ Returns the average number of correct
        images on the top four results of queries. """
        
    nbr_images = len(imlist)
    pos = zeros((nbr_images,4))
    # get first four results for each image
    for i in range(nbr_images):
        pos[i] = [w[1]-1 for w in src.query(imlist[i])[:4]]
    
    # compute score and return average
    score = array([ (pos[i]//4)==(i//4) for i in range(nbr_images)])*1.0
    return sum(score) / (nbr_images)

compute_ukbench_score(src,imlist)

定义plot_results函数

def plot_results(src,res):
    """ Show images in result list 'res'. """
    
    figure()
    nbr_results = len(res)
    for i in range(nbr_results):
        imname = src.get_filename(res[i])
        subplot(1,nbr_results,i+1)
        imshow(array(Image.open(imname)))
        axis('off')
    show()

src=Searcher('test.db',voc)
nbr_results=6
res=[w[1] for w in src.query(imlist[0])[:nbr_results]]
plot_results(src,res)

在这里插入图片描述

7.5 使用几何特性对结果排序

#载入图像列表
imlist=get_imlist(r'.\first1000')
nbr_images = len(imlist)
#载入特征列表
featlist = [imlist[i][:-3]+'sift' for i in range(nbr_images)]
    
    
nbr_images=len(imlist)

with open('vocabulary.pkl','rb') as f:
    voc=pickle.load(f)

src=Searcher('test.db',voc)

q_ind=50
nbr_results=20

res_reg=[w[1] for w in src.query(imlist[q_ind])[:nbr_results]]
print('top matches (regular):',res_reg)

q_locs,q_descr=read_features_from_file(featlist[q_ind])
fp=make_homog(q_locs[:,:2].T)

model=RansacModel()

rank={}
for ndx in res_reg[1:]:
    locs,descr=read_features_from_file(featlist[ndx])
    
    matches=match(q_descr,descr)
    ind=matches.nonzero()[0]
    ind2=matches[ind]
    tp=make_homog(locs[:,:2].T)
    
    try:
        H,inliers=H_from_ransac(fp[:,ind],tp[:,ind2],model,match_theshold=4)
    except:
        inliers=[]
    
    rank[ndx]=len(inliers)

sorted_rank=sorted(rank.items(),key=lambda t:t[1],reverse=True)
res_geom=[res_reg[0]]+[s[0] for s in sorted_rank]
print('top matches (homography):',res_geom)

plot_results(src,res_reg[:8])
plot_results(src,res_geom[:8])