图神经网络：(处理点云)PPFNet的实现

news2025/10/21 16:29:29

文章说明：
1)参考资料：PYG官方文档。超链。
2)博主水平不高，如有错误还望批评指正。
3)我在百度网盘上传了这篇文章的jupyter notebook和有关文献。超链。提取码8848。

文章目录

- 前言
- 文献阅读
- 代码实操
- 历史遗留问题

前言

本篇文章接上一篇文章，超链，PointNet++无法解决旋转变化，所以我们使用PPFNet解决问题。

文献阅读

重要！重要！重要！我的理解不一定对！水平低请见谅！谨慎食用！
参考文献： PPFNet: Global Context Aware Local Features for Robust 3D Point Matching
文章概述： 参考文献提出Point Pair Feature NETwork模型，模型在点云中能够感知全局并且能够描述局部。模型受到PointNet的启发(PointNet++也是受到PointNet启发)，采用新颖的N-tuple损失以及框架将全局信息嵌入于局部，具有置换不变性质，高召回率高鲁棒性。乃是3D点云提取特征重要工作文献。
理论概述：
11. 考虑两个点的集合 $X$ 以及 $Y$ 。 $x_{i},y_{i}$ 分别表示第 $i$ 个点坐标。假设它们存在某种关系并且是双射的，按照这篇The 3d-3d registration problem revisited的方法并且假定是刚性的。它们之间的关系可以分别表示为排列矩阵刚性变换。所以点集配准的L2可表示为： $d(X,Y|R,t,P)=\frac{1}{n}\sum_{i=1}^n||x_i-Ry_{i(P)}-t||^2$ 。符号说明： $P$ 指排列矩阵， $T$ 指刚性变换， $T=\{{R\in SO(3)},t \in R^3\}$ 。这里还是把L2理解为距离吧。
1). 排列矩阵是什么呢？如下一个矩阵便是一个44排列矩阵。表示集合{2,1,4,3}。第n行第m列为1，n在第m个的位置。
$\begin{bmatrix}0&1&0&0\\1&0&0&0\\0&0&0&1\\0&0&1&0\end{bmatrix}$
2). 刚性变换是什么呢？在几何学之中表示，使物体的大小形状不变变换。例如，平移，旋转，镜像。
12. 如果两个点集是数量相同的上述公式简化为 $d(X,Y|T,P)=\frac{1}{n}||X-PYT^\mathcal{T}||^2$ 。符号说明： $\mathcal{T}$ 是转置。如果两个点集匹配，那么有 $d(X,Y|T,P)\approx0$ 。得出结论：模型进行有效学习应该保持嵌入空间相似距离并且让 $d (X, Y ∣ T, P)$ 尽量为0。 $d_f(X,Y|T,P)=\frac{1}{n}||f(X)-f(PYT^\mathcal{T})||^2$ 为保持不变性(即让它约为0)， $f$ 应该对排列矩阵刚性变换极不敏感。但原文却说对刚性变换敏感。原文是“Ideally we would like to learn f being invariant to permutations P and as intolerant as possible to rigid transformations T”。还请大佬能够确认。
2. PPF stands for Point Pair Feature。我们对Point Pair Feature进行描述，它是一个4维的描述子。 $p_j−p_i∥^2,∠(n_i,p_j−p_i),∠(n_j,p_j−p_i),∠(n_i,n_j))$ 符号说明： $P_i$ 是点集 $X_1$ 第 $i$ 个点的特征向量， $P_j$ 是点集 $X_2$ 第 $j$ 个点的特征向量。 $n$ 指法线。 $||\cdot||$ 维欧几里得空间。 $∠(v_1,v_2) = atan2(||v_1\times v_2||,v_1 \cdot v_2)$ 。
3. 这篇文章受到PointNet启发，鉴于篇幅假定你已经彻底搞懂了PointNet并且知道PointNet能够有效聚合全局信息但是同时难以捕捉局部信息。
流程概述：
1.局部区域点云编码。 先看看图因为图很好理解的。 $F_r$ 为输入PPFNet的数据。
在这里插入图片描述
2.PPFNet的结构。 不难理解不再赘述。

3.N-tuple loss。 这是一个损失函数。如有兴趣建议自行阅读论文这一部分。博主没看因为我受不了。好了给我速速结束这一部分。

代码实操

导库

import torch
from torch.nn import Sequential,Linear,ReLU
from torch_geometric.nn import PPFConv
from torch_cluster import knn_graph
from torch_geometric.nn import global_max_pool

定义PPFNet框架
注意！注意！注意！注意！搭建PPFConv时输入维度必须加4，为什么呢?因为上面理论已经说了。输出维度还是不管你随意吧。

class PPFNet(torch.nn.Module):
    
    def __init__(self):
        super().__init__()
        mlp1=Sequential(Linear(4,20),ReLU(),Linear(20,24))
        self.conv1=PPFConv(mlp1)
        mlp2=Sequential(Linear(28,32),ReLU(),Linear(32,32))
        self.conv2=PPFConv(mlp2)
        self.classifier=Linear(32,40)
        
    def forward(self,pos,normal,batch):
        edge_index=knn_graph(pos,k=16,batch=batch,loop=False)
        x=self.conv1(x=None,pos=pos,normal=normal,edge_index=edge_index)
        x=x.relu()
        x=self.conv2(x=x,pos=pos,normal=normal,edge_index=edge_index)
        x=x.relu()
        x=global_max_pool(x,batch)
        return self.classifier(x)
    
model = PPFNet()
print(model)
#输出如下：
#PPFNet(
#  (conv1): PPFConv(local_nn=Sequential(
#    (0): Linear(in_features=4, out_features=20, bias=True)
#    (1): ReLU()
#    (2): Linear(in_features=20, out_features=24, bias=True)
#  ), global_nn=None)
#  (conv2): PPFConv(local_nn=Sequential(
#    (0): Linear(in_features=28, out_features=32, bias=True)
#    (1): ReLU()
#    (2): Linear(in_features=32, out_features=32, bias=True)
#  ), global_nn=None)
#  (classifier): Linear(in_features=32, out_features=40, bias=True)
#)

导库

from torch_geometric.transforms import Compose, RandomRotate,SamplePoints

定义旋转操作

random_rotate = Compose([
    RandomRotate(degrees=180, axis=0),
    RandomRotate(degrees=180, axis=1),
    RandomRotate(degrees=180, axis=2),
])
test_transform = Compose([
    random_rotate,
    SamplePoints(num=128, include_normals=True),
])

导库，导入数据数据变换，训测拆分打乱顺序

from torch_geometric.datasets import GeometricShapes
from torch_geometric.loader import DataLoader
train_dataset=GeometricShapes(root='/DATA/GeometricShapes',train=False,transform=SamplePoints(128,include_normals=True))
test_dataset=GeometricShapes(root='/DATA/GeometricShapes',train=False,transform=test_transform)
train_loader=DataLoader(train_dataset,batch_size=10,shuffle=True)
test_loader=DataLoader(test_dataset,batch_size=10)

jupyter nootbook内的输出如下
在这里插入图片描述

历史遗留问题

上篇文章我们使用K最邻近算法构边建图，好像这篇文章也是K最邻近算法构边建图。PointNet++那一篇文章指出这样不好(数据不均匀效果就不好，主观很好理解)。所以使用FPS算法来寻找质心克服问题。简单来说1.开始随便找个质心2.质心张开一个固定半径邻域3.再在剩下所有点内(质心以及质心邻域内的不算)寻找到离所有质心最远的点成为下个质心4迭代直到没有点了。
开摆

import matplotlib.pyplot as plt

def visualize_points(pos,edge_index=None,index=None):
    fig=plt.figure(figsize=(4, 4))
    if edge_index is not None:
        for (src,dst) in edge_index.t().tolist():
            src=pos[src].tolist()
            dst=pos[dst].tolist()
            plt.plot([src[0],dst[0]],[src[1],dst[1]],linewidth=1,color='black')
    if index is None:
        plt.scatter(pos[:,0],pos[:,1],s=50,zorder=1000)
    else:
       mask=torch.zeros(pos.size(0),dtype=torch.bool)
       mask[index]=True
       plt.scatter(pos[~mask,0],pos[~mask,1],s=50,color='lightgray',zorder=1000)
       plt.scatter(pos[mask,0],pos[mask,1],s=50,zorder=1000)
    plt.axis('off')
    plt.show()

from torch_cluster import fps

dataset=GeometricShapes(root='/DATA//GeometricShapes',transform=SamplePoints(128))
data=dataset[0]
index=fps(data.pos,ratio=0.25)
visualize_points(data.pos)
visualize_points(data.pos,index=index)