数据集:处理geolife数据-CSDN博客 这边的stations,找到每个station 最近的其他10个station
1 读取数据
假设已经读完了,就是locations
2 保留有用的列
locations.drop(['center','user_id'],axis=1,inplace=True)
locations
3 加载几何形状
使用 shapely 的 wkt 模块来直接从 Well-Known Text(WKT)格式的字符串中加载几何形状,这样就无需手动解析字符串了
shapely.wkt.loads 函数可以将 WKT 格式的字符串转换为 shapely 的几何对象V
locations['extent'] = locations['extent'].apply(wkt.loads)
4 保存为geodataframe
locations=gpd.GeoDataFrame(locations,geometry='extent')
locations
locations['centroid'] = locations.geometry.centroid
locations
5 创建kd树
5.1 提取centriod坐标
import numpy as np
coordinates = np.array(list(locations['centroid'].apply(lambda x: (x.x, x.y))))
coordinates
5.2 创建kd树
kdtree = KDTree(coordinates)
6 查询最近的11个(包括自己)的邻居
distances, indices = kdtree.query(coordinates, k=11)
indices
'''
array([[ 0, 518, 603, ..., 526, 345, 559],
[ 1, 12, 7, ..., 204, 56, 5],
[ 2, 124, 245, ..., 133, 96, 93],
...,
[904, 852, 386, ..., 57, 253, 147],
[905, 380, 10, ..., 154, 117, 739],
[906, 646, 880, ..., 465, 129, 219]], dtype=int64)
'''
#剔除自己
nearest_indices = indices[:, 1:]
nearest_indices
'''
array([[518, 603, 341, ..., 526, 345, 559],
[ 12, 7, 59, ..., 204, 56, 5],
[124, 245, 542, ..., 133, 96, 93],
...,
[852, 386, 679, ..., 57, 253, 147],
[380, 10, 3, ..., 154, 117, 739],
[646, 880, 429, ..., 465, 129, 219]], dtype=int64)
'''