0. 数据来源
https://tjgb.hongheiku.com/
https://www.hongheiku.com/sichuan/55201.html
手动收集整理
数据展示
数据分享
只分享人口数据,地理数据可能涉及隐私问题,暂不分享,有需要可以邮箱联系uncodong@qq.com
链接: https://pan.baidu.com/s/1tWyVmcmho62I248zP1rIdQ
提取码: iknp
1. 整合成excel并使用geopandas映射
第六次人口普查数据读取
import pandas as pd
df = pd.read_excel("深圳市第六次人口普查数据记录.xlsx", sheet_name=None)
print(list(df))
统计人数并拼接
import pandas as pd
# dropnan https://blog.csdn.net/qq_17753903/article/details/89817371
# pandas合并excel的多个sheet https://blog.csdn.net/qiuqiuit/article/details/120596158
# concat https://blog.csdn.net/qsx123432/article/details/111931323
# pd.concat( [df数据1, df数据2, …… ], axis = 0或1, join = 连接方式, keys = 表明数据来源 )
## 1. 读取并拼接数据
dfs = [sheet.dropna(axis=0, how='any') for sheet in pd.read_excel("深圳市第六次人口普查数据记录.xlsx", sheet_name=None).values()]
dfs_concat = pd.concat(dfs)
print(len(dfs))
## 2. 遍历数据,保存地区和人口数量的对应关系
dfs_concat = dfs_concat.reset_index() # 重设index,这样才能用to_dict("index")
time6_area_name2people_num_dic_list = []
for each in dfs_concat.to_dict("index").values():
area_name = each["地区"]
people_num = each["常住人口"]
time6_area_name2people_num_dic_list.append({
"地区":area_name.replace("街道","").replace("办事处",""),
"人口数量":int(people_num.replace("人","")),
})
time6_df = pd.DataFrame(time6_area_name2people_num_dic_list)
time6_df
# time6_area_name2people_num_dic
读取街道尺度数据,拼接
import geopandas as gpd
shenzhen_subdistrict = gpd.read_file("shenzhen_subdistrict.shp")
print(shenzhen_subdistrict)
shenzhen_subdistrict_time6 = shenzhen_subdistrict.merge(time6_df, left_on='JDNAME', right_on='地区', how='left')
shenzhen_subdistrict_time6.to_file('深圳-第6次人口普查-街道.shp', driver='ESRI Shapefile', encoding='gbk')
shenzhen_subdistrict_time6
2. 地理数据展示
颜色越深表示人越多