读写和解析简单的 nc 文件

news2025/3/4 15:51:39

NetCDF 文件格式在气象数据工程领域占据着举足轻重的地位,其结构灵活、强兼容性等优势使其成为该领域的一个标准。无论是从事学术研究还是工程实践,掌握这种数据格式变得越发重要。其次,我注意到目前社区中气象编程大多数课程都聚焦于某个特定库的使用方法,而鲜有以数据格式本身为中心的内容。因此,我决定将 NetCDF 数据格式置于核心位置,同时辅以 xarray、netCDF4、nco、cdo 等工具,共同构建本次培训的内容框架。

关卡 1 我们会由浅入深,假设从我们刚拿到一个陌生的 nc 文件开始,一步一步教大家如何快速查看 nc 文件的元信息、简单读取和创建 nc 文件,同时也会教大家理解 nc 文件中的一些重要的特点,帮助大家避免一些初学者常遇到的坑。

1.1 查看文件信息:使用 ncdump/ncinfo 等命令行

当我们拿到一个 netcdf 格式的文件,想立刻快速看一下这个文件的元信息(也就是有哪些变量,有哪些维度等),最快的方式是使用命令行。我们看一下下面这个例子:

ERA5_FP = '/home/mw/input/training_camp9055/era5.nc'
!ncinfo $ERA5_FP
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    CDI: Climate Data Interface version 2.3.0 (https://mpimet.mpg.de/cdi)
    Conventions: CF-1.6
    history: Fri Dec 01 02:13:30 2023: cdo setattribute,latitude@units=degrees_north era5.nc era5_1.nc
2023-09-16 06:32:48 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data7/adaptor.mars.internal-1694845966.6698666-25558-2-ac87386c-9688-494c-aad2-3f57cee71470.nc /cache/tmp/ac87386c-9688-494c-aad2-3f57cee71470-adaptor.mars.internal-1694845965.2818055-25558-3-tmp.grib
    CDO: Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo)
    dimensions(sizes): time(80), longitude(1440), latitude(721)
    variables(dimensions): int32 time(time), float32 longitude(longitude), float32 latitude(latitude), int16 u10(time, latitude, longitude), int16 v10(time, latitude, longitude)
    groups: 

上面这个命令展示的是在 Jupyter 环境里执行终端命令的例子,也就是 ! 语法,即你在 Jupyter 中如果想要执行一个 shell 命令,只需要在前面增加一个 ! 即可。上面的例子里也就是相当于直接在终端执行 ncinfo ./era5.nc

可以看到,返回的结果里显示了这个文件的组织形式和关键信息。这个命令行得以正常运行的前提是已经安装了 netCDF4 这个包,你可以使用 conda install -c conda-forge netCDF4 等方式安装。我们看一下文件中比较重要的内容1:

  dimensions(sizes): time(80), longitude(1440), latitude(721)    
    variables(dimensions): int32 time(time), float32 longitude(longitude), float32 latitude(latitude), int16 u10(time, latitude, longitude), int16 v10(time, latitude, longitude)    

以上信息显示文件维度包含了时间 80 个元素、经度 1440 个元素,纬度 721 个元素。文件中存储的变量包含经度、纬度、时间、u10(10 米风的纬向分量)、v10(10 米风的经向分量)。变量名后面括号内是该变量的维度结构,比如 u10 变量的数组是由(时间、纬度、经度)的三维结构组成,也就是可以推断出它的数组形状应该为(80, 721, 1440),而 time 变量仅由时间维单独组成,即(80,)。

  1. 🐳:预览下方代码块时使光标悬浮在代码块上并左右拖动,即可完整显示内容↩

当然上面的命令只显示了最简略的信息,而变量内部的属性信息被省略了。如果你想查看变量内部的更多信息,可以执行下面的命令:

!ncinfo -v u10 $ERA5_FP  # 查看 u10 变量的详细元信息
# 本地执行命令为 ncinfo -v u10 ./era5.nc
<class 'netCDF4._netCDF4.Variable'>
int16 u10(time, latitude, longitude)
    long_name: 10 metre U wind component
    units: m s**-1
    add_offset: -2.1228631327102883
    scale_factor: 0.000929512490889013
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (80, 721, 1440)
filling off

除了 ncinfo 命令以外,我们还需要知道另一个很方便的命令可以实现类似的效果,那就是 ncdump :

!ncdump -h $ERA5_FP
netcdf era5 {
dimensions:
	time = UNLIMITED ; // (80 currently)
	longitude = 1440 ;
	latitude = 721 ;
variables:
	int time(time) ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:units = "hours since 1900-01-01" ;
		time:calendar = "gregorian" ;
		time:axis = "T" ;
	float longitude(longitude) ;
		longitude:axis = "X" ;
		longitude:units = "degrees_east" ;
		longitude:long_name = "longitude" ;
	float latitude(latitude) ;
		latitude:long_name = "latitude" ;
		latitude:units = "degrees_north" ;
		latitude:axis = "Y" ;
	short u10(time, latitude, longitude) ;
		u10:long_name = "10 metre U wind component" ;
		u10:units = "m s**-1" ;
		u10:add_offset = -2.12286313271029 ;
		u10:scale_factor = 0.000929512490889013 ;
		u10:_FillValue = -32767s ;
		u10:missing_value = -32767s ;
	short v10(time, latitude, longitude) ;
		v10:long_name = "10 metre V wind component" ;
		v10:units = "m s**-1" ;
		v10:add_offset = 1.84205168976882 ;
		v10:scale_factor = 0.000878920267046397 ;
		v10:_FillValue = -32767s ;
		v10:missing_value = -32767s ;

// global attributes:
		:CDI = "Climate Data Interface version 2.3.0 (https://mpimet.mpg.de/cdi)" ;
		:Conventions = "CF-1.6" ;
		:history = "Fri Dec 01 02:13:30 2023: cdo setattribute,latitude@units=degrees_north era5.nc era5_1.nc\n2023-09-16 06:32:48 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data7/adaptor.mars.internal-1694845966.6698666-25558-2-ac87386c-9688-494c-aad2-3f57cee71470.nc /cache/tmp/ac87386c-9688-494c-aad2-3f57cee71470-adaptor.mars.internal-1694845965.2818055-25558-3-tmp.grib" ;
		:CDO = "Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo)" ;
}

ncdump 从名字可以看出来,它是要把数据内容“倒出来”。当我们增加 -h 的属性时(head的缩写),它就会只显示元信息。可以看到,ncdump -h 显示的信息很全面,除了基本的变量信息以外,变量内部的元信息也一并显示出来了。当然这种方式会增加篇幅,如果你有一个变量巨多的 nc 文件,可能还是要考虑先用 ncinfo 浅看一下。

当然了 ncdump 这个命令并不是只能看这种程度的元信息,它其实是可以把整个文件内容都给你“倒出来”的。在我们查看 nc 文件的时候,很多时候需要大致看一眼它的时间范围、空间范围。这个时候,我们就可以用 -c 属性:

!ncdump -c $ERA5_FP
netcdf era5 {
dimensions:
	time = UNLIMITED ; // (80 currently)
	longitude = 1440 ;
	latitude = 721 ;
variables:
	int time(time) ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:units = "hours since 1900-01-01" ;
		time:calendar = "gregorian" ;
		time:axis = "T" ;
	float longitude(longitude) ;
		longitude:axis = "X" ;
		longitude:units = "degrees_east" ;
		longitude:long_name = "longitude" ;
	float latitude(latitude) ;
		latitude:long_name = "latitude" ;
		latitude:units = "degrees_north" ;
		latitude:axis = "Y" ;
	short u10(time, latitude, longitude) ;
		u10:long_name = "10 metre U wind component" ;
		u10:units = "m s**-1" ;
		u10:add_offset = -2.12286313271029 ;
		u10:scale_factor = 0.000929512490889013 ;
		u10:_FillValue = -32767s ;
		u10:missing_value = -32767s ;
	short v10(time, latitude, longitude) ;
		v10:long_name = "10 metre V wind component" ;
		v10:units = "m s**-1" ;
		v10:add_offset = 1.84205168976882 ;
		v10:scale_factor = 0.000878920267046397 ;
		v10:_FillValue = -32767s ;
		v10:missing_value = -32767s ;

// global attributes:
		:CDI = "Climate Data Interface version 2.3.0 (https://mpimet.mpg.de/cdi)" ;
		:Conventions = "CF-1.6" ;
		:history = "Fri Dec 01 02:13:30 2023: cdo setattribute,latitude@units=degrees_north era5.nc era5_1.nc\n2023-09-16 06:32:48 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data7/adaptor.mars.internal-1694845966.6698666-25558-2-ac87386c-9688-494c-aad2-3f57cee71470.nc /cache/tmp/ac87386c-9688-494c-aad2-3f57cee71470-adaptor.mars.internal-1694845965.2818055-25558-3-tmp.grib" ;
		:CDO = "Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo)" ;
data:

 time = 1083792, 1083798, 1083804, 1083810, 1083816, 1083822, 1083828, 
    1083834, 1083840, 1083846, 1083852, 1083858, 1083864, 1083870, 1083876, 
    1083882, 1083888, 1083894, 1083900, 1083906, 1083912, 1083918, 1083924, 
    1083930, 1083936, 1083942, 1083948, 1083954, 1083960, 1083966, 1083972, 
    1083978, 1083984, 1083990, 1083996, 1084002, 1084008, 1084014, 1084020, 
    1084026, 1084032, 1084038, 1084044, 1084050, 1084056, 1084062, 1084068, 
    1084074, 1084080, 1084086, 1084092, 1084098, 1084104, 1084110, 1084116, 
    1084122, 1084128, 1084134, 1084140, 1084146, 1084152, 1084158, 1084164, 
    1084170, 1084176, 1084182, 1084188, 1084194, 1084200, 1084206, 1084212, 
    1084218, 1084224, 1084230, 1084236, 1084242, 1084248, 1084254, 1084260, 
    1084266 ;

 longitude = -180, -179.75, -179.5, -179.25, -179, -178.75, -178.5, -178.25, 
    -178, -177.75, -177.5, -177.25, -177, -176.75, -176.5, -176.25, -176, 
    -175.75, -175.5, -175.25, -175, -174.75, -174.5, -174.25, -174, -173.75, 
    -173.5, -173.25, -173, -172.75, -172.5, -172.25, -172, -171.75, -171.5, 
    -171.25, -171, -170.75, -170.5, -170.25, -170, -169.75, -169.5, -169.25, 
    -169, -168.75, -168.5, -168.25, -168, -167.75, -167.5, -167.25, -167, 
    -166.75, -166.5, -166.25, -166, -165.75, -165.5, -165.25, -165, -164.75, 
    -164.5, -164.25, -164, -163.75, -163.5, -163.25, -163, -162.75, -162.5, 
    -162.25, -162, -161.75, -161.5, -161.25, -161, -160.75, -160.5, -160.25, 
    -160, -159.75, -159.5, -159.25, -159, -158.75, -158.5, -158.25, -158, 
    -157.75, -157.5, -157.25, -157, -156.75, -156.5, -156.25, -156, -155.75, 
    -155.5, -155.25, -155, -154.75, -154.5, -154.25, -154, -153.75, -153.5, 
    -153.25, -153, -152.75, -152.5, -152.25, -152, -151.75, -151.5, -151.25, 
    -151, -150.75, -150.5, -150.25, -150, -149.75, -149.5, -149.25, -149, 
    -148.75, -148.5, -148.25, -148, -147.75, -147.5, -147.25, -147, -146.75, 
    -146.5, -146.25, -146, -145.75, -145.5, -145.25, -145, -144.75, -144.5, 
    -144.25, -144, -143.75, -143.5, -143.25, -143, -142.75, -142.5, -142.25, 
    -142, -141.75, -141.5, -141.25, -141, -140.75, -140.5, -140.25, -140, 
    -139.75, -139.5, -139.25, -139, -138.75, -138.5, -138.25, -138, -137.75, 
    -137.5, -137.25, -137, -136.75, -136.5, -136.25, -136, -135.75, -135.5, 
    -135.25, -135, -134.75, -134.5, -134.25, -134, -133.75, -133.5, -133.25, 
    -133, -132.75, -132.5, -132.25, -132, -131.75, -131.5, -131.25, -131, 
    -130.75, -130.5, -130.25, -130, -129.75, -129.5, -129.25, -129, -128.75, 
    -128.5, -128.25, -128, -127.75, -127.5, -127.25, -127, -126.75, -126.5, 
    -126.25, -126, -125.75, -125.5, -125.25, -125, -124.75, -124.5, -124.25, 
    -124, -123.75, -123.5, -123.25, -123, -122.75, -122.5, -122.25, -122, 
    -121.75, -121.5, -121.25, -121, -120.75, -120.5, -120.25, -120, -119.75, 
    -119.5, -119.25, -119, -118.75, -118.5, -118.25, -118, -117.75, -117.5, 
    -117.25, -117, -116.75, -116.5, -116.25, -116, -115.75, -115.5, -115.25, 
    -115, -114.75, -114.5, -114.25, -114, -113.75, -113.5, -113.25, -113, 
    -112.75, -112.5, -112.25, -112, -111.75, -111.5, -111.25, -111, -110.75, 
    -110.5, -110.25, -110, -109.75, -109.5, -109.25, -109, -108.75, -108.5, 
    -108.25, -108, -107.75, -107.5, -107.25, -107, -106.75, -106.5, -106.25, 
    -106, -105.75, -105.5, -105.25, -105, -104.75, -104.5, -104.25, -104, 
    -103.75, -103.5, -103.25, -103, -102.75, -102.5, -102.25, -102, -101.75, 
    -101.5, -101.25, -101, -100.75, -100.5, -100.25, -100, -99.75, -99.5, 
    -99.25, -99, -98.75, -98.5, -98.25, -98, -97.75, -97.5, -97.25, -97, 
    -96.75, -96.5, -96.25, -96, -95.75, -95.5, -95.25, -95, -94.75, -94.5, 
    -94.25, -94, -93.75, -93.5, -93.25, -93, -92.75, -92.5, -92.25, -92, 
    -91.75, -91.5, -91.25, -91, -90.75, -90.5, -90.25, -90, -89.75, -89.5, 
    -89.25, -89, -88.75, -88.5, -88.25, -88, -87.75, -87.5, -87.25, -87, 
    -86.75, -86.5, -86.25, -86, -85.75, -85.5, -85.25, -85, -84.75, -84.5, 
    -84.25, -84, -83.75, -83.5, -83.25, -83, -82.75, -82.5, -82.25, -82, 
    -81.75, -81.5, -81.25, -81, -80.75, -80.5, -80.25, -80, -79.75, -79.5, 
    -79.25, -79, -78.75, -78.5, -78.25, -78, -77.75, -77.5, -77.25, -77, 
    -76.75, -76.5, -76.25, -76, -75.75, -75.5, -75.25, -75, -74.75, -74.5, 
    -74.25, -74, -73.75, -73.5, -73.25, -73, -72.75, -72.5, -72.25, -72, 
    -71.75, -71.5, -71.25, -71, -70.75, -70.5, -70.25, -70, -69.75, -69.5, 
    -69.25, -69, -68.75, -68.5, -68.25, -68, -67.75, -67.5, -67.25, -67, 
    -66.75, -66.5, -66.25, -66, -65.75, -65.5, -65.25, -65, -64.75, -64.5, 
    -64.25, -64, -63.75, -63.5, -63.25, -63, -62.75, -62.5, -62.25, -62, 
    -61.75, -61.5, -61.25, -61, -60.75, -60.5, -60.25, -60, -59.75, -59.5, 
    -59.25, -59, -58.75, -58.5, -58.25, -58, -57.75, -57.5, -57.25, -57, 
    -56.75, -56.5, -56.25, -56, -55.75, -55.5, -55.25, -55, -54.75, -54.5, 
    -54.25, -54, -53.75, -53.5, -53.25, -53, -52.75, -52.5, -52.25, -52, 
    -51.75, -51.5, -51.25, -51, -50.75, -50.5, -50.25, -50, -49.75, -49.5, 
    -49.25, -49, -48.75, -48.5, -48.25, -48, -47.75, -47.5, -47.25, -47, 
    -46.75, -46.5, -46.25, -46, -45.75, -45.5, -45.25, -45, -44.75, -44.5, 
    -44.25, -44, -43.75, -43.5, -43.25, -43, -42.75, -42.5, -42.25, -42, 
    -41.75, -41.5, -41.25, -41, -40.75, -40.5, -40.25, -40, -39.75, -39.5, 
    -39.25, -39, -38.75, -38.5, -38.25, -38, -37.75, -37.5, -37.25, -37, 
    -36.75, -36.5, -36.25, -36, -35.75, -35.5, -35.25, -35, -34.75, -34.5, 
    -34.25, -34, -33.75, -33.5, -33.25, -33, -32.75, -32.5, -32.25, -32, 
    -31.75, -31.5, -31.25, -31, -30.75, -30.5, -30.25, -30, -29.75, -29.5, 
    -29.25, -29, -28.75, -28.5, -28.25, -28, -27.75, -27.5, -27.25, -27, 
    -26.75, -26.5, -26.25, -26, -25.75, -25.5, -25.25, -25, -24.75, -24.5, 
    -24.25, -24, -23.75, -23.5, -23.25, -23, -22.75, -22.5, -22.25, -22, 
    -21.75, -21.5, -21.25, -21, -20.75, -20.5, -20.25, -20, -19.75, -19.5, 
    -19.25, -19, -18.75, -18.5, -18.25, -18, -17.75, -17.5, -17.25, -17, 
    -16.75, -16.5, -16.25, -16, -15.75, -15.5, -15.25, -15, -14.75, -14.5, 
    -14.25, -14, -13.75, -13.5, -13.25, -13, -12.75, -12.5, -12.25, -12, 
    -11.75, -11.5, -11.25, -11, -10.75, -10.5, -10.25, -10, -9.75, -9.5, 
    -9.25, -9, -8.75, -8.5, -8.25, -8, -7.75, -7.5, -7.25, -7, -6.75, -6.5, 
    -6.25, -6, -5.75, -5.5, -5.25, -5, -4.75, -4.5, -4.25, -4, -3.75, -3.5, 
    -3.25, -3, -2.75, -2.5, -2.25, -2, -1.75, -1.5, -1.25, -1, -0.75, -0.5, 
    -0.25, 0, 0.25, 0.5, 0.75, 1, 1.25, 1.5, 1.75, 2, 2.25, 2.5, 2.75, 3, 
    3.25, 3.5, 3.75, 4, 4.25, 4.5, 4.75, 5, 5.25, 5.5, 5.75, 6, 6.25, 6.5, 
    6.75, 7, 7.25, 7.5, 7.75, 8, 8.25, 8.5, 8.75, 9, 9.25, 9.5, 9.75, 10, 
    10.25, 10.5, 10.75, 11, 11.25, 11.5, 11.75, 12, 12.25, 12.5, 12.75, 13, 
    13.25, 13.5, 13.75, 14, 14.25, 14.5, 14.75, 15, 15.25, 15.5, 15.75, 16, 
    16.25, 16.5, 16.75, 17, 17.25, 17.5, 17.75, 18, 18.25, 18.5, 18.75, 19, 
    19.25, 19.5, 19.75, 20, 20.25, 20.5, 20.75, 21, 21.25, 21.5, 21.75, 22, 
    22.25, 22.5, 22.75, 23, 23.25, 23.5, 23.75, 24, 24.25, 24.5, 24.75, 25, 
    25.25, 25.5, 25.75, 26, 26.25, 26.5, 26.75, 27, 27.25, 27.5, 27.75, 28, 
    28.25, 28.5, 28.75, 29, 29.25, 29.5, 29.75, 30, 30.25, 30.5, 30.75, 31, 
    31.25, 31.5, 31.75, 32, 32.25, 32.5, 32.75, 33, 33.25, 33.5, 33.75, 34, 
    34.25, 34.5, 34.75, 35, 35.25, 35.5, 35.75, 36, 36.25, 36.5, 36.75, 37, 
    37.25, 37.5, 37.75, 38, 38.25, 38.5, 38.75, 39, 39.25, 39.5, 39.75, 40, 
    40.25, 40.5, 40.75, 41, 41.25, 41.5, 41.75, 42, 42.25, 42.5, 42.75, 43, 
    43.25, 43.5, 43.75, 44, 44.25, 44.5, 44.75, 45, 45.25, 45.5, 45.75, 46, 
    46.25, 46.5, 46.75, 47, 47.25, 47.5, 47.75, 48, 48.25, 48.5, 48.75, 49, 
    49.25, 49.5, 49.75, 50, 50.25, 50.5, 50.75, 51, 51.25, 51.5, 51.75, 52, 
    52.25, 52.5, 52.75, 53, 53.25, 53.5, 53.75, 54, 54.25, 54.5, 54.75, 55, 
    55.25, 55.5, 55.75, 56, 56.25, 56.5, 56.75, 57, 57.25, 57.5, 57.75, 58, 
    58.25, 58.5, 58.75, 59, 59.25, 59.5, 59.75, 60, 60.25, 60.5, 60.75, 61, 
    61.25, 61.5, 61.75, 62, 62.25, 62.5, 62.75, 63, 63.25, 63.5, 63.75, 64, 
    64.25, 64.5, 64.75, 65, 65.25, 65.5, 65.75, 66, 66.25, 66.5, 66.75, 67, 
    67.25, 67.5, 67.75, 68, 68.25, 68.5, 68.75, 69, 69.25, 69.5, 69.75, 70, 
    70.25, 70.5, 70.75, 71, 71.25, 71.5, 71.75, 72, 72.25, 72.5, 72.75, 73, 
    73.25, 73.5, 73.75, 74, 74.25, 74.5, 74.75, 75, 75.25, 75.5, 75.75, 76, 
    76.25, 76.5, 76.75, 77, 77.25, 77.5, 77.75, 78, 78.25, 78.5, 78.75, 79, 
    79.25, 79.5, 79.75, 80, 80.25, 80.5, 80.75, 81, 81.25, 81.5, 81.75, 82, 
    82.25, 82.5, 82.75, 83, 83.25, 83.5, 83.75, 84, 84.25, 84.5, 84.75, 85, 
    85.25, 85.5, 85.75, 86, 86.25, 86.5, 86.75, 87, 87.25, 87.5, 87.75, 88, 
    88.25, 88.5, 88.75, 89, 89.25, 89.5, 89.75, 90, 90.25, 90.5, 90.75, 91, 
    91.25, 91.5, 91.75, 92, 92.25, 92.5, 92.75, 93, 93.25, 93.5, 93.75, 94, 
    94.25, 94.5, 94.75, 95, 95.25, 95.5, 95.75, 96, 96.25, 96.5, 96.75, 97, 
    97.25, 97.5, 97.75, 98, 98.25, 98.5, 98.75, 99, 99.25, 99.5, 99.75, 100, 
    100.25, 100.5, 100.75, 101, 101.25, 101.5, 101.75, 102, 102.25, 102.5, 
    102.75, 103, 103.25, 103.5, 103.75, 104, 104.25, 104.5, 104.75, 105, 
    105.25, 105.5, 105.75, 106, 106.25, 106.5, 106.75, 107, 107.25, 107.5, 
    107.75, 108, 108.25, 108.5, 108.75, 109, 109.25, 109.5, 109.75, 110, 
    110.25, 110.5, 110.75, 111, 111.25, 111.5, 111.75, 112, 112.25, 112.5, 
    112.75, 113, 113.25, 113.5, 113.75, 114, 114.25, 114.5, 114.75, 115, 
    115.25, 115.5, 115.75, 116, 116.25, 116.5, 116.75, 117, 117.25, 117.5, 
    117.75, 118, 118.25, 118.5, 118.75, 119, 119.25, 119.5, 119.75, 120, 
    120.25, 120.5, 120.75, 121, 121.25, 121.5, 121.75, 122, 122.25, 122.5, 
    122.75, 123, 123.25, 123.5, 123.75, 124, 124.25, 124.5, 124.75, 125, 
    125.25, 125.5, 125.75, 126, 126.25, 126.5, 126.75, 127, 127.25, 127.5, 
    127.75, 128, 128.25, 128.5, 128.75, 129, 129.25, 129.5, 129.75, 130, 
    130.25, 130.5, 130.75, 131, 131.25, 131.5, 131.75, 132, 132.25, 132.5, 
    132.75, 133, 133.25, 133.5, 133.75, 134, 134.25, 134.5, 134.75, 135, 
    135.25, 135.5, 135.75, 136, 136.25, 136.5, 136.75, 137, 137.25, 137.5, 
    137.75, 138, 138.25, 138.5, 138.75, 139, 139.25, 139.5, 139.75, 140, 
    140.25, 140.5, 140.75, 141, 141.25, 141.5, 141.75, 142, 142.25, 142.5, 
    142.75, 143, 143.25, 143.5, 143.75, 144, 144.25, 144.5, 144.75, 145, 
    145.25, 145.5, 145.75, 146, 146.25, 146.5, 146.75, 147, 147.25, 147.5, 
    147.75, 148, 148.25, 148.5, 148.75, 149, 149.25, 149.5, 149.75, 150, 
    150.25, 150.5, 150.75, 151, 151.25, 151.5, 151.75, 152, 152.25, 152.5, 
    152.75, 153, 153.25, 153.5, 153.75, 154, 154.25, 154.5, 154.75, 155, 
    155.25, 155.5, 155.75, 156, 156.25, 156.5, 156.75, 157, 157.25, 157.5, 
    157.75, 158, 158.25, 158.5, 158.75, 159, 159.25, 159.5, 159.75, 160, 
    160.25, 160.5, 160.75, 161, 161.25, 161.5, 161.75, 162, 162.25, 162.5, 
    162.75, 163, 163.25, 163.5, 163.75, 164, 164.25, 164.5, 164.75, 165, 
    165.25, 165.5, 165.75, 166, 166.25, 166.5, 166.75, 167, 167.25, 167.5, 
    167.75, 168, 168.25, 168.5, 168.75, 169, 169.25, 169.5, 169.75, 170, 
    170.25, 170.5, 170.75, 171, 171.25, 171.5, 171.75, 172, 172.25, 172.5, 
    172.75, 173, 173.25, 173.5, 173.75, 174, 174.25, 174.5, 174.75, 175, 
    175.25, 175.5, 175.75, 176, 176.25, 176.5, 176.75, 177, 177.25, 177.5, 
    177.75, 178, 178.25, 178.5, 178.75, 179, 179.25, 179.5, 179.75 ;

 latitude = 90, 89.75, 89.5, 89.25, 89, 88.75, 88.5, 88.25, 88, 87.75, 87.5, 
    87.25, 87, 86.75, 86.5, 86.25, 86, 85.75, 85.5, 85.25, 85, 84.75, 84.5, 
    84.25, 84, 83.75, 83.5, 83.25, 83, 82.75, 82.5, 82.25, 82, 81.75, 81.5, 
    81.25, 81, 80.75, 80.5, 80.25, 80, 79.75, 79.5, 79.25, 79, 78.75, 78.5, 
    78.25, 78, 77.75, 77.5, 77.25, 77, 76.75, 76.5, 76.25, 76, 75.75, 75.5, 
    75.25, 75, 74.75, 74.5, 74.25, 74, 73.75, 73.5, 73.25, 73, 72.75, 72.5, 
    72.25, 72, 71.75, 71.5, 71.25, 71, 70.75, 70.5, 70.25, 70, 69.75, 69.5, 
    69.25, 69, 68.75, 68.5, 68.25, 68, 67.75, 67.5, 67.25, 67, 66.75, 66.5, 
    66.25, 66, 65.75, 65.5, 65.25, 65, 64.75, 64.5, 64.25, 64, 63.75, 63.5, 
    63.25, 63, 62.75, 62.5, 62.25, 62, 61.75, 61.5, 61.25, 61, 60.75, 60.5, 
    60.25, 60, 59.75, 59.5, 59.25, 59, 58.75, 58.5, 58.25, 58, 57.75, 57.5, 
    57.25, 57, 56.75, 56.5, 56.25, 56, 55.75, 55.5, 55.25, 55, 54.75, 54.5, 
    54.25, 54, 53.75, 53.5, 53.25, 53, 52.75, 52.5, 52.25, 52, 51.75, 51.5, 
    51.25, 51, 50.75, 50.5, 50.25, 50, 49.75, 49.5, 49.25, 49, 48.75, 48.5, 
    48.25, 48, 47.75, 47.5, 47.25, 47, 46.75, 46.5, 46.25, 46, 45.75, 45.5, 
    45.25, 45, 44.75, 44.5, 44.25, 44, 43.75, 43.5, 43.25, 43, 42.75, 42.5, 
    42.25, 42, 41.75, 41.5, 41.25, 41, 40.75, 40.5, 40.25, 40, 39.75, 39.5, 
    39.25, 39, 38.75, 38.5, 38.25, 38, 37.75, 37.5, 37.25, 37, 36.75, 36.5, 
    36.25, 36, 35.75, 35.5, 35.25, 35, 34.75, 34.5, 34.25, 34, 33.75, 33.5, 
    33.25, 33, 32.75, 32.5, 32.25, 32, 31.75, 31.5, 31.25, 31, 30.75, 30.5, 
    30.25, 30, 29.75, 29.5, 29.25, 29, 28.75, 28.5, 28.25, 28, 27.75, 27.5, 
    27.25, 27, 26.75, 26.5, 26.25, 26, 25.75, 25.5, 25.25, 25, 24.75, 24.5, 
    24.25, 24, 23.75, 23.5, 23.25, 23, 22.75, 22.5, 22.25, 22, 21.75, 21.5, 
    21.25, 21, 20.75, 20.5, 20.25, 20, 19.75, 19.5, 19.25, 19, 18.75, 18.5, 
    18.25, 18, 17.75, 17.5, 17.25, 17, 16.75, 16.5, 16.25, 16, 15.75, 15.5, 
    15.25, 15, 14.75, 14.5, 14.25, 14, 13.75, 13.5, 13.25, 13, 12.75, 12.5, 
    12.25, 12, 11.75, 11.5, 11.25, 11, 10.75, 10.5, 10.25, 10, 9.75, 9.5, 
    9.25, 9, 8.75, 8.5, 8.25, 8, 7.75, 7.5, 7.25, 7, 6.75, 6.5, 6.25, 6, 
    5.75, 5.5, 5.25, 5, 4.75, 4.5, 4.25, 4, 3.75, 3.5, 3.25, 3, 2.75, 2.5, 
    2.25, 2, 1.75, 1.5, 1.25, 1, 0.75, 0.5, 0.25, 0, -0.25, -0.5, -0.75, -1, 
    -1.25, -1.5, -1.75, -2, -2.25, -2.5, -2.75, -3, -3.25, -3.5, -3.75, -4, 
    -4.25, -4.5, -4.75, -5, -5.25, -5.5, -5.75, -6, -6.25, -6.5, -6.75, -7, 
    -7.25, -7.5, -7.75, -8, -8.25, -8.5, -8.75, -9, -9.25, -9.5, -9.75, -10, 
    -10.25, -10.5, -10.75, -11, -11.25, -11.5, -11.75, -12, -12.25, -12.5, 
    -12.75, -13, -13.25, -13.5, -13.75, -14, -14.25, -14.5, -14.75, -15, 
    -15.25, -15.5, -15.75, -16, -16.25, -16.5, -16.75, -17, -17.25, -17.5, 
    -17.75, -18, -18.25, -18.5, -18.75, -19, -19.25, -19.5, -19.75, -20, 
    -20.25, -20.5, -20.75, -21, -21.25, -21.5, -21.75, -22, -22.25, -22.5, 
    -22.75, -23, -23.25, -23.5, -23.75, -24, -24.25, -24.5, -24.75, -25, 
    -25.25, -25.5, -25.75, -26, -26.25, -26.5, -26.75, -27, -27.25, -27.5, 
    -27.75, -28, -28.25, -28.5, -28.75, -29, -29.25, -29.5, -29.75, -30, 
    -30.25, -30.5, -30.75, -31, -31.25, -31.5, -31.75, -32, -32.25, -32.5, 
    -32.75, -33, -33.25, -33.5, -33.75, -34, -34.25, -34.5, -34.75, -35, 
    -35.25, -35.5, -35.75, -36, -36.25, -36.5, -36.75, -37, -37.25, -37.5, 
    -37.75, -38, -38.25, -38.5, -38.75, -39, -39.25, -39.5, -39.75, -40, 
    -40.25, -40.5, -40.75, -41, -41.25, -41.5, -41.75, -42, -42.25, -42.5, 
    -42.75, -43, -43.25, -43.5, -43.75, -44, -44.25, -44.5, -44.75, -45, 
    -45.25, -45.5, -45.75, -46, -46.25, -46.5, -46.75, -47, -47.25, -47.5, 
    -47.75, -48, -48.25, -48.5, -48.75, -49, -49.25, -49.5, -49.75, -50, 
    -50.25, -50.5, -50.75, -51, -51.25, -51.5, -51.75, -52, -52.25, -52.5, 
    -52.75, -53, -53.25, -53.5, -53.75, -54, -54.25, -54.5, -54.75, -55, 
    -55.25, -55.5, -55.75, -56, -56.25, -56.5, -56.75, -57, -57.25, -57.5, 
    -57.75, -58, -58.25, -58.5, -58.75, -59, -59.25, -59.5, -59.75, -60, 
    -60.25, -60.5, -60.75, -61, -61.25, -61.5, -61.75, -62, -62.25, -62.5, 
    -62.75, -63, -63.25, -63.5, -63.75, -64, -64.25, -64.5, -64.75, -65, 
    -65.25, -65.5, -65.75, -66, -66.25, -66.5, -66.75, -67, -67.25, -67.5, 
    -67.75, -68, -68.25, -68.5, -68.75, -69, -69.25, -69.5, -69.75, -70, 
    -70.25, -70.5, -70.75, -71, -71.25, -71.5, -71.75, -72, -72.25, -72.5, 
    -72.75, -73, -73.25, -73.5, -73.75, -74, -74.25, -74.5, -74.75, -75, 
    -75.25, -75.5, -75.75, -76, -76.25, -76.5, -76.75, -77, -77.25, -77.5, 
    -77.75, -78, -78.25, -78.5, -78.75, -79, -79.25, -79.5, -79.75, -80, 
    -80.25, -80.5, -80.75, -81, -81.25, -81.5, -81.75, -82, -82.25, -82.5, 
    -82.75, -83, -83.25, -83.5, -83.75, -84, -84.25, -84.5, -84.75, -85, 
    -85.25, -85.5, -85.75, -86, -86.25, -86.5, -86.75, -87, -87.25, -87.5, 
    -87.75, -88, -88.25, -88.5, -88.75, -89, -89.25, -89.5, -89.75, -90 ;
}

可以看到,它把维度上的值都给你打出来了,而没有打出变量的值,这个命令可以让你快速了解你所要处理的 nc 文件的经纬度坐标大概是在哪个范围。当然,如果你的 nc 文件经纬度坐标非常的巨大,那么不建议直接像上面这样执行,否则就被刷屏了。可以尝试使用管道符 |加 more 来限制一次性输出:ncdump -c ./era5.nc | more

如果要查看某一个变量的值,可以执行:

!ncdump -v u10 $ERA5_FP | head -n 50
netcdf era5 {
dimensions:
	time = UNLIMITED ; // (80 currently)
	longitude = 1440 ;
	latitude = 721 ;
variables:
	int time(time) ;
		time:standard_name = "time" ;
		time:long_name = "time" ;
		time:units = "hours since 1900-01-01" ;
		time:calendar = "gregorian" ;
		time:axis = "T" ;
	float longitude(longitude) ;
		longitude:axis = "X" ;
		longitude:units = "degrees_east" ;
		longitude:long_name = "longitude" ;
	float latitude(latitude) ;
		latitude:long_name = "latitude" ;
		latitude:units = "degrees_north" ;
		latitude:axis = "Y" ;
	short u10(time, latitude, longitude) ;
		u10:long_name = "10 metre U wind component" ;
		u10:units = "m s**-1" ;
		u10:add_offset = -2.12286313271029 ;
		u10:scale_factor = 0.000929512490889013 ;
		u10:_FillValue = -32767s ;
		u10:missing_value = -32767s ;
	short v10(time, latitude, longitude) ;
		v10:long_name = "10 metre V wind component" ;
		v10:units = "m s**-1" ;
		v10:add_offset = 1.84205168976882 ;
		v10:scale_factor = 0.000878920267046397 ;
		v10:_FillValue = -32767s ;
		v10:missing_value = -32767s ;

// global attributes:
		:CDI = "Climate Data Interface version 2.3.0 (https://mpimet.mpg.de/cdi)" ;
		:Conventions = "CF-1.6" ;
		:history = "Fri Dec 01 02:13:30 2023: cdo setattribute,latitude@units=degrees_north era5.nc era5_1.nc\n2023-09-16 06:32:48 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data7/adaptor.mars.internal-1694845966.6698666-25558-2-ac87386c-9688-494c-aad2-3f57cee71470.nc /cache/tmp/ac87386c-9688-494c-aad2-3f57cee71470-adaptor.mars.internal-1694845965.2818055-25558-3-tmp.grib" ;
		:CDO = "Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo)" ;
data:

 u10 =
  8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 
    8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 8845, 

由于数据量过于巨大,因此上面的命令使用 Unix 管道符将输出结果限制在了 50 行以内。

可以看到输出的 u10 的数值好像很反直觉(值全是 8845),但是事实上这个数据并非错误,而是涉及到一个打包和解包的过程,这里暂时不表,后面会专门讲到。

1.2 读取:使用 netcdf4-python、xarray 等常⻅工具

在简要查看了 nc 文件之后,我们如何在 Python 程序中读取 nc 文件呢?其实方法有很多,在这里我介绍两个比较主流的方法:使用 xarray 和 netCDF4-python。

使用 netCDF4-python

我们先来介绍基于 netCDF4-python 包的方法,netCDF4-python 包是由 C 版本 netCDF 模块包的 Python 版 API 接口。在 Python 环境下安装方法是 pip install netCDF4

我们来看看如何用 netCDF4 简单读取 nc 文件:

import netCDF4 as nc

ds = nc.Dataset(ERA5_FP)
ds

 

<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    CDI: Climate Data Interface version 2.3.0 (https://mpimet.mpg.de/cdi)
    Conventions: CF-1.6
    history: Fri Dec 01 02:13:30 2023: cdo setattribute,latitude@units=degrees_north era5.nc era5_1.nc
2023-09-16 06:32:48 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data7/adaptor.mars.internal-1694845966.6698666-25558-2-ac87386c-9688-494c-aad2-3f57cee71470.nc /cache/tmp/ac87386c-9688-494c-aad2-3f57cee71470-adaptor.mars.internal-1694845965.2818055-25558-3-tmp.grib
    CDO: Climate Data Operators version 2.3.0 (https://mpimet.mpg.de/cdo)
    dimensions(sizes): time(80), longitude(1440), latitude(721)
    variables(dimensions): int32 time(time), float32 longitude(longitude), float32 latitude(latitude), int16 u10(time, latitude, longitude), int16 v10(time, latitude, longitude)
    groups: 

从上面可以看到 dataset 数据中的变量和维度信息,这个信息和之前 !ncinfo ./era5.nc 显示的结果一样。

如果我们想要读取某一个变量,可以按以下方式读取:

u10 = ds.variables['u10']
u10
<class 'netCDF4._netCDF4.Variable'>
int16 u10(time, latitude, longitude)
    long_name: 10 metre U wind component
    units: m s**-1
    add_offset: -2.1228631327102883
    scale_factor: 0.000929512490889013
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (80, 721, 1440)
filling off

上述代码选取了 u10 这个要素,当我们直接打印时显示的是该要素的元信息。如果我们要取其中的值,可以这样执行:

u10_data = u10[:]
u10_data
masked_array(
  data=[[[ 6.09867485e+00,  6.09867485e+00,  6.09867485e+00, ...,
           6.09867485e+00,  6.09867485e+00,  6.09867485e+00],
         [ 3.15119074e+00,  3.13724805e+00,  3.12237585e+00, ...,
           3.19394832e+00,  3.18000563e+00,  3.16606294e+00],
         [ 2.89185676e+00,  2.86397138e+00,  2.83515649e+00, ...,
           2.97830142e+00,  2.94948653e+00,  2.92160116e+00],
         ...,
         [ 1.26613941e+00,  1.26056233e+00,  1.25312623e+00, ...,
           1.28565917e+00,  1.27915258e+00,  1.27264600e+00],
         [ 1.16575206e+00,  1.16296352e+00,  1.16017499e+00, ...,
           1.17504718e+00,  1.17225865e+00,  1.16854060e+00],
         [ 3.29991274e+00,  3.29991274e+00,  3.29991274e+00, ...,
           3.29991274e+00,  3.29991274e+00,  3.29991274e+00]],

        [[ 4.51478556e+00,  4.51478556e+00,  4.51478556e+00, ...,
           4.51478556e+00,  4.51478556e+00,  4.51478556e+00],
         [ 2.25142265e+00,  2.23933899e+00,  2.22725532e+00, ...,
           2.28767364e+00,  2.27651949e+00,  2.26350631e+00],
         [ 2.60184886e+00,  2.57675202e+00,  2.55258470e+00, ...,
           2.67435083e+00,  2.65018351e+00,  2.62601618e+00],
         ...,
         [ 8.71096600e-01,  8.69237575e-01,  8.66449038e-01, ...,
           8.79462213e-01,  8.76673675e-01,  8.73885138e-01],
         [ 1.42601556e+00,  1.42415653e+00,  1.42229751e+00, ...,
           1.42973361e+00,  1.42787458e+00,  1.42694507e+00],
         [ 3.27202736e+00,  3.27202736e+00,  3.27202736e+00, ...,
           3.27202736e+00,  3.27202736e+00,  3.27202736e+00]],

        [[ 3.90409586e+00,  3.90409586e+00,  3.90409586e+00, ...,
           3.90409586e+00,  3.90409586e+00,  3.90409586e+00],
         [ 1.93167035e+00,  1.92144572e+00,  1.91122108e+00, ...,
           1.96420329e+00,  1.95304914e+00,  1.94282450e+00],
         [ 2.01439696e+00,  1.99301818e+00,  1.97163939e+00, ...,
           2.07853333e+00,  2.05715454e+00,  2.03484624e+00],
         ...,
         [ 1.34235943e+00,  1.33213480e+00,  1.32098065e+00, ...,
           1.37210383e+00,  1.36280871e+00,  1.35165456e+00],
         [ 1.76993518e+00,  1.76435810e+00,  1.75971054e+00, ...,
           1.78573689e+00,  1.77923030e+00,  1.77458274e+00],
         [ 4.89867422e+00,  4.89867422e+00,  4.89867422e+00, ...,
           4.89867422e+00,  4.89867422e+00,  4.89867422e+00]],

        ...,

        [[-3.65191118e+00, -3.65191118e+00, -3.65191118e+00, ...,
          -3.65191118e+00, -3.65191118e+00, -3.65191118e+00],
         [-2.23068658e+00, -2.22510951e+00, -2.21767341e+00, ...,
          -2.24927683e+00, -2.24369976e+00, -2.23719317e+00],
         [-1.79288620e+00, -1.78080254e+00, -1.76778936e+00, ...,
          -1.83006670e+00, -1.81798304e+00, -1.80496986e+00],
         ...,
         [-1.64044615e+00, -1.63858712e+00, -1.63486907e+00, ...,
          -1.64974127e+00, -1.64695274e+00, -1.64416420e+00],
         [-4.04194537e-01, -4.03265025e-01, -4.01406000e-01, ...,
          -4.09771612e-01, -4.07912587e-01, -4.06983075e-01],
         [-2.70344738e-01, -2.70344738e-01, -2.70344738e-01, ...,
          -2.70344738e-01, -2.70344738e-01, -2.70344738e-01]],

        [[-2.39335127e+00, -2.39335127e+00, -2.39335127e+00, ...,
          -2.39335127e+00, -2.39335127e+00, -2.39335127e+00],
         [-3.16820363e-01, -3.08454750e-01, -2.98230113e-01, ...,
          -3.44705738e-01, -3.34481100e-01, -3.26115488e-01],
         [ 4.29009711e-02,  5.96321959e-02,  7.91519582e-02, ...,
          -1.28697784e-02,  5.72047143e-03,  2.43107212e-02],
         ...,
         [-1.28129327e-01, -1.17904690e-01, -1.07680052e-01, ...,
          -1.59732752e-01, -1.49508115e-01, -1.38353965e-01],
         [-1.94124714e-01, -1.90406664e-01, -1.84829589e-01, ...,
          -2.10855939e-01, -2.05278864e-01, -1.99701789e-01],
         [ 3.84032055e-01,  3.84032055e-01,  3.84032055e-01, ...,
           3.84032055e-01,  3.84032055e-01,  3.84032055e-01]],

        [[-3.02263122e+00, -3.02263122e+00, -3.02263122e+00, ...,
          -3.02263122e+00, -3.02263122e+00, -3.02263122e+00],
         [-6.78400722e-01, -6.66317059e-01, -6.55162910e-01, ...,
          -7.14651709e-01, -7.03497559e-01, -6.90484384e-01],
         [-6.42149735e-01, -6.17052897e-01, -5.94744598e-01, ...,
          -7.12792684e-01, -6.89554872e-01, -6.65387547e-01],
         ...,
         [ 3.15026123e+00,  3.14747269e+00,  3.14375464e+00, ...,
           3.16234489e+00,  3.15862684e+00,  3.15490879e+00],
         [ 2.87419602e+00,  2.87140748e+00,  2.86954846e+00, ...,
           2.87791407e+00,  2.87605504e+00,  2.87512553e+00],
         [ 1.97535744e+00,  1.97535744e+00,  1.97535744e+00, ...,
           1.97535744e+00,  1.97535744e+00,  1.97535744e+00]]],
  mask=False,
  fill_value=1e+20)

如果我们想要获取变量的属性,可以按照以下两种方法获取:

print(u10.getncattr('units'))  # 函数法
print(u10.units)               # 属性法
m s**-1
m s**-1

这里我比较推荐使用函数法,因为这种方法可以将属性名称作为变量传入而避免成为硬编码,属性法的写法是一种硬编码。

硬编码是一种有缺陷的编程习惯,它会降低代码模块的可复用性。以上述获取变量属性为例,如果我们使用硬编码的属性方式获取units属性,如果后续想要获取其他属性,则需要通过“修改源码”的方式进行,但如果我们使用函数法获取属性,那么属性的 key 就可以在封装时作为一种参数进行传递,在需要获取的时候只需要更换为想要的 key 即可,如果辅以循环逻辑,可以非常灵活地变更所要获取属性的内容和数量。类似地,当我们要给一个变量赋值属性时,如果使用属性法,就需要根据需要把每一个属性都写在源码里,并且一旦属性内容变更,就要修改源码;而如果我们使用函数法,可以将属性内容以字典的方式通过参数传入,如果属性内容有变更,无需修改源码只需要修改传递的参数即可。这种设计思路也是依据软件编程范式中大名鼎鼎的开闭原则

使用 xarray

下面我们再来使用 xarray 读取 nc 文件。xarray 是基于各种格点数据底层 IO 包的高级封装。在读取 netcdf 数据时,通常也是基于 netcdf4 包作为依赖。

import xarray as xr

ds = xr.open_dataset(ERA5_FP)
ds
/opt/conda/lib/python3.9/site-packages/xarray/backends/plugins.py:71: RuntimeWarning: Engine 'cfradial1' loading failed:
cannot import name 'ParasiteAxesAuxTrans' from 'mpl_toolkits.axisartist' (/opt/conda/lib/python3.9/site-packages/mpl_toolkits/axisartist/__init__.py)
  warnings.warn(f"Engine {name!r} loading failed:\n{ex}", RuntimeWarning)

要获取某个要素:

u10 = ds['u10']
u10
如果要获取要素的数组:
print(u10.values)

 得到的数组实质上是 numpy 的 ndarray 类型数组

print(type(u10.values))

 <class 'numpy.ndarray'>

如果要获取要素的属性,方法和 netCDF4 包里的方法类似,有两种方法:

print(u10.attrs.get('units'))  # 函数法
print(u10.units)               # 属性法
m s**-1
m s**-1

netCDF-4 ,还是 xarray?

xarray 相对 netcdf4 来说,有更高级和方便的封装,例如通过直接传递经纬度的方式在空间取值:

u10.sel(longitude=130, latitude=40, method='nearest')

或者用时间信息直接取值:

u10.sel(time='2023-08-22 00:00:00')

这些高级封装都让使用者获得了极其舒适的使用体验,这也让 xarray 成为科研人员处理格点数据时的首选模块包。对于科研人员来说,面对的数据处理情况相对简单,但对于工程师来说,处理 nc 文件所要面临的问题会更复杂,因此如果只掌握 xarray 是无法解决工作中遇到的所有问题的,所以仍需要学会更底层的 netcdf 包的相关技巧。

1.3 解析:时间维度、数据值

初学者在读取 nc 文件时,可能会在时间信息解析或者数据解包问题上遇到麻烦,下面我们就来看一下如果正确解析时间信息以及如何对数据进行正确的解包操作。

使用 netCDF4 时,正确解析时间维度信息的方法

上面我们可以看到使用 xarray 读取的数据,在对时间的选取上相对直观,但是在 netcdf4 包里,对时间的处理就会略显复杂。

ds = nc.Dataset(ERA5_FP)
time = ds.variables['time'][:]
time

 

masked_array(data=[1083792, 1083798, 1083804, 1083810, 1083816, 1083822,
                   1083828, 1083834, 1083840, 1083846, 1083852, 1083858,
                   1083864, 1083870, 1083876, 1083882, 1083888, 1083894,
                   1083900, 1083906, 1083912, 1083918, 1083924, 1083930,
                   1083936, 1083942, 1083948, 1083954, 1083960, 1083966,
                   1083972, 1083978, 1083984, 1083990, 1083996, 1084002,
                   1084008, 1084014, 1084020, 1084026, 1084032, 1084038,
                   1084044, 1084050, 1084056, 1084062, 1084068, 1084074,
                   1084080, 1084086, 1084092, 1084098, 1084104, 1084110,
                   1084116, 1084122, 1084128, 1084134, 1084140, 1084146,
                   1084152, 1084158, 1084164, 1084170, 1084176, 1084182,
                   1084188, 1084194, 1084200, 1084206, 1084212, 1084218,
                   1084224, 1084230, 1084236, 1084242, 1084248, 1084254,
                   1084260, 1084266],
             mask=False,
       fill_value=999999,
            dtype=int32)

可以看到,netcdf4 包读取的时间直接读取是一串整数,并不是直观的时间字符串。如果要正确解析它的时间信息,需要用一些专门的方法进行转换:

dts = nc.num2date(time, ds.variables['time'].units)
dts
masked_array(data=[cftime.DatetimeGregorian(2023, 8, 22, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 22, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 22, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 22, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 23, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 23, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 23, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 23, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 24, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 24, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 24, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 24, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 25, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 25, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 25, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 25, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 26, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 26, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 26, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 26, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 27, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 27, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 27, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 27, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 28, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 28, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 28, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 28, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 29, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 29, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 29, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 29, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 30, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 30, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 30, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 30, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 31, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 31, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 31, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 8, 31, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 1, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 1, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 1, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 1, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 2, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 2, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 2, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 2, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 3, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 3, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 3, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 3, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 4, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 4, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 4, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 4, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 5, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 5, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 5, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 5, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 6, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 6, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 6, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 6, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 7, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 7, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 7, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 7, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 8, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 8, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 8, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 8, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 9, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 9, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 9, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 9, 18, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 10, 0, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 10, 6, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 10, 12, 0, 0, 0, has_year_zero=False),
                   cftime.DatetimeGregorian(2023, 9, 10, 18, 0, 0, 0, has_year_zero=False)],
             mask=False,
       fill_value='?',
            dtype=object)

如果想以 Python 的 datetime 对象形式来存储返回值,可以增加 only_use_python_datetimes 和 only_use_cftime_datetimes 来强制将结果按照符合 Python datetime 对象的形式返回。

dts = nc.num2date(time, ds.variables['time'].units, only_use_python_datetimes=True, only_use_cftime_datetimes=False)
dts
masked_array(data=[real_datetime(2023, 8, 22, 0, 0),
                   real_datetime(2023, 8, 22, 6, 0),
                   real_datetime(2023, 8, 22, 12, 0),
                   real_datetime(2023, 8, 22, 18, 0),
                   real_datetime(2023, 8, 23, 0, 0),
                   real_datetime(2023, 8, 23, 6, 0),
                   real_datetime(2023, 8, 23, 12, 0),
                   real_datetime(2023, 8, 23, 18, 0),
                   real_datetime(2023, 8, 24, 0, 0),
                   real_datetime(2023, 8, 24, 6, 0),
                   real_datetime(2023, 8, 24, 12, 0),
                   real_datetime(2023, 8, 24, 18, 0),
                   real_datetime(2023, 8, 25, 0, 0),
                   real_datetime(2023, 8, 25, 6, 0),
                   real_datetime(2023, 8, 25, 12, 0),
                   real_datetime(2023, 8, 25, 18, 0),
                   real_datetime(2023, 8, 26, 0, 0),
                   real_datetime(2023, 8, 26, 6, 0),
                   real_datetime(2023, 8, 26, 12, 0),
                   real_datetime(2023, 8, 26, 18, 0),
                   real_datetime(2023, 8, 27, 0, 0),
                   real_datetime(2023, 8, 27, 6, 0),
                   real_datetime(2023, 8, 27, 12, 0),
                   real_datetime(2023, 8, 27, 18, 0),
                   real_datetime(2023, 8, 28, 0, 0),
                   real_datetime(2023, 8, 28, 6, 0),
                   real_datetime(2023, 8, 28, 12, 0),
                   real_datetime(2023, 8, 28, 18, 0),
                   real_datetime(2023, 8, 29, 0, 0),
                   real_datetime(2023, 8, 29, 6, 0),
                   real_datetime(2023, 8, 29, 12, 0),
                   real_datetime(2023, 8, 29, 18, 0),
                   real_datetime(2023, 8, 30, 0, 0),
                   real_datetime(2023, 8, 30, 6, 0),
                   real_datetime(2023, 8, 30, 12, 0),
                   real_datetime(2023, 8, 30, 18, 0),
                   real_datetime(2023, 8, 31, 0, 0),
                   real_datetime(2023, 8, 31, 6, 0),
                   real_datetime(2023, 8, 31, 12, 0),
                   real_datetime(2023, 8, 31, 18, 0),
                   real_datetime(2023, 9, 1, 0, 0),
                   real_datetime(2023, 9, 1, 6, 0),
                   real_datetime(2023, 9, 1, 12, 0),
                   real_datetime(2023, 9, 1, 18, 0),
                   real_datetime(2023, 9, 2, 0, 0),
                   real_datetime(2023, 9, 2, 6, 0),
                   real_datetime(2023, 9, 2, 12, 0),
                   real_datetime(2023, 9, 2, 18, 0),
                   real_datetime(2023, 9, 3, 0, 0),
                   real_datetime(2023, 9, 3, 6, 0),
                   real_datetime(2023, 9, 3, 12, 0),
                   real_datetime(2023, 9, 3, 18, 0),
                   real_datetime(2023, 9, 4, 0, 0),
                   real_datetime(2023, 9, 4, 6, 0),
                   real_datetime(2023, 9, 4, 12, 0),
                   real_datetime(2023, 9, 4, 18, 0),
                   real_datetime(2023, 9, 5, 0, 0),
                   real_datetime(2023, 9, 5, 6, 0),
                   real_datetime(2023, 9, 5, 12, 0),
                   real_datetime(2023, 9, 5, 18, 0),
                   real_datetime(2023, 9, 6, 0, 0),
                   real_datetime(2023, 9, 6, 6, 0),
                   real_datetime(2023, 9, 6, 12, 0),
                   real_datetime(2023, 9, 6, 18, 0),
                   real_datetime(2023, 9, 7, 0, 0),
                   real_datetime(2023, 9, 7, 6, 0),
                   real_datetime(2023, 9, 7, 12, 0),
                   real_datetime(2023, 9, 7, 18, 0),
                   real_datetime(2023, 9, 8, 0, 0),
                   real_datetime(2023, 9, 8, 6, 0),
                   real_datetime(2023, 9, 8, 12, 0),
                   real_datetime(2023, 9, 8, 18, 0),
                   real_datetime(2023, 9, 9, 0, 0),
                   real_datetime(2023, 9, 9, 6, 0),
                   real_datetime(2023, 9, 9, 12, 0),
                   real_datetime(2023, 9, 9, 18, 0),
                   real_datetime(2023, 9, 10, 0, 0),
                   real_datetime(2023, 9, 10, 6, 0),
                   real_datetime(2023, 9, 10, 12, 0),
                   real_datetime(2023, 9, 10, 18, 0)],
             mask=False,
       fill_value='?',
            dtype=object)

基于 scale_factor 和 add_offset 参数,正确解析数据值

还记得我们在 1.1 章节里提到的 scale_factor 和 add_offset 吗?

这是一种压缩参数,使用这两个参数对原始数据值进行一个公式计算,就可以把它转为更节省空间的整型数据类型进行存储以节约存储空间(打包),而在读取数据的时候,再使用这两个参数反向计算一下获取真实的原始数据(解包)。

ds.variables['u10']
<class 'netCDF4._netCDF4.Variable'>
int16 u10(time, latitude, longitude)
    long_name: 10 metre U wind component
    units: m s**-1
    add_offset: -2.1228631327102883
    scale_factor: 0.000929512490889013
    _FillValue: -32767
    missing_value: -32767
unlimited dimensions: time
current shape = (80, 721, 1440)
filling off

我们在 u10 要素的属性中,可以看到 add_offset 和 scale_factor 两个参数。

在最新版的 netcdf4 包以及依赖于 netcdf4 包的 xarray 中,通常是不需要考虑 scale_factor 和 add_offset 的影响的,因为 netcdf4 包已经可以自动解包数据了。例如我们在 1.2 章节里,无论使用 xarray 还是 netcdf4,读取出来的 u10 的值看起来都是在比较合理的范围内。我们根本感觉不到解包的过程。

但是如果我们使用 h5py 等其他并不自动解包的库来处理数据时,就需要手动解包:

import h5py

ds = h5py.File(ERA5_FP)
ds['u10'][:]
array([[[ 8845,  8845,  8845, ...,  8845,  8845,  8845],
        [ 5674,  5659,  5643, ...,  5720,  5705,  5690],
        [ 5395,  5365,  5334, ...,  5488,  5457,  5427],
        ...,
        [ 3646,  3640,  3632, ...,  3667,  3660,  3653],
        [ 3538,  3535,  3532, ...,  3548,  3545,  3541],
        [ 5834,  5834,  5834, ...,  5834,  5834,  5834]],

       [[ 7141,  7141,  7141, ...,  7141,  7141,  7141],
        [ 4706,  4693,  4680, ...,  4745,  4733,  4719],
        [ 5083,  5056,  5030, ...,  5161,  5135,  5109],
        ...,
        [ 3221,  3219,  3216, ...,  3230,  3227,  3224],
        [ 3818,  3816,  3814, ...,  3822,  3820,  3819],
        [ 5804,  5804,  5804, ...,  5804,  5804,  5804]],

       [[ 6484,  6484,  6484, ...,  6484,  6484,  6484],
        [ 4362,  4351,  4340, ...,  4397,  4385,  4374],
        [ 4451,  4428,  4405, ...,  4520,  4497,  4473],
        ...,
        [ 3728,  3717,  3705, ...,  3760,  3750,  3738],
        [ 4188,  4182,  4177, ...,  4205,  4198,  4193],
        [ 7554,  7554,  7554, ...,  7554,  7554,  7554]],

       ...,

       [[-1645, -1645, -1645, ..., -1645, -1645, -1645],
        [ -116,  -110,  -102, ...,  -136,  -130,  -123],
        [  355,   368,   382, ...,   315,   328,   342],
        ...,
        [  519,   521,   525, ...,   509,   512,   515],
        [ 1849,  1850,  1852, ...,  1843,  1845,  1846],
        [ 1993,  1993,  1993, ...,  1993,  1993,  1993]],

       [[ -291,  -291,  -291, ...,  -291,  -291,  -291],
        [ 1943,  1952,  1963, ...,  1913,  1924,  1933],
        [ 2330,  2348,  2369, ...,  2270,  2290,  2310],
        ...,
        [ 2146,  2157,  2168, ...,  2112,  2123,  2135],
        [ 2075,  2079,  2085, ...,  2057,  2063,  2069],
        [ 2697,  2697,  2697, ...,  2697,  2697,  2697]],

       [[ -968,  -968,  -968, ...,  -968,  -968,  -968],
        [ 1554,  1567,  1579, ...,  1515,  1527,  1541],
        [ 1593,  1620,  1644, ...,  1517,  1542,  1568],
        ...,
        [ 5673,  5670,  5666, ...,  5686,  5682,  5678],
        [ 5376,  5373,  5371, ...,  5380,  5378,  5377],
        [ 4409,  4409,  4409, ...,  4409,  4409,  4409]]], dtype=int16)

使用 h5py 读取 u10 时得到的是原始值,而我们需要从它的属性中读取解包参数然后使用相应的解包公式来完成解包:

add_offset = ds['u10'].attrs.get('add_offset', None)
scale_factor = ds['u10'].attrs.get('scale_factor', None)
real_u10 = ds['u10'][:] * scale_factor + add_offset    # 解包公式
real_u10
array([[[ 6.09867485e+00,  6.09867485e+00,  6.09867485e+00, ...,
          6.09867485e+00,  6.09867485e+00,  6.09867485e+00],
        [ 3.15119074e+00,  3.13724805e+00,  3.12237585e+00, ...,
          3.19394832e+00,  3.18000563e+00,  3.16606294e+00],
        [ 2.89185676e+00,  2.86397138e+00,  2.83515649e+00, ...,
          2.97830142e+00,  2.94948653e+00,  2.92160116e+00],
        ...,
        [ 1.26613941e+00,  1.26056233e+00,  1.25312623e+00, ...,
          1.28565917e+00,  1.27915258e+00,  1.27264600e+00],
        [ 1.16575206e+00,  1.16296352e+00,  1.16017499e+00, ...,
          1.17504718e+00,  1.17225865e+00,  1.16854060e+00],
        [ 3.29991274e+00,  3.29991274e+00,  3.29991274e+00, ...,
          3.29991274e+00,  3.29991274e+00,  3.29991274e+00]],

       [[ 4.51478556e+00,  4.51478556e+00,  4.51478556e+00, ...,
          4.51478556e+00,  4.51478556e+00,  4.51478556e+00],
        [ 2.25142265e+00,  2.23933899e+00,  2.22725532e+00, ...,
          2.28767364e+00,  2.27651949e+00,  2.26350631e+00],
        [ 2.60184886e+00,  2.57675202e+00,  2.55258470e+00, ...,
          2.67435083e+00,  2.65018351e+00,  2.62601618e+00],
        ...,
        [ 8.71096600e-01,  8.69237575e-01,  8.66449038e-01, ...,
          8.79462213e-01,  8.76673675e-01,  8.73885138e-01],
        [ 1.42601556e+00,  1.42415653e+00,  1.42229751e+00, ...,
          1.42973361e+00,  1.42787458e+00,  1.42694507e+00],
        [ 3.27202736e+00,  3.27202736e+00,  3.27202736e+00, ...,
          3.27202736e+00,  3.27202736e+00,  3.27202736e+00]],

       [[ 3.90409586e+00,  3.90409586e+00,  3.90409586e+00, ...,
          3.90409586e+00,  3.90409586e+00,  3.90409586e+00],
        [ 1.93167035e+00,  1.92144572e+00,  1.91122108e+00, ...,
          1.96420329e+00,  1.95304914e+00,  1.94282450e+00],
        [ 2.01439696e+00,  1.99301818e+00,  1.97163939e+00, ...,
          2.07853333e+00,  2.05715454e+00,  2.03484624e+00],
        ...,
        [ 1.34235943e+00,  1.33213480e+00,  1.32098065e+00, ...,
          1.37210383e+00,  1.36280871e+00,  1.35165456e+00],
        [ 1.76993518e+00,  1.76435810e+00,  1.75971054e+00, ...,
          1.78573689e+00,  1.77923030e+00,  1.77458274e+00],
        [ 4.89867422e+00,  4.89867422e+00,  4.89867422e+00, ...,
          4.89867422e+00,  4.89867422e+00,  4.89867422e+00]],

       ...,

       [[-3.65191118e+00, -3.65191118e+00, -3.65191118e+00, ...,
         -3.65191118e+00, -3.65191118e+00, -3.65191118e+00],
        [-2.23068658e+00, -2.22510951e+00, -2.21767341e+00, ...,
         -2.24927683e+00, -2.24369976e+00, -2.23719317e+00],
        [-1.79288620e+00, -1.78080254e+00, -1.76778936e+00, ...,
         -1.83006670e+00, -1.81798304e+00, -1.80496986e+00],
        ...,
        [-1.64044615e+00, -1.63858712e+00, -1.63486907e+00, ...,
         -1.64974127e+00, -1.64695274e+00, -1.64416420e+00],
        [-4.04194537e-01, -4.03265025e-01, -4.01406000e-01, ...,
         -4.09771612e-01, -4.07912587e-01, -4.06983075e-01],
        [-2.70344738e-01, -2.70344738e-01, -2.70344738e-01, ...,
         -2.70344738e-01, -2.70344738e-01, -2.70344738e-01]],

       [[-2.39335127e+00, -2.39335127e+00, -2.39335127e+00, ...,
         -2.39335127e+00, -2.39335127e+00, -2.39335127e+00],
        [-3.16820363e-01, -3.08454750e-01, -2.98230113e-01, ...,
         -3.44705738e-01, -3.34481100e-01, -3.26115488e-01],
        [ 4.29009711e-02,  5.96321959e-02,  7.91519582e-02, ...,
         -1.28697784e-02,  5.72047143e-03,  2.43107212e-02],
        ...,
        [-1.28129327e-01, -1.17904690e-01, -1.07680052e-01, ...,
         -1.59732752e-01, -1.49508115e-01, -1.38353965e-01],
        [-1.94124714e-01, -1.90406664e-01, -1.84829589e-01, ...,
         -2.10855939e-01, -2.05278864e-01, -1.99701789e-01],
        [ 3.84032055e-01,  3.84032055e-01,  3.84032055e-01, ...,
          3.84032055e-01,  3.84032055e-01,  3.84032055e-01]],

       [[-3.02263122e+00, -3.02263122e+00, -3.02263122e+00, ...,
         -3.02263122e+00, -3.02263122e+00, -3.02263122e+00],
        [-6.78400722e-01, -6.66317059e-01, -6.55162910e-01, ...,
         -7.14651709e-01, -7.03497559e-01, -6.90484384e-01],
        [-6.42149735e-01, -6.17052897e-01, -5.94744598e-01, ...,
         -7.12792684e-01, -6.89554872e-01, -6.65387547e-01],
        ...,
        [ 3.15026123e+00,  3.14747269e+00,  3.14375464e+00, ...,
          3.16234489e+00,  3.15862684e+00,  3.15490879e+00],
        [ 2.87419602e+00,  2.87140748e+00,  2.86954846e+00, ...,
          2.87791407e+00,  2.87605504e+00,  2.87512553e+00],
        [ 1.97535744e+00,  1.97535744e+00,  1.97535744e+00, ...,
          1.97535744e+00,  1.97535744e+00,  1.97535744e+00]]])

1.4 简单输出:用 netcdf4-python、xarray 等常⻅工具

知道了如何读取 nc 文件,也需要知道如何输出和保存 nc 文件,我们还是以 netcdf4-python 和 xarray 两个库为例,来看一看如何保存一个简单结构的 nc 文件。

首先还是以 netcdf4 为例,我们从前面的文件里,把 u10、v10 数据计算为10米风速 wspd ,然后保存为 nc 文件。简单起见,这里我们省去对时间维度的处理。

ds = nc.Dataset(ERA5_FP)

u10 = ds.variables['u10'][0]  # 取第 0 个时间点,从而忽略时间维
v10 = ds.variables['v10'][0]  # 同上
wspd = (u10 ** 2 + v10 ** 2 ) ** 0.5  # 勾股定理计算风速值
wspd
masked_array(
  data=[[7.45549172, 7.45549172, 7.45549172, ..., 7.45549172, 7.45549172,
         7.45549172],
        [7.58922772, 7.59065141, 7.59092519, ..., 7.58714413, 7.58766872,
         7.58822541],
        [7.33959868, 7.34241225, 7.34581297, ..., 7.33310267, 7.33512501,
         7.33766044],
        ...,
        [3.81937921, 3.81753401, 3.81508504, ..., 3.82589441, 3.82371284,
         3.8215411 ],
        [3.65374161, 3.65285287, 3.65196603, ..., 3.6567179 , 3.65582279,
         3.65463227],
        [3.9934181 , 3.9934181 , 3.9934181 , ..., 3.9934181 , 3.9934181 ,
         3.9934181 ]],
  mask=False,
  fill_value=1e+20)
wspd.shape # 查看风速矩阵的形状
(721, 1440)

使用 netcdf4 创建和保存 nc 文件的基本步骤

with nc.Dataset('./output.nc', 'w') as dsout:  # 以“写”模式创建 Dataset 实例
    dsout.createDimension('lat', wspd.shape[0])  # 创建纬度维
    dsout.createDimension('lon', wspd.shape[1])  # 创建经度维
    dsout.createVariable('longitude', float, ('lon',))
    dsout.createVariable('latitude', float, ('lat',))
    dsout.createVariable('wspd', float, ('lat', 'lon'))

    dsout.variables['longitude'][:] = ds.variables['longitude'][:]
    dsout.variables['latitude'][:] = ds.variables['latitude'][:]
    dsout.variables['wspd'][:] = wspd

看一下生成的数据

with nc.Dataset('./output.nc') as new_ds:
    print(new_ds)
<class 'netCDF4._netCDF4.Dataset'>
root group (NETCDF4 data model, file format HDF5):
    dimensions(sizes): lat(721), lon(1440)
    variables(dimensions): float64 longitude(lon), float64 latitude(lat), float64 wspd(lat, lon)
    groups: 

下面用 xarray 来实现上面的步骤

ds = xr.open_dataset(ERA5_FP)
dsout = xr.DataArray(wspd,
                        name='wspd',
                        dims=['lat', 'lon'], 
                        coords={'latitude': ('lat', ds['latitude'].values), 
                                'longitude': ('lon', ds['longitude'].values)})
dsout.to_netcdf('./output_by_xarray.nc')
dsout

可以看出,实现相同功能的情况下,使用 xarray 比使用 netcdf4 的代码量要少得多。

✍️ 练习:创建纬度和经度范围的 NetCDF 文件并填充随机数据

使用随机数组生成一个简单结构的 netcdf 格式文件,要素的名叫 myvar,数组尺寸为 (181, 360),纬度为90~-90,长度181间隔为-1°,经度为-179~180,长度360间隔为1°。

# import netCDF4 as nc
# import xarray as xr
# import numpy as np

# filename = "simple_netcdf.nc"  
# with nc.Dataset(filename, "w", format="NETCDF4") as dataset:
#     lat_dim = dataset.createDimension('latitude', 181)  
#     lon_dim = dataset.createDimension('longitude', 360) 

#     latitude = dataset.createVariable('latitude', np.float32, ('latitude',))
#     longitude = dataset.createVariable('longitude', np.float32, ('longitude',))

#     latitude[:] = np.arange(90, -91, -1) 
#     longitude[:] = np.arange(-179, 181)  

#     myvar = dataset.createVariable('myvar', np.float32, ('latitude', 'longitude'))

#     myvar[:, :] = np.random.rand(181, 360)

# ds = xr.open_dataset('simple_netcdf.nc')
# ds

1.5 修改并覆写:针对简单结构的 netcdf 文件

如果我们有一个已经存在的 nc 文件,想要在不创建新的文件的情况下直接原地修改文件,要怎么做呢?这个时候,xarray 就排不上用场了,只能用 netcdf4 来实现。我们就拿刚才创建的 output.nc 文件为例,来看一下如果我们想把所有风速小于 5 的值都改为 0。

ds = nc.Dataset('./output.nc', 'r+')  # 这里使用 r+ 模式打开,即覆写模式
wspd = ds.variables['wspd'][:]
wspd

 

masked_array(
  data=[[7.45549172, 7.45549172, 7.45549172, ..., 7.45549172, 7.45549172,
         7.45549172],
        [7.58922772, 7.59065141, 7.59092519, ..., 7.58714413, 7.58766872,
         7.58822541],
        [7.33959868, 7.34241225, 7.34581297, ..., 7.33310267, 7.33512501,
         7.33766044],
        ...,
        [3.81937921, 3.81753401, 3.81508504, ..., 3.82589441, 3.82371284,
         3.8215411 ],
        [3.65374161, 3.65285287, 3.65196603, ..., 3.6567179 , 3.65582279,
         3.65463227],
        [3.9934181 , 3.9934181 , 3.9934181 , ..., 3.9934181 , 3.9934181 ,
         3.9934181 ]],
  mask=False,
  fill_value=1e+20)

我们使用 numpy 的数值筛选功能,把所有小于 5 的元素全部改为 0:

wspd[wspd<5] = 0
wspd
masked_array(
  data=[[7.45549172, 7.45549172, 7.45549172, ..., 7.45549172, 7.45549172,
         7.45549172],
        [7.58922772, 7.59065141, 7.59092519, ..., 7.58714413, 7.58766872,
         7.58822541],
        [7.33959868, 7.34241225, 7.34581297, ..., 7.33310267, 7.33512501,
         7.33766044],
        ...,
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ],
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ],
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ]],
  mask=False,
  fill_value=1e+20)

最后把修改后的值赋值给原数组,并关闭对象:

ds.variables['wspd'][:] = wspd  # 赋值给 dataset 对象
ds.close()   # 关闭对象

完成覆写修改以后,我们重新打开文件看看:

ds = nc.Dataset('./output.nc')
wspd = ds.variables['wspd'][:]
wspd
masked_array(
  data=[[7.45549172, 7.45549172, 7.45549172, ..., 7.45549172, 7.45549172,
         7.45549172],
        [7.58922772, 7.59065141, 7.59092519, ..., 7.58714413, 7.58766872,
         7.58822541],
        [7.33959868, 7.34241225, 7.34581297, ..., 7.33310267, 7.33512501,
         7.33766044],
        ...,
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ],
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ],
        [0.        , 0.        , 0.        , ..., 0.        , 0.        ,
         0.        ]],
  mask=False,
  fill_value=1e+20)

可以看到文件已经被修改完成了。

在工程实践中,以覆写方式修改已存在的文件可以提高处理大数据集的性能,减少代码的复杂程度。

通过上面的讲解,我们已经初步了解了如何通过各种工具对 nc 文件进行查看、读写,同时也了解了 nc 文件的时间信息存储的特点及正确的解析方式,此外还对 nc 数据中的打包和解包有了初步的认识。下面我们准备了几道闯关题,来测测你对知识的吸收程度如何吧。

闯关题

STEP1:请根据要求完成题目

Q1:读取 ERA5 样例数据,获取特定时间和坐标点的值

读取样例文件 /home/mw/input/training_camp9055/era5-temp.nc 文件,从中取出 2023 年 8 月 26 日 10:00 时刻的数据,并获取坐标点为 lon=110, lat=40 处的值,保留 1 位小数后赋值给 a1。

### ...your code here...

# a1 =
Q2:修改一个存储气温的 nc 文件,将开氏温度改为摄氏度

将 /home/mw/input/training_camp9055/era5-temp-float.nc 拷贝到工作目录,并原地修改拷贝过来的 era5-temp-float.nc,将原本的开氏温度转为摄氏度,然后计算转换后气温整个数据矩阵的平均值,保留 1 位小数后赋值给 a2。

In [38]:

### ...your code here...

# a2 =

STEP2:将结果保存为 csv 文件

csv 需要有两列,列名:id、answer。其中,id 列为题号,如 q1、q2;answer 列为 STEP1 中各题你计算出来的结果。💡 这一步的代码你无需修改,直接运行即可。

# 生成 csv 作业答案文件
def save_csv(answers):
    import pandas as pd
    df = pd.DataFrame({"id": ["q1", "q2"], "answer": answers})
    df.to_csv("answer_MeteoEng_1_1.csv", index=None)

save_csv([a1,a2])  # 该csv文件在左侧文件树project工作区下,你可以自行右击下载或者读取查看

 

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/2283904.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

LLM:BERT or BART 之BERT

文章目录 前言一、BERT1. Decoder-only2. Encoder-only3. Use of Bidirectional Context4. Masked Language Model (MLM)5. Next Sentence Prediction (NSP)6. Fine-tune1、情感分析2、句对分析3、命名实体识别&#xff08;NER&#xff09; 7. BERT总结 总结 前言 NLP选手对这…

【力扣:新动计划,编程入门 —— 题解 ③】

—— 25.1.26 231. 2 的幂 给你一个整数 n&#xff0c;请你判断该整数是否是 2 的幂次方。如果是&#xff0c;返回 true &#xff1b;否则&#xff0c;返回 false 。 如果存在一个整数 x 使得 n 2x &#xff0c;则认为 n 是 2 的幂次方。 示例 1&#xff1a; 输入&#xff1a;…

Centos7系统php8编译安装ImageMagick/Imagick扩展教程整理

Centos7系统php8编译安装ImageMagick/Imagick扩展教程整理 安装php8安装ImageMagick1、下载ImageMagick2、解压并安装3、查看是否安装成功 安装imagick扩展包 安装php8 点我安装php8 安装ImageMagick 1、下载ImageMagick wget https://www.imagemagick.org/download/ImageMa…

android的gradle

Gradle User Manual gradle官网 这里有个gradlew很有用&#xff0c;因为这个可以在窗口中运行gradlew脚本 gradlew 和 gradlew.bat 都是 Gradle Wrapper&#xff08;Gradle 包装器&#xff09; 的一部分&#xff0c;它们的作用是让项目可以使用 Gradle 而无需提前在系统中…

2025美赛MCM数学建模A题:《石头台阶的“记忆”:如何用数学揭开历史的足迹》(全网最全思路+模型)

✨个人主页欢迎您的访问 ✨期待您的三连 ✨ 《石头台阶的“记忆”&#xff1a;如何用数学揭开历史的足迹》 目录 《石头台阶的“记忆”&#xff1a;如何用数学揭开历史的足迹》 ✨摘要✨ ✨引言✨ 1. 引言的结构 2. 撰写步骤 &#xff08;1&#xff09;研究背景 &#…

【S32K3 RTD LLD篇7】K344中心对齐PWM中心点触发ADC BCTU采样

【S32K3 RTD LLD篇7】K344中心对齐PWM中心点触发ADC BCTU采样 一&#xff0c;文档简介二&#xff0c;中心对齐PWM中心点触发ADC原理2.1 如何生成中心对齐的PWM2.2 如何生成PWM中心点触发标志 三&#xff0c; 软件配置与实现3.1 Demo CT 模块配置3.1.1 引脚配置3.1.2 时钟配置3.…

14-6-3C++STL的list

&#xff08;一&#xff09;list的插入 1.list.insert(pos,elem);//在pos位置插入一个elem元素的拷贝&#xff0c;返回新数据的位置 #include <iostream> #include <list> using namespace std; int main() { list<int> lst; lst.push_back(10); l…

unity学习20:time相关基础 Time.time 和 Time.deltaTime

目录 1 unity里的几种基本时间 1.1 time 相关测试脚本 1.2 游戏开始到现在所用的时间 Time.time 1.3 时间缩放值 Time.timeScale 1.4 固定时间间隔 Time.fixedDeltaTime 1.5 两次响应时间之间的间隔&#xff1a;Time.deltaTime 1.6 对应测试代码 1.7 需要关注的2个基本…

HarmonyOS:创建应用静态快捷方式

一、前言 静态快捷方式是一种在系统中创建的可以快速访问应用程序或特定功能的链接。它通常可以在长按应用图标&#xff0c;以图标和相应的文字出现在应用图标的上方&#xff0c;用户可以迅速启动对应应用程序的组件。使用快捷方式&#xff0c;可以提高效率&#xff0c;节省了查…

mysql 学习6 DQL语句,对数据库中的表进行 查询 操作

前期准备数据 重新create 一张表 create table emp(id int comment 编号,workno varchar(10) comment 工号,name varchar(10) comment 姓名,gender char comment 性别,ager tinyint unsigned comment 年龄,idcard char(18) comment 身份证号,workaddress varchar(10) c…

【ES实战】治理项之索引模板相关治理

索引模板治理 文章目录 索引模板治理问题现象分析思路操作步骤问题程序化方案索引与索引模板增加分片数校验管理 彩蛋如何查询Flink on Yarn 模式下的Task Manager日志相关配置查询已停止的Flink任务查询未停止的Flink任务 问题现象 在集群索引新建时&#xff0c;索引的分片比…

springboot3 集成 knife4j(接口文档)

提示&#xff1a;文章是集成 knife4j&#xff0c;而非 swagger2 或者 swagger3&#xff0c;效果如图 文章目录 前言一、添加依赖二、如何集成1.配置文件2.注解部分1.Tag2.Operation3.Parameter4.Schema 3.使用 总结 前言 提示&#xff1a;&#xff1a;大家在开发阶段&#xff…

51单片机开发:独立键盘实验

实验目的&#xff1a;按下键盘1时&#xff0c;点亮LED灯1。 键盘原理图如下图所示&#xff0c;可见&#xff0c;由于接GND&#xff0c;当键盘按下时&#xff0c;P3相应的端口为低电平。 键盘按下时会出现抖动&#xff0c;时间通常为5-10ms&#xff0c;代码中通过延时函数delay…

Flutter_学习记录_Tab的简单Demo~真的很简单

1. Tab的简单使用了解 要实现tab(选项卡或者标签视图)需要用到三个组件&#xff1a; TabBarTabBarViewTabController 这一块&#xff0c;我也不知道怎么整理了&#xff0c;直接提供代码吧&#xff1a; import package:flutter/material.dart;void main() {runApp(MyApp());…

GESP2024年3月认证C++六级( 第三部分编程题(1)游戏)

参考程序&#xff1a; #include <cstdio> using namespace std; const int N 2e5 5; const int mod 1e9 7; int n, a, b, c; int f[N << 1]; int ans; int main() {scanf("%d%d%d%d", &n, &a, &b, &c);f[N n] 1;for (int i n; i…

数据结构测试题2

一、单选题&#xff08;每题 2 分&#xff0c;共20分&#xff09; 1. 栈和队列的共同特点是( A )。 A.只允许在端点处插入和删除元素 B.都是先进后出 C.都是先进先出 D.没有共同点 2. 用链接方式存储的队列&#xff0c;在进行插入运算时( C ) A. 仅修改头指针 B. 头…

项目概述与规划 (I)

项目概述与规划 (I) JavaScript的学习已经接近尾声了&#xff0c;最后我们将通过一个项目来讲我们在JavaScript中学习到的所有都在这个项目中展现出来&#xff0c;这个项目的DEMO来自于Udemy中的课程&#xff0c;作者是Jonas Schmedtmann&#xff1b; 项目规划 项目步骤 用户…

【WebRTC - STUN/TURN服务 - COTURN配置】

在WebRTC中&#xff0c;对于通信的两端不在同一个局域网的情况下&#xff0c;通信两端往往无法P2P直接连接&#xff0c;需要一个TURN中继服务&#xff0c;而中继服务可以选用 COTURN 构建。 注&#xff1a;COTURN 是一个开源的 TURN&#xff08;Traversal Using Relays around…

【HuggingFace项目】:Open-R1 - DeepSeek-R1 大模型开源复现计划

项目链接&#xff1a;https://github.com/huggingface/open-r1 概述 Open-R1 是由 HuggingFace 发布的一个完全开放的项目&#xff0c;旨在通过三个主要步骤复现 DeepSeek-R1 的完整训练流程。这个项目的目标是让更多人能够理解和使用 DeepSeek-R1 的技术方案&#xff0c;从而…

On to OpenGL and 3D computer graphics

2. On to OpenGL and 3D computer graphics 声明&#xff1a;该代码来自&#xff1a;Computer Graphics Through OpenGL From Theory to Experiments&#xff0c;仅用作学习参考 2.1 First Program Square.cpp完整代码 /// // square.cpp // // OpenGL program to draw a squ…