学习Numpy的奇思妙想

news2025/7/1 6:39:23

学习Numpy的奇思妙想

本文主要想记录一下，学习 numpy 过程中的偶然的灵感，并记录一下知识框架。
推荐资源：https://numpy.org/doc/stable/user/absolute_beginners.html

💡灵感

为什么 numpy 数组的 shape 和 pytorch 是 tensor 是反着的？？
- 在读入一个 RGB 图像的时候，pytorch 的张量通常是(batch, channel, height, width)，但是 numpy 的数组形状通常是(height, width, channel)
- 把数组转换成张量直接用 transform.ToTensor()，但是在把 tensor 转换成张量并用matplotlib 显示前要注意转换维度。
```
from torchvision import transforms, datasets
from torch.utils.data import DataLoader
from PIL import Image

# 定义转换操作，将图片转换为 tensor
transform = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.ToTensor(),
])

# 加载单个图片
image_pil = Image.open('path_to_image.jpg')
image_tensor = transform(image_pil)

# 显示图片形状
print(image_tensor.shape)  # 输出可能是 (channels, height, width)

# 注意：PyTorch 的 tensor 需要先转置维度，然后才能用 matplotlib 显示
plt.imshow(image_tensor.permute(1, 2, 0))
plt.show()
```
- 💡猜想一下 numpy 是如何计算数组形状的，可能是numpy得到一个输入的列表，会先查看他的 len，得到一个数，这个就是第 0 维度，然后查看数组中第一个元素的 len，这个就是第 1 维度，以此类推。就像剥洋葱一样，一层一层的剥开他的心。
```
import numpy as np
a = [[[1,2,3],[4,5,6]]]
print(len(a)) # 1
print(len(a[0])) # 2
print(len(a[0][0])) # 3
print(np.array(a).shape) # (1,2,3)
```
- 图片保存的时候，RGB 统一保存成一个颜色，比如#FFFFFF，他是在一起的，所以 channel 对于 numpy 来说在最后边。
⚠️ list 的索引返回副本（深拷贝），ndarry 的索引返回视图（浅拷贝）
- 这个是例子
```
import numpy as np
a_list = [1,2,3,4,5,6]
a_array = np.array(a_list)

b_list = a_list[0:4]
b_list[0] = 100
print(b_list, a_list) 
# [100, 2, 3, 4] [1, 2, 3, 4, 5, 6]

b_array = a_array[0:4]
b_array[0] = 100
print(b_array, a_array) 
# [100   2   3   4] [100   2   3   4   5   6]
```
- 同样，展平数组时，.flatten() 和 .ravel()的区别也是如此， ravel() 创建的新数组实际上是对父数组的引用（即，“视图”）。这意味着对新数组的任何更改也会影响父数组。由于 ravel() 不创建副本，因此它的内存效率很高。
⚠️ empty 不是真的 empty
- np.empty 创建的并不是0，他直接在内存上开辟空间，存储的是“随机”的内容，可能全是 0，也可能不是 0。
- np.zeros 创建的才是真正的 0。
- np.random.rand 创建的才是真正的随机。

Numpy组织结构

https://numpy.org/doc/stable/reference/module_structure.html

【推荐使用】Main namespaces(Regular/recommended user-facing namespaces for general use)
- numpy
- numpy.exceptions
- numpy.fft
- numpy.linalg
- numpy.polynomial
- numpy.random
- numpy.strings
- numpy.testing
- numpy.typing
【推荐使用】Special-purpose namespaces
- numpy.ctypeslib - interacting with NumPy objects with ctypes
- numpy.dtypes - dtype classes (typically not used directly by end users)
- numpy.emath - mathematical functions with automatic domain
- numpy.lib - utilities & functionality which do not fit the main namespace
- numpy.rec - record arrays (largely superseded by dataframe libraries)
- numpy.version - small module with more detailed version info
【不建议使用】Legacy namespaces(Prefer not to use these namespaces for new code. There are better alternatives and/or this code is deprecated or isn’t reliable.)
- numpy.char - legacy string functionality, only for fixed-width strings
- numpy.distutils (deprecated) - build system support
- numpy.f2py - Fortran binding generation (usually used from the command line only)
- numpy.ma - masked arrays (not very reliable, needs an overhaul)
- numpy.matlib (pending deprecation) - functions supporting matrix instances