Python酷库之旅-第三方库Pandas(124)

一、用法精讲

551、pandas.DataFrame.notna方法

551-1、语法

551-2、参数

551-3、功能

551-4、返回值

551-5、说明

551-6、用法

551-6-1、数据准备

551-6-2、代码示例

551-6-3、结果输出

552、pandas.DataFrame.notnull方法

552-1、语法

552-2、参数

552-3、功能

552-4、返回值

552-5、说明

552-6、用法

552-6-1、数据准备

552-6-2、代码示例

552-6-3、结果输出

553、pandas.DataFrame.pad方法

553-1、语法

553-2、参数

553-3、功能

553-4、返回值

553-5、说明

553-6、用法

553-6-1、数据准备

553-6-2、代码示例

553-6-3、结果输出

554、pandas.DataFrame.replace方法

554-1、语法

554-2、参数

554-3、功能

554-4、返回值

554-5、说明

554-6、用法

554-6-1、数据准备

554-6-2、代码示例

554-6-3、结果输出

555、pandas.DataFrame.droplevel方法

555-1、语法

555-2、参数

555-3、功能

555-4、返回值

555-5、说明

555-6、用法

555-6-1、数据准备

555-6-2、代码示例

555-6-3、结果输出

二、推荐阅读

1、Python筑基之旅

2、Python函数之旅

3、Python算法之旅

4、Python魔法之旅

5、博客个人主页

一、用法精讲

551、pandas.DataFrame.notna方法

551-1、语法

# 551、pandas.DataFrame.notna方法
pandas.DataFrame.notna()
Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings '' or numpy.inf are not considered NA values (unless you set pandas.options.mode.use_inf_as_na = True). NA values, such as None or numpy.NaN, get mapped to False values.

Returns:
DataFrame
Mask of bool values for each element in DataFrame that indicates whether an element is not an NA value.

551-2、参数

无

551-3、功能

该方法逐元素检查DataFrame，判断每个元素是否为缺失值。

551-4、返回值

返回一个与原DataFrame形状相同的布尔DataFrame，其中每个元素对应着原DataFrame中元素的非缺失状态。

551-5、说明

无

551-6、用法

551-6-1、数据准备

无

551-6-2、代码示例

# 551、pandas.DataFrame.notna方法
import pandas as pd
import numpy as np
# 创建示例DataFrame
data = {
    'A': [1, 2, np.nan],
    'B': [np.nan, 3, 4],
    'C': [5, np.nan, 6]
}
df = pd.DataFrame(data)
# 使用notna()方法
not_na_df = df.notna()
print(not_na_df)

551-6-3、结果输出

# 551、pandas.DataFrame.notna方法
#        A      B      C
# 0   True  False   True
# 1   True   True  False
# 2  False   True   True

552、pandas.DataFrame.notnull方法

552-1、语法

# 552、pandas.DataFrame.notnull方法
pandas.DataFrame.notnull()
DataFrame.notnull is an alias for DataFrame.notna.

Detect existing (non-missing) values.

Return a boolean same-sized object indicating if the values are not NA. Non-missing values get mapped to True. Characters such as empty strings '' or numpy.inf are not considered NA values (unless you set pandas.options.mode.use_inf_as_na = True). NA values, such as None or numpy.NaN, get mapped to False values.

Returns:
DataFrame
Mask of bool values for each element in DataFrame that indicates whether an element is not an NA value.

552-2、参数

无

552-3、功能

该方法逐元素检查DataFrame，判断每个元素是否为非缺失值。

552-4、返回值

返回一个与原DataFrame形状相同的布尔DataFrame，对于每个位置：

True：表示该位置的值是有效值(非缺失值)。
False：表示该位置的值是缺失值(如NaN)。

552-5、说明

无

552-6、用法

552-6-1、数据准备

无

552-6-2、代码示例

# 552、pandas.DataFrame.notnull方法
import pandas as pd
import numpy as np
# 创建示例DataFrame
data = {
    'A': [1, 2, np.nan],
    'B': [np.nan, 3, 4],
    'C': [5, np.nan, 6]
}
df = pd.DataFrame(data)
# 使用notnull()方法
not_null_df = df.notnull()
print(not_null_df)

552-6-3、结果输出

# 552、pandas.DataFrame.notnull方法
#        A      B      C
# 0   True  False   True
# 1   True   True  False
# 2  False   True   True

553、pandas.DataFrame.pad方法

553-1、语法

# 553、pandas.DataFrame.pad方法
pandas.DataFrame.pad(*, axis=None, inplace=False, limit=None, downcast=_NoDefault.no_default)
Fill NA/NaN values by propagating the last valid observation to next valid.

Deprecated since version 2.0: Series/DataFrame.pad is deprecated. Use Series/DataFrame.ffill instead.

Returns:
Series/DataFrame or None
Object with missing values filled or None if inplace=True.

553-2、参数

553-2-1、axis(可选，默认值为None)：{0 or 'index', 1 or 'columns'}，指定填充的方向：

0或'index'：进行按行填充。
1或'columns'：进行按列填充。
None：默认情况下按行填充。

553-2-2、inplace(可选，默认值为False)：布尔值，如果为True，则直接在原DataFrame上进行填充，不返回新的对象；如果为False，则返回填充后的新DataFrame，而不改变原始DataFrame。

553-2-3、limit(可选，默认值为None)：整数，指定每列或每行能填充的缺失值的最大数量，如果没有设置，则会填充所有缺失值。

553-2-4、downcast(可选)：字符串，用于在填充过程中将结果的类型向下转换(例如将浮点数转换为整数等)，可以指定想要的类型。

553-3、功能

用于在DataFrame中填充缺失值的方法，该方法可以通过向前填充或向后一致地填充缺失值，以便用已知的值替代缺失值。

553-4、返回值

返回一个新的DataFrame(如果inplace=False)或对原DataFrame进行修改(如果inplace=True)，填充了缺失值，返回的DataFrame保留了原始DataFrame的索引和列标签。

553-5、说明

无

553-6、用法

553-6-1、数据准备

无

553-6-2、代码示例

# 553、pandas.DataFrame.pad方法
import pandas as pd
import numpy as np
# 创建示例DataFrame
data = {
    'A': [1, np.nan, 3],
    'B': [4, 5, np.nan],
    'C': [np.nan, np.nan, 6]
}
df = pd.DataFrame(data)
# 使用pad方法向前填充缺失值
filled_df = df.pad()
print(filled_df)

553-6-3、结果输出

# 553、pandas.DataFrame.pad方法
#      A    B    C
# 0  1.0  4.0  NaN
# 1  1.0  5.0  NaN
# 2  3.0  5.0  6.0

554、pandas.DataFrame.replace方法

554-1、语法

# 554、pandas.DataFrame.replace方法
pandas.DataFrame.replace(to_replace=None, value=_NoDefault.no_default, *, inplace=False, limit=None, regex=False, method=_NoDefault.no_default)
Replace values given in to_replace with value.

Values of the Series/DataFrame are replaced with other values dynamically. This differs from updating with .loc or .iloc, which require you to specify a location to update with some value.

Parameters:
to_replacestr, regex, list, dict, Series, int, float, or None
How to find the values that will be replaced.

numeric, str or regex:

numeric: numeric values equal to to_replace will be replaced with value

str: string exactly matching to_replace will be replaced with value

regex: regexs matching to_replace will be replaced with value

list of str, regex, or numeric:

First, if to_replace and value are both lists, they must be the same length.

Second, if regex=True then all of the strings in both lists will be interpreted as regexs otherwise they will match directly. This doesn’t matter much for value since there are only a few possible substitution regexes you can use.

str, regex and numeric rules apply as above.

dict:

Dicts can be used to specify different replacement values for different existing values. For example, {'a': 'b', 'y': 'z'} replaces the value ‘a’ with ‘b’ and ‘y’ with ‘z’. To use a dict in this way, the optional value parameter should not be given.

For a DataFrame a dict can specify that different values should be replaced in different columns. For example, {'a': 1, 'b': 'z'} looks for the value 1 in column ‘a’ and the value ‘z’ in column ‘b’ and replaces these values with whatever is specified in value. The value parameter should not be None in this case. You can treat this as a special case of passing two lists except that you are specifying the column to search in.

For a DataFrame nested dictionaries, e.g., {'a': {'b': np.nan}}, are read as follows: look in column ‘a’ for the value ‘b’ and replace it with NaN. The optional value parameter should not be specified to use a nested dict in this way. You can nest regular expressions as well. Note that column names (the top-level dictionary keys in a nested dictionary) cannot be regular expressions.

None:

This means that the regex argument must be a string, compiled regular expression, or list, dict, ndarray or Series of such elements. If value is also None then this must be a nested dictionary or Series.

See the examples section for examples of each of these.

valuescalar, dict, list, str, regex, default None
Value to replace any values matching to_replace with. For a DataFrame a dict of values can be used to specify which value to use for each column (columns not in the dict will not be filled). Regular expressions, strings and lists or dicts of such objects are also allowed.

inplacebool, default False
If True, performs operation inplace and returns None.

limitint, default None
Maximum size gap to forward or backward fill.

Deprecated since version 2.1.0.

regexbool or same types as to_replace, default False
Whether to interpret to_replace and/or value as regular expressions. Alternatively, this could be a regular expression or a list, dict, or array of regular expressions in which case to_replace must be None.

method{‘pad’, ‘ffill’, ‘bfill’}
The method to use when for replacement, when to_replace is a scalar, list or tuple and value is None.

Deprecated since version 2.1.0.

Returns:
Series/DataFrame
Object after replacement.

Raises:
AssertionError
If regex is not a bool and to_replace is not None.

TypeError
If to_replace is not a scalar, array-like, dict, or None

If to_replace is a dict and value is not a list, dict, ndarray, or Series

If to_replace is None and regex is not compilable into a regular expression or is a list, dict, ndarray, or Series.

When replacing multiple bool or datetime64 objects and the arguments to to_replace does not match the type of the value being replaced

ValueError
If a list or an ndarray is passed to to_replace and value but they are not the same length.

554-2、参数

554-2-1、to_replace(可选，默认值为None)：scalar, list-like, dict或正则表达式，指要被替换的值，可以是单个值、值的列表或者一个字典(键为旧值，值为新值)，也可以是正则表达式。

554-2-2、value(可选)：scalar, list-like或 dict，表示新的值，用于替换to_replace中对应的值，如果to_replace是一个字典，则value也应是一个字典，且键值对应。

554-2-3、inplace(可选，默认值为False)：布尔值，如果为True，则直接在原DataFrame上进行替换，不返回新的对象；如果为False，则返回替换后的新DataFrame，而不改变原始DataFrame。

554-2-4、limit(可选，默认值为None)：整数，指定最大替换次数，如果未设置，则会替换所有符合条件的值。

554-2-5、regex(可选，默认值为False)：布尔值，如果为True，to_replace将被视为正则表达式，并根据正则表达式的匹配进行替换。

554-2-6、method(可选)：字符串，该参数仅在使用to_replace为字典时有效，可以指定替换的方式，例如'pad'或'ffill'。

554-3、功能

用于替换DataFrame中指定值的方法，该方法可以用新的值替换旧的值，支持多种替换方式，如逐元素替换、使用正则表达式等。

554-4、返回值

返回一个新的DataFrame(如果inplace=False)或对原DataFrame进行修改(如果inplace=True)，替换了指定的值，返回的DataFrame保留了原始DataFrame的索引和列标签。

554-5、说明

无

554-6、用法

554-6-1、数据准备

无

554-6-2、代码示例

# 554、pandas.DataFrame.replace方法
import pandas as pd
# 创建示例DataFrame
data = {
    'A': [1, 2, 3],
    'B': ['apple', 'banana', 'apple'],
    'C': [3.5, 4.5, 5.5]
}
df = pd.DataFrame(data)
# 使用replace方法替换值
df_replaced = df.replace({'apple': 'orange', 2: 20})
print(df_replaced)

554-6-3、结果输出

# 554、pandas.DataFrame.replace方法
#     A       B    C
# 0   1  orange  3.5
# 1  20  banana  4.5
# 2   3  orange  5.5

555、pandas.DataFrame.droplevel方法

555-1、语法

# 555、pandas.DataFrame.droplevel方法
pandas.DataFrame.droplevel(level, axis=0)
Return Series/DataFrame with requested index / column level(s) removed.

Parameters:
levelint, str, or list-like
If a string is given, must be the name of a level If list-like, elements must be names or positional indexes of levels.

axis{0 or ‘index’, 1 or ‘columns’}, default 0
Axis along which the level(s) is removed:

0 or ‘index’: remove level(s) in column.

1 or ‘columns’: remove level(s) in row.

For Series this parameter is unused and defaults to 0.

Returns:
Series/DataFrame
Series/DataFrame with requested index / column level(s) removed.

555-2、参数

555-2-1、level(必须)：int, str或者list，指定要删除的级别，可以是：

级别的整数位置(例如0表示第一个级别)
级别的名称(如果索引有命名的话)
由上述两种类型构成的列表，用于同时删除多个级别。

555-2-2、axis(可选，默认值为0)：{0 or 'index', 1 or 'columns'}，指定填充的方向：

0或'index'：进行按行填充。
1或'columns'：进行按列填充。
None：默认情况下按行填充。

555-3、功能

从指定的MultiIndex中删除给定级别，这样可以简化数据框的索引结构。当删除级别时，其他级别的信息将被保留，最终的DataFrame仍将是一个有效的pandas DataFrame，只是索引的结构发生了变化。

555-4、返回值

返回一个新的DataFrame，索引或列索引中指定级别被删除后形成的新DataFrame，原始DataFrame不会被修改。

555-5、说明

无

555-6、用法

555-6-1、数据准备

无

555-6-2、代码示例

# 555、pandas.DataFrame.droplevel方法
import pandas as pd
# 创建一个MultiIndex DataFrame
arrays = [
    ['A', 'A', 'B', 'B'],
    ['one', 'two', 'one', 'two']
]
index = pd.MultiIndex.from_arrays(arrays, names=('letter', 'number'))
data = pd.DataFrame({'value': [1, 2, 3, 4]}, index=index)
# 输出原始DataFrame
print("原始DataFrame:")
print(data)
# 使用droplevel删除其中一个级别
data_dropped = data.droplevel(level='number', axis=0)
# 输出处理后的DataFrame
print("\n删除级别后的DataFrame:")
print(data_dropped)

555-6-3、结果输出

# 555、pandas.DataFrame.droplevel方法
# 原始DataFrame:
#                value
# letter number       
# A      one         1
#        two         2
# B      one         3
#        two         4
# 
# 删除级别后的DataFrame:
#         value
# letter       
# A           1
# A           2
# B           3
# B           4