2022数学建模国赛C题官网展示论文C155论文复现

news2024/9/17 8:37:12

2022数学建模国赛C题C155论文复现

  • 1.内容比对
  • 2.第一问第二小问复现代码
    • 2.1 页表合并
    • 2.2 数据的正态性检验
      • 2.2.1数据的正态性检验效果图
    • 2.3不满足正态性,进行中心化对数比变换
      • 2.3.1 核心步骤-inf用0值替换
      • 2.3.2中心化对数比变换效果图
    • 2.4描述性统计
    • 2.5 箱线图绘制

github查看完整论文复现过程

1.内容比对

箱线图比对
国赛C155
在这里插入图片描述
复现内容:
在这里插入图片描述

2.第一问第二小问复现代码

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
plt.rcParams['font.sans-serif'] = ['SimHei']

# Load all sheets of the Excel file
xl_file = pd.ExcelFile("E:\\数学建模国赛\\2022数学建模赛题\\C题\\附件.xlsx")

# Load individual sheets with correct names
sheet1 = xl_file.parse('表单1')  # 玻璃文物的基本信息
sheet2 = xl_file.parse('表单2')  # 已分类玻璃文物的化学成分比例
sheet3 = xl_file.parse('表单3')  # 未分类玻璃文物的化学成分比例

# Show the first few rows of each sheet
sheet1.head(), sheet2.head(), sheet3.head()


(   文物编号 纹饰  类型  颜色 表面风化
 0     1  C  高钾  蓝绿  无风化
 1     2  A  铅钡  浅蓝   风化
 2     3  A  高钾  蓝绿  无风化
 3     4  A  高钾  蓝绿  无风化
 4     5  A  高钾  蓝绿  无风化,
    文物采样点  二氧化硅(SiO2)  氧化钠(Na2O)  氧化钾(K2O)  氧化钙(CaO)  氧化镁(MgO)  氧化铝(Al2O3)  \
 0     01       69.33        NaN      9.99      6.32      0.87        3.93   
 1     02       36.28        NaN      1.05      2.34      1.18        5.73   
 2  03部位1       87.05        NaN      5.19      2.01       NaN        4.06   
 3  03部位2       61.71        NaN     12.37      5.87      1.11        5.50   
 4     04       65.88        NaN      9.67      7.12      1.56        6.44   
 
    氧化铁(Fe2O3)  氧化铜(CuO)  氧化铅(PbO)  氧化钡(BaO)  五氧化二磷(P2O5)  氧化锶(SrO)  氧化锡(SnO2)  \
 0        1.74      3.87       NaN       NaN         1.17       NaN        NaN   
 1        1.86      0.26     47.43       NaN         3.57      0.19        NaN   
 2         NaN      0.78      0.25       NaN         0.66       NaN        NaN   
 3        2.16      5.09      1.41      2.86         0.70      0.10        NaN   
 4        2.06      2.18       NaN       NaN         0.79       NaN        NaN   
 
    二氧化硫(SO2)  
 0       0.39  
 1        NaN  
 2        NaN  
 3        NaN  
 4       0.36  ,
   文物编号 表面风化  二氧化硅(SiO2)  氧化钠(Na2O)  氧化钾(K2O)  氧化钙(CaO)  氧化镁(MgO)  氧化铝(Al2O3)  \
 0   A1  无风化       78.45        NaN       NaN      6.08      1.86        7.23   
 1   A2   风化       37.75        NaN       NaN      7.63       NaN        2.33   
 2   A3  无风化       31.95        NaN      1.36      7.19      0.81        2.93   
 3   A4  无风化       35.47        NaN      0.79      2.89      1.05        7.07   
 4   A5   风化       64.29        1.2      0.37      1.64      2.34       12.75   
 
    氧化铁(Fe2O3)  氧化铜(CuO)  氧化铅(PbO)  氧化钡(BaO)  五氧化二磷(P2O5)  氧化锶(SrO)  氧化锡(SnO2)  \
 0        2.15      2.11       NaN       NaN         1.06      0.03        NaN   
 1         NaN       NaN     34.30       NaN        14.27       NaN        NaN   
 2        7.06      0.21     39.58      4.69         2.68      0.52        NaN   
 3        6.45      0.96     24.28      8.31         8.45      0.28        NaN   
 4        0.81      0.94     12.23      2.16         0.19      0.21       0.49   
 
    二氧化硫(SO2)  
 0       0.51  
 1        NaN  
 2        NaN  
 3        NaN  
 4        NaN  )
sheet2
文物采样点二氧化硅(SiO2)氧化钠(Na2O)氧化钾(K2O)氧化钙(CaO)氧化镁(MgO)氧化铝(Al2O3)氧化铁(Fe2O3)氧化铜(CuO)氧化铅(PbO)氧化钡(BaO)五氧化二磷(P2O5)氧化锶(SrO)氧化锡(SnO2)二氧化硫(SO2)
00169.33NaN9.996.320.873.931.743.87NaNNaN1.17NaNNaN0.39
10236.28NaN1.052.341.185.731.860.2647.43NaN3.570.19NaNNaN
203部位187.05NaN5.192.01NaN4.06NaN0.780.25NaN0.66NaNNaNNaN
303部位261.71NaN12.375.871.115.502.165.091.412.860.700.10NaNNaN
40465.88NaN9.677.121.566.442.062.18NaNNaN0.79NaNNaN0.36
................................................
6454严重风化点17.11NaNNaNNaN1.113.65NaN1.3458.46NaN14.131.12NaNNaN
655549.012.71NaN1.13NaN1.45NaN0.8632.927.950.35NaNNaNNaN
665629.15NaNNaN1.21NaN1.85NaN0.7941.2515.452.54NaNNaNNaN
675725.42NaNNaN1.31NaN2.18NaN1.1645.1017.30NaNNaNNaNNaN
685830.39NaN0.343.490.793.520.863.1339.357.668.990.24NaNNaN

69 rows × 15 columns

component_cols = ['二氧化硅(SiO2)', '氧化钠(Na2O)', '氧化钾(K2O)', '氧化钙(CaO)', '氧化镁(MgO)', 
                  '氧化铝(Al2O3)', '氧化铁(Fe2O3)', '氧化铜(CuO)', '氧化铅(PbO)', '氧化钡(BaO)', 
                  '五氧化二磷(P2O5)', '氧化锶(SrO)', '氧化锡(SnO2)', '二氧化硫(SO2)']

sheet2 ['成分总和'] = sheet2 [component_cols].sum(axis=1)
sheet2 ['成分总和']

sheet2 = sheet2[(sheet2['成分总和'] >= 85) & (sheet2['成分总和'] <= 105)]
sheet2
sheet2 = sheet2.fillna(0)
# Normalize the chemical components to sum up to 100%
sheet2[component_cols] = sheet2[component_cols].div(sheet2[component_cols].sum(axis=1), axis=0) * 100

sheet2 ['成分总和'] = sheet2 [component_cols].sum(axis=1)
sheet2
文物采样点二氧化硅(SiO2)氧化钠(Na2O)氧化钾(K2O)氧化钙(CaO)氧化镁(MgO)氧化铝(Al2O3)氧化铁(Fe2O3)氧化铜(CuO)氧化铅(PbO)氧化钡(BaO)五氧化二磷(P2O5)氧化锶(SrO)氧化锡(SnO2)二氧化硫(SO2)成分总和
00171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549100.0
10236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000100.0
203部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000100.0
303部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000100.0
40468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766100.0
...................................................
6454严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000100.0
655550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000100.0
665631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000100.0
675727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000100.0
685830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000100.0

67 rows × 16 columns

sheet2_copy = sheet2.copy()
sheet2=sheet2_copy

# Define the new column names
new_component_cols = ['SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 
                      'Al2O3', 'Fe2O3', 'CuO', 'PbO', 'BaO', 
                      'P2O5', 'SrO', 'SnO2', 'SO2']

# Create a mapping from old column names to new column names
rename_dict = dict(zip(component_cols, new_component_cols))

# Rename the columns
sheet2.rename(columns=rename_dict, inplace=True)

# Check the updated column names
sheet2.columns
Index(['文物采样点', 'SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 'Al2O3', 'Fe2O3', 'CuO',
       'PbO', 'BaO', 'P2O5', 'SrO', 'SnO2', 'SO2', '成分总和'],
      dtype='object')

2.1 页表合并

# Merge sheet1 and sheet2 on 文物编号 (artifact number)
# First, we need to extract the 文物编号 from the 文物采样点 in sheet2
# We assume that the 文物编号 is the numeric part before any non-numeric character in the 文物采样点

# Import regular expression library
import re

# Define a function to extract 文物编号 from 文物采样点
def extract_number(s):
    match = re.match(r"(\d+)", s)
    return int(match.group()) if match else None

# Apply the function to the 文物采样点 column
sheet2['文物编号'] = sheet2['文物采样点'].apply(extract_number)

# Merge sheet1 and sheet2
data = pd.merge(sheet1, sheet2, on='文物编号')
# nan for zero

data
 
文物编号纹饰类型颜色表面风化文物采样点SiO2Na2OK2OCaO...Al2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2成分总和
01C高钾蓝绿无风化0171.0275590.00000010.2346076.474746...4.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549100.0
12A铅钡浅蓝风化0236.3199520.0000001.0511562.342577...5.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000100.0
23A高钾蓝绿无风化03部位187.0500000.0000005.1900002.010000...4.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000100.0
33A高钾蓝绿无风化03部位262.4089810.00000012.5101135.936489...5.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000100.0
44A高钾蓝绿无风化0468.5821360.00000010.0666257.412034...6.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766100.0
..................................................................
6254C铅钡浅蓝风化54严重风化点17.6537350.0000000.0000000.000000...3.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000100.0
6355C铅钡绿无风化5550.8507992.8117870.0000001.172442...1.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000100.0
6456C铅钡蓝绿风化5631.6023420.0000000.0000001.311795...2.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000100.0
6557C铅钡蓝绿风化5727.4899970.0000000.0000001.416676...2.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000100.0
6658C铅钡NaN风化5830.7715670.0000000.3442693.533819...3.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000100.0

67 rows × 21 columns

data.drop(['颜色','纹饰','文物编号','成分总和'],axis=1,inplace=True)
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

data.shape
(67, 17)
#data.to_excel('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据.xlsx', index=True)
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.2 数据的正态性检验

"""
对于某些统计分析,如回归分析,数据的正态性是一种关键的假设。
然而,是否需要进行这种变换取决于数据本身的特性和分析目标。
现在,让我们查看一下数据
对于您的数据,考虑到它是化学成分数据,并且从前面的分析中我们看到数据的分布并不完全是正态的,
我建议在中心化对数比变换后进行分析。这样可以确保数据满足统计分析的假设,并能更好地处理组成数据的特性。
"""
# 正态性检验,查看一下这些化学元素的分布。
import matplotlib.pyplot as plt

# Select only the columns that are numeric and not categorical
numeric_cols = data.select_dtypes(include='number').columns

2.2.1数据的正态性检验效果图

# Plot histograms for each numeric column
fig, axs = plt.subplots(len(numeric_cols), figsize=(10, len(numeric_cols)*3))

for i, col in enumerate(numeric_cols):
    axs[i].hist(data[col].dropna(), bins=30, color='skyblue', edgecolor='black', alpha=0.7)
    axs[i].set_title(f'Histogram of {col}')

plt.tight_layout()
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lY1y3UDY-1692511618307)(output_12_0.png)]

data_raw=data.copy()
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

"""
正态性检验,们将使用 Shapiro-Wilk 测试来检查每个化学成分的正态性。
这是一种常用的正态性检验方法,它的零假设是数据来自正态分布。
如果 p 值小于 0.05,我们将拒绝零假设,即认为数据不符合正态分布。
"""
from scipy.stats import shapiro, levene

# Initialize an empty dataframe to store the test results
test_results = pd.DataFrame()

# Loop over each numeric column
for col in numeric_cols[0:]:
    # Initialize an empty dict to store the results for this variable
    col_results = {'Variable': col}
    
    # Normality test
    # Drop NA values before performing the test
    _, p_normal = shapiro(data[col].dropna())
    col_results['Normality p-value'] = p_normal
    col_results['Normal'] = p_normal > 0.05
    
    # Variance equality test (only if the data is normal)
    if col_results['Normal']:
        _, p_equal_var = levene(data.loc[data['表面风化'] == '无风化', col].dropna(), 
                                data.loc[data['表面风化'] == '风化', col].dropna())
        col_results['Equal var p-value'] = p_equal_var
        col_results['Equal var'] = p_equal_var > 0.05
    
    # Append the results to the dataframe
    test_results = test_results.append(col_results, ignore_index=True)

# Now, the test_results dataframe contains the p-values for normality and equal variances
# for each numeric variable, without any transformation applied to the data.

C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
  test_results = test_results.append(col_results, ignore_index=True)
test_results
VariableNormality p-valueNormalEqual var p-valueEqual var
0SiO25.434923e-02True0.009129False
1Na2O5.631047e-13FalseNaNNaN
2K2O2.218287e-13FalseNaNNaN
3CaO8.905178e-06FalseNaNNaN
4MgO1.066307e-05FalseNaNNaN
5Al2O31.085733e-06FalseNaNNaN
6Fe2O31.809425e-09FalseNaNNaN
7CuO3.633815e-09FalseNaNNaN
8PbO7.531955e-04FalseNaNNaN
9BaO7.773099e-08FalseNaNNaN
10P2O54.346846e-09FalseNaNNaN
11SrO6.648307e-06FalseNaNNaN
12SnO28.658932e-17FalseNaNNaN
13SO25.878219e-17FalseNaNNaN
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.3不满足正态性,进行中心化对数比变换

from scipy.stats.mstats import gmean

data_centralized = data.copy()

# 选择数值列
numeric_data = data_centralized.select_dtypes(include='number')

# 计算每一行的非零元素的几何均值
geo_means = []
for index, row in numeric_data.iterrows():
    non_zero_values = row[row > 0]
    geo_mean = gmean(non_zero_values) if len(non_zero_values) > 0 else 1e-6
    geo_means.append(geo_mean)

# 将每个值除以其所在行的非零元素的几何均值,并取对数
for col in numeric_data.columns:
    data_centralized[col] = np.log(numeric_data[col] / geo_means)

data_centralized.head()

D:\py1.1\envs\pytorch\lib\site-packages\pandas\core\arraylike.py:402: RuntimeWarning: divide by zero encountered in log
  result = getattr(ufunc, method)(*inputs, **kwargs)
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化013.045978-inf1.1086850.650820-1.3321610.175740-0.6390140.160355-inf-inf-1.035896-inf-inf-2.134508
1铅钡风化022.676664-inf-0.865813-0.064452-0.7490890.831113-0.294026-2.2616772.944652-inf0.357963-2.575334-inf-inf
2高钾无风化03部位13.586159-inf0.766410-0.182189-inf0.520860-inf-1.128785-2.266618-inf-1.295839-inf-inf-inf
3高钾无风化03部位23.090699-inf1.4835270.738107-0.9273870.673001-0.2616390.595531-0.6881580.019074-1.388422-3.334332-inf-inf
4高钾无风化042.968764-inf1.0499570.743836-0.7743860.643457-0.496365-0.439747-inf-inf-1.454794-inf-inf-2.240723

2.3.1 核心步骤-inf用0值替换

# Replace -inf values with NaN for visualization purposes
#plt.rcParams['font.family'] = 'DejaVu Sans'
selected_cols=new_component_cols
data_centralized.replace(-np.inf, 0, inplace=True)
data_centralized

类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化013.0459780.0000001.1086850.650820-1.3321610.175740-0.6390140.1603550.0000000.000000-1.0358960.0000000.0-2.134508
1铅钡风化022.6766640.000000-0.865813-0.064452-0.7490890.831113-0.294026-2.2616772.9446520.0000000.357963-2.5753340.00.000000
2高钾无风化03部位13.5861590.0000000.766410-0.1821890.0000000.5208600.000000-1.128785-2.2666180.000000-1.2958390.0000000.00.000000
3高钾无风化03部位23.0906990.0000001.4835270.738107-0.9273870.673001-0.2616390.595531-0.6881580.019074-1.388422-3.3343320.00.000000
4高钾无风化042.9687640.0000001.0499570.743836-0.7743860.643457-0.496365-0.4397470.0000000.000000-1.4547940.0000000.0-2.240723
......................................................
62铅钡风化54严重风化点1.2166070.0000000.0000000.000000-1.518696-0.3283290.000000-1.3303862.4452870.0000001.025244-1.5097270.00.000000
63铅钡无风化552.673354-0.2217220.000000-1.0964530.000000-0.8471070.000000-1.3694932.2754100.854502-2.2684920.0000000.00.000000
64铅钡风化561.7536030.0000000.000000-1.4282310.000000-1.0036660.000000-1.8545742.1007991.118757-0.6866880.0000000.00.000000
65铅钡风化571.3867200.0000000.000000-1.5787890.000000-1.0694910.000000-1.7003961.9600661.0018900.0000000.0000000.00.000000
66铅钡风化582.3163260.000000-2.1765970.152115-1.3335100.160674-1.2486100.0432462.5747090.9382251.098326-2.5249040.00.000000

67 rows × 17 columns

data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.3.2中心化对数比变换效果图

# Visual comparison between raw data and centralized log ratio transformed data for selected columns
plt.rcParams['font.family'] = 'DejaVu Sans'
fig, axs = plt.subplots(len(selected_cols), 2, figsize=(15, len(selected_cols)*3))

for i, col in enumerate(selected_cols):
    # Plot raw data
    axs[i, 0].hist(data_raw[col].dropna(), bins=30, color='skyblue', edgecolor='black', alpha=0.7)
    axs[i, 0].set_title(f'Raw data: {col}')
    
    # Plot centralized log ratio transformed data
    axs[i, 1].hist(data_centralized[col].dropna(), bins=30, color='salmon', edgecolor='black', alpha=0.7)
    axs[i, 1].set_title(f'Centralized Log Ratio: {col}')

plt.tight_layout()
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wvnwbDlV-1692511618309)(output_21_0.png)]

#data_centralized.to_excel('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并对数中心化转换数据.xlsx', index=True)
data=data_centralized
# Count the unique values in the '类型' and '表面风化' columns
glass_types = data['类型'].unique()
weathering_states = data['表面风化'].unique()

glass_types, weathering_states

(array(['高钾', '铅钡'], dtype=object), array(['无风化', '风化'], dtype=object))
# Initialize an empty DataFrame to store the results
grouped_stats = pd.DataFrame()


component_cols = ['SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 
                      'Al2O3', 'Fe2O3', 'CuO', 'PbO', 'BaO', 
                      'P2O5', 'SrO', 'SnO2', 'SO2']
# Calculate descriptive statistics for each chemical component
for component in component_cols:
    component_data = data.groupby(['类型', '表面风化'])[component]
    stats = component_data.agg(['mean', 'max', 'min', 'std', 'var', 'skew'])
    stats['kurt'] = component_data.apply(pd.DataFrame.kurt)
    stats['cv'] = stats['std'] / stats['mean']  # calculate coefficient of variation
    # Add a level to column names
    stats.columns = pd.MultiIndex.from_product([[component], stats.columns])
    grouped_stats = pd.concat([grouped_stats, stats], axis=1)

grouped_stats
SiO2Na2O...SnO2SO2
meanmaxminstdvarskewkurtcvmeanmax...kurtcvmeanmaxminstdvarskewkurtcv
类型表面风化
铅钡无风化3.0137433.8715211.8595240.6461950.417567-0.301305-0.9568150.2144160.0711310.876318...3.253187-2.4419870.0205690.2673960.0000000.0741620.0055003.60555113.0000003.605551
风化2.2423293.937307-0.1313530.9237800.853370-0.5848110.6507070.4119730.0133711.043858...13.632917-3.6649830.0280211.369229-0.7965620.3364510.1131992.1089099.85728012.007019
高钾无风化3.1656873.7122882.2666090.3632050.131918-1.0937263.0365630.114732-0.0135850.320182...12.000000-3.464102-0.5076200.000000-2.2407230.9259010.857292-1.388056-0.011455-1.824002
风化4.1870454.3729773.8304980.1873880.035114-1.7319953.6411360.0447540.0000000.000000...0.000000NaN0.0000000.0000000.0000000.0000000.0000000.0000000.000000NaN

4 rows × 112 columns

# Adjusting the code to avoid renaming columns, instead we will capture the group information in the DataFrame index
tables_dict = {}

for glass_type in glass_types:
    for weathering_state in weathering_states:
        subset = grouped_stats.loc[glass_type, weathering_state].unstack().T
        table_name = f"{glass_type}_{weathering_state}"
        tables_dict[table_name] = pd.DataFrame(subset)  # 显式地转换为pd.DataFrame
        
# Looping through the tables_dict and outputting each DataFrame

tables_dict

{'高钾_无风化':          Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \
 cv    0.664393 -1.972230  0.893838 -2.321136 -1.626433  0.473990 -0.700958   
 kurt -1.409964  3.016385 -0.156702  1.577446  0.472540  1.635379 -1.292382   
 max   1.508084  0.019074  1.647769  0.595531  0.747950  2.210662  0.000000   
 mean  0.776104 -0.179823  0.599071 -0.262942 -0.390464  1.145963 -0.674968   
 min   0.006978 -1.080913 -0.182189 -1.652716 -1.590841  0.000000 -1.332161   
 skew -0.031480 -1.906416  0.378894 -1.180633 -0.394538 -0.184857  0.061519   
 std   0.515638  0.354653  0.535473  0.610324  0.635064  0.543175  0.473124   
 var   0.265882  0.125778  0.286731  0.372495  0.403306  0.295039  0.223846   
 
            Na2O      P2O5       PbO       SO2      SiO2       SnO2       SrO  
 cv   -19.285768 -0.979906 -1.116780 -1.824002  0.114732  -3.464102 -1.050200  
 kurt   7.015733  0.317255 -1.629147 -0.011455  3.036563  12.000000 -2.376521  
 max    0.320182  0.526955  0.000000  0.000000  3.712288   0.000000  0.000000  
 mean  -0.013585 -0.938500 -0.987338 -0.507620  3.165687  -0.007795 -1.723790  
 min   -0.760277 -2.730275 -2.672140 -2.240723  2.266609  -0.093536 -3.774602  
 skew  -2.150622  0.057567 -0.552251 -1.388056 -1.093726  -3.464102 -0.037176  
 std    0.262001  0.919641  1.102639  0.925901  0.363205   0.027002  1.810324  
 var    0.068645  0.845740  1.215812  0.857292  0.131918   0.000729  3.277274  ,
 '高钾_风化':          Al2O3  BaO       CaO       CuO     Fe2O3       K2O       MgO  Na2O  \
 cv    2.498627  NaN -0.962261 -8.191497 -0.250545 -0.997049 -1.572791   NaN   
 kurt  0.025390  0.0  2.287842  0.619598  1.095297 -0.867476 -1.112631   0.0   
 max   0.961580  0.0  0.215634  0.477459 -1.341006  0.000000  0.000000   0.0   
 mean  0.194529  0.0 -0.664817 -0.060020 -1.714985 -0.328478 -0.286859   0.0   
 min  -0.410081  0.0 -1.760008 -0.889020 -2.470072 -0.824068 -0.983686   0.0   
 skew  0.669913  0.0 -0.709483 -1.043688 -1.369695 -0.588570 -1.095736   0.0   
 std   0.486056  0.0  0.639727  0.491651  0.429681  0.327508  0.451170   0.0   
 var   0.236251  0.0  0.409251  0.241720  0.184626  0.107262  0.203554   0.0   
 
           P2O5  PbO  SO2      SiO2  SnO2  SrO  
 cv   -0.562597  NaN  NaN  0.044754   NaN  NaN  
 kurt  2.101884  0.0  0.0  3.641136   0.0  0.0  
 max   0.000000  0.0  0.0  4.372977   0.0  0.0  
 mean -1.326415  0.0  0.0  4.187045   0.0  0.0  
 min  -2.178840  0.0  0.0  3.830498   0.0  0.0  
 skew  1.134407  0.0  0.0 -1.731995   0.0  0.0  
 std   0.746238  0.0  0.0  0.187388   0.0  0.0  
 var   0.556871  0.0  0.0  0.035114   0.0  0.0  ,
 '铅钡_无风化':          Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \
 cv    3.716292  0.352188 -0.987216 -1.103642 -2.376125 -0.899079 -1.163923   
 kurt  0.214284  1.405046 -0.671685 -0.661301  4.165086 -1.951127 -0.717171   
 max   0.901223  2.031090  0.340114  0.899535  0.554504  0.000000  0.000000   
 mean  0.138882  1.245669 -0.714861 -0.925721 -0.306467 -1.288085 -0.541147   
 min  -0.847107  0.260264 -1.990837 -2.580097 -2.264904 -2.915489 -1.822866   
 skew -0.716711 -0.562582  0.062455  0.086620 -1.989760  0.104047 -0.750761   
 std   0.516125  0.438710  0.705723  1.021664  0.728205  1.158091  0.629853   
 var   0.266385  0.192466  0.498044  1.043798  0.530282  1.341175  0.396715   
 
           Na2O      P2O5       PbO        SO2      SiO2      SnO2       SrO  
 cv    3.684555 -0.818040  0.266446   3.605551  0.214416 -2.441987 -0.893422  
 kurt  8.623783 -1.684970  6.556376  13.000000 -0.956815  3.253187 -2.023534  
 max   0.876318  0.000000  2.610837   0.267396  3.871521  0.000000  0.000000  
 mean  0.071131 -1.449052  2.160856   0.020569  3.013743 -0.311426 -1.114090  
 min  -0.221722 -3.201927  0.468937   0.000000  1.859524 -2.078030 -2.211561  
 skew  2.741762 -0.069394 -2.363412   3.605551 -0.301305 -2.182647  0.129023  
 std   0.262087  1.185383  0.575751   0.074162  0.646195  0.760497  0.995352  
 var   0.068690  1.405133  0.331490   0.005500  0.417567  0.578356  0.990726  ,
 '铅钡_风化':           Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \
 cv   -11.231984  0.609170 -1.725044 -1.063677 -1.111626 -1.185000 -0.978333   
 kurt  -0.288489 -0.601793 -0.712685 -0.599524 -0.418132 -1.660810 -1.486424   
 max    2.042802  2.167893  0.497358  0.888513  0.000000  0.000000  0.000000   
 mean  -0.087576  1.035546 -0.375654 -0.824426 -0.723172 -0.967980 -0.693444   
 min   -1.826182 -0.181275 -1.877738 -2.764779 -2.575747 -2.970023 -1.841063   
 skew   0.155720 -0.126683 -0.661062  0.043889 -0.837263 -0.456970 -0.340389   
 std    0.983655  0.630823  0.648019  0.876923  0.803897  1.147056  0.678420   
 var    0.967578  0.397938  0.419929  0.768995  0.646251  1.315737  0.460253   
 
            Na2O       P2O5       PbO        SO2      SiO2       SnO2       SrO  
 cv    27.248350 -11.063796  0.221513  12.007019  0.411973  -3.664983 -0.415568  
 kurt   3.996993   1.120939 -0.753508   9.857280  0.650707  13.632917  1.634188  
 max    1.043858   1.188784  3.510396   1.369229  3.937307   0.000000  0.000000  
 mean   0.013371  -0.102296  2.402080   0.028021  2.242329  -0.119384 -1.827413  
 min   -1.093837  -3.229330  1.389649  -0.796562 -0.131353  -1.944122 -2.930869  
 skew  -0.038016  -1.253158  0.239303   2.108909 -0.584811  -3.788951  1.424906  
 std    0.364329   1.131785  0.532092   0.336451  0.923780   0.437542  0.759414  
 var    0.132736   1.280938  0.283122   0.113199  0.853370   0.191443  0.576710  }
'''
with pd.ExcelWriter('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据统计性分析.xlsx') as writer:
    for sheet_name, df in tables_dict.items():
        df.to_excel(writer, sheet_name=sheet_name,index=True)
'''
"\nwith pd.ExcelWriter('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据统计性分析.xlsx') as writer:\n    for sheet_name, df in tables_dict.items():\n        df.to_excel(writer, sheet_name=sheet_name,index=True)\n"

2.4描述性统计

tables_dict['高钾_无风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv0.664393-1.9722300.893838-2.321136-1.6264330.473990-0.700958-19.285768-0.979906-1.116780-1.8240020.114732-3.464102-1.050200
kurt-1.4099643.016385-0.1567021.5774460.4725401.635379-1.2923827.0157330.317255-1.629147-0.0114553.03656312.000000-2.376521
max1.5080840.0190741.6477690.5955310.7479502.2106620.0000000.3201820.5269550.0000000.0000003.7122880.0000000.000000
mean0.776104-0.1798230.599071-0.262942-0.3904641.145963-0.674968-0.013585-0.938500-0.987338-0.5076203.165687-0.007795-1.723790
min0.006978-1.080913-0.182189-1.652716-1.5908410.000000-1.332161-0.760277-2.730275-2.672140-2.2407232.266609-0.093536-3.774602
skew-0.031480-1.9064160.378894-1.180633-0.394538-0.1848570.061519-2.1506220.057567-0.552251-1.388056-1.093726-3.464102-0.037176
std0.5156380.3546530.5354730.6103240.6350640.5431750.4731240.2620010.9196411.1026390.9259010.3632050.0270021.810324
var0.2658820.1257780.2867310.3724950.4033060.2950390.2238460.0686450.8457401.2158120.8572920.1319180.0007293.277274
tables_dict['高钾_风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv2.498627NaN-0.962261-8.191497-0.250545-0.997049-1.572791NaN-0.562597NaNNaN0.044754NaNNaN
kurt0.0253900.02.2878420.6195981.095297-0.867476-1.1126310.02.1018840.00.03.6411360.00.0
max0.9615800.00.2156340.477459-1.3410060.0000000.0000000.00.0000000.00.04.3729770.00.0
mean0.1945290.0-0.664817-0.060020-1.714985-0.328478-0.2868590.0-1.3264150.00.04.1870450.00.0
min-0.4100810.0-1.760008-0.889020-2.470072-0.824068-0.9836860.0-2.1788400.00.03.8304980.00.0
skew0.6699130.0-0.709483-1.043688-1.369695-0.588570-1.0957360.01.1344070.00.0-1.7319950.00.0
std0.4860560.00.6397270.4916510.4296810.3275080.4511700.00.7462380.00.00.1873880.00.0
var0.2362510.00.4092510.2417200.1846260.1072620.2035540.00.5568710.00.00.0351140.00.0
tables_dict['铅钡_无风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv3.7162920.352188-0.987216-1.103642-2.376125-0.899079-1.1639233.684555-0.8180400.2664463.6055510.214416-2.441987-0.893422
kurt0.2142841.405046-0.671685-0.6613014.165086-1.951127-0.7171718.623783-1.6849706.55637613.000000-0.9568153.253187-2.023534
max0.9012232.0310900.3401140.8995350.5545040.0000000.0000000.8763180.0000002.6108370.2673963.8715210.0000000.000000
mean0.1388821.245669-0.714861-0.925721-0.306467-1.288085-0.5411470.071131-1.4490522.1608560.0205693.013743-0.311426-1.114090
min-0.8471070.260264-1.990837-2.580097-2.264904-2.915489-1.822866-0.221722-3.2019270.4689370.0000001.859524-2.078030-2.211561
skew-0.716711-0.5625820.0624550.086620-1.9897600.104047-0.7507612.741762-0.069394-2.3634123.605551-0.301305-2.1826470.129023
std0.5161250.4387100.7057231.0216640.7282051.1580910.6298530.2620871.1853830.5757510.0741620.6461950.7604970.995352
var0.2663850.1924660.4980441.0437980.5302821.3411750.3967150.0686901.4051330.3314900.0055000.4175670.5783560.990726
tables_dict['铅钡_风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv-11.2319840.609170-1.725044-1.063677-1.111626-1.185000-0.97833327.248350-11.0637960.22151312.0070190.411973-3.664983-0.415568
kurt-0.288489-0.601793-0.712685-0.599524-0.418132-1.660810-1.4864243.9969931.120939-0.7535089.8572800.65070713.6329171.634188
max2.0428022.1678930.4973580.8885130.0000000.0000000.0000001.0438581.1887843.5103961.3692293.9373070.0000000.000000
mean-0.0875761.035546-0.375654-0.824426-0.723172-0.967980-0.6934440.013371-0.1022962.4020800.0280212.242329-0.119384-1.827413
min-1.826182-0.181275-1.877738-2.764779-2.575747-2.970023-1.841063-1.093837-3.2293301.389649-0.796562-0.131353-1.944122-2.930869
skew0.155720-0.126683-0.6610620.043889-0.837263-0.456970-0.340389-0.038016-1.2531580.2393032.108909-0.584811-3.7889511.424906
std0.9836550.6308230.6480190.8769230.8038971.1470560.6784200.3643291.1317850.5320920.3364510.9237800.4375420.759414
var0.9675780.3979380.4199290.7689950.6462511.3157370.4602530.1327361.2809380.2831220.1131990.8533700.1914430.576710
'''
均值(Mean):
SiO2(二氧化硅): 在未风化的玻璃中,高钾玻璃的SiO2含量均值显著高于铅钡玻璃。
然而,风化过程中,两者的差异缩小,可能表明风化过程影响了SiO2的含量。
Al2O3(氧化铝): 未风化的玻璃中,高钾玻璃的氧化铝含量均值大于铅钡玻璃。
风化后,铅钡玻璃的氧化铝含量均值超过高钾玻璃,这可能反映了风化对氧化铝的显著影响。
标准差(Std)和变异系数(CV):
Na2O(氧化钠): 未风化玻璃中,铅钡玻璃的氧化钠含量均值较高,但风化后,高钾玻璃的氧化钠含量均值增加。
这可能表明风化过程改变了氧化钠的分布。
CaO(氧化钙): 在所有条件下,铅钡玻璃的氧化钙含量均值均大于高钾玻璃,反映了铅钡玻璃的特有组成。
偏度(Skew)和峰度(Kurt):
PbO(氧化铅)和BaO(氧化钡): 在高钾和铅钡玻璃之间,这些成分的分布偏度和峰度存在显著差异。
这可能反映了不同类型玻璃的结构差异和风化过程的不同影响。
特定元素观察:
二氧化硅 (SiO2): 未风化的高钾玻璃的二氧化硅含量约为铅钡玻璃的两倍,但风化后,两者的差异减小。
这可能反映了风化对二氧化硅含量的影响。
氧化铝 (Al2O3): 风化可能对氧化铝含量有显著影响,特别是在铅钡玻璃中。
'''
'\n均值(Mean):\nSiO2(二氧化硅): 在未风化的玻璃中,高钾玻璃的SiO2含量均值显著高于铅钡玻璃。\n然而,风化过程中,两者的差异缩小,可能表明风化过程影响了SiO2的含量。\nAl2O3(氧化铝): 未风化的玻璃中,高钾玻璃的氧化铝含量均值大于铅钡玻璃。\n风化后,铅钡玻璃的氧化铝含量均值超过高钾玻璃,这可能反映了风化对氧化铝的显著影响。\n标准差(Std)和变异系数(CV):\nNa2O(氧化钠): 未风化玻璃中,铅钡玻璃的氧化钠含量均值较高,但风化后,高钾玻璃的氧化钠含量均值增加。\n这可能表明风化过程改变了氧化钠的分布。\nCaO(氧化钙): 在所有条件下,铅钡玻璃的氧化钙含量均值均大于高钾玻璃,反映了铅钡玻璃的特有组成。\n偏度(Skew)和峰度(Kurt):\nPbO(氧化铅)和BaO(氧化钡): 在高钾和铅钡玻璃之间,这些成分的分布偏度和峰度存在显著差异。\n这可能反映了不同类型玻璃的结构差异和风化过程的不同影响。\n特定元素观察:\n二氧化硅 (SiO2): 未风化的高钾玻璃的二氧化硅含量约为铅钡玻璃的两倍,但风化后,两者的差异减小。\n这可能反映了风化对二氧化硅含量的影响。\n氧化铝 (Al2O3): 风化可能对氧化铝含量有显著影响,特别是在铅钡玻璃中。\n'

2.5 箱线图绘制

import matplotlib.pyplot as plt  # or another font that supports the special character
import seaborn as sns
plt.rcParams['font.family'] = 'DejaVu Sans'
# Correct the condition for each DataFrame
data_high_potassium_erosion = data[(data['类型'] == '高钾') & (data['表面风化'] == '风化')]
data_high_potassium_no_erosion = data[(data['类型'] == '高钾') & (data['表面风化'] == '无风化')]
data_lead_barium_erosion = data[(data['类型'] == '铅钡') & (data['表面风化'] == '风化')]
data_lead_barium_no_erosion = data[(data['类型'] == '铅钡') & (data['表面风化'] == '无风化')]

# Create a new DataFrame for boxplot
boxplot_data_high_potassium_erosion = data_high_potassium_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_high_potassium_no_erosion = data_high_potassium_no_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_lead_barium_erosion = data_lead_barium_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_lead_barium_no_erosion = data_lead_barium_no_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
# Set the figure size
plt.figure(figsize=(20, 45))

# Create subplots
fig, axs = plt.subplots(2, 2, figsize=(12, 8))

# Reorder the data and titles to switch the positions of the plots
data_list = [boxplot_data_lead_barium_erosion, boxplot_data_lead_barium_no_erosion, boxplot_data_high_potassium_erosion, boxplot_data_high_potassium_no_erosion]
titles = ['Lead Barium Glass with Erosion', 'Lead Barium Glass without Erosion', 'High Potassium Glass with Erosion', 'High Potassium Glass without Erosion']

# Generate boxplots for each condition
for ax, data, title in zip(axs.flatten(), data_list, titles):
    sns.boxplot(y='variable', x='value', data=data, ax=ax, orient="h")
    ax.set_ylabel('Chemical Component')
    ax.set_xlabel('Content (%)')
    ax.set_title('{}'.format(title))
    ax.invert_yaxis() # Invert the y-axis labels

# Adjust layout
plt.tight_layout()
plt.show()


<Figure size 2000x4500 with 0 Axes>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VjubA0SS-1692511618310)(output_33_1.png)]

'''
铅钡玻璃:
风化前后的变化:

中位数下降: 大部分化学成分的中位数在风化过程中有所下降,特别是Al2O3、K2O、SiO2、CaO、MgO、Na2O。
这可能反映了风化过程中这些成分的流失。
离散程度下降: 这些成分的离散程度也在风化过程中减小,表明风化可能使这些成分的含量变得更一致。
特定化学成分观察:

铝氧化物 (Al2O3): 风化使铝氧化物的中位数下降,分布变得更紧密。
硅氧化物 (SiO2): 风化使硅氧化物的中位数下降,分布也更紧密。
钾氧化物 (K2O) 和 钠氧化物 (Na2O): 分布变得更紧密,中位数下降。
高钾玻璃:
风化前后的变化:

中位数下降: 大部分化学成分的中位数也在风化过程中下降,尤其是K2O和Na2O,与铅钡玻璃相似。
离散程度变化: 不同于铅钡玻璃,某些化学成分的分布在风化后变得更广,例如硅氧化物 (SiO2) 和钾氧化物 (K2O)。
特定化学成分观察:

铝氧化物 (Al2O3): 高钾玻璃的铝氧化物分布在风化后变得更广泛。
硅氧化物 (SiO2): 风化过程似乎没有明显改变硅氧化物的中位数,但分布变得更广。
钾氧化物 (K2O) 和 钠氧化物 (Na2O): 中位数大幅下降,分布更广。
总结:
通过这些箱线图,我们可以观察到风化过程对玻璃成分的具体影响。
对于铅钡玻璃和高钾玻璃,风化过程都可能导致某些成分的流失,但具体的影响程度可能会因为玻璃的类型和成分的种类而有所不同。
这些观察有助于我们理解风化对不同类型玻璃化学成分的影响,进而为文物保护和修复提供指导。
'''
'\n铅钡玻璃:\n风化前后的变化:\n\n中位数下降: 大部分化学成分的中位数在风化过程中有所下降,特别是Al2O3、K2O、SiO2、CaO、MgO、Na2O。\n这可能反映了风化过程中这些成分的流失。\n离散程度下降: 这些成分的离散程度也在风化过程中减小,表明风化可能使这些成分的含量变得更一致。\n特定化学成分观察:\n\n铝氧化物 (Al2O3): 风化使铝氧化物的中位数下降,分布变得更紧密。\n硅氧化物 (SiO2): 风化使硅氧化物的中位数下降,分布也更紧密。\n钾氧化物 (K2O) 和 钠氧化物 (Na2O): 分布变得更紧密,中位数下降。\n高钾玻璃:\n风化前后的变化:\n\n中位数下降: 大部分化学成分的中位数也在风化过程中下降,尤其是K2O和Na2O,与铅钡玻璃相似。\n离散程度变化: 不同于铅钡玻璃,某些化学成分的分布在风化后变得更广,例如硅氧化物 (SiO2) 和钾氧化物 (K2O)。\n特定化学成分观察:\n\n铝氧化物 (Al2O3): 高钾玻璃的铝氧化物分布在风化后变得更广泛。\n硅氧化物 (SiO2): 风化过程似乎没有明显改变硅氧化物的中位数,但分布变得更广。\n钾氧化物 (K2O) 和 钠氧化物 (Na2O): 中位数大幅下降,分布更广。\n总结:\n通过这些箱线图,我们可以观察到风化过程对玻璃成分的具体影响。\n对于铅钡玻璃和高钾玻璃,风化过程都可能导致某些成分的流失,但具体的影响程度可能会因为玻璃的类型和成分的种类而有所不同。\n这些观察有助于我们理解风化对不同类型玻璃化学成分的影响,进而为文物保护和修复提供指导。\n'

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.coloradmin.cn/o/903147.html

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈,一经查实,立即删除!

相关文章

回归预测 | MATLAB实现TSO-LSSVM金枪鱼群算法优化最小二乘支持向量机多输入单输出回归预测(多指标,多图)

回归预测 | MATLAB实现TSO-LSSVM金枪鱼群算法优化最小二乘支持向量机多输入单输出回归预测&#xff08;多指标&#xff0c;多图&#xff09; 目录 回归预测 | MATLAB实现TSO-LSSVM金枪鱼群算法优化最小二乘支持向量机多输入单输出回归预测&#xff08;多指标&#xff0c;多图&a…

五种网络IO模型

五种模型出自&#xff1a;RFC标准。可参考&#xff1a; 《UNIX网络编程-卷一》 6.2 很多程序员是从高级语言的网络编程/文件操作了解到nio&#xff0c;继而了解到五种io模型的&#xff1b; 这五种io模型不止用于网络io “阻塞与****系统调用”是怎么回事&#xff1f;我知道了线…

Redis之持久化机制

文章目录 一、redis持久化二、持久化方式2.1. RDB方式2.1.1 RDB手动2.1.2 RDB自动2.1.3RDB优缺点 2.2AOF方式2.2.1 AOF写数据遇到的问题2.2.2 AOF重写方式 二、RDB和AOF优缺点对比总结 一、redis持久化 Redis 是内存数据库&#xff0c;如果不将内存中的数据库状态保存到磁盘&a…

Azure防火墙

文章目录 什么是Azure防火墙如何部署和配置创建虚拟网络创建虚拟机创建防火墙创建路由表&#xff0c;关联子网、路由配置防火墙策略配置应用程序规则配置网络规则配置 DNAT 规则 更改 Srv-Work 网络接口的主要和辅助 DNS 地址测试防火墙 什么是Azure防火墙 Azure防火墙是一种用…

ELK日志监控系统搭建docker版

目录 日志来源elk介绍elasticsearch介绍logstash介绍kibana介绍 部署elasticsearch拉取镜像&#xff1a;docker pull elasticsearch:7.17.9修改配置⽂件&#xff1a;/usr/share/elasticsearch/config/elasticsearch.yml启动容器设置密码&#xff08;123456&#xff09;忘记密码…

Redis从基础到进阶篇(一)

目录 一、了解NoSql 1.1 什么是Nosql 1.2 为什么要使用NoSql 1.3 NoSql数据库的优势 1.4 常见的NoSql产品 1.5 各产品的区别 二、Redis介绍 2.1什么是Redis 2.2 Redis优势 2.3 Redis应用场景 2.4 Redis下载 三、Linux下安装Redis 3.1 环境准备 3.2 Redis的…

Win11右键显示更多选项

不需要重启电脑&#xff0c;重启资源管理器即可&#xff0c;用命令&#xff1a;taskkill /f /im explorer.exe & start explorer.exe

一、Kafka概述

目录 1.3 Kafka的基础架构 1.3 Kafka的基础架构 Producer&#xff1a;消息生产者&#xff0c;就是向 Kafka broker 发消息的客户端Consumer&#xff1a;消息消费者&#xff0c;向 Kafka broker 取消息的客户端。Consumer Group&#xff08;CG&#xff09;&#xff1a;消费者组&…

浅析深浅拷贝

我们在对对象进行复制时就用到深浅拷贝。 一、普通复制 <script>const people{name:tim,age:22}const testpeople;console.log(test);//tim 22test.age20;console.log(test);//tim 20console.log(people);//tim 20 </script> 控制台打印结果&#xff1a; 之所以…

使用struct解析通达信本地Lday日线数据

★★★★★博文原创不易&#xff0c;我的博文不需要打赏&#xff0c;也不需要知识付费&#xff0c;可以白嫖学习编程小技巧&#xff0c;喜欢的老铁可以多多帮忙点赞&#xff0c;小红牛在此表示感谢。★★★★★ 在Python中&#xff0c;struct模块提供了二进制数据的打包和解包…

使用transformers生成文本Generating text with transformers

到目前为止&#xff0c;您已经看到了Transformers架构内部的一些主要组件的高级概述。但您还没有看到从头到尾的整体预测过程是如何工作的。让我们通过一个简单的例子来了解。在这个例子中&#xff0c;您将查看一个翻译任务或一个序列到序列的任务&#xff0c;这恰好是Transfor…

破解难题:如何应对项目中的‘老油条’障碍

引言 在项目管理的实践中&#xff0c;我们经常遇到各种各样的人员挑战。其中&#xff0c;有一种特殊的挑战被称为“老油条”现象。这些“老油条”通常在表面上表现得非常配合&#xff0c;但在实际工作中却常常没有任何进展。这种情况不仅会影响项目的进度&#xff0c;还可能对…

机器学习---常见的距离公式(欧氏距离、曼哈顿距离、标准化欧式距离、余弦距离、杰卡德距离、马氏距离、切比雪夫距离、闵可夫斯基距离、K-L散度)

1. 欧氏距离 欧几里得度量&#xff08;euclidean metric&#xff09;&#xff08;也称欧氏距离&#xff09;是一个通常采用的距离定义&#xff0c;指在m维空 间中两个点之间的真实距离&#xff0c;或者向量的自然长度&#xff08;即该点到原点的距离&#xff09;。在二维和三维…

Spring(16) Aware结尾的类整理

目录 一、什么是 Aware 结尾的类&#xff1f;二、常见的 Aware 实现接口三、Aware 实现原理 一、什么是 Aware 结尾的类&#xff1f; 在 Spring Boot 中&#xff0c;以 Aware 结尾的类通常是一些继承了 Aware 接口的接口类&#xff0c;它们用于使 Bean 获取某些特定的能力或资…

AJAX的POST请求在chrome浏览器报net::ERR_CONNECTION_RESET问题

背景说明 公司对前端的所有的AJAX请求做了统一的封装&#xff0c;因此业务上需要发起请求调用后端服务时&#xff0c;使用的都是公司封装好的工具。 由于ERR_CONNECTION_RESET问题比较粗&#xff0c;也就是说可能会有很多原因会导致浏览器报这个错&#xff0c;因此在网上可以…

clion软件ide的安装和环境配置@ubuntu

1.官网&#xff1a; Download CLion 2.安装Clion 直接在官网下载并安装即可&#xff0c;过程很简单 https://www.jetbrains.com/clion/ https://www.jetbrains.com/clion/download/#sectionlinux 3.激活码 4.配置Clion 安装gcc、g、make Ubuntu中用到的编译工具是gcc©…

Java面向对象——多态、Object类、instanceof关键字以及final关键字

多态的概念 1.多态是指同一个方法调用可以在不同的对象上有不同的表现&#xff0c;即同一种方法调用方式适用于不同的数据类型。 编译时和运行时&#xff1a;编译时期调用的是父类中的方法&#xff0c;但运行时期会根据实际的对象类型来调用适当的方法。这种行为称为动态绑定&…

自注意力机制简介Transformers: Attention is all you need

“Attention is All You Need” 是一篇由Google研究人员在2017年发表的研究论文&#xff0c;该论文介绍了Transformer模型&#xff0c;这是一种革命性的架构&#xff0c;它彻底改变了自然语言处理&#xff08;NLP&#xff09;领域&#xff0c;并成为我们现在所知道的LLMs的基础…

剪枝基础与实战(1): 概述

本文介绍基于L1正则化的剪枝原理,并以VGG网络进行实战说明。将从零详细介绍模型训练、稀疏化、剪枝、finetune的全过程,提供详细的源码及说明,有助于对剪枝的熟练掌握,后续也会对yolov8进行剪枝的介绍。 论文: Learning Efficient Convolutional Networks through Network …