主题
对xlsx文件进行清洗
第一步
将g2到y2的标题复制到g4和y4
安装操作库
pip install openpyxl
下载失败,更换为阿里源
pip install library -i http://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.com
下载仍然失败
再次换源
pip install openpyxl -i https://pypi.doubanio.com/simple
下载成功
尝试复制行
python openpyx对整行进行复制 https://blog.51cto.com/u_16213363/7065319
打开已有的 xls文件
from openpyxl import load_workbook
# 加载工作簿
wb2 = load_workbook('Mytest.xlsx')
# 获取sheet页
ws2 = wb2['mytest']
ws3 = wb2.get_sheet_by_name('mytest')
# 打印sheet页的颜色属性值
print('color:',ws2.sheet_properties.tabColor)
wb2.close()
创建一个工作簿对象
# This is a sample Python script.
# Press Shift+F10 to execute it or replace it with your code.
# Press Double Shift to search everywhere for classes, files, tool windows, actions, and settings.
from openpyxl import Workbook
from openpyxl import load_workbook
def print_hi(name):
# Use a breakpoint in the code line below to debug your script.
print(f'Hi, {name}') # Press Ctrl+F8 to toggle the breakpoint.
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
print_hi('PyCharm')
#创建一个工作簿对象
wb = Workbook()
# 在索引为0的位置创建一个名为mytest的sheet页
ws = wb.create_sheet('mytest', 0)
# 对sheet页设置一个颜色(16位的RGB颜色)
ws.sheet_properties.tabColor = 'ff72BA'
# 将创建的工作簿保存为Mytest.xlsx
wb.save('./src/Mytest.xlsx')
# 最后关闭文件
wb.close()
#data = xlrd.open_workbook('原始数据') # 文件名以及路径,如果路径或者文件名有中文给前面加一个 r
# See PyCharm help at https://www.jetbrains.com/help/pycharm/
加载一个工作簿、
from openpyxl import load_workbook
# 加载工作簿
wb2 = load_workbook('Mytest.xlsx')
# 获取sheet页
ws2 = wb2['mytest']
ws3 = wb2.get_sheet_by_name('mytest')
# 打印sheet页的颜色属性值
print('color:',ws2.sheet_properties.tabColor)
wb2.close()
修改第一个表单名字为-销售明细
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
print_hi('PyCharm')
# 加载工作簿
wb = load_workbook('./src/原始数据.xlsx')
# 获取sheet页,修改第一个sheet页面为
name1 = wb.sheetnames[0]
ws = wb[name1]
ws.title = "销售明细python"
wb.save('./src/处理数据.xlsx')
#wb.close()
复制尺码
将i2-v2复制到i4-v4
name1 = wb.sheetnames[0]
ws1 = wb[name1]
ws1.title = "销售明细python"
#将第二行尺码复制到第四行
for i in range(9, 23):
values = ws1.cell(2, i).value
ws1.cell(5, i).value = values
wb.save('./src/处理数据.xlsx')
第二步
删除前三行
#删除前四行
# 删除第1行(索引从0开始)
ws1.delete_rows(0, 4)
wb.save('./src/处理数据.xlsx')
第三步
求和,行和加字段
https://www.cnblogs.com/cherishthepresent/p/17580255.html
第四步
求和,列和
# This is a sample Python script.
# Press Shift+F10 to execute it or replace it with your code.
# Press Double Shift to search everywhere for classes, files, tool windows, actions, and settings.
from openpyxl import load_workbook
def print_hi(name):
# Use a breakpoint in the code line below to debug your script.
print(f'Hi, {name}') # Press Ctrl+F8 to toggle the breakpoint.
# Press the green button in the gutter to run the script.
# Press the green button in the gutter to run the script.
if __name__ == '__main__':
print_hi('PyCharm')
# 加载工作簿
wb = load_workbook('./src/原始数据.xlsx')
# 获取sheet页,修改第一个sheet页面为
name1 = wb.sheetnames[0]
ws1 = wb[name1]
ws1.title = "销售明细python"
# 将第二行尺码复制到第四行
for i in range(9, 23):
values = ws1.cell(2, i).value
ws1.cell(5, i).value = values
# 删除前四行
# 删除第1行(索引从0开始)
ws1.delete_rows(0, 4)
#wb.close()
#增加合计字段
ws1.cell(1, 23).value = '合计'
min_row = ws1.min_row
max_row = ws1.max_row
min_col = ws1.min_column
max_col = ws1.max_column
#求行和
for row in range(min_row + 2, max_row + 1):
key = ws1.cell(row=row, column=max_col).coordinate
# 求和的开始单元格地址
start = ws1.cell(row=row, column=min_col + 1).coordinate
# 求和的结束单元格地址
end = ws1.cell(row=row, column=max_col - 1).coordinate
ws1[key] = f'=SUM({start}:{end})'
#求列和
for col in range(min_col+8, max_col+1):
key = ws1.cell(row=max_row+1, column=col).coordinate
#求和开始单元格地址
start = ws1.cell(row=min_row+1, column=col).coordinate
#求和结束单元格地址
end = ws1.cell(row=max_row, column=col).coordinate
ws1[key] = f'=SUM({start}:{end})'
wb.save('./src/处理数据.xlsx')
# See PyCharm help at https://www.jetbrains.com/help/pycharm/
处理效果
第五步
修改格式
第六步
创建新sheet表,单款排名
对商品编号通过pandas进行聚类
数据处理