Python在Windows下批量压缩CSV文件为ZIP并异步上传到Box企业云,需整合文件处理、异步任务、配置管理和日志记录功能。
该方案通过线程池实现异步上传,每个文件独立压缩处理,异常发生时继续后续任务。日志系统记录完整操作流水,配置文件使路径和认证信息可灵活调整。实际部署时需根据企业Box账号的具体权限配置调整文件夹创建逻辑。以下是实现方案的核心要点:
1. 配置文件管理(config.ini)
[Directories]
source_dir = C:\csv_source
output_dir = C:\zip_output
log_file = C:\logs\app.log
[Box]
client_id = your_client_id
client_secret = your_client_secret
access_token = your_access_token
upload_path = /target_folder
2. 核心功能模块
- 压缩模块
使用zipfile
逐文件压缩,保留原始文件名:
def compress_csv(csv_path, zip_dir):
zip_name = Path(csv_path).stem + '.zip'
zip_path = Path(zip_dir) / zip_name
with ZipFile(zip_path, 'w') as zf:
zf.write(csv_path, arcname=Path(csv_path).name)
return zip_path
- 异步上传模块
采用线程池实现并行上传:
def async_upload(zip_path, box_client, upload_path):
try:
folder = ensure_box_folder(box_client, upload_path)
folder.upload(zip_path)
except BoxAPIException as e:
logging.error(f"上传失败 {zip_path}: {e.context_info}")
3. Box云集成关键点
-
OAuth2认证流程初始化
-
路径自动创建功能:
def ensure_box_folder(client, path):
current_folder = client.folder('0')
for part in Path(path).parts:
subfolder = next((item for item in current_folder.get_items()
if item.name == part and item.type == 'folder'), None)
current_folder = subfolder or current_folder.create_subfolder(part)
return current_folder
4. 异常处理机制
-
文件操作异常(FileNotFoundError, PermissionError)
-
压缩过程异常(zipfile.BadZipFile)
-
网络传输异常(requests.exceptions.ConnectionError)
-
Box API错误(boxsdk.BoxAPIException)
5. 日志系统配置
logging.basicConfig(
filename=config['Directories']['log_file'],
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
6. 主程序流程
def main():
config = read_config()
box_client = init_box_client(config)
with ThreadPoolExecutor(max_workers=4) as executor:
for csv_file in Path(config['source_dir']).glob('*.csv'):
try:
zip_path = compress_csv(csv_file, config['output_dir'])
executor.submit(async_upload, zip_path, box_client,
config['Box']['upload_path'])
except Exception as e:
logging.error(f"处理失败 {csv_file}: {str(e)}")
7. 部署注意事项
-
安装必要依赖:
pip install boxsdk python-dotenv
-
配置文件加密建议:敏感字段(client_secret)建议使用环境变量
-
网络代理配置:企业环境可能需要配置代理服务器
-
权限管理:确保程序有文件系统写入权限和网络访问权限