概述
导入excel或word是一些web应用常见的需求,本demo详细介绍怎么导入word,读取word里面的数据
详细
一、运行效果
二、实现过程
①、首先用maven快速搭建一个spring boot 项目
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<java.version>1.8</java.version>
</properties>
<dependencies>
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
</dependencies>
②、添加启动主类
@SpringBootApplication
public class WordParseApplication {
public static void main(String[] args) {
SpringApplication.run(WordParseApplication.class, args);
}
}
③、引入poi和swagger相关依赖
<!-- 引入poi -->
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.17</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.17</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-scratchpad</artifactId>
<version>3.17</version>
</dependency>
④、为了方便测试我们引入swagger,直接图形化界面调用接口
<!-- 引入swagger相关jar包 -->
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger2</artifactId>
<version>2.8.0</version>
</dependency>
<dependency>
<groupId>io.springfox</groupId>
<artifactId>springfox-swagger-ui</artifactId>
<version>2.8.0</version>
</dependency>
⑤、创建swagger配置类
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import springfox.documentation.builders.ApiInfoBuilder;
import springfox.documentation.builders.PathSelectors;
import springfox.documentation.builders.RequestHandlerSelectors;
import springfox.documentation.service.ApiInfo;
import springfox.documentation.service.Contact;
import springfox.documentation.spi.DocumentationType;
import springfox.documentation.spring.web.plugins.Docket;
import springfox.documentation.swagger2.annotations.EnableSwagger2;
@Configuration //让Spring来加载该类配置
@EnableSwagger2 //启用Swagger2
public class SwaggerConfig {
@Bean
public Docket dataManagerApi() {
return new Docket(DocumentationType.SWAGGER_2)
.groupName("POI解析word文档")
.apiInfo(apiInfo())
.select()
.apis(RequestHandlerSelectors.basePackage("com.hello.word.controller"))
.paths(PathSelectors.any()).build();
}
private ApiInfo apiInfo() {
return new ApiInfoBuilder()
.title("POI解析word文档")
.description("POI解析word文档,包括doc和docx格式")
.termsOfServiceUrl("-----")
.contact(new Contact("zhangwq","----", "lion_qiang@163.com"))
.version("1.0").build();
}
}
⑥、PoiWordController中添加如下方法,解析doc和docx需要使用不同的组件
/**
* 按段落解析一个word文档
* @param file
* @return
* @throws Exception
*/
@ApiOperation(value="解析word文档", notes="按段落解析word文档")
@RequestMapping(value = "upload", method = RequestMethod.POST)
public Map uploadFile(@RequestParam(value = "file", required = true) MultipartFile file){
String textFileName=file.getOriginalFilename();
Map wordMap = new LinkedHashMap();//创建一个map对象存放word中的内容
try {
if(textFileName.endsWith(".doc")){ //判断文件格式
InputStream fis = file.getInputStream();
WordExtractor wordExtractor = new WordExtractor(fis);//使用HWPF组件中WordExtractor类从Word文档中提取文本或段落
int i=1;
for(String words : wordExtractor.getParagraphText()){//获取段落内容
System.out.println(words);
wordMap.put("DOC文档,第("+i+")段内容",words);
i++;
}
fis.close();
}
if(textFileName.endsWith(".docx")){
File uFile = new File("tempFile.docx");//创建一个临时文件
if(!uFile.exists()){
uFile.createNewFile();
}
FileCopyUtils.copy(file.getBytes(), uFile);//复制文件内容
OPCPackage opcPackage = POIXMLDocument.openPackage("tempFile.docx");//包含所有POI OOXML文档类的通用功能,打开一个文件包。
XWPFDocument document = new XWPFDocument(opcPackage);//使用XWPF组件XWPFDocument类获取文档内容
List<XWPFParagraph> paras = document.getParagraphs();
int i=1;
for(XWPFParagraph paragraph : paras){
String words = paragraph.getText();
System.out.println(words);
wordMap.put("DOCX文档,第("+i+")段内容",words);
i++;
}
uFile.delete();
}
} catch (Exception e) {
e.printStackTrace();
}
System.out.println(wordMap);
return wordMap;
}
三、项目结构图
四、补充
实际应用中,每个段落可能会有特殊的用途。我们可以添加特殊字符对他们进行归类,按照自己的逻辑进行处理。