一区二区三区在线-一区二区三区亚洲视频-一区二区三区亚洲-一区二区三区午夜-一区二区三区四区在线视频-一区二区三区四区在线免费观看

服務器之家:專注于服務器技術(shù)及軟件下載分享
分類導航

PHP教程|ASP.NET教程|Java教程|ASP教程|編程技術(shù)|正則表達式|C/C++|IOS|C#|Swift|Android|VB|R語言|JavaScript|易語言|vb.net|

服務器之家 - 編程語言 - Java教程 - java使用POI實現(xiàn)html和word相互轉(zhuǎn)換

java使用POI實現(xiàn)html和word相互轉(zhuǎn)換

2021-06-23 14:05追逐盛夏流年 Java教程

這篇文章主要為大家詳細介紹了java使用POI實現(xiàn)html和word的相互轉(zhuǎn)換,具有一定的參考價值,感興趣的小伙伴們可以參考一下

項目后端使用了springboot,maven,前端使用了ckeditor富文本編輯器。目前從html轉(zhuǎn)換的word為doc格式,而圖片處理支持的是docx格式,所以需要手動把doc另存為docx,然后才可以進行圖片替換。

一.添加maven依賴

主要使用了以下和poi相關(guān)的依賴,為了便于獲取html的圖片元素,還使用了jsoup:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
<dependency>
  <groupid>org.apache.poi</groupid>
  <artifactid>poi</artifactid>
  <version>3.14</version>
</dependency>
 
<dependency>
  <groupid>org.apache.poi</groupid>
  <artifactid>poi-scratchpad</artifactid>
  <version>3.14</version>
</dependency>
 
<dependency>
  <groupid>org.apache.poi</groupid>
  <artifactid>poi-ooxml</artifactid>
  <version>3.14</version>
</dependency>
 
<dependency>
  <groupid>fr.opensagres.xdocreport</groupid>
  <artifactid>xdocreport</artifactid>
  <version>1.0.6</version>
</dependency>
 
<dependency>
  <groupid>org.apache.poi</groupid>
  <artifactid>poi-ooxml-schemas</artifactid>
  <version>3.14</version>
</dependency>
 
<dependency>
  <groupid>org.apache.poi</groupid>
  <artifactid>ooxml-schemas</artifactid>
  <version>1.3</version>
</dependency>
 
<dependency>
  <groupid>org.jsoup</groupid>
  <artifactid>jsoup</artifactid>
  <version>1.11.3</version>
</dependency>

二.word轉(zhuǎn)換為html

在springboot項目的resources目錄下新建static文件夾,將需要轉(zhuǎn)換的word文件temp.docx粘貼進去,由于static是springboot的默認資源文件,所以不需要在配置文件里面另行配置了,如果改成其他名字,需要在application.yml進行相應配置。

doc格式轉(zhuǎn)換為html:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
public static string doctohtml() throws exception {
  file path = new file(resourceutils.geturl("classpath:").getpath());
  string imagepathstr = path.getabsolutepath() + "\\static\\image\\";
  string sourcefilename = path.getabsolutepath() + "\\static\\test.doc";
  string targetfilename = path.getabsolutepath() + "\\static\\test2.html";
  file file = new file(imagepathstr);
  if(!file.exists()) {
    file.mkdirs();
  }
  hwpfdocument worddocument = new hwpfdocument(new fileinputstream(sourcefilename));
  org.w3c.dom.document document = documentbuilderfactory.newinstance().newdocumentbuilder().newdocument();
  wordtohtmlconverter wordtohtmlconverter = new wordtohtmlconverter(document);
  //保存圖片,并返回圖片的相對路徑
  wordtohtmlconverter.setpicturesmanager((content, picturetype, name, width, height) -> {
    try (fileoutputstream out = new fileoutputstream(imagepathstr + name)) {
      out.write(content);
    } catch (exception e) {
      e.printstacktrace();
    }
    return "image/" + name;
  });
  wordtohtmlconverter.processdocument(worddocument);
  org.w3c.dom.document htmldocument = wordtohtmlconverter.getdocument();
  domsource domsource = new domsource(htmldocument);
  streamresult streamresult = new streamresult(new file(targetfilename));
  transformerfactory tf = transformerfactory.newinstance();
  transformer serializer = tf.newtransformer();
  serializer.setoutputproperty(outputkeys.encoding, "utf-8");
  serializer.setoutputproperty(outputkeys.indent, "yes");
  serializer.setoutputproperty(outputkeys.method, "html");
  serializer.transform(domsource, streamresult);
  return targetfilename;
}

docx格式轉(zhuǎn)換為html

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
public static string docxtohtml() throws exception {
  file path = new file(resourceutils.geturl("classpath:").getpath());
  string imagepath = path.getabsolutepath() + "\\static\\image";
  string sourcefilename = path.getabsolutepath() + "\\static\\test.docx";
  string targetfilename = path.getabsolutepath() + "\\static\\test.html";
 
  outputstreamwriter outputstreamwriter = null;
  try {
    xwpfdocument document = new xwpfdocument(new fileinputstream(sourcefilename));
    xhtmloptions options = xhtmloptions.create();
    // 存放圖片的文件夾
    options.setextractor(new fileimageextractor(new file(imagepath)));
    // html中圖片的路徑
    options.uriresolver(new basicuriresolver("image"));
    outputstreamwriter = new outputstreamwriter(new fileoutputstream(targetfilename), "utf-8");
    xhtmlconverter xhtmlconverter = (xhtmlconverter) xhtmlconverter.getinstance();
    xhtmlconverter.convert(document, outputstreamwriter, options);
  } finally {
    if (outputstreamwriter != null) {
      outputstreamwriter.close();
    }
  }
  return targetfilename;
}

轉(zhuǎn)換成功后會生成對應的html文件,如果想在前端展示,直接讀取文件轉(zhuǎn)換為string返回給前端即可。

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
public static string readfile(string filepath) {
  file file = new file(filepath);
  inputstream input = null;
  try {
    input = new fileinputstream(file);
  } catch (filenotfoundexception e) {
    e.printstacktrace();
  }
  stringbuffer buffer = new stringbuffer();
  byte[] bytes = new byte[1024];
  try {
    for (int n; (n = input.read(bytes)) != -1;) {
      buffer.append(new string(bytes, 0, n, "utf8"));
    }
  } catch (ioexception e) {
    e.printstacktrace();
  }
  return buffer.tostring();
}

在富文本編輯器ckeditor中的顯示效果:

java使用POI實現(xiàn)html和word相互轉(zhuǎn)換

三.html轉(zhuǎn)換為word

實現(xiàn)思路就是先把html中的所有圖片元素提取出來,統(tǒng)一替換為變量字符”${imgreplace}“,如果多張圖片,可以依序排列下去,之后生成對應的doc文件(之前試過直接生成docx文件發(fā)現(xiàn)打不開,這個問題尚未找到好的解決方法),我們將其另存為docx文件,之后就可以替換變量為圖片了:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
public static string writewordfile(string content) {
    string path = "d:/wordfile";
    map<string, object> param = new hashmap<string, object>();
 
    if (!"".equals(path)) {
      file filedir = new file(path);
      if (!filedir.exists()) {
        filedir.mkdirs();
      }
      content = htmlutils.htmlunescape(content);
      list<hashmap<string, string>> imgs = getimgstr(content);
      int count = 0;
      for (hashmap<string, string> img : imgs) {
        count++;
        //處理替換以“/>”結(jié)尾的img標簽
        content = content.replace(img.get("img"), "${imgreplace" + count + "}");
        //處理替換以“>”結(jié)尾的img標簽
        content = content.replace(img.get("img1"), "${imgreplace" + count + "}");
        map<string, object> header = new hashmap<string, object>();
 
        try {
          file filepath = new file(resourceutils.geturl("classpath:").getpath());
          string imagepath = filepath.getabsolutepath() + "\\static\\";
          imagepath += img.get("src").replaceall("/", "\\\\");
          //如果沒有寬高屬性,默認設置為400*300
          if(img.get("width") == null || img.get("height") == null) {
            header.put("width", 400);
            header.put("height", 300);
          }else {
            header.put("width", (int) (double.parsedouble(img.get("width"))));
            header.put("height", (int) (double.parsedouble(img.get("height"))));
          }
          header.put("type", "jpg");
          header.put("content", officeutil.inputstream2bytearray(new fileinputstream(imagepath), true));
        } catch (filenotfoundexception e) {
          e.printstacktrace();
        }
        param.put("${imgreplace" + count + "}", header);
      }
      try {
        // 生成doc格式的word文檔,需要手動改為docx
        byte by[] = content.getbytes("utf-8");
        bytearrayinputstream bais = new bytearrayinputstream(by);
        poifsfilesystem poifs = new poifsfilesystem();
        directoryentry directory = poifs.getroot();
        documententry documententry = directory.createdocument("worddocument", bais);
        fileoutputstream ostream = new fileoutputstream("d:\\wordfile\\temp.doc");
        poifs.writefilesystem(ostream);
        bais.close();
        ostream.close();
 
        // 臨時文件(手動改好的docx文件)
        customxwpfdocument doc = officeutil.generateword(param, "d:\\wordfile\\temp.docx");
        //最終生成的帶圖片的word文件
        fileoutputstream fopts = new fileoutputstream("d:\\wordfile\\final.docx");
        doc.write(fopts);
        fopts.close();
      } catch (exception e) {
        e.printstacktrace();
      }
 
    }
    return "d:/wordfile/final.docx";
  }
 
  //獲取html中的圖片元素信息
  public static list<hashmap<string, string>> getimgstr(string htmlstr) {
    list<hashmap<string, string>> pics = new arraylist<hashmap<string, string>>();
 
    document doc = jsoup.parse(htmlstr);
    elements imgs = doc.select("img");
    for (element img : imgs) {
      hashmap<string, string> map = new hashmap<string, string>();
      if(!"".equals(img.attr("width"))) {
        map.put("width", img.attr("width").substring(0, img.attr("width").length() - 2));
      }
      if(!"".equals(img.attr("height"))) {
        map.put("height", img.attr("height").substring(0, img.attr("height").length() - 2));
      }
      map.put("img", img.tostring().substring(0, img.tostring().length() - 1) + "/>");
      map.put("img1", img.tostring());
      map.put("src", img.attr("src"));
      pics.add(map);
    }
    return pics;
  }

officeutil工具類,之前發(fā)現(xiàn)網(wǎng)上的寫法只支持一張圖片的修改,多張圖片就會報錯,是因為添加了圖片,processparagraphs方法中的runs的大小改變了,會報arraylist的異常,就和我們循環(huán)list中刪除元素會報異常道理一樣,解決方法就是復制一個新的arraylist進行循環(huán)即可:

 

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
package com.example.demo.util;
 
import java.io.bytearrayinputstream;
import java.io.fileinputstream;
import java.io.ioexception;
import java.io.inputstream;
import java.util.arraylist;
import java.util.iterator;
import java.util.list;
import java.util.map;
import java.util.map.entry;
 
import org.apache.poi.poixmldocument;
import org.apache.poi.hwpf.extractor.wordextractor;
import org.apache.poi.openxml4j.opc.opcpackage;
import org.apache.poi.xwpf.usermodel.xwpfparagraph;
import org.apache.poi.xwpf.usermodel.xwpfrun;
import org.apache.poi.xwpf.usermodel.xwpftable;
import org.apache.poi.xwpf.usermodel.xwpftablecell;
import org.apache.poi.xwpf.usermodel.xwpftablerow;
 
/**
 * 適用于word 2007
 */
public class officeutil {
 
  /**
   * 根據(jù)指定的參數(shù)值、模板,生成 word 文檔
   * @param param 需要替換的變量
   * @param template 模板
   */
  public static customxwpfdocument generateword(map<string, object> param, string template) {
    customxwpfdocument doc = null;
    try {
      opcpackage pack = poixmldocument.openpackage(template);
      doc = new customxwpfdocument(pack);
      if (param != null && param.size() > 0) {
 
        //處理段落
        list<xwpfparagraph> paragraphlist = doc.getparagraphs();
        processparagraphs(paragraphlist, param, doc);
 
        //處理表格
        iterator<xwpftable> it = doc.gettablesiterator();
        while (it.hasnext()) {
          xwpftable table = it.next();
          list<xwpftablerow> rows = table.getrows();
          for (xwpftablerow row : rows) {
            list<xwpftablecell> cells = row.gettablecells();
            for (xwpftablecell cell : cells) {
              list<xwpfparagraph> paragraphlisttable = cell.getparagraphs();
              processparagraphs(paragraphlisttable, param, doc);
            }
          }
        }
      }
    } catch (exception e) {
      e.printstacktrace();
    }
    return doc;
  }
  /**
   * 處理段落
   * @param paragraphlist
   */
  public static void processparagraphs(list<xwpfparagraph> paragraphlist,map<string, object> param,customxwpfdocument doc){
    if(paragraphlist != null && paragraphlist.size() > 0){
      for(xwpfparagraph paragraph:paragraphlist){
        //poi轉(zhuǎn)換過來的行間距過大,需要手動調(diào)整
        if(paragraph.getspacingbefore() >= 1000 || paragraph.getspacingafter() > 1000) {
          paragraph.setspacingbefore(0);
          paragraph.setspacingafter(0);
        }
        //設置word中左右間距
        paragraph.setindentationleft(0);
        paragraph.setindentationright(0);
        list<xwpfrun> runs = paragraph.getruns();
        //加了圖片,修改了paragraph的runs的size,所以循環(huán)不能使用runs
        list<xwpfrun> allruns = new arraylist<xwpfrun>(runs);
        for (xwpfrun run : allruns) {
          string text = run.gettext(0);
          if(text != null){
            boolean issettext = false;
            for (entry<string, object> entry : param.entryset()) {
              string key = entry.getkey();
              if(text.indexof(key) != -1){
                issettext = true;
                object value = entry.getvalue();
                if (value instanceof string) {//文本替換
                  text = text.replace(key, value.tostring());
                } else if (value instanceof map) {//圖片替換
                  text = text.replace(key, "");
                  map pic = (map)value;
                  int width = integer.parseint(pic.get("width").tostring());
                  int height = integer.parseint(pic.get("height").tostring());
                  int pictype = getpicturetype(pic.get("type").tostring());
                  byte[] bytearray = (byte[]) pic.get("content");
                  bytearrayinputstream byteinputstream = new bytearrayinputstream(bytearray);
                  try {
                    string blipid = doc.addpicturedata(byteinputstream,pictype);
                    doc.createpicture(blipid,doc.getnextpicnamenumber(pictype), width, height,paragraph);
                  } catch (exception e) {
                    e.printstacktrace();
                  }
                }
              }
            }
            if(issettext){
              run.settext(text,0);
            }
          }
        }
      }
    }
  }
  /**
   * 根據(jù)圖片類型,取得對應的圖片類型代碼
   * @param pictype
   * @return int
   */
  private static int getpicturetype(string pictype){
    int res = customxwpfdocument.picture_type_pict;
    if(pictype != null){
      if(pictype.equalsignorecase("png")){
        res = customxwpfdocument.picture_type_png;
      }else if(pictype.equalsignorecase("dib")){
        res = customxwpfdocument.picture_type_dib;
      }else if(pictype.equalsignorecase("emf")){
        res = customxwpfdocument.picture_type_emf;
      }else if(pictype.equalsignorecase("jpg") || pictype.equalsignorecase("jpeg")){
        res = customxwpfdocument.picture_type_jpeg;
      }else if(pictype.equalsignorecase("wmf")){
        res = customxwpfdocument.picture_type_wmf;
      }
    }
    return res;
  }
  /**
   * 將輸入流中的數(shù)據(jù)寫入字節(jié)數(shù)組
   * @param in
   * @return
   */
  public static byte[] inputstream2bytearray(inputstream in,boolean isclose){
    byte[] bytearray = null;
    try {
      int total = in.available();
      bytearray = new byte[total];
      in.read(bytearray);
    } catch (ioexception e) {
      e.printstacktrace();
    }finally{
      if(isclose){
        try {
          in.close();
        } catch (exception e2) {
          system.out.println("關(guān)閉流失敗");
        }
      }
    }
    return bytearray;
  }
}

我認為之所以word2003不支持圖片替換,主要是處理2003版本的hwpfdocument對象被聲明為了final,我們就無法重寫他的方法了。而處理2007版本的類為xwpfdocument,是可以繼承的,通過繼承xwpfdocument,重寫createpicture方法即可實現(xiàn)圖片替換,以下為對應的customxwpfdocument類:

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
package com.example.demo.util; 
 
import java.io.ioexception;
import java.io.inputstream;
import org.apache.poi.openxml4j.opc.opcpackage;
import org.apache.poi.xwpf.usermodel.xwpfdocument;
import org.apache.poi.xwpf.usermodel.xwpfparagraph;
import org.apache.xmlbeans.xmlexception;
import org.apache.xmlbeans.xmltoken;
import org.openxmlformats.schemas.drawingml.x2006.main.ctnonvisualdrawingprops;
import org.openxmlformats.schemas.drawingml.x2006.main.ctpositivesize2d;
import org.openxmlformats.schemas.drawingml.x2006.wordprocessingdrawing.ctinline;
 
/**
 * 自定義 xwpfdocument,并重寫 createpicture()方法
 */
public class customxwpfdocument extends xwpfdocument { 
  public customxwpfdocument(inputstream in) throws ioexception { 
    super(in); 
  
 
  public customxwpfdocument() { 
    super(); 
  
 
  public customxwpfdocument(opcpackage pkg) throws ioexception { 
    super(pkg); 
  
 
  /**
   * @param ind
   * @param width 寬
   * @param height 高
   * @param paragraph 段落
   */
  public void createpicture(string blipid, int ind, int width, int height,xwpfparagraph paragraph) { 
    final int emu = 9525
    width *= emu; 
    height *= emu; 
    ctinline inline = paragraph.createrun().getctr().addnewdrawing().addnewinline(); 
    string picxml = ""
        + "<a:graphic xmlns:a=\"http://schemas.openxmlformats.org/drawingml/2006/main\">"
        + "  <a:graphicdata uri=\"http://schemas.openxmlformats.org/drawingml/2006/picture\">"
        + "   <pic:pic xmlns:pic=\"http://schemas.openxmlformats.org/drawingml/2006/picture\">"
        + "     <pic:nvpicpr>" + "      <pic:cnvpr id=\""
        + ind 
        + "\" name=\"generated\"/>"
        + "      <pic:cnvpicpr/>"
        + "     </pic:nvpicpr>"
        + "     <pic:blipfill>"
        + "      <a:blip r:embed=\""
        + blipid 
        + "\" xmlns:r=\"http://schemas.openxmlformats.org/officedocument/2006/relationships\"/>"
        + "      <a:stretch>"
        + "        <a:fillrect/>"
        + "      </a:stretch>"
        + "     </pic:blipfill>"
        + "     <pic:sppr>"
        + "      <a:xfrm>"
        + "        <a:off x=\"0\" y=\"0\"/>"
        + "        <a:ext cx=\""
        + width 
        + "\" cy=\""
        + height 
        + "\"/>"
        + "      </a:xfrm>"
        + "      <a:prstgeom prst=\"rect\">"
        + "        <a:avlst/>"
        + "      </a:prstgeom>"
        + "     </pic:sppr>"
        + "   </pic:pic>"
        + "  </a:graphicdata>" + "</a:graphic>"
 
    inline.addnewgraphic().addnewgraphicdata(); 
    xmltoken xmltoken = null
    try
      xmltoken = xmltoken.factory.parse(picxml); 
    } catch (xmlexception xe) { 
      xe.printstacktrace(); 
    
    inline.set(xmltoken); 
 
    inline.setdistt(0);  
    inline.setdistb(0);  
    inline.setdistl(0);  
    inline.setdistr(0);  
 
    ctpositivesize2d extent = inline.addnewextent(); 
    extent.setcx(width); 
    extent.setcy(height); 
 
    ctnonvisualdrawingprops docpr = inline.addnewdocpr();  
    docpr.setid(ind);  
    docpr.setname("圖片" + ind);  
    docpr.setdescr("測試"); 
  
}

以上就是通過poi實現(xiàn)html和word的相互轉(zhuǎn)換,對于html無法轉(zhuǎn)換為可讀的docx這個問題尚未解決,如果大家有好的解決方法可以交流一下。

原文鏈接:https://blog.csdn.net/j1231230/article/details/80712531

延伸 · 閱讀

精彩推薦
主站蜘蛛池模板: 性柔术xxxhd 性派对videos18party | 爱情岛永久成人免费网站 | 日韩一区二区不卡 | 美女张开腿黄网站免费精品动漫 | 五月天综合久久 | 色综合伊人色综合网亚洲欧洲 | 四虎4hu新地址入口 四虎1515h永久 | 国产在线播放一区 | 国产欧美国产精品第一区 | 国产拍拍 | 免费二区 | 欧美最新在线 | 暖暖中国免费观看高清完整版 | 国产精品国产高清国产专区 | 天码毛片一区二区三区入口 | 国产在线综合网 | 久久综合老色鬼网站 | 双性总裁(h) | 亚洲国产精品综合久久网络 | 精品久久久久亚洲 | 国产精品一区二区三区免费 | 嫩草影院永久在线播放 | 美女福利视频一区二区 | 成人久久伊人精品伊人 | 男人狂躁女人下面狂叫图片 | 国内免费高清视频在线观看 | 为什么丈夫插我我却喜欢被打着插 | 日韩国产欧美成人一区二区影院 | 精品欧美一区二区在线观看欧美熟 | 欧美整片完整片视频在线 | 香蕉久久ac一区二区三区 | 国产馆精品推荐在线观看 | 91夜夜人人揉人人捏人人添 | 国产综合亚洲欧美日韩一区二区 | 亚洲免费国产 | 嫩模被黑人粗大挺进 | 精品午夜中文字幕熟女人妻在线 | 色色色色网站 | 2021国产精品视频一区 | 亚洲国产精品一区二区首页 | 日韩欧美一区二区三区免费观看 |