背景: 系统支持excel的两种文件上传 xlsx 和 xls
问题:今天上传了一个 xlsx的文档 ,系统错误为
org.apache.poi.openxml4j.exceptions.OLE2NotOfficeXmlFileException: The supplied data appears to be in the OLE2 Format. You are calling the part of POI that deals with OOXML (Office Open XML) Documents. You need to call a different part of POI to process this data (eg HSSF instead of XSSF) at org.apache.poi.openxml4j.opc.internal.ZipHelper.verifyZipHeader(ZipHelper.java:147) ~[poi-ooxml-4.1.0.jar!/:4.1.0] at org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipStream(ZipHelper.java:175) ~[poi-ooxml-4.1.0.jar!/:4.1.0] at org.apache.poi.openxml4j.opc.ZipPackage.<init>(ZipPackage.java:104) ~[poi-ooxml-4.1.0.jar!/:4.1.0] at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:301) ~[poi-ooxml-4.1.0.jar!/:4.1.0] at org.apache.poi.ooxml.util.PackageHelper.open(PackageHelper.java:37) ~[poi-ooxml-4.1.0.jar!/:4.1.0] at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:303) ~[poi-ooxml-4.1.0.jar!/:4.1.0正常系统逻辑处理为:
private Workbook getWorkbook(String filePath, InputStream is) throws IOException { String fileExt = this.fileExt(filePath); return "xlsx".equals(fileExt) ? new XSSFWorkbook(is) : "xls".equals(fileExt) ? new HSSFWorkbook(is) : null; }这是线上一个问题,客户说上传错误 但是不肯将文件发我们,通过分析错误原因 看到,是后缀名称和解析方式出了问题,
也就是手动修改文件名后不匹配导致的。
找到原因就好说了,尝试catch里面处理异常
if( "xlsx".equals(fileExt)){ try { workbook = new XSSFWorkbook(is); }catch (Exception e){ log.error("我草,xlsx是手动改的后缀 而不是通过另存为的方式"); workbook = new HSSFWorkbook(is); } } else if ("xls".equals(fileExt) ){ try { workbook = new HSSFWorkbook(is); }catch (Exception e){ log.error("我草,xls是手动改的后缀 而不是通过另存为的方式"); workbook = new XSSFWorkbook(is); } }尝试仍然不好用,其实这种事指标不治本,文件的本身字节码并没有改变,但是通过改变workboor获取方式,仍是存在问题的。
再次优化一下,使用工厂模式创建Workbook:
// 通过文件流工厂方式来处理 Workbook workbook = WorkbookFactory.create(is);看一下POI的源码:
public static Workbook create(InputStream inp, String password) throws IOException, InvalidFormatException, EncryptedDocumentException { InputStream is = FileMagic.prepareToCheckMagic(inp); FileMagic fm = FileMagic.valueOf(is); switch(fm) { case OLE2: NPOIFSFileSystem fs = new NPOIFSFileSystem(is); return create(fs, password); case OOXML: return new XSSFWorkbook(OPCPackage.open(is)); default: throw new InvalidFormatException("Your InputStream was neither an OLE2 stream, nor an OOXML stream"); } }通过debug调试 可得出
xlsx 对应的FileMagic为 OOXML xls 对应的FileMagic为 OLE2
FileMagic 类的代码可参考如下:
public enum FileMagic { /** OLE2 / BIFF8+ stream used for Office 97 and higher documents */ OLE2(HeaderBlockConstants._signature), /** OOXML / ZIP stream */ OOXML(OOXML_FILE_HEADER), /** XML file */ XML(RAW_XML_FILE_HEADER), /** BIFF2 raw stream - for Excel 2 */ BIFF2(new byte[]{ 0x09, 0x00, // sid=0x0009 0x04, 0x00, // size=0x0004 0x00, 0x00, // unused 0x70, 0x00 // 0x70 = multiple values }), /** BIFF3 raw stream - for Excel 3 */ BIFF3(new byte[]{ // 省略 }
OLE2类型的orkbook实现方式:
里面有个password 默认为 null ,但是我也没有找到 何时调用password不为空的时候 估计是预留的吧,有兴趣的可以自己看看
// 第一步先转化为POIFSFileSystem POIFSFileSystem fs = new POIFSFileSystem(is); // 第二步调用this.create(fs, password); public static Workbook create(final DirectoryNode root, String password) throws IOException { // Encrypted OOXML files go inside OLE2 containers, is this one? if (root.hasEntry(Decryptor.DEFAULT_POIFS_ENTRY)) { InputStream stream = null; try { stream = DocumentFactoryHelper.getDecryptedStream(root, password); return createXSSFWorkbook(stream); } finally { IOUtils.closeQuietly(stream); // as we processed the full stream already, we can close the filesystem here // otherwise file handles are leaked root.getFileSystem().close(); } } // If we get here, it isn't an encrypted PPTX file // So, treat it as a regular HSLF PPT one boolean passwordSet = false; if (password != null) { Biff8EncryptionKey.setCurrentUserPassword(password); passwordSet = true; } try { return createHSSFWorkbook(root); } finally { if (passwordSet) { Biff8EncryptionKey.setCurrentUserPassword(null); } } }OOXML类型的orkbook实现方式:
private static Workbook createWorkbook(String factoryClass, Object[] args) throws IOException, EncryptedDocumentException { try { Class<?> clazz = WorkbookFactory.class.getClassLoader().loadClass(factoryClass); Class<?>[] argsClz = new Class<?>[args.length]; int i=0; for (Object o : args) { Class<?> c = o.getClass(); if (Boolean.class.isAssignableFrom(c)) { c = boolean.class; } else if (InputStream.class.isAssignableFrom(c)) { c = InputStream.class; } else if (File.class.isAssignableFrom(c)) { c = File.class; } argsClz[i++] = c; } Method m = clazz.getMethod("createWorkbook", argsClz); return (Workbook)m.invoke(null, args); } catch (InvocationTargetException e) { Throwable t = e.getCause(); if (t instanceof IOException) { throw (IOException)t; } else if (t instanceof EncryptedDocumentException) { throw (EncryptedDocumentException)t; } else if (t instanceof OldFileFormatException) { throw (OldFileFormatException)t; } else if (t instanceof RuntimeException) { throw (RuntimeException)t; } else { throw new IOException(t.getMessage(), t); } } catch (Exception e) { throw new IOException(e); } }