apache-poi-logo The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel solution (for Excel 97-2008). We have a complete API for porting other OOXML and OLE2 formats and welcome others to participate.

Switch Off logging

From the documentation at http://poi.apache.org/utils/logging.html

Logging in POI is used only as a debugging mechanism, not a normal runtime logging system. Logging is ONLY for autopsie type debugging, and should NEVER be enabled on a production system. Enabling logging will reduce performance by at least a factor of 100. If you are not developing POI or trying to debug why POI isn't reading a file correctly, then DO NOT enable logging. You've been warned.

In order to effectively disable the logging functionality in Apache POI you must use an alternative logger. This is accomplished by providing a property to the POILogFactory to override the default logger. You can add one of these –D to your JVM settings

-Dorg.apache.poi.util.POILogger=org.apache.poi.util.NullLogger
-Dorg.apache.poi.util.POILogger=org.apache.commons.logging.impl.NoOpLog

I found Apache POI to slightly better perform with the NoOpLog of apache common!

Recompile poi with more adapted settings

You can create a custom build of Apache POI 3.8 and alter the following properties to better match the size of the excel files you are generating or reading:

  • org.apache.poi.hssf.usermodel.HSSFRowINITIAL_CAPACITY=5;
  • org.apache.poi.hssf.usermodel.HSSFSheetINITIAL_CAPACITY= 20;&160;&160;&160; // used for compile-time optimization.&160; This is the initial size for the collection of rows.&160; It is currently set to 20.&160; If you generate larger sheets you may benefit by setting this to a higher number and recompiling a custom edition of HSSFSheet.
  • org.apache.poi.hssf.usermodel.HSSFWorkbookINITIAL_CAPACITY=3;&160; // used for compile-time performance/memory optimization.&160; This determines the&160; initial capacity for the sheet collection.&160; Its currently set to 3.Changing it in this release will decrease performance since you're never allowed to have more or less than three sheets!&160;&160;&160;&160;&160;
  • INITIAL_CAPACITY">http://poi.apache.org/apidocs/org/apache/poi/hssf/usermodel/HSSFWorkbook.htmlINITIAL_CAPACITY

    Don’t use xlsx, prefer xls!

    This will only work if you do not reach xls limitations which may avoid you to go to that extreme solution. XLS is not compressed (XLSX is xml based and compressed) and your workbook may double size in memory as a result!

    For example, data beyond 256 (IV) columns by 65,536 rows will not be saved in xls! In Excel 2010 and Excel 2007, the worksheet size is 16,384 columns by 1,048,576 rows, but the worksheet size of Excel 97-2003 is only 256 columns by 65,536 rows. Data in cells outside of this column and row limit is lost in Excel 97-2003. But there is a lot more limitations listed at office.com

    The biggest side effect was that my excel file went from 354kb to 967kb, but the speed increase was quite interesting: more than 44% less evaluation time.

    Small localized optimization

    I don’t think these bring a lot of speed, JIT should optimize this bad piece of code for us but it is always worth trying Speeding up org.apache.poi.hssf.usermodel.HSSFRow.compareTo() and http://affy.blogspot.ch/2004/04/poi-optimization-speeding-_108265938673224937.html

    comments powered by Disqus

    You might like also

    Apache POI list of Excel supported functions
    The Apache POI Project's mission is to create and maintain Java APIs for manipulating various file formats based upon the Office Open XML standards (OOXML) and Microsoft's OLE 2 Compound Document format (OLE2). In short, you can read and write MS Excel files using Java. In addition, you can read and write MS Word and MS PowerPoint files using Java. Apache POI is your Java Excel solution (for Excel 97-2008). We have a complete API for porting other OOXML and …
    1683 Days ago
    No Thumbnail was found
    The AIAA (American Institute of Aeronautics and Astronautics) paper [.pdf] plan to use java (For cost reasons) for safety-critical missions. The first fully time-deterministic and open-source library for Java: Javolution is ready to fight! Javolution real-time goals are simple: To make your application faster and more time predictable! That being accomplished through: High performance and time-deterministic (real-time) util / lang / text / io / xml base classes. Context programming in order to achieve true separation of concerns …
    3704 Days ago
    Playing with Axis C++ and Apache 1.3.1
    Here is a How to since it take me a very long time to install something which should have been trivial....For the benefit of the community, I am publishing it here on my free time :-) ... Enjoy...Apache Axis and Apache Axis C++ are implementation of the SOAP ("Simple Object Access Protocol") submission to W3C. From the W3C draft specification: SOAP is a lightweight protocol for exchanging structured information in a decentralized, distributed environment. It is an XML based protocol …
    4250 Days ago
    No Thumbnail was found
    4545 Days ago
    iText a free Java library
    iText is a library that allows you to generate PDF files on the fly. The iText classes are very useful for people who need to generate read-only, platform independent documents containing text, lists, tables and images. The library is especially useful in combination with Java(TM) technology-based Servlets: The look and feel of HTML is browser dependent; with iText and PDF you can control exactly how your servlet's output will look.iText requires JDK 1.2. It's available for free under a multiple …
    4555 Days ago
    No Thumbnail was found
    Log4J: A logging framework for J2EE Log4j homepage: http://jakarta.apache.org/log4j/ Reference book on log4j: The Complete Log4j Manualby Ceki GulcuEdition: Paperback IntroductionLog4j is an open source tool (OSS) developed for inserting logs statements into your application and was developed by people at Apache fundation. It's speed and flexibility allows log statements to remain in shipped code while giving the user the ability to enable logging at runtime without modifying any of the application binary. All of this while not incurring a …
    4829 Days ago
    Apache Jmeter
    Â Work in progress …
    4832 Days ago
    No Thumbnail was found
    In computer programming, a unit test is a method of testing the correctness of a particular module of source code. The idea is to write test cases for every non-trivial function or method in the module so that each test case is separate from the others if possible. …
    4832 Days ago