D25008: Add XLSX spreadsheets import optimisations for small/readonly devices

David Llewellyn-Jones noreply at phabricator.kde.org
Mon Oct 28 15:08:37 GMT 2019


davidllewellynjones created this revision.
davidllewellynjones added reviewers: Calligra: 3.0, pvuorela, dcaliste.
davidllewellynjones added a project: Calligra: 3.0.
Herald added a subscriber: Calligra-Devel-list.
davidllewellynjones requested review of this revision.

REVISION SUMMARY
  Loading large XLSX files on resource-constrained devices can be very slow, and the problem is made worse by the fact that spreadsheet applications make it easy to add styles to large numbers of rows and columns, even though the number of cells with actual content may be quite small (e.g. select a single entire column/row and apply a style). In this case it's much faster to only convert the cells inside the boundary containing values, and ignore the cells beyond this which only have styles applied.
  
  This patch adds a number of compile-time defines that can be used to optimise loading of XLSX spreadsheet documents on restricted devices.
  
  For the desktop version I expect none of these flags will be particularly interesting, so if they're not provided to cmake, the current behaviour is left unchanged.
  
  `MSOOXML_MAX_SPREADSHEET_COLS=<integer>`
  
  This controls the maximum number of columns that the importer will load. Any columns outside the range will be ignored. The default value is 0x7FFF, the maximum number of columns supported by Calligra.
  
  `MSOOXML_MAX_SPREADSHEET_ROWS=<integer>`
  
  This controls the maximum number of rows that the importer will load. Any rows outside the range will be ignored. The default value is 0xFFFFF, the maximum number of rows supported by Calligra.
  
  `MSOOXML_SPREADSHEET_CONTENT_BORDER=<integer>`
  
  If this value is set to a positive value, the importer will calculate the smallest spreadsheet size that can accommodate all cells containing values or formulae. It adds a border of the given number of cells and uses that for the maximum bounds of the imported spreadsheet. This can be useful because it's easy to create spreadsheets where large numbers of cells have style attributes set, but which in practice only have content in a smaller area of the spreadsheet. Setting this value therefore provides an optimised way to import the spreadsheet, but with style data lost outside of the bounded region.
  
  If this value is left unset, or is negative, the full spreadsheet will be imported.
  
  `MSOOXML_IMPORT_READ_ONLY=<true|false>`
  
  Formula cells in XLSX documents store both the formula and the calculated value at the time the document was saved. When importing a file for read-only viewing, the value is more useful than the cell. By setting this flag, the importer can optimise by using the values rather than the formulae.

TEST PLAN
  The following example spreadsheet has data in columns IR (252) and IW (257):
  
  http://www.flypig.co.uk/dnload/dnload/other/calligra-optimise01.zip
  
  1. Load the file and notice there's data in columns IR and IW.
  
  2. Following the standard instructions <https://community.kde.org/Calligra/Building/3#Build_Calligra>, rebuild Calligra using the additional flags:
  
    cd ~/kde/src/calligra
    cmake -DCMAKE_INSTALL_PREFIX=$HOME/kde/inst5 $HOME/kde/src/calligra -DCMAKE_BUILD_TYPE=RelWithDebInfo -DPRODUCTSET=SHEETS -DMSOOXML_MAX_SPREADSHEET_COLS=0xFF -DMSOOXML_MAX_SPREADSHEET_ROWS=0x1400 -DMSOOXML_SPREADSHEET_CONTENT_BORDER=5 -DMSOOXML_IMPORT_READ_ONLY=true
    make -j6
    make install -j6
  
  
  
  3. Open the file with the newly build version of Calligra and notice that the less data has been loaded.
  
  Playing around with the file and different values for the flags should give a mixture of results.

REPOSITORY
  R8 Calligra

REVISION DETAIL
  https://phabricator.kde.org/D25008

AFFECTED FILES
  CMakeLists.txt
  filters/libmsooxml/MsooXmlGlobal.cpp
  filters/libmsooxml/MsooXmlGlobal.h
  filters/sheets/xlsx/XlsxXmlWorksheetReader.cpp
  filters/sheets/xlsx/XlsxXmlWorksheetReader_p.h

To: davidllewellynjones, #calligra:_3.0, pvuorela, dcaliste
Cc: Calligra-Devel-list, davidllewellynjones, dcaliste, cochise, vandenoever
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.kde.org/pipermail/calligra-devel/attachments/20191028/bbe51c1c/attachment.htm>


More information about the calligra-devel mailing list