I wanted to load an Excel XML file (a.k.a. Microsoft Office Excel 2002 and Excel 2003 XML Format — SpreadsheetML) in another Excel file using Power Query M formula language. It turns out that it's trickier than I expected so I decided to blog my solution.
Many SAP related tools export this Microsoft 2003 XML format. You can save an excel file in that format by selecting the "XML Spreadsheet 2003 (*.xml)" option in the save as dialog:
| <?xml version="1.0"?> | |
| <?mso-application progid="Excel.Sheet"?> | |
| <Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" | |
| xmlns:o="urn:schemas-microsoft-com:office:office" | |
| xmlns:x="urn:schemas-microsoft-com:office:excel" | |
| xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" | |
| xmlns:html="http://www.w3.org/TR/REC-html40"> | |
| <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> | |
| <Author>Andreas Botsikas</Author> | |
| <LastAuthor>Andreas Botsikas</LastAuthor> | |
| <Created>2024-10-17T20:45:19Z</Created> | |
| <Version>16.00</Version> | |
| </DocumentProperties> | |
| <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> | |
| <AllowPNG/> | |
| </OfficeDocumentSettings> | |
| <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel"> | |
| <WindowHeight>7524</WindowHeight> | |
| <WindowWidth>23040</WindowWidth> | |
| <WindowTopX>32767</WindowTopX> | |
| <WindowTopY>32767</WindowTopY> | |
| <ProtectStructure>False</ProtectStructure> | |
| <ProtectWindows>False</ProtectWindows> | |
| </ExcelWorkbook> | |
| <Styles> | |
| <Style ss:ID="Default" ss:Name="Normal"> | |
| <Alignment ss:Vertical="Bottom"/> | |
| <Borders/> | |
| <Font ss:FontName="Aptos Narrow" x:CharSet="161" x:Family="Swiss" ss:Size="11" | |
| ss:Color="#000000"/> | |
| <Interior/> | |
| <NumberFormat/> | |
| <Protection/> | |
| </Style> | |
| </Styles> | |
| <Worksheet ss:Name="Sheet1"> | |
| <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="3" x:FullColumns="1" | |
| x:FullRows="1" ss:DefaultRowHeight="14.4"> | |
| <Column ss:Width="55.2"/> | |
| <Column ss:Width="54"/> | |
| <Row> | |
| <Cell><Data ss:Type="String">First Name</Data></Cell> | |
| <Cell><Data ss:Type="String">Last Name</Data></Cell> | |
| </Row> | |
| <Row> | |
| <Cell><Data ss:Type="String">A</Data></Cell> | |
| <Cell><Data ss:Type="String">B</Data></Cell> | |
| </Row> | |
| <Row> | |
| <Cell><Data ss:Type="String">C</Data></Cell> | |
| <Cell><Data ss:Type="String">D</Data></Cell> | |
| </Row> | |
| </Table> | |
| <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel"> | |
| <PageSetup> | |
| <Header x:Margin="0.3"/> | |
| <Footer x:Margin="0.3"/> | |
| <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/> | |
| </PageSetup> | |
| <Selected/> | |
| <Panes> | |
| <Pane> | |
| <Number>3</Number> | |
| <ActiveRow>3</ActiveRow> | |
| </Pane> | |
| </Panes> | |
| <ProtectObjects>False</ProtectObjects> | |
| <ProtectScenarios>False</ProtectScenarios> | |
| </WorksheetOptions> | |
| </Worksheet> | |
| </Workbook> |
Notice the "<?mso-application progid="Excel.Sheet"?>" line which gives out the type of this xml file.
Once you have this file, let's assume you place it in C:\tmp\ExcelXmlExport.xml and you need to load it in an excel file. Open up Excel and navigate to the Data tab and then select Get Data -> From Other Sources -> Blank Query.
From the ribbon select the advance editor:
In the editor paste the following M script:
| let | |
| // Load the XML file contents | |
| Source = Xml.Tables(File.Contents("C:\tmp\ExcelXmlExport.xml")), | |
| // Filter rows to find one named "Worksheet" | |
| WorksheetRows = Table.SelectRows(Source, each Text.Contains([Name], "Worksheet")), | |
| // Assuming you find one, select the Table | |
| WorksheetTable = WorksheetRows{0}[Table], | |
| // Filter rows within the "Worksheet" table to find the row named "Table" | |
| TableRows = Table.SelectRows(WorksheetTable, each Text.Contains([Name], "Table")), | |
| // Assuming you find one, select the Table | |
| TableTable = TableRows{0}[Table], | |
| // Filter rows within the "Table" table to find the row named "Row" | |
| RowRows = Table.SelectRows(TableTable, each Text.Contains([Name], "Row")), | |
| // Assuming you find one, select the Table | |
| RowTable = RowRows{0}[Table], | |
| // Transform the rows by adding a new column "RowData" | |
| // The transformation takes each "Cell" and extracts the "Element:Text" element | |
| // If the extraction fails (e.g., due to missing data), it defaults to an empty string | |
| TransformRows = Table.AddColumn(RowTable, "RowData", each List.Transform( | |
| [Cell][Data], | |
| (cellData) => try cellData[#"Element:Text"]{0} otherwise "" | |
| )), | |
| // Convert the list of transformed rows into a table | |
| ExpandedRows = Table.FromRows(TransformRows[RowData]), | |
| // Promote the first row of the table to headers | |
| TableWithHeaders = Table.PromoteHeaders(ExpandedRows, [PromoteAllScalars=true]) | |
| in | |
| // Return the final table with headers | |
| TableWithHeaders |
This will load the Excel XML file and you should have all data loaded as strings:
From there on, you can Close & Load and start building your dashboard using the data you just loaded from an external file.
No comments:
Post a Comment