I wanted to load an Excel XML file (a.k.a. Microsoft Office Excel 2002 and Excel 2003 XML Format — SpreadsheetML) in another Excel file using Power Query M formula language. It turns out that it's trickier than I expected so I decided to blog my solution.
Many SAP related tools export this Microsoft 2003 XML format. You can save an excel file in that format by selecting the "XML Spreadsheet 2003 (*.xml)" option in the save as dialog:
<?xml version="1.0"?> | |
<?mso-application progid="Excel.Sheet"?> | |
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet" | |
xmlns:o="urn:schemas-microsoft-com:office:office" | |
xmlns:x="urn:schemas-microsoft-com:office:excel" | |
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet" | |
xmlns:html="http://www.w3.org/TR/REC-html40"> | |
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"> | |
<Author>Andreas Botsikas</Author> | |
<LastAuthor>Andreas Botsikas</LastAuthor> | |
<Created>2024-10-17T20:45:19Z</Created> | |
<Version>16.00</Version> | |
</DocumentProperties> | |
<OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office"> | |
<AllowPNG/> | |
</OfficeDocumentSettings> | |
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel"> | |
<WindowHeight>7524</WindowHeight> | |
<WindowWidth>23040</WindowWidth> | |
<WindowTopX>32767</WindowTopX> | |
<WindowTopY>32767</WindowTopY> | |
<ProtectStructure>False</ProtectStructure> | |
<ProtectWindows>False</ProtectWindows> | |
</ExcelWorkbook> | |
<Styles> | |
<Style ss:ID="Default" ss:Name="Normal"> | |
<Alignment ss:Vertical="Bottom"/> | |
<Borders/> | |
<Font ss:FontName="Aptos Narrow" x:CharSet="161" x:Family="Swiss" ss:Size="11" | |
ss:Color="#000000"/> | |
<Interior/> | |
<NumberFormat/> | |
<Protection/> | |
</Style> | |
</Styles> | |
<Worksheet ss:Name="Sheet1"> | |
<Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="3" x:FullColumns="1" | |
x:FullRows="1" ss:DefaultRowHeight="14.4"> | |
<Column ss:Width="55.2"/> | |
<Column ss:Width="54"/> | |
<Row> | |
<Cell><Data ss:Type="String">First Name</Data></Cell> | |
<Cell><Data ss:Type="String">Last Name</Data></Cell> | |
</Row> | |
<Row> | |
<Cell><Data ss:Type="String">A</Data></Cell> | |
<Cell><Data ss:Type="String">B</Data></Cell> | |
</Row> | |
<Row> | |
<Cell><Data ss:Type="String">C</Data></Cell> | |
<Cell><Data ss:Type="String">D</Data></Cell> | |
</Row> | |
</Table> | |
<WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel"> | |
<PageSetup> | |
<Header x:Margin="0.3"/> | |
<Footer x:Margin="0.3"/> | |
<PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/> | |
</PageSetup> | |
<Selected/> | |
<Panes> | |
<Pane> | |
<Number>3</Number> | |
<ActiveRow>3</ActiveRow> | |
</Pane> | |
</Panes> | |
<ProtectObjects>False</ProtectObjects> | |
<ProtectScenarios>False</ProtectScenarios> | |
</WorksheetOptions> | |
</Worksheet> | |
</Workbook> |
Notice the "<?mso-application progid="Excel.Sheet"?>" line which gives out the type of this xml file.
Once you have this file, let's assume you place it in C:\tmp\ExcelXmlExport.xml and you need to load it in an excel file. Open up Excel and navigate to the Data tab and then select Get Data -> From Other Sources -> Blank Query.
From the ribbon select the advance editor:
In the editor paste the following M script:
let | |
// Load the XML file contents | |
Source = Xml.Tables(File.Contents("C:\tmp\ExcelXmlExport.xml")), | |
// Filter rows to find one named "Worksheet" | |
WorksheetRows = Table.SelectRows(Source, each Text.Contains([Name], "Worksheet")), | |
// Assuming you find one, select the Table | |
WorksheetTable = WorksheetRows{0}[Table], | |
// Filter rows within the "Worksheet" table to find the row named "Table" | |
TableRows = Table.SelectRows(WorksheetTable, each Text.Contains([Name], "Table")), | |
// Assuming you find one, select the Table | |
TableTable = TableRows{0}[Table], | |
// Filter rows within the "Table" table to find the row named "Row" | |
RowRows = Table.SelectRows(TableTable, each Text.Contains([Name], "Row")), | |
// Assuming you find one, select the Table | |
RowTable = RowRows{0}[Table], | |
// Transform the rows by adding a new column "RowData" | |
// The transformation takes each "Cell" and extracts the "Element:Text" element | |
// If the extraction fails (e.g., due to missing data), it defaults to an empty string | |
TransformRows = Table.AddColumn(RowTable, "RowData", each List.Transform( | |
[Cell][Data], | |
(cellData) => try cellData[#"Element:Text"]{0} otherwise "" | |
)), | |
// Convert the list of transformed rows into a table | |
ExpandedRows = Table.FromRows(TransformRows[RowData]), | |
// Promote the first row of the table to headers | |
TableWithHeaders = Table.PromoteHeaders(ExpandedRows, [PromoteAllScalars=true]) | |
in | |
// Return the final table with headers | |
TableWithHeaders |
This will load the Excel XML file and you should have all data loaded as strings:
From there on, you can Close & Load and start building your dashboard using the data you just loaded from an external file.
No comments:
Post a Comment