Importing from XML into Excel
Overview
XML files are a very useful way of importing data into Excel.
The best way to do this is to use a SAXParser since this does not
import all the data and hence works better with large data files.
Instead the file is opened much as with a browser and then read by
excel code that must be customised by the user.
The rest of this page explains how this process works with a
step by step method. We can also provide support on this feature as it
makes data access much easier than might otherwise be the case.
An excel file based upon this tutorial can be downloaded from the following link. To use this excel version you only
need to change the strUsername = "myuser" and strPassword = "mypass" entries in the VBA code.
How To Import
In a new Excel file press Alt-F11 to open the VBA environment.
Then go to Tools > References and tick 'Microsoft XML, v6.0' (on some
versions including some of the most recent Excel versions you may need
v3.0 instead). This will allow access to functionality that will allow
for imports into Excel from XML sources.
We now need to setup the excel file so create a new module
(right click Microsoft Excel Objects and select Insert > Module). Then
we need to enter the following text
Sub xmlDataDownload()
Dim saxReader As SAXXMLReader60 'This is the object used to
parse XML files
Dim saxHandler As ContentHandlerData 'This is the file we will
create that will tell the system how to parse the data
Dim ws As Worksheet
Dim strUsername As String
Dim strPassword As String
Dim i As Date
Dim strXMLDate As String
Dim strURL As String
Set saxReader = New SAXXMLReader60
Set saxHandler = New ContentHandlerData
Set saxReader.contentHandler = saxHandler
Set ws = ThisWorkbook.Sheets("Sheet1")
Application.Calculation = xlCalculationManual
ws.Cells.ClearContents
ws.Cells(1,7) = 2
Calculate
For i = DateSerial(2015,05,01) To DateSerial(2015,05,02)
strXMLDate = Format(i, "yyyymmdd")
strURL = [YOU MUST FILL THIS IN WITH THE URL - GET THIS BY
LOOKING THROUGH THE 'XML-REST Services' DOCUMENTATION]
saxReader.parseURL strURL
Next i
Application.Calculation = xlCalculationAutomatic
End Sub
The URL must be filled in but otherwise this is the basic
structure used to parse the URL. It is best to open the XML in a web
browser and then you know the link is right and can understand the
structure.
To get such a link go to Backend > Services > realtime > Fuel
Mix. This offers you a URL
'https://www.netareports.com/dataService?rt=realtime&username=auser&password=apasswd&message=BMRS.FUELHH&querytype=data&sd=20100401'
from which you merely need to replace the username and set it up to update the
date. This would look as follows:
strURL =
"https://www.netareports.com/dataService?rt=realtime&username=" &
strUsername & "&password=" & strPassword &
"&message=BMRS.FUELHH&querytype=data&sd=" & strXMLDate
Now we just need to define "strUsername = myUsername" and "strPassword = myPasword"; place this code somewhere near the top of the module putting in your username and password as appropriate.
Finally we must write the script to navigate through the XML
file.
To do this we must create a new Class Module much like creating
a module, but selecting 'Class Module' instead (right click Microsoft Excel Objects and select Insert > Class Module). Then in the properties
box select the '(Name)' entry and rename the class module
'ContentHandlerData'.
On the first line of the class module type:
Implements IVBSAXContentHandler
Press enter and then you can select the dropdown above that says
'(General)' and select IVBSAXContentHandler (if this option does not appear be sure you have ticked 'Microsoft XML, v6.0' under Tools > References). Once selected you can
select all the items on the right dropdown at the top and this will
setup all the methods. If you have trouble here feel fee to get in
touch and we can guide you through.
At the very top of the class module just below the
'Implements...' paste in the following:
Dim ws As Worksheet
Dim strDate As String, strSP As String, strValue As String
Dim i As Long
By this point you will have methods and the next step will be to
replace the 'Private Sub IVBSAXContentHandler_endDocument()' method
with:
Private Sub IVBSAXContentHandler_endDocument()
ws.Cells(1, 7) = i 'This will keep track of your row as you
change date
End Sub
Finally the 'IVBSAXContentHandler_startElement' method must be
updated to handle the import. The following will look roughly like the
code you must use within this method, but must be adapted to the
specific XML:
Private Sub IVBSAXContentHandler_startElement(strNamespaceURI
As String, strLocalName As String, strQName As String, ByVal
oAttributes As MSXML2.IVBSAXAttributes)
If strQName = "response" Then 'When we hit the response tag at
the top
Set ws = Worksheets("Sheet1") 'We now have access to the sheet
ws.Cells(1, 1) = "Date"
ws.Cells(1, 2) = "Settlement Period"
ws.Cells(1, 3) = "Value" 'Change these to what you want as
headers
i = ws.Cells(1, 7) 'Get latest row number (needed when doing
more than one date)
ElseIf strQName = "hhfuelmix" Then 'Or whatever the tag
we are processing has as its name
strDate = oAttributes.getValueFromQName("settlementdate") 'Gets
the entry settlementdate="2015-12-01 T12:00:00"
strSP = oAttributes.getValueFromQName("settlementperiod") 'Gets
the entry settlementperiod="1"
strValue = oAttributes.getValueFromQName("generation") 'Gets the
entry generation="6.5"
ws.Cells(i,1) = strDate
ws.Cells(i,2) = strSP
ws.Cells(i,3) = strValue
i=i+1
End If
End Sub
This section is typically easiest to fill in while stepping through a run
as you can see what is happening and see a failure as it happens if
you have made a mistake, but for now this tutorial should guide you through
setting up a first downloader.
Note that this approach has a weakness if a tag is missing (say
if settlementperiod=x was missing). This can generally be ignored (as in this case
where our data will not let us down), but for future reference
would look as follows:
blnHasSPTag = False
For i = 0 To oAttributes.Length -1
strIthName = oAttributes.getQName(i)
If strIthName = "settlementperiod" Then blnHasSPTag = True
Next i
If blnHasSPTag = True Then
strSP = oAttributes.getValueFromQName("settlementperiod") 'Gets
the entry settlementperiod="1"
ws.Cells(i,2) = strSP
End If
For now we can ignore this code and start to step through. If you have two screens it is best to
have excel and the XML in a browser on one screen (switch between as
you go) and the VBA code in the other (perhaps print these instructions out first).
When you run through it is best to run through by pressing F8
repeatedly to step through and then you can see what is going on by
placing the mouse over any completed object assignments.
We can start this off by going back to our module, clicking between 'Sub xmlDataDownload()' and 'End Sub' and hitting F8. A yellow
highlighted band will appear and as we continue to hit F8 this band will step through our method.
Once down to 'strXMLDate = Format(i, "yyyymmdd")' you can put your mouse over the variable i to see that it shows the data '01/05/2015' and after 'strURL = ....' we can check to see that our XML link is fully formed.
If all steps have been carried out properly we should move into the class module and start stepping through that code. As we go through here we will pick up the
data from the XML and start building up the data table in the excel sheet.
Don't forget that as you run placing the mouse over strURL on
'saxReader.parseURL strURL' as it is highlighted will give you the
exact link you are going to try and download. This link can be pasted into a
browser and then you can see the data you are importing.
If all goes right data should be imported. The process takes a
bit of getting used to but once learnt can be setup in a matter of
minutes to import data using the above approach.
To start with it will be best to modify parts of the code, but in time once a few runs have
been compeleted it should become easy to setup new XML imports.
The next step would be to split up the fuel type imports replacing 'ws.Cells(1,3) = "Value"' with 'ws.Cells(1,3) = "CCGT"' and 'ws.Cells(1,4) = "OCGT"' and then
getting the fuel type using 'strFuel = oAttributes.getValueFromQName("fueltype")'.
The import can then be better managed by replacing 'ws.Cells(i,1) = strDate' to 'i = i+1' with the following:
If strFuel = "CCGT" Then
ws.Cells(i, 1) = strDate
ws.Cells(i, 2) = strSP
ws.Cells(i, 3) = strValue
ElseIf strFuel = "OCGT" Then
ws.Cells(i, 4) = strValue
i = i + 1
End If
Once this is importing CCGT and OCGT data in two columns, this can then be extended out to handle all the different fuel types avaialable to download.
Having set this up you should be set to start experimenting with more XML calls and will soon be able to get all your excel models automatically importing the data you
need without having to copy and paste new data in.
Support
If you need any support phone us on 01642 671111 or email us
with any questions.