Create a Pivot Table from multiple worksheets in different workbooks

{ 67 Comments }

Assume two workbooks named Book1.xls and Book2.xls.  In Book1.xls, there is a sheet named Jan and in the Book2.xls, there is a worksheet named Feb.  In both the worksheets – Jan and Feb, the following exist

1. Same number of columns on both worksheets; and
2. Same order of columns on both worksheets; and
3. Same spellings of headings on both worksheets

As you can observe in zipped file, all three aspects mentioned above are in place.  One may want to perform any of the following tasks:

1. Create one pivot table from both worksheets (which are in different workbooks); and/or
2. Consolidate data from both worksheets (which are in different workbooks) one below the other

To curtail the workbook size, I have deleted rows of data.  In each workbook, there were 65,000 rows of data on each Jan worksheet and Feb worksheet.

While one simple way would be to copy and paste data from Jan and Feb worksheet in a third worksheet and then create a pivot table, the following shortcomings exist with this method

1. Copying and pasting data from multiple workbooks into one is a manual process; and
2. Since the pivot table will be based on the manually created sheet (by copying and pasting), changes in the two original workbooks will not update the pivot table on refreshing unless the same changes are carried out in the consolidated sheet (created by copying and pasting above) as well.  So this leads to duplication of effort.

Both problems outlined above can be resolved by using MS Query.  The technique mentioned below will work as long as the three conditions mentioned above are satisfied.  Furthermore, since MS Query will only recognize named ranged with rows up to 65,536, the number of rows of data in each of the worksheets (Jan and Feb) should not exceed 65,536.  The combined rows in both worksheets could be any number.

Depending upon the version of MS Excel which you are using, there could be two ways to solve this problem

Solution for MS Excel 2010 and higher versions

If you are using the Power Query add-in, then a few simple steps (no formulas at all) can solve this problem.  The result will be dynamic and refreshable (just as in a Pivot Table).  In order for the Power Query solution to work, all files will have to be saved as CSV files in the same folder (Since a CSV files can have only one worksheet, the Jan and Feb worksheets in the other workbook will have to become two workbooks – Jan.csv and Feb.csv).  In Power Query, there is a feature to append data from multiple CSV files lying in a specific folder into one running range.  Once this is done, the running range can either be transferred to an MS Excel worksheet (if the number or rows are up to 1 million) from where a Pivot Table can be created OR the running range can be loaded to the Data Model (Excel 2013) [the Data Model will be able to accommodate more than 1 million rows depending upon the hardware and software configuration of the machine] from where a Pivot Table can be created.

You may watch a short video here:

Solution for all versions of MS Excel

The steps for creating a pivot table from multiple worksheets (both in the same file) are:

1. Both workbooks – Book1.xls and Book2.xls are saved on the desktop.
2. Open Book1.xls, select the data on the Jan sheet (including the first row as the header row – on the Jan sheet, it will be A1:S4.  Ensure that the header row has some distinctive formatting such as Bold or some colour) and press Ctrl+F3 > New.  In the Name box, type Dummy and click on OK > Close.
3. To cross check that the name assigned above has indeed been assigned correctly, select the data range once again and in the Name box (left of the formula bar), Dummy should appear.
4. Select A1:S4 of the Jan sheet again and press Ctrl+T to convert this range into a Table.  Ensure that the “My Table has headers” box is checked.  Save and close Book1.xks
5. Repeat steps 2 – 4 for the Feb sheet on Book2.xls as well.  In step 3, just change the name to Dummy1.  Save and close Book2.xls
6. Open a new workbook and go to Data > From Other Sources > From Microsoft Query
7. Under Databases, select Excel files > OK
8. In the Directories dialog box, navigate to the folder on the desktop where the Book1.xls file is saved.  So for me, it is saved under C:\Users\Ashish\Desktop\ and double click on the folder where Book1.xls file is saved
9. In the left hand side window, select the Book1.xls file and click on OK
10. With Dummy selected, click on the > symbol to bring over all columns of this named range to the right hand side box
11. Click on Next three times
12. Select the option of Return Data to Microsoft Excel
13. At this stage, if you wish to get data in a Table form then select Table.  If you wish to create a pivot table, select the second option button – Pivot Table.  Select any cell where you would like to the result to appear, say cell A1.
14. Click on OK.  A counter will run at the bottom left hand side with the title of Reading Data
15. If you had selected pivot table in step 13 above, then the pivot table grid/layout will appear
16. You may now drag fields to create a pivot table

So the Table or pivot table which you have created so far is only from Jan sheet of Book1.xls.  We have to also include data from Feb sheet of Book2.xls.

17. If you had created a Table in step 13 above, then click on any cell in the Table and under Table Tools (yellow button in the MS Excel title bar) > Options, click on the small drop down arrow under Refresh and select Connection Properties.  If you had created a Pivot Table in step 13 above, then select any cell in the pivot table and under Pivot Table Tools (red button in the MS Excel title bar) > Options, click on the small drop down arrow under Refresh and select Connection Properties.
18. In the Command text box, you see something like this

SELECT dummy.Invoice_number, dummy.Date, dummy.`Article Number`, dummy.`Article Name`, dummy.`Tax Type`, dummy.Brand, dummy.Category, dummy.`Sports Code`, dummy.`Product Type`, dummy.MRP, dummy.`Customer #`, dummy.`Customer Name`, dummy.Region, dummy.`Area Code`, dummy.Quantity, dummy.`Line Discount`, dummy.`Sales Value`, dummy.`Before Tax`, dummy.`VAT Amount` FROM `C:\Users\Ashish\Desktop\Book1.xls`.dummy dummy

Replace this lengthy statement with the following shorter one

SELECT * from `C:\Users\Ashish\Desktop\Book1.xls`.dummy
Union all
SELECT * from `C:\Users\Ashish\Desktop\Book2.xls`.dummy1

19. Click on OK and the Table/Pivot Table should update
20. Save the file as Consolidated.xls and close it.

Update Table/Pivot Table for addition/editing in existing named ranges

To see the effect of a dynamic Table/pivot table, edit data in Jan and/or Feb sheets of Book1.xls and/or Book2.xls.  Save and close the file(s).  Open Consolidated.xls and refresh the Table/Pivot Table (Right click and Refresh).  At the bottom right had side the counter will run again and (step 14 above) and once it has read all data, the Table/pivot table will update.

Furthermore, since the two named ranges (Dummy and Dummy1) have been converted into Tables, even if you add data by rows (with no row being left entirely blank), when you right click to Refresh the pivot Table, data of new rows will appear.

Update Table/Pivot Table for addition of new worksheets

Assume you now want to add data from a worksheet titled March in Book3.xls.  To include this sheet in the pivot Table, follow the under mentioned steps:

1. Open Book3.xls

2. Follow steps 2 – 4 mentioned above for the March sheet.  In step 3, just change the name to Dummy3

3. Save and close the workbook

4. Open Consolidated.xls

5. Select any cell in the pivot table and under Pivot Table Tools (red button in the MS Excel title bar) > Options, click on the small drop down arrow under Refresh and select Connection Properties.  If you selected Table (instead of Pivot Table) in step 13 above, click on any cell in the Table and under Table Tools (yellow button in the MS Excel title bar) > Options, click on the small drop down arrow under Refresh and select Connection Properties.

6. Click on the Definition tab and under Command Text, add the following at the end of the SQL query:

Union all
SELECT * from `C:\Users\Ashish\Desktop\Book3.xls`.dummy3

7. When you click on OK, the counter will run again and the pivot should reflect data from March sheet.

Leave a Comment

Your email address will not be published.

*

  • Thank you for this….I tried it and it works! But once I create the “unioned” table…i cannot seem to have the data give me a sum of one of the columns of data.

    My tables are simple….3 columns
    ITEMNO DESC QTY

    Once I have the 2 sheets merged into one table, I am try to get a sum for the qty of each item, but for some reason it only returns 0’s . Any idea why?

    Thanks for making this available!

    Pete

    • Hi,

      You are welcome. This is happening because at least on of the tables has the Qty column as all blank values. Enter a value of 0 in the first cell of the Qty column of the table which has all blank values and refresh the Pivot Table.

      Does this work?

  • First of all thanks for the wonderful article. I tried your example with the sample xls file and it worked like a charm.

    In my case, i have 3 different workbook with exactly same no/type of columns. And in each workbook there are around 2000 rows. My aim is to consolidate for a range of 2000 rows from all 3 workbook and make a pivot report of it.

    I selected the entire range (2000 rows) while using the Name manager and the creating the Table.

    I use Excel 2010 (all 3 worksheets are with xlsx extention) and files are placed on the network. The MS Query Wizard doesn’t respond after Step12 when I click Finish. Nothing happens.

    Pls advice!

    • Hi,

      You are most welcome. If the Select Data box does not appear after 5-6 seconds of returning data to MS Excel, then press Alt+Tab.

      Does the Select Data box appear now?

  • @ashish,

    I tried the following workarounds:

    In step2, I had earlier selected 2000rows as per my requirement. My sample sheet had data until 45rows. So I again created the name manager and table for 45rows in all three workbook and saved them. Now when I tried using MS query wizard, it worked fine.

    The MS Query Wizard responded perfectly after Step12 when I clicked Finish.

    But when I modified the command text.

    SELECT * from `C:\Documents and Settings\Desktop\Trial2\130419_Final_2B.xlsx`.data2b
    Union all
    SELECT * from `C:\Documents and Settings\Desktop\Trial2\130419_Final_2E.xlsx`.data2e
    Union all
    SELECT * from `C:\Documents and Settings\Desktop\Trial2\130419_Final_2H.xlsx`.data2h

    I got an error “[Microsoft][ODBC Excel Driver] Too many fields defined”. And then when I click OK a error msg saying “Problem obtaining data” appears.

    I then tried them limiting to 2 files

    SELECT * from `C:\Documents and Settings\Desktop\Trial2\130419_Final_2B.xlsx`.data2b
    Union all
    SELECT * from `C:\Documents and Settings\Desktop\Trial2\130419_Final_2E.xlsx`.data2e

    It worked perfectly.

    This means that your method works only if

    1) I select the filled data rows (45 rows) instead of 2000 rows (but my requirement is 2000rows)
    2) I use the command text to add only two files at a time (but my requiment is 3 files)

    I am almost there except for the above two bottlenecks. Pls advice!

    • Hi,

      There seems to be some problem in the named range data2h. Check the following in the named range data2h

      1. It should contain the same number of columns as in data2b and data2e; and
      2. The order of column should match the order of columns in data2b and data2e; and
      3. Spelling of headings should be the same in all the three named ranges

      In each workbook, the named range should only exist till where data is. So if data in data2b is only till row 45, then define the named range only till row 45. If data on data2e is only till row 57, then define the named range only till row 57 and so on.

      Since you have converted all ranges into Tables, the named range will expand as and when data gets added by rows.

      • @ashish,

        Yes. I followed your instructions. All these files were copied and made from a single file (only changes are in file name and row data). And I should have mentioned earlier that I has already tried combining two files like data2b & data2e works, data2e & data2h works, and data2b & data2h works.

        Only when I combine the 3rd using “union all” I seem to have a problem.

        “Since you have converted all ranges into Tables, the named range will expand as and when data gets added by rows.”: Its a relief 🙂

        Now i just have to find a way to add the 3rd file!

        • Hi,

          Upload all four files (3 individual data files and 1 file where you are trying to create a Pivot Table) to SkyDrive and share the link of the uploaded file here.

  • I am running Excel 2007 on Windows 7 on a 64bit machine.

    I have +200 files (in tab separated value format, single sheet per worksheet) that I would like to interrogate via pivot table in a separate worksheet. The files are all identically formatted with a header line and 15 fields with between 25000 and 45000 lines of data. I cannot open all the files at once nor can I join them into a single file (5.4 million lines).

    Any suggestions how to set up a Pivot table to interrogate all the files?

    Your site is excellent!

    • Hi,

      Thank you for your kind comments about my website. Since you are using Excel 2007, I cannot think of any “practically feasible” MS Excel based solution to this problem. The only think I can suggest is using MS Access which will allow you to append data from 200+ files into one database and then create a Pivot Table from it.

      If the following three conditions were true, I would have tried something:

      1. You were using MS Excel 2010 or higher version
      2. The 200+ files were saved as CSV files
      3. The 200+ files were all saved in one folder

      If these conditions are true, then I think we can try to use a combination of PowerQuery and PowerPivot to atleast attempt solving this problem.

  • This was a great guide! Very helpful. I just have a query. When I try to get the sum of the values as a summary in the pivot table, all I get are 0’s. When I select summarize data (by right clicking on the pivot table), the columns I need to sum up seems to be converted to text format (hence why there are 0s).

    Any advise as to convert them to numbers?

    Thanks!

    • Thank you. I am glad that you liked the solution. This is happening because of one of the following reasons:

      1. Any one of the numeric columns has numbers stored as text. In all workbooks, select the column of numbers and check to see if you get the SUM of numbers in the task bar
      2. There are no numeric values in the column (which you are dragging inside the Pivot Table) of one of the worksheets. If there is any such worksheet, then type 0 in the first cell (under the header row) of that column

      Refresh the Pivot Table.

      Does this help?

      • Sadly no. In the actual reference filed with the raw data, the numbers in the column are results from a formula, so I’m pretty sure that they are stored as numbers. It also shows the sum when you manually try to get it.

        The only value thats not numeric is the header.

  • That was it! Some of the fields were blank because an if statement were returning “”. When I changed it to return 0 instead, the values were correctly adding up!

    Thanks!

  • i am getting error “address..” is not a valid name.Make sure that it does not include invalid characters or punctuation and that it is not too long..

    I’ve used workbook names as book1 & book2.

    I’ve checked with different formats like .xls, .xlsx, .xlsb but no use.

    Below is the link
    https://onedrive.live.com/redir?resid=643304E60729B7A4!298&authkey=!AB1GFNH_97F7qCE&ithint=folder%2c.xlsx

    i am using excel 2007

    Its a life savior for me.. thank you for the article..

    • Hi,

      I believe the error mesage says it all. It looks like the files which you want to create a Pivot Table from have either a lengthy path or contain special characters. To try this, copy the files on a certain folder on the desktop.

      Does it work then?

      • No use mate..
        This time i did it on another system. File path is very simple. query is below.

        select * from ‘D:\summary\book1.xlsb’.dummy1
        union all
        select * from ‘D:\Summary\book2.xlsb’.dummy2

        Error message below
        [microsoft][odbc excel driver] “D:\summary\book1.xlsb” is not a valid name. make sure that it does not include invalid characters…….

        have you tried on my data??

    • Hi,

      I cannot try my workaround steps on your shared files right now because i do not have access to my computer at this time. Anyways, the reason i think this is not working is that the number of rows in any one file (in your case unfortunately both) exceeds 65,536. MS Query (being a primitive application) does not support Named Ranges of more than 65,536 rows (I have mentioned this limitation in the original Blog article itself).

      Anyways, i can think of two workarounds to this problem

      1. Ensure that no sheet has a named range exceeding 65,000 rows of data. So if any sheet has 85,000 rows, then split that worksheet into two worksheets with only upto 65,000 rows of data on each worksheet. You will have to do this for all other sheets/workbooks as well. Now follow the steps outlined in my original Blog article and let me know if the method works.

      2. If the solution outlined in 1 above is impractical to implement (because it may be a time consuming exercise to split a file into multiple files), then we may use Power Query (a free add-in by Microsoft for Excel 2010 and higher versions). In Power Query, there is a way to append rows of data from data lying in multiple files (creating a Pivot Table from this is then very simple) provided the following conditions are satisfied:

      a. All files have the same structure i.e. same number of columns, same order of columns and same spelling of headings; and
      b. All files are lying in one foler; and
      c. All files are saved as .CSV files (which by default means that all files will only have one worksheet); and
      d. You should be using Excel 2010 or a higher version; and
      e. You should have Power Query (a free add-in from Microsoft) installed on your system.

      If the above conditions can get satisfied, then there is a one click solution to append data from multiple files to one. The icing on the cake here will be that if more files are added to the folder or files are deleted from the folder or data is edited in existing files, then the appended data (an therefore the Pivot Table) will, on Refreh, show you the updated result.

  • Please note that you may get an error if you take steps 2,3 & 4 in order 4, 2 & 3

    (I am using MS Office 2013, Wn7)

  • Hi!
    I’m trying your solution, but at step 12/13 I get a problem: I can only select the options “Return data to Microsoft Excel” or “View data or edit query in Microsoft Query” – no matter which option I select,

    • (sorry – accidentally hit submit)

      … no matter what option I select, I get the error message “Too few parameters. Expected 9.” and then I can’t continue with anything. I followed exactly your steps, jsut applied it to my data sets (exactly same structure and column labels in both sheets of course). I’m using Excel 2010 64bit on Windows 7..

      Thanks a lot!!
      Best,
      Tilmann

  • Hi, Thank you for this. I tried it and it works greatly! I would like to ask what about creating a Pivot Table from multiple worksheets of the same workbook? For example, the workbook contains different tab (Tab1 = Jan data, Tab2 = Feb Data). The pivot table data is from the different tab. Is there any way to do it as well as ways to continue to update the data?

    Thanks!

  • Hi. First of all, I’d like to thank Ashish for the article. I’ve been looking for solution that suits me for a long time, and this approach seemed to be the best. The only problem that it doesn’t work when I consolidate more than 2 workbooks. I’ve got ‘too many fields defined’ error. What I am trying to do is to consolidate 11 workbooks. Each workbook (table) has 91 columns and 3 000 to 7 000 raws (I followed carefully your instructions and ensured that there is equal number of columns and they have the same order and names in all files). I’ve read about this issue in comments section. Man called ‘vveekk’ wrote about the same problem. I wonder if he has solved it. Please, give me advice or link to other techniques.

  • Hi, Excellent explanation, but it’s not quite what I’m looking for. The data I am working with is in three columns but the thrid column title changes over the course of the 20 workbooks from “2007 Charges” to “2013 Charges”. When i add the additional file name through the description, the Pivot Table still only has the original 2007 Charges as a Field and I can’t see how to get the new Fields to appear so I can make them Columns in the Pivot Table. I tried to simply add additional empty columns to the first spreadsheet so that there are fields in the Pivot table awaiting data but it seems as though all of the data sets must have exactly the same columns in each of them. I was hoping that adding the additional data might dynamically add the additional fields to the Pivot Table tool so I could insert them in the Pivot Table once the data was linked. This is my first experience with MS Query.

  • Hi! Thank you for the solution. Is very useful.
    However I have some problems because of the location of the excel document.
    In our company the documents are placed on a platform for document management. I don’t know exactly how to explain. Is like a web place where we keep documents and every one can have access through links and see them and make changes. I hope you know what I mean. Then when I have to look for the document I cannot get to the location as the location is not in my computer.
    Is it possible to make the SQL extraction and create the pivot table in this case?

    • Hi,

      No matter where the document is saved, it has to have a traceable path. That patch has to be input at the location which I have shared in my original Blog post.

      • Hello again Ashish,
        Thank you for your prompt answer.
        I have watched again your tutorial and I insist that I cannot put the location of the document as it is not a location from my computer.
        For instance in the point 8 of your tutorial above you say:
        “8. In the Directories dialog box, navigate to the folder on the desktop where the Book1.xls file is saved. So for me, it is saved under C:\Users\Ashish\Desktop\ and double click on the folder where Book1.xls file is saved”
        In my case the location would be: ShareDox/Repository/OD/etc.This location I can not find in the Directories Dialog box you mention above as I get to ShareDox/Repository/OD through Internet Explorer.
        Is it possible to just copy the internet location an paste it into the Directories dialog box? I am not sure it is possible.
        Than you very much for your help and apologise for insisting.

        • Hi,

          check if the internet service which you are using to save your workbook has a provision to sync files to your computer, such as OneDrive does. If there is a such a feature then you will be able to give a path. I am now out of ideas.

          • I am using a lot this Excel function and I want to thank you very much for your tutorial.
            In my last project I have faced a problem similar to what you are showing in the tutorial when the pivot table is not summing the sales values from the three tables.
            I am trying to join 16 different tables but when I do some numerical data from certain columns is not recognize as such (though I can see the data) and when I want to sum values from those columns the result is 0.
            I have changed the format of all source tables as well as the final one but still not working.
            I am out of solutions and I don’t know what else I should do.
            Could you help me with that?
            Thank you for your help.
            Regards,

          • Hi,

            You are welcome. Formatting of cells does not change the data type. Perhaps the data type of your numeric columns is text. Convert the Data Type to Numbers first.

  • Hi,
    I got your response but let me explain further
    each workbook has several sheets which is summarized on another tab names summary..I have about 4 workbooks like this.

    What i intend to do is consolidate the summary sheets only.

    and i want to be able to automatically update the pivot whenever there is any change in the worksheets in each workbook

  • great solution. thank you. use a sample of data and worked like a charm. problem however is size. I have a client database (sql) of 1.5m records. I deliver in excel to client. all this is fine – break data in logical portions. the test data I used that worked is 3 worksheets consisting of 8779, 82 897 and 305 067 records (lines). It worked 100%. Using the full data set, I break up large sheets to max 200 000 lines (based on the sample of 305 067 lines that worked). Excel (wizard) stop working at process no 12 after the 3rd file I tried to link. Woking on office 32bit with excel’s max 2GIG processing capabilities- will office 64bit resolve this and/or the Power Query option

  • Hi Ashish Mathur,

    i’m johannes.
    i got a job to create a report for my leader from several supervisor,
    let me describe it.
    – the report on excel format and put it on google drive(all workbook) so anytime can update and check it.
    – we have 3-5 supervisor that have to report their job everyday to my leader
    – i create workbook(1 workbook have 2-3worksheet an every worksheet have at least 4 column) for each supervisor and share the access to update everyday.
    – for my leader he want to check the report in one workbook(master report).
    – and when my leader see the report, if there is something wrong, he will give comment. and could it? (it’s mean comment on master workbook than the source workbook can be notify. (optional if can do it)

    i want to create a master report that the data get from the supervisor workbooks.
    may i create it? and how to do it?
    and the master report should can filter by each supervisor, job desk, and date.
    also auto refresh/update for the master report.

    please give me an advice. and big hope that i can do it.
    kindly wait for you reply Ashish Mathur,
    thank you so much for helping

    • Hi,

      Thank you for writing in. I may be able to solve this problem using Power Query a.k.a Get & Transform in Excel 2016. Also, the workbooks will have to be stored on your local drive or network somewhere. Share the link from where i can download a sample workbook. Also, show me the exact format in which you would like to see your output.

  • Hello Ashish…thank you for this solution…I am stuck though…i have linked data sheets on 4 different workbook. The resultant pivot is reading from all the workbooks but is not calculating any values. Its just giving me a 0 in the columns where it is supposed to give me a sum total…any ideas on how to fix this?
    I have entered values (either 0 or 1) in all the 4 data sheets…

    • You are welcome. Getting a 0 means that it is treating the entries in the numeric columns as text entries. Ensure that in the first cell of all numeric column of all workbooks, there is number or a 0. Then refresh the Pivot Table. Furthermore, if you want to build a pivot Table from multiple workbooks, it would be ideal to use Power Query to first append data from multiple workbooks and then push the data to the PowerPivot. From the PowerPivot, we can then build our desired Pivot Table.

  • Hi Ashish. First of all I would like to thanks you for sharing such a useful solution which not only me but several other were wondering. Thanks again 🙂
    I have a short question, I am using MS Query for bringing data from mulitple Tabels and it works beautifully and it brings all rows including blank\empty rows. I am using this SQL statement
    SELECT * from dummy
    Union all
    SELECT * from dummy1
    Is there any way to avoide BLANK rows in output tabel?
    regards
    Naghman

    • You are welcome. If the dummy and dummy1 named ranges include blank rows, then they will appear in your dataset. If you trying to build a Pivot Table from multiple workbooks, then a much better solution to use is Power Query a.k.a. Get & Transform in Excel 2016 and Power Pivot.

      • Hi again. and thanks for reply. I am unsure about lot of things about Power Query due to it’s not easy to change my whole set-up.
        By using MS query, mostly my all files are connected with each other and it is very convenient in use.
        Is there any other SQL statement which can allow me to select specific columns\rows from input files (instead of Select * \ Union all)?

  • i am using two csv files kept in a folder. Used New query-From FILE–From folder and selected both files. After i load the file and insert Pivot, i get total of Count instead of Total of SUM.Please help in this case.I need Total of SUM