Comparing Excel: A Practical Guide to Spreadsheet Comparison


This is an impartial and practical guide for users wishing to compare Excel workbooks for auditing or analysis purposes. It explains the preparation required, and shows various methods by which the user can undertake workbook comparisons with or without third party tools. It goes on to discuss the problems & pitfalls faced by both manual and automated comparison methods, and finally recommends some changes users may wish to make to their spreadsheets or governance procedures once the comparison is complete.



Contents:
1) What do I really want to achieve?
2) Simple spreadsheet comparison methods
3) Advanced user spreadsheet comparison methods
4) Third party spreadsheet comparisons tools
5) Preparing spreadsheets for comparison
6) Potential pitfalls
7) And finally


1) What do I really want to achieve?
This fundamental question is the key driver of which workbook comparison method is most suited to your individual purpose. If you are simply looking to ascertain if a file has changed, then a simpler approach can be taken then if you want to establish if the fundamental calculation method or results in a spreadsheet has changed. Likewise, if you need to determine changes to your company’s asset register, then a more diligent methodology is likely to be required then if you just want to discover the extent of change in an internal telephone number list. We do not apologize for the obvious nature of this paragraph, because your answer will help determine not only the best approach, but also the amount of time, effort & resource you should consider allocating to your project: Keep this question in mind when considering your best options.


2) Simple spreadsheet comparison methods
In many circumstances, users with a basic understanding of Excel can quickly and simply compare Excel workbooks for themselves with no additional worksheet comparison tools. Here are two simple methods that may meet your needs…

The subtraction method:
Many simple (and some complex) financial or numerical spreadsheets’ values can be compared using the subtraction method. The method relies on subtracting the new spreadsheet figures from the old. The result of any numerical changes will be non-zero.

Notes on this method:
Text changes are not shown. This method will only work if the rows and columns are already aligned, or are pre-aligned (see ‘Preparing spreadsheets for comparison’ section if required)


  • a) Copy the entire contents of the original spreadsheet to the clipboard (Select the small square above row 1 and to the left of column 1, then press Ctrl-C)

  • b) On a new worksheet paste the contents as VALUES (Select cell A1 then Paste-Special as Values)

  • c) Copy the entire contents of the new spreadsheet to the clipboard

  • d) On the new worksheet SUBTRACT the clipboard (Select cell A1 then Paste-Special operation:Subtract)

  • e) All unchanged numerical values will now be shown as zero, and changed numbers as non-zero. If required, conditional formatting can be used to highlight the non-zero numbers.


The IF(Different) Method

This method requires basic knowledge of Excel functions. It uses the IF() function to highlight numerical and textual value spreadsheet changes, and can be adapted to more advanced Excel comparisons (see next chapter).

Notes on this method:

This method will only work if the rows and columns are already aligned, or are pre-aligned (see ‘Preparing spreadsheets for comparison’ section if required)

  • a) If your two spreadsheets are in the same workbook, on cell A1 of a new worksheet insert the following IF() formula (replace the worksheet names Sheet1 and Sheet2 as appropriate)

    =IF(Sheet1!A1<>Sheet2!A1,"DIFF:"&Sheet2!A1,"")

    If your two spreadsheets are in different workbooks, create a blank new workbook and on cell A1 of the first worksheet use the following IF() formula  (replace the workbook names Book1 and Book2, and worksheet names Sheet1 and Sheet2 as appropriate, taking care to keep the brackets)

    =IF([Book1]Sheet1!A1<>[Book2]Sheet2!A1,"DIFF:"&[Book2]Sheet2!A1,"")

  • b) Create an instance of this formula in every cell that is occupied in the new or original spreadsheets by dragging the formula down the required number of rows, and then across the required number of columns.

  • c) Each cell value that has changed will be start with the text ‘DIFF:’ and then display the new value (Note that a prefix such as DIFF: is required in case the original value has been deleted completely in the new version)


3) Advanced user spreadsheet comparison methods

Once the basics of the IF(Different) method is understood it can be extended to compare workbooks for numerous requirements. Conditional formatting can be used to further enhance the differences. Here are a few samples to whet your appetite …


Only Show changes to text items

=IF(ISTEXT(Sheet1!A1),IF(Sheet1!A1<>Sheet2!A1,Sheet2!A1,""),"")


Only show changes that have resulted in new errors

=IF(ISERROR(Sheet2!A1),IF(ISERROR(Sheet1!A1),"",Sheet2!A1),"")


Only show numerical differences over 10%

=IF(ISERROR(Sheet1!A1/Sheet2!A1),"",IF(ABS((Sheet1!A1-Sheet2!A1)/Sheet1!A1)>10%,Sheet2!A1,""))


Limitations of the IF(Different) method:

This method is limited to comparing the values of spreadsheets. VBA or user defined functions must be used for the comparison of formulae. Whilst VBA/UDFs are out of scope for this document, those with the required knowledge may consider implementing the following UDF to return the formula of a cell, which can subsequently be compared using one of the methods described above.


Public Function CellAsFormula(rngCell As Range) As String
    CellAsFormula = CStr(rngCell.Formula)
End Function


4) Third party spreadsheet comparisons tools

There are numerous third party Excel comparison and worksheet comparison tools available. These range from straight file utilities, not specifically designed for Excel comparison, through to enterprise level dedicated spreadsheet monitoring tools.


Unlike the comparison methods above, nearly all of the third party Excel comparison tools offer a basic difference report and a simple menu that mean that a quick comparison report is always available from the standard Excel interface. For many users these may be significant deciding factors in choosing whether to compare Excel yourself or purchase a software tool.


At any price point, the variation in quality is surprisingly significant, so before selection it is advisable to avail yourself of one or more of the many ‘free-trial’ offers available from most of the vendors, then test it fully against a variety of your spreadsheets. Be sure that any trials that only report the ‘first’ x number of differences or ‘x’ rows of data will definitely meet your needs when they potentially come to swamp you with all the differences.


At the time of writing (early 2010) many companies are in a transitional stage from the established Excel 2003 to the implementation of Excel 2007 or Excel 2010. if you are currently using Excel 2003 or earlier, ensure the software will work not just in 2007/10 but also with Excel 2007 / 2010 files (which are a completely different format and can exceed 1m rows)


5) Preparing spreadsheets for comparison

If using one of the more advanced third party comparison toolkits, then no preparation of your comparison workbooks should be required, but for those undertaking their own Excel comparisons or using less expensive toolkits, the following items should be given consideration …


Mis-matched rows & columns


This is the most common barrier to successful like-for-like comparisons. Auditors Assistant is one of the few workbook comparison toolkits that provide fully automated row & column matching, although more and more third party tools offer ‘semi’ automated matching, where the user identifies the location of  new / deleted  / moved rows or columns for the software in advance. However if you are undertaking your own comparison and need to allow for changes in rows or columns, all is not lost!


Option 1: Manually match rows and columns

Although this can be arduous, it is ultimately the simplest approach to enable a comparison of worksheets that have dissimilar row and column structures. It has the advantage of subsequently being able to be compared quickly using one of the simple comparison techniques described at the beginning of this article, with any structural mistakes being easy to rectify.

  • a) Make a copy of the two worksheets for comparison (ie leave the originals untouched).

  • b) Show the original spreadsheet in one window, and the new worksheet in a second window.

  • c) Arrange the windows, side by side, then move, add or delete rows and columns as appropriate until the two structures are the same.

  • d) Run the comparison manually or through your comparison toolkit as normal.


Option 2: Use an index row / column

When there are numerous row / column changes or when moving / inserting / removing columns is impractical then it may be easier to use an index row and/or column.

  • a) Make a copy of the two worksheets for comparison (ie leave the originals untouched)

  • b) For row changes, insert two blank columns in columns 1 & 2

  • c) The first column will be used to hold the row number: Enter 1 in cell A1 and 2 in cell A2 … drag the numbers down so that every row containing data has its ‘row’ number in the first column

  • d) The second blank column will be known as the ‘index’ column: Enter the number ‘1’ in the index column of the original worksheet, at the first row containing data.

  • e) Enter a corresponding ‘1’ in the index column of the new worksheet, in the row that should be matched to the original worksheet.

  • f) Enter the number 2 in the corresponding rows of the index column of the two worksheets.

  • g) Continue entering corresponding numbers to each row that should be matched to each another.

  • h) Once all corresponding rows have been matched, sort both worksheets using the index column (column 2) as the key: There may or may not be some new / deleted rows left over at the bottom.

  • i) If columns also need to be mapped, repeat steps b) to h) for column matches (by inserting 2 blank rows)

  • j) Use your workbook comparison software or manual comparison method to highlight the changes … you can identify the row where the change was made from the row number in the first column.

  • k) If required, the data can be re-sorted to its original order by using the sort command with the first column and/or row as the key.


Option 3: Sort the worksheets

Where one column contains headers or data that are known to be the same in both original and new worksheets, then a simplified version of the index row/column can often be used. Simply sort both worksheets alphabetically using the unique header column or row.


Hidden Rows & Columns

Many spreadsheets are designed with hidden rows or columns. There is a relatively high incidence of error in these hidden rows or columns, as changes made to the spreadsheet over time often overlook the need for a corresponding change in a hidden row or column. The safest practice is to unhide all hidden rows or columns before comparing Excel worksheets.


Hidden & Very Hidden Worksheets

If a full or audit level workbook comparison is required, then consideration should be given to worksheets that are hidden or very hidden. Hidden worksheets are easily identified and made visible: Simply right-click on any worksheet tab - If there are any hidden sheets the menu item ‘Unhide’ will be enabled, which when pressed will show the list of hidden sheets.

‘Very hidden’ sheets are a feature available only within VBA or the Visual Basic Editor. Unless your spreadsheet has been professionally developed it is highly unlikely they will contain hidden worksheets. However, if certainty is required, a check can be made by taking the following steps …

  • a) From within Excel, display the Visual Basic Editor by pressing Alt-F11

  • b) In the ‘Project’ section of the window, select the workbook being reviewed, then expand the tree until you see ‘Microsoft Excel Objects’. Expand this section to show a list of all the worksheets in the workbook.

  • c) Press F4 to display the ‘Properties’ box

  • d) Each worksheet will have a ‘Visible’ property, set as xlSheetVisible, xlSheetHidden or xlSheetVeryHidden: If necessary change any sheet(s) to xlSheetVisible


6) Pitfalls

This is a checklist of potential pitfalls when comparing data.  Due the vast nature of spreadsheet structure, content and requirements, it is not possible to make generic recommendations, however if any of your spreadsheets contain these items, beware of drawing absolute conclusions from your comparisons.

  • a) Hidden rows & columns

    Frequently the source of forgotten changes

  • b) Formulae

    The comparison is best undertaken by somebody of expert Excel / VBA knowledge or with a third party Excel compare toolkit. A change in formula can often give the same result, leading to incorrect workbook comparison conclusions, so this is an extremely worthwhile comparison to make for important spreadsheets.

  • c) Linked worksheets

    If comparing formulae that references another workbook, leave the linked workbook(s) CLOSED. That way the full path of the link is held in the formula, and path differences can also be ascertained.  Consider extending your comparison project to the linked files themselves.

  • d) Macros/VBA

    One single accidental character change can completely alter workbook results. If comparing business critical workbooks that contain macros/VBA then this type of comparison should definitely be undertaken if at all possible. Be particularly wary of code that needs to be changed every year or every time the report needs to be run—these have a higher incidence of potential error—Professional code is very rarely designed this way.

  • e) Protected structures

    If the structure of the worksheet has been password protected, this may prevent the user form viewing / un-hiding data. If performing a manual comparison consider using the IF(Different) formula on a new (unprotected) sheet to avoid any hidden data issues.

  • f) Shared Workbooks

    It is not possible to review the macros in a shared workbook. If a VBA comparison needs to be undertaken, un-share a copy of the workbook, then compare the VBA.

  • g) Range names

    Check any range names used are still referencing or returning the same data or dataset. Be particularly wary or range names referencing linked files or external data sources.

  • h) Hidden worksheets and very hidden worksheets

    Discussed in the previous chapter.

  • i) Mixed Pre and Post 2007 workbooks

    Comparing 2007/2010 workbooks against  pre Excel 2007 workbooks may cause problems for some third party products. It may be necessary to re-save an old file to the new file format before undertaking the comparison of the Excel files.


7) And finally ….

Once you have compared your workbook, you may have identified certain worksheets, rows, columns, sections of data or even VBA that you do not want changed in future. Consideration should be given to protecting this data from accidental (or deliberate) change …. this is a worthwhile time investment for important spreadsheets. A link to trustworthy information and directions on workbook protection is listed at the bottom of this sheet


We hope this short article has given you an insight into the methods, options and potential pitfalls of comparing Excel files. Comparing the structure and results of a known ‘correct’ spreadsheet to a newly developed or frequently saved workbooks a key element in corporate Excel governance, and you may be interested to read our guide on how Auditors Assistant can be used to gain and maintain control of the ever changing critical spreadsheets in your organization elsewhere on this website.



Further reading: Workbook Protection

http://spreadsheetpage.com/index.php/tip/protecting_cells_sheets_workbooks_and_files

Copyright AuditorsAssistant.com 2009-10. This guide is not to be copied in whole or part, although linking to this webpage is permitted.