Need advice: Excel data comparison

mandala · March 28, 2014, 4:54am

Background:
I have a huge Excel spreadsheet with something like 5000 rows and 70 columns packed with dense data (text and numbers, no formulae). That was an extract from a data source about 3 months ago. I am just getting another huger spreadsheet current as of yesterday. I am supposed to transform this data in several ways and even upload a portion of it back to another database.

Is there a easy way to compare these two Excel sheets, row by row, highlighting cells that are not the same? I can then work on only those (changed) cells, hugely saving my time.

Is there an easy way to accomplish this? I am no Excel or VBA expert, but can handle small macros or VBA procedures. Any 3rd party tools available?

Sincerely appreciate your help!

Dave_Hartwick · March 28, 2014, 5:26am

That’s not all that much data.

Paste the first extract in a new workbook, Sheet1.

Paste the second to Sheet2.

On Sheet3 in A1, type “=Sheet1!A1=Sheet2!A1”.

Copy that to all the cells that apply. Any that aren’t exactly the same will return false. Use conditional formatting to make the bad results stand out.

You could use “=if(Sheet1!A1=Sheet2!A1=true,0,1)” which will return a 1 if true and a zero if false. Sum the rows and use autofilter to only see rows that with variant records.

Or return the variant data by changing the “,1)” to “Sheet2!A1”.

mandala · March 28, 2014, 5:35am

Thanks for this! I will certainly try it.

mcgato · March 28, 2014, 12:28pm

I use something like:
=if(Sheet1!A1<>Sheet2!A1,“Different”,"")

That way, the cells that are the same are blank. It just makes it easier to find the different cells. Though conditional formatting would do the same thing, just one more step.

gigi · March 28, 2014, 3:28pm

Can you guys expand on this a little? Do these commands compare the exact same cell in both sheets, requiring that both sheets be sorted the same beforehand? If the second sheet is “huger”, aren’t whole sections of it going to be different from the first one? I usually do some sort of lookup and then an +EXACT(x,y) so I know I am comparing info on the same key data point, if that makes sense.

BrokenBriton · March 28, 2014, 3:36pm

Yep, what has been described here is cell vs. cell.

VLookup is a far better option at any time but espeically if the two ranges are of different size.

CookingWithGas · March 29, 2014, 1:55am

For two sheets, put them both in the same file (if they’re not already) and use conditional formatting to highlight the cells that don’t match.

In the first sheet, select the entire sheet (click upper left corner). Then add a conditional formatting rule (“Use a formula…”) that looks like

=A1<>Sheet2!A1

Then in Sheet1, all cells that don’t match Sheet2 will be highlighted in however you choose the formatting options.

Whether this works across sheets depends on your version of Excel.

gigi · March 29, 2014, 5:02pm

But again, it means you are comparing the exact same rows. I’m concerned that the OP said the second file was “huger” so it may mean they no longer line up…?

Dave_Hartwick · March 30, 2014, 2:34am

Could be. Could be that the size difference is due to appended rows. OP asked for cell to cell comparison, so I went with a way to check that.

Could be that a concatenated match or something is required, due to a lack of unique IDs and no good way to sort the records. It’s impossible to know the requirements from the average message board Excel query, so I tend to just guess a simple solution and see if that works. Often the OP never comes back because the suggested solution worked, a solution was found elsewhere, the project got chucked, the OP figured it out on her own, etc.

StGermain · March 30, 2014, 4:35am

Start by identifying a unique value, such as invoice number. If there are no unique values, you may have to create one, for example, take invoice number and item number and concatenate them into one cell (=concatenate(a2,’-’,b2) will give you the value in A2, a dash, then the value in B2 A2-B2. Then you have a unique value for every item in every invoice. Do that to the other sheet, using the same values. Then run a V-lookup on the unique value to the column you’ve created.

Do the V-lookup on both sheets, not just the larger one, so you can tell if there’s any info on the smaller sheet that doesn’t match. People too often only check the sheet they’re expecting discrepancies in, and forget that there can be errors on both sheets.

StG

mandala · March 30, 2014, 4:45am

Thanks to all responders. The second file contains a few hundred additional rows, including duplicates in what I expected would be unique values (employee ID). Since this extract is from a Lotus Notes non-relational database, anything goes.

Now I was looking to simplify my work by comparing matching rows (based on employee ID) and seeing if any cells were changed. Now since rows contain duplicated identifiers this simple scheme will no longer work.

Dave_Hartwick · March 30, 2014, 7:57am

As mentioned, you can concatenate fields to create a unique ID. Employee number and date/time is a typical method. It’s simple: =A2&B2 combines the contents of the references.

People have also mentioned VLOOKUP. I prefer OFFSET and MATCH. Either way the methods mentioned already will work. I’ll use VLOOKUP because people know it better:

=VLOOKUP(Sheet3!A2,Sheet1!$A$2:$B$5,COLUMN(Sheet3!B2),FALSE)=VLOOKUP(Sheet3!A2,Sheet2!$A$2:$B$5,COLUMN(Sheet3!B2),FALSE)

…Returns a true or false. I’d suggest that you determine a concatenated identifier, add the column to both sets of data, copy BOTH sets of IDs in a single column to another sheet, and then remove any duplicates, thereby ensuring that you’ve got them all. You’ll get an N/A if the ID does not appear in one of the sets. You could add IF plus ISNA to add a flag telling you which set is missing the record.

All this assumes that it is indeed possible to create a unique ID. In my experience, it always has been.

Reply · March 30, 2014, 9:32am

You can also upload it to a Google Sheet and use the free Remove Duplicates addon.

don_t_ask · March 30, 2014, 9:46am

The same function exists in Excel.

Simply copy both lots of data to one sheet, sort them by employee ID and then use the function. You will now have only one row for the unchanged ones and multiple rows for the changed ones. The next step depends on what you are actually trying to output.

Topic		Replies	Views
MS Excel Question Factual Questions	12	857	March 11, 2004
Excel Question Factual Questions	15	796	October 16, 2003
comparing data in two Excel spreadsheets Factual Questions	5	1215	May 16, 2013
excel formula debugging tools Factual Questions	4	1548	October 21, 2011
Excel help: File matching Factual Questions	32	1108	July 18, 2008

Need advice: Excel data comparison

Related topics