Ok Ok! I’m a good Uncle. My neice called me up a few minutes ago from work. Thinking I would have the answer to her problems. However, I rarely if ever use MS Excel so I could not be of much help to her. This is what she needs to do. And she is in no hurry.
She needs to take out all the duplicate info in a large excel document. It goes like this if I understand corectly.
First Name - Last Name - Address1 - Address2 - City - State - Zip
Apparently there are duplicates in a 5000 name list. She need’s to weed them out.
Can this be done?
Also, separate issue, is it possible to separate out zip codes from an address line? How does one take info from one cell and separate it out into two cells?
Folks if you can help it would be wondrous, she’s pretty bumbed to have to do it all on her own. Also if there is aprogram to buy that is fine too, but if a formula would do it that would be best. She’s running Excel 2000. Thanks folks.
Manually, I’d highlight all the data then “sort” by one of the columns that would show duplicates (maybe by name first, then address.)
The duplicates will then appear next to each other, and you could simply highlight & delete them. If it’s a huge list, that may be a little time consuming. (Though scrolling down, duplicates should stand out visually).
There’s probably a way to do this in Access too.
A trick to split up info in one cell might be to save the file as a tab delimited format, then re-import it into Excel. That way, you can put column breaks wherever you please.
I know you can use the formulas LEFT and RIGHT to select a certain number of characters from a string of characters. This would only work if all zip codes have the same number of characters though, and being in the UK, my knowledge of zip codes is, well, zip.
If she can get the data into Access, she could set up a query to find duplicates.
If you import the Sheet into Access, you can create a “find duplicates” query. If just using Excel, I’ve always just sorted and removed duplicates visually.
She does have access. Hmm. So there is no way to do it right in excel? The problem I think she is having is that all the first names have a Mr. or Mrs. in front of them. And some of the names are already together. ex: Mr. and Mrs. John Q. Pinochle
The manual approach is good - but not really for 5K rows. Here’s a method I have used a few times.
First, I’m assuming the data is stored in a single column - columnA
sort columnA
in cell B2, type this formula =IF(A2=A1, 1, 0)
if the contents of row2 = row1, B2 will be set to ‘1’, otherwise ‘0’ - which may or may not display, depending on your settings
Set B1 = 0, as it can’t be a duplicate
Copy B2 down 5000 rows to the end of your data
Column B now shows a flag for every duplicate row.
select column B and copy it. paste the values over the top of the formula - use Paste Special and choose Paste Values
Now select columns A & B and sort them both based on ColumnB
All of the duplicates will be at the bottom of the list - go ahead and delete them.
Russell
What level of duplicates does she need to remove?
e.g. is Mr. and Mrs. John Q. Pinochle at some address the same antry as Mr. John Q. Pinochle at the same address
Russell
Yes, they are the same. Mr. P at 1234 Good Street, Essex, CT 12345. She needs to know if there are duplicates within the 5k List. the names, and Addy’s would be the same. I think. I’m going to copy/paste this into an email and see if it works for her.