Part 4 in a mini-series exploring the great, big world of “de-duping” – or, how to handle repeated info in a spreadsheet or database. In this installment: de-duping rows in a Google Sheets spreadsheet (and some more thoughts about de-duping in general). To catch up on this series, check out Duped Again!, Re-member-ships, and Dupes / dupes / dupes (Excel).
When it comes to having children, multiples are statistically unlikely. Twins roll in at a rate of around 33/1,000 births (and that’s after a 76% increase from 1980-2011!). Triplets are even more unusual – 104/100,000 births. Ah, if only multiples were as rare in databases as they are in the general public. Then again, I might be out of a job if that was true!
I think changemakers are more like bears. According to the National Bear Association (yes, I checked!), bears tend to have litters of 2-3 cub-lets, sometimes as many as 5. (This rings true duplicate dilemmas that some of my clients face!). Just like bears, changemakers are cute, protective of our young, and fearsome if you get on our bad sides.
Bear with me
GoogleSheets doesn’t have nearly the suite of de-dupin’ options that Excel offers, so bear with me (#PunIntended) while I get theoretical for a moment.
When it comes to de-duping, it all depends on what you select.
What you select? You might be scratching your head right now. And I don’t blame you! “Selecting” is spreadsheet jargon for clicking or highlighting. When you click anywhere in a spreadsheet, you’ll see your cell light up with a border around it. If you click and drag, you’ll make a box around any cells that fall inside of the area you dragged. In both of those cases, you made a selection. You can also select an entire row or column at a time, or even an entire sheet!
When you are de-duping, what you select makes a BIG OLE DIFFERENCE! If you select only one column, Google Sheets AND Microsoft Excel will de-dupe that column’s data without looking at the data in the adjacent columns. Why is this important?
Let’s say you had a spreadsheet that looked like this:
|Malcolm X||(xxx) xxx-xxxx|
|Malcolm X||(yyy) yyy-yyyy|
If you had selected JUST the “Name” column, then your spreadsheet would return just “one” Malcolm X (he was truly one of a kind!) However, if you had selected both rows and both columns, and then de-duped, your spreadsheet would return… well… all of your data, because the rows have different phone numbers. (By the way, happy birthday, Malcolm X. Rest in power!).
Moral of the story: be intentional with selecting your data. Google Sheets and Microsoft Excel will look across the whole row (within the selection) to compare against other rows for duplicates.
I’ll be honest with you: There is no classic “de-dupe” function in GS. There’s no magic button you can push. Some folks have written scripts for GS extenders, but they didn’t work well for me, plus, who wants to download an extender?? Fortunately, I have an answer for you. It’s called the “Unique” function – and you can watch a 90-second tutorial here. In a nutshell, all you have to do is click in any blank cell (I usually start in a different tab), then type =Unique( ) and inside of the parenthesis, you’ll put in your data selection — in the form of cell codes like A1:C5). In the example in the video, the YouTuber de-dupes based on data in only one column. If he had used a selection (aka range) that included other columns, GS would have copied data in the adjacent columns, too.
And that, my friends, is how you de-dupe in Google Sheets! Go ahead and give it a try – and don’t forget to write in to Dear Spreadsheet Whisperer if you get stuck along the way. I’m standing by (or if you know me well, the truth is, I’m lounging around if at all possible) and ready to help.
Ok, faithful TDAA readers, I’m going to leave you with a thought provoking quote from Malcolm X:
In the past, the greatest weapon the white man has had has been his ability to divide and conquer. If I take my hand and slap you, you don’t even feel it. It might sting you because these digits are separated. But all I have to do to put you back in your place is bring those digits together.