A few years ago, I wrote about a “lady and the tramp” metaphor for solving spreadsheet formula data discrepancies. Like eating both ends of the noodle until you smooch your boo, sometimes the only way to solve the problem is to fix the data and the formula iteratively until they meet in the middle. Well, this week I’ve been DEEP, DEEP in this metaphor. Except without noodles. Harrumph.

Sometimes fixing data discrepancies can be more like finding a hay in a needlestack (It’s always a good day on TDAA when I get to borrow from the Spongebob archive). A pokey, painful, hyperbolic, inverted, not-at-all-fun-or-adorable situation. Ouch!

e-MERGE-ing discrepancies

For example, this week we found an error in my beloved Budget Merge spreadsheet opus.

Why does the tool show that we have more awarded grant payments than we actually do? When I checked the formula, nothing looked out of place. Plus, the same formula has been chugging along for years without complaint. (Gasp – the nagging implication that this could have been wrong the whole time. Ok, squishing that thought away for now). The formula basically looks at the Grant Award Date, Grant Payment Date, Grant Strategy, and if all of those attributes match the formula criteria, then add up the payment amount. Nothing seemed out of place there.

Next, I looked at the data records themselves. It turns out that a few grants had been prematurely populated with “award dates” before the decision had transpired. It’s an honest mistake – the staff member in question was trying to be proactive because the grant is expected to be awarded on a certain day. These few grants had a “status” of “pre-approved” but an award date that was pre-populated! This simple problem was enough to wreak havoc on our formulas – and it was not so easy to find that this was the problem, since Award Date is automatically entered when a grant is Awarded, it’s not a field that I typically check for accuracy! I had to create lists of payments and review them “by hand” (by eyeball?) until I found something that looked wrong.

Photo by Negative Space on Pexels.com

Unfortunately, like solving a Rubix Cube, changing one row to get us closer to a solution can sometimes make some other formulas go out of whack. Let’s look at an example where this happened in another project…

Conflicting Criteria

In one project I’ve been working on, we have a data coverpage with overall summaries, like #People Served, #Adults, #Cases Closed, #Average Case Timespan, etc. Then, on other tabs, we calculate some of those measures in more detail, for example, broken out by age or gender. At the end of the day, the detail tabs and the summary tabs are supposed to agree. However, the instructions use different definitions! This created a problem where if we changed one of the equations, it became out of sync with the other ones. If we changed the rows themselves to meet the criteria better, then some other formula summaries went out the window. What to do????

The answer, in this case, was to slow down and go deep. Instead of looking at the spreadsheets and trying to make the rows make sense, we went back to the database where we could see more notes and details about the people and situations that the data represent. In some cases, we found some anomalies and we could directly ask the staff what’s going on. In other cases, we realized that there is a process issue that needs to be addressed. Maybe data entry is happening too soon or too late in order to get the most accurate impressions. Maybe staff need a refresher on what certain drop down options mean. Maybe there is a typo which means that our “find and replace” operations aren’t catching everything.

Aw who am I kidding – all of these issues came up in this project! I kinda felt like I did 6 loads of laundry, came out with socks I didn’t even know I had, and then figured out how to match them up. Maybe this time I’ll finally start rolling my socks into pairs BEFORE they go in the hamper so that they come out together. Can this 28 year old dog learn new tricks?

Valid Volunteers?

In another event of Spreadsheet Olympics, the challenger (me) faced a spreadsheet of volunteer records. The goal? Figure out how many individual volunteers showed up or no-showed. So, what happens if someone is BOTH AT THE SAME TIME? Houston, we have a problem.

Now, OF COURSE sometimes volunteers come and sometimes they do not. That’s the way of the world, and that’s why being a volunteer manager at a nonprofit is such a complex, compelling, highly-skilled job. (Shoutouts! You know who you are!) So for this formula, we went for whether or not a volunteer had attended the majority of their scheduled volunteer shifts within a particular commitment. It should be a pretty cut-and-dry metric, right? For each volunteer, we can use logic to get a clear answer.

The problem, of course, was that some volunteers were in the list twice due to some data entry glitches. I could not deduplicate by Contact Name since sometimes there really are two people with the same name! So instead, I deduplicated by Volunteer Commitment. Each Volunteer Commitment is connected to a roster of Volunteer Shifts. So each Volunteer Commitment can accurately tell us the Volunteer’s Status. Over time, the Volunteer might have many Volunteer Commitments (like, one for each semester for our really loyal volunteers) – but in general, the system is set up so that they should only have one active commitment at a time. The organization doesn’t offer multiple programs, so there wouldn’t be a situation where someone has one Commitment for trail clean up and a different Commitment for “tots in nature.”

Since one of the Volunteer Commitments was essentially empty (none of the Volunteer Shifts were attended, since it was a duplicate or a mistake), but the other one was active, the same volunteer could essentially be counted as Showed Up and No Showed at the same time!

Oy vey, what a headache!

What to do when the data aren’t alright

Here are a couple of strategies I’ve used recently to help get on track:

  • Deep breath and gratitude. So. There’s a data problem. They are a normal part of life and it’s SO MUCH better to know about them than to be in the dark. All data are fixable! In fact, by getting to this point, you are halfway there! Take a moment to put your hand on your heart or your belly and thank yourself for taking such good care of your data, and asking such good questions, that you have arrived at the point of identifying a discrepancy. I’m personally cheering for you.
  • Talk it out with a friend. I use this outline:
    • My columns represent…
    • My rows represent…
    • The question I need to answer is…
    • The formula steps are…
    • Other methods I have tried are…
    • My frustration level is … right now
  • Filter or summarize could become your new bff
    • With data discrepancies, sometimes you need to find rows that are missing key information. My go-to tool for that is using a Filter, which anyone can do in Excel or Google Sheets without even furrowing your brow. Use a filter on any column to find “blanks” – this might be just the cleanup the doctor ordered!
    • With data discrepancies, sometimes you have duplicates or “abandoned” rows that shouldn’t be there and are create misleading conclusions. (Think… donations for $0… or mistaken volunteer sign ups that hover like a ghost in the back of your system). My go-to tool for that is using Pivot Tables to summarize data into categories or “groups” by email, date, etc. Then, I can see how many rows are attributed to each. If I am expecting 1 per Email, then I will quickly be able to notice where there might be doubles, triples, or worse!
    • I also use Pivot Tables AND Filters to DIY check my formulas. Which ever way you start (Pivot first, formulas to check) or (formulas first, pivots to check) – they should always agree with each other, like socks coming out of the dryer. JK – we’ve already been through this. 😉
  • Take a break (and then come back)
    • My friend and I spent more than an hour talking in circles about a formula discrepancy that was off by one. One little, tiny, 1!!!! (Does this sound like your worst nightmare? It’s not mine! I had a good time problem solving with my friend, we cheered/challenged each other on, and we knew that we were better off together than alone!). Eventually, I suggested that we take a break. When I came back after a few hours, I figured out the problem in less than 5 minutes!

Once again, I want to emphasize that occasional data discrepancies are normal and okay! They will happen in any human system, even more so if the system is complicated, even more more so when we are tired and resource-constrained. I see these experiences as a learning opportunity – learning how to refine our logic, our process, our data integrity procedures, and most importantly, how to take care of our systems and our people. I’m not going to say, “don’t freak out!” because heck, we all freak out sometimes! But I will say that I have confidence in you, us, and our ability to get through this. On the other side? Pride, relief, actionable insights/conclusions, and a good night of sleep.

Leave a Reply