Dear Spreadsheet Whisperer,

I am collecting data for my school to determine and address inequities. This ranges from test scores and discipline data (easily found) to who participated in clubs and who is an officer to who gets school-selected awards. Staff and student demographics. Some will have a common ID number to connect them, but not all. We will be removing names and using percentages and likelihood scenarios ultimately for big picture. Any setup advice?

Sincerely, Gettin’ Schooled

Dear GS,

First of all – WOW! What a wonderful undertaking and what a sorely needed and sure-to-be-impactful effort. I am truly honored that you brought me in to offer advice. As ‘luck’ would have it, I *do* have ideas for how you can set this up for success. I am also going to need to make some assumptions about your data and your school, so first let me make those clear. If I’m off-base here, I hope you’ll correct me so that this advice can be as helpful as possible!

First of all, I’m guessing that you are at a public school because you mentioned standardized tests. Second of all, I’m guessing that your school is racially diverse, which, at least in the School District of Philadelphia, is notable. Because schools are so deeply segregated, most individual schools have a lot less racial diversity than I would wish or expect. So one piece of (this part is unsolicited…) advice that I would offer is that you look at median income (or another metric that points to socio-economic status) and race, to the extent that you have access to these data and tools to analyze them, and not race alone. Also, unless race is self-reported (which in a lot of schools, unfortunately, it isn’t), it is easy to miss out on important distinctions, like that multiracial is a different identity/experience than Black or African-American, and not easy to tell just by sight (which is true of all demographics, of course, but this is super duper true and needs to be further emphasized!).

Next, I want to ask you about the data you are working with. We need to figure out how to organize and consolidate it to a few spreadsheets or tabs. Another assumption here – I am assuming you are doing most/all of the work in Google Sheets or Microsoft Excel (not R, Stata, or other statistic/analysis platforms). If yes, you’re in good company. That’s exactly what I do!

First thing’s first. You need a list of your students. What are the columns in that sheet going to be? Well, from your initial note, I can extrapolate:

  • Name or Identifier
  • Address (street, state, city, zip)
  • Race or ethnicity
  • Date of birth or age
  • Grade (year in school)
  • Etc

Each row represents one student.

Beyond students, we get into thornier questions that database managers call “one-to-many” relationships. For example… test scores! Each student has more than one test score… one from 2020, one from 2019, etc. If there’s truly only one test score per student, you can add them as a column to the Students spreadsheet. But … if there are multiple tests per year, or multiple subjects per test… we would quickly add WAY too many columns for this to be a useful spreadsheet. Once you hit this tipping point, the next step is to create a separate tab or sheet that represents test scores over many years, ideally with a student ID that links back to the Student tab.

Let’s assume that you have many years of test scores for each student. Your end goal is to have a spreadsheet with columns that include:

  • Student name or identifier
  • Name of test
  • Year of test
  • Test score

Each row represents one student’s results for one exam “sitting.” A student can take the same exam more than once, or different exams entirely. You can quickly end up with quite a few rows! This is not a problem – things get unwieldy, in my experience, when you begin to have too many columns. Rows are fairly unlimited!

I would assume that you would have separate spreadsheets for each “thing” you are measuring. All of these would use the similar “one to many” concept. Such as:

  • Extracurriculars (columns might include: Student name or identifier; Activity type; Activity Name; Season or Semester; Student role)
  • Discipline (columns might include: Student name or identifier; Discipline Type; Incident; Date; Consequence; Parent contact; Staff/Admin Point Person)
  • Awards (columns might include: Student name or identifier; Award Name; Date)
  • Standardized Tests
  • Report cards (or GPA)
  • Other things you want to track

Unfortunately, you might need to do some data cleaning/standardizing to get things into a consistent and usable format, but once you do, it should be pretty smooth sailin’ from there.

I’m going to introduce some spreadsheet steps at a high level here. If you want to get down and dirty with formulas, or you want to set up an hour with me to do some of this work together, let me know! Otherwise, I’ll stick to strategy and principles so that this blog post can hopefully help other people, too.

I would start by storing your Student Spreadsheet as your “Reference Table.” You can do this either by making a mental shift, changing the color of the spreadsheet tab, or giving it a “name” using a spreadsheet tip called Named Ranges. You will want to reference this table in all of your other tabs… and you can do this with a VLOOKUP function as long as there is a “unique identifier” for each student that re-appears in each tab. We both know that that won’t always be possible, but where it is possible, it will help you get way further, way faster.

With your handy dandy VLOOKUP, you can “bring” data from your Student Reference Table over to your other tables. For example, you can find “Race/Ethnicity” for each person and make it appear on each row, in each tab. This will help you be able to make data insights (summaries, charts, graphs, etc) without having to re-type the student’s racial identity each time… which is otherwise both time consuming and error prone! Yippee!

I just explained how to take data from the Student Reference Tab and populate data in the other tabs. But what if you want to go in the other direction? Remember, the other tabs exist because we are trying to represent a “one to many” relationship. A student might have many extracurriculars, many test scores, many awards, or many discipline entries. It would be very helpful to summarize those for easy access on the Student Reference Tab, don’t you think?

The steps to do that would require using a “Sum If” formula. You can create a new column in the Student Reference Tab… let’s call it “Number of Awards.” Then, you can count the number of times that the student’s name or unique ID shows up on the Awards tab. If you add to the Awards Tab, this formula can be smart enough to automatically update. Pretty snazzy, right? And we can go beyond sum/counts! What if you want average test scores? Best test score (asset framing, not deficit thinking… ya with me?). Spreadsheets can handle all of that math for you, as long as you feed the spreadsheet the data and the formula properly. This can be tricky at first, but I know you are up to the challenge!

There’s one more piece of advice I want to leave you with (shout out to my partner who has been hammering this concept home with his students!). In determining if something is equitable or not, it’s important to have some sort of baseline to compare your data to. In this case, are you comparing your data to your overall student demographics in your school? To demographics of youth under 18 in the city or county where you work? To national test statistics? I hope that you go into your analysis with a degree of humility and curiosity, which I can already tell you have from your note. And while I want you to maintain an open mind, I think it’s also helpful to have a hypothesis or working definition going into it – which will help you draw a conclusion about whether or not your results demonstrate a lack of equity. For example, does equity mean that the ratio of student discipline records by race is the same as student population by race? If yes, what happens if your results show inequity? Do you know that all of the discipline records were recorded? Do some students misbehave more or are they just punished more severely for the same behavior as other students? Do you think that the consequences are metered out in a fair and consistent way? Do you need to change the behavior of the students or the behavior of the disciplinarians? These are juicy questions and data can HELP you answer them, but I don’t think data can answer them solo.

Let me offer an example. At some private prep schools, there are rampant problems with student theft. In my made-up prep school, the population is 50% white and 50% students of color. White students are frequently caught stealing tennis rackets, electronics, and fancy shoes from their peers. Their consequence is typically a stern talking to and an apology letter. Recently, a student of color was caught stealing an iPod and was disciplined with a detention. What would the data tell us? The data would tell us: 1. White students are stealing more than students of color. They are disproportionately guilty of theft! 2. Students of color are receiving more harsh punishments for the same action. They are disproportionately punished! Do you see how the same data can lead you to very different conclusions? They both represent lack of equity compared to an arbitrary amount of stealing (which should be the same across racial groups in the population) but what you do about it could be very different, and that will be informed by the way you ask questions!

So, Schooled, my dear, I hope that this blog post sets you up for success and I hope that I can be a resource to you as you go through this project. I would love to hear about what you come up with! In the mean time, I am rooting for you and cheering you on!

With spreadsheet whispers and utmost confidence in you, s.

2 thoughts on “Dear Spreadsheet Whisperer: Gettin’ Schooled

Leave a Reply