The Pigeonhole Principle

Emily DeHoff
5 min readMay 3, 2022

This is the second article in So You Think You Can Count: A Series on Combinatorics.

Pigeons sitting on a roof.
Photo by Julie Boyne on Unsplash

It’s early in the morning. It’s cold. It’s dark. You want to put on some socks but you’re definitely not ready to turn the lights on yet. You know there are three colors of socks in the drawer. How many individual socks do you need to pull out to guarantee a matching pair? (You know, if that’s important to you.)

This is the kind of question the pigeonhole principle (PHP) was designed to answer. The basic pigeonhole principle states the following:

If more than n objects are distributed among n boxes, then some box must contain at least two objects.

As a classic example, imagine you have 5 pigeons and 4 pigeonholes (those little cubbies the carrier pigeons nest in). We can start by putting one pigeon in each hole, but no matter how we arrange them, we’re always going to have the fifth pigeon left over. This means some hole will need to have two pigeons in it.

Four pigeons each in their own cubby, with a fifth sitting outside. Then all five pigeons in cubbies with one cubby containing two birds.

So how do we actually use the PHP? Let’s break it down in the context of our sock example.

  1. Determine the worst case scenario.
    The idea here is to imagine getting as far as you possibly can without reaching the desired outcome. In the case of our sock conundrum, we could, in theory, grab one of each color, giving us three socks in our hand and no matches. Notice that as soon as we pull a fourth sock, it’s going to have to match one of the three socks we already grabbed, so pulling three unmatched socks is our worst case scenario.
  2. Add one to the result from step 1.
    Take the number we found above (three for our socks) and add one. This is now the number of socks we need to pull out in order to guarantee a match. This doesn’t mean we’ll always have to grab four. Sometimes we might get a match with the first two socks. Other times it might take three picks to get a pair. Four is just the fewest number of socks that will guarantee we have a matching pair.

Here’s what really tripped me up in the beginning: the number generated by the PHP isn’t necessarily the exact moment when you’ll reach your desired outcome. This has nothing to do with probability or the likelihood of getting your desired outcome. This is an all or nothing kind of game. We only want to know when there is a 100% chance that we’ll have a pair of socks. Remember, this is discrete math. We’re either guaranteed to have a match or we’re not. We don’t care about the gradation in the middle.

The cool thing about the PHP is that it shows up in a lot of very different situations. Let’s play with a few examples so we can start to build some intuition around how this principle works:

If you’re given a standard, shuffled, deck of cards, how many cards must you draw before you’re guaranteed to hold a spade?

First, what is the worst possible scenario? In this situation, we could rephrase this by asking what is the largest number of cards we could possibly draw without getting a spade? In theory, we could draw all of the hearts (there’s 13 of those), all of the clubs (another 13), and all of the diamonds (again, 13). At this point, we’ve run out of non spade cards that we could draw. So, our worst case scenario involves drawing 39 cards.

Once you’ve drawn 39 cards without coming across a spade, the only cards left in the deck must be spades. So by the time you draw your 40th card, you are guaranteed to have a spade in your hand. We can’t say the same about your 14th card, or even your 39th card. This only happens at that tipping point, when you’ve exhausted all your other options and there’s no longer any way to avoid your desired result.

Now let’s try something a little more complex.

In any group of 6 people, there are either 3 mutual friends or 3 mutual strangers.

This comes from an area of discrete math called Ramsey theory. Simon Pampena does a wonderful job of explaining this in the video below (we all know by now how much I love Numberphile videos). He doesn’t explicitly mention the pigeonhole principle, but see if you can spot how it shows up in his solution.

A natural thing to do when you discover a cool new math toy is to use it on everything. In developing you mathematical intuition, it’s often just as helpful to see nonexamples as it is to see examples. Here is a situation where you can’t use the pigeonhole principle (even though you might want to).

Given a fair 6-sided dice, how many times must you roll it before you’re guaranteed to get a three?

See how close this sounds to the question about drawing a spade from a deck of cards? The trick here is that the acts of drawing cards from a deck are dependent on each other. You’re actually changing the pool of cards you’re drawing from each time you take a card. This is not the case when rolling a dice. Each time you roll, the outcome has absolutely nothing to do with the result of the previous roll. You may be thinking to yourself, “But wait, isn’t a three supposed to show up roughly 1/6 of the time? Shouldn’t that have something to do with this?”. And this, my friend, is the dangerous mudslide called the Law of Large Numbers. Yes, when you roll a dice hundreds of thousands of times, you will get a three approximately 1/6 of the time. However that is only true when you zoom way out and look at a massive number of rolls. When you zoom back in to our time scale, there is nothing governing when and how often a three will show up. In theory, you could go 20 or 30 rolls without ever seeing a three.

Bonus Round

If you’re ready to take your pigeonholing skills to the next level, see if you can explain why the following statement is always true (and then compare notes with Jonathan Kim Sing):

If you pick 7 numbers between 1 and 11, there will always be a pair of numbers that sums to 12.

And just for fun, here’s one of my favorite examples of the PHP, from the 2002 Putnam exam.

Given any 5 distinct points on the surface of a sphere, show that we can find a closed hemisphere which contains at least 4 of them.

Can you think of more examples of the pigeonhole principle in action? Throw them in the comments and see if others can work them out too!

--

--