Find the Magic Word(le) Like a Data Scientist
Yes, we’re still playing Wordle, even if we’ve dialed back the obsessive sharing of results on Facebook. Regular players have their strategies and favorite words to kick off the game to maximize their chances. As data scientists, we wanted to know if there was a better strategy, one more rooted in data…so we did the nerdiest thing ever. We created a graph showing the relationships between every five-letter word in the Wordle dictionary and started testing.
Could we find the optimal starting word?
Wordle is a five-letter guessing game where you’ve six opportunities to guess the secret word. Pretty simple. But did you know there are 2,315 possible solution words in the Wordle dictionary as well as an additional 10,657 words that you can use as guesses? That leaves each player with 12,972 words to choose from each day…and 10,657 words that’ll never be the answer.
A straightforward goal with a massive number of possibilities to explore. But there are a few common strategies. Do you see yourself in any of these?
- The Gambler. This Wordle player likes to use words that have uncommon letters like X, Z, J and Q paired with common letters in hopes of getting lucky and narrowing down the number of possible words significantly.
- The Minimalist. This strategy involves choosing the most common word that comes to mind at every turn. While this approach is probably what creator Josh Wardle probably had in mind, it can take the fun out of the game.
- The Strategist. This person plays what’s known as “hard mode” where you commit to using the correct letters you already know. It prevents you from eliminating unknown letters as quickly and requires more mental work to select your next guess.
Whatever your strategy, Wordle is all about probability—the probability the word will contain or not contain a letter. Players cycle through their mental dictionary to find words that fit the pattern set so far, then try to estimate the likelihood their next guess will be the right word, or will get them closer to the answer.
But what if instead of seeing a list of possible words, you could see the pathways between those words and visualize whether you’re getting closer or farther away from the solution?
Have you ever wondered how many options there are in Wordle? This network graph illustrates every possible word—all 12,972 of them—and shows how they’re related to each other by similar letters and letter placement.
This kind of visualization is called a network graph and they help you see multiple connections in meaningful ways…so we decided to use our AI Platform to help us solve Wordle puzzles by creating network graphs illustrating the relationships in five-letter words. By visualizing and exploring the connections between words (the shared letters AND the shared placement of letters) we can understand what options are left, narrow down the possibilities, and march toward the solution with confidence.
Here’s what it looks like to play Wordle as a data scientist with an AI sidekick!
Choose a Starting Word for Wordle
How do analytical pros choose their starting word?
We asked on LinkedIn, and 71 percent of responses said they use the same word to start every day. So we better make it a good one, right?
Most players pick a word with common letters and multiple vowels—think R, S, T, L, N and E like on Wheel of Fortune. This choice is based on logic more than a hypothesis because we have statistics for the most common letters in five-letter words. But it’s a pretty high-level strategy, so we see plenty of favorite words that are the personal preference of the player. Bias influences us all the time, even when we’re playing games!
Intelligent Exploration—the process of having AI guide us through our data—prevents bias from limiting our progress. For instance, even if cheer is your favorite word, exploring the data of the Wordle dictionary tells us starting with a double e word isn’t going to eliminate as many options as we’d like. We use our exploration to find a starting word that logically makes sense, adding AI to help us take that exploration one step further.
At first guess, most people aren’t considering the placement of each letter, or common letter combinations, because we just don’t have enough information yet. But by using AI to explore our options we find that SALET is our best starting word based on the frequency of the letters and their placement within the word. You don’t need to know the meaning of a word to use it (evidently a salet is some form chalet, which we had to look up).
As data scientists, we’re operating on two things we know to be true:
- The more you explore your data before you start, the better your first effort to find a solution will be.
- The results of your first move should be the new starting point for even deeper exploration.
And now we’re going to prove it by using SALET in a game.
Play Wordle with our Data Scientist
So you’ve got the perfect word to start with, SALET. What’s your next move? We can’t help you there, but we can show you how our Data Science Intern Max set about solving the December 12th puzzle.
Starting with SALET, Max found A and L were in the solution, but in the wrong spots. So he eliminated some possibilities, but what does that really look like? By running an algorithm to map the potential words into a network graph with our weighted edge function, this is what Max saw:
By creating a network graph, Max can see there are communities featuring common letters that can help drive his next guess. He’s also able to see the likely position of the letters representing the strongest connections. For example, the Yellow Community contains words where the third letter of the word is always an A.
Our Louvain community detection algorithm that’s built into our network graph solution picked up on some pretty cool patterns. It was able to recognize common letter positions and the most notable frequencies of letters in words and break them into different sub-graphs. Here our algorithm surfaced three different potential words for our next guess: VIOLA, PLAIN, and CLANG. Where VIOLA (the blue community above) is the most eccentric (containing the most unique combination of letters), PLAIN (in the yellow community above) has the highest weighted degree (it’s most closely related to other words because it has a lot of common letters), and CLANG is a mixture of the two (also falling in the yellow community).
VIOLA is the most eccentric option because of the V and the placement of the other letters. It could be the solution, but there are a lot of other possibilities that don’t include such a unique letter.
You can see that PLAIN has a lot of commonality with other words, illustrated here by the number of connections.
Finally, CLANG is the choice that’s the most middle of the road. Not too unique or eccentric, but with letters less common than PLAIN and therefore with fewer connections.
That leaves us with a strategic choice: do we gamble on the guess that’s more eccentric and unique, or do we use the guess that has more common letters? Unique letters might not be in the word at all but if they are, we’ll get to the answer in a flash. A word with common letters will help us confirm our choices, but could leave us with so many possible solutions we run out of chances.
AIThority News: Toyota Research Institute Reveals How Technology Can Help To Solve Society’s Challenges
Given there were still quite a few possible words remaining, Max chose PLAIN to narrow down the field. He probably won’t get a big win, but this choice has a higher likelihood of confirming some letters.
His next result told him that the L, A, and N were in the word, but none of them were in the right spot. Max now knows the solution must contain L, A and N but not in spots he’s already tried. Having followed the relationships in the network graph to their logical conclusion, there are only three options left in the Wordle list using our known letters in the right way:
ZONAL, LUNAR and ANNUL
You’d think ZONAL maximizes eccentricity here but no, LUNAR or ANNUL are better guesses since ZONAL is well connected to LUNAR with an A and N in the same position, and well connected to ANNUL with N and L in the same position, whereas LUNAR and ANNUL only share an N.
Max picked LUNAR and got lucky! But either way, he had enough guesses remaining to try all three of the possible solutions left. He was going to win no matter which of the three remaining words he guessed first.
Intelligent Exploration in (Wordle) Network Graphs
Let’s face it, sometimes our first Wordle guess looks like it’s going to go well (with three green letters) but then the letter combination is so common there are still tons of possibilities left, and not enough guesses to eliminate all the noise. Chances are good you’ve had a project or initiative that experienced a similar fate, where what seemed like a good idea at first glance didn’t have the impact or result you wanted.
So, what are we supposed to do?
Not trust our gut at all?
Assume we’re doomed from the start?
That area of unknown is why we use AI to explore our data before we begin and each time we guess. With a solid foundation for our strategy and monitoring of our progress, we’re able to make sure the choices we make are actually leading to progress.
We should also appreciate finding eccentricities and anomalies—the first Wordle guess that results in all gray blocks (i.e., no letters correct) actually eliminates tons of possibilities. Now you know those most popular letters, our educated hypothesis, aren’t what you’re looking for. That drives down the list of possible solutions significantly. Finding the right path or project is always the goal of Intelligent Exploration, but there’s so much value in finding out what you don’t want to do, too.
AI as a Sidekick, You as the Hero
The Wordle solution of the day is pretty public. People don’t play because they’re dying to know what the word is—they play because they want to see how quickly they can discover the path to the solution.
Our AI platform isn’t about handing users a single, unexplained answer. Our solution is designed to explore and visualize complex data in ways data scientists and other analysts and business users can see, understand, explore even further, and share with stakeholders so you can build and execute winning strategies. You and your team are the heroes—we help you find the way.
Recommended: Nature Portfolio: This Open Science Library Showcases Ready-to-Run Software by Authors in Nature Journals
Comments are closed.