Recent, Wordle became a phenomenon that took the internet. A pretty simple game, but still fun to play. So naturally I decided I wanted to create an algorithm to solve Wordles for me.
It wasn't because I didn't enjoy solving them (I'm easily entertained) but more of I saw creating an algorithm to solve it as being a challenge. Could I create something I couldn't even beat? TBH, the answer to that is usually that I probably can, not because I'm the greatest developer but because even something as simple as a Wordle usually can beat me.
I approached it knowing that there's two main things to tackle:
- • First, I needed a way to prioritize words to figure out how to guess them.
- • Second, I needed to filter out words by letters in the word (both by knowing their position and also not knowing where they go) & letters not in the word.
Now before I walk you through my solution I'll give you the link to the CodePen - I imagine some of you will want to look at it (or maybe break it)
Prioritizing the words would prove to be easier. I only needed to sort words at the start - and I'd do that by giving each letter a score based on how common the letter is and summing up the total.
Next step is to start making guesses and filtering based on which letters were correct. I'd need to track letters where I knew the position, didn't know the position, and also the ones I knew weren't in the word (since I'd rather not waste guesses on words with letters I knew weren't in the word - maximum letter coverage and all).
On making a guess, I'd input which letters where grey, yellow and green - for yellow and green I'd want to save both the position as well as the character. Then once I have those saved it's time for the filtering.
So, to recap up to this point, I have an array of all words. Then, I have an array of green letters, yellow letters, and grey letters.
Filtering by green letters (letters I knew the position of) would end up being the easiest. I'd filter my word list, and for each word, check if it contains my green letters. Easy enough. Yellow letters would also be fairly straightforward - I'd need to check if the word contains that letter, but also NOT at the position it's currently at. Turns out, it was actually the grey letters that would end up being a pain.
You see, if I have a grey letter, I couldn't just filter out words that didn't have that letter. Why? because of words with duplicates. Say the word is "tiger"; it only has one "r". But say the word I guess is "rider". Well, one "r" is correct, however one isn't. If I just filter out all words without an "r" then… the word I'm looking for suddenly isn't in my master list. So instead I need to build a tolerance for a letter - a threshold say. If I have one green or yellow occurrence of a letter and a grey occurrence of a given letter then I need to filter out words with more than one of that letter. So back to the example, with "rider" one "r" is correct, so my threshold is one. All words with more than one "r" get filtered out.
And then I suddenly found myself with a working script that could guess Wordles.
Is it the most sofisticated or perfect algorithm? No - it's a simple decision tree, nothing too fancy. In all likelihood there are some words that perfectly slip through the decision tree (or more accurately are just too far down the desicion tree given the limit on number of guesses). Also, this was that beta version of a weekend project so performance wasn't my top concern. Looking over the code there's several places it can be optimized. However, it is a working proof of concept and I'm quite happy with it. I've thrown a good number of words at it, and it seems to be getting them right, even if it does come down to the final guess.
Oh, one final note. The dictionary I used sometimes includes special characters or proper nouns - I've added a "skip guess" button to get around those words. At some point I'll clear them all out.