Search a string for a list of matches in Excel
When analysing marketing data you often find yourself working around the formulas that Excel provides – massive nested formulas, chaining operations across multiple columns etc. One such example is when you want to determine whether a list of words appear in some text (perhaps you are differentiating between brand or non-brand keywords, or segmenting long tail search phrases from converting ones). You can do this a number of different ways, but they all feel a bit clunky, like it should be more straightforward.
And it can be – if you are prepared to learn a bit about macros and programming, you can make Excel do almost anything you want. So, I’m going to demonstrate:
- How to create your very own Excel formula that does exactly what you need it to do. These are known as UDFs (User Defined Functions).
- How to search within text for a list of words, without even using regular expression(we’ll get to that).
Here’s an example: we all love Vanilla Ice. Imagine we love him so badly we want to do some text analysis on the lyrics using Excel. This might involve determining whether certain word(s) exist, how often, etc. So, say you have a cell with with a portion of the lyrics in it.
Now, if you have a single word to check, you can use the Excel Search or Find formulas (Search is case-insensitive). These formulas return the character number of any match. However, that’s not really powerful enough for the type of text analysis we’re doing – what if you have a list of words that you want to check? What if you want to know how many of them match, or which ones in particular match, or how frequently? We’re getting into that ugliness I mention above, which can lead to mistakes and end up with you talking nonsense about the Vanilla man.
Some of the more technical folks out there will immediately recognise this as a problem that can be solved with regular expressions. And you would be right. However:
- Many people don’t know what the hell regular expressions are. If you care to know, then regular expressions are a set of rules and syntax for performing powerful pattern matching operations on strings. If that’s not good enough for you, go while away the time on Wikipedia, then StackOverflow, and download Regular Expression Designerwhile you’re at it.
- Excel doesn’t have a regular expression formula. Unbelievable. I don’t know why.
The wise SEO/PPC folks out there will point to the excellent SEO Tools pluginfor Excel by Niels Bosma. This introduces regular expressions along with a wealth of other very handy functions to Excel. If you haven’t heard of it check it out – even if you don’t do SEO!
That still leaves us with the first problem, and the fact that any out-of-the-box function has limitation. If we want to analyse the lyrics we have in cell A1 to understand how much Vanilla raps about communication and sharing, perhaps we need to check for occurrences of words such as “collaborate” and “listen”. What would be really handy is if I could provide a list of these words and see which ones matched. And without anyone learning anything about regular expressions.
Let’s free ourselves – let’s imagine a formula that does exactly what we want. That’s right – just imagine it first. Mine would be something like this:
=ListSearch(text-to-search, text-to-search-for, separator)
And it would return results like below:
After imaging it, I made it so. I did that by writing my very own functions. Yes, it involves programming in VBA. I’m not going to cover programming 101 in this post so if you’re not ready to write your own functions, take from this that it’s worth learning and copy and paste my functions. Over time you can learn by changing them to do what you need them to do.
Writing A Custom Function In Excel
First of all, we’re getting in about the VBA component of Excel so you need to look in the developer tab in the ribbon and click on ‘Visual Basic’:
You are faced with the developer window. From here you can insert ‘Modules’ (places to store code) and write custom ‘Subs’ (an automated task) and ‘Functions’ (custom formulas). You will be looking at something like the below:
I then wrote two functions, ListSearchA and ListSearchB. They are very similar. ListSearchA returns the number of matches.
Function ListSearchA(text As String, wordlist As String, separator As String, Optional caseSensitive As Boolean = False) Dim intMatches As Integer Dim res As Variant Dim arrWords() As String intMatches = 0 arrWords = Split(wordlist, separator) On Error Resume Next Err.Clear For Each word In arrWords If caseSensitive = False Then res = InStr(LCase(text), LCase(word)) Else res = InStr(text, word) End If If res > 0 Then intMatches = intMatches + 1 End If Next word ListSearchA = intMatches End Function
ListSearchA loops through every word that you supply in the list, split by the character you declare as the seperator. It then runs either Search of Find on each one (depending on whether you declare ignorecase to be true or false), and counts the number of matches.
Edit:Based on feedback, the functions have been updated to use InStr() instead of Search()/Find() and can take an optional TRUE or FALSE parameter to select case sensitivity(defaults to FALSE = case insensitive).
You can go ahead and copy the entire function into your module and save your workbook as a macro-enabled workbook(.xlsm) – this is essential, VBA functionality does not work in a standard workbook. When you return to the standard spreadsheet view(there will be two Excel windows open in your taskbar: one for developer, one for the spreadsheet), you’ll see that you can now use the function in the workbook itself:
ListSearchB returns a list of the words that match – this can be really useful for sanity checking, or for further classification or pivot table operations.
Function ListSearchB(text As String, wordlist As String, separator As String, Optional caseSensitive As Boolean = False) Dim strMatches As String Dim res As Variant Dim arrWords() As String arrWords = Split(wordlist, seperator) On Error Resume Next Err.Clear For Each word In arrWords If caseSensitive = False Then res = InStr(LCase(text), LCase(word)) Else res = InStr(text, word) End If If res > 0 Then strMatches = strMatches & separator & word End If Next word If Len(strMatches) <> 0 Then strMatches = Right(strMatches, Len(strMatches) - Len(separator)) End If ListSearchB = strMatches End Function
And we use the formula in a similar way to return the words that match:
There may be those that immediately respond with “You can do this with Excel formula =blah1(blah2(X,Y,Z), blah3(extrablablah(X, Y), BlahDeBlah(X,Y,Z))). That’s fine – feel free to do so. However my point is that it’s best to understand all ways of tackling a problem before you can understand the best way and it’s hugely empowering (and time saving) to know that you can make Excel do exactly what you need it to do.
Own your marketing data & simplify your tech stack.
Have you read?
It’s been 4 years since Google first announced it will be removing third-party cookies from Chrome. Since then, we’ve seen multiple delays on the plan which has left marketers doubting...
The deadline to move to Google Analytics 4 is just around the corner. So, how can you prepare to make the change, with minimal impact on your marketing measurement? In...
Google Optimize is a core part of the CRO toolkit. But with the news that Google will soon be sunsetting it, what Google Optimize alternatives are available? We’ve put together...