In 2024, Microsoft added three new functions in Excel that rely on Regular Expressions or REGEX to identify and extract complex patterns from the provided data. These functions include the REGEXTEST, REGEXEXTRACT, and REGEXREPLACE functions, which are available in the web version of Excel and in Microsoft 365.
Using the REGEXTEST function
With the REGEXTEST function, you can find out whether your data contains a specific pattern and can show the output as TRUE or FALSE. The syntax for this function is REGEXTEST(a,b,c)
. Here 'a' refers to the value or text you want to test, while 'b' refers to the pattern used for the test, and 'c' is optional and can be zero when you need the test to be case-sensitive, or one otherwise. For instance, you can check whether a particular value, such as product codes or account codes contains any numbers.
- In this example, we have an Excel sheet containing a few product/account codes.
data:image/s3,"s3://crabby-images/4891c/4891c2790ecc6bc359bd0bcd3b572fec91ca9802" alt=""
- To check whether the code contains any numbers, we click inside Cell C4 and type
=REGEXTEST(A4,"[0-9]"
data:image/s3,"s3://crabby-images/2f0d6/2f0d6ed1605ec6f1f465817ce9d044a22528adf6" alt=""
- If the code contains any numbers, you should see the output as 'TRUE'.
data:image/s3,"s3://crabby-images/cd57c/cd57c7ce117d50f7090cfc525d3adb99219918f0" alt=""
- Now, you can click at the edge of the cell where you entered the formula and drag it down to fill the rest of the cells and see the output automatically.
data:image/s3,"s3://crabby-images/6c518/6c5187fa9805504c80b789bd9b9ec5879a7169d9" alt=""
- In the same way, you can use the REGEXTEST function for more complex codes, such as identifying telephone numbers in scraped data.
data:image/s3,"s3://crabby-images/121c8/121c84102def3ccdc0e058c51613744a57c1e423" alt=""
- In this case, you can use the
=REGEXTEST(A4,"\(?\d{3}\)?[-.\s]?\d{4}"
function to determine whether the data contains any phone numbers.
data:image/s3,"s3://crabby-images/692f5/692f54457203b135421e24d1a32050f29021762b" alt=""
- Again, you can drag the formula down to check for all the values on the left.
data:image/s3,"s3://crabby-images/201f8/201f85e30183aa9d2dcf75697207b5a258fed945" alt=""
Using the REGEXEXTRACT function
The REGEXEXTRACT function helps you find specific pieces of text in the provided data and has the REGEXEXTRACT(d,e,f,g)
syntax. Here, 'd' is the cell reference or value from which you want to extract the required text and 'e' refers to the pattern to be extracted. You can use 'f' if you want to only find the first match, '1' if you want to find all possible matches, and '2' to find groups from the first match. You can also make the extraction case-sensitive with '0' or '1' if you don't want that.
- We can use the previous example to extract phone numbers from the provided data.
data:image/s3,"s3://crabby-images/fc4f2/fc4f27fc13f6440e8d42bd2d1f71df949812fb0b" alt=""
- We can use the
=REGEXEXTRACT(B6, "your_phone_number_pattern")
function by typing=REGEXEXTRACT(A4,"(?\d{3})?[-.\s]?\d{3}[-.s]?\d{4}"
in Cell C4.
data:image/s3,"s3://crabby-images/285d3/285d3f0fcfc3637243a380e0c976f75c35c379bf" alt=""
- If the reference cell contains a phone number, you will see it as the output. You can then drag the formula cell down to get the output for the other reference cells.
data:image/s3,"s3://crabby-images/5688e/5688e627dbe0062c1dec9c7add438e5807e0d1d4" alt=""
- If you want to get all matches, you can modify the function as
=REGEXEXTRACT(A4, "your_phone_number_pattern", 1)
. So, the function will be=REGEXEXTRACT(A4,"(?\d{3})?[-.\s]?\d{3}[-.s]?\d{4}",1
.
data:image/s3,"s3://crabby-images/b4ad0/b4ad0def01d70f2d1be52c7834439cfbeb0ada5b" alt=""
TRANSPOSE(REGEXEXTRACT(Reference Cell, "your_phone_number_pattern", 1))
to avoid spill errors. Using the REGEXREPLACE function
The REGEXREPLACE function lets you find and replace a pattern with a different one. It has the syntax REGEXREPLACE(h,i,j,k,l)
, where 'h' refers to the reference cell or value to be replaced, 'i' is the pattern to be replaced, and 'j' is the replacement pattern. 'k' is optional and refers to how the pattern to be replaced occurs and 'l' can be zero if you want the replacement to be case-sensitive or one otherwise.
- We can use the data from the previous example to try out the REGEXREPLACE function using
=REGEXREPLACE(A4, "your_phone_number_pattern", "replacement_format").
- Type
=REGEXREPLACE(A4,"(?(\d{3}))?[-.\s]?(\d{3})[-.\s]?(\d{4})""$1-$2-$3")
in Cell C4. This will standardize the phone number formats in your Excel sheet.
data:image/s3,"s3://crabby-images/a45a5/a45a5e1609a214ac4dc25780f8dc2d44ee52f5d1" alt=""
Things to know
- Characters like '{}' and '[]' are called tokens and are just two examples of the several tokens that you can use with the REGEX functions.
- To avoid running into issues when using REGEX functions, make sure your data is plainly formatted.
- You can also use conditional formatting with REGEX functions in Excel and can also enhance the data manipulation process.
- Excel also offers other ways by which you extract data, such as the Flash Fill tool and the TEXTSPLIT function.
Member discussion