Regex Tester

/ /
Test string
No matches
Patterns and test text run 100% in your browser. Nothing is uploaded.

Quick reference

.Any character except newline
\d \w \sDigit, word character, whitespace
\D \W \SNot digit, not word, not whitespace
[abc]Any one of a, b, or c
[^abc]Any character except a, b, or c
[a-z]Any character in the range a to z
* + ?Zero or more, one or more, zero or one
{2} {2,} {2,5}Exactly 2, 2 or more, between 2 and 5
^ $Start and end of string (or line with m)
\bWord boundary
( )Capturing group
(?: )Non-capturing group
a|bMatch a or b (alternation)
\Escape a special character

This free regex tester lets you build, test, and understand regular expressions in real time, entirely in your browser. Type a pattern and a test string and matches are highlighted instantly, capture groups are broken out, and a plain-English explanation of your pattern appears as you type so you can see exactly what each piece does. A quick reference and a library of common patterns are built in. Below the tool is a complete guide to regular expressions: the syntax, how the engine actually matches, the concepts that trip people up, and practical patterns you can adapt.

What is a regular expression?

A regular expression, almost always shortened to regex or regexp, is a sequence of characters that defines a search pattern. Instead of looking for one fixed piece of text, a regex describes a shape that text might take, such as “an email address,” “a date,” or “a word that starts with a capital letter.” A regex engine then scans a string and finds every place where that shape occurs. This makes regular expressions one of the most powerful tools a developer has for searching, validating, extracting, and transforming text.

Regular expressions appear in virtually every programming language and in countless tools: text editors, command-line utilities like grep, database queries, log analyzers, and form validators all speak some dialect of regex. The core syntax is broadly shared, which means the pattern you build and test here will, with minor variations, work across most environments. The tool above uses the JavaScript regex engine, which is one of the most common dialects and very close to what you will use day to day.

Literal characters and metacharacters

At its simplest, a regex is just literal text. The pattern cat matches the letters c, a, t in sequence, wherever they appear. The power comes from metacharacters, which are characters that carry special meaning rather than matching themselves. The dot . is the most famous: it matches any single character except a newline. So c.t matches “cat,” “cot,” “cut,” and even “c9t.” When you actually want a literal dot, you escape it with a backslash, writing \. to mean a real period rather than “any character.” This distinction between a literal and a metacharacter is the foundation of everything else.

The characters with special meaning include . ^ $ * + ? ( ) [ ] { } | \ and to match any of them literally you put a backslash in front. This is why a pattern for a price might use \$\d+: the \$ is a literal dollar sign, escaped because a bare $ means “end of string,” and \d+ means one or more digits.

Character classes

A character class, written with square brackets, matches any single character from a set you define. The pattern [aeiou] matches any one vowel. Inside a class you can use a hyphen to express a range, so [a-z] matches any lowercase letter, [0-9] any digit, and [A-Za-z0-9] any letter or digit. Placing a caret at the start negates the class: [^0-9] matches any character that is not a digit.

Because some classes are so common, regex provides shorthand for them. \d means any digit, equivalent to [0-9]. \w means any “word” character, which is letters, digits, and the underscore. \s means any whitespace, including spaces, tabs, and newlines. Each has an uppercase counterpart that means the opposite: \D is any non-digit, \W any non-word character, and \S any non-whitespace. These shorthands make patterns far more readable, which is why the email pattern loaded in the tool above leans on \w rather than spelling out every allowed character.

Quantifiers: how many times

Quantifiers control how many times the preceding element may repeat, and they are where regex gains most of its flexibility. The three basic quantifiers are the asterisk, the plus, and the question mark. * means zero or more, + means one or more, and ? means zero or one, making the preceding element optional. So colou?r matches both “color” and “colour,” because the u is optional.

For precise counts, braces let you specify exact numbers. \d{4} matches exactly four digits, useful for a year. \d{2,4} matches between two and four digits. \d{2,} matches two or more with no upper limit. Combining quantifiers with character classes is the heart of pattern building: [A-Z][a-z]+ matches a capital letter followed by one or more lowercase letters, which is the shape of a capitalized word.

Greedy versus lazy matching

This is one of the most common sources of confusion, and understanding it separates beginners from confident regex users. By default, quantifiers are greedy, meaning they match as much text as they possibly can while still allowing the overall pattern to succeed. Consider the pattern <.+> applied to the text . You might expect it to match just , but because .+ is greedy, it grabs everything up to the last >, matching the entire in one go.

To make a quantifier lazy, matching as little as possible, you add a question mark after it. The pattern <.+?> now matches just , then separately , because the lazy .+? stops at the first > it can. Knowing when to reach for lazy quantifiers prevents a whole class of bugs where a pattern matches far more than intended. You can test this directly in the tool above by toggling a question mark on and off after a quantifier and watching the highlights change.

Rule of thumb: if a pattern is matching more than you expected, often spanning across things it should treat separately, a greedy quantifier is usually the cause, and making it lazy with a trailing ? is usually the fix.

Anchors and word boundaries

Anchors do not match characters; they match positions. The caret ^ matches the start of the string, and the dollar sign $ matches the end. So ^abc$ matches only a string that is exactly “abc,” nothing more. Anchors are essential for validation, where you usually want the entire input to match a pattern rather than just containing it somewhere.

The word boundary \b is a subtler but extremely useful anchor. It matches the position between a word character and a non-word character, which lets you match whole words. The pattern \bcat\b matches “cat” in “the cat sat” but not the “cat” inside “category” or “concatenate,” because in those cases there is no boundary on both sides. The email pattern in the tool uses \b at each end to avoid grabbing fragments of larger strings.

Groups and capturing

Parentheses create groups, and groups do two jobs. First, they let a quantifier apply to several characters at once: (ab)+ matches “ab,” “abab,” “ababab,” and so on, because the + applies to the whole group. Second, and just as importantly, a group captures the text it matched so you can extract it afterward. When you match a date pattern like (\d{4})-(\d{2})-(\d{2}), the three groups capture the year, month, and day separately, which the tool above displays as individual captured groups under each match.

Sometimes you want a group only for applying a quantifier, without the overhead of capturing. A non-capturing group, written (?:...), does exactly that. It groups without recording the result, which keeps your captured groups clean and is slightly more efficient. You will see non-capturing groups in several of the library patterns above, used to group alternatives without cluttering the output.

Alternation

The pipe character | means “or,” allowing a pattern to match one of several alternatives. The pattern cat|dog|bird matches any of those three words. Alternation is often combined with groups to limit its scope: ^(cat|dog)s?$ matches “cat,” “cats,” “dog,” or “dogs,” because the group confines the alternation to just those two words before the optional s and the end anchor. Without the group, the alternation would extend across the whole pattern in ways that are rarely what you want, which is a common beginner mistake worth watching for.

Flags that change matching behavior

Flags are options that modify how the entire pattern behaves, and the tool above exposes the four most useful ones as checkboxes. The global flag g finds all matches rather than stopping at the first, which is why you usually want it on when testing. The ignore-case flag i makes the pattern match regardless of letter case, so cat with the i flag also matches “Cat” and “CAT.” The multiline flag m changes the anchors ^ and $ so they match the start and end of each line rather than the whole string, which matters when your test text spans several lines. The dotall flag s makes the dot match newlines too, which it normally does not. Toggling these in the tool and watching the matches update is the fastest way to build an intuition for what each does.

How to use this regex tester

Type your pattern into the field between the slashes at the top, and toggle the flags you need. Enter or paste your sample text into the test string box, and matches light up immediately, with alternating highlight colors so adjacent matches are easy to tell apart. Below the test box, every match is listed with its position and any captured groups broken out individually, so you can confirm not just that something matched but exactly what each group captured. The explanation strip describes your pattern in plain English as you build it, and if the pattern is invalid it tells you what went wrong. The library buttons load tested patterns for common tasks like emails, URLs, dates, phone numbers, hex colors, and IP addresses, which you can use directly or adapt as a starting point. Everything runs locally, so you can test patterns against real, sensitive data without it leaving your browser.

Common patterns explained

The library buttons above load patterns for the tasks developers reach for most often. Understanding how they are built makes them easy to adapt rather than copy blindly.

Matching an email address

The email pattern \b[\w.+-]+@[\w-]+\.[\w.-]+\b reads as: a word boundary, then one or more characters that are word characters, dots, plus signs, or hyphens, then a literal at-sign, then the domain made of word characters and hyphens, a literal dot, and the top-level part. It is deliberately pragmatic rather than exhaustive. A fully standards-compliant email regex is famously enormous and almost never worth it; a practical pattern like this catches the addresses you actually encounter while staying readable.

Matching a URL

The URL pattern https?://[\w.-]+(?:/[\w./?%&=-]*)? uses the optional s in https? to accept both http and https, matches the host, and then has an optional non-capturing group for the path and query portion. The optional group is why it matches both a bare domain and a full path with query parameters. If you extract a URL with it, the URL encoder is the natural next step for handling its query values.

Matching a date

The date pattern \b\d{4}[-/]\d{2}[-/]\d{2}\b matches a four-digit year, a separator that can be a hyphen or a slash, two digits, the separator again, and two more digits. Note that it validates shape, not real-world validity: it will happily match an impossible date like a thirteenth month. Regex is excellent at recognizing the form of data and poor at understanding its meaning, which is a distinction worth internalizing.

That last point generalizes. Regex answers “does this text have the right shape,” not “is this value actually valid or sensible.” For real validation you often pair a regex shape check with logic in your code that confirms the meaning, such as checking that a matched date is a real calendar date. Using regex for what it is good at, and code for the rest, produces far more reliable results than trying to cram every rule into one monstrous pattern.

Regex performance and catastrophic backtracking

Most regular expressions run in negligible time, but it is possible to write a pattern that performs catastrophically on certain inputs, and this is worth understanding because it can become a real security and reliability issue. The problem, known as catastrophic backtracking, arises when a pattern can match the same text in an enormous number of ways, forcing the engine to try a combinatorial explosion of possibilities before giving up.

The classic trigger is nested quantifiers, such as (a+)+$ applied to a long string of a’s followed by a character that fails the match. The engine tries every possible way to split the a’s between the inner and outer quantifier, and the number of combinations grows exponentially with the input length, so a string of a few dozen characters can hang the engine for seconds or longer. When this happens in a web server validating user input, an attacker can deliberately send such a string to exhaust resources, a denial-of-service attack sometimes called ReDoS.

The defenses are straightforward once you know to look. Avoid nesting quantifiers where one already covers the other, prefer specific character classes over the very permissive dot when you can, and be cautious with patterns that allow overlapping ways to match the same text. If you are validating untrusted input with a complex pattern, test it against deliberately awkward strings, long runs of repeated characters in particular, and watch that it stays fast. The tool above includes a guard against zero-width infinite loops, but the broader lesson is to keep validation patterns simple and unambiguous.

Regex across different languages

The core regex syntax covered here is remarkably consistent across programming languages, which is why a pattern tested in this tool transfers so well. The character classes, quantifiers, anchors, groups, and alternation all behave the same way in JavaScript, Python, Java, PHP, Ruby, Go, and the command line. That shared foundation is what makes regex a genuinely transferable skill rather than something you relearn for each language.

The differences are mostly at the edges. Some languages support advanced features like lookbehind, named groups, or recursion that others lack or implement differently, and the exact flags and the way you write a pattern in source code vary. In JavaScript a pattern can be written between slashes as a literal; in Python it is usually a raw string passed to the re module; in Java it is a string with doubled backslashes because the language also uses the backslash as an escape. These are surface differences in how you embed the pattern, not in how the pattern itself matches.

The practical workflow this enables is to build and verify your pattern here, where you get instant visual feedback, and then drop the verified pattern into your language of choice, adjusting only the surrounding syntax for how that language expects a regex to be written. Because the matching behavior is shared, a pattern that works in the tester will behave the same way once it is embedded, which saves a great deal of guesswork. When you do use a feature that is dialect-specific, it is worth a quick check that your target language supports it, but for the common patterns covered above, what you test is what you get.

Lookahead and lookbehind

Once you are comfortable with the basics, lookarounds are the next step up in expressiveness. A lookahead lets you assert that something does or does not follow, without including it in the match. A positive lookahead is written (?=...) and a negative lookahead (?!...). For example, \d+(?= dollars) matches a number only when it is followed by the word “dollars,” but the word itself is not part of the match. This is invaluable when you want to find text based on its context but only capture part of it.

Lookbehind works the same way in the other direction, asserting what comes before. A positive lookbehind (?<=...) and negative lookbehind (? let you match something only when it is or is not preceded by a given pattern. A common use is matching a number after a currency symbol without consuming the symbol, such as (?<=\$)\d+ to grab the digits after a dollar sign. Support for lookbehind has historically varied between regex flavors more than lookahead, so it is one of the features worth confirming in your target language, though modern JavaScript, which powers the tool above, supports both.

Lookarounds are powerful but can make patterns harder to read, so the practical advice is to use them when they genuinely simplify the problem, typically when the surrounding context matters but should not be captured, and to reach for simpler constructs the rest of the time. As with everything in regex, testing the pattern against real examples in the tool above is the fastest way to confirm a lookaround does what you intend before you commit it to code.

A workflow for building patterns

Beginners often try to write a complex pattern all at once and then puzzle over why it fails. A far more reliable approach is to build incrementally. Start with the simplest pattern that matches part of your target, confirm it highlights what you expect in the tester, then extend it one piece at a time, checking after each addition. If a change suddenly breaks the matching or grabs too much, you know exactly which addition caused it. This tight feedback loop, edit the pattern, watch the highlights, read the plain-English explanation, is precisely what a live tester is for, and it turns regex from frustrating guesswork into a quick, almost conversational process.

Frequently asked questions

What is a regex tester used for?

It lets you write a regular expression and immediately see what it matches in sample text, including captured groups, before you use the pattern in your code. This makes building and debugging patterns far faster than trial and error inside a program.

Why is my regex matching too much text?

Almost always because of a greedy quantifier. By default * and + match as much as possible. Add a ? after the quantifier to make it lazy and match as little as possible, which usually fixes over-matching.

What does \b mean in regex?

It is a word boundary anchor. It matches the position between a word character and a non-word character, letting you match whole words. \bcat\b matches “cat” as a word but not inside “category.”

What is the difference between a capturing and non-capturing group?

A capturing group (...) groups part of a pattern and records what it matched so you can extract it. A non-capturing group (?:...) groups without recording, which keeps your captured output clean when you only need the grouping for a quantifier or alternation.

Does this regex tester support all regex flavors?

It uses the JavaScript regex engine, which is one of the most widely used dialects. The core syntax, classes, quantifiers, groups, and anchors are shared across nearly all flavors, so patterns you build here transfer well, though some advanced features differ between languages.

Is my pattern or test text sent anywhere?

No. Both the pattern and the test string are processed entirely in your browser. Nothing is uploaded, logged, or stored.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top