Logo

dev-resources.site

for different kinds of informations.

Intro to Regular Expressions

Published at
9/12/2024
Categories
regex
beginners
Author
Jae Jeong
Categories
2 categories in total
regex
open
beginners
open
Intro to Regular Expressions

Introduction

Regular expressions, or regex, are sequences of characters that define a search pattern that is primarily used for pattern matching within strings. They are commonly used in validating, searching, replacing, or extracting specific patterns from strings. Common uses are finding or matching strings that match a required format like phone numbers, email addresses, or dates.

Character Classes

Character Class Syntax Description
Character Set [ABC] Matches any character in the set
Negated Set [^ABC] Matches any character not in the set
Range [A-Z] Matches any characters in between the two specified characters, inclusively
Dot . Matches any characters except line breaks
Word \w Matches any word character (alphanumeric & underscore) Equivalent to [A-Za-z0-9_]
Not Word \W Matches any characters that are not a word character
Digit \d Matches any number (0-9)
Digit \D Matches any character that is not a number

Anchors

Anchor Syntax Description
Beginning ^ Matches the beginning of the string
End $ Matches the end of the string
Word Boundary \b Matches a word boundary position between a word character and non-word character
Not Word Boundary \B Matches any position that is not a word boundary

Reserved Characters

In regular expressions, some characters, (+ * ? ^ $ \ . [ ] { } ( ) | /) are reserved for specific purposes, as seen above. To represent these characters as a literal character, they should be preceded by a backslash ().

Quantifiers & Alternation

Syntax Description
Plus + Matches 1 or more of the preceding token
Star * Matches 0 or more of the preceding token
Quantifier {1,3} Matches the specified quantity of the preceding token. {1,3} will match 1 to 3 and {3} will match exactly 3. {3,} will match 3 or more.
Optional ? Matches 0 or 1 of the preceding token
Lazy ? Makes the preceding quantifier lazy, making it match as few characters as possible
Alternation | Matches the expression before or after the |

Examples

Example Syntax
Phone Number ^(\d{3})\s?\d{3}[-\s]?\d{4}$
Email Address ^\w+@[a-zA-Z_]+?.[a-zA-Z]{2,3}$
Date (MM/DD/YY) ^(0[1-9]|1[0-2])/(0[1-9]|[12]\d|3[01])/(\d{2})$

Featured ones: