Tuesday, 10 March 2026

Regular expression - basic

Mastering Regular Expressions (Regex) in Bash

Regular Expressions (regex) are extremely powerful tools for pattern matching in Bash. They allow you to search, manipulate, and validate text efficiently. Regex is commonly used with commands like:

  • grep

  • sed

  • awk

  • [[ string =~ regex ]] (Bash built-in)

  • find

A regular expression is essentially a pattern used to match text. In Bash, regex is primarily used in two ways:

  1. Inside the [[ ... ]] test construct

  2. Through external utilities like grep, sed, and awk


1. The Two Main Flavors of Regex

BRE (Basic Regular Expressions)

  • Default for grep and sed.

  • Metacharacters like +, ?, {, and ( must be escaped with a backslash (\) to work.

  • Example: grep '\+' file.txt matches one or more of the preceding character.

ERE (Extended Regular Expressions)

  • Used by grep -E (or egrep) and Bash’s [[ =~ ]] operator.

  • Most symbols work "out of the box" without backslashes.

  • Example: grep -E 'a+b' file.txt

Tip: For even more advanced regex (lookaheads, lookbehinds), grep -P enables Perl-Compatible Regex.


2. Core Syntax Cheat Sheet

Anchors & Boundaries

  • ^ : Matches the start of a line

  • $ : Matches the end of a line

  • . : Matches any single character except newline

Quantifiers (How many times?)

  • * : 0 or more of the preceding character

  • + : 1 or more (ERE)

  • ? : 0 or 1 (ERE)

  • {n,m} : Between n and m occurrences

Character Classes

  • [abc] : Matches any one of a, b, or c

  • [^abc] : Matches any character except a, b, or c

  • [0-9] : Matches any digit

  • [a-z] : Matches any lowercase letter


3. Basic Examples in Bash

Example 1: Matching Names

[root@oel01db ~]# cat re_simple.sh
re='^(dave|joe)'
input=$1

if [[ $input =~ $re ]]; then
echo match
else
echo no match
fi
[root@oel01db ~]# bash re_simple.sh dave
match
[root@oel01db ~]# bash re_simple.sh davejohn
match

Here, the string must start with dave or joe.


Example 2: Exact Match at Start and End

[root@oel01db ~]# cat re_simple.sh
re='^(dave|joe)$'
input=$1

if [[ $input =~ $re ]]; then
echo match
else
echo no match
fi
[root@oel01db ~]# bash re_simple.sh davejohn
no match
[root@oel01db ~]# bash re_simple.sh dave
match

Adding $ ensures that the entire string matches the regex.


4. Using ${BASH_REMATCH}

Bash automatically populates a special array variable called BASH_REMATCH whenever [[ string =~ regex ]] matches.

  • ${BASH_REMATCH[0]} → The entire matched string

  • ${BASH_REMATCH[1]} → The first capture group

  • ${BASH_REMATCH[2]} → The second capture group, and so on

[root@oel01db ~]# cat re_simple.sh
re='^(dave|joe)$'
input=$1

if [[ $input =~ $re ]]; then
echo match
printf '%s\n' "${BASH_REMATCH[@]}"
else
echo no match
fi
[root@oel01db ~]# bash re_simple.sh joe
match
joe
joe

Why “joe” appears twice?

  • BASH_REMATCH[0] → Entire match = joe

  • BASH_REMATCH[1] → Captured group (dave|joe) = joe

Parentheses () in regex create capture groups.


5. More Complex Example

[root@oel01db ~]# cat re_simple.sh
re='^(d|j).*$'
input=$1

if [[ $input =~ $re ]]; then
echo match
printf '%s\n' "${BASH_REMATCH[@]}"
else
echo no match
fi
[root@oel01db ~]# bash re_simple.sh john
match
john
j

Here:

  • The string must start with d or j, followed by anything (.*)

  • Capture group (d|j) captures only the first character

Valid Examples:

  • john → matches

  • jack → matches

  • dave → matches

  • dog → matches

  • j → matches

  • d → matches


6. Important Additional Points

  1. Escaping in BRE vs ERE

    • BRE: +, ?, {}, () need a backslash (\+)

    • ERE: No escape needed for +, ?, {}, ()

  2. Testing for Regex in Bash

    • Always quote the regex if it contains special characters to prevent shell expansion:

      [[ $input =~ "$re" ]]
  3. Capture Groups vs Non-Capturing Groups

    • Capturing: (abc) → stored in ${BASH_REMATCH[n]}

    • Non-capturing (not supported in Bash [[ =~ ]]): (?:abc)

  4. Perl-Compatible Regex

    • grep -P allows lookahead/lookbehind and more advanced patterns:

      grep -P '(?<=foo)bar' file.txt
  5. Always Test Your Regex

    • Tools like regex101.com help visualize capture groups and matches before using in Bash.

No comments:

Post a Comment

JFrog Artifactory - How to install

JFrog Artifactory OSS Installation Guide CentOS 9 + PostgreSQL 17 This guide provides a structured workflow to install JFrog Artifactory OSS...