Tabula Documentation
Tabula HomeCommunity
  • πŸ‘‹Introduction to Tabula
  • Getting Started
    • Product Updates
    • Getting Started
      • Installation and Login
      • Beginner's Guide
    • FAQ
  • Product overview
    • Home Page
    • Exploring Data
      • Data Catalog
      • Exploring Datasets
      • Statistics Panel
    • Designing Flows
      • Creating Flows
      • Flow Designer Guide
        • Working with Canvas
        • Using Groups
        • Working with Table
      • Managing Flows
      • Sharing Flows
      • Demo: Building a Simple Flow
    • Executing Flows
      • Running Flows
      • Jobs overview
    • Building Reports
      • Designing Reports
      • Running Reports
      • Reports Page
    • Connecting Data
  • Integrations
    • Enrichments
      • How to add your API key in Tabula
      • List of Supported Queries
      • Enrichment Providers
        • AnymailFinder
        • Apollo
          • How to find Apollo API key
          • Enrich person by LinkedIn
          • Enrich company by domain
        • Bounceban
        • Bouncer
        • Bouncify
        • CaptainVerify
        • Cleanify
        • Clearout
        • CompanyEnrich
        • ContactOut
          • How to find ContactOut API key
          • Enrich person by LinkedIn
          • Enrich person by email
        • Discolike
        • TheCompaniesAPI
        • Findymail
        • Emailable
        • EmailListVerify
        • Enrichley
        • Heybounce
        • Hunter
        • Kickbox
        • Mails
        • MailChecker
        • MillionVerifier
        • NeverBounce
        • Nubela (Proxycurl)
        • PeopleDataLabs
        • Prospeo
        • ZeroBounce
        • ReverseContact
          • How to find Reverse Contact API key
          • Enrich person by LinkedIn
          • Enrich person and company by email
          • Enrich company by domain
          • Enrich company by LinkedIn
        • UpLead
    • Data Sources
      • Configuring Fivetran Integration
    • Data Storages
      • PostgreSQL
      • Snowflake
      • BigQuery
      • ClickHouse
  • Data Transformation
    • Transforms
      • Source
      • New Empty Table
      • Output
      • Chart
      • Enrichment
      • New Column
      • If...Then
      • Rolling Functions
      • Column Type
      • Columns Edit
      • Filter
      • Remove Duplicates
      • Sort
      • Find and Replace Text
      • Split Column
      • Extract Text
      • Match Text
      • Join
      • Union
      • Group By
      • Pivot
      • Unpivot
      • To JSON
      • From JSON
      • API Call
      • AI Column
      • AI Table
    • Formulas
      • What are Formulas?
      • Math Functions
        • Abs
        • Ceiling
        • Exp
        • Floor
        • IsEven
        • IsOdd
        • Ln
        • Log
        • Log10
        • Mod
        • Pi
        • Power
        • Quotient
        • Round
        • RoundDown
        • RoundUp
        • Sign
        • Sqrt
        • Truncate
      • Trigonometric Functions
        • Acos
        • Asin
        • Atan
        • Atan2
        • Cos
        • Cot
        • Degrees
        • Radians
        • Sin
        • Tan
      • String Functions
        • Compare
        • Concat
        • Contains
        • In
        • CountMatches
        • CountMatchesRegexp
        • EndsWith
        • EndsWithRegexp
        • Extract
        • FindMatchOfString
        • FindMatchOfRegexp
        • FindMatchesOfString
        • FindMatchesOfRegexp
        • Left
        • Length
        • Lower
        • Matches
        • Pad
        • ProperCase
        • RemoveSymbols
        • RemoveWhitespaces
        • Repeat
        • Replace
        • ReplaceRegexp
        • Reverse
        • Right
        • Spaces
        • Split
        • SplitRegexp
        • StartsWith
        • StartsWithRegexp
        • Stuff
        • Substring
        • SubstringDelimiter
        • SubstringRegexpDelimiter
        • Trim
        • Upper
      • Date & Time Functions
        • Date
        • DateAdd
        • DateAdd2
        • DateDiff
        • DateDiff2
        • DateFromParts
        • DateTime
        • DateTimeFromParts
        • DateTrunc
        • DayName
        • DayOfMonth
        • DayOfWeek
        • DayOfYear
        • Hour
        • Minute
        • Month
        • MonthName
        • Now
        • Quarter
        • Second
        • Time
        • TimeFromParts
        • Today
        • Week
        • Year
      • Aggregate Functions
        • Any
        • AnyIf
        • Array
        • ArrayIf
        • Avg
        • AvgIf
        • AvgInRow
        • Count
        • CountA
        • CountIf
        • CountUnique
        • Max
        • MaxIf
        • MaxInRow
        • Median
        • MedianIf
        • Min
        • MinIf
        • MinInRow
        • Mode
        • ModeIf
        • Percentile
        • Quartile
        • StdDev
        • StdDevIf
        • Sum
        • SumIf
        • SumProduct
        • Variance
        • VarianceIf
      • Conversion Functions
        • ToArray
        • ArrayToString
        • ToBoolean
        • ToDate
        • ToDateTime
        • ToDecimal
        • ToInteger
        • ToObject
        • ToTime
        • ToString
      • Misc Functions
        • At
        • IsMissing
        • RowNumber
        • Random
        • If
        • Coalesce
        • True
        • False
        • Null
        • $target
      • Window Functions
      • Custom Functions
      • Data Types
      • Supported Date Parts
      • Regex: List of Tokes
  • Pricing & Billing
    • Plans, Subscriptions, and Credits
    • Tabula for Education
  • Tutorials
    • Tabula Use Cases
    • Merge Columns
    • Join Types
    • Union Introduction
    • Window Functions
    • What is Unpivot?
    • JSON Format Tutorial
    • Using Regex
Powered by GitBook
On this page
  • Introduction
  • Basic concepts
  • Common regex metacharacters
  • Examples

Was this helpful?

  1. Tutorials

Using Regex

Introduction

What are Regular Expressions?

Regular expressions, or regex for short, are a powerful and flexible tool for working with text. They are essentially a sequence of characters that define a search pattern, which can be used to search, match, and manipulate strings. Regex is available in most programming languages, including Python, JavaScript, Java, and SQL.

Regex provides a concise and expressive way to represent patterns in text. They can be used for simple tasks like finding words or more complex operations like validating email addresses or parsing log files.

Applications of Regex

Regex is commonly used in various applications, such as:

  • Text processing and analysis

  • Data extraction and parsing

  • Input validation

  • Search and replace operations

  • String manipulation and transformation

  • Syntax highlighting and code formatting

Some real-world examples of using regex include:

  • Searching for specific words or phrases in a document

  • Extracting dates, phone numbers, or URLs from a text file

  • Validating user input, such as email addresses or passwords

  • Replacing specific patterns in a text with other values

Basic concepts

A regular expression is a sequence of characters that defines a search pattern. This pattern can be used to match strings or parts of strings.

Here are some basic terms to understand when working with regex:

  • Literal characters: Ordinary characters that match themselves, e.g., abc would match the string abc

  • Metacharacters: Special characters that have a specific meaning, e.g., . matches any character.

  • Character classes: Define a set of characters to match, e.g., [a-z] would match any lowercase letter.

  • Quantifiers: Specify how many times a character or group should appear, e.g., a{3} would match aaa

Common regex metacharacters

Here are some common metacharacters and their meanings:

  • .: Matches any single character except a newline.

  • ^: Matches the start of a string.

  • $: Matches the end of a string.

  • ``: Matches zero or more occurrences of the preceding character.

  • +: Matches one or more occurrences of the preceding character.

  • ?: Matches zero or one occurrence of the preceding character.

  • {n}: Matches exactly n occurrences of the preceding character.

  • {n,}: Matches n or more occurrences of the preceding character.

  • {n,m}: Matches between n and m occurrences of the preceding character.

  • [...]: Defines a character class, matching any single character within the brackets.

  • [^...]: Negated character class, matching any single character NOT within the brackets.

  • |: Alternation, matches either the expression before or after the symbol.

  • (...): Grouping, allows applying quantifiers to the entire group.

Examples

Here are some simple examples to illustrate regex patterns:

  1. ^hello: Matches strings that start with "hello".

  2. world$: Matches strings that end with "world".

  3. a.b: Matches strings containing "a", any single character, followed by "b".

  4. ab*c: Matches strings containing "a", followed by zero or more "b"s, and then "c".

  5. ab+c: Matches strings containing "a", followed by one or more "b"s, and then "c".

  6. ab?c: Matches strings containing "a", followed by zero or one "b", and then "c".

  7. [A-Za-z0-9]: Matches any single alphanumeric character.

Example 1

String: β€œThe quick brown fox jumps over the lazy dog”

Regex: **\\b\\w{5}\\b**

Replace To: β€œ0”

Result: The 0 0 fox 0 over the lazy dog”

Example 2

In this example, we'll demonstrate how to use regex to find and replace multiple spaces with a single space in a text

Text: "This is an example with multiple spaces.”

Regex: **\\s{2,}**

Replace To: β€œ ”

Result: "This is an example with multiple spaces.”

In this example, the pattern is defined as r"\\s{2,}" This pattern matches any sequence of two or more whitespace characters. The replacement is a single-space character.

PreviousJSON Format Tutorial

Last updated 1 year ago

Was this helpful?

See the of supported tokens

full list