Specification: Ballerina Regex Library

Owners: @daneshk @kalaiyarasiganeshalingam
Reviewers: @daneshk
Created: 2021/12/07
Updated: 2022/02/17
Edition: Swan Lake

Introduction

This is the specification for the Regex standard library of Ballerina language, which provides functionalities such as matching, replacing and splitting strings based on regular expressions.

The Regex library specification has evolved and may continue to evolve in the future. The released versions of the specification can be found under the relevant GitHub tag.

If you have any feedback or suggestions about the library, start a discussion via a GitHub issue or in the Slack channel. Based on the outcome of the discussion, the specification and implementation can be updated. Community feedback is always welcome. Any accepted proposal, which affects the specification is stored under /docs/proposals. Proposals under discussion can be found with the label type/proposal in GitHub.

The conforming implementation of the specification is released and included in the distribution. Any deviation from the specification is considered a bug.

Contents

  1. Overview
  2. Operations

1. Overview

This library is based on regular expressions, which are notations for describing sets of character strings that specify a search pattern. It supports the regular expression patterns of Java.

2. Operations

2.1. Matches

This is used to check whether a string matches the provided regex.

2.2. Replace

The replace APIs are used to replace the occurrence/s of substrings that matches the provided regex in the original string with the provided replacement string or string returned by the provided function. The following function and type are used to provide the constant or dynamic replacement string.

The following APIs are provided to replace the matches:

  • To replace all occurrences of substrings that matches the provided regex in the original string with the provided replacement string or string returned by the provided function.
  • To replace only the first occurrence of the substring from the start index that matches the provided regex in the original string with the provided replacement string or string returned by the provided function.

2.3. Split

This splits a string into an array of substrings, using the provided regex as the delimiter.

The search APIs extract substring/s of the string that matches the provided regex. It provides details of the matches such as substring value, start index, end index, and matched regex groups

The following records are used to hold the results of a match against a regular expression.

This Groups object handles the matches with the group of regex.

The following APIs are provided by the regex module to extract string/s.

  • To get all substrings in string that match the regex.
  • To get the first substring from the start index in the given string that matches the regex.