Language-integrated queries specify the logic in SQL-like syntax to process data and events. They are easy to write and understand due to the simplicity of the syntax. See how Ballerina provides first-class support for writing queries that process data.
As of now, language-integrated queries are supported for iterator implementations such as an array, map, stream, and table. There are two kinds of integrated queries that can be written in Ballerina:
- Query expression: allows generating a list, table, string, or XML
- Query action: executes a set of statements for each element of the iterator
Query expressions allow you to generate a list, stream, table, string, or XML. The sections below look at the clauses you can use in a query expression.
Query expressions contain a set of clauses similar to SQL to process the data. They must start with the
from clause and can perform various operations such as filter, join, sort, limit, and projection. There are various SQL-like clauses below to perform these operations.
from clause is used to define the input iterator source that is considered for processing the data. Similar to the
for each statement, you can specify any iterator as the expression. Similar to a list and stream, you can use all kinds of iterators as the input in the
where clause allows you to filter by condition. You can define any conditional expression, which returns a boolean. A
where clause contains logical operators, equality, and range checks.
let clause allows you to define variables that can be used only within the query expression scope. These variables cannot be accessed out of the query expression. You can define one or more variables in the
let clause and use them within query expressions.
join clause performs an inner or left outer equijoin. In the
join clause, there are two input iterators. During the joining process, each value of an iterator is matched against all the values in the other iterator based on the given condition, and the output values are generated for all the matching event pairs. Here, you can only perform equality checks as the joining condition.
order by clause
order by clause allows ordering the result in the ascending and/or descending order based on the specified attributes. Ordering will be done in an ascending manner by default. You can use the
descending keyword to order in a descending manner. Here, attributes that are considered for the
order by operations are order-keys, which should be an ordered type. You can define more than one order key in the
order by clause and it’s possible to have more than one
order by clause.
limit clause limits the number of frames/values emitted by a query pipeline. You should define an integer value to specify the number of output values.
select clause is a mandatory clause in query expressions that is used for projection. You can use this clause to create the values required to generate iterators such as list, table, XML, string, and stream.
on conflict clause
on conflict clause is only allowed for a query expression that constructs a table with a key sequence. The expression is evaluated when the
select clause emits a value that conflicts with a previous value, in the sense, that both values have the same key value in the table. The
on conflict clause gets executed when the
select clause emits a row that has the same key as a row that it emitted earlier. It gives an
onConflictError error if there is a key conflict.
Query actions are executed in the same way as the clauses in the query expression. However, it doesn't generate an output such as a list. Rather, it executes a set of statements defined by you. The block inside the
do clause is executed in each iteration.
Query action example
Write integrated queries
The example below provides in-depth knowledge on how to utilize the capabilities of the Ballerina query expressions. This example explains the use case of finding popular books in a store.
In the above code, the necessary custom types and variables are created to perform the data manipulation.
Here, the author details are represented as JSON elements. In this method, the respective author JSON elements are converted into tabular data using Ballerina query expressions for further processing.
The above method iterates through a
categories XML array and constructs a table with the category ID and name. Query expression clauses such as
on conflict are used in it.
The above method returns the book category names according to the given category IDs. As written in the query expression, a
join clause iterates through the
categories array to find the respective category name according to the ID.
Here, the author names are identified according to the author IDs. As similar to the previous method, query joins are utilized to create the
Now, the author details and book category details exist as individual table values. The book details exist as an array. Further, this array has the necessary keys to identify the respective authors and book categories. Hence, there is a requirement to generate a table, which contains the relevant book details, author names, and category names. As shown in the above example, Ballerina query expressions can be utilized to create such table values.
Now, you have the book details as a table value and there is a requirement to identify the popular books based on the number of sales copies. As shown in the above example, you have to combine the
sales array and the
books table to identify the popular books. In this example, Ballerina query clauses such as
order by, and
limit are utilized to cater to the requirement.
main method is responsible for identifying the popular books according to the number of copies sold. Hence, the respective
getPopularBooks method gets called with the required parameters.