Detailed New Features of Firebird 5. Part 2: SQLs

This material is sponsored and created with the sponsorship and support of IBSurgeon https://www.ib-aid.com, vendor of HQbird (advanced distribution of Firebird) and supplier of performance optimization, migration and technical support services for Firebird.

The material is licensed under Public Documentation License https://www.firebirdsql.org/file/documentation/html/en/licenses/pdl/public-documentation-license.html--

Preface

In the first part we considered improvements in the query optimizer: https://ib-aid.com/en/articles/detailed-new-features-of-firebird-5-part-1-improvements-in-optimizer .

Now let’s see what’s new in the SQL language Firebird 5.0.

New features in SQL language

Support for `WHEN NOT MATCHED BY SOURCE` clause in `MERGE` statement

The MERGE operator merges records from the source table and the target table (or updatable view).

During execution of a MERGE statement, the source records are read and then INSERT, UPDATE or DELETE is performed on the target table depending on the conditions.

Syntax of MERGE:

MERGE
  INTO target [[AS] target_alias]
  USING <source> [[AS] source_alias]
  ON <join condition>
  <merge when> [<merge when> ...]
  [<plan clause>]
  [<order by clause>]
  [<returning clause>]

<source> ::= tablename | (<select_stmt>)

<merge when> ::=
    <merge when matched>
  | <merge when not matched by target>
  | <merge when not matched by source>

<merge when matched> ::=
  WHEN MATCHED [ AND <condition> ]
  THEN { UPDATE SET <assignment_list> | DELETE }

<merge when not matched by target> ::=
  WHEN NOT MATCHED [ BY TARGET ] [ AND <condition> ]
  THEN INSERT [ <left paren> <column_list> <right paren> ]
  VALUES <left paren> <value_list> <right paren>

<merge when not matched by source> ::=
  WHEN NOT MATCHED BY SOURCE [ AND <condition> ] THEN
  { UPDATE SET <assignment list> | DELETE }

Firebird 5.0 introduced conditional branches <merge when not matched by source>, which allow you to update or delete records from the target table if they are not present in the data source.

Now the MERGE operator is a truly universal tool for any modifications of the target table for a certain set of data.

The data source can be a table, view, stored procedure, or derived table. When a MERGE statement is executed, a join is made between the source (USING) and the target table.

The join type depends on the presence of WHEN NOT MATCHED clauses:

<merge when not matched by target> и <merge when not matched by source> — FULL JOIN
<merge when not matched by source> — RIGHT JOIN
<merge when not matched by target> — LEFT JOIN
only <merge when matched> — INNER JOIN

The action on the target table, as well as the condition under which it is performed, is described in the WHEN clause.

It is possible to have several clauses WHEN MATCHED, WHEN NOT MATCHED [BY TARGET] and WHEN NOT MATCHED BY SOURCE.

If the condition in the WHEN clause is not met, then Firebird skips it and moves on to the next clause.

This will continue until the condition for one of the WHEN clauses is met. In this case, the action associated with the WHEN clause is performed and the next record of the join result between the source (USING) and the target table is moved to. Only one action is performed for each result record of the join.

`WHEN MATCHED`

Specifies that all target rows that match the rows returned by the <source> ON <join condition> expression and satisfy additional search conditions are updated (UPDATE clause) or deleted (DELETE clause) according to the <merge when matched> clause.

Multiple WHEN MATCHED clauses are allowed. If more than one WHEN MATCHED clause is specified, all of them should be supplemented with additional search terms except the last one.

A MERGE statement cannot update the same row more than once, or it cannot update and delete the same row at the same time.

`WHEN NOT MATCHED [BY TARGET]`

Specifies that all target rows that do not match the rows returned by the <source> ON <join condition> expression and satisfy additional search conditions are inserted into the target table (INSERT clause) according to the <merge when not matched by target> clause.

Multiple WHEN NOT MATCHED [BY TARGET] clauses are allowed. If more than one WHEN NOT MATCHED [BY TARGET] clause is specified, then all of them should be supplemented with additional search terms, except for the last one.

`WHEN NOT MATCHED BY SOURCE`

Specifies that all target rows that do not match the rows returned by the <source> ON <join condition> expression and satisfy additional search conditions (UPDATE clause) or are deleted (DELETE clause) according to the <merge when not matched by source clause >. The WHEN NOT MATCHED BY SOURCE clause became available in Firebird 5.0.

Multiple WHEN NOT MATCHED BY SOURCE clauses are allowed. If more than one WHEN NOT MATCHED BY SOURCE clause is specified, all of them should be supplemented with additional search terms except the last one.

Note	In the SET list of an UPDATE clause, it makes no sense to use expressions that refer to <source>, since no entries from <source> match target entries.

Example of using `MERGE` with clause `WHEN NOT MATCHED BY SOURCE`

Let’s say you have a price list in the tmp_price temporary table and you need to update the current price so that:

if the product is not in the current price list, then add it;
if the product is in the current price list, then update the price for it;
if the product is included in the current price list. but it is not in the new one, then delete this price line

All these actions can be done in the single SQL command:

MERGE INTO price
USING tmp_price
ON price.good_id = tmp_price.good_id
WHEN NOT MATCHED
 -- add if it wasn't there
  THEN INSERT(good_id, name, cost)
  VALUES(tmp_price.good_id, tmp_price.name, tmp_price.cost)
WHEN MATCHED AND price.cost <> tmp_price.cost THEN
   -- update the price if the product is in the new price list and the price is different
  UPDATE SET cost = tmp_price.cost
WHEN NOT MATCHED BY SOURCE
  -- if there is no product in the new price list, then we remove it from the current price list
  DELETE;

Note

In this example, instead of the temporary table tmp_price, there can be an arbitrarily complex SELECT query or stored procedure.

Please note, that since both the WHEN NOT MATCHED [BY TARGET] and WHEN NOT MATCHED BY SOURCE clauses are present, the join between the target table and the data source will be done using a FULL JOIN.

In the current version of Firebird, FULL JOIN, if it is impossible to use indexes on both the right and left, will be very slow.

Clause `SKIP LOCKED`

Firebird 5.0 introduced the SKIP LOCKED clause, which can be used in SELECT .. WITH LOCK, UPDATE, and DELETE statements.

Using this clause causes the engine to skip records locked by other transactions instead of waiting for them, or cause update conflict errors.

Using SKIP LOCKED is useful for implementing work queues, in which one or more processes submit work to a table and emit an event, while worker (executor) threads listen for events and read/remove items from the table. Using SKIP LOCKED, multiple workers can receive exclusive jobs from a table without conflicts.

SELECT
  [FIRST ...]
  [SKIP ...]
  FROM <sometable>
  [WHERE ...]
  [PLAN ...]
  [ORDER BY ...]
  [{ ROWS ... } | {OFFSET ...} | {FETCH ...}]
  [FOR UPDATE [OF ...]]
  [WITH LOCK [SKIP LOCKED]]

UPDATE <sometable>
  SET ...
  [WHERE ...]
  [PLAN ...]
  [ORDER BY ...]
  [ROWS ...]
  [SKIP LOCKED]
  [RETURNING ...]

DELETE FROM <sometable>
  [WHERE ...]
  [PLAN ...]
  [ORDER BY ...]
  [ROWS ...]
  [SKIP LOCKED]
  [RETURNING ...]

Note	As with FIRST/SKIP/ROWS/OFFSET/FETCH, record locking (and the "skip locked" check) occurs between skip (SKIP/ROWS/OFFSET) and restriction (FIRST/ROWS/FETCH).

Example:

Create table and trigger:

create table emails_queue (
  subject varchar(60) not null,
  text blob sub_type text not null
);

set term !;

create trigger emails_queue_ins after insert on emails_queue
as
begin
  post_event('EMAILS_QUEUE');
end!

set term ;!

Sending a message by an application

insert into emails_queue (subject, text)
values ('E-mail subject', 'E-mail text...');

commit;

Client application

-- The client application can check table to the EMAILS_QUEUE event,
-- to send emails using this command:

delete from emails_queue
  rows 10
  skip locked
  returning subject, text;

More than one instance of an application can be running, for example for load balancing.

Note	Note The use of SKIP LOCKED for organizing queues will be discussed in in the separate article.

Support for returning multiple records by operators with clause `RETURNING`

Since Firebird 5.0, client-side modification statements INSERT .. SELECT, UPDATE, DELETE, UPDATE OR INSERT and MERGE, with clause RETURNING will return a cursor: it means that they are able to return multiple rows instead of throwing the "multiple rows in singleton select" error as previously.

These queries are now described as isc_info_sql_stmt_select, whereas in previous versions they were described as isc_info_sql_stmt_exec_procedure.

Singleton statements INSERT .. VALUES, and positioned statements UPDATE and DELETE (those containing a WHERE CURRENT OF clause) retain the existing behavior and are described as isc_info_sql_stmt_exec_procedure.

However, all of these statements, if used in PSQL, and if the RETURNING clause is used, are still treated as singletons.

Examples of modifying statements containing RETURNING and returning a cursor:

INSERT INTO dest(name, val)
SELECT desc, num + 1 FROM src WHERE id_parent = 5
RETURNING id, name, val;

UPDATE dest
SET a = a + 1
RETURNING id, a;

DELETE FROM dest
WHERE price < 0.52
RETURNING id;

MERGE INTO PRODUCT_INVENTORY AS TARGET
USING (
  SELECT
    SL.ID_PRODUCT,
    SUM(SL.QUANTITY)
  FROM
    SALES_ORDER_LINE SL
    JOIN SALES_ORDER S ON S.ID = SL.ID_SALES_ORDER
  WHERE S.BYDATE = CURRENT_DATE
    AND SL.ID_PRODUCT = :ID_PRODUCT
  GROUP BY 1
) AS SRC(ID_PRODUCT, QUANTITY)
ON TARGET.ID_PRODUCT = SRC.ID_PRODUCT
WHEN MATCHED AND TARGET.QUANTITY - SRC.QUANTITY <= 0 THEN
  DELETE
WHEN MATCHED THEN
  UPDATE SET
    TARGET.QUANTITY = TARGET.QUANTITY - SRC.QUANTITY,
    TARGET.BYDATE = CURRENT_DATE
RETURNING OLD.QUANTITY, NEW.QUANTITY, SRC.QUANTITY;

PARTIAL Indices

In Firebird 5.0, when creating an index, it became possible to specify an optional WHERE clause, which specifies a search condition that limits the subset of table records to be indexed. Such indexes are called partial indexes. The search condition must contain one or more table columns.

The partial index definition may include a UNIQUE specification. In this case, each key in the index must be unique. This allows you to ensure uniqueness for a certain subset of table rows.

The definition of a partial index can also include a COMPUTED BY clause so that the partial index can be computed.

So the complete syntax for creating an index is as follows:

CREATE [UNIQUE] [ASC[ENDING] | DESC[ENDING]]
INDEX indexname ON tablename
{(<column_list>) | COMPUTED [BY] (<value_expression>)}
[WHERE <search_condition>]

<column_list> ::= col [, col ...]

The optimizer can only use a partial index in the following cases:

the WHERE clause includes exactly the same logical expression as the one defined for the index;
the search condition defined for the index contains Boolean expressions combined with an OR, and one of them is explicitly included in the WHERE clause;
The search condition defined for the index specifies IS NOT NULL, and the WHERE clause includes an expression for the same field that is known to ignore NULL.

If a regular index and a partial index exist for the same set of fields, the optimizer will choose the regular index even if the WHERE clause includes the same expression as defined in the partial index.

The reason for this behavior is that the regular index has better selectivity than the partial index.

But there are exception to this rule: using predicates with poor selectivity on indexed fields, such as <>, IS DISTINCT FROM, or IS NOT NULL, provided that the predicate is used in a partial index.

Note	Partial indexes cannot be used to constrain a primary key or a foreign key; that is, a USING INDEX clause cannot specify a partial index definition.

Let’s see when partial indexes are useful.

Example 1. Partial uniqueness

Let’s say we have a table storing a person’s email address.

CREATE TABLE MAN_EMAILS (
  CODE_MAN_EMAIL BIGINT GENERATED BY DEFAULT AS IDENTITY,
  CODE_MAN BIGINT NOT NULL,
  EMAIL VARCHAR(50) NOT NULL,
  DEFAULT_FLAG BOOLEAN DEFAULT FALSE NOT NULL,
  CONSTRAINT PK_MAN_EMAILS PRIMARY KEY(CODE_MAN_EMAIL),
  CONSTRAINT FK_EMAILS_REF_MAN FOREIGN KEY(CODE_MAN) REFERENCES MAN(CODE_MAN)
);

One person can have many email addresses, but only one can be the default address. A regular unique index or restriction will not work in this case, since in this case we will be limited to only two addresses.

Here we can use the partial unique index:

CREATE UNIQUE INDEX IDX_UNIQUE_DEFAULT_MAN_EMAIL
ON MAN_EMAILS(CODE_MAN) WHERE DEFAULT_FLAG IS TRUE;

Thus, for one person we allow as many addresses as desired with DEFAULT_FLAG=FALSE and only one address with DEFAULT_FLAG=TRUE.

Partial indexes can be used simply to make the index more compact.

Example 2. Reducing the index size

Suppose you have a horse table HORSE in your database and it has the IS_ANCESTOR field, which is used to indicate whether the horse is the ancestor of a line or family. Obviously, there are hundreds of times fewer ancestors than other horses — see the result of the query below:

SELECT
  COUNT(*) FILTER(WHERE IS_ANCESTOR IS TRUE) AS CNT_ANCESTOR,
  COUNT(*) FILTER(WHERE IS_ANCESTOR IS FALSE) AS CNT_OTHER
FROM HORSE

         CNT_ANCESTOR             CNT_OTHER
===================== =====================
                 1426                518197

The goal is to quickly obtain a list of ancestors. From the above statistics it is also obvious that for the IS_ANCESTOR IS FALSE option, the use of index is practically useless.

Let’s try to create a regular index:

CREATE INDEX IDX_HORSE_ANCESTOR ON HORSE(IS_ANCESTOR);

But in this case, such an index will be redundant. Let’s look at its statistics using gstat tool:

    Index IDX_HORSE_ANCESTOR (26)
        Root page: 163419, depth: 2, leaf buckets: 159, nodes: 519623
        Average node length: 4.94, total dup: 519621, max dup: 518196
        Average key length: 2.00, compression ratio: 0.50
        Average prefix length: 1.00, average data length: 0.00
        Clustering factor: 9809, ratio: 0.02
        Fill distribution:
             0 - 19% = 0
            20 - 39% = 1
            40 - 59% = 0
            60 - 79% = 0
            80 - 99% = 158

Instead of a regular index, we can create a partial index (the previous one must be deleted):

CREATE INDEX IDX_HORSE_ANCESTOR ON HORSE(IS_ANCESTOR) WHERE IS_ANCESTOR IS TRUE;

Let’s compare the statistics using gstat tool:

    Index IDX_HORSE_ANCESTOR (26)
        Root page: 163417, depth: 1, leaf buckets: 1, nodes: 1426
        Average node length: 4.75, total dup: 1425, max dup: 1425
        Average key length: 2.00, compression ratio: 0.50
        Average prefix length: 1.00, average data length: 0.00
        Clustering factor: 764, ratio: 0.54
        Fill distribution:
             0 - 19% = 0
            20 - 39% = 0
            40 - 59% = 1
            60 - 79% = 0
            80 - 99% = 0

As you can see, the partial index is much more compact - there are 1426 nodes in partial index instead 519623 in regular. Let’s check that it can be used to obtain ancestors:

SELECT COUNT(*)
FROM HORSE
WHERE IS_ANCESTOR IS TRUE;

Select Expression
    -> Aggregate
        -> Filter
            -> Table "HORSE" Access By ID
                -> Bitmap
                    -> Index "IDX_HORSE_ANCESTOR" Full Scan

                COUNT
=====================
                 1426

Current memory = 556868928
Delta memory = 176
Max memory = 575376064
Elapsed time = 0.007 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 2192
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
HORSE                           |         |     1426|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

Please note that if you specify WHERE IS_ANCESTOR or WHERE IS_ANCESTOR = TRUE in the query, the index will not be used. It is necessary that the expression specified to filter the index completely matches the expression in the WHERE of your query.

Another case when partial indices can be useful is when using them with non-selective predicates.

Example 3. Using partial indexes with non-selective predicates

Suppose we need to get all dead horses for which the date of death is known. A horse is definitely dead if it has a date of death, but it often happens that it is not listed or is simply unknown. Moreover, the number of unknown dates of death is much greater than the known ones. To do this, we will write the following query:

SELECT COUNT(*)
FROM HORSE
WHERE DEATHDATE IS NOT NULL;

We want to get this list as quickly as possible, so we’ll try to create an index on the DEATHDATE field.

CREATE INDEX IDX_HORSE_DEATHDATE
ON HORSE(DEATHDATE);

Now let’s try to run the query above and look at its plan and statistics:

Select Expression
    -> Aggregate
        -> Filter
            -> Table "HORSE" Full Scan

                COUNT
=====================
                16234

Current memory = 2579550800
Delta memory = 176
Max memory = 2596993840
Elapsed time = 0.196 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 555810
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
HORSE                           |   519623|         |         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

As you can see, it was not possible to use the index.

The reason is that the predicates IS NOT NULL, <>, IS DISTINCT FROM are low-selective.

Как видите индекс задействовать не получилось. Причина в том, что предикаты IS NOT NULL, <>, IS DISTINCT FROM являются малоселективными.

Currently, Firebird does not have histograms with the distribution of index key values, and therefore the distribution is assumed to be uniform. With a uniform distribution, there is no point in using an index for such predicates, which is what is done.

Now let’s try to delete the previously created index and create a partial index instead:

DROP INDEX IDX_HORSE_DEATHDATE;

CREATE INDEX IDX_HORSE_DEATHDATE
ON HORSE(DEATHDATE) WHERE DEATHDATE IS NOT NULL;

And let’s try to repeat the request above:

Select Expression
    -> Aggregate
        -> Filter
            -> Table "HORSE" Access By ID
                -> Bitmap
                    -> Index "IDX_HORSE_DEATHDATE" Full Scan

                COUNT
=====================
                16234

Current memory = 2579766848
Delta memory = 176
Max memory = 2596993840
Elapsed time = 0.017 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 21525
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
HORSE                           |         |    16234|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

As you can see, the optimizer managed to use our index. But the most interesting thing is that our index will continue to work with other date comparison predicates (but it will not work for IS NULL).

See example below:

SELECT COUNT(*)
FROM HORSE
WHERE DEATHDATE = DATE'01.01.2005';

Select Expression
    -> Aggregate
        -> Filter
            -> Table "HORSE" Access By ID
                -> Bitmap
                    -> Index "IDX_HORSE_DEATHDATE" Range Scan (full match)

                COUNT
=====================
                  190

Current memory = 2579872992
Delta memory = 192
Max memory = 2596993840
Elapsed time = 0.004 sec
Buffers = 153600
Reads = 0
Writes = 0
Fetches = 376
Per table statistics:
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
 Table name                     | Natural | Index   | Insert  | Update  | Delete  | Backout | Purge   | Expunge |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+
HORSE                           |         |      190|         |         |         |         |         |         |
--------------------------------+---------+---------+---------+---------+---------+---------+---------+---------+

The optimizer in this case realized that the IS NOT NULL filter condition in the partial index covers any other predicates that do not compare to NULL.

It is important to note that if you specify the condition FIELD > 2 in the partial index, and the query contains the search condition FIELD > 1, then despite the fact that any number greater than 2 is also greater than 1, the partial index will not be used. The optimizer is not smart enough to derive this equivalence condition.

Functions `UNICODE_CHAR` and `UNICODE_VAL`

Firebird 2.1 introduced a pair of functions ASCII_CHAR - returning a character by its code in the ASCII table, and ASCII_VAL - returning the code in the ASCII table by character. These functions only apply to single-byte encodings; there is nothing similar for UTF-8. Firebird 5.0 added two more functions that work with multibyte encodings:

UNICODE_CHAR (number)

UNICODE_VAL (string)

The UNICODE_CHAR function returns the UNICODE character for the given code point.

The UNICODE_VAL function returns the UTF-32 code point for the first character in a string. For an empty string, 0 is returned.

SELECT
  UNICODE_VAL(UNICODE_CHAR(0x1F601)) AS CP_VAL,
  UNICODE_CHAR(0x1F601) AS CH
FROM RDB$DATABASE

Query expressions in parentheses

In 5.0, the DML syntax was expanded to allow the use of a query expression within parentheses (SELECT, including order by, offset and fetch clauses, but without with clause), where previously only the query specification was allowed (SELECT without clauses with, order by, offset and fetch).

This allows you to write clearer queries, especially in UNION operators, and provides greater compatibility with operators generated by some ORMs.

Note	Using query expressions in parentheses is not free from the Firebird engine’s point of view, since they require additional query context compared to a simple query specification. The maximum number of request contexts in a statement is limited to 255.

Example:

(
  select emp_no, salary, 'lowest' as type
  from employee
  order by salary asc
  fetch first row only
)
union all
(
  select emp_no, salary, 'highest' as type
  from employee
  order by salary desc
  fetch first row only
);

Improved Literals

Full syntax of string literals

The character string literal syntax has been changed to support full standard SQL syntax. This means that the literal can be "interrupted" by spaces or comments. This can be used, for example, to split a long literal across multiple lines or to provide inline comments.

String literal syntax according to ISO/IEC 9075-2:2016 SQL - Part 2: Foundation

<character string literal> ::=
  [ <introducer> <character set specification> ]
    <quote> [ <character representation>... ] <quote>
    [ { <separator> <quote> [ <character representation>... ] <quote> }... ]

<separator> ::=
  { <comment> | <white space> }...

Example:

-- spaces between literals
select 'ab'
       'cd'
from RDB$DATABASE;
-- output: 'abcd'

-- comment and spaces between literals
select 'ab' /* comment */ 'cd'
from RDB$DATABASE;
-- output: 'abcd'

Complete syntax for binary literals

The syntax for binary string literals has been changed to support full standard SQL syntax. This means that the literal can contain spaces to separate hexadecimal characters and can be "interrupted" by spaces or comments. This can be used, for example, to make a hexadecimal string more readable by grouping characters, or to split a long literal over multiple lines, or to provide inline comments.

The binary literal syntax is as per ISO/IEC 9075-2:2016 SQL - Part 2: Foundation

<binary string literal> ::=
  {X|x} <quote> [ <space>... ] [ { <hexit> [ <space>... ] <hexit> [ <space>... ] }...
      ] <quote>
    [ { <separator> <quote> [ <space>... ] [ { <hexit> [ <space>... ]
    <hexit> [ <space>... ] }... ] <quote> }... ]

Examples:

-- Grouping by bytes (spaces within a literal)
select _win1252 x'42 49 4e 41 52 59'
from RDB$DATABASE;
-- output: BINARY

--  spaces between literals
select _win1252 x'42494e'
                 '415259'
from RDB$DATABASE;
-- output: BINARY

Improved predicate IN

Prior to Firebird 5.0, the IN predicate with a list of constants was limited to 1500 elements because it was processed by recursively converting the original expression into an equivalent form.

This

F IN (V1, V2, ... VN)

will be transformed into

(F = V1) OR (F = V2) OR .... (F = VN)

Since Firebird 5.0, processing of IN <list> predicates is linear. The 1500 item limit has been increased to 65535 items. In addition, queries using the IN predicate with a list of constants are processed much faster. This was discussed in detail in the first part.

Package `RDB$BLOB_UTIL`

The operations with BLOBs inside PSQL were not fast, because any modification to a BLOB always creates a new temporary BLOB, which leads to additional memory consumption and, in some cases, to a larger database file for storing temporary BLOBs. In Firebird 4.0.2, a built-in function, BLOB_APPEND, was added to solve BLOB concatenation problems. In Firebird 5.0, was added a built-in RDB$BLOB_UTIL package with procedures and functions for more efficient BLOB manipulation.

We will show several practical examples how to use functions from the package RDB$NLOB_UTIL. The full description can be found in the «Release Notes» and in the «Firebird 5.0 SQL Language Guide»,.

Using the function `RDB$BLOB_UTIL.NEW_BLOB`

The RDB$BLOB_UTIL.NEW_BLOB function creates a new BLOB SUB_TYPE BINARY. It returns a BLOB suitable for appending data, similar to BLOB_APPEND.

Функция RDB$BLOB_UTIL.NEW_BLOB создает новый BLOB SUB_TYPE BINARY. Она возвращает BLOB, подходящий для добавления данных, аналогично BLOB_APPEND.

The difference over BLOB_APPEND is that you can set parameters SEGMENTED and TEMP_STORAGE.

The BLOB_APPEND function always creates blobs in temporary storage, which may not always be the best approach if the created blob will be stored in a permanent table because it would require a copy operation.

The BLOB returned by RDB$BLOB_UTIL.NEW_BLOB can be used with BLOB_APPEND to append data, even if TEMP_STORAGE = FALSE.

Table 1. Input parameters for function `RDB$BLOB_UTIL.NEW_BLOB`
Parameter	Type	Description
`SEGMENTED`	`BOOLEAN NOT NULL`	Type of BLOB. If TRUE - a segmented BLOB will be created, FALSE - a streaming one.
`TEMP_STORAGE`	`BOOLEAN NOT NULL`	In what storage is the BLOB created? TRUE - in temporary, FALSE - in permanent (for writing to a regular table).

Return type

BLOB SUB_TYPE BINARY

Example:

execute block
declare b blob sub_type text;
as
begin
  -- создаём потоковый не временный BLOB, поскольку далее он будет добавлен в таблицу
  b = rdb$blob_util.new_blob(false, false);

  b = blob_append(b, 'abcde');
  b = blob_append(b, 'fghikj');

  update t
  set some_field = :b
  where id = 1;
end

Reading BLOBs in chunks

When you needed to read part of a BLOB, you used the SUBSTRING function, but this function has one significant drawback: it always returns a new temporary BLOB.

Since Firebird 5.0 you can use the RDB$BLOB_UTIL.READ_DATA function for this purpose.

Table 2. Input parameters for function `RDB$BLOB_UTIL.READ_DATA`
Parameter	Type	Description
`HANDLE`	`INTEGER NOT NULL`	Handle of opened BLOB.
`LENGTH`	`INTEGER`	Quantity of bytes to read.

Return type

VARBINARY(32765)

The RDB$BLOB_UTIL.READ_DATA function is used to read pieces of data from a BLOB handle opened with RDB$BLOB_UTIL.OPEN_BLOB. When the BLOB has been completely read and there is no more data, it returns NULL.

If LENGTH parameter value is a positive number, a VARBINARY of maximum length LENGTH is returned.

If NULL is passed to LENGTH, a BLOB segment with a maximum length of 32765 is returned.

When you are done with a BLOB handle, you must close it using the RDB$BLOB_UTIL.CLOSE_HANDLE procedure.

Example 4. Opening a BLOB and returning it piece by piece to EXECUTE BLOCK

execute block returns (s varchar(10))
as
    declare b blob = '1234567';
    declare bhandle integer;
begin
    --  opens a BLOB for reading and returns its handle.
    bhandle = rdb$blob_util.open_blob(b);

    -- Getting the blob in parts
    s = rdb$blob_util.read_data(bhandle, 3);
    suspend;

    s = rdb$blob_util.read_data(bhandle, 3);
    suspend;

    s = rdb$blob_util.read_data(bhandle, 3);
    suspend;

    -- When there is no more data, NULL is returned.
    s = rdb$blob_util.read_data(bhandle, 3);
    suspend;

    -- Close the BLOB handle.
    execute procedure rdb$blob_util.close_handle(bhandle);
end

By passing the NULL value as the LENGTH parameter, you can read a BLOB segment by segment, if the segments do not exceed 32765 байт.

Let’s write a procedure to return a BLOB segment by segment

CREATE OR ALTER PROCEDURE SP_GET_BLOB_SEGEMENTS (
  TXT BLOB SUB_TYPE TEXT CHARACTER SET NONE
)
RETURNS (
  SEG VARCHAR(32765) CHARACTER SET NONE
)
AS
  DECLARE H INTEGER;
BEGIN
  H = RDB$BLOB_UTIL.OPEN_BLOB(TXT);
  SEG = RDB$BLOB_UTIL.READ_DATA(H, NULL);
  WHILE (SEG IS NOT NULL) DO
  BEGIN
    SUSPEND;
    SEG = RDB$BLOB_UTIL.READ_DATA(H, NULL);
  END
  EXECUTE PROCEDURE RDB$BLOB_UTIL.CLOSE_HANDLE(H);
END

It can be used, for example, like this:

WITH
  T AS (
    SELECT LIST(CODE_HORSE) AS B
    FROM HORSE
  )
SELECT
  S.SEG
FROM T
  LEFT JOIN SP_GET_BLOB_SEGEMENTS(T.B) S ON TRUE

Summary

New SQL commands in Firebird 5.0 are focused on increased convenience for developers (MERGE, SKIP LOCK), support of large databases (PARTIAL indices) and increased performance (RDB$BLOB_UTILS) of big databases with very long tables.