Biblioteca

Detailed New Features Of Firebird 5, Part 6: Profiler

1. SQL and PSQL Profiling

One of the tasks of a database developer or administrator is to determine the causes of "slowdowns" in the information system.

Starting from Firebird 2.5, a powerful tracing tool has been added to their arsenal. Tracing is an indispensable tool for finding application bottlenecks, evaluating resources consumed during query execution, and determining the frequency of certain actions. Tracing shows statistics in the most detailed form (unlike the statistics available in ISQL, for example). The statistics do not take into account the costs of query preparation and data transmission over the network, which makes it "cleaner" than the data shown by ISQL. At the same time, tracing has a very insignificant impact on performance. Even with intensive logging, it usually results in no more than a 2-3% drop in the speed of executed queries.

After slow queries are "caught" by tracing, you can start optimizing them. However, such queries can be quite complex, and sometimes even call stored procedures, so a profiling tool is needed to help identify bottlenecks in the query itself or in the called PSQL module. Starting from Firebird 5.0, such a tool has appeared.

The profiler allows users to measure the performance costs of SQL and PSQL code. This is implemented using a system package in the engine that transmits data to the profiler plugin.

In this document, the engine and plugin are considered as a whole. Additionally, it is assumed that the default profiler plugin (Default_Profiler) is used.

The RDB$PROFILER package can profile the execution of PSQL code, collecting statistics on how many times each line was executed, as well as its minimum, maximum, and total execution time (accurate to nanoseconds), as well as statistics on opening and fetching records from implicit and explicit SQL cursors. In addition, you can get statistics on SQL cursors in terms of data sources (access methods) of the expanded query plan.

Note

Although the execution time is measured with nanosecond accuracy, this result should not be trusted. The process of measuring execution time has certain overhead costs, which the profiler tries to compensate for. Accordingly, the measured time cannot be accurate, but it allows for a comparative analysis of the costs of executing individual sections of PSQL code among themselves, and identifying bottlenecks.

To collect profiling data, the user must first start a profiling session using the RDB$PROFILER.START_SESSION function. This function returns a profiling session identifier, which is later saved in the profiler snapshot tables. Later, you can execute queries to these tables for user analysis. A profiler session can be local (same connection) or remote (another connection).

Remote profiling simply redirects session control commands to the remote connection. Thus, it is possible for a client to profile multiple connections simultaneously. It is also possible that a locally or remotely started profiling session contains commands issued by another connection.

Remotely executed session control commands require the target connection to be in a waiting state, i.e., not executing other queries. When the target connection is not in waiting mode, the call is blocked waiting for this state.

If the remote connection comes from another user, the calling user must have the PROFILE_ANY_ATTACHMENT system privilege.

After starting a session, statistics for PSQL and SQL statements are collected in memory. The profiling session only collects data about statements executed in the connection associated with the session. Data is aggregated and saved for each query (i.e., executed statement). When querying snapshot tables, the user can perform additional aggregation for each statement or use auxiliary views that do this automatically.

The session can be paused to temporarily disable statistics collection. Later, it can be resumed to return statistics collection in the same session.

To analyze the collected data, the user must flush the data to snapshot tables, which can be done by finishing or pausing the session (with the FLUSH parameter set to TRUE), or by calling RDB$PROFILER.FLUSH. Data is flushed using an autonomous transaction (a transaction starts and ends with the specific purpose of updating profiler data).

All procedures and functions of the RDB$PROFILER package contain an ATTACHMENT_ID parameter, which should be specified if you want to manage a remote profiling session. If this parameter is NULL or not specified, the procedures and functions manage the local profiling session.

1.1. Starting a Profiling Session

To start a profiling session, you need to call the RDB$PROFILER.START_SESSION function, which returns the profiling session identifier.

This function has the following parameters:

  • DESCRIPTION of type VARCHAR(255) CHARACTER SET UTF8, default is NULL;

  • FLUSH_INTERVAL of type INTEGER, default is NULL;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION);

  • PLUGIN_NAME of type VARCHAR(255) CHARACTER SET UTF8, default is NULL;

  • PLUGIN_OPTIONS of type VARCHAR(255) CHARACTER SET UTF8, default is NULL.

You can pass an arbitrary text description of the session to the DESCRIPTION parameter.

If the FLUSH_INTERVAL parameter is not NULL, it sets the interval for automatic flushing of statistics to snapshot tables, as when manually calling the RDB$PROFILER.SET_FLUSH_INTERVAL procedure. If FLUSH_INTERVAL is greater than zero, automatic statistics flushing is enabled; otherwise, it is disabled. The FLUSH_INTERVAL parameter is measured in seconds.

If ATTACHMENT_ID is not NULL, the profiling session is started for a remote connection; otherwise, the session starts for the current connection.

The PLUGIN_NAME parameter is used to specify which profiling plugin is used for the profiling session. If it is NULL, the profiling plugin specified in the DefaultProfilerPlugin configuration parameter is used.

Each profiling plugin may have its own options, which can be passed to the PLUGIN_OPTIONS parameter. For the Default_Profiler plugin included in the standard Firebird 5.0 distribution, the following values are allowed: NULL or 'DETAILED_REQUESTS'.

When DETAILED_REQUESTS is used, the PLG$PROF_REQUESTS table will store detailed query data, i.e., one record for each SQL statement call. This can result in the creation of a large number of records, which will lead to slow operation of RDB$PROFILER.FLUSH.

When DETAILED_REQUESTS is not used (default), the PLG$PROF_REQUESTS table saves an aggregated record for each SQL statement, using REQUEST_ID = 0.

Note

Here, an SQL statement refers to a prepared SQL query stored in the prepared query cache. Queries are considered the same if they match exactly, character by character. So if you have semantically identical queries that differ in comments, they are different queries for the prepared query cache. Prepared queries can be executed multiple times with different sets of input parameters.

1.2. Pausing a Profiling Session

The RDB$PROFILER.PAUSE_SESSION procedure pauses the current profiler session (with the given ATTACHMENT_ID). For a paused session, execution statistics for subsequent SQL statements are not collected.

If the FLUSH parameter is TRUE, the snapshot tables are updated with profiling data up to the current moment; otherwise, the data remains only in memory for subsequent updating.

Calling RDB$PROFILER.PAUSE_SESSION(TRUE) has the same effect as calling RDB$PROFILER.PAUSE_SESSION(FALSE) followed by a call to RDB$PROFILER.FLUSH (using the same ATTACHMENT_ID).

Input parameters:

  • FLUSH of type BOOLEAN NOT NULL, default is FALSE;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.3. Resuming a Profiling Session

The RDB$PROFILER.RESUME_SESSION procedure resumes the current profiler session (with the given ATTACHMENT_ID) if it was paused. After resuming the session, execution statistics for subsequent SQL statements are collected again.

Input parameters:

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.4. Finishing a Profiling Session

The RDB$PROFILER.FINISH_SESSION procedure finishes the current profiler session (with the given ATTACHMENT_ID).

If the FLUSH parameter is TRUE, the snapshot tables are updated with data from the finished session (and old finished sessions not yet present in the snapshot); otherwise, the data remains only in memory for subsequent updating.

Calling RDB$PROFILER.FINISH_SESSION(TRUE) has the same effect as calling RDB$PROFILER.FINISH_SESSION(FALSE) followed by a call to RDB$PROFILER.FLUSH (using the same ATTACHMENT_ID).

Input parameters:

  • FLUSH of type BOOLEAN NOT NULL, default is TRUE;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.5. Canceling a Profiling Session

The RDB$PROFILER.PAUSE_SESSION procedure pauses the current profiler session (with the given ATTACHMENT_ID). For a paused session, execution statistics for subsequent SQL statements are not collected.

If the FLUSH parameter is TRUE, the snapshot tables are updated with profiling data up to the current moment; otherwise, the data remains only in memory for subsequent updating.

Calling RDB$PROFILER.PAUSE_SESSION(TRUE) has the same effect as calling RDB$PROFILER.PAUSE_SESSION(FALSE) followed by a call to RDB$PROFILER.FLUSH (using the same ATTACHMENT_ID).

Input parameters:

  • FLUSH of type BOOLEAN NOT NULL, default is FALSE;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.6. Resuming a Profiling Session

The RDB$PROFILER.RESUME_SESSION procedure resumes the current profiler session (with the given ATTACHMENT_ID) if it was paused. After resuming the session, execution statistics for subsequent SQL statements are collected again.

Input parameters:

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.7. Finishing a Profiling Session

The RDB$PROFILER.FINISH_SESSION procedure finishes the current profiler session (with the given ATTACHMENT_ID).

If the FLUSH parameter is TRUE, the snapshot tables are updated with data from the finished session (and old finished sessions not yet present in the snapshot); otherwise, the data remains only in memory for subsequent updating.

Calling RDB$PROFILER.FINISH_SESSION(TRUE) has the same effect as calling RDB$PROFILER.FINISH_SESSION(FALSE) followed by a call to RDB$PROFILER.FLUSH (using the same ATTACHMENT_ID).

Input parameters:

  • FLUSH of type BOOLEAN NOT NULL, default is TRUE;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.8. Canceling a Profiling Session

The RDB$PROFILER.CANCEL_SESSION procedure cancels the current profiling session (with the given ATTACHMENT_ID).

All session data present in the profiler plugin’s memory is destroyed and not flushed to the snapshot tables.

Already flushed data is not automatically deleted.

Input parameters:

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.9. Discarding Profiling Sessions

The RDB$PROFILER.DISCARD procedure removes all sessions (with the given ATTACHMENT_ID) from memory without flushing them to the snapshot tables.

If there is an active profiling session, it is canceled.

Input parameters:

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.10. Flushing Profiling Session Statistics to Snapshot Tables

The RDB$PROFILER.FLUSH procedure updates the snapshot tables with data from profiling sessions (with the given ATTACHMENT_ID).

After flushing the statistics, the data is saved in the tables PLG$PROF_SESSIONS, PLG$PROF_STATEMENTS, PLG$PROF_RECORD_SOURCES, PLG$PROF_REQUESTS, PLG$PROF_PSQL_STATS, and PLG$PROF_RECORD_SOURCE_STATS and can be read and analyzed by the user.

Data is updated using an autonomous transaction, so if the procedure is called in a snapshot transaction, the data will not be available for immediate reading in the same transaction.

After flushing the statistics, completed profiling sessions are removed from memory.

Input parameters:

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.11. Setting the Statistics Flush Interval

The RDB$PROFILER.SET_FLUSH_INTERVAL procedure enables periodic automatic flushing of statistics (when FLUSH_INTERVAL is greater than 0) or disables it (when FLUSH_INTERVAL is 0).

The FLUSH_INTERVAL parameter is interpreted as the number of seconds.

Input parameters:

  • FLUSH_INTERVAL of type INTEGER NOT NULL;

  • ATTACHMENT_ID of type BIGINT, default is NULL (which means CURRENT_CONNECTION).

1.12. Snapshot Tables

Snapshot tables (as well as views and sequences) are created automatically when the profiler is first used. They belong to the database owner with read/write permissions for PUBLIC.

When a profiling session is deleted, the associated data in other profiler snapshot tables is automatically deleted using foreign keys with the DELETE CASCADE option.

Below is a list of tables that store profiling data.

1.12.1. Table PLG$PROF_SESSIONS

The PLG$PROF_SESSIONS table contains information about profiling sessions.

Table 1. Description of columns in the PLG$PROF_SESSIONS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

ATTACHMENT_ID

BIGINT

Connection identifier for which the profiling session was started.

USER_NAME

CHAR(63)

Name of the user who started the profiling session.

DESCRIPTION

VARCHAR(255)

Description passed in the DESCRIPTION parameter when calling RDB$PROFILER.START_SESSION.

START_TIMESTAMP

TIMESTAMP WITH TIME ZONE

Start time of the profiling session.

FINISH_TIMESTAMP

TIMESTAMP WITH TIME ZONE

End time of the profiling session (NULL if the session is not finished).

Primary key: PROFILE_ID.

1.12.2. Table PLG$PROF_STATEMENTS

The PLG$PROF_STATEMENTS table contains information about statements.

Table 2. Description of columns in the PLG$PROF_STATEMENTS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

PARENT_STATEMENT_ID

BIGINT

Parent statement identifier. Relates to subprograms.

STATEMENT_TYPE

VARCHAR(20)

Statement type: BLOCK, FUNCTION, PROCEDURE, or TRIGGER.

PACKAGE_NAME

CHAR(63)

Package name.

ROUTINE_NAME

CHAR(63)

Function, procedure, or trigger name.

SQL_TEXT

BLOB SUB_TYPE TEXT

SQL text for BLOCK type statements.

Primary key: PROFILE_ID, STATEMENT_ID.

1.12.3. Table PLG$PROF_REQUESTS

The PLG$PROF_REQUESTS table contains statistics on SQL query execution.

If the profiler is started with the DETAILED_REQUESTS option, the PLG$PROF_REQUESTS table will store detailed query data, i.e., one record for each statement call. This can result in the creation of a large number of records, which will lead to slow operation of RDB$PROFILER.FLUSH.

When DETAILED_REQUESTS is not used (default), the PLG$PROF_REQUESTS table saves an aggregated record for each statement, using REQUEST_ID = 0.

Table 3. Description of columns in the PLG$PROF_REQUESTS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

REQUEST_ID

BIGINT

Request identifier.

CALLER_STATEMENT_ID

BIGINT

Caller statement identifier.

CALLER_REQUEST_ID

BIGINT

Caller request identifier.

START_TIMESTAMP

TIMESTAMP WITH TIME ZONE

Request start time.

FINISH_TIMESTAMP

TIMESTAMP WITH TIME ZONE

Request finish time.

TOTAL_ELAPSED_TIME

BIGINT

Accumulated request execution time (in nanoseconds).

Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID.

1.12.4. Table PLG$PROF_CURSORS

The PLG$PROF_CURSORS table contains information about cursors.

Table 4. Description of columns in the PLG$PROF_CURSORS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

CURSOR_ID

BIGINT

Cursor identifier.

NAME

CHAR(63)

Name of the explicitly declared cursor.

LINE_NUM

INTEGER

PSQL line number where the cursor is defined.

COLUMN_NUM

INTEGER

PSQL column number where the cursor is defined.

Primary key: PROFILE_ID, STATEMENT_ID, CURSOR_ID.

1.12.5. Table PLG$PROF_RECORD_SOURCES

The PLG$PROF_RECORD_SOURCES table contains information about data sources.

Table 5. Description of columns in the PLG$PROF_RECORD_SOURCES table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

CURSOR_ID

BIGINT

Cursor identifier.

RECORD_SOURCE_ID

BIGINT

Data source identifier.

PARENT_RECORD_SOURCE_ID

BIGINT

Parent data source identifier.

LEVEL

INTEGER

Indent level for the data source. Necessary when constructing a detailed plan.

ACCESS_PATH

BLOB SUB_TYPE TEXT

Description of the access method for the data source.

Primary key: PROFILE_ID, STATEMENT_ID, CURSOR_ID, RECORD_SOURCE_ID.

1.12.6. Table PLG$PROF_RECORD_SOURCE_STATS

The PLG$PROF_RECORD_SOURCES table contains statistics on data sources.

Table 6. Description of columns in the PLG$PROF_RECORD_SOURCE_STATS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

REQUEST_ID

BIGINT

Request identifier.

CURSOR_ID

BIGINT

Cursor identifier.

RECORD_SOURCE_ID

BIGINT

Data source identifier.

OPEN_COUNTER

BIGINT

Number of times the data source was opened.

OPEN_MIN_ELAPSED_TIME

BIGINT

Minimum time to open the data source (in nanoseconds).

OPEN_MAX_ELAPSED_TIME

BIGINT

Maximum time to open the data source (in nanoseconds).

OPEN_TOTAL_ELAPSED_TIME

BIGINT

Accumulated time to open the data source (in nanoseconds).

FETCH_COUNTER

BIGINT

Number of fetches from the data source.

FETCH_MIN_ELAPSED_TIME

BIGINT

Minimum time to fetch a record from the data source (in nanoseconds).

FETCH_MAX_ELAPSED_TIME

BIGINT

Maximum time to fetch a record from the data source (in nanoseconds).

FETCH_TOTAL_ELAPSED_TIME

BIGINT

Accumulated time to fetch records from the data source (in nanoseconds).

Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID, CURSOR_ID, RECORD_SOURCE_ID.

1.12.7. Table PLG$PROF_PSQL_STATS

The PLG$PROF_PSQL_STATS table contains PSQL statistics.

Table 7. Description of columns in the PLG$PROF_PSQL_STATS table
Column Name Data Type Description

PROFILE_ID

BIGINT

Profiling session identifier.

STATEMENT_ID

BIGINT

Statement identifier.

REQUEST_ID

BIGINT

Request identifier.

LINE_NUM

INTEGER

PSQL line number for the statement.

COLUMN_NUM

INTEGER

PSQL column number for the statement.

COUNTER

BIGINT

Number of executions for the line/column number.

MIN_ELAPSED_TIME

BIGINT

Minimum execution time (in nanoseconds) for the line/column.

MAX_ELAPSED_TIME

BIGINT

Maximum execution time (in nanoseconds) for the line/column.

TOTAL_ELAPSED_TIME

BIGINT

Accumulated execution time (in nanoseconds) for the line/column.

Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID, LINE_NUM, COLUMN_NUM.

1.13. Auxiliary Views

In addition to snapshot tables, the profiling plugin creates auxiliary views. These views help extract profiling data aggregated at the statement level.

Auxiliary views are the preferred way to analyze profiling data for quickly finding "hot spots". They can also be used in conjunction with snapshot tables. Once "hot spots" are found, you can drill down into the data at the query level using the tables.

Below is a list of views for the Default_Profiler.

PLG$PROF_PSQL_STATS_VIEW

aggregated PSQL statistics in a profiling session.

PLG$PROF_RECORD_SOURCE_STATS_VIEW

aggregated statistics on data sources in a profiling session.

PLG$PROF_STATEMENT_STATS_VIEW

aggregated statistics of SQL statements in a profiling session.

In this document, I will not provide the source code for these views and a description of their columns; you can always view the text of these views yourself. A description of the columns can be found in the "Firebird 5.0 SQL Language Reference".

1.14. Profiler Launch Modes

Before we move on to real-world examples of using the profiler, I’ll demonstrate the difference between the different modes of running the profiling plugin.

1.14.1. Option DETAILED_REQUESTS

Note

The profiling statistics also include the SELECT RDB$PROFILER.START_SESSION() …​ query and the RDB$PROFILER.START_SESSION function startup statistics.

In order to limit the statistics output, I add a filter by the query text, in this case I output statistics only for those queries that contain 'FROM HORSE'.

SELECT RDB$PROFILER.START_SESSION('Profile without "DETAILED_REQUESTS"')
FROM RDB$DATABASE;

SELECT COUNT(*) FROM HORSE;

SELECT COUNT(*) FROM HORSE;

SELECT RDB$PROFILER.START_SESSION('Profile with "DETAILED_REQUESTS"',
  NULL, NULL, NULL, 'DETAILED_REQUESTS')
FROM RDB$DATABASE;

SELECT COUNT(*) FROM HORSE;

SELECT COUNT(*) FROM HORSE;

EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION;

COMMIT;

SELECT
  S.DESCRIPTION,
  V.*
FROM PLG$PROF_STATEMENT_STATS_VIEW V
JOIN PLG$PROF_SESSIONS S ON S.PROFILE_ID = V.PROFILE_ID
WHERE V.SQL_TEXT CONTAINING 'FROM HORSE';
DESCRIPTION                     Profile without "DETAILED_REQUESTS"
PROFILE_ID                      12
STATEMENT_ID                    2149
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
PARENT_STATEMENT_ID             <null>
PARENT_STATEMENT_TYPE           <null>
PARENT_ROUTINE_NAME             <null>
SQL_TEXT                        13e:9
SELECT COUNT(*) FROM HORSE
COUNTER                         1
MIN_ELAPSED_TIME                <null>
MAX_ELAPSED_TIME                <null>
TOTAL_ELAPSED_TIME              <null>
AVG_ELAPSED_TIME                <null>

DESCRIPTION                     Profile with "DETAILED_REQUESTS"
PROFILE_ID                      13
STATEMENT_ID                    2149
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
PARENT_STATEMENT_ID             <null>
PARENT_STATEMENT_TYPE           <null>
PARENT_ROUTINE_NAME             <null>
SQL_TEXT                        13e:b
SELECT COUNT(*) FROM HORSE
COUNTER                         2
MIN_ELAPSED_TIME                165498198
MAX_ELAPSED_TIME                235246029
TOTAL_ELAPSED_TIME              400744227
AVG_ELAPSED_TIME                200372113

As you can see, if we run a profiler session without the 'DETAILED_REQUESTS' option, the view gives us less detail. At a minimum, there are no query execution statistics, and it appears as if the query was executed once. Let’s try to detail this data using a query to the snapshot tables.

First, let’s look at the session details without the 'DETAILED_REQUESTS' option.

SELECT
  S.PROFILE_ID,
  S.DESCRIPTION,
  R.REQUEST_ID,
  STMT.STATEMENT_ID,
  STMT.STATEMENT_TYPE,
  STMT.PACKAGE_NAME,
  STMT.ROUTINE_NAME,
  STMT.SQL_TEXT,
  R.CALLER_STATEMENT_ID,
  R.CALLER_REQUEST_ID,
  R.START_TIMESTAMP,
  R.FINISH_TIMESTAMP,
  R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.PROFILE_ID = 12
  AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID                      12
DESCRIPTION                     Profile without "DETAILED_REQUESTS"
REQUEST_ID                      0
STATEMENT_ID                    2149
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
SQL_TEXT                        13e:9
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID             <null>
CALLER_REQUEST_ID               <null>
START_TIMESTAMP                 2023-11-09 15:48:59.2250
FINISH_TIMESTAMP                <null>
TOTAL_ELAPSED_TIME              <null>

Now let’s compare it with a session with the 'DETAILED_REQUESTS' option.

SELECT
  S.PROFILE_ID,
  S.DESCRIPTION,
  R.REQUEST_ID,
  STMT.STATEMENT_ID,
  STMT.STATEMENT_TYPE,
  STMT.PACKAGE_NAME,
  STMT.ROUTINE_NAME,
  STMT.SQL_TEXT,
  R.CALLER_STATEMENT_ID,
  R.CALLER_REQUEST_ID,
  R.START_TIMESTAMP,
  R.FINISH_TIMESTAMP,
  R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.PROFILE_ID = 13
  AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID                      13
DESCRIPTION                     Profile with "DETAILED_REQUESTS"
REQUEST_ID                      2474
STATEMENT_ID                    2149
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
SQL_TEXT                        13e:b
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID             <null>
CALLER_REQUEST_ID               <null>
START_TIMESTAMP                 2023-11-09 15:49:01.6540
FINISH_TIMESTAMP                2023-11-09 15:49:02.8360
TOTAL_ELAPSED_TIME              165498198

PROFILE_ID                      13
DESCRIPTION                     Profile with "DETAILED_REQUESTS"
REQUEST_ID                      2475
STATEMENT_ID                    2149
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
SQL_TEXT                        13e:b
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID             <null>
CALLER_REQUEST_ID               <null>
START_TIMESTAMP                 2023-11-09 15:49:02.8470
FINISH_TIMESTAMP                2023-11-09 15:49:04.0980
TOTAL_ELAPSED_TIME              235246029

Thus, if the profiler is run without the 'DETAILED_REQUESTS' option, all runs of the same SQL statement will appear as a single run, in which statistics are accumulated. Directly in the PLG$PROF_REQUESTS table, statistics are not taken into account at all without the 'DETAILED_REQUESTS' option, but they are aggregated in other tables.

SELECT
  R.PROFILE_ID,
  R.STATEMENT_ID,
  RS.CURSOR_ID,
  RS.RECORD_SOURCE_ID,
  RS.PARENT_RECORD_SOURCE_ID,
  RS."LEVEL",
  RS.ACCESS_PATH,
  RSS.OPEN_COUNTER,
  RSS.OPEN_MIN_ELAPSED_TIME,
  RSS.OPEN_MAX_ELAPSED_TIME,
  RSS.OPEN_TOTAL_ELAPSED_TIME,
  RSS.FETCH_COUNTER,
  RSS.FETCH_MIN_ELAPSED_TIME,
  RSS.FETCH_MAX_ELAPSED_TIME,
  RSS.FETCH_TOTAL_ELAPSED_TIME
FROM
  PLG$PROF_REQUESTS R
  JOIN PLG$PROF_RECORD_SOURCES RS
    ON RS.PROFILE_ID = R.PROFILE_ID AND
       RS.STATEMENT_ID = R.STATEMENT_ID
  JOIN PLG$PROF_RECORD_SOURCE_STATS RSS
    ON RSS.PROFILE_ID = R.PROFILE_ID AND
       RSS.STATEMENT_ID = R.STATEMENT_ID AND
       RSS.REQUEST_ID = R.REQUEST_ID AND
       RSS.CURSOR_ID = RS.CURSOR_ID AND
       RSS.RECORD_SOURCE_ID = RS.RECORD_SOURCE_ID
WHERE R.PROFILE_ID = 12
  AND R.STATEMENT_ID = 2149
ORDER BY RSS.REQUEST_ID, RSS.RECORD_SOURCE_ID
PROFILE_ID                      12
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                1
PARENT_RECORD_SOURCE_ID         <null>
LEVEL                           0
ACCESS_PATH                     140:f
Select Expression
OPEN_COUNTER                    2
OPEN_MIN_ELAPSED_TIME           10266
OPEN_MAX_ELAPSED_TIME           10755
OPEN_TOTAL_ELAPSED_TIME         21022
FETCH_COUNTER                   4
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          191538868
FETCH_TOTAL_ELAPSED_TIME        356557956

PROFILE_ID                      12
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                2
PARENT_RECORD_SOURCE_ID         1
LEVEL                           1
ACCESS_PATH                     140:10
-> Aggregate
OPEN_COUNTER                    2
OPEN_MIN_ELAPSED_TIME           9777
OPEN_MAX_ELAPSED_TIME           9777
OPEN_TOTAL_ELAPSED_TIME         19555
FETCH_COUNTER                   4
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          191538379
FETCH_TOTAL_ELAPSED_TIME        356556489

PROFILE_ID                      12
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                3
PARENT_RECORD_SOURCE_ID         2
LEVEL                           2
ACCESS_PATH                     140:11
-> Table "HORSE" Full Scan
OPEN_COUNTER                    2
OPEN_MIN_ELAPSED_TIME           2444
OPEN_MAX_ELAPSED_TIME           3911
OPEN_TOTAL_ELAPSED_TIME         6355
FETCH_COUNTER                   1039248
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          905422
FETCH_TOTAL_ELAPSED_TIME        330562264

Here you can see that all statistics are "doubled" because the request was run twice. Now let’s look at the statistics with 'DETAILED_REQUESTS'.

SELECT
  R.PROFILE_ID,
  R.STATEMENT_ID,
  RS.CURSOR_ID,
  RS.RECORD_SOURCE_ID,
  RS.PARENT_RECORD_SOURCE_ID,
  RS."LEVEL",
  RS.ACCESS_PATH,
  RSS.OPEN_COUNTER,
  RSS.OPEN_MIN_ELAPSED_TIME,
  RSS.OPEN_MAX_ELAPSED_TIME,
  RSS.OPEN_TOTAL_ELAPSED_TIME,
  RSS.FETCH_COUNTER,
  RSS.FETCH_MIN_ELAPSED_TIME,
  RSS.FETCH_MAX_ELAPSED_TIME,
  RSS.FETCH_TOTAL_ELAPSED_TIME
FROM
  PLG$PROF_REQUESTS R
  JOIN PLG$PROF_RECORD_SOURCES RS
    ON RS.PROFILE_ID = R.PROFILE_ID AND
       RS.STATEMENT_ID = R.STATEMENT_ID
  JOIN PLG$PROF_RECORD_SOURCE_STATS RSS
    ON RSS.PROFILE_ID = R.PROFILE_ID AND
       RSS.STATEMENT_ID = R.STATEMENT_ID AND
       RSS.REQUEST_ID = R.REQUEST_ID AND
       RSS.CURSOR_ID = RS.CURSOR_ID AND
       RSS.RECORD_SOURCE_ID = RS.RECORD_SOURCE_ID
WHERE R.PROFILE_ID = 13
  AND R.STATEMENT_ID = 2149
ORDER BY RSS.REQUEST_ID, RSS.RECORD_SOURCE_ID
PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                1
PARENT_RECORD_SOURCE_ID         <null>
LEVEL                           0
ACCESS_PATH                     140:14
Select Expression
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           20044
OPEN_MAX_ELAPSED_TIME           20044
OPEN_TOTAL_ELAPSED_TIME         20044
FETCH_COUNTER                   2
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          165438065
FETCH_TOTAL_ELAPSED_TIME        165438065

PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                2
PARENT_RECORD_SOURCE_ID         1
LEVEL                           1
ACCESS_PATH                     140:15
-> Aggregate
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           19066
OPEN_MAX_ELAPSED_TIME           19066
OPEN_TOTAL_ELAPSED_TIME         19066
FETCH_COUNTER                   2
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          165437576
FETCH_TOTAL_ELAPSED_TIME        165437576

PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                3
PARENT_RECORD_SOURCE_ID         2
LEVEL                           2
ACCESS_PATH                     140:16
-> Table "HORSE" Full Scan
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           2444
OPEN_MAX_ELAPSED_TIME           2444
OPEN_TOTAL_ELAPSED_TIME         2444
FETCH_COUNTER                   519624
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          892222
FETCH_TOTAL_ELAPSED_TIME        161990420

PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                1
PARENT_RECORD_SOURCE_ID         <null>
LEVEL                           0
ACCESS_PATH                     140:14
Select Expression
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           12711
OPEN_MAX_ELAPSED_TIME           12711
OPEN_TOTAL_ELAPSED_TIME         12711
FETCH_COUNTER                   2
FETCH_MIN_ELAPSED_TIME          488
FETCH_MAX_ELAPSED_TIME          235217674
FETCH_TOTAL_ELAPSED_TIME        235218163

PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                2
PARENT_RECORD_SOURCE_ID         1
LEVEL                           1
ACCESS_PATH                     140:15
-> Aggregate
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           11244
OPEN_MAX_ELAPSED_TIME           11244
OPEN_TOTAL_ELAPSED_TIME         11244
FETCH_COUNTER                   2
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          235217674
FETCH_TOTAL_ELAPSED_TIME        235217674

PROFILE_ID                      13
STATEMENT_ID                    2149
CURSOR_ID                       1
RECORD_SOURCE_ID                3
PARENT_RECORD_SOURCE_ID         2
LEVEL                           2
ACCESS_PATH                     140:16
-> Table "HORSE" Full Scan
OPEN_COUNTER                    1
OPEN_MIN_ELAPSED_TIME           2444
OPEN_MAX_ELAPSED_TIME           2444
OPEN_TOTAL_ELAPSED_TIME         2444
FETCH_COUNTER                   519624
FETCH_MIN_ELAPSED_TIME          0
FETCH_MAX_ELAPSED_TIME          675155
FETCH_TOTAL_ELAPSED_TIME        196082602

For a session with the 'DETAILED_REQUESTS' option, we see that statistics are collected separately for each SQL statement run.

Note

The 'DETAILED_REQUESTS' option creates one record in the PLG$PROF_REQUESTS table not only for each top-level SQL query, but also for each stored procedure or function call. Thus, if you have a PSQL function called on some field in the SELECT clause, then PLG$PROF_REQUESTS will have as many records with the call to this function as there were records snapped. This can significantly slow down the reset of profiling statistics. Therefore, you should not use the 'DETAILED_REQUESTS' option for any profiling session. To find "bottlenecks" in a particular query, it will be sufficient to leave the PLUGIN_OPTIONS parameter at the default value.

1.14.2. Running the profiler in a remote connection

To start a profiling session on a remote connection, you need to know the ID of that connection. You can do this using the MON$ATTACHMENTS monitoring table or by running a query with the CURRENT_CONNECTION context variable in the remote session, and set this ID as the value of the ATTACHMENT_ID parameter of the RDB$PROFILER.START_SESSION function.

Requesting connection ID in session 1
select current_connection from rdb$database;
   CURRENT_CONNECTION
=====================
                   29
Starting a profiling session on a remote computer in session 2
SELECT RDB$PROFILER.START_SESSION('Profile with "DETAILED_REQUESTS"',
  NULL, 29, NULL, 'DETAILED_REQUESTS')
FROM RDB$DATABASE;
The request or requests that we profile in session 1
select current_connection from rdb$database;
Stopping remote profiling in session 2
EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION(TRUE, 29);

COMMIT;

Now we can see the profiling result in any connection.

SELECT
  S.PROFILE_ID,
  S.ATTACHMENT_ID,
  S.START_TIMESTAMP AS SESSION_START,
  S.FINISH_TIMESTAMP AS SESSION_FINISH,
  R.REQUEST_ID,
  STMT.STATEMENT_ID,
  STMT.STATEMENT_TYPE,
  STMT.PACKAGE_NAME,
  STMT.ROUTINE_NAME,
  STMT.SQL_TEXT,
  R.CALLER_STATEMENT_ID,
  R.CALLER_REQUEST_ID,
  R.START_TIMESTAMP,
  R.FINISH_TIMESTAMP,
  R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.ATTACHMENT_ID = 29
  AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID                      14
ATTACHMENT_ID                   29
SESSION_START                   2023-11-09 16:56:39.1640
SESSION_FINISH                  2023-11-09 16:57:41.0010
REQUEST_ID                      3506
STATEMENT_ID                    3506
STATEMENT_TYPE                  BLOCK
PACKAGE_NAME                    <null>
ROUTINE_NAME                    <null>
SQL_TEXT                        13e:1
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID             <null>
CALLER_REQUEST_ID               <null>
START_TIMESTAMP                 2023-11-09 16:57:29.1010
FINISH_TIMESTAMP                2023-11-09 16:57:30.4800
TOTAL_ELAPSED_TIME              82622

1.15. Examples of using the profiler to find "bottlenecks"

Now that you’ve become familiar with the different modes for launching a profiling session, it’s time to show how the profiler can help you find “bottlenecks” in your SQL queries and PSQL modules.

Let’s say using tracing you found such a slow query:

SELECT
 CODE_HORSE,
 BYDATE,
 HORSENAME,
 FRISK,
 OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)

We need to understand the reasons for the slowdown of the query and fix them. The first thing to do is eliminate the time it takes to fetch records for the client. To do this, you can simply wrap this query in another query that will simply calculate the number of records. Thus, we are guaranteed to read all records, but at the same time we need to send only 1 record to the client.

SELECT COUNT(*)
FROM (
  SELECT
    CODE_HORSE,
    BYDATE,
    HORSENAME,
    FRISK,
    OWNERNAME
  FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);

The execution statistics for this SQL query look like this:

                COUNT
=====================
                  240

Current memory = 554444768
Delta memory = 17584
Max memory = 554469104
Elapsed time = 2.424 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124985
Note

In this case, the list of fields could simply be replaced with COUNT(*) without wrapping the query in a derived table, however, in the general case, the SELECT clause can contain various expressions, including subqueries, so it is better to do as I showed .

Now you can run the query in the profiler:

SELECT RDB$PROFILER.START_SESSION('Profile procedure SP_SOME_STAT')
FROM RDB$DATABASE;

SELECT COUNT(*)
FROM (
  SELECT
    CODE_HORSE,
    BYDATE,
    HORSENAME,
    FRISK,
    OWNERNAME
  FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);

EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION;

COMMIT;

First, let’s look at the PSQL statistics:

SELECT
  ROUTINE_NAME,
  LINE_NUM,
  COLUMN_NUM,
  COUNTER,
  TOTAL_ELAPSED_TIME,
  AVG_ELAPSED_TIME
FROM PLG$PROF_PSQL_STATS_VIEW STAT
WHERE STAT.PROFILE_ID = 5;
ROUTINE_NAME             LINE_NUM   COLUMN_NUM         COUNTER    TOTAL_ELAPSED_TIME      AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC             22            5           67801            1582095600                 23334
SF_RACETIME_TO_SEC             18            3           67801              90068700                  1328
SF_RACETIME_TO_SEC             27            5           67801              53903300                   795
SF_RACETIME_TO_SEC             31            5           67801              49835400                   735
SP_SOME_STAT                   16            3               1              43414600              43414600
SF_RACETIME_TO_SEC             25            5           67801              42623200                   628
SF_RACETIME_TO_SEC             34            5           67801              37339200                   550
SF_RACETIME_TO_SEC             14            3           67801              35822000                   528
SF_RACETIME_TO_SEC             29            5           67801              34874400                   514
SF_RACETIME_TO_SEC             32            5           67801              24093200                   355
SF_RACETIME_TO_SEC             15            3           67801              23832900                   351
SF_RACETIME_TO_SEC              6            1           67801              15985600                   235
SF_RACETIME_TO_SEC             26            5           67801              15625500                   230
SP_SOME_STAT                   38            5             240               3454800                 14395
SF_SEC_TO_RACETIME             20            3             240                549900                  2291
SF_SEC_TO_RACETIME             31            3             240                304100                  1267
SF_SEC_TO_RACETIME             21            3             240                294200                  1225
SF_SEC_TO_RACETIME             16            3             240                293900                  1224
SF_RACETIME_TO_SEC              7            1           67801                202400                     2
SF_RACETIME_TO_SEC              8            1           67801                186100                     2

ROUTINE_NAME             LINE_NUM   COLUMN_NUM         COUNTER    TOTAL_ELAPSED_TIME      AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC             20            3           67801                168400                     2
SF_RACETIME_TO_SEC              9            1           67801                156700                     2
SF_RACETIME_TO_SEC             12            1           67801                153900                     2
SF_RACETIME_TO_SEC             10            1           67801                153300                     2
SF_SEC_TO_RACETIME             18            3             240                148600                   619
SF_RACETIME_TO_SEC             16            3           67801                127100                     1
SF_SEC_TO_RACETIME             17            3             240                 92100                   383
SF_SEC_TO_RACETIME              8            1             240                 89200                   371
SF_RACETIME_TO_SEC             11            1           67801                 69500                     1
SF_SEC_TO_RACETIME             28            3             240                 16600                    69
SF_RACETIME_TO_SEC              5            1           67801                  7800                     0
SF_SEC_TO_RACETIME             11            1             240                  2000                     8
SF_SEC_TO_RACETIME             10            1             240                  1800                     7
SF_SEC_TO_RACETIME              9            1             240                  1200                     5
SP_SOME_STAT                   37            5             240                   500                     2
SF_SEC_TO_RACETIME             13            3             240                   500                     2
SF_SEC_TO_RACETIME              7            1             240                   400                     1

From these statistics it is clear that the leader in total execution time is the operator located in line 22 of the SF_RACETIME_TO_SEC function. The average execution time of this statement is low, but it is called 67801 times. There are two options for optimization: either optimize the SF_RACETIME_TO_SEC function itself (operator on line 22), or reduce the number of calls to this function.

Let’s look at the contents of the SP_SOME_STAT procedure.

CREATE OR ALTER PROCEDURE SP_SOME_STAT (
    A_CODE_BREED INTEGER,
    A_MIN_FRISK  VARCHAR(9),
    A_YEAR_BEGIN SMALLINT,
    A_YEAR_END   SMALLINT
)
RETURNS (
    CODE_HORSE BIGINT,
    BYDATE     DATE,
    HORSENAME  VARCHAR(50),
    FRISK      VARCHAR(9),
    OWNERNAME  VARCHAR(120)
)
AS
BEGIN
  FOR
    SELECT
      TL.CODE_HORSE,
      TRIAL.BYDATE,
      H.NAME,
      SF_SEC_TO_RACETIME(TL.TIME_PASSED_SEC) AS FRISK
    FROM
      TRIAL_LINE TL
      JOIN TRIAL ON TRIAL.CODE_TRIAL = TL.CODE_TRIAL
      JOIN HORSE H ON H.CODE_HORSE = TL.CODE_HORSE
    WHERE TL.TIME_PASSED_SEC <= SF_RACETIME_TO_SEC(:A_MIN_FRISK)
      AND TRIAL.CODE_TRIALTYPE = 2
      AND H.CODE_BREED = :A_CODE_BREED
      AND EXTRACT(YEAR FROM TRIAL.BYDATE) BETWEEN :A_YEAR_BEGIN AND :A_YEAR_END
    INTO
      CODE_HORSE,
      BYDATE,
      HORSENAME,
      FRISK
  DO
  BEGIN
    OWNERNAME = NULL;
    SELECT
      FARM.NAME
    FROM
      (
        SELECT
          R.CODE_FARM
        FROM REGISTRATION R
        WHERE R.CODE_HORSE = :CODE_HORSE
          AND R.CODE_REGTYPE = 6
          AND R.BYDATE <= :BYDATE
        ORDER BY R.BYDATE DESC
        FETCH FIRST ROW ONLY
      ) OWN
      JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
    INTO OWNERNAME;

    SUSPEND;
  END
END

The function SF_RACETIME_TO_SEC calls on line number 21:

    WHERE TL.TIME_PASSED_SEC <= SF_RACETIME_TO_SEC(:A_MIN_FRISK)

Obviously this condition will be checked multiple times for each record. Repeated execution of the SF_RACETIME_TO_SEC function contributes significantly to the overall execution time. If you look at the function call itself, you will notice that the function arguments do not depend on the data source, that is, the function value is invariant. This means that we can move its calculation outside the query. So we can rewrite our procedure like this:

CREATE OR ALTER PROCEDURE SP_SOME_STAT (
    A_CODE_BREED INTEGER,
    A_MIN_FRISK  VARCHAR(9),
    A_YEAR_BEGIN SMALLINT,
    A_YEAR_END   SMALLINT
)
RETURNS (
    CODE_HORSE BIGINT,
    BYDATE     DATE,
    HORSENAME  VARCHAR(50),
    FRISK      VARCHAR(9),
    OWNERNAME  VARCHAR(120)
)
AS
  DECLARE TIME_PASSED NUMERIC(18, 3);
BEGIN
  TIME_PASSED = SF_RACETIME_TO_SEC(:A_MIN_FRISK);
  FOR
    SELECT
      TL.CODE_HORSE,
      TRIAL.BYDATE,
      H.NAME,
      SF_SEC_TO_RACETIME(TL.TIME_PASSED_SEC) AS FRISK
    FROM
      TRIAL_LINE TL
      JOIN TRIAL ON TRIAL.CODE_TRIAL = TL.CODE_TRIAL
      JOIN HORSE H ON H.CODE_HORSE = TL.CODE_HORSE
    WHERE TL.TIME_PASSED_SEC <= :TIME_PASSED
      AND TRIAL.CODE_TRIALTYPE = 2
      AND H.CODE_BREED = :A_CODE_BREED
      AND EXTRACT(YEAR FROM TRIAL.BYDATE) BETWEEN :A_YEAR_BEGIN AND :A_YEAR_END
    INTO
      CODE_HORSE,
      BYDATE,
      HORSENAME,
      FRISK
  DO
  BEGIN
    OWNERNAME = NULL;
    SELECT
      FARM.NAME
    FROM
      (
        SELECT
          R.CODE_FARM
        FROM REGISTRATION R
        WHERE R.CODE_HORSE = :CODE_HORSE
          AND R.CODE_REGTYPE = 6
          AND R.BYDATE <= :BYDATE
        ORDER BY R.BYDATE DESC
        FETCH FIRST ROW ONLY
      ) OWN
      JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
    INTO OWNERNAME;

    SUSPEND;
  END
END

Let’s try to execute an SQL query after changing the SP_SOME_STAT procedure:

SELECT COUNT(*)
FROM (
  SELECT
    CODE_HORSE,
    BYDATE,
    HORSENAME,
    FRISK,
    OWNERNAME
  FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
                COUNT
=====================
                  240

Current memory = 555293472
Delta memory = 288
Max memory = 555359872
Elapsed time = 0.134 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124992

2.424 sec vs 0.134 sec - the acceleration is significant. Is it possible to do better? Let’s run the profiling session again. The new session ID is 6.

Let’s look at the PSQL statistics:

SELECT
  ROUTINE_NAME,
  LINE_NUM,
  COLUMN_NUM,
  COUNTER,
  TOTAL_ELAPSED_TIME,
  AVG_ELAPSED_TIME
FROM PLG$PROF_PSQL_STATS_VIEW STAT
WHERE STAT.PROFILE_ID = 6;
ROUTINE_NAME             LINE_NUM   COLUMN_NUM         COUNTER    TOTAL_ELAPSED_TIME      AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SP_SOME_STAT                   18            3               1              10955200              10955200
SP_SOME_STAT                   40            5             240               3985800                 16607
SF_SEC_TO_RACETIME             20            3             240                508100                  2117
SF_SEC_TO_RACETIME             31            3             240                352700                  1469
SF_SEC_TO_RACETIME             21            3             240                262200                  1092
SF_SEC_TO_RACETIME             16            3             240                257300                  1072
SF_SEC_TO_RACETIME             18            3             240                156400                   651
SP_SOME_STAT                   17            3               1                141500                141500
SF_SEC_TO_RACETIME              8            1             240                125700                   523
SF_SEC_TO_RACETIME             17            3             240                 94100                   392
SF_RACETIME_TO_SEC             22            5               1                 83400                 83400
SF_SEC_TO_RACETIME             28            3             240                 38700                   161
SF_SEC_TO_RACETIME             10            1             240                 20800                    86
SF_SEC_TO_RACETIME             11            1             240                 20200                    84
SF_SEC_TO_RACETIME              9            1             240                 16200                    67
SF_RACETIME_TO_SEC              6            1               1                  7100                  7100
SF_SEC_TO_RACETIME              7            1             240                  6600                    27
SF_RACETIME_TO_SEC             27            5               1                  5800                  5800
SF_RACETIME_TO_SEC             18            3               1                  5700                  5700
SF_SEC_TO_RACETIME             13            3             240                  5700                    23

ROUTINE_NAME             LINE_NUM   COLUMN_NUM         COUNTER    TOTAL_ELAPSED_TIME      AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC             32            5               1                  4600                  4600
SP_SOME_STAT                   39            5             240                  4400                    18
SF_RACETIME_TO_SEC             14            3               1                  4300                  4300
SF_RACETIME_TO_SEC             34            5               1                  3500                  3500
SF_RACETIME_TO_SEC             25            5               1                  3300                  3300
SF_RACETIME_TO_SEC             31            5               1                  2900                  2900
SF_RACETIME_TO_SEC             29            5               1                  2800                  2800
SF_RACETIME_TO_SEC             15            3               1                  1600                  1600
SF_RACETIME_TO_SEC             26            5               1                  1000                  1000
SF_RACETIME_TO_SEC              7            1               1                   800                   800
SF_RACETIME_TO_SEC             20            3               1                   800                   800
SF_RACETIME_TO_SEC              5            1               1                   400                   400
SF_RACETIME_TO_SEC              8            1               1                   400                   400
SF_RACETIME_TO_SEC             10            1               1                   400                   400
SF_RACETIME_TO_SEC             11            1               1                   400                   400
SF_RACETIME_TO_SEC             12            1               1                   400                   400
SF_RACETIME_TO_SEC             16            3               1                   400                   400
SP_SOME_STAT                   15            3               1                   300                   300
SF_RACETIME_TO_SEC              9            1               1                   300                   300

Line 18 in the SP_SOME_STAT procedure takes the longest time - this is the top-level cursor itself, but this cursor is opened once. It is important to note here that the total time for retrieving all records from the cursor is affected by the operators executed inside the statement block for processing each cursor record, that is

  FOR
    SELECT
...
  DO
  BEGIN
    -- everything that is done here affects the time it takes to retrieve all records from the cursor
    ...
  END

Let’s look at what makes the most significant contribution within this block. This is line number 40 of the SP_SOME_STAT procedure, which is called 240 times. Here is the content of the statement that is called:

    SELECT
      FARM.NAME
    FROM
      (
        SELECT
          R.CODE_FARM
        FROM REGISTRATION R
        WHERE R.CODE_HORSE = :CODE_HORSE
          AND R.CODE_REGTYPE = 6
          AND R.BYDATE <= :BYDATE
        ORDER BY R.BYDATE DESC
        FETCH FIRST ROW ONLY
      ) OWN
      JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
    INTO OWNERNAME;

Now let’s look at the plan of the procedure SP_SOME_STAT:

SELECT CS.MON$EXPLAINED_PLAN
FROM MON$COMPILED_STATEMENTS CS
JOIN PLG$PROF_STATEMENTS S ON S.STATEMENT_ID = CS.MON$COMPILED_STATEMENT_ID
WHERE CS.MON$OBJECT_NAME = 'SP_SOME_STAT'
  AND S.PROFILE_ID = 6;
==============================================================================
MON$EXPLAINED_PLAN:

Select Expression (line 40, column 5)
    -> Singularity Check
        -> Nested Loop Join (inner)
            -> First N Records
                -> Refetch
                    -> Sort (record length: 28, key length: 8)
                        -> Filter
                            -> Table "REGISTRATION" as "OWN R" Access By ID
                                -> Bitmap
                                    -> Index "REGISTRATION_IDX_HORSE_REGTYPE" Range Scan (full match)
            -> Filter
                -> Table "FARM" Access By ID
                    -> Bitmap
                        -> Index "PK_FARM" Unique Scan
Select Expression (line 18, column 3)
    -> Nested Loop Join (inner)
        -> Filter
            -> Table "TRIAL" Access By ID
                -> Bitmap And
                    -> Bitmap
                        -> Index "IDX_TRIAL_BYYEAR" Range Scan (lower bound: 1/1, upper bound: 1/1)
                    -> Bitmap
                        -> Index "FK_TRIAL_TRIALTYPE" Range Scan (full match)
        -> Filter
            -> Table "TRIAL_LINE" as "TL" Access By ID
                -> Bitmap
                    -> Index "FK_TRIAL_LINE_TRIAL" Range Scan (full match)
        -> Filter
            -> Table "HORSE" as "H" Access By ID
                -> Bitmap
                    -> Index "PK_HORSE" Unique Scan
==============================================================================

So the inner query plan is:

Select Expression (line 40, column 5)
    -> Singularity Check
        -> Nested Loop Join (inner)
            -> First N Records
                -> Refetch
                    -> Sort (record length: 28, key length: 8)
                        -> Filter
                            -> Table "REGISTRATION" as "OWN R" Access By ID
                                -> Bitmap
                                    -> Index "REGISTRATION_IDX_HORSE_REGTYPE" Range Scan (full match)
            -> Filter
                -> Table "FARM" Access By ID
                    -> Bitmap
                        -> Index "PK_FARM" Unique Scan

As you can see, the plan is not the most effective. An index is used to filter the data, and then sort the resulting keys and Refetch. Let’s look at the REGISTRATION_IDX_HORSE_REGTYPE index:

SQL> SHOW INDEX REGISTRATION_IDX_HORSE_REGTYPE;
REGISTRATION_IDX_HORSE_REGTYPE INDEX ON REGISTRATION(CODE_HORSE, CODE_REGTYPE)

Only the CODE_HORSE and CODE_REGTYPE fields are included in the index, so index navigation cannot be used to determine the last record on a date. Let’s try to create another composite index:

CREATE DESCENDING INDEX IDX_REG_HORSE_OWNER ON REGISTRATION(CODE_HORSE, CODE_REGTYPE, BYDATE);

Let’s run the query again:

SELECT COUNT(*)
FROM (
  SELECT
    CODE_HORSE,
    BYDATE,
    HORSENAME,
    FRISK,
    OWNERNAME
  FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
                COUNT
=====================
                  240

Current memory = 554429808
Delta memory = 288
Max memory = 554462400
Elapsed time = 0.125 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124165

Just in case, let’s check that the procedure plan has changed.

SELECT CS.MON$EXPLAINED_PLAN
FROM MON$COMPILED_STATEMENTS CS
WHERE CS.MON$OBJECT_NAME = 'SP_SOME_STAT'
ORDER BY CS.MON$COMPILED_STATEMENT_ID DESC
FETCH FIRST ROW ONLY;
==============================================================================
MON$EXPLAINED_PLAN:

Select Expression (line 38, column 5)
    -> Singularity Check
        -> Nested Loop Join (inner)
            -> First N Records
                -> Filter
                    -> Table "REGISTRATION" as "OWN R" Access By ID
                        -> Index "IDX_REG_HORSE_OWNER" Range Scan (lower bound: 3/3, upper bound: 2/3)
            -> Filter
                -> Table "FARM" Access By ID
                    -> Bitmap
                        -> Index "PK_FARM" Unique Scan
Select Expression (line 16, column 3)
    -> Nested Loop Join (inner)
        -> Filter
            -> Table "TRIAL" Access By ID
                -> Bitmap And
                    -> Bitmap
                        -> Index "IDX_TRIAL_BYYEAR" Range Scan (lower bound: 1/1, upper bound: 1/1)
                    -> Bitmap
                        -> Index "FK_TRIAL_TRIALTYPE" Range Scan (full match)
        -> Filter
            -> Table "TRIAL_LINE" as "TL" Access By ID
                -> Bitmap
                    -> Index "FK_TRIAL_LINE_TRIAL" Range Scan (full match)
        -> Filter
            -> Table "HORSE" as "H" Access By ID
                -> Bitmap
                    -> Index "PK_HORSE" Unique Scan
==============================================================================

Yes, the plan got better. Navigation by index IDX_REG_HORSE_OWNER is now used with Range Scan. If we compare the execution time, we get 0.134 seconds vs 0.125 seconds and 124992 vs 124165 fetches. The improvements are very minor. In principle, relative to the original version, our procedure has already become 19 times faster, so optimization can be completed.

1.16. Conclusion

In this article, you were introduced to a new tool - the PSQL profiler, which allows you to find bottlenecks in your SQL and PSQL code. Now it will be easier for you to find the reasons for slowness not only among top-level queries, but also in PSQL modules.

This concludes the review of the new features of Firebird 5.0. Migrate to Firebird 5.0 and get all the features described above.