One of the tasks of a database developer or administrator is to determine the causes of "slowdowns" in the information system.
Starting from Firebird 2.5, a powerful tracing tool has been added to their arsenal. Tracing is an indispensable tool for finding application bottlenecks, evaluating resources consumed during query execution, and determining the frequency of certain actions. Tracing shows statistics in the most detailed form (unlike the statistics available in ISQL, for example). The statistics do not take into account the costs of query preparation and data transmission over the network, which makes it "cleaner" than the data shown by ISQL. At the same time, tracing has a very insignificant impact on performance. Even with intensive logging, it usually results in no more than a 2-3% drop in the speed of executed queries.
After slow queries are "caught" by tracing, you can start optimizing them. However, such queries can be quite complex, and sometimes even call stored procedures, so a profiling tool is needed to help identify bottlenecks in the query itself or in the called PSQL module. Starting from Firebird 5.0, such a tool has appeared.
The profiler allows users to measure the performance costs of SQL and PSQL code. This is implemented using a system package in the engine that transmits data to the profiler plugin.
In this document, the engine and plugin are considered as a whole. Additionally, it is assumed that the default profiler plugin (Default_Profiler) is used.
The RDB$PROFILER
package can profile the execution of PSQL code, collecting statistics on how many times each line was executed, as well as its minimum, maximum, and total execution time (accurate to nanoseconds), as well as statistics on opening and fetching records from implicit and explicit SQL cursors. In addition, you can get statistics on SQL cursors in terms of data sources (access methods) of the expanded query plan.
Note
|
Although the execution time is measured with nanosecond accuracy, this result should not be trusted. The process of measuring execution time has certain overhead costs, which the profiler tries to compensate for. Accordingly, the measured time cannot be accurate, but it allows for a comparative analysis of the costs of executing individual sections of PSQL code among themselves, and identifying bottlenecks.
|
To collect profiling data, the user must first start a profiling session using the RDB$PROFILER.START_SESSION
function. This function returns a profiling session identifier, which is later saved in the profiler snapshot tables. Later, you can execute queries to these tables for user analysis. A profiler session can be local (same connection) or remote (another connection).
Remote profiling simply redirects session control commands to the remote connection. Thus, it is possible for a client to profile multiple connections simultaneously. It is also possible that a locally or remotely started profiling session contains commands issued by another connection.
Remotely executed session control commands require the target connection to be in a waiting state, i.e., not executing other queries. When the target connection is not in waiting mode, the call is blocked waiting for this state.
If the remote connection comes from another user, the calling user must have the PROFILE_ANY_ATTACHMENT
system privilege.
After starting a session, statistics for PSQL and SQL statements are collected in memory. The profiling session only collects data about statements executed in the connection associated with the session. Data is aggregated and saved for each query (i.e., executed statement). When querying snapshot tables, the user can perform additional aggregation for each statement or use auxiliary views that do this automatically.
The session can be paused to temporarily disable statistics collection. Later, it can be resumed to return statistics collection in the same session.
To analyze the collected data, the user must flush the data to snapshot tables, which can be done by finishing or pausing the session (with the FLUSH
parameter set to TRUE), or by calling RDB$PROFILER.FLUSH
. Data is flushed using an autonomous transaction (a transaction starts and ends with the specific purpose of updating profiler data).
All procedures and functions of the RDB$PROFILER
package contain an ATTACHMENT_ID
parameter, which should be specified if you want to manage a remote profiling session. If this parameter is NULL
or not specified, the procedures and functions manage the local profiling session.
1.1. Starting a Profiling Session
To start a profiling session, you need to call the RDB$PROFILER.START_SESSION
function, which returns the profiling session identifier.
This function has the following parameters:
-
DESCRIPTION
of type VARCHAR(255) CHARACTER SET UTF8
, default is NULL
;
-
FLUSH_INTERVAL
of type INTEGER
, default is NULL
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
);
-
PLUGIN_NAME
of type VARCHAR(255) CHARACTER SET UTF8
, default is NULL
;
-
PLUGIN_OPTIONS
of type VARCHAR(255) CHARACTER SET UTF8
, default is NULL
.
You can pass an arbitrary text description of the session to the DESCRIPTION
parameter.
If the FLUSH_INTERVAL
parameter is not NULL
, it sets the interval for automatic flushing of statistics to snapshot tables, as when manually calling the RDB$PROFILER.SET_FLUSH_INTERVAL
procedure. If FLUSH_INTERVAL
is greater than zero, automatic statistics flushing is enabled; otherwise, it is disabled. The FLUSH_INTERVAL
parameter is measured in seconds.
If ATTACHMENT_ID
is not NULL
, the profiling session is started for a remote connection; otherwise, the session starts for the current connection.
The PLUGIN_NAME
parameter is used to specify which profiling plugin is used for the profiling session. If it is NULL
, the profiling plugin specified in the DefaultProfilerPlugin
configuration parameter is used.
Each profiling plugin may have its own options, which can be passed to the PLUGIN_OPTIONS
parameter. For the Default_Profiler
plugin included in the standard Firebird 5.0 distribution, the following values are allowed: NULL
or 'DETAILED_REQUESTS'
.
When DETAILED_REQUESTS
is used, the PLG$PROF_REQUESTS
table will store detailed query data, i.e., one record for each SQL statement call. This can result in the creation of a large number of records, which will lead to slow operation of RDB$PROFILER.FLUSH
.
When DETAILED_REQUESTS
is not used (default), the PLG$PROF_REQUESTS
table saves an aggregated record for each SQL statement, using REQUEST_ID = 0
.
Note
|
Here, an SQL statement refers to a prepared SQL query stored in the prepared query cache. Queries are considered the same if they match exactly, character by character. So if you have semantically identical queries that differ in comments, they are different queries for the prepared query cache. Prepared queries can be executed multiple times with different sets of input parameters.
|
1.2. Pausing a Profiling Session
The RDB$PROFILER.PAUSE_SESSION
procedure pauses the current profiler session (with the given ATTACHMENT_ID
). For a paused session, execution statistics for subsequent SQL statements are not collected.
If the FLUSH
parameter is TRUE
, the snapshot tables are updated with profiling data up to the current moment; otherwise, the data remains only in memory for subsequent updating.
Calling RDB$PROFILER.PAUSE_SESSION(TRUE)
has the same effect as calling RDB$PROFILER.PAUSE_SESSION(FALSE)
followed by a call to RDB$PROFILER.FLUSH
(using the same ATTACHMENT_ID
).
-
FLUSH
of type BOOLEAN NOT NULL
, default is FALSE
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
).
1.3. Resuming a Profiling Session
The RDB$PROFILER.RESUME_SESSION
procedure resumes the current profiler session (with the given ATTACHMENT_ID
) if it was paused. After resuming the session, execution statistics for subsequent SQL statements are collected again.
1.4. Finishing a Profiling Session
The RDB$PROFILER.FINISH_SESSION
procedure finishes the current profiler session (with the given ATTACHMENT_ID
).
If the FLUSH
parameter is TRUE
, the snapshot tables are updated with data from the finished session (and old finished sessions not yet present in the snapshot); otherwise, the data remains only in memory for subsequent updating.
Calling RDB$PROFILER.FINISH_SESSION(TRUE)
has the same effect as calling RDB$PROFILER.FINISH_SESSION(FALSE)
followed by a call to RDB$PROFILER.FLUSH
(using the same ATTACHMENT_ID
).
-
FLUSH
of type BOOLEAN NOT NULL
, default is TRUE
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
).
1.5. Canceling a Profiling Session
The RDB$PROFILER.PAUSE_SESSION
procedure pauses the current profiler session (with the given ATTACHMENT_ID
). For a paused session, execution statistics for subsequent SQL statements are not collected.
If the FLUSH
parameter is TRUE
, the snapshot tables are updated with profiling data up to the current moment; otherwise, the data remains only in memory for subsequent updating.
Calling RDB$PROFILER.PAUSE_SESSION(TRUE)
has the same effect as calling RDB$PROFILER.PAUSE_SESSION(FALSE)
followed by a call to RDB$PROFILER.FLUSH
(using the same ATTACHMENT_ID
).
-
FLUSH
of type BOOLEAN NOT NULL
, default is FALSE
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
).
1.6. Resuming a Profiling Session
The RDB$PROFILER.RESUME_SESSION
procedure resumes the current profiler session (with the given ATTACHMENT_ID
) if it was paused. After resuming the session, execution statistics for subsequent SQL statements are collected again.
1.7. Finishing a Profiling Session
The RDB$PROFILER.FINISH_SESSION
procedure finishes the current profiler session (with the given ATTACHMENT_ID
).
If the FLUSH
parameter is TRUE
, the snapshot tables are updated with data from the finished session (and old finished sessions not yet present in the snapshot); otherwise, the data remains only in memory for subsequent updating.
Calling RDB$PROFILER.FINISH_SESSION(TRUE)
has the same effect as calling RDB$PROFILER.FINISH_SESSION(FALSE)
followed by a call to RDB$PROFILER.FLUSH
(using the same ATTACHMENT_ID
).
-
FLUSH
of type BOOLEAN NOT NULL
, default is TRUE
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
).
1.8. Canceling a Profiling Session
The RDB$PROFILER.CANCEL_SESSION
procedure cancels the current profiling session (with the given ATTACHMENT_ID
).
All session data present in the profiler plugin’s memory is destroyed and not flushed to the snapshot tables.
Already flushed data is not automatically deleted.
1.9. Discarding Profiling Sessions
The RDB$PROFILER.DISCARD
procedure removes all sessions (with the given ATTACHMENT_ID
) from memory without flushing them to the snapshot tables.
If there is an active profiling session, it is canceled.
1.10. Flushing Profiling Session Statistics to Snapshot Tables
The RDB$PROFILER.FLUSH
procedure updates the snapshot tables with data from profiling sessions (with the given ATTACHMENT_ID
).
After flushing the statistics, the data is saved in the tables PLG$PROF_SESSIONS
, PLG$PROF_STATEMENTS
, PLG$PROF_RECORD_SOURCES
, PLG$PROF_REQUESTS
, PLG$PROF_PSQL_STATS
, and PLG$PROF_RECORD_SOURCE_STATS
and can be read and analyzed by the user.
Data is updated using an autonomous transaction, so if the procedure is called in a snapshot transaction, the data will not be available for immediate reading in the same transaction.
After flushing the statistics, completed profiling sessions are removed from memory.
1.11. Setting the Statistics Flush Interval
The RDB$PROFILER.SET_FLUSH_INTERVAL
procedure enables periodic automatic flushing of statistics (when FLUSH_INTERVAL
is greater than 0) or disables it (when FLUSH_INTERVAL
is 0).
The FLUSH_INTERVAL
parameter is interpreted as the number of seconds.
-
FLUSH_INTERVAL
of type INTEGER NOT NULL
;
-
ATTACHMENT_ID
of type BIGINT
, default is NULL
(which means CURRENT_CONNECTION
).
1.12. Snapshot Tables
Snapshot tables (as well as views and sequences) are created automatically when the profiler is first used. They belong to the database owner with read/write permissions for PUBLIC
.
When a profiling session is deleted, the associated data in other profiler snapshot tables is automatically deleted using foreign keys with the DELETE CASCADE
option.
Below is a list of tables that store profiling data.
1.12.1. Table PLG$PROF_SESSIONS
The PLG$PROF_SESSIONS
table contains information about profiling sessions.
Table 1. Description of columns in the PLG$PROF_SESSIONS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
ATTACHMENT_ID
|
BIGINT
|
Connection identifier for which the profiling session was started.
|
USER_NAME
|
CHAR(63)
|
Name of the user who started the profiling session.
|
DESCRIPTION
|
VARCHAR(255)
|
Description passed in the DESCRIPTION parameter when calling RDB$PROFILER.START_SESSION .
|
START_TIMESTAMP
|
TIMESTAMP WITH TIME ZONE
|
Start time of the profiling session.
|
FINISH_TIMESTAMP
|
TIMESTAMP WITH TIME ZONE
|
End time of the profiling session (NULL if the session is not finished).
|
1.12.2. Table PLG$PROF_STATEMENTS
The PLG$PROF_STATEMENTS
table contains information about statements.
Table 2. Description of columns in the PLG$PROF_STATEMENTS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
PARENT_STATEMENT_ID
|
BIGINT
|
Parent statement identifier. Relates to subprograms.
|
STATEMENT_TYPE
|
VARCHAR(20)
|
Statement type: BLOCK, FUNCTION, PROCEDURE, or TRIGGER.
|
PACKAGE_NAME
|
CHAR(63)
|
Package name.
|
ROUTINE_NAME
|
CHAR(63)
|
Function, procedure, or trigger name.
|
SQL_TEXT
|
BLOB SUB_TYPE TEXT
|
SQL text for BLOCK type statements.
|
Primary key: PROFILE_ID, STATEMENT_ID
.
1.12.3. Table PLG$PROF_REQUESTS
The PLG$PROF_REQUESTS
table contains statistics on SQL query execution.
If the profiler is started with the DETAILED_REQUESTS
option, the PLG$PROF_REQUESTS
table will store detailed query data, i.e., one record for each statement call. This can result in the creation of a large number of records, which will lead to slow operation of RDB$PROFILER.FLUSH
.
When DETAILED_REQUESTS
is not used (default), the PLG$PROF_REQUESTS
table saves an aggregated record for each statement, using REQUEST_ID = 0
.
Table 3. Description of columns in the PLG$PROF_REQUESTS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
REQUEST_ID
|
BIGINT
|
Request identifier.
|
CALLER_STATEMENT_ID
|
BIGINT
|
Caller statement identifier.
|
CALLER_REQUEST_ID
|
BIGINT
|
Caller request identifier.
|
START_TIMESTAMP
|
TIMESTAMP WITH TIME ZONE
|
Request start time.
|
FINISH_TIMESTAMP
|
TIMESTAMP WITH TIME ZONE
|
Request finish time.
|
TOTAL_ELAPSED_TIME
|
BIGINT
|
Accumulated request execution time (in nanoseconds).
|
Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID
.
1.12.4. Table PLG$PROF_CURSORS
The PLG$PROF_CURSORS
table contains information about cursors.
Table 4. Description of columns in the PLG$PROF_CURSORS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
CURSOR_ID
|
BIGINT
|
Cursor identifier.
|
NAME
|
CHAR(63)
|
Name of the explicitly declared cursor.
|
LINE_NUM
|
INTEGER
|
PSQL line number where the cursor is defined.
|
COLUMN_NUM
|
INTEGER
|
PSQL column number where the cursor is defined.
|
Primary key: PROFILE_ID, STATEMENT_ID, CURSOR_ID
.
1.12.5. Table PLG$PROF_RECORD_SOURCES
The PLG$PROF_RECORD_SOURCES
table contains information about data sources.
Table 5. Description of columns in the PLG$PROF_RECORD_SOURCES
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
CURSOR_ID
|
BIGINT
|
Cursor identifier.
|
RECORD_SOURCE_ID
|
BIGINT
|
Data source identifier.
|
PARENT_RECORD_SOURCE_ID
|
BIGINT
|
Parent data source identifier.
|
LEVEL
|
INTEGER
|
Indent level for the data source. Necessary when constructing a detailed plan.
|
ACCESS_PATH
|
BLOB SUB_TYPE TEXT
|
Description of the access method for the data source.
|
Primary key: PROFILE_ID, STATEMENT_ID, CURSOR_ID, RECORD_SOURCE_ID
.
1.12.6. Table PLG$PROF_RECORD_SOURCE_STATS
The PLG$PROF_RECORD_SOURCES
table contains statistics on data sources.
Table 6. Description of columns in the PLG$PROF_RECORD_SOURCE_STATS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
REQUEST_ID
|
BIGINT
|
Request identifier.
|
CURSOR_ID
|
BIGINT
|
Cursor identifier.
|
RECORD_SOURCE_ID
|
BIGINT
|
Data source identifier.
|
OPEN_COUNTER
|
BIGINT
|
Number of times the data source was opened.
|
OPEN_MIN_ELAPSED_TIME
|
BIGINT
|
Minimum time to open the data source (in nanoseconds).
|
OPEN_MAX_ELAPSED_TIME
|
BIGINT
|
Maximum time to open the data source (in nanoseconds).
|
OPEN_TOTAL_ELAPSED_TIME
|
BIGINT
|
Accumulated time to open the data source (in nanoseconds).
|
FETCH_COUNTER
|
BIGINT
|
Number of fetches from the data source.
|
FETCH_MIN_ELAPSED_TIME
|
BIGINT
|
Minimum time to fetch a record from the data source (in nanoseconds).
|
FETCH_MAX_ELAPSED_TIME
|
BIGINT
|
Maximum time to fetch a record from the data source (in nanoseconds).
|
FETCH_TOTAL_ELAPSED_TIME
|
BIGINT
|
Accumulated time to fetch records from the data source (in nanoseconds).
|
Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID, CURSOR_ID, RECORD_SOURCE_ID
.
1.12.7. Table PLG$PROF_PSQL_STATS
The PLG$PROF_PSQL_STATS
table contains PSQL statistics.
Table 7. Description of columns in the PLG$PROF_PSQL_STATS
table
Column Name |
Data Type |
Description |
PROFILE_ID
|
BIGINT
|
Profiling session identifier.
|
STATEMENT_ID
|
BIGINT
|
Statement identifier.
|
REQUEST_ID
|
BIGINT
|
Request identifier.
|
LINE_NUM
|
INTEGER
|
PSQL line number for the statement.
|
COLUMN_NUM
|
INTEGER
|
PSQL column number for the statement.
|
COUNTER
|
BIGINT
|
Number of executions for the line/column number.
|
MIN_ELAPSED_TIME
|
BIGINT
|
Minimum execution time (in nanoseconds) for the line/column.
|
MAX_ELAPSED_TIME
|
BIGINT
|
Maximum execution time (in nanoseconds) for the line/column.
|
TOTAL_ELAPSED_TIME
|
BIGINT
|
Accumulated execution time (in nanoseconds) for the line/column.
|
Primary key: PROFILE_ID, STATEMENT_ID, REQUEST_ID, LINE_NUM, COLUMN_NUM
.
1.13. Auxiliary Views
In addition to snapshot tables, the profiling plugin creates auxiliary views. These views help extract profiling data aggregated at the statement level.
Auxiliary views are the preferred way to analyze profiling data for quickly finding "hot spots". They can also be used in conjunction with snapshot tables. Once "hot spots" are found, you can drill down into the data at the query level using the tables.
Below is a list of views for the Default_Profiler
.
PLG$PROF_PSQL_STATS_VIEW
-
aggregated PSQL statistics in a profiling session.
PLG$PROF_RECORD_SOURCE_STATS_VIEW
-
aggregated statistics on data sources in a profiling session.
PLG$PROF_STATEMENT_STATS_VIEW
-
aggregated statistics of SQL statements in a profiling session.
In this document, I will not provide the source code for these views and a description of their columns; you can always view the text of these views yourself. A description of the columns can be found in the "Firebird 5.0 SQL Language Reference".
1.14. Profiler Launch Modes
Before we move on to real-world examples of using the profiler, I’ll demonstrate the difference between the different modes of running the profiling plugin.
1.14.1. Option DETAILED_REQUESTS
Note
|
The profiling statistics also include the SELECT RDB$PROFILER.START_SESSION() … query and the RDB$PROFILER.START_SESSION function startup statistics.
In order to limit the statistics output, I add a filter by the query text, in this case I output statistics only for those queries that contain 'FROM HORSE'.
|
SELECT RDB$PROFILER.START_SESSION('Profile without "DETAILED_REQUESTS"')
FROM RDB$DATABASE;
SELECT COUNT(*) FROM HORSE;
SELECT COUNT(*) FROM HORSE;
SELECT RDB$PROFILER.START_SESSION('Profile with "DETAILED_REQUESTS"',
NULL, NULL, NULL, 'DETAILED_REQUESTS')
FROM RDB$DATABASE;
SELECT COUNT(*) FROM HORSE;
SELECT COUNT(*) FROM HORSE;
EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION;
COMMIT;
SELECT
S.DESCRIPTION,
V.*
FROM PLG$PROF_STATEMENT_STATS_VIEW V
JOIN PLG$PROF_SESSIONS S ON S.PROFILE_ID = V.PROFILE_ID
WHERE V.SQL_TEXT CONTAINING 'FROM HORSE';
DESCRIPTION Profile without "DETAILED_REQUESTS"
PROFILE_ID 12
STATEMENT_ID 2149
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
PARENT_STATEMENT_ID <null>
PARENT_STATEMENT_TYPE <null>
PARENT_ROUTINE_NAME <null>
SQL_TEXT 13e:9
SELECT COUNT(*) FROM HORSE
COUNTER 1
MIN_ELAPSED_TIME <null>
MAX_ELAPSED_TIME <null>
TOTAL_ELAPSED_TIME <null>
AVG_ELAPSED_TIME <null>
DESCRIPTION Profile with "DETAILED_REQUESTS"
PROFILE_ID 13
STATEMENT_ID 2149
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
PARENT_STATEMENT_ID <null>
PARENT_STATEMENT_TYPE <null>
PARENT_ROUTINE_NAME <null>
SQL_TEXT 13e:b
SELECT COUNT(*) FROM HORSE
COUNTER 2
MIN_ELAPSED_TIME 165498198
MAX_ELAPSED_TIME 235246029
TOTAL_ELAPSED_TIME 400744227
AVG_ELAPSED_TIME 200372113
As you can see, if we run a profiler session without the 'DETAILED_REQUESTS' option, the view gives us less detail. At a minimum, there are no query execution statistics, and it appears as if the query was executed once. Let’s try to detail this data using a query to the snapshot tables.
First, let’s look at the session details without the 'DETAILED_REQUESTS' option.
SELECT
S.PROFILE_ID,
S.DESCRIPTION,
R.REQUEST_ID,
STMT.STATEMENT_ID,
STMT.STATEMENT_TYPE,
STMT.PACKAGE_NAME,
STMT.ROUTINE_NAME,
STMT.SQL_TEXT,
R.CALLER_STATEMENT_ID,
R.CALLER_REQUEST_ID,
R.START_TIMESTAMP,
R.FINISH_TIMESTAMP,
R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.PROFILE_ID = 12
AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID 12
DESCRIPTION Profile without "DETAILED_REQUESTS"
REQUEST_ID 0
STATEMENT_ID 2149
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
SQL_TEXT 13e:9
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID <null>
CALLER_REQUEST_ID <null>
START_TIMESTAMP 2023-11-09 15:48:59.2250
FINISH_TIMESTAMP <null>
TOTAL_ELAPSED_TIME <null>
Now let’s compare it with a session with the 'DETAILED_REQUESTS' option.
SELECT
S.PROFILE_ID,
S.DESCRIPTION,
R.REQUEST_ID,
STMT.STATEMENT_ID,
STMT.STATEMENT_TYPE,
STMT.PACKAGE_NAME,
STMT.ROUTINE_NAME,
STMT.SQL_TEXT,
R.CALLER_STATEMENT_ID,
R.CALLER_REQUEST_ID,
R.START_TIMESTAMP,
R.FINISH_TIMESTAMP,
R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.PROFILE_ID = 13
AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID 13
DESCRIPTION Profile with "DETAILED_REQUESTS"
REQUEST_ID 2474
STATEMENT_ID 2149
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
SQL_TEXT 13e:b
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID <null>
CALLER_REQUEST_ID <null>
START_TIMESTAMP 2023-11-09 15:49:01.6540
FINISH_TIMESTAMP 2023-11-09 15:49:02.8360
TOTAL_ELAPSED_TIME 165498198
PROFILE_ID 13
DESCRIPTION Profile with "DETAILED_REQUESTS"
REQUEST_ID 2475
STATEMENT_ID 2149
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
SQL_TEXT 13e:b
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID <null>
CALLER_REQUEST_ID <null>
START_TIMESTAMP 2023-11-09 15:49:02.8470
FINISH_TIMESTAMP 2023-11-09 15:49:04.0980
TOTAL_ELAPSED_TIME 235246029
Thus, if the profiler is run without the 'DETAILED_REQUESTS' option, all runs of the same SQL statement will appear as a single run, in which statistics are accumulated. Directly in the PLG$PROF_REQUESTS
table, statistics are not taken into account at all without the 'DETAILED_REQUESTS' option, but they are aggregated in other tables.
SELECT
R.PROFILE_ID,
R.STATEMENT_ID,
RS.CURSOR_ID,
RS.RECORD_SOURCE_ID,
RS.PARENT_RECORD_SOURCE_ID,
RS."LEVEL",
RS.ACCESS_PATH,
RSS.OPEN_COUNTER,
RSS.OPEN_MIN_ELAPSED_TIME,
RSS.OPEN_MAX_ELAPSED_TIME,
RSS.OPEN_TOTAL_ELAPSED_TIME,
RSS.FETCH_COUNTER,
RSS.FETCH_MIN_ELAPSED_TIME,
RSS.FETCH_MAX_ELAPSED_TIME,
RSS.FETCH_TOTAL_ELAPSED_TIME
FROM
PLG$PROF_REQUESTS R
JOIN PLG$PROF_RECORD_SOURCES RS
ON RS.PROFILE_ID = R.PROFILE_ID AND
RS.STATEMENT_ID = R.STATEMENT_ID
JOIN PLG$PROF_RECORD_SOURCE_STATS RSS
ON RSS.PROFILE_ID = R.PROFILE_ID AND
RSS.STATEMENT_ID = R.STATEMENT_ID AND
RSS.REQUEST_ID = R.REQUEST_ID AND
RSS.CURSOR_ID = RS.CURSOR_ID AND
RSS.RECORD_SOURCE_ID = RS.RECORD_SOURCE_ID
WHERE R.PROFILE_ID = 12
AND R.STATEMENT_ID = 2149
ORDER BY RSS.REQUEST_ID, RSS.RECORD_SOURCE_ID
PROFILE_ID 12
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 1
PARENT_RECORD_SOURCE_ID <null>
LEVEL 0
ACCESS_PATH 140:f
Select Expression
OPEN_COUNTER 2
OPEN_MIN_ELAPSED_TIME 10266
OPEN_MAX_ELAPSED_TIME 10755
OPEN_TOTAL_ELAPSED_TIME 21022
FETCH_COUNTER 4
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 191538868
FETCH_TOTAL_ELAPSED_TIME 356557956
PROFILE_ID 12
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 2
PARENT_RECORD_SOURCE_ID 1
LEVEL 1
ACCESS_PATH 140:10
-> Aggregate
OPEN_COUNTER 2
OPEN_MIN_ELAPSED_TIME 9777
OPEN_MAX_ELAPSED_TIME 9777
OPEN_TOTAL_ELAPSED_TIME 19555
FETCH_COUNTER 4
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 191538379
FETCH_TOTAL_ELAPSED_TIME 356556489
PROFILE_ID 12
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 3
PARENT_RECORD_SOURCE_ID 2
LEVEL 2
ACCESS_PATH 140:11
-> Table "HORSE" Full Scan
OPEN_COUNTER 2
OPEN_MIN_ELAPSED_TIME 2444
OPEN_MAX_ELAPSED_TIME 3911
OPEN_TOTAL_ELAPSED_TIME 6355
FETCH_COUNTER 1039248
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 905422
FETCH_TOTAL_ELAPSED_TIME 330562264
Here you can see that all statistics are "doubled" because the request was run twice. Now let’s look at the statistics with 'DETAILED_REQUESTS'.
SELECT
R.PROFILE_ID,
R.STATEMENT_ID,
RS.CURSOR_ID,
RS.RECORD_SOURCE_ID,
RS.PARENT_RECORD_SOURCE_ID,
RS."LEVEL",
RS.ACCESS_PATH,
RSS.OPEN_COUNTER,
RSS.OPEN_MIN_ELAPSED_TIME,
RSS.OPEN_MAX_ELAPSED_TIME,
RSS.OPEN_TOTAL_ELAPSED_TIME,
RSS.FETCH_COUNTER,
RSS.FETCH_MIN_ELAPSED_TIME,
RSS.FETCH_MAX_ELAPSED_TIME,
RSS.FETCH_TOTAL_ELAPSED_TIME
FROM
PLG$PROF_REQUESTS R
JOIN PLG$PROF_RECORD_SOURCES RS
ON RS.PROFILE_ID = R.PROFILE_ID AND
RS.STATEMENT_ID = R.STATEMENT_ID
JOIN PLG$PROF_RECORD_SOURCE_STATS RSS
ON RSS.PROFILE_ID = R.PROFILE_ID AND
RSS.STATEMENT_ID = R.STATEMENT_ID AND
RSS.REQUEST_ID = R.REQUEST_ID AND
RSS.CURSOR_ID = RS.CURSOR_ID AND
RSS.RECORD_SOURCE_ID = RS.RECORD_SOURCE_ID
WHERE R.PROFILE_ID = 13
AND R.STATEMENT_ID = 2149
ORDER BY RSS.REQUEST_ID, RSS.RECORD_SOURCE_ID
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 1
PARENT_RECORD_SOURCE_ID <null>
LEVEL 0
ACCESS_PATH 140:14
Select Expression
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 20044
OPEN_MAX_ELAPSED_TIME 20044
OPEN_TOTAL_ELAPSED_TIME 20044
FETCH_COUNTER 2
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 165438065
FETCH_TOTAL_ELAPSED_TIME 165438065
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 2
PARENT_RECORD_SOURCE_ID 1
LEVEL 1
ACCESS_PATH 140:15
-> Aggregate
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 19066
OPEN_MAX_ELAPSED_TIME 19066
OPEN_TOTAL_ELAPSED_TIME 19066
FETCH_COUNTER 2
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 165437576
FETCH_TOTAL_ELAPSED_TIME 165437576
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 3
PARENT_RECORD_SOURCE_ID 2
LEVEL 2
ACCESS_PATH 140:16
-> Table "HORSE" Full Scan
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 2444
OPEN_MAX_ELAPSED_TIME 2444
OPEN_TOTAL_ELAPSED_TIME 2444
FETCH_COUNTER 519624
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 892222
FETCH_TOTAL_ELAPSED_TIME 161990420
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 1
PARENT_RECORD_SOURCE_ID <null>
LEVEL 0
ACCESS_PATH 140:14
Select Expression
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 12711
OPEN_MAX_ELAPSED_TIME 12711
OPEN_TOTAL_ELAPSED_TIME 12711
FETCH_COUNTER 2
FETCH_MIN_ELAPSED_TIME 488
FETCH_MAX_ELAPSED_TIME 235217674
FETCH_TOTAL_ELAPSED_TIME 235218163
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 2
PARENT_RECORD_SOURCE_ID 1
LEVEL 1
ACCESS_PATH 140:15
-> Aggregate
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 11244
OPEN_MAX_ELAPSED_TIME 11244
OPEN_TOTAL_ELAPSED_TIME 11244
FETCH_COUNTER 2
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 235217674
FETCH_TOTAL_ELAPSED_TIME 235217674
PROFILE_ID 13
STATEMENT_ID 2149
CURSOR_ID 1
RECORD_SOURCE_ID 3
PARENT_RECORD_SOURCE_ID 2
LEVEL 2
ACCESS_PATH 140:16
-> Table "HORSE" Full Scan
OPEN_COUNTER 1
OPEN_MIN_ELAPSED_TIME 2444
OPEN_MAX_ELAPSED_TIME 2444
OPEN_TOTAL_ELAPSED_TIME 2444
FETCH_COUNTER 519624
FETCH_MIN_ELAPSED_TIME 0
FETCH_MAX_ELAPSED_TIME 675155
FETCH_TOTAL_ELAPSED_TIME 196082602
For a session with the 'DETAILED_REQUESTS' option, we see that statistics are collected separately for each SQL statement run.
Note
|
The 'DETAILED_REQUESTS' option creates one record in the PLG$PROF_REQUESTS table not only for each top-level SQL query, but also for each stored procedure or function call. Thus, if you have a PSQL function called on some field in the SELECT clause, then PLG$PROF_REQUESTS will have as many records with the call to this function as there were records snapped. This can significantly slow down the reset of profiling statistics. Therefore, you should not use the 'DETAILED_REQUESTS' option for any profiling session. To find "bottlenecks" in a particular query, it will be sufficient to leave the PLUGIN_OPTIONS parameter at the default value.
|
1.14.2. Running the profiler in a remote connection
To start a profiling session on a remote connection, you need to know the ID of that connection. You can do this using the MON$ATTACHMENTS
monitoring table or by running a query with the CURRENT_CONNECTION
context variable in the remote session, and set this ID as the value of the ATTACHMENT_ID
parameter of the RDB$PROFILER.START_SESSION
function.
Requesting connection ID in session 1
select current_connection from rdb$database;
CURRENT_CONNECTION
=====================
29
Starting a profiling session on a remote computer in session 2
SELECT RDB$PROFILER.START_SESSION('Profile with "DETAILED_REQUESTS"',
NULL, 29, NULL, 'DETAILED_REQUESTS')
FROM RDB$DATABASE;
The request or requests that we profile in session 1
select current_connection from rdb$database;
Stopping remote profiling in session 2
EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION(TRUE, 29);
COMMIT;
Now we can see the profiling result in any connection.
SELECT
S.PROFILE_ID,
S.ATTACHMENT_ID,
S.START_TIMESTAMP AS SESSION_START,
S.FINISH_TIMESTAMP AS SESSION_FINISH,
R.REQUEST_ID,
STMT.STATEMENT_ID,
STMT.STATEMENT_TYPE,
STMT.PACKAGE_NAME,
STMT.ROUTINE_NAME,
STMT.SQL_TEXT,
R.CALLER_STATEMENT_ID,
R.CALLER_REQUEST_ID,
R.START_TIMESTAMP,
R.FINISH_TIMESTAMP,
R.TOTAL_ELAPSED_TIME
FROM PLG$PROF_SESSIONS S
JOIN PLG$PROF_STATEMENTS STMT ON STMT.PROFILE_ID = S.PROFILE_ID
JOIN PLG$PROF_REQUESTS R ON R.PROFILE_ID = S.PROFILE_ID AND R.STATEMENT_ID = STMT.STATEMENT_ID
WHERE S.ATTACHMENT_ID = 29
AND STMT.SQL_TEXT CONTAINING 'FROM HORSE';
PROFILE_ID 14
ATTACHMENT_ID 29
SESSION_START 2023-11-09 16:56:39.1640
SESSION_FINISH 2023-11-09 16:57:41.0010
REQUEST_ID 3506
STATEMENT_ID 3506
STATEMENT_TYPE BLOCK
PACKAGE_NAME <null>
ROUTINE_NAME <null>
SQL_TEXT 13e:1
SELECT COUNT(*) FROM HORSE
CALLER_STATEMENT_ID <null>
CALLER_REQUEST_ID <null>
START_TIMESTAMP 2023-11-09 16:57:29.1010
FINISH_TIMESTAMP 2023-11-09 16:57:30.4800
TOTAL_ELAPSED_TIME 82622
1.15. Examples of using the profiler to find "bottlenecks"
Now that you’ve become familiar with the different modes for launching a profiling session, it’s time to show how the profiler can help you find “bottlenecks” in your SQL queries and PSQL modules.
Let’s say using tracing you found such a slow query:
SELECT
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK,
OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
We need to understand the reasons for the slowdown of the query and fix them. The first thing to do is eliminate the time it takes to fetch records for the client. To do this, you can simply wrap this query in another query that will simply calculate the number of records. Thus, we are guaranteed to read all records, but at the same time we need to send only 1 record to the client.
SELECT COUNT(*)
FROM (
SELECT
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK,
OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
The execution statistics for this SQL query look like this:
COUNT
=====================
240
Current memory = 554444768
Delta memory = 17584
Max memory = 554469104
Elapsed time = 2.424 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124985
Note
|
In this case, the list of fields could simply be replaced with COUNT(*) without wrapping the query in a derived table, however, in the general case, the SELECT clause can contain various expressions, including subqueries, so it is better to do as I showed .
|
Now you can run the query in the profiler:
SELECT RDB$PROFILER.START_SESSION('Profile procedure SP_SOME_STAT')
FROM RDB$DATABASE;
SELECT COUNT(*)
FROM (
SELECT
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK,
OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
EXECUTE PROCEDURE RDB$PROFILER.FINISH_SESSION;
COMMIT;
First, let’s look at the PSQL statistics:
SELECT
ROUTINE_NAME,
LINE_NUM,
COLUMN_NUM,
COUNTER,
TOTAL_ELAPSED_TIME,
AVG_ELAPSED_TIME
FROM PLG$PROF_PSQL_STATS_VIEW STAT
WHERE STAT.PROFILE_ID = 5;
ROUTINE_NAME LINE_NUM COLUMN_NUM COUNTER TOTAL_ELAPSED_TIME AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC 22 5 67801 1582095600 23334
SF_RACETIME_TO_SEC 18 3 67801 90068700 1328
SF_RACETIME_TO_SEC 27 5 67801 53903300 795
SF_RACETIME_TO_SEC 31 5 67801 49835400 735
SP_SOME_STAT 16 3 1 43414600 43414600
SF_RACETIME_TO_SEC 25 5 67801 42623200 628
SF_RACETIME_TO_SEC 34 5 67801 37339200 550
SF_RACETIME_TO_SEC 14 3 67801 35822000 528
SF_RACETIME_TO_SEC 29 5 67801 34874400 514
SF_RACETIME_TO_SEC 32 5 67801 24093200 355
SF_RACETIME_TO_SEC 15 3 67801 23832900 351
SF_RACETIME_TO_SEC 6 1 67801 15985600 235
SF_RACETIME_TO_SEC 26 5 67801 15625500 230
SP_SOME_STAT 38 5 240 3454800 14395
SF_SEC_TO_RACETIME 20 3 240 549900 2291
SF_SEC_TO_RACETIME 31 3 240 304100 1267
SF_SEC_TO_RACETIME 21 3 240 294200 1225
SF_SEC_TO_RACETIME 16 3 240 293900 1224
SF_RACETIME_TO_SEC 7 1 67801 202400 2
SF_RACETIME_TO_SEC 8 1 67801 186100 2
ROUTINE_NAME LINE_NUM COLUMN_NUM COUNTER TOTAL_ELAPSED_TIME AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC 20 3 67801 168400 2
SF_RACETIME_TO_SEC 9 1 67801 156700 2
SF_RACETIME_TO_SEC 12 1 67801 153900 2
SF_RACETIME_TO_SEC 10 1 67801 153300 2
SF_SEC_TO_RACETIME 18 3 240 148600 619
SF_RACETIME_TO_SEC 16 3 67801 127100 1
SF_SEC_TO_RACETIME 17 3 240 92100 383
SF_SEC_TO_RACETIME 8 1 240 89200 371
SF_RACETIME_TO_SEC 11 1 67801 69500 1
SF_SEC_TO_RACETIME 28 3 240 16600 69
SF_RACETIME_TO_SEC 5 1 67801 7800 0
SF_SEC_TO_RACETIME 11 1 240 2000 8
SF_SEC_TO_RACETIME 10 1 240 1800 7
SF_SEC_TO_RACETIME 9 1 240 1200 5
SP_SOME_STAT 37 5 240 500 2
SF_SEC_TO_RACETIME 13 3 240 500 2
SF_SEC_TO_RACETIME 7 1 240 400 1
From these statistics it is clear that the leader in total execution time is the operator located in line 22 of the SF_RACETIME_TO_SEC
function. The average execution time of this statement is low, but it is called 67801 times. There are two options for optimization: either optimize the SF_RACETIME_TO_SEC
function itself (operator on line 22), or reduce the number of calls to this function.
Let’s look at the contents of the SP_SOME_STAT
procedure.
CREATE OR ALTER PROCEDURE SP_SOME_STAT (
A_CODE_BREED INTEGER,
A_MIN_FRISK VARCHAR(9),
A_YEAR_BEGIN SMALLINT,
A_YEAR_END SMALLINT
)
RETURNS (
CODE_HORSE BIGINT,
BYDATE DATE,
HORSENAME VARCHAR(50),
FRISK VARCHAR(9),
OWNERNAME VARCHAR(120)
)
AS
BEGIN
FOR
SELECT
TL.CODE_HORSE,
TRIAL.BYDATE,
H.NAME,
SF_SEC_TO_RACETIME(TL.TIME_PASSED_SEC) AS FRISK
FROM
TRIAL_LINE TL
JOIN TRIAL ON TRIAL.CODE_TRIAL = TL.CODE_TRIAL
JOIN HORSE H ON H.CODE_HORSE = TL.CODE_HORSE
WHERE TL.TIME_PASSED_SEC <= SF_RACETIME_TO_SEC(:A_MIN_FRISK)
AND TRIAL.CODE_TRIALTYPE = 2
AND H.CODE_BREED = :A_CODE_BREED
AND EXTRACT(YEAR FROM TRIAL.BYDATE) BETWEEN :A_YEAR_BEGIN AND :A_YEAR_END
INTO
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK
DO
BEGIN
OWNERNAME = NULL;
SELECT
FARM.NAME
FROM
(
SELECT
R.CODE_FARM
FROM REGISTRATION R
WHERE R.CODE_HORSE = :CODE_HORSE
AND R.CODE_REGTYPE = 6
AND R.BYDATE <= :BYDATE
ORDER BY R.BYDATE DESC
FETCH FIRST ROW ONLY
) OWN
JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
INTO OWNERNAME;
SUSPEND;
END
END
The function SF_RACETIME_TO_SEC
calls on line number 21:
WHERE TL.TIME_PASSED_SEC <= SF_RACETIME_TO_SEC(:A_MIN_FRISK)
Obviously this condition will be checked multiple times for each record. Repeated execution of the SF_RACETIME_TO_SEC
function contributes significantly to the overall execution time. If you look at the function call itself, you will notice that the function arguments do not depend on the data source, that is, the function value is invariant. This means that we can move its calculation outside the query. So we can rewrite our procedure like this:
CREATE OR ALTER PROCEDURE SP_SOME_STAT (
A_CODE_BREED INTEGER,
A_MIN_FRISK VARCHAR(9),
A_YEAR_BEGIN SMALLINT,
A_YEAR_END SMALLINT
)
RETURNS (
CODE_HORSE BIGINT,
BYDATE DATE,
HORSENAME VARCHAR(50),
FRISK VARCHAR(9),
OWNERNAME VARCHAR(120)
)
AS
DECLARE TIME_PASSED NUMERIC(18, 3);
BEGIN
TIME_PASSED = SF_RACETIME_TO_SEC(:A_MIN_FRISK);
FOR
SELECT
TL.CODE_HORSE,
TRIAL.BYDATE,
H.NAME,
SF_SEC_TO_RACETIME(TL.TIME_PASSED_SEC) AS FRISK
FROM
TRIAL_LINE TL
JOIN TRIAL ON TRIAL.CODE_TRIAL = TL.CODE_TRIAL
JOIN HORSE H ON H.CODE_HORSE = TL.CODE_HORSE
WHERE TL.TIME_PASSED_SEC <= :TIME_PASSED
AND TRIAL.CODE_TRIALTYPE = 2
AND H.CODE_BREED = :A_CODE_BREED
AND EXTRACT(YEAR FROM TRIAL.BYDATE) BETWEEN :A_YEAR_BEGIN AND :A_YEAR_END
INTO
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK
DO
BEGIN
OWNERNAME = NULL;
SELECT
FARM.NAME
FROM
(
SELECT
R.CODE_FARM
FROM REGISTRATION R
WHERE R.CODE_HORSE = :CODE_HORSE
AND R.CODE_REGTYPE = 6
AND R.BYDATE <= :BYDATE
ORDER BY R.BYDATE DESC
FETCH FIRST ROW ONLY
) OWN
JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
INTO OWNERNAME;
SUSPEND;
END
END
Let’s try to execute an SQL query after changing the SP_SOME_STAT
procedure:
SELECT COUNT(*)
FROM (
SELECT
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK,
OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
COUNT
=====================
240
Current memory = 555293472
Delta memory = 288
Max memory = 555359872
Elapsed time = 0.134 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124992
2.424 sec vs 0.134 sec - the acceleration is significant. Is it possible to do better? Let’s run the profiling session again. The new session ID is 6.
Let’s look at the PSQL statistics:
SELECT
ROUTINE_NAME,
LINE_NUM,
COLUMN_NUM,
COUNTER,
TOTAL_ELAPSED_TIME,
AVG_ELAPSED_TIME
FROM PLG$PROF_PSQL_STATS_VIEW STAT
WHERE STAT.PROFILE_ID = 6;
ROUTINE_NAME LINE_NUM COLUMN_NUM COUNTER TOTAL_ELAPSED_TIME AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SP_SOME_STAT 18 3 1 10955200 10955200
SP_SOME_STAT 40 5 240 3985800 16607
SF_SEC_TO_RACETIME 20 3 240 508100 2117
SF_SEC_TO_RACETIME 31 3 240 352700 1469
SF_SEC_TO_RACETIME 21 3 240 262200 1092
SF_SEC_TO_RACETIME 16 3 240 257300 1072
SF_SEC_TO_RACETIME 18 3 240 156400 651
SP_SOME_STAT 17 3 1 141500 141500
SF_SEC_TO_RACETIME 8 1 240 125700 523
SF_SEC_TO_RACETIME 17 3 240 94100 392
SF_RACETIME_TO_SEC 22 5 1 83400 83400
SF_SEC_TO_RACETIME 28 3 240 38700 161
SF_SEC_TO_RACETIME 10 1 240 20800 86
SF_SEC_TO_RACETIME 11 1 240 20200 84
SF_SEC_TO_RACETIME 9 1 240 16200 67
SF_RACETIME_TO_SEC 6 1 1 7100 7100
SF_SEC_TO_RACETIME 7 1 240 6600 27
SF_RACETIME_TO_SEC 27 5 1 5800 5800
SF_RACETIME_TO_SEC 18 3 1 5700 5700
SF_SEC_TO_RACETIME 13 3 240 5700 23
ROUTINE_NAME LINE_NUM COLUMN_NUM COUNTER TOTAL_ELAPSED_TIME AVG_ELAPSED_TIME
==================== ============ ============ =============== ===================== =====================
SF_RACETIME_TO_SEC 32 5 1 4600 4600
SP_SOME_STAT 39 5 240 4400 18
SF_RACETIME_TO_SEC 14 3 1 4300 4300
SF_RACETIME_TO_SEC 34 5 1 3500 3500
SF_RACETIME_TO_SEC 25 5 1 3300 3300
SF_RACETIME_TO_SEC 31 5 1 2900 2900
SF_RACETIME_TO_SEC 29 5 1 2800 2800
SF_RACETIME_TO_SEC 15 3 1 1600 1600
SF_RACETIME_TO_SEC 26 5 1 1000 1000
SF_RACETIME_TO_SEC 7 1 1 800 800
SF_RACETIME_TO_SEC 20 3 1 800 800
SF_RACETIME_TO_SEC 5 1 1 400 400
SF_RACETIME_TO_SEC 8 1 1 400 400
SF_RACETIME_TO_SEC 10 1 1 400 400
SF_RACETIME_TO_SEC 11 1 1 400 400
SF_RACETIME_TO_SEC 12 1 1 400 400
SF_RACETIME_TO_SEC 16 3 1 400 400
SP_SOME_STAT 15 3 1 300 300
SF_RACETIME_TO_SEC 9 1 1 300 300
Line 18 in the SP_SOME_STAT
procedure takes the longest time - this is the top-level cursor itself, but this cursor is opened once. It is important to note here that the total time for retrieving all records from the cursor is affected by the operators executed inside the statement block for processing each cursor record, that is
FOR
SELECT
...
DO
BEGIN
-- everything that is done here affects the time it takes to retrieve all records from the cursor
...
END
Let’s look at what makes the most significant contribution within this block. This is line number 40 of the SP_SOME_STAT
procedure, which is called 240 times. Here is the content of the statement that is called:
SELECT
FARM.NAME
FROM
(
SELECT
R.CODE_FARM
FROM REGISTRATION R
WHERE R.CODE_HORSE = :CODE_HORSE
AND R.CODE_REGTYPE = 6
AND R.BYDATE <= :BYDATE
ORDER BY R.BYDATE DESC
FETCH FIRST ROW ONLY
) OWN
JOIN FARM ON FARM.CODE_FARM = OWN.CODE_FARM
INTO OWNERNAME;
Now let’s look at the plan of the procedure SP_SOME_STAT
:
SELECT CS.MON$EXPLAINED_PLAN
FROM MON$COMPILED_STATEMENTS CS
JOIN PLG$PROF_STATEMENTS S ON S.STATEMENT_ID = CS.MON$COMPILED_STATEMENT_ID
WHERE CS.MON$OBJECT_NAME = 'SP_SOME_STAT'
AND S.PROFILE_ID = 6;
==============================================================================
MON$EXPLAINED_PLAN:
Select Expression (line 40, column 5)
-> Singularity Check
-> Nested Loop Join (inner)
-> First N Records
-> Refetch
-> Sort (record length: 28, key length: 8)
-> Filter
-> Table "REGISTRATION" as "OWN R" Access By ID
-> Bitmap
-> Index "REGISTRATION_IDX_HORSE_REGTYPE" Range Scan (full match)
-> Filter
-> Table "FARM" Access By ID
-> Bitmap
-> Index "PK_FARM" Unique Scan
Select Expression (line 18, column 3)
-> Nested Loop Join (inner)
-> Filter
-> Table "TRIAL" Access By ID
-> Bitmap And
-> Bitmap
-> Index "IDX_TRIAL_BYYEAR" Range Scan (lower bound: 1/1, upper bound: 1/1)
-> Bitmap
-> Index "FK_TRIAL_TRIALTYPE" Range Scan (full match)
-> Filter
-> Table "TRIAL_LINE" as "TL" Access By ID
-> Bitmap
-> Index "FK_TRIAL_LINE_TRIAL" Range Scan (full match)
-> Filter
-> Table "HORSE" as "H" Access By ID
-> Bitmap
-> Index "PK_HORSE" Unique Scan
==============================================================================
So the inner query plan is:
Select Expression (line 40, column 5)
-> Singularity Check
-> Nested Loop Join (inner)
-> First N Records
-> Refetch
-> Sort (record length: 28, key length: 8)
-> Filter
-> Table "REGISTRATION" as "OWN R" Access By ID
-> Bitmap
-> Index "REGISTRATION_IDX_HORSE_REGTYPE" Range Scan (full match)
-> Filter
-> Table "FARM" Access By ID
-> Bitmap
-> Index "PK_FARM" Unique Scan
As you can see, the plan is not the most effective. An index is used to filter the data, and then sort the resulting keys and Refetch
. Let’s look at the REGISTRATION_IDX_HORSE_REGTYPE
index:
SQL> SHOW INDEX REGISTRATION_IDX_HORSE_REGTYPE;
REGISTRATION_IDX_HORSE_REGTYPE INDEX ON REGISTRATION(CODE_HORSE, CODE_REGTYPE)
Only the CODE_HORSE
and CODE_REGTYPE
fields are included in the index, so index navigation cannot be used to determine the last record on a date. Let’s try to create another composite index:
CREATE DESCENDING INDEX IDX_REG_HORSE_OWNER ON REGISTRATION(CODE_HORSE, CODE_REGTYPE, BYDATE);
Let’s run the query again:
SELECT COUNT(*)
FROM (
SELECT
CODE_HORSE,
BYDATE,
HORSENAME,
FRISK,
OWNERNAME
FROM SP_SOME_STAT(58, '2.00,0', 2020, 2023)
);
COUNT
=====================
240
Current memory = 554429808
Delta memory = 288
Max memory = 554462400
Elapsed time = 0.125 sec
Buffers = 32768
Reads = 0
Writes = 0
Fetches = 124165
Just in case, let’s check that the procedure plan has changed.
SELECT CS.MON$EXPLAINED_PLAN
FROM MON$COMPILED_STATEMENTS CS
WHERE CS.MON$OBJECT_NAME = 'SP_SOME_STAT'
ORDER BY CS.MON$COMPILED_STATEMENT_ID DESC
FETCH FIRST ROW ONLY;
==============================================================================
MON$EXPLAINED_PLAN:
Select Expression (line 38, column 5)
-> Singularity Check
-> Nested Loop Join (inner)
-> First N Records
-> Filter
-> Table "REGISTRATION" as "OWN R" Access By ID
-> Index "IDX_REG_HORSE_OWNER" Range Scan (lower bound: 3/3, upper bound: 2/3)
-> Filter
-> Table "FARM" Access By ID
-> Bitmap
-> Index "PK_FARM" Unique Scan
Select Expression (line 16, column 3)
-> Nested Loop Join (inner)
-> Filter
-> Table "TRIAL" Access By ID
-> Bitmap And
-> Bitmap
-> Index "IDX_TRIAL_BYYEAR" Range Scan (lower bound: 1/1, upper bound: 1/1)
-> Bitmap
-> Index "FK_TRIAL_TRIALTYPE" Range Scan (full match)
-> Filter
-> Table "TRIAL_LINE" as "TL" Access By ID
-> Bitmap
-> Index "FK_TRIAL_LINE_TRIAL" Range Scan (full match)
-> Filter
-> Table "HORSE" as "H" Access By ID
-> Bitmap
-> Index "PK_HORSE" Unique Scan
==============================================================================
Yes, the plan got better. Navigation by index IDX_REG_HORSE_OWNER
is now used with Range Scan. If we compare the execution time, we get 0.134 seconds vs 0.125 seconds and 124992 vs 124165 fetches. The improvements are very minor. In principle, relative to the original version, our procedure has already become 19 times faster, so optimization can be completed.
1.16. Conclusion
In this article, you were introduced to a new tool - the PSQL profiler, which allows you to find bottlenecks in your SQL and PSQL code. Now it will be easier for you to find the reasons for slowness not only among top-level queries, but also in PSQL modules.
This concludes the review of the new features of Firebird 5.0. Migrate to Firebird 5.0 and get all the features described above.