Library

How to create and fill artificial primary keys (often needed for replication)

Em Portuguese: Como criar e preencher chaves primárias artificiais (geralmente necessárias para replicação)

Many people who tried to configure native replication in Firebird have faced the unexpected problem – absence of primary or unique key for the tables they want to replicate.

Despite the fact that having primary key constraint for the table is the very basic requirement of relational theory, we have seen many databases where tables to be replicated did not have primary or unique keys.

Usually the reasons for the absence of primary key are very simple: developers forgot to create primary key or considered created table as “temporary”, but it stayed in the schema, became the part of business logic, etc.

Few people also mistakenly believed that primary key can be substituted with combination of CHECK constraint and/or trigger Before Insert with “Select…. Where Exists” statement to check existence of the same key. This is not true, such schema does not give the warranty of uniqueness of the value.

The best way to solve the described problem is to create primary key for existing field/fields which uniquely identify each record in the table. However, it could be non-trivial and time consuming task, especially if database is created by third-party vendor or if it has complex structure.

The alternative could be creation of artificial primary keys: i.e., column with automatically filled value, with associated sequence (generator) and trigger. It is necessary to create these objects, fill the values for existing records, create primary key, and then create unique values for this field for every new record.

For this purpose we have created the following instruction and templates.

Instruction: how to create artificial primary keys

Please note: all operations below require exclusive access, without connected users!

1) We need to find all tables without primary or unique keys. Use the following SQL:

with R
as (select R.RDB$RELATION_NAME
    from RDB$RELATIONS R
    where R.RDB$RELATION_TYPE = 0 and
          R.RDB$SYSTEM_FLAG is distinct from 1),
B
as (select distinct R.RDB$RELATION_NAME, RI.RDB$INDEX_NAME, RI.RDB$UNIQUE_FLAG, RI.RDB$INDEX_INACTIVE
    from RDB$INDICES RI
    join RDB$RELATIONS R on RI.RDB$RELATION_NAME = R.RDB$RELATION_NAME
    where R.RDB$SYSTEM_FLAG is distinct from 1 and
          RI.RDB$UNIQUE_FLAG = 1 and
          RI.RDB$INDEX_INACTIVE is distinct from 1)
select R.*
from R
left join B on R.RDB$RELATION_NAME = B.RDB$RELATION_NAME
where B.RDB$RELATION_NAME is null;

2) Save the output of SQL above to the file, and then assign to each table the number (##### in template).
As a result, you will have list with 2 columns, something like this:

### _TABLE_
001 Table1
002 Table2
..
099 Table99

3) For each table in the list you need to prepare the SQL script, using following template – replace ##### and _Table_ with the actual number and table name from the script:


create sequence RPL_#####_SEQ;   -- create sequence/generator
alter sequence RPL_#####_SEQ restart with -1;   -- make sure it starts from 0

---- add column to the table with default value = -1. 
---- The name of the column is RPL_#####_ID
alter table _TABLE_
 add RPL_#####_ID integer default -1 not null;

---- change terminator, in order to be able to execute this script in isql
set terminator ^;

---- create trigger Before Insert to autofill new values
create trigger RPL_#####_TRG
  before insert
  on _TABLE_
as
 begin
  if (new.RPL_#####_ID is null) then
   new.RPL_#####_ID = next value for RPL_#####_SEQ;
 end;
^
set terminator ;^

---- create or alter sequence to fill existing records – starting from -2B
create or alter sequence tmp_seq start with -2000000000;
commit;

---- actually updating. Time to perform this operation depends on the number of records in _TABLE_

update _TABLE_
 set RPL_#####_ID = next value for tmp_seq
where RPL_#####_ID = -1;
commit;

---- remove default -1 from the _Table_
alter table _TABLE_
 alter RPL_#####_ID
  drop default;

---- Finally, add primary key 

alter table _TABLE_
 add constraint RPL_#####_PK
  primary key (RPL_#####_ID)
   using index RPL_#####_IDX;
commit;

4) Execute script in isql - open and copy-paste or isqil -i script_name.sql

5) It is possible that it will be necessary to drop artificial primary key and associated objects, for example, due to the changes from the vendor.

For dropping the keys and triggers we have the following script template.

alter table _TABLE_ drop constraint RPL_#####_PK;
drop trigger RPL_#####_TRG;
drop sequence RPL_#####_SEQ;
alter table _TABLE_drop RPL_#####_ID;
commit;

As you can see, both scripts ( for creating keys and dropping) change the metadata, it means that they must run in the exclusive mode, when no users connected.

Frequently Asked Questions

1. Should we use primary key or unique key for the replication?

It is possible to create unique key as an artificial key for replication. In general, unique key can be built on the nullable field, so such key would allow storing NULL values; and if there will be more than one record with NULL in the unique key in the table, there could be ambiguous conflict on replica side when update or delete will arrive to the replica database.
So, we recommend to create artificial primary key or unique key with NOT NULL field for replication purposes.

2. How to exclude tables from the replication?

In order to exclude specific tables, add to the replication configuration the following filter:

exclude_filter=TESTTABLEWITHOUTPK|TESTTABLE2

or, it is possible to use wildcards like in Similar To

exclude_filter=TEST%

To see what tables will be exlcuded by the filter, use the following statement:

select rdb$relation_name from rdb$relations where rdb$relation_name similar to 'TEST%';

RDB$RELATION_NAME
========================================================
TESTTABLEWITHOUTPK
TESTTABLE2

Also, in Firebird 4 you can use SQL command to exclude table from publication:

ALTER DATABASE EXCLUDE MyTable1 FROM PUBLICATION