Biblioteca

How to create and fill artificial primary keys (often needed for replication)Instruction and

Many people who tried to configure native replication in Firebird have faced the unexpected problem – absence of primary or unique key for the tables they want to replicate.

Despite the fact that having primary key constraint for the table is the very basic requirement of relational theory, we have seen many databases where tables to be replicated did not have primary or unique keys.

Usually the reasons for the absence of primary key are very simple: developers forgot to create primary key or considered created table as “temporary”, but it stayed in the schema, became the part of business logic, etc.

Few people also mistakenly believed that primary key can be substituted with combination of CHECK constraint and/or trigger Before Insert with “Select…. Where Exists” statement to check existence of the same key. This is not true, such schema does not give the warranty of uniqueness of the value.

The best way to solve the described problem is to create primary key for existing field/fields which uniquely identify each record in the table. However, it could be non-trivial and time consuming task, especially if database is created by third-party vendor or if it has complex structure.

The alternative could be creation of artificial primary keys: i.e., column with automatically filled value, with associated sequence (generator) and trigger. It is necessary to create these objects, fill the values for existing records, create primary key, and then create unique values for this field for every new record.

For this purpose we have created the following instruction and templates.

Instruction: how to create artificial primary keys

Please note: all operations below require exclusive access, without connected users!

1) We need to find all tables without primary or unique keys. Use the following SQL:

with C
as (select min(RDB$RELATION_NAME) RELATION_NAME
    from RDB$RELATION_CONSTRAINTS
    where RDB$CONSTRAINT_TYPE in ('PRIMARY KEY', 'UNIQUE')
    group by RDB$RELATION_NAME)
select R.RDB$RELATION_NAME RN
from RDB$RELATIONS R
left join C on R.RDB$RELATION_NAME = C.RELATION_NAME
where R.RDB$SYSTEM_FLAG = 0 and
      R.RDB$RELATION_TYPE = 0 and
      C.RELATION_NAME is null
order by 1;

2) Save the output of SQL above to the file, and then assign to each table the number (##### in template).
As a result, you will have list with 2 columns, something like this:

### __TABLE_
001 Table1
002 Table2
..
099 Table99
3) For each table in the list you need to prepare the SQL script, using following template – replace ##### and _Table_ with the actual number and table name from the script:

create sequence RPL_#####_SEQ;   – create sequence/generator
alter sequence RPL_#####_SEQ restart with -1;   – make sure it starts from 0

--- add column to the table with default value = -1. 
--- The name of the column is RPL_#####_ID
alter table _TABLE_
 add RPL_#####_ID integer default -1 not null;

– change terminator, in order to be able to execute this script in isql
set terminator ^;

– create trigger Before Insert to autofill new values
create trigger RPL_#####_TRG
  before insert
  on _TABLE_
as
 begin
  if (new.RPL_#####_ID is null) then
   new.RPL_#####_ID = next value for RPL_#####_SEQ;
 end;
^
set terminator ;^

– create or alter sequence to fill existing records – starting from -2B
create or alter sequence tmp_seq restart with -2000000000;
commit;

– actually updating. Time to perform this operation depends on the number of records in _TABLE_

update _TABLE_
 set RPL_#####_ID = next value for tmp_seq
where RPL_#####_ID = -1;
commit;

-– remove default -1 from the _Table_
alter table __TABLE__
 alter RPL_#####_ID
  drop default;
– Finally, add primary key 

alter table __TABLE__
 add constraint RPL_#####_PK
  primary key (RPL_#####_ID)
   using index RPL_#####_IDX;
commit;

4) Execute script in isql - open and copy-paste or isqil -i script_name.sql

5) It is possible that it will be necessary to drop artificial primary key and associated objects, for example, due to the changes from the vendor.

For dropping the keys and triggers we have the following script template.

 

alter table _TABLE_drop constraint RPL_#####_PK;
drop trigger RPL_#####_TRG;
drop sequence RPL_#####_SEQ;
alter table _TABLE_drop RPL_#####_ID;
commit;

As you can see, both scripts ( for creating keys and dropping) change the metadata, it means that they must run in the exclusive mode, when no users connected.

Frequently Asked Questions

1. Should we use primary key or unique key for the replication?

Theoretically, it is possible to create unique key, but unique key allows storing NULL value, and if there will be more than one record with NULL in the table, there could be conflict on replica side when update or delete will arrive to the replica database.

2. How to exclude tables from the replication?

In order to exclude specific tables, add to the replication configuration the following filter:
exclude_filter=TESTTABLEWITHOUTPK|TESTTABLE2
or, it is possible to use wildcards like in Similar To
exclude_filter=TEST%
To see what tables will be exlcuded by the filter, use the following statement:
select rdb$relation_name from rdb$relations where rdb$relation_name similar to 'TEST%';

RDB$RELATION_NAME
========================================================
TESTTABLEWITHOUTPK
TESTTABLE2
To exclude all tables without primary or unique key, we recommend to use the following option:
exclude_without_pk=true  

3. Replica gives error on inserting data, but table on the master has Primary or Unique Key, why?

If you have create table while users are active, inserted some records, and then created primary key, it is possible that replication will break at the moment when table did not yet have primary key.

We recommend to create table with primary key in the single command if you need to create when users are active, i.e.:

create table MYTABLE (id1 integer not null primary key, …);
or, better, always create tables and, in general, modify DDL, in the exclusive mode.