Library

How to implement backups with InterBase XE7 online dump

Dmitry Kuzmenko, 08-SEP-2016

InterBase since version 2007 supports online dump – online database file copying. Instead of gbak –b/-c this allows you to get ready to use database after copying, and you do not need to “restore” it from backup (something that is not a database). Online dump is very fast, almost like file copying by the operating system.

Use the following command to make online dump of the database

gbak –d [options] database target

(You can find full dump description in the documentation, Doc\OpGuide.pdf, or here)

As well as for the database file, the target may have any name and extension that you like. The result of the command will be dump file that is equivalent to the original database, but in read-only mode.

Time of the first run of this command is time to scan (read) source database and time to write target file.

Until target is in read-only mode, it is linked to the database file. At the first dump, the full database file will be read and copied to target. At second, and next – only changed pages will be written to the target. This is called as “incremental dump”.

Important! InterBase XE7 on the each repeating dump command (with the same file names, of course) reads only changed pages of the database, and writes them to the target. Other versions before XE7 read the full source database. So, the performance is different, and XE7 is much faster. If none pages were changed in the source database, XE7 incremental dump will take ~1 second, while InterBase2007-XE3 will spend the time needed to read the whole database file. The time depends on the source database size and storage speed. For example, if storage speed is around 400mb/sec, then 100gb database will be scanned in 250 seconds (4 min 10 sec).

Note. InterBase XE7 supports database file formats of XE7 (ODS 16), XE/XE3 (ODS 15) and 2009 (ODS 13). Smart scanning feature, mentioned above, will work only with XE7 (ODS 16) database format.

If you need to turn target to the read-write mode (normal mode), use command

gfix target –mode read_write

But after that target loses the link to the database, and the sequent running of gbak –d database target is impossible anymore because target considered as “another” database.

If you want to fully overwrite the target, instead of incremental dump, use –ov option

gbak –d –ov database target

Dump of the dump

The first dump of the dump will work, but the subsequent increments will not. For example, first

gbak –d database target1

here we get target1 as a full dumped database in read-only mode

Next,

gbak –d target1 target2

As you see, we create the dump of the dump. And yes, this command will copy target1 to the target2. Both targets will be in the read-only mode. But if we will repeat these two commands again, changes that went from the database to the target1 will not be copied to the target2. So, target2 stays in the state after the first, initial, copy. And there are none any error or warning messages.

So, if you ever want to make dump of dump, you should use only full dump

gbak –d –ov target1 target2

The –ov option is mandatory, to ensure that target2 will be written (and overwritten) with the source of the target1.

Backup scheme 1

Example of dumps at different time intervals

gbak –d database target

Is being run, for example, each 1 hour. Here we have production source database and “backup” copy target, one hour behind.

Also, in addition, each 24 hours we can run

gbak –d database target2

Here we have the production database, target copy behind 1 hour, and target2 copy behind 24 hours.

You can make any number of dumps from one database.

Needless to say, that target dumps can be used as read-only databases for any purpose – reporting, analytics, etc., for the tasks that do not need to look in the actual database.

Pro: Each dump can be independently scheduled.

Con: Lags between newest and oldest dump.

Backup scheme 2

Sequential dump to the different targets. In this case, you need to tune some scheduler (OS or custom) to run the following commands one per specified time interval

gbak –d database target1

gbak –d database target2

gbak –d database target3

If you use 1-hour interval between these commands, you will have dumps (backups) like that:

target1 at 12:00, target2 at 13:00, target3 at 14:00. The next run of dump target1 will be updated at 15:00, and so on. As the result, we will have copies of the databases by the last 3 hours.

Pro: we have dumps for several hours that stay close to the original database

Con: a bit hard to schedule these commands. I.e. in this example dump commands need to be scheduled by exact time:

Dump to target1 at 00:00, 03:00, 06:00…

Dump to target2 at 01:00, 04:00, 07:00…

Dump to target3 at 02:00, 05:00, 08:00…

Summary

Gbak –d can be used for a dump to the local storage and for a dump over the network - since incremental dump sends to the target only changed pages. So, the target can be placed to the remote network storage. But, of course, the network must have good bandwidth to be compatible with the local storage. Otherwise, writes to the target will be slow.

Online dump can be used not only as a tool to make an online backup copy of the database but as a tool to make “horizontal scaling” of the system, to balance the load of the production, reporting, and analytics applications.

While you see that online dump is the fastest way to get online database copy (instead of gbak –b/-c), since dump operates with pages, it can skip some damage on pages, if the database is corrupted. Thus, you still need to check database consistency by the old good gbak –b/-c, but you can do it less often than before.