Database maintenance can be a full -time job when working within a large corporation. Cleanup or deletion of records can especially be time-consuming and error prone when dealing with large data tables with millions of records. Often developers will try to run a basic delete statement without knowing about batching or truncating the activity log. Below is a simple method for performing a batch delete.

Define you DELETE statement

The first step is to define your DELETE statement with the appropriate WHERE clause. If for example you want to delete all records before a set date then your query would look similar to:

DELETE FROM my_table WHERE action_ts < '06/01/2009'

Implementing the basic batch loop

To implement a basic loop we need to define some variables to keep track of a few variables. First we define @err to obtain the error id if an error occours, and @nCount to hold the total number of rows deleted.

DECLARE @err INT, @nCount INT
SELECT @nCount=0

Next we implement or while statement to execute or delete statement while more records are available to be deleted. In this example we will test if any rows exist matching or deletion statement.

WHILE EXISTS (SELECT * FROM my_table WHERE action_ts < '06/01/2009')
BEGIN
    ... 
END

In our while loop we will delete our records in batches by use of the TOP command that limits how many records are effected. We will be using a batch size of 10000.

DELETE TOP 10000 FROM my_table WHERE action_ts < '06/01/2009'

After our delete query we want to check if any errors occurred and if so break out of our batch loop. This is done by obtaining the current error status and if an error occurred then breaking out of the loop by the break command.

SELECT @err=@@ERROR, @nCount=@nCount+@@ROWCOUNT
IF @err=0
BEGIN
    ...
END
ELSE
BEGIN
    RAISERROR 20001 'Failed to delete date from table my_table'
    BREAK
END

Dumping the activity log and output statements

To be notified on how many rows are being deleted we can output the @nCount variable that is udpated after every deletion with the number of rows affected by the last delete statement. We can use the print statement to output the number.

PRINT '%l! rows deleted', @nCount

Also when dealing with a large amount of rows it is possible for the activity log to get full for the database. To clean up the log use the dump tran command with the truncate_only option.

DUMP TRAN my_db WITH TRUNCATE_ONLY

The Full Batch Delete Script

Putting the entire script together we end up with:

DECLARE @err INT, @nCount INT
SELECT @nCount=0
 
WHILE EXISTS (SELECT * FROM my_table WHERE action_ts < '06/01/2009')
BEGIN
 
    DELETE TOP 10000 FROM my_table WHERE action_ts < '06/01/2009'
 
    SELECT @err=@@ERROR, @nCount=@nCount+@@ROWCOUNT
    IF @err=0
        BEGIN
            DUMP TRAN my_db WITH TRUNCATE_ONLY
            PRINT '%l! rows deleted', @nCount
        END
    ELSE
        BEGIN
            RAISERROR 20001 'Failed to delete date from table my_table'
            BREAK 
        END
END

For those that worry about the extra time needed for doing the exists select clause in the while loop you could optionally use a separate variable to hold the latest row count and test for if the row count is greater than 0.

, , , , , ,

It is often the case that developers new to databases will try to perform large data modifications without batching. When this happens in most cases the modification will fail with the error of the transaction log being full after you are presented with the message The transaction log in the database is almost full. Your transaction is suspended until space is made available in the log. What is happening is that the update/insert modification is trying to change too many rows without committing so the transaction log is getting filled up and once it is full the modification fails and no changes get saved to the database.

To correct this it is good practice to perform your database modifications in batches, and upon each iteration of the batch to clear your log transaction file. The dump tran command notifies Sybase that you want to clear the transaction log of a database and with which options. In these cases you simply want to clear it without backing up the log to a file so you would use the with turncate_only option.

DUMP TRAN MY_TABLE WITH truncate_only

Resources

, , , , ,

Transferring data between tables is a common task of database maintenance. It is used when updating database schemas or when you need to migrate data to a log or history table. There are 2 common methods for performing the transfer the first is done by use of a WHERE NOT EXISTS sub query, while the less common known method is by setting up and utilizing a Trigger, which removes rows from the old table as soon as they get inserted into the new table.

WHERE NOT EXISTS Method

The most common method of transferring data between tables in a database that i have seen is the combination of ROWCOUNT and a WHERE NOT EXISTS clause. How this method works is that you set your ROWCOUNT to the batch size then you insert into your table to be transferred to from a select of your old table where the current record does not exist in your transfer to table. A sample code snippet showing this method is:

DECLARE @ErrorId INTEGER,
    @RowCtrInserted INTEGER
 
SET ROWCOUNT 10000
 
SELECT @RowCtrInserted = 1, @ErrorId = @@ERROR
 
WHILE (@ErrorId = 0 AND @RowCtrInserted > 0)
BEGIN			
    INSERT INTO LOG_HIST	
        (
        ID,
        USER_EMP_ID,
        APP_NM,
        ACTION_NM,
        ACTION_TS,
        IP_AD
        )
    SELECT 
        ID,
        USER_EMP_ID,
        APP_NM,
        ACTION_NM,
        ACTION_TS,
        IP_AD
    FROM LOG LOG
    WHERE NOT EXISTS (
        SELECT * 
        FROM LOG_HIST hist
        WHERE LOG.ID = hist.ID 
     )
 
    SELECT @RowCtrInserted = @@ROWCOUNT, @ErrorId = @@ERROR
END
 
IF @ErrorId=0 
BEGIN
    TRUNCATE TABLE LOG
END

As can be seen in the code this method loops while a record exists in the log table that is not present in the log_hist table. This requires an additional query every time the batch size is met, and can be a performance problem on large database tables.

Temporary Trigger Method

The second method that can be used to transfer data between two tables is the use of a temporary insert trigger on the table being transferred to. How it works is that the trigger is defined to delete the newly inserted record from the old table. What this means is that for every iteration of the insert loop as determined by the batch size a sub query to determine if a record already exists in the new table is not necessary as records are deleted as soon as they are transferred. Below is a sample sql script setting up the trigger and performing the transfer, in addition it contains error checking and print statements.

CREATE TRIGGER LOG_HIST_TR ON LOG_HIST FOR INSERT
AS
DELETE LOG FROM LOG, inserted
  WHERE 1=1
    AND LOG.ID = inserted.ID
go
 
PRINT "Transferring data from LOG to LOG_HIST"
go
 
DECLARE @err INT, @n INT
SELECT @n=0
PRINT 'Will use batch size of 10000 rows'
WHILE exists (SELECT * FROM LOG)
BEGIN
    INSERT LOG_HIST (
        ID,
        USER_EMP_ID,
        APP_NM,
        ACTION_NM,
        ACTION_TS,
        IP_AD
    )
    SELECT TOP 10000
        ID,
        USER_EMP_ID,
        APP_NM,
        ACTION_NM,
        ACTION_TS,
        IP_AD
    FROM LOG
 
    SELECT @err=@@ERROR, @n=@n+@@ROWCOUNT
 
    IF @err=0
    BEGIN
        DUMP TRAN USERACTIVITY WITH truncate_only
        PRINT '%1! rows transferred', @n
    END
    ELSE
    BEGIN
        RAISERROR 20001 'Failed to transfer data from table LOG to table LOG_HIST. Please recover data from these two tables'
        BREAK
    END
END
 
go
PRINT "Checking data transfer completion"
go
IF not exists (SELECT * FROM LOG)
BEGIN
    DROP TRIGGER LOG_HIST_TR
END
ELSE
BEGIN
    RAISERROR 20001 'Failed to transfer data from LOG into LOG_HIST'
END

In the above example after each batch iteration we output the total number of records transferred, as well as cleaning up the database log so that we don’t max out the allocated space. Once an error is thrown or no more records exists we drop our temporary trigger. You will notice that no truncating of the table we transferred data from is needed as the trigger was responsible for removing them. The additional benefit to using this method is that if a transfer is interrupted no time or data is loss and the transfer can pick up right were it left off without performing an additional WHERE NOT EXISTS sub query.

, , , , , , ,