It is generally not faster to commit frequently—it is almost always faster to do the work in a single SQL statement. By way of a small example, say we have a table, T, with lots of rows, and we want to update a column value for every row in that table. We’ll use this to set up such a table (run these four setup steps before each of the following three cases):
$ sqlplus eoda/foo@PDB1
SQL> drop table t;Table dropped.
SQL> create table t as select * from all_objects; Table created.
In this simple example, it is many times slower to loop in order to commit frequently. If you can do it in a single SQL statement, do it that way, as it is almost certainly faster. Even if we “optimize” the procedural code, using bulk processing for the updates (as follows), it is in fact much faster, but still much slower than it could be:
SQL> exec :n := dbms_utility.get_cpu_time; PL/SQL procedure successfully completed.
Not only that, but you should notice that the code is getting more and more complex. From the sheer simplicity of a single UPDATE statement, to procedural code, to even more complex procedural code—we are going in the wrong direction!
Furthermore (yes, there is more to complain about), the preceding procedural code is not done yet. It doesn’t deal with “what happens when we fail” (not if we but rather when we). What happens if this code gets halfway done and then the system fails?
How do you restart the procedural code with a commit? You’d have to add yet more code so you knew where to pick up and continue processing. With the single UPDATE statement, we just reissue the UPDATE. We know that it will entirely succeed or entirely fail; there will not be partial work to worry about. We visit this point more in the section “Restartable Processes Require Complex Logic.”
Now, just to supply a counterpoint to this discussion, recall in the previous chapter when we discussed the concept of write consistency and how an UPDATE statement, for example, could be made to restart. In the event that the preceding UPDATE statement was to be performed against a subset of the rows (it had a WHERE clause, and other users were modifying the columns this UPDATE was using in the WHERE clause), then there would be a case either for using a series of smaller transactions rather than one large transaction or for locking the table prior to performing the mass update. The goal here would be to reduce the opportunity for restarts to occur.
If we were to UPDATE the vast majority of the rows in the table, that would lead us toward using the LOCK TABLE command. In my experience, however, these sorts of large mass updates or mass deletes (the only statement types really that would be subject to the restart) are done in isolation. That large, one-time bulk update or the purge of old data generally is not done during a period of high activity. Indeed, the purge of data should not be affected by this at all, since you would typically use some date field to locate the information to purge, and other applications would not modify this data.