Oracle Database Backup Optimization And Upgrade Project

This chapter mainly introduces how we help customers optimize the existing backup strategy and backup architecture to solve problems such as long backup/recovery time, large archive log volume, untimely backup, scattered backup jobs, data loss risks, and difficult backup management.

ORACLE Database Backup Optimization and Upgrade Project

Oracle Database Backup Optimization Architecture

Project Background

With the continuous development of customer business operations, the data volume in each database has gradually increased. Concurrently, due to rapid growth in new business, approximately 3–8 new databases are deployed monthly. Database backup and recovery face significant challenges:
(1) Prolonged Backup Time: Continuous data growth extends backup durations. Backup operations consume significant I/O and CPU resources, impacting database performance.
(2) Recovery Risks: Currently, redo logs and data files are backed up together once daily. In emergency recovery scenarios, restoring from tape libraries requires applying redo logs generated after the backup point. If these logs (stored on local disks) are corrupted, lost due to storage failure, or accidentally deleted, zero-data-loss recovery becomes impossible.
(3) Archive Log Overhead: Backups include data files and archive logs. Databases with high transaction volumes generate excessive archive logs, further extending backup times.
(4) Slow Tape-Based Recovery: Restoring a 1.7 TB database (e.g., the Customer Information System) from tape libraries, followed by redo log application, exceeds 2 hours. Prolonged recovery for critical transaction databases impacts user experience and organizational reputation.
(5) Lack of Backup Validation: Limited tape library resources prevent daily validation of backup integrity, risking undetected corrupted backups during recovery.
(6) Manual Management: Backup configuration, monitoring, execution, and recovery rely on manual scripts, lacking centralized tools for streamlined management.
In summary, existing ORACLE database backups face issues including extended backup/recovery times, log loss risks, unverified backup validity, and inefficient management. This project addresses these challenges.

Project Goals

Upgrade the ORACLE backup system to achieve:
(1) Backup Strategy Optimization: Shift from daily full backups to an initial full backup followed by daily incremental backups.
(2) Zero Data Loss: Implement real-time backup of database logs.
(3) Faster Recovery: Achieve recovery speeds up to 1 GB/s—over twice as fast as tape-based restoration—significantly reducing downtime.
(4) Centralized Management:
     - Use ORACLE OEM for unified backup policy configuration, record retention, and capacity management.
     - Automate backup script generation/execution on the backup appliance, eliminating local server scripting.

Project Implementation Status

1. Achieved Goals

Post-implementation results include:
(1) Backup Strategy: Initial full backup + daily incremental backups deployed.
(2) Enhanced Speed: Notable improvement in daily backup/recovery performance.
(3) Centralized Management: Unified backup policy configuration, task scheduling, and result monitoring via OEM.
(4) Zero Data Loss (RPO=0):
     - By enabling the real-time backup function of the ORACLE database activity log, the database log is transferred to the Oracle Exadata backup machine in real time for storage, so that the database can be restored from the Exadata machine at any time when it goes down, and the RPO is 0, that is, there is no data loss.
     - At present, the zero data loss function of the database has been fully tested and verified in the test environment and production environment, and meets the functional requirements.

2. Unachieved Goal

Tape Integration: Contract required integrating the backup appliance with the physical tape library (TS4500) for backups/restores. This was modified during planning:
     - Traditional tape backups retained for long-term data preservation.
     - New backup appliance connected via 10GbE to Exadata/PC servers for direct database backups.

3. Milestone Summary

(1) Phase I: Hardware procurement, installation, power-on, and software validation completed.
(2) Ph(1) ase II: Backup/recovery functionality thoroughly tested in non-production environments.
(3) Phase III: All ORACLE databases upgraded with optimized backups (incremental strategy, accelerated recovery, centralized management) and deployed stably.

4. Key Deliverables

(1) Production Deployment: 48 ORACLE databases automated with scheduled backups (initial full → daily incremental), reducing daily backup volume.
(2) Performance Gains:
     - Example: Statistical Reporting Database recovery speed increased from 178 MB/s to 475 MB/s, cutting recovery time from 16 minutes to 6 minutes.
(3) OEM Centralization: Unified platform for backup policy configuration, task customization, and result visualization.