We performed four backup tests and four recovery tests. , (mbytes_processed/dbsize_mbytes*100) complete The RMAN channel allocates a variable number of disk buffers of size 512 kilobytes (KB) so that the total buffer size for all the input files is less than 16 MB. RMAN VALIDATE was used for simulating the read I/O load by a RMAN database backup (basically the write phase of a RMAN backup is missing for details check the graphic below). where status=RUNNING The fastest channel of any of them is pushing the blazing-fast speed of 37.47 MB per second. 4 database files are read simultaneously with system call pread and a request / read size of 1024 kb, if you look closely at the open file handles (273 to 276) and the corresponding read size (1048576). We also validated the cloned environment by running query against the cloned database while creating a clone of another database from the backup. Even though RMAN allows active-database duplication, our requirement is to take advantage of the backup that is available on the FlashBlade and hence our recommendation is to use backup-based duplication, preferably with recovery catalog database which you might be already using. No database files are read by the Oracle (RMAN) shadow process itself anymore. Ideally, it would make perfect sense to take advantage of the Oracle backups to create the database clones to be used by testers and developers. , dbsize_mbytes Even though the source database is on a block based storage, when RMAN is invoked, any writes to FlashBlade will be managed through dNFS. By default, the value is set to TRUE. Industrys first cloud-era flash, purpose-built for modern analytics and designed to drive tomorrows discoveries, insights, and creations. In addition, the customers also face challenges keeping up with demands in their test and development environment, provisioning clones of production databases quickly for DBAs to iterate on. Setup multiple subnets on FlashBlade (preferably four as Oracle dNFS can only support up to four network paths to the storage system). Apply the latest archived logs to the new instance. That showed the throughput is not cumulative. Have you ever switched from an Oracle flat file database backup to RMAN and the read I/O throughput for the backup decreased drastically? The speed at which an enterprise can restore data in the event of a failure, or quickly provision test/dev environment, defines its business edge and also defines a new class of solution area, called Fast Backup/Restore. FlashBlade is the industrys first all-flash storage purpose-built for modern analytics architected from the ground-up to deliver a powerful cloud-era data platform thats fast, big, and simple. We tested the cloning process in our lab and documented the steps in the White Paper which would be available under the Data Protection section of our website. With the introduction of FlashBlade, a ground-breaking scale-out flash storage system from Pure Storage, the backup and restore challenges of enterprise customers can be addressed easily. In the restore scenario, the read throughput of 3.2GBps is still impressive but it is limited by the target FlashArrays write performance. Setup filesystems (or volumes) on FlashBlade and mount them onto the database host over NFS protocol. , (select sum(bytes)/1024/1024 dbsize_mbytes from v$datafile) When the maintenance window starts, the following steps can be completed in less than an hour. Accelerated RMAN recipes for Oracle on Pure Storages scale out FlashBlade. The database was 1.01 TB in size. col dbsize_mbytes for 99,999,990.00 justify right head DBSIZE_MB
Oracle customers can now direct their RMAN backups to FlashBlade and use dNFS (direct NFS) to accelerate backups significantly. Curious about Evergreen Storage from Pure Heres some key reasons why it makes sense. and output_device_type is not null You can control disk I/O slaves by setting the DBWR_IO_SLAVES initialization parameter, which is not dynamic. This not only allows the customer to take advantage of the dormant backups but also helps validating their restore/recovery process. Internet Security And Online Transactions: What Should The Gamblers Look For? In addition, Direct NFS is capable of performing concurrent direct I/O by bypassing Operating System level caches. Greater than 4 but less than or equal to 8. The clones were created under 20 minutes in both cases. To address the speed of data protection, storage and backup vendors have come up with purpose-built appliances which may have accelerated the backup times but did not meet the recovery time objective (RTO) during restore as the data rehydration process during restore operations generates large random I/O access patterns on disk drives, resulting in poor performance. Asynchronous I/O is available only with Automatic Storage Management disk group which uses raw partitions as the storage option for database files. , output_bytes/1024/1024 output_mbytes For the remote server, the key requirement is to make the backups available. Lets start with some basic information about how RMAN performs a backup and how we can influence the behavior with different buffer sizes (without hidden parameters) or throughput. However, if the asynchronous I/O implementation is not stable, you can set this parameter to false to disable asynchronous I/O. FROM BIG DATA TO BIG INTELLIGENCE In the new age of big data, modern applications and compute technologies leverage massively parallel architecture for performance. I ran the query multiple times, and it seemed like the average was 35MB/s. , input_bytes/1024/1024 input_mbytes The query on server oradb01 performed over 2.7 GBps of read throughput while the RMAN duplicate on the server oradb02 performed read and write bandwidth of 1.08GBps. from v$rman_status rs In conjunction with the data growth, customers are now challenged with daily backups taking over 24 hours breaching their service level agreements. This would certainly require RMAN recovery catalog database. To illustrate the advantages of FlashBlade in supporting Fast Backup/Restore, we tested Oracle RMAN backup and recovery scenarios with and without dNFS. Enable dNFS on the database. The scale-out FlashBlade system is file/object based and it supports applications using NFS, S3/object, SMB, and HTTP protocols. For more detailed information on Fast Backup/Restore not just for Oracle but for other databases, please visitNext-Gen Data Protection. sysdate + TIME_REMAINING/3600/24 end_at col SID for 99999 If your platform supports asynchronous I/O to disk, Oracle recommends that you leave this parameter set to its default value. The actual time turned out to be 6.9 hours close enough. AND opname NOT LIKE %aggregate% col input_mbytes for 99,999,990.00 justify right head READ_MB In consequence the backup performance dropped drastically and the backup time windows were not sufficient anymore. RMAN VALIDATE with DISK_ASYNCH_IO = FALSE, MAXOPENFILES = 4 and DBWR_IO_SLAVES = 0. This will be the target for the RMAN backups. set lines 300 The following tests were performed on Solaris x86 with an attached enterprise SAN storage and an Oracle 11.2.0.3 database. Each channel (in our test case we have only one channel) reads the data into the input buffers, processes the data while copying it from the input buffers to the output buffers, and then writes the data from the output buffers to tape (the write phase is missing by a RMAN VALIDATE). The duplicate or clone process can be performed on the same server or onto a remote server. , to_char(start_time + (sysdate-start_time)/(mbytes_processed/dbsize_mbytes),DD-MON-YYYY HH24:MI:SS) est_complete See, I can restore the database onto the new host any day of the week. This parameter must be set to TRUE only when the database files reside on raw partitions. You can scale up the I/O read throughput to the enterprise storage maximum by parallel RMAN backup of course, but the main focus was on increasing the I/O throughput for each channel. In our case we talk about the read and copy phase only as we perform a RMAN VALIDATE of the whole database. Good stuff, going through something similar trying to get the performance for rman backups higher. If your source database is very busy and running RMAN backups would impact the database performance, easy solution is to take FlashRecover snapshots of the source database, place it on a second host (Mount host) and perform backups from the mount host onto FlashBlade. The growth of data in recent years has been astounding. From scientific research and movie rendering to artificial intelligence, applications push the limits on thousands of GPU cores or thousands of CPU servers. Cobalt Technology Limited. RMAN VALIDATE with DISK_ASYNCH_IO = FALSE, MAXOPENFILES = 4 and DBWR_IO_SLAVES = 4. col complete for 990.00 justify right head COMPLETE % Other key feature of Direct NFS Client is the high availability. col output_device_type for a10 justify left head DEVICE The RMAN channel allocates 16 buffers of size 1 megabyte (MB) so that the total buffer size for all the input files is 16 MB. Lets test the various settings and check the corresponding system calls (to understand how it works internally) after we got all that detailed information. RMAN is designed to take advantage of asynchronous io. 1.3.4 DISK_ASYNCH_IO Initialization Parameter (HP-UX). If you answer one of the questions with Yes, then this blog is worth to read and get some more insights into synchronous I/O and RMAN backups. To find out more click here to download the Flashblade data sheet. There are always 4 slaves spawned initially per channel but slaves will die if idle > 60 secs. The performance in these test cases were limited by the compute and network resources on hosts leaving lot of performance room in FlashBlade to accommodate further workloads. As you can see from the above diagram, the clone or duplicate process happens on the FlashBlade, meaning the backups are read from FlashBlade and new database is created on FlashBlade. col est_complete for a20 head ESTIMATED COMPLETION We had two databases OLTP (1.2TB) and DW (1.8TB) that were backed up on to FlashBlade and clones were created from these two backups. SAPnote #1431798 Oracle 11.2.0: Database Parameter Settings, DISK_ASYNCH_IO FALSE (only on HP-UX, only for standard file systems, not for OnlineJFS(VxFS 5.x), not for ASM, not for raw devices, see SAP note 798194). Data is the new currency in the modern era. AND opname like RMAN%; Duplicate Database and the location of thedatafiles, How to Integrate HashiCorp Vault with Jenkins to secure yoursecrets, CLUSTERWARE PROCESSES in 11g RAC R2Environment, Oracle RAC 11gR2 Voting Disk & OCRBackup, Automatic Segment Space Management andLMT, Datapatch fails with ORA-01017, as it only connects to the database with OSAuthentication, Perl lib version v5.8.3 doesnt match executable versionv5.10.0, ORA-00054 resource busy and acquire with NOWAIT specified or timeoutexpired, How to change the Redo Log File size in OracleDatabase, Applying PSU patch in an Oracle 12c DataguardEnvironment, Get lsinventory with SQL statement in12c, Patching Oracle 12c Multi-tenant to latest PSU now known asDPBP, Create CDB Common User Without Using C##Prefix, Monitoring Wait Statistics in Oracledatabase, How to find Table Fragmentation in Oracle Database, SQL script to check available space in your recoveryarea (db_recovery_file_dest_size), ORA-09945: Unable to initialize the audit trail file. Allocation of Input Disk Buffers / Level of Multiplexing. This helps, thanks!! These tests include single vs multiple mount points across Kernel NFS and Direct NFS. In our use case for Oracle RMAN backups and restore, the filesystems from FlashBlade would be mounted on the host using NFS protocol. If one network path fails, then Direct NFS Client will reissue I/O commands over any remaining paths, ensuring fault tolerance and high availability. where totalwork > sofar The backup was performed with one channel only (no parallel backup). Modernization of database platforms and analytics along with big data has made data the primary asset of enterprises. Channel 1 writes the data to a locally attached tape drive, whereas channel 2 sends the data over the network to a remote media server. I had to talk with my storage adminstrators about that. This solution also puts the dormant backups to use by speeding up database clones. It also performs asynchronous I/O, which allows processing to continue while the I/O request is submitted and processed. select SID, to_char(START_TIME,dd-mm-yy hh24:mi:ss) START_TIME,TOTALWORK, sofar, (sofar/totalwork) * 100 done, The I/O slaves perform the same pread system call as the RMAN channel shadow process before (check request / read size for example). While companies are modernizing their primary storage by adopting all flash systems like Pure FlashArray, their backup and restore systems are still using traditional disk based solutions. , (output_bytes/input_bytes*100) compression The current generation of FlashBlade system can support a write rate of 4.5 GBps and a read rate of 16 GBps in a single 4U chassis. We have checked how Oracle handles the I/O requests internally with various settings, but how does this affect the I/O read throughput for a classic RMAN database backup? If you have any further questions please feel free to ask or get in contact directly, if you need assistance by troubleshooting Oracle database performance issues. Basically the root cause for the issue looked like this (by running truss on the specific Oracle processes): My client set the initialization parameter DISK_ASYNCH_IO to FALSE to avoid such aio errors and perform pread / pwrites only. col recid for 9999999 head ID, select recid set pages 1000 With Rubrik from Cobalt never wait for a restore again We say Live Mount! Disk IO slaves should always be used to simulate asynchronous IO when native asynchronous IO is disabled. This is very easy to accomplish with FlashBlade by mounting the backup filesystems on the remote host with the same mount point location as that of the source. As FlashBlades massively distributed architecture performs best with parallel connections to the blades, dNFS is highly recommended as dNFS makes separate connections to the storage system for every server process. Using I/O slaves (for simulating asynchronous I/O) can improve the I/O throughput drastically, if you are running an Oracle database on an OS platform, that does not support asynchronous I/O at all or if you need to disable asynchronous I/O due to various reasons (like bugs or file system designs). If you set DISK_ASYNCH_IO to false, then you should also set DBWR_IO_SLAVES to a value other than its default of zero in order to simulate asynchronous I/O. The challenge with the Kernel NFS is that it allows only a single connection to storage for every mount from the host. 100* sum (long_waits) / sum (io_count) as "LONG_WAIT_PCT", sum (effective_bytes_per_second)/1024/1024 as "MB_PER_S", This will produce a result-listing such as the following, ----------------------- --- ------- ------- --------- ----- -------------. The DISK_ASYNCH_IO initialization parameter determines whether the database files reside on raw disks or file systems. One day I had to (was granted the privilege to!) The I/O requests are swapped out to the I/O slaves and exchanged by memory operations (semctl, semtimedop). DISK_ASYNCH_IO controls whether I/O to datafiles, control files, and logfiles is asynchronous (that is, whether parallel server processes can overlap I/O requests with CPU processing during table scans).
If your platform does not support asynchronous I/O to disk, this parameter has no effect. rehost a production database onto a new machine. The DISK_ASYNCH_IO parameter must be set to FALSE when the database files reside on file system. 8 database files are read simultaneously with system call pread and a request / read size of 512 kb, if you look closely at the open file handles (273 to 280) and the corresponding read size (524288). The parameter specifies the number of I/O server processes used by the database writer process (DBWR). Lets start with an explanation and some SAP information about the DISK_ASYNCH_IO, before we go on with researching and performance measures. Parallel compute demands parallel storage. col START_TIME for a20 Alerting is not available for unauthorized users, Oracle Documentation Parameter DISK_ASYNCH_IO, Oracle Documentation DISK_ASYNCH_IO Initialization Parameter (HP-UX), Oracle Documentation Tuning RMAN Performance, MOS ID 360443.1 RMAN Backup Performance. It must be analyzed, backed up, recovered, and iterated upon at the speed of modern businesses. With more FlashArrays, FlashBlade can deliver higher performance.
col output_mbytes for 99,999,990.00 justify right head WRITTEN_MB col compression for 990.00 justify right head COMPRESS|% ORIG While Pure FlashArrays FlashRecover snapshot functionality allows customers to clone databases in seconds at no additional storage, many customers wanted to segregate their non-production environments from production and hence end up deploying additional infrastructure, including storage to support test and development environment. The DISK_ASYNCH_IO parameter can be set to TRUE or FALSE depending on where the files reside. Alter system switch logfile N times, and let the archiver saves the online logs. , output_device_type The Oracle 11g Backup and Recovery Users Guide explains how to DUPLICATE a database. RMAN VALIDATE with DISK_ASYNCH_IO = FALSE, MAXOPENFILES = 8 (Default Setting) and DBWR_IO_SLAVES = 0. The high bandwidth capabilities of FlashBlade along with RMAN feature of DUPLICATE can be used to clone databases very quickly from the periodic backups. from v$session_longops By default, the value is 0 and I/O server processes are not used. I came across this issue recently as one of my clients had to disable asynchronous I/O on Solaris with ZFS due to I/O performance issues and high CPU usage.