Tuesday, July 13, 2010

Oracle RAC and TAF to Guarantee availability

One of the most exciting new features in Oracle Database is Real Application Clusters (RAC). The Oracle RAC solution delivers 24/7 database availability, performance, and scalability. Cache Fusion is the key memory feature that enables Oracle RAC performance, and the new Transparent Application Failover (TAF) is what applications use to sync up with Oracle RAC availability. This article explores the cooperation between Oracle RAC, Cache Fusion, and TAF and offers insights into the architecture and use of these tools for continuous availability and infinite scalability.

Oracle RAC Architecture

Oracle has long recognized that a clustered environment is the best protection against hardware and software failure. In a clustered environment, many Oracle instances exist on separate servers, each with direct connectivity to a single Oracle database. Should any single server or instance fail, processing continues on the surviving servers.

Cache Fusion and Oracle RAC

The introduction of the Cache Fusion shared RAM cache for multiple Oracle instances is a breakthrough in clustered solutions. Oracle RAC fully implements Cache Fusion, which both provides high performance and enables continuous cluster availability. The high-availability capability of Oracle RAC is almost unfathomable. It's estimated that in a 12-computer configuration, any application running on Oracle RAC will not experience a catastrophic failure for well over 100,000 years.

Cache Fusion technology changes the internal configuration of the Oracle system global area (SGA). Cache Fusion moves the RAM data buffers from local RAM storage into a shared RAM area accessible by all Oracle instances.

Beyond high performance and high availability, Oracle RAC offers significant benefits as a scalability tool. Whenever the processing load becomes excessive in an existing Oracle RAC cluster, you can add additional processors—each with its own Oracle instance—to the Oracle RAC configuration. This allows companies to start small and scale infinitely as processing demands increase.

Oracle RAC and Hardware Failover

To detect a node failure, the Cluster Manager uses a background process—Global Enqueue Service Monitor (LMON)—to monitor the health of the cluster. When a node fails, the Cluster Manager reports the change in the cluster's membership to Global Cache Services (GCS) and Global Enqueue Service (GES). These services are then remastered based on the current membership of the cluster.

To successfully remaster the cluster services, Oracle RAC keeps track of all resources and resource states on each node and then uses this information to restart these resources on a backup node.

These processes also manage the state of in-flight transactions and work with TAF to either restart or resume the transactions on the new node. Now let's see how Oracle RAC and TAF work together to ensure that a server failure does not cause an unplanned service interruption.

Using Transparent Application Failover

After an Oracle RAC node crashes—usually from a hardware failure—all new application transactions are automatically rerouted to a specified backup node. The challenge in rerouting is to not lose transactions that were "in flight" at the exact moment of the crash. One of the requirements of continuous availability is the ability to restart in-flight application transactions, allowing a failed node to resume processing on another server without interruption. Oracle's answer to application failover is a new Oracle Net mechanism dubbed Transparent Application Failover. TAF allows the DBA to configure the type and method of failover for each Oracle Net client.

For an application to use TAF, it must use failover-aware API calls from the Oracle Call Interface (OCI). Inside OCI are TAF callback routines that can be used to make any application failover-aware.

While the concept of failover is simple, providing an apparent instant failover can be extremely complex, because there are many ways to restart in-flight transactions. The TAF architecture offers the ability to restart transactions at either the transaction (SELECT) or session level:

SELECT failover. With SELECT failover, Oracle Net keeps track of all SELECT statements issued during the transaction, tracking how many rows have been fetched back to the client for each cursor associated with a SELECT statement. If the connection to the instance is lost, Oracle Net establishes a connection to another Oracle RAC node and re-executes the SELECT statements, repositioning the cursors so the client can continue fetching rows as if nothing has happened. The SELECT failover approach is best for data warehouse systems that perform complex and time-consuming transactions.
SESSION failover. When the connection to an instance is lost, SESSION failover results only in the establishment of a new connection to another Oracle RAC node; any work in progress is lost. SESSION failover is ideal for online transaction processing (OLTP) systems, where transactions are small.
Oracle TAF also offers choices on how to restart a failed transaction. The Oracle DBA may choose one of the following failover methods:

BASIC failover. In this approach, the application connects to a backup node only after the primary connection fails. This approach has low overhead, but the end user experiences a delay while the new connection is created.
PRECONNECT failover. In this approach, the application simultaneously connects to both a primary and a backup node. This offers faster failover, because a pre-spawned connection is ready to use. But the extra connection adds everyday overhead by duplicating connections.
Currently, TAF will fail over standard SQL SELECT statements that have been caught during a node crash in an in-flight transaction failure. In the current release of TAF, however, TAF must restart some types of transactions from the beginning of the transaction.

The following types of transactions do not automatically fail over and must be restarted by TAF:

Transactional statements. Transactions involving INSERT, UPDATE, or DELETE statements are not supported by TAF.
ALTER SESSION statements. ALTER SESSION and SQL*Plus SET statements do not fail over.
The following do not fail over and cannot be restarted:
Temporary objects. Transactions using temporary segments in the TEMP tablespace and global temporary tables do not fail over.
PL/SQL package states. PL/SQL package states are lost during failover.
Using Oracle RAC and TAF Together

The continuous availability features of Oracle RAC and TAF come together when these products cooperate in restarting failed transactions. Let's take a closer look at how this works.

Within each connected Oracle Net client, tnsnames.ora file parameters define the failover types and methods for that client. The parameters direct Oracle RAC and TAF on how to restart any transactions that may be in-flight during a hardware failure on the node.

It is important to note that TAF failover control is external to the Oracle RAC cluster, and each Oracle Net client may have unique failover types and methods, depending on processing requirements. The following is a client tnsnames.ora file entry for a node, including its current TAF failover parameters:



bubba.world =
(DESCRIPTION_LIST =
(FAILOVER = true)
(LOAD_BALANCE = true)
(DESCRIPTION =
(ADDRESS =
(PROTOCOL = TCP)
(HOST = redneck)(PORT = 1521))
(CONNECT_DATA =
(SERVICE_NAME = bubba)
(SERVER = dedicated)
(FAILOVER_MODE =
(BACKUP=cletus)
(TYPE=select)
(METHOD=preconnect)
(RETRIES=20)
(DELAY=3)
)
)
)



The failover_mode section of the tnsnames.ora file lists the parameters and their values:

BACKUP=cletus. This names the backup node that will take over failed connections when a node crashes. In this example, the primary server is bubba, and TAF will reconnect failed transactions to the cletus instance in case of server failure.

TYPE=select. This tells TAF to restart all in-flight transactions from the beginning of the transaction (and not to track cursor states within each transaction).

METHOD=preconnect. This directs TAF to create two connections at transaction startup time: one to the primary bubba database and a backup connection to the cletus database. In case of instance failure, the cletus database will be ready to resume the failed transaction.

RETRIES=20. This directs TAF to retry a failover connection up to 20 times.

DELAY=3. This tells TAF to wait three seconds between connection retries.

Remember, you must set these TAF parameters in every tnsnames.ora file on every Oracle Net client that needs transparent failover.

Putting It All Together

An Oracle Net client can be a single PC or a huge application server. In the architectures of giant Oracle RAC systems, each application server has a customized tnsnames.ora file that governs the failover method for all connections that are routed to that application server.

Watching TAF in Action

The transparency of TAF operation is a tremendous advantage to application users, but DBAs need to quickly see what has happened and where failover traffic is going, and they need to be able to get the status of failover transactions. To provide this capability, the Oracle data dictionary has several new columns in the V$SESSION view that give the current status of failover transactions.

The following query calls the new FAILOVER_TYPE, FAILOVER_METHOD, and FAILED_OVER columns of the V$SESSION view. Be sure to note that the query is restricted to nonsystem sessions, because Oracle data definition language (DDL) and data manipulation language (DML) are not recoverable with TAF.



select
username,
sid,
serial#,
failover_type,
failover_method,
failed_over
from
v$session
where
username not in ('SYS','SYSTEM',
'PERFSTAT')
and
failed_over = 'YES';



You can run this script against the backup node after an instance failure to see those transactions that have been reconnected with TAF. Remember, TAF will quickly redirect transactions, so you'll only see entries for a short period of time immediately after the failover. A backup node can have a variety of concurrent failover transactions, because the tnsnames.ora file on each Oracle Net client specifies the backup node, the failover type, and the failover method.

Conclusion

Oracle RAC, TAF, and Cache Fusion work together to guarantee continuous availability and infinite scalability. To summarize, here's a short description of each component:

Oracle RAC. The clustering component of Oracle that allows the creation of multiple, independent Oracle instances, all sharing a single database.

Cache Fusion. The shared RAM component of Oracle RAC that provides fast interchange of Oracle data blocks between SGA regions.

TAF. The failover method implemented on the Oracle Net client to restart in-flight transactions when a node crashes.

Monday, May 31, 2010

Convert a Single Instance Database to Oracle RAC 10g on RHEL3

Overview
The RAC cluster comprises two Intel x86 servers running on RHEL3 (Kernel 2.4.21-27). Each node has access to a shared storage and connectivity to the public and private network.
This guide is structured into the following steps:
Preliminary Installation
Migrating Your Database to ASM
Installing Oracle Cluster Ready Services (CRS) Software
Installing Oracle RAC Software
Post Installation
Testing Transparent Application Failover (TAF)
Unless otherwise specified, you should execute all steps on both nodes.
Here's an overview of our single-instance database environment before converting to RAC:


Host Name Instance Name Database Name $ORACLE_BASE Database File Storage
salmon1 prod1 prod1 /u01/app/oracle ext3

And an overview of the RAC database environment:


Host Name Instance Name Database Name $ORACLE_BASE Database File Storage OCR & CRS Voting Disk
salmon1 prod1a prod1 /u01/app/oracle ASM Raw
salmon2 prod1b prod1 /u01/app/oracle ASM Raw

You'll install the Oracle Home on each node for redundancy.The ASM and RAC instances share the same Oracle Home on each node.
Step 1: Preliminary Installation
1a. Verify software package versions.
Install the required packages. Additional information can be obtained from the documentation.
[root@salmon1]# rpm -qa | grep -i compat
compat-libstdc++-7.3-2.96.128
compat-gcc-c++-7.3-2.96.128
compat-libstdc++-devel-7.3-2.96.128
compat-db-4.0.14-5
compat-glibc-7.x-2.2.4.32.6
compat-slang-1.4.5-5
compat-gcc-7.3-2.96.128
compat-pwdb-0.62-3
[root@salmon1]#
[root@salmon1]# rpm -qa | grep openmotif
openmotif-2.2.3-3.RHEL3
[root@salmon1]#
[root@salmon1]# rpm -qa | grep -i gcc
gcc-gnat-3.2.3-42
gcc-c++-ssa-3.5ssa-0.20030801.48
compat-gcc-c++-7.3-2.96.128
libgcc-ssa-3.5ssa-0.20030801.48
gcc-3.2.3-42
gcc-g77-3.2.3-42
gcc-java-3.2.3-42
gcc-ssa-3.5ssa-0.20030801.48
gcc-g77-ssa-3.5ssa-0.20030801.48
gcc-objc-ssa-3.5ssa-0.20030801.48
libgcc-3.2.3-42
gcc-c++-3.2.3-42
gcc-objc-3.2.3-42
gcc-java-ssa-3.5ssa-0.20030801.48
compat-gcc-7.3-2.96.128
1b. Verify kernel parameters.
Verify the following kernel parameters. Additional information can be obtained from the documentation.
[root@salmon1]# sysctl -a | grep shm
kernel.shmmni = 4096
kernel.shmall = 2097152
kernel.shmmax = 2147483648
[root@salmon1]# sysctl -a | grep sem
kernel.sem = 250 32000 100 128
[root@salmon1]# sysctl -a | grep -i ip_local
net.ipv4.ip_local_port_range = 1024 65000
[root@salmon1]# sysctl -a | grep -i file-max
fs.file-max = 65536
1c. Create the Oracle Base directory, oracle user, and groups.
Using the information below on node 1, create the oracle user and the oinstall and dba groups on the second node.
[oracle@salmon1]$ hostname
salmon1.dbsconsult.com
[oracle@salmon1]$
[oracle@salmon1]$ id
uid=500(oracle) gid=500(dba) groups=500(dba),501(oinstall)
[oracle@salmon1]$
[oracle@salmon1]$ echo $ORACLE_BASE
/u01/app/oracle
1d. Edit the oracle user environment file.
[oracle@salmon1]$ more .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

export PATH=$PATH:$HOME/bin
export ORACLE_SID=prod1
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.1.0
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export PATH=$ORACLE_HOME/bin:$PATH
umask 022
1e. Configure the oracle user shell limits.
[root@salmon1]# more /etc/security/limits.conf
* soft nproc 2047
* hard nproc 16384
* soft nofile 1024
* hard nofile 65536
[root@salmon1]# grep pam_limits /etc/pam.d/login
session required /lib/security/pam_limits.so
1f. Configure public and private network.
Using the information below, make the necessary changes to network interface devices eth0 (public) and eth1 (private).
[root@salmon1]# redhat-config-network

Host Name IP Address Type
salmon1.dbsconsult.com 192.168.0.184 Public (eth0)
salmon2.dbsconsult.com 192.168.0.185 Public (eth0)
salmon1.dbsconsult.com 10.10.10.84 Private (eth1)
salmon2.dbsconsult.com 10.10.10.85 Private (eth1)
salmon1-vip.dbsconsult.com 192.168.0.186 Virtual
salmon2-vip.dbsconsult.com 192.168.0.187 Virtual
1g. Edit the /etc/hosts file.
127.0.0.1 localhost.localdomain localhost
10.10.10.84 sallocal1.dbsconsult.com sallocal1
10.10.10.85 sallocal2.dbsconsult.com sallocal2
192.168.0.184 salmon1.dbsconsult.com salmon1
192.168.0.185 salmon2.dbsconsult.com salmon2
192.168.0.186 salmon1-vip.dbsconsult.com salmon1-vip
192.168.0.187 salmon2-vip.dbsconsult.com salmon2-vip
Verify the hostname and the configured network interface devices.
[root@salmon1]# hostname
salmon1.dbsconsult.com
[root@salmon1]# /sbin/ifconfig
1h. Establish user equivalence with SSH.
During the Cluster Ready Services (CRS) and RAC installation, the Oracle Universal Installer (OUI) has to be able to copy the software as oracle to all RAC nodes without being prompted for a password. In Oracle 10g, this can be accomplished using ssh instead of rsh.
To establish user equivalence, generate the user's public and private keys as the oracle user on both nodes.
[oracle@salmon1]$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/oracle/.ssh/id_dsa):
Created directory '/home/oracle/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/oracle/.ssh/id_dsa.
Your public key has been saved in /home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
5d:8c:42:97:eb:42:ae:52:52:e9:59:20:2a:d3:6f:59 oracle@salmon1.dbsconsult.com
Test the connection on every node. Verify that you are not prompted for password when you run the following the second time.
ssh salmon1 date
ssh salmon2 date
ssh sallocal1 date
ssh sallocal2 date
ssh salmon1.dbsconsult.com date
ssh salmon2.dbsconsult.com date
ssh sallocal1.dbsconsult.com date
ssh sallocal2.dbsconsult.com date
1i. Configure hangcheck timer kernel module.
The hangcheck timer kernel module monitors the system's health and restarts a failing RAC node. It uses two parameters, hangcheck_tick (defines the system checks frequency) and hangcheck_margin (defines the maximum hang delay before a RAC node is reset), to determine if a node is failing.
Add the following line in /etc/rc.d/rc.local to load the hangcheck module automatically.
[root@salmon1]# grep insmod /etc/rc.d/rc.local
insmod hangcheck-timer hangcheck_tick=30 hangcheck_margin=180
1j. Recreate database control file.
Make sure that the entries below are sized appropriately in the control file before converting to RAC. If required, recreate the database control file with the right settings.
MAXLOGFILES
MAXLOGMEMBERS
MAXDATAFILES
MAXINSTANCES
MAXLOGHISTORY
1k. Resize the database buffer cache.
When transitioning from a single instance database to RAC, additional memory is required for the database buffer cache. In RAC, space is allocated for the Global Cache Service (GCS) in every block buffer cache. The amount of additional memory requires depends on how the application accesses the data—that is, if the same block is cached in more than one instance.
I observed an increase of about 8% buffer cache usage during an experimental demonstration. Use the buffer cache advisory to determine an optimal buffer cache size or let Oracle take control by switching to Oracle Automatic Shared Memory Management (ASMM).


Step 2: Migrate Your Database to ASM
The Oracle Cluster Synchronization Services (CSS) daemon is required for synchronization between the ASM instance and the database instances. The CSS daemon must be up before the ASM instance can be started. When you installed or upgraded to Oracle Database 10g, the CSS should have already been configured for the single-node version and it should start automatically when the system reboots.

[root@salmon1]# ps –ef | grep ocssd.bin | wc –l
17
2a. Download ASM RPMs.
Download the following ASM RPMs from OTN:
oracleasm-support-2.0.0-1.i386.rpm
oracleasm-2.4.21-27.EL-1.0.4-2.i686.rpm (driver for UP kernel) or oracleasm-2.4.21-27.ELsmp-1.0.4-1.i686.rpm (driver for SMP kernel)
oracleasmlib-2.0.0-1.i386.rpm
2b. Install the ASM RPMs.
Install the ASM RPMs as the root user.
[root@salmon1]# rpm -i oracleasm-support-2.0.0-1.i386.rpm
[root@salmon1]# rpm -i oracleasm-2.4.21-27.EL-1.0.4-2.i686.rpm
Linking module oracleasm.o into the module path [ OK ]
[root@salmon1]# rpm -i oracleasmlib-2.0.0-1.i386.rpm
[root@salmon1]# rpm -qa | grep oracleasm
oracleasm-2.4.21-27.EL-1.0.4-2
oracleasm-support-2.0.0-1
oracleasmlib-2.0.0-1
2c. Configure ASM.
Configure the ASMLib as the root user.
[root@salmon1]# /etc/init.d/oracleasm configure
Configuring the Oracle ASM library driver.

This will configure the on-boot properties of the Oracle ASM library
driver. The following questions will determine whether the driver is
loaded on boot and what permissions it will have. The current values
will be shown in brackets ('[]'). Hitting without typing an
answer will keep that current value. Ctrl-C will abort.

Default user to own the driver interface []: oracle
Default group to own the driver interface []: dba
Start Oracle ASM library driver on boot (y/n) [n]: y
Fix permissions of Oracle ASM disks on boot (y/n) [y]: y
Writing Oracle ASM library driver configuration [ OK ]
Creating /dev/oracleasm mount point [ OK ]
Loading module "oracleasm" [ OK ]
Mounting ASMlib driver filesystem [ OK ]
Scanning system for ASM disks
2d. Create ASM disks.
Create the ASM disks on any one node as the root user.
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL1 /dev/sdg5
Marking disk "/dev/sdg5" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL2 /dev/sdg6
Marking disk "/dev/sdg6" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL3 /dev/sdg7
Marking disk "/dev/sdg7" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL4 /dev/sdg8
Marking disk "/dev/sdg8" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL5/dev/sdg9
Marking disk "/dev/sdg9" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL6 /dev/sdg10
Marking disk "/dev/sdg10" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL7 /dev/sdg11
Marking disk "/dev/sdg11" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL8 /dev/sdg12
Marking disk "/dev/sdg12" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL9 /dev/sdg13
Marking disk "/dev/sdg13" as an ASM disk [ OK ]
[root@salmon1]# /etc/init.d/oracleasm createdisk VOL10 /dev/sdg14
Marking disk "/dev/sdg14" as an ASM disk [ OK ]
Verify that the ASM disks are visible from every node.
[root@salmon1]# /etc/init.d/oracleasm listdisks
VOL1
VOL2
VOL3
VOL4
VOL5
VOL6
VOL7
VOL8
VOL9
VOL10
[root@salmon1]# /etc/init.d/oracleasm scandisks
Scanning system for ASM disks [ OK ]
2e. Create ASM init.ora.
On the first node, create an init+ASM1A.ora file in $ORACLE_HOME/dbs with the following parameters:
#asm_diskgroups='DG1', ‘DG2’, ‘RECOVERYDEST’
asm_diskstring='ORCL:VOL*'
background_dump_dest=/u01/app/oracle/admin/+ASM/bdump
core_dump_dest=/u01/app/oracle/admin/+ASM/cdump
user_dump_dest=/u01/app/oracle/admin/+ASM/udump
instance_type=asm
large_pool_size=16M
remote_login_passwordfile=exclusive
+ASM1A.instance_number=1
+ASM1B.instance_number=2
2f. Create ASM password file.
Using the orapwd utility, create an orapw+ASM1A file in $ORACLE_HOME/dbs on the first node.
[oracle@salmon1]$ cd $ORACLE_HOME/dbs
[oracle@salmon1]$ orapwd file=orapw+ASM1A password=sys entries=5
2g. Create the first ASM instance.
Create the first ASM instance on the first node. The second ASM instance will be created in Post Installation after the CRS software is installed on the second node.
[oracle@salmon1]$ export ORACLE_SID=+ASM1A
[oracle@salmon1]$ sqlplus / as sysdba

SQL*Plus: Release 10.1.0.3.0 - Production on Thu May 26 05:51:07 2005

Copyright (c) 1982, 2004, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 104857600 bytes
Fixed Size 777616 bytes
Variable Size 104079964 bytes
Database Buffers 0 bytes
Redo Buffers 0 bytes
ORA-15110: no diskgroups mounted
2h. Create the ASM instance spfile.
Create a spfile immediately after the ASM instance starts. With spfile, any newly created disk groups are automatically added to the spfile.
SQL> create spfile from pfile;

File created.
2i. Create disk groups.
Create three disk groups: DG1, DG2, and RECOVERYDEST. DG1 and DG2 will be used to store Oracle data files and redo logs. RECOVERYDEST will be used as the flash recovery area.
SQL> create diskgroup dg1 normal redundancy
2 failgroup fg1a disk
3 'ORCL:VOL1','ORCL:VOL2'
4 failgroup fg1b disk
5 'ORCL:VOL3','ORCL:VOL4';

Diskgroup created.

SQL> create diskgroup dg2 normal redundancy
2 failgroup fg2a disk
3 'ORCL:VOL5','ORCL:VOL6'
4 failgroup fg2b disk
5 'ORCL:VOL7','ORCL:VOL8';

Diskgroup created.

SQL> create diskgroup recoverydest normal redundancy
2 failgroup fgrd1 disk
3 'ORCL:VOL9'
4 failgroup fgrd2 disk
5 'ORCL:VOL10';

Diskgroup created.

SQL> show parameter diskgroup

NAME TYPE VALUE
------------------------------ ----------- -----------------------------
asm_diskgroups string DG1, DG2, RECOVERYDEST

SQL> select name,total_mb from v$asm_diskgroup;

NAME TOTAL_MB
------------------------- -------------------
DG1 36864
DG2 36864
RECOVERYDEST 73728

3 rows selected.

SQL> select name,path,failgroup from v$asm_disk;

NAME PATH FAILGROUP
-------- --------------- ----------------
VOL1 ORCL:VOL1 FG1A
VOL10 ORCL:VOL10 FGRD2
VOL2 ORCL:VOL2 FG1A
VOL3 ORCL:VOL3 FG1B
VOL4 ORCL:VOL4 FG1B
VOL5 ORCL:VOL5 FG2A
VOL6 ORCL:VOL6 FG2A
VOL7 ORCL:VOL7 FG2B
VOL8 ORCL:VOL8 FG2B
VOL9 ORCL:VOL9 FGRD1

10 rows selected.
2j. Configure flash recovery area.

SQL> connect sys/sys@prod1 as sysdba
Connected.
SQL> alter database disable block change tracking;

Database altered.

SQL> alter system set db_recovery_file_dest_size=72G;

System altered.

SQL> alter system set db_recovery_file_dest=’+RECOVERYDEST’;

System altered.
2k. Migrate data files to ASM.
You must use RMAN to migrate the data files to ASM disk groups. All data files will be migrated to the newly created disk group, DG1. The redo logs and control files are created in DG1 and DG2. In a production environment, you should store redo logs on different set of disks and disk controllers from the rest of the Oracle data files.
SQL> connect sys/sys@prod1 as sysdba
Connected.
SQL> alter system set db_create_file_dest=’+DG1’;

System altered.

SQL> alter system set control_files='+DG1/cf1.dbf' scope=spfile;

System altered.

SQL> shutdown immediate;

[oracle@salmon1]$ rman target /

RMAN> startup nomount;

Oracle instance started

Total System Global Area 419430400 bytes
Fixed Size 779416 bytes
Variable Size 128981864 bytes
Database Buffers 289406976 bytes
Redo Buffers 262144 bytes

RMAN> restore controlfile from '/u02/oradata/prod1/control01.ctl';

Starting restore at 26-MAY-05
using target database controlfile instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=160 devtype=DISK

channel ORA_DISK_1: copied controlfile copy
output filename=+DG1/cf1.dbf
Finished restore at 26-MAY-05

RMAN> alter database mount;

database mounted
released channel: ORA_DISK_1

RMAN> backup as copy database format '+DG1';

Starting backup at 26-MAY-05
allocated channel: ORA_DISK_1
channel ORA_DISK_1: sid=160 devtype=DISK
channel ORA_DISK_1: starting datafile copy
input datafile fno=00001 name=/u02/oradata/prod1/system01.dbf
output filename=+DG1/prod1/datafile/system.257.1 tag=TAG20050526T073442 recid=1 stamp=559294642
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:02:49
channel ORA_DISK_1: starting datafile copy
input datafile fno=00003 name=/u02/oradata/prod1/sysaux01.dbf
output filename=+DG1/prod1/datafile/sysaux.258.1 tag=TAG20050526T073442 recid=2 stamp=559294735
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:01:26
channel ORA_DISK_1: starting datafile copy
input datafile fno=00002 name=/u02/oradata/prod1/undotbs01.dbf
output filename=+DG1/prod1/datafile/undotbs1.259.1 tag=TAG20050526T073442 recid=3 stamp=559294750
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:15
channel ORA_DISK_1: starting datafile copy
input datafile fno=00004 name=/u02/oradata/prod1/users01.dbf
output filename=+DG1/prod1/datafile/users.260.1 tag=TAG20050526T073442 recid=4 stamp=559294758
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
copying current controlfile
output filename=+DG1/prod1/controlfile/backup.261.1 tag=TAG20050526T073442 recid=5 stamp=559294767
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:08
Finished backup at 26-MAY-05

RMAN> switch database to copy;

datafile 1 switched to datafile copy "+DG1/prod1/datafile/system.257.1"
datafile 2 switched to datafile copy "+DG1/prod1/datafile/undotbs1.259.1"
datafile 3 switched to datafile copy "+DG1/prod1/datafile/sysaux.258.1"
datafile 4 switched to datafile copy "+DG1/prod1/datafile/users.260.1"

RMAN> alter database open;

database opened

RMAN> exit

SQL> connect sys/sys@prod1 as sysdba
Connected.
SQL> select tablespace_name, file_name from dba_data_files;

TABLESPACE FILE_NAME
--------------------- -----------------------------------------
USERS +DG1/prod1/datafile/users.260.1
SYSAUX +DG1/prod1/datafile/sysaux.258.1
UNDOTBS1 +DG1/prod1/datafile/undotbs1.259.1
SYSTEM +DG1/prod1/datafile/system.257.1
2l. Migrate temp tablespace to ASM.
SQL> alter tablespace temp add tempfile size 100M;

Tablespace altered.

SQL> select file_name from dba_temp_files;

FILE_NAME
-------------------------------------
+DG1/prod1/tempfile/temp.264.3
2m. Migrate redo logs to ASM.
Drop existing redo logs and recreate them in ASM disk groups, DG1 and DG2.
SQL> alter system set db_create_online_log_dest_1='+DG1';

System altered.

SQL> alter system set db_create_online_log_dest_2='+DG2';

System altered.

SQL> select group#, member from v$logfile;

GROUP# MEMBER
--------------- ----------------------------------
1 /u03/oradata/prod1/redo01.log
2 /u03/oradata/prod1/redo02.log

SQL> alter database add logfile group 3 size 10M;

Database altered.

SQL> alter system switch logfile;

System altered.

SQL> alter database drop logfile group 1;

Database altered.

SQL> alter database add logfile group 1 size 100M;

Database altered.

SQL> alter database drop logfile group 2;

Database altered.

SQL> alter database add logfile group 2 size 100M;

Database altered.

SQL> alter system switch logfile;

System altered.

SQL> alter database drop logfile group 3;

Database altered.

SQL> select group#, member from v$logfile;

GROUP# MEMBER
--------------- ----------------------------------------
1 +DG1/prod1/onlinelog/group_1.265.3
1 +DG2/prod1/onlinelog/group_1.257.1
2 +DG1/prod1/onlinelog/group_2.266.3
2 +DG2/prod1/onlinelog/group_2.258.1
2n. Create pfile from spfile.
Create and retain a copy of the database pfile. You'll add more RAC specific parameters to the pfile later, in the Post Installation.
SQL> connect sys/sys@prod1 as sysdba
Connected.
SQL> create pfile='/tmp/tmppfile.ora' from spfile;

File created.
2o. Add additional control file.
If an additional control file is required for redundancy, you can create it in ASM as you would on any other filesystem.
SQL> connect sys/sys@prod1 as sysdba
Connected to an idle instance.
SQL> startup mount
ORACLE instance started.

Total System Global Area 419430400 bytes
Fixed Size 779416 bytes
Variable Size 128981864 bytes
Database Buffers 289406976 bytes
Redo Buffers 262144 bytes
Database mounted.

SQL> alter database backup controlfile to '+DG2/cf2.dbf';

Database altered.

SQL> alter system set control_files='+DG1/cf1.dbf ','+DG2/cf2.dbf' scope=spfile;

System altered.

SQL> shutdown immediate;
ORA-01109: database not open

Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 419430400 bytes
Fixed Size 779416 bytes
Variable Size 128981864 bytes
Database Buffers 289406976 bytes
Redo Buffers 262144 bytes
Database mounted.
Database opened.
SQL> select name from v$controlfile;

NAME
---------------------------------------
+DG1/cf1.dbf
+DG2/cf2.dbf
After successfully migrating all the data files over to ASM, the old data files are no longer needed and can be removed. Your single-instance database is now running on ASM!

Step 3: Install Cluster Ready Services (CRS) Software
CRS requires two files—the Oracle Cluster Registry (OCR) and the Voting Disk—on shared raw devices or Oracle Cluster File System (OCFS). These files must be accessible to all nodes in the cluster. Raw devices are used here to house both files.
3a. Create OCR and Voting Disk.
The storage size for the OCR should be at least 100MB and the storage size for the voting disk should be at least 20MB.

File Raw Device Disk Partition Filename Size (MB)
OCR /dev/raw/raw11 /dev/sde1 /u02/oradata/prod1/ocr 100
Voting Disk /dev/raw/raw12 /dev/sde2 /u02/oradata/prod1/vdisk 20
[root@salmon1]# more /etc/sysconfig/rawdevices
/dev/raw/raw11 /dev/sde1
/dev/raw/raw12 /dev/sde2
[root@salmon1]# chown oracle:dba /dev/raw/raw11
[root@salmon1]# chown oracle:dba /dev/raw/raw12
[root@salmon1]# /sbin/service rawdevices restart
Assigning devices:
/dev/raw/raw11 --> /dev/sde1
/dev/raw/raw11: bound to major 8, minor 65
/dev/raw/raw12 --> /dev/sde2
/dev/raw/raw12: bound to major 8, minor 66
done
[root@salmon1]# su - oracle
[oracle@salmon1]$ ln -s /dev/raw/raw11 /u02/oradata/prod1/ocr
[oracle@salmon1]$ ln -s /dev/raw/raw12 /u02/oradata/prod1/vdisk
3b. Install CRS software.
Before installing the CRS software, shut down the listener, database, and ASM instance. Mount the CRS CD or download the software from OTN. The OUI should be launched on only the first node. During installation, the installer automatically copies the software to the second node.
[oracle@salmon1]$ export ORACLE_BASE=/u01/app/oracle
[oracle@salmon1]$ /mnt/cdrom/runInstaller
Welcome - Click on "Next."
Specify File Locations:
Name: OraCr10g_home1
Path: /u01/app/oracle/product/10.1.0/crs_1
Language Selection - English
Cluster Configuration:
Cluster Name: crs
Public Node Name: salmon1, Private Node Name: sallocal1
Public Node Name: salmon2, Private Node Name: sallocal2
Private Interconnect Enforcement:
Interface Name: eth0, Subnet: 192.168.0.0, Interface Type: Public
Interface Name: eth1, Subnet: 10.10.10.0, Interface Type: Private
Oracle Cluster Registry:
Specify OCR Location: /u02/oradata/prod1/ocr
Voting Disk:
Enter Voting disk filename: /u02/oradata/prod1/vdisk
Execute /u01/app/oracle/oraInventory/orainstRoot.sh script from another window as the root user on the first node.
Execute /u01/app/oracle/oraInventory/orainstRoot.sh script from another window as the root user on the second node.
Click on "Continue" after the orainstRoot.sh has been run on both nodes.
Summary - Click on "Install"
Execute /u01/app/oracle/product/10.1.0/crs_1/root.sh script from another window as the root user on the first node. Do not run the next step until this step completes.
Execute /u01/app/oracle/product/10.1.0/crs_1/root.sh script from another window as the root user on the second node.
Click on "OK" after the root.sh has been run on both nodes.
End of Installation - Click on "Exit."
[oracle@salmon1]$ /u01/app/oracle/product/10.1.0/crs_1/bin/olsnodes -n
salmon1 1
salmon2 2
[oracle@salmon1]$ ps -ef | egrep "css|crs|evm"

Step 4: Install Oracle RAC Software
4a. Edit the oracle user environment file.
On node 1, set ORACLE_SID=prod1a.
On node 2, set ORACLE_SID=prod1b.
[oracle@salmon1]$ more .bash_profile
# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi

export PATH=$PATH:$HOME/bin
export ORACLE_SID=prod1a
export ORACLE_BASE=/u01/app/oracle
export ORACLE_HOME=$ORACLE_BASE/product/10.1.0/db_1
export ORA_CRS_HOME=$ORACLE_BASE/product/10.1.0/crs_1
export LD_LIBRARY_PATH=$ORACLE_HOME/lib
export PATH=$ORACLE_HOME/bin:$PATH
umask 022
4b. Install RAC software.
Mount the Oracle Database 10g Enterprise Edition CD or download the software from OTN. Launch the OUI on only the first node. During installation, the installer automatically copies the software to the second node.
[oracle@salmon1]$ . ~/.bash_profile
[oracle@salmon1]$ /mnt/cdrom/runInstaller
Welcome - Click on "Next."
Specify File Locations:
Name: OraDB10g_home1
Path: /u01/app/oracle/product/10.1.0/db_1
Specify Hardware Cluster Installation Mode:
Select "Cluster Installation"
Click on "Select All"
Select Installation Type:
Select "Enterprise Edition"
Product-specific Prerequisite Checks:
Verify that all checks are successful before proceeding. Ignore the openmotif-2.1.30-11 warning.
Select Database Configuration:
Select "Do not create a starter database"
Summary - Click on "Install"
Execute /u01/app/oracle/product/10.1.0/db_1/root.sh script from another window as the root user on the first node. Do not run root.sh on the second node until the VIP configuration completes.
VIPCA Welcome - Click on "Next"
VIP Configuration Assistance, 1 of 2: Network Interfaces:
Select "eth0"
VIP Configuration Assistance, 2 of 2: Virtual IPs for cluster nodes:
Node Name: salmon1
IP Alias Name: salmon1-vip
IP address: 192.168.0.186
Subnet Mask: 255.255.255.0
Node Name: salmon2
IP Alias Name: salmon2-vip
IP address: 192.168.0.187
Subnet Mask: 255.255.255.0
Summary- Click on "Finish"
Configuration Assistant Progress Dialog - Click on "OK"
Configuration Results - Click on "Exit"
Execute /u01/app/oracle/product/10.1.0/db_1/root.sh script from another window as the root user on the second node.
Click on "OK" after the root.sh has run on both nodes.
End of Installation - Click on "Exit"
4c. Configure Oracle Listener.
The Network Configuration Assistant (NETCA) should only be launched and configured on one node. At the end of the configuration process, the NETCA starts up the Oracle listener on both nodes.
[oracle@salmon1]$ netca
Oracle Net Configuration Assistant: Real Application Clusters, Configuration:
Select "Cluster configuration"
Oracle Net Configuration Assistant: TOPSNodes:
Click "Select all nodes"
Oracle Net Configuration Assistant: Welcome:
Select "Listener configuration"
Oracle Net Configuration Assistant: Listener Configuration, Listener:
Select "Add"
Oracle Net Configuration Assistant: Listener Configuration, Listener Name:
Listener Name: LISTENER
Oracle Net Configuration Assistant: Listener Configuration, Select Protocols
Selected Protocols: TCP
Oracle Net Configuration Assistant: Listener Configuration, TCP/IP Protocol:
Select "Use the standard port number of 1521"
Oracle Net Configuration Assistant: Listener Configuration, More Listeners?
Select "No"
Oracle Net Configuration Assistant: Listener Configuration Done:
Click on "Next"
Oracle Net Configuration Assistant: Welcome
Select "Naming Methods configuration"
Click on "Next"
Oracle Net Configuration Assistant: Naming Methods Configuration, Select Naming:
Select "Local Naming"
Oracle Net Configuration Assistant: Naming Methods Configuration Done:
Click on "Next"
Oracle Net Configuration Assistant: Welcome
Click on "Finish"
4d. Verify status of services.
On node 1:
[oracle@salmon1]$ srvctl status nodeapps -n salmon1
VIP is running on node: salmon1
GSD is running on node: salmon1
Listener is running on node: salmon1
ONS daemon is running on node: salmon1
On node 2:
[oracle@salmon2]$ srvctl status nodeapps -n salmon2
VIP is running on node: salmon2
GSD is running on node: salmon2
Listener is running on node: salmon2
ONS daemon is running on node: salmon2
4e. listener.ora file
On node 1:
[oracle@salmon1]$ more $ORACLE_HOME/network/admin/listener.ora

LISTENER_SALMON1 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon1-vip)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.184)(PORT = 1521))
)
)
)

SID_LIST_LISTENER_SALMON1 =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u01/app/oracle/product/10.1.0/db_1)
(PROGRAM = extproc)
)
)
On node 2:
[oracle@salmon2]$ more $ORACLE_HOME/network/admin/listener.ora

LISTENER_SALMON2 =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon2-vip)(PORT = 1521))
)
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = 192.168.0.185)(PORT = 1521))
)
)
)

SID_LIST_LISTENER_SALMON2 =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /u01/app/oracle/product/10.1.0/db_1)
(PROGRAM = extproc)
)
)
4f. tnsnames.ora file
On both nodes:
[oracle@salmon1]$ more $ORACLE_HOME/network/admin/tnsnames.ora
LISTENERS_PROD1 =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon2-vip)(PORT = 1521))
)

PROD1 =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon1-vip)(PORT = 1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon2-vip)(PORT = 1521))
(LOAD_BALANCE = yes)
(CONNECT_DATA =
(SERVICE_NAME = PROD1)
(FAILOVER_MODE =
(TYPE = SELECT)
(METHOD = BASIC)
(RETRIES = 200)
(DELAY = 5)
)
)
)
PROD1A =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon1-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = PROD1)
(INSTANCE_NAME = PROD1A)
)
)

PROD1B =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = salmon2-vip)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = PROD1)
(INSTANCE_NAME = PROD1B)
)
)

Step 5: Post Installation
5a. Relocate the ASM instance files on the first node.
Copy the ASM instance password file and spfile+ASM1A.ora to the new Oracle Home.
[oracle@salmon1]$ cp /u01/app/oracle/product/10.1.0/dbs/orapw+ASM1A
/u01/app/oracle/product/10.1.0/db_1/dbs
[oracle@salmon1]$ cp /u01/app/oracle/product/10.1.0/dbs/spfile+ASM1A.ora
/u01/app/oracle/product/10.1.0/db_1/dbs
5b. Create init.ora for the second ASM instance.
On the second node, create an init+ASM1B.ora file in $ORACLE_HOME/dbs.
asm_diskgroups='DG1', ‘DG2’, ‘RECOVERYDEST’
asm_diskstring='ORCL:VOL*'
background_dump_dest=/u01/app/oracle/admin/+ASM/bdump
core_dump_dest=/u01/app/oracle/admin/+ASM/cdump
user_dump_dest=/u01/app/oracle/admin/+ASM/udump
instance_type=asm
large_pool_size=16M
remote_login_passwordfile=exclusive
+ASM1A.instance_number=1
+ASM1B.instance_number=2
5c. Create password file for the second ASM instance.
Using the orapwd utility, create an orapw+ASM1B file in $ORACLE_HOME/dbs on the second node.
5d. Create the second ASM instance.
Create the second ASM instance on the second node.
[oracle@salmon2]$ export ORACLE_SID=+ASM1B
[oracle@salmon2]$ sqlplus / as sysdba

SQL*Plus: Release 10.1.0.3.0 - Production on Thu May 26 18:43:14 2005

Copyright (c) 1982, 2004, Oracle. All rights reserved.

Connected to an idle instance.

SQL> startup
ORACLE instance started.

Total System Global Area 104857600 bytes
Fixed Size 777616 bytes
Variable Size 104079964 bytes
Database Buffers 0 bytes
Redo Buffers 0 bytes
ASM diskgroups mounted
SQL> create spfile from pfile;

File created.

SQL> shutdown immediate;
5e. Register the ASM instances with CRS.
For higher availability, register the ASM instances under the CRS framework. When registered, the CRS should detect any failed instances and automatically attempt to start up the instances. The CRS should also automatically start up the instances when the servers are rebooted.
On node 1:

[oracle@salmon1]$ srvctl add asm -n salmon1 -i +ASM1A -o /u01/app/oracle/product/10.1.0/db_1
[oracle@salmon1]$ srvctl start asm -n salmon1
[oracle@salmon1]$ srvctl status asm -n salmon1
ASM instance +ASM1A is running on node salmon1.
On node 2:
[oracle@salmon2]$ srvctl add asm -n salmon2 -i +ASM1B -o /u01/app/oracle/product/10.1.0/db_1
[oracle@salmon2]$ srvctl start asm -n salmon2
[oracle@salmon2]$ srvctl status asm -n salmon2
ASM instance +ASM1B is running on node salmon2.

[oracle@salmon1]$ ps -ef | grep dbw | grep -v grep
oracle 8918 1 0 May 26 ? 00:00:01 asm_dbw0_+ASM1A
[oracle@salmon1]$ kill -9 8918
[oracle@salmon1]$ srvctl status asm –n salmon1
ASM instance +ASM1A is not running on node salmon1.
[oracle@salmon1]$ ps -ef | grep dbw | grep -v grep
[oracle@salmon1]$ srvctl status asm –n salmon1
ASM instance +ASM1A is running on node salmon1.
[oracle@salmon1]$ ps -ef | grep dbw | grep -v grep
oracle 8920 1 0 May 26 ? 00:00:01 asm_dbw0_+ASM1A
5f. Create password files for both RAC instances.
Using the orapwd utility, create an orapwprod1a file in $ORACLE_HOME/dbs on the first node and an orapwprod1b file in $ORACLE_HOME/dbs on the second node.
5g. Add RAC specific parameters in pfile.
Modify the pfile saved in Step 2n. Add and modify the following parameters:
*.cluster_database_instances=2
*.cluster_database=true
*.remote_listener='LISTENERS_PROD1’
prod1a.thread=1
prod1a.instance_number=1
prod1a.undo_tablespace='UNDOTBS1'
prod1b.thread=2
prod1b.instance_number=2
prod1b.undo_tablespace='UNDOTBS2'
5h. Modify init.ora of both RAC instances.
On node 1:
[oracle@salmon1]$ more $ORACLE_HOME/dbs/initprod1a.ora
spfile='+DG1/spfileprod1.ora'
On node 2:
[oracle@salmon2]$ more $ORACLE_HOME/dbs/initprod1b.ora
spfile='+DG1/spfileprod1.ora'
5i. Migrate database spfile to ASM.
SQL> !echo $ORACLE_SID
prod1a

SQL> connect / as sysdba
Connected to an idle instance.
SQL> startup pfile=/tmp/tmppfile.ora
ORACLE instance started.

Total System Global Area 419430400 bytes
Fixed Size 779416 bytes
Variable Size 128981864 bytes
Database Buffers 289406976 bytes
Redo Buffers 262144 bytes
Database mounted.
Database opened.

SQL> create spfile='+DG1/spfileprod1.ora' from pfile='/tmp/tmppfile.ora';

File created.

SQL> shutdown immediate;
Database closed.
Database dismounted.
ORACLE instance shut down.
SQL> startup
ORACLE instance started.

Total System Global Area 419430400 bytes
Fixed Size 779416 bytes
Variable Size 128981864 bytes
Database Buffers 289406976 bytes
Redo Buffers 262144 bytes
Database mounted.
Database opened.
SQL> show parameter spfile

NAME TYPE VALUE
---------------------- ------------ ---------------------------
spfile string +DG1/spfileprod1.ora
5j. Create RAC Data Dictionary Views.
Create the RAC data dictionary views on the first RAC instance.
SQL> !echo $ORACLE_SID
prod1a

SQL> spool /tmp/catclust.log
SQL> @$ORACLE_HOME/rdbms/admin/catclust
...
...
...
SQL> spool off
SQL> shutdown immediate;
5k. Register the RAC instances with CRS.
On node 1:
[oracle@salmon1]$ srvctl add database -d prod1 -o $ORACLE_HOME
[oracle@salmon1]$ srvctl add instance -d prod1 -i prod1a -n salmon1
[oracle@salmon1]$ srvctl add instance -d prod1 -i prod1b -n salmon2
[oracle@salmon1]$ srvctl start instance -d prod1 -i prod1a
5l. Create redo logs for the second RAC instance.
SQL> connect sys/sys@prod1a as sysdba
Connected.
SQL> alter database add logfile thread 2 group 3 size 100M;
SQL> alter database add logfile thread 2 group 4 size 100M;
SQL> select group#, member from v$logfile;

GROUP# MEMBER
--------------- ----------------------------------------
1 +DG1/prod1/onlinelog/group_1.265.3
1 +DG2/prod1/onlinelog/group_1.257.1
2 +DG1/prod1/onlinelog/group_2.266.3
2 +DG2/prod1/onlinelog/group_2.258.1
3 +DG1/prod1/onlinelog/group_3.268.1
3 +DG2/prod1/onlinelog/group_3.259.1
4 +DG1/prod1/onlinelog/group_4.269.1
4 +DG2/prod1/onlinelog/group_4.260.1

8 rows selected.

SQL> alter database enable thread 2;

Database altered.
5m. Create undo tablespace for the second RAC instance.
SQL> create undo tablespace UNDOTBS2 datafile size 200M;

SQL> select tablespace_name, file_name from dba_data_files
2 where tablespace_name=’UNDOTBS2’;

TABLESPACE FILE_NAME
--------------------- --------------------------------------
UNDOTBS2 +DG1/prod1/datafile/undotbs2.270.1
5n. Start up the second RAC instance.
[oracle@salmon1]$ srvctl start instance -d prod1 -i prod1b
[oracle@salmon1]$ crs_stat -t
Name Type Target State Host
-----------------------------------------------------------------------
ora....1a.inst application ONLINE ONLINE salmon1
ora....1b.inst application ONLINE ONLINE salmon2
ora.prod1.db application ONLINE ONLINE salmon1
ora....M1A.asm application ONLINE ONLINE salmon1
ora....M1B.asm application ONLINE ONLINE salmon2
ora....N1.lsnr application ONLINE ONLINE salmon1
ora....on1.gsd application ONLINE ONLINE salmon1
ora....on1.ons application ONLINE ONLINE salmon1
ora....on1.vip application ONLINE ONLINE salmon1
ora....N2.lsnr application ONLINE ONLINE salmon2
ora....on2.gsd application ONLINE ONLINE salmon2
ora....on2.ons application ONLINE ONLINE salmon2
ora....on2.vip application ONLINE ONLINE salmon2
[oracle@salmon1]$ srvctl status database -d prod1
Instance prod1a is running on node salmon1
Instance prod1b is running on node salmon2
[oracle@salmon1]$ srvctl stop database -d prod1
[oracle@salmon1]$ srvctl start database -d prod1
[oracle@salmon1]$ sqlplus system/system@prod1

SQL*Plus: Release 10.1.0.3.0 - Production on Fri May 27 05:53:21 2005

Copyright (c) 1982, 2004, Oracle. All rights reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.1.0.3.0 - Production
With the Partitioning, Real Application Clusters, OLAP and Data Mining options

SQL> select instance_number instance#, instance_name, host_name, status
2 from gv$instance;

INSTANCE# INSTANCE_NAME HOST_NAME STATUS
----------------- ----------------- ------------------------- -----------
1 prod1a salmon1.dbsconsult.com OPEN
2 prod1b salmon2.dbsconsult.com OPEN
Congratulations, you have converted your single-instance database to RAC!


Step 6: Test Transparent Application Failover (TAF)
The failover mechanism in Oracle TAF enables any failed database connections to reconnect to another node within the cluster. The failover is transparent to the user. Oracle re-executes the query on the failed over instance and continues to display the remaining results to the user.
To demonstrate the failover mechanism of the TAF option, connect to two different database sessions and execute these steps:
6a. Connect the first session using the PROD1 service.
If the returned output of failover_type and failover_mode is 'NONE', verify that the PROD1 service is configured correctly in tnsnames.ora.
SQL> connect system/system@prod1 as sysdba
Connected.

SQL> select instance_number instance#, instance_name, host_name, status
2 from v$instance;

INSTANCE# INSTANCE_NAME HOST_NAME STATUS
----------------- -------------- ------------------------- ----------------
1 prod1a salmon1.dbsconsult.com OPEN

SQL> select failover_type, failover_method, failed_over
2 from v$session where username='SYSTEM';

FAILOVER_TYPE FAILOVER_MODE FAILED_OVER
------------------------- -------------------- ----------------------
SELECT BASIC NO
6b. Shut down the instance from another session.
Connect as the sys user on prod1a instance and shut down the instance.
SQL> connect sys/sys@prod1a as sysdba
Connected.

SQL> select instance_number instance#, instance_name, host_name, status
2 from v$instance;

INSTANCE# INSTANCE_NAME HOST_NAME STATUS
----------------- ---------------- ------------------------ ----------------
1 prod1a salmon1.dbsconsult.com OPEN

SQL> shutdown abort;
ORACLE instance shut down.
6c. Verify that the session has failed over.
From the same session in Step 6a, execute the queries below to verify that the session has failed over to another instance.
SQL> select instance_number instance#, instance_name, host_name, status
2 from v$instance;

INSTANCE# INSTANCE_NAME HOST_NAME STATUS
----------------- ---------------- ------------------------ ----------------
2 prod1b salmon2.dbsconsult.com OPEN

SQL> select failover_type, failover_method, failed_over
2 from v$session where username='SYSTEM';

FAILOVER_TYPE FAILOVER_MODE FAILED_OVER
----------------- -------------- ----------------------
SELECT BASIC YES

Conclusion
With proper planning and understanding of the RAC architecture, the transition from a single-instance database to a RAC configuration is not necessarily complex. ASM and RAC compliment each other to provide higher levels of availability, scalability, and business continuity. Hopefully, this guide has provided a clear and concise method of performing the conversion.

Friday, January 8, 2010

BDUMP, UDUMP, ALERT LOG FILES IN ORACLE 11G

The 11g New Features Guide notes important OFA changes, namely the removal of $ORACLE_HOME as an anchor for diagnostic and alert files:

"The database installation process has been redesigned to be based on the ORACLE_BASE environment variable. Until now, setting this variable has been optional and the only required variable has been ORACLE_HOME.

With this feature, ORACLE_BASE is the only required input, and the ORACLE_HOME setting will be derived from ORACLE_BASE."

Let's take a look at changes to the Oracle11g OFA standard.

Enter new admin subdirectories

New in Oracle 11g we see the new ADR (Automatic Diagnostic Repository) and Incident Packaging System, all designed to allow quick access to alert and diagnostic information.

The new $ADR_HOME directory is located by default at $ORACLE_BASE/diag, with the directories for each instance at $ORACLE_HOME/diag/$ORACLE_SID, at the same level as the traditional bdump, udump and cdump directories and the initialization parameters background_dump_dest and user_dump_dest are deprecated in 11g.

You can use the new initialization parameter diagnostic_dest to specify an alternative location for the diag directory contents.

In 11g, each $ORACLE_HOME/diag/$ORACLE_SID directory may contain these new directories:

*

alert - A new alert directory for the plain text and XML versions of the alert log.

*

incident - A new directory for the incident packaging software.

*

incpkg - A directory for packaging an incident into a bundle.

*

trace - A replacement for the ancient background dump (bdump) and user dump (udump) destinations.

*

cdump - The old core dump directory retains it's 10g name.

Let's see how the 11g alert log has changed.

Alert log changes in 11g

Oracle now writes two alert logs, the traditional alert log in plain text plus a new XML formatted alert.log which is named as log.xml.

"Prior to Oracle 11g, the alert log resided in $ORACLE_HOME/admin/$ORACLE_SID/bdump directory, but it now resides in the $ORACLE_BASE/diag/$ORACLE_SID directory".

Fortunately, you can re-set it to the 10g and previous location by specifying the BDUMP location for the diagnostic_dest parameter.

But best of all, you no longer require server access to see your alert log since it is now accessible via standard SQL using the new v$diag_info view:

select name, value from v$diag_info;

For complete details, see MetaLink Note:438148.1 - "Finding alert.log file in 11g".