Oracle Performance and Backup Blog: 2012

Thursday, November 8, 2012

How to rename NIC in OEL 6.X

This is a quick note about host cloning and network interface (ethX) renaming.
As I wrote in my last post I have ESX lab now and I create a template of OEL 6.3 with all my settings and I would like use it for adding new VM (including RAC nodes). It is easy but there is one small issue – every time you clone VM network devices are renamed. Well MAC address is unique so it doesn’t surprise me but I would like to keep interface names like eth0, eth1 and eth2 and I got eth0, eth3, eth4 instead. Since OEL 5 (Redhat 5) all device name are generated by udev mechanism so I start my research there.
Well it was easier than I thought – there is a file in "/udev/rules.d" directory called "70-persistent-net.rules".
And here is file content:

# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:ce:65:67", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"
# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:95", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"
# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:94", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"
# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:98", ATTR{type}=="1", KERNEL=="eth*", NAME="eth3"
# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:99", ATTR{type}=="1", KERNEL=="eth*", NAME="eth4"

Well this is it – all I need to do was to run "ifconfig –a" and confirm MAC addresses of existing interfaces. I have removed old entries and change interface names. My final configuration file looks like that

[root@localhost rules.d]# cat 70-persistent-net.rules
# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:0c:29:ce:65:67", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:95", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

# PCI device 0x8086:0x100f (e1000)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:50:56:9c:22:94", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"

After changes I rerun udev mechanism

[root@localhost rules.d]# start_udev
Starting udev:

And finally I have got correct configuration. Next lesson learned.
Regards,
Marcin

Sunday, November 4, 2012

New lab setup

After couple of month playing with Oracle VM I decided to install and configure VMWare. Thanks to VMware Guru Program I’m able to test ESX 5i with one year license. This is excellent opportunity for me to get more familiar with newest VMware solutions. I have been using VMware tools for years and I remember times when I deployed my first Oracle RAC 9i on Workstation 3.x and had to "hack" configuration file to change disk.locking parameter to false.

When I start instalation I have run into couple of issues as I decided to leave my current disk setup and I had only one free disk. New ESX is using GPT partition table and I couldn’t use it together with Grub to start Linux or VMware. This is when USB sticks went to play and I have installed ESX 5i on it and force my PC to boot from USB only. With other USB with Grub I have boot selection on my headless PC based on which USB key is connected at boot time. First problem solved.
Next issue appears to when I created my first VM and want to clone it – there was no such option in Windows vSphere Client. I ~~read a manual~~ (frankly I just googled) and I found out that I need to install vSphere 5 vCenter Server to be able to clone my VM automatically. Well I don’t have other host for it but I found out that I can download a VM image and deploy it on same ESX host. Here is a link to very nice instruction - vSphere vCenter virtual appliance quick start guide and you can find documentation here - vSphere documentation.
After I downloaded and installed vCenter and used vSphere Client to connect into I was able to use all features I was looking for. Actually it is strange for me that I need to have additional management server to clone VM. But I can’t complain Oracle VM need management server even to start VM so I would say both are quite similar in terms of management overhead for home usage. I can give VMWare stack one point more as you can stop vCenter and keep it down if you don’t need all features implemented in it.

I'm using now my new lab as Oracle server in various configurations and also as development box and additional workstation. I spend some time trying to setup remote desktop from my laptop to workstation with sound but without luck. I have tried FreeNX and NoMachine solution but setting audio transport was too much for me. After couple of days thanks to Twitter discussion with @simon_haslam peoples from ThinLinc joined it I have been told that we can use their remote desktop solution for free up to 10 connection. And that is what I was looking for - remote desktop with video and audio support. Good work ThinLinc.

Now time for my TODO list with my little virtual server:
- install free version of Cisco Nexus V1000 switch and play with RAC NIC failovers
- Learn Hadoop and Oracle DB Hadoop integration
- Install Cloudera software (including newest Impala)
- extend day to 48h to have time to do all above ;)

regards,
Marcin

Thursday, August 23, 2012

DataGuard and Oracle Restart - how to make it work

If you are going to implement Oracle Data Guard together with Oracle Restart you should be aware that there is a configuration problem in version 11.2.0.3 (and probably in other 11.2. as well). Oracle Restart is not checking what is a current database role and it is going to start standby instance in OPEN mode. This can end up with license issue if your Data Guard Broker will start applying process on standby database. If you don't have license to use Active Data Guard you just broken your license agreement.

There is couple of possible solutions:

manually set up "mount" state as start up mode for standby database and change it after every switch- or fail-over
disable MRP functionality on standby database and keep it open in read only mode
add your own script to open primary database only and keep standby in "mount" state

I want to describe last solution and share script which I have created for it.

First of all you have to change your Oracle Restart configuration to open both databases (standby and primary) in "mount" state using following command.
```
[oracle@testdb1 ~] srvctl modify database -d database_name -s MOUNT
```

In next step user script has to be created - you can use this one as an template. I have based it on CRS demo script.Script has been save in /tmp directory on both server using name "opendb"

#!/bin/sh
# These messages goes into the CRSD agent log file.
echo " *******   `date` ********** "
echo "Action script '$_CRS_ACTION_SCRIPT' for resource[$_CRS_NAME] called for action $1"
#env
#

#setup database home
export ORACLE_HOME=/u01/app/oracle/product/11.2.0/db3
#setup database SID
export ORACLE_SID=testa
DBROLE=''


# check database role and return following values
# OPENED if DB is open and it is primary
# PRIMARY if DB is mounted and DB role is primary
# PHYSICAL is DB is mounted and DB role is standby

function getrole() {

$ORACLE_HOME/bin/sqlplus -s / as sysdba << EOF
     spool /tmp/getrole.tmp
     set echo off feedback off head off
     select case when OPEN_MODE = 'READ WRITE' then 'OPENED' when OPEN_MODE='MOUNTED' and DATABASE_ROLE='PRIMARY' then 'PRIMARY' when OPEN_MODE='MOUNTED' and DATABASE_ROLE='PHYSICAL STANDBY' then 'PHYSICAL' end from v\$database;
     exit
EOF
DBROLE=`cat /tmp/getrole.tmp | sed 's/[ \t]*$//' | sed 's/^[ \t]*//' | tail -1`

}


case "$1" in
  'start')
     echo "START entry point has been called.."
     getrole
     # check role and do following actions
     case $DBROLE in
       'PHYSICAL')   echo "This is standby - do nothing" ;;
       'PRIMARY')
     $ORACLE_HOME/bin/sqlplus -s / as sysdba << EOF
     alter database open;
     exit
EOF
     echo ;;
     esac
     exit 0
     ;;

  'stop')
     echo "STOP entry point has been called.."
     exit 0
     ;;

  'check')
    echo "CHECK entry point has been called.."
    getrole
    if [ "$DBROLE" == 'OPENED' ] || [ "$DBROLE" == 'PHYSICAL' ]; then
        echo "Check -- SUCCESS"
        exit 0
    else
        echo "Check -- FAILED"
        exit 1
    fi
    ;;

  'clean')
     echo "CLEAN entry point has been called.."
     exit 0
     ;;

esac

Register user script in Oracle Restart and set up dependencies using following command. Ora.testa.db is my database resource name in CRS - please change to your database resource name. New resource is called "ora.opendb" and it has to be registered on both servers and started.There is hard dependency between my resource and database resource so my script will be started only when database has been started as well.
```
[oracle@testdb1 ~]$ /u01/app/oracle/product/11.2.0/grid3/bin/crsctl add resource ora.opendb -type cluster_resource \
> -attr "ACTION_SCRIPT=/tmp/opendb,CHECK_INTERVAL=30,RESTART_ATTEMPTS=2 \
> ,START_DEPENDENCIES=hard(intermediate:ora.testa.db)"

[oracle@testdb1 ~]$ /u01/app/oracle/product/11.2.0/grid3/bin/crsctl start resource ora.opendb
```

Check status of resources.

[oracle@testdb1 ~]$ /u01/app/oracle/product/11.2.0/grid3/bin/crsctl status res -t
--------------------------------------------------------------------------------
NAME           TARGET  STATE        SERVER                   STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.LISTENER.lsnr
               ONLINE  ONLINE       testdb1
ora.ons
               OFFLINE OFFLINE      testdb1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.cssd
      1        OFFLINE OFFLINE
ora.diskmon
      1        OFFLINE OFFLINE
ora.evmd
      1        ONLINE  ONLINE       testdb1
ora.opendb
      1        ONLINE  ONLINE       testdb1
ora.testa.db
      1        ONLINE  ONLINE       testdb1                  Open

When everything is deployed is it running in following way: Oracle Restart is starting both data bases in mount state and then next resource (user script) is started. Script is checking current database role and it is opening primary database and doesn't do any action on standby database.

Disclamer:
Please check script in your development and test environment before you deploy it and change anything in production environment. This script has been created for my personal tests and there is no guarantee that it is bug free.

regards,
Marcin

Monday, August 6, 2012

Oracle Grid Cloud Control 12c BP1 on Oracle Enterprise Linux 6.1

I have installed it in my test box and I hit a problem with auto-discovery due to missing libssl.so library for nmap. Issue is very easy to fix using common Linux troubleshooting approach. First what versions I'm using:

[root@oem-server ~]# cat /etc/oracle-release
Oracle Linux Server release 6.1
[root@oem-server ~]# openssl version
OpenSSL 1.0.0-fips 29 Mar 2010
[root@oem-server ~]# yum list | grep -e "^openssl"
openssl.x86_64                           1.0.0-20.el6_2.5            @ol6_latest
openssl.i686                             1.0.0-20.el6_2.5            ol6_latest
openssl-devel.i686                       1.0.0-20.el6_2.5            ol6_latest
openssl-devel.x86_64                     1.0.0-20.el6_2.5            ol6_latest
openssl-perl.x86_64                      1.0.0-20.el6_2.5            ol6_latest
openssl-static.x86_64                    1.0.0-20.el6_2.5            ol6_latest
openssl098e.i686                         0.9.8e-17.0.1.el6_2.2       ol6_latest
openssl098e.x86_64                       0.9.8e-17.0.1.el6_2.2       ol6_latest

Now what kind of issue I had

[root@oem-server ~]# /u01/Middleware/agent/agent_inst/discovery/nmap/bin/nmap
/u01/Middleware/agent/agent_inst/discovery/nmap/bin/nmap: error while loading shared libraries: libssl.so.4: cannot open shared object file: No such file or directory

How I fixed it

[oracle@oem-server ~]$  ldd /u01/Middleware/agent/agent_inst/discovery/nmap/bin/nmap
        linux-vdso.so.1 =>  (0x00007fff447ff000)
        libssl.so.4 => not found
        libcrypto.so.4 => not found
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003d40800000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003d43c00000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003d41000000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003d43400000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003d40400000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003d40000000)

[root@oem-server ~]# ln -s /usr/lib64/libssl.so.10 /usr/lib64/libssl.so.4
[root@oem-server ~]# ln -s /usr/lib64/libcrypto.so.10 /usr/lib64/libcrypto.so.4

[oracle@oem-server ~]$  ldd /u01/Middleware/agent/agent_inst/discovery/nmap/bin/nmap
        linux-vdso.so.1 =>  (0x00007fffb0686000)
        libssl.so.4 => /usr/lib64/libssl.so.4 (0x00007fd6b1c0a000)
        libcrypto.so.4 => /usr/lib64/libcrypto.so.4 (0x00007fd6b1870000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003d40800000)
        libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003d43c00000)
        libm.so.6 => /lib64/libm.so.6 (0x0000003d41000000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003d43400000)
        libc.so.6 => /lib64/libc.so.6 (0x0000003d40400000)
        libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003d48c00000)
        libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003d46c00000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003d44000000)
        libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003d47800000)
        libz.so.1 => /lib64/libz.so.1 (0x0000003d41400000)
        /lib64/ld-linux-x86-64.so.2 (0x0000003d40000000)
        libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003d46800000)
        libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003d47000000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003d42400000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003d40c00000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003d41c00000)

Now it is working

[oracle@oem-server ~]$ /u01/Middleware/agent/agent_inst/discovery/nmap/bin/nmap -v

Starting Nmap 5.51.3 ( http://nmap.org ) at 2012-07-08 19:19 IST
Unable to find nmap-services!  Resorting to /etc/services
Read data files from: /etc
WARNING: No targets were specified, so 0 hosts scanned.
Nmap done: 0 IP addresses (0 hosts up) scanned in 0.03 seconds
[oracle@oem-server ~]$

After that I checked MOS and Grid Control 12c is certified with OEL 6 so this errors was not expected but then I realized that I didn't read know issues for Oracle® Enterprise Manager Cloud Control Support Notes for Linux x86 and x86-64 and I found it in point called "Host Discovery Job Displays Error While Loading Shared Libraries". Official solution is a little bit more complicated and required downloading openssl sources and compiling it. Anyway I found article above using Google and I couldn't find it on entry documentation page.

regards,
Marcin

RMAN Transportable tablespace bug

I was a long time since I last used Transportable Tablespaces and I decided to refresh my memory and setup a test environment for Streams using TTS feature. I was going to use example script from Oracle Streams Documentation which is creating necessary data files for specified point of time (no need to switch source tablespace into read only mode) and also is running Data Pump creating meta data file. I have done necessary changes and run it.

[oracle@testdb1 oracle]$ rsp / as sysdba
SQL*Plus: Release 11.2.0.3.0 Production on Mon Aug 6 12:50:04 2012
Copyright (c) 1982, 2011, Oracle.  All rights reserved.

Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

Session altered.

SET SERVEROUTPUT ON SIZE 1000000
DECLARE
  until_scn NUMBER;
BEGIN
  until_scn:= DBMS_FLASHBACK.GET_SYSTEM_CHANGE_NUMBER;
      DBMS_OUTPUT.PUT_LINE('Until SCN: ' || until_scn);
SYS@testa AS SYSDBA >   2    3    4    5    6  END;
  7  /
Until SCN: 6300731

PL/SQL procedure successfully completed.
SYS@testa AS SYSDBA > exit

Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options

[oracle@testdb1 oracle]$ rlwrap rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Mon Aug 6 12:50:33 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: TESTA (DBID=243207980)

RMAN>  RUN
2>     {
        TRANSPORT TABLESPACE 'SOE_TS'
        UNTIL SCN 6300731
        TABLESPACE DESTINATION '/u01/app/oracle/dest';
        DATAPUMP DIRECTORY SOURCE_DIR
        AUXILIARY DESTINATION '/u01/app/oracle/aux_files'
        DUMP FILE 'soe_ts.dmp'
        EXPORT LOG 'soe.log' 
        IMPORT SCRIPT 'soe_ts.sql'
        TABLESPACE DESTINATION '/u01/app/oracle/dest';
       }

using target database control file instead of recovery catalog
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point-in-time

List of tablespaces expected to have UNDO segments
Tablespace SYSTEM
Tablespace UNDOTBS2
...

Unfortunately there is a RMAN bug related to export log file. Due to it first attempt end up with following error

contents of Memory Script:
{
# make read only the tablespace that will be exported
sql clone 'alter tablespace  "SOE" read only';
}
executing Memory Script

sql statement: alter tablespace  "SOE" read only

Performing export of metadata...

Removing automatic instance
shutting down automatic instance
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of transport tablespace command at 08/05/2012 23:18:47
RMAN-06136: ORACLE error from auxiliary database: ORA-01097: cannot shutdown while in a transaction - commit or rollback first
RMAN-06136: ORACLE error from auxiliary database: ORA-01460: unimplemented or unreasonable conversion requested

RMAN>

At first sight error message has nothing to do with log file but I know that RMAN error messaging is well .. different is a good word here. I search for solution and I found out that workaround is quite simple - EXPORT LOG clause has to be removed from script. So let's run it again

[oracle@testdb1 oracle]$ rlwrap rman target /
Recovery Manager: Release 11.2.0.3.0 - Production on Mon Aug 6 12:50:33 2012
Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.
connected to target database: TESTA (DBID=243207980)

RMAN>  RUN
2>       {
        TRANSPORT TABLESPACE 'SOE_TS'
        UNTIL SCN 6300731
        TABLESPACE DESTINATION '/u01/app/oracle/dest';
        DATAPUMP DIRECTORY SOURCE_DIR
        AUXILIARY DESTINATION '/u01/app/oracle/aux_files'
        DUMP FILE 'soe_ts.dmp'
        #EXPORT LOG 'soe.log' - bug
        IMPORT SCRIPT 'soe_ts.sql'
        TABLESPACE DESTINATION '/u01/app/oracle/dest';
      }

using target database control file instead of recovery catalog
RMAN-05026: WARNING: presuming following set of tablespaces applies to specified point-in-time

List of tablespaces expected to have UNDO segments
Tablespace SYSTEM
Tablespace UNDOTBS2
...
contents of Memory Script:
{
# make read only the tablespace that will be exported
sql clone 'alter tablespace  "SOE_TS" read only';
}
executing Memory Script

sql statement: alter tablespace  "SOE_TS" read only

Performing export of metadata...
   EXPDP> FLASHBACK automatically enabled to preserve database integrity.
   EXPDP> Starting "SYS"."TSPITR_EXP_khar":
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/PLUGTS_BLK
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/TABLE
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/INDEX/INDEX
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/INDEX/FUNCTIONAL_INDEX/INDEX
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/CONSTRAINT/CONSTRAINT
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/INDEX_STATISTICS
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/INDEX/STATISTICS/FUNCTIONAL_INDEX/INDEX_STATISTICS
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/CONSTRAINT/REF_CONSTRAINT
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/TABLE_STATISTICS
   EXPDP> Processing object type TRANSPORTABLE_EXPORT/POST_INSTANCE/PLUGTS_BLK
   EXPDP> Master table "SYS"."TSPITR_EXP_khar" successfully loaded/unloaded
   EXPDP> ******************************************************************************
   EXPDP> Dump file set for SYS.TSPITR_EXP_khar is:
   EXPDP>   /u01/app/oracle/dest/soe_ts.dmp
   EXPDP> ******************************************************************************
   EXPDP> Datafiles required for transportable tablespace SOE_TS:
   EXPDP>   /u01/app/oracle/dest/soe_ts01.dbf
   EXPDP> Job "SYS"."TSPITR_EXP_khar" successfully completed at 12:55:35
Export completed

Now looks much better - tablespace has been exported and import script has been generated as well. Now I'm ready for next steps and can continue my work with Streams.

regards,
Marcin

Sunday, July 1, 2012

Average Active Session in SQL*plus with refresh

Recently when I hit performance issues and figure out that OEM agent is mis-configured for that host I wish I have script to display live Average Active Session in SQL*Plus. Of course there is a plenty of other great tools like Tanel Poder’s Snapper or Tanel and Adrian Billington MOATS.
MOATS could be a answer for my needs but it required some objects to be created in database. From other side Snapper is using dynamic objects only but it is not displaying history so I can’t see at a glance if system performance has been improved or not.

I decided to answer my needs and I have created SQL*Plus script displaying AAS history using only dynamic objects like Snapper. I would like to thanks Tanel and Adrian for inspiration how to build SQL*Plus active output scripts.

Here is a sample screenshot.

Average Active Session is calculated based on v$session sampling and output is divided into three event category – CPU, DISK I/O and OTHER. Technically it is possible to add more classes but it become more tricky to read it from screen – so I think these three are a good balance between knowing what is going on and visibility. For deeper investigation you can use Snapper.

This tool is using two scripts (both have to be in one directory):

runtopaas.sql - is a main script to parse run attributes and specify a run environment for topaas.sql script. It is calling topaas.sql 100 times

topaas.sql - is sampling v$session every 1 s for time specified in refresh rate parameters and keep it in PL/SQL collection. At the end AAS (divided into 3 sections: CPU, Disk I/O and other) is calculated and displayed on screen. In addition to that AAS results are added to bind variables together with sample time. When topaas.sql is called next time it is reading data from bind variables and it allow it to have history of AAS from past and display it on screen. Default configuration allow to display 100 data point

How to use it:

Change SQL*Plus window / screen terminal to 45 characters height and 150 characters wide
Run in SQL*Plus window:

SQL> @runtopaas.sql aas:refresh rate - it will specify refresh rate (ex. 15 s) and with 100 samples it allow to keep 25 min of AAS in SQL*Plus window.If script will be started again after 100 cycles or after user break in this same session it will still be able to display historical data
SQL> @runtopaas.sql aas:refresh rate:reset - like above but historical data are cleared
SQL> @runtopaas.sql aas:refresh rate:max aas - like above but maximum value of AAS (y axis) is set by user
SQL> @runtopaas.sql aas:refresh rate:max aas:reset - like above but historical data are cleared

Examples:

SQL> @runtopaas aas:15

Yes there was a problem - AAS around 600 is not a normal one. When issue has been fixed I want to reset historic data and run script again

SQL> @runtopaas aas:15:reset

after some time

You can download both scripts here - Github repo.

Let me know if this tool is useful for you or you if you found any problems with it.
regards,
Marcin

Saturday, June 16, 2012

Oracle VM upgrade story

As I have described in my last post I stared my work with Oracle VM 3.1 but after all problems with NFS local storage I installed Oracle VM 3.0. When storage issues have been resolved it was a good time run and share information about upgrade process from version 3.0.3 to 3.1. Process itself has been split into two phases - Oracle VM upgrade and Oracle VM Manager upgrade.

Oracle VM upgrade
Oracle VM upgrade is straight forward procedure. I have booted my server from new 3.1 CD and chose upgrade option. After couple of minutes and one reboot new version was in place. Unfortunately repository based on OCFS was not mounted and none of my virtual machines started. But before I dig into it I decided to upgrade Oracle VM Manager.

Oracle VM Manager upgrade
You can download upgrade ISO image from e-delivery (V32481-01) and it has only 123 MB in size. In next step ISO image has to be mounted using loop device and upgrade script should be started.

[root@OVMiddleEarth ~]# mount -o loop /root/V32481-01.iso /mnt
[root@OVMiddleEarth ~]# cd /mnt/
[root@OVMiddleEarth mnt]# ls
components EULA LICENSE runUpgrader.sh TRANS.TBL upgrade
[root@OVMiddleEarth mnt]# ./runUpgrader.sh
Stating OVM Manager upgrade on Thu Jun 14 13:40:16 IST 2012

Oracle VM Manager 3.1.1.305 upgrade utility
Upgrade logfile : /tmp/upgrade-2012-06-14-40.log

It is highly recommended to do a full database repository backup prior to upgrading Oracle VM Manager ...

Press any key to continue ...

Aborting upgrade on Thu Jun 14 13:43:14 IST 2012 due to error
Attempt to rollback to before starting this upgrade

.... redeploy weblogic and 3.0.1 OVM Manager
Redeploying back to the 3.0.1 Oracle VM Manager core container ...
Redeploying back to the 3.0.1 Oracle VM Manager console ...
Redeploying back to the 3.0.1 Oracle VM Manager help ...

I hit a problem here - zip program could not be found. This is strange as I didn't recall this issues from Oracle VM 3.1 / Oracle VM 3.1 Manager installation. It mean that server installation process is different then server upgrade or Oracle VM Manager upgrade has different requirements. Whatever is a root cause let solve it quick. As I mentioned in my last post I have added OEL 5 repository to my Oracle VM yum configuration so I was able to run

[root@OVMiddleEarth mnt]# yum install zip
el5_latest                                                                                                                                | 1.1 kB     00:00
el5_latest/primary                                                                                                                        | 9.1 MB     00:23
el5_latest                                                                                                                                             9031/9031
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package zip.x86_64 0:2.31-2.el5 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

=================================================================================================================================================================
 Package                           Arch                                 Version                                   Repository                                Size
=================================================================================================================================================================
Installing:
 zip                               x86_64                               2.31-2.el5                                el5_latest                               136 k

Transaction Summary
=================================================================================================================================================================
Install       1 Package(s)
Upgrade       0 Package(s)

Total download size: 136 k
Is this ok [y/N]: y
Downloading Packages:
zip-2.31-2.el5.x86_64.rpm                                                                                                                 | 136 kB     00:00
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : zip                                                                                                                                       1/1

Installed:
  zip.x86_64 0:2.31-2.el5

Complete!
[root@OVMiddleEarth mnt]#

Now I was ready to restart upgrade process

[root@OVMiddleEarth mnt]# ./runUpgrader.sh
Stating OVM Manager upgrade on Thu Jun 14 13:50:22 IST 2012

Oracle VM Manager 3.1.1.305 upgrade utility
Upgrade logfile : /tmp/upgrade-2012-06-14-50.log

It is highly recommended to do a full database repository backup prior to upgrading Oracle VM Manager ...

Press any key to continue ...

Oracle VM Manager is running ...
Verifying installation status ...
Read Oracle VM Manager config file ...
Found Oracle VM Manager install files ...
Found Oracle VM Manager upgrader ...
Found Oracle WebLogic Server ...
Found Java ...
Using the following information :
Database Host : localhost
Database SID : XE
Database LSNR : 1521
Oracle VM Schema : ovs
Oracle VM Manager UUID : 0004fb00000100000a19593edeada0d8
Current Build ID : 3.0.3.126
Upgrade from version : 3.0.3
Upgrade to version : 3.1.1
Using /tmp/workdir.RLGCBY8025 for backup and export location.
Using /tmp/patchdir.aEpoE8026 for patching.
Enter password for user ovs :
Undeploying previous version of Oracle VM Manager application ...
Undeploying Oracle VM Manager help ...
Undeploying Oracle VM Manager console ...
Undeploying Oracle VM Manager core ...
Waiting for Oracle VM Manager core to fully undeploy...
Waiting...
Finished undeploying previous version ...
Exporting Oracle VM Manager repository ...
Please wait as this can take a long time ...
Oracle VM Manager repository export completed ...
Creating backup file ...
Oracle VM Manager repository backup in /tmp/ovm-manager-3-backup-2012-06-14.zip
Upgrading Oracle VM Manager ...
Backing up old files to /tmp/ovm-manager-3-backup-2012-06-14-135340...
Removing old files ...
Unpacking Oracle VM Manager 3.1.1.305
`transform_003001001000_010.xsl' -> `/tmp/patchdir.aEpoE8026/transform_003001001000_010.xsl'
`transform_003001001000_020.xsl' -> `/tmp/patchdir.aEpoE8026/transform_003001001000_020.xsl'
`deletedClasses.xml' -> `/tmp/patchdir.aEpoE8026/deletedClasses.xml'
Filtering full repository export to the selective export subset at /tmp/workdir_sel.GbPcqS8595 ...
cp: omitting directory `/tmp/workdir.RLGCBY8025/jrnl'
cp: omitting directory `/tmp/workdir.RLGCBY8025/objs'
adding: objs/19/193.cl.xml (deflated 70%)
adding: objs/51/519.cl.xml (deflated 72%)
adding: objs/51/511.cl.xml (deflated 81%)
adding: objs/10/84/10848.cl.xml (deflated 75%)
adding: objs/10/79/10791.cl.xml (deflated 73%)
adding: objs/42/426.cl.xml (deflated 70%)
adding: objs/41/412.cl.xml (deflated 85%)
adding: objs/17/07/17071.cl.xml (deflated 75%)
adding: objs/47/472.cl.xml (deflated 74%)
adding: objs/16/161.cl.xml (deflated 67%)
adding: objs/9.cl.xml (deflated 93%)

Selective export is at /tmp/workdir_sel.GbPcqS8595
11 objects selected (out of 12735) to be upgraded
Transform XSL files used:
-rw-r--r-- 1 root root 56678 Jun 14 13:53 /tmp/patchdir.aEpoE8026/transform_003001001000_010.xsl
-rw-r--r-- 1 root root 10079 Jun 14 13:53 /tmp/patchdir.aEpoE8026/transform_003001001000_020.xsl
Changed classes encountered in selective export set:

com.oracle.ovm.mgr.api.manager.BusinessManagerDbImpl
com.oracle.ovm.mgr.api.manager.ModelManagerDbImpl
com.oracle.ovm.mgr.api.manager.RasManagerDbImpl
com.oracle.ovm.mgr.api.physical.network.BondPortDbImpl
com.oracle.ovm.mgr.api.physical.network.EthernetPortDbImpl
com.oracle.ovm.mgr.api.physical.network.InternalPortDbImpl
com.oracle.ovm.mgr.api.physical.ServerDbImpl
com.oracle.ovm.mgr.api.virtual.VirtualMachineDbImpl
com.oracle.ovm.mgr.api.virtual.VirtualMachineTemplateDbImpl
com.oracle.ovm.mgr.api.virtual.XenHypervisorDbImpl

Upgrading Oracle VM Manager repository ...
Please wait as this can take a long time ...
Oracle VM Manager repository upgrade completed ...
Validating Oracle VM Manager repository ...
Oracle VM Manager repository validation completed ...
Refresh system-jazn-data.xml file ...
Redeploying Oracle VM Manager core container ...
Redeploying Oracle VM Manager console ...
Redeploying Oracle VM Manager help ...
Install ADF Patch ...
Completed upgrade to 3.1.1.305 ...
Writing updated config in /u01/app/oracle/ovm-manager-3/.config
Restart WebLogic ...
Stopping Oracle VM Manager [ OK ]
Starting Oracle VM Manager [ OK ]

OVM Manager upgrade finished on Thu Jun 14 13:57:31 IST 2012
[root@OVMiddleEarth mnt]#

This time it finish with successfully and I was able to login to Oracle VM Manager.

Post upgrade changes

My first impression after login to upgraded system was that none of two OCFS file systems have been mounted. I checked system logs and looked around in system and I found that only one iSCSI target has been presented. New version of VM means new kernel and new configuration for multipath daemon. In my case second HDD (/dev/sdb) which I used as a block device to my local iSCSI server has been configured with multi path access and I had to change my iSCSI server configuration - instead of using direct path to /dev/sdb2 I need to use path presented through device mapper.

[root@OVMiddleEarth tgt]#  ls -l /dev/mapper/
total 0
brw-rw---- 1 root disk 252,   4 Jun 14 14:04 1IET_00010001
crw------- 1 root root  10, 236 Jun 14 14:04 control
brw-rw---- 1 root disk 252,   0 Jun 14 14:04 SATA_ST31000524AS_9VPCK40X
brw-rw---- 1 root disk 252,   1 Jun 14 14:04 SATA_ST31000524AS_9VPCK40Xp1
brw-rw---- 1 root disk 252,   2 Jun 14 14:04 SATA_ST31000524AS_9VPCK40Xp2
brw-rw---- 1 root disk 252,   3 Jun 14 14:04 SATA_ST31000524AS_9VPCK40Xp3
[root@OVMiddleEarth tgt]#

When I found out what is a mapper name for my device I have changed tgtd configuration and rebooted server.

[root@OVMiddleEarth ~]# vi /etc/tgt/targets.conf
...

    backing-store /dev/mapper/SATA_ST31000524AS_9VPCK40Xp2
    backing-store /etc/tgt/small_disk
    write-cache off

I helped and now it least cluster heartbeat file system has been mounted but still no repository. Solution has simple but it took me some time to find it out. I need to rescan all disks in Storage section of Oracle VM Manager and acknowledge all events in Repository section.

Lesson Learned
Event view and acknowledge of all previous errors helps in some stages and made a clean view of current state of Oracle VM.

regards,
Marcin

Sunday, June 10, 2012

Oracle VM at home

Thanks to Yury Velikanov posts about Oracle VM Server I start my journey with that tool. First of all installation and configuration of Oracle VM 3.1 and Oracle VM Manager on one box went well and I was able to connect to it via browser (see Yury's posts for details). I have started configuration of environment but there was first glitch. Oracle VM can create local storage on whole disk only (correct me if I’m wrong) but I have installed it on my test PC already running other Linux distributions. I had 1 partition free (not whole disk) and I was unable to add it in simple way.

Adding file systems to repository using NFS on local loop interface

Oracle VM is supporting NFS and iSCSI/FC disks as well so I decided that I can use NFS to present free partition as repository. OVM is based on OEL distribution and it had NFS server already installed. So here is my configuration:

[root@OVMiddleEarth ~]# cat /etc/fstab
…
/dev/sdb2  /nfs_pool  ext3    defaults       0 0

[root@OVMiddleEarth ~]# cat /etc/exports
/nfs_pool *(rw,insecure,no_root_squash,sync)

[root@OVMiddleEarth ~]# chkconfig --level 2345 nfs on
[root@OVMiddleEarth ~]# chkconfig --list nfs
nfs             0:off   1:off   2:on    3:on    4:on    5:on    6:off

So far so good I was able to add local NFS server as repository for Oracle VM but in next 5 min I hit another issue – you can import Assemblies (pre-configured machines) via http(s)/ftp protocol only.

Adding local Apache (httpd) server

OK – lets add Apache to Oracle VM. I have already added yum repository from OEL 5.8(thanks Yury !) so adding httpd package was simple.

[root@OVMiddleEarth ~]# ls /etc/yum.repos.d/
public-yum-el5.repo
[root@OVMiddleEarth ~]# yum install httpd

Then simple configuration change to disable welcome screen

[root@OVMiddleEarth ~]# cat /etc/httpd/conf.d/welcome.conf
#
# This configuration file enables the default "Welcome"
# page if there is no default index page present for
# the root URL.  To disable the Welcome page, comment
# out all the lines below.
#
#
#    Options -Indexes
#    ErrorDocument 403 /error/noindex.html
#

And I have moved my assemblies into /var/www/html

[root@OVMiddleEarth ~]# ls -l /var/www/html/
total 571644
-rw-r--r-- 1 root root 584785920 Jan 20 22:51 OVM_OL6U1_x86_64_PVM.ova

That was simple as well and I was ready for implementation of Virtual Machine. I started to importing assemblies using local http server (http://localhost/OVM_OL6U1_x86_64_PVM.ova ) but it hung after minute or so. I wait a while but nothing happen so I start digging. First of all there was no disk activity at all – hmmm – I know it quite well – D-state.

ps aux | grep D

– showed processes waiting in DN state so it looks like problem with NFS server. I checked /var/log/messages and this is what I found

Jun  4 13:24:21 OVMiddleearth kernel: INFO: task nfsd:3639 blocked for more than 120 seconds.
Jun  4 13:24:21 OVMiddleearth kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  4 13:24:21 OVMiddleearth kernel: nfsd            D 0000000000000000     0  3639      2 0x00000000
Jun  4 13:24:21 OVMiddleearth kernel:  ffff8800a7e49be0 0000000000000246 00080094a7e49b60 00000000000121c0
Jun  4 13:24:21 OVMiddleearth kernel:  ffff8800a7e46500 ffff8800b0cfc0c0 ffff8800af48ac80 ffff8800af7fca80
Jun  4 13:24:21 OVMiddleearth kernel:  0000000000000000 ffff8800aa54c540 ffffffff81009d5d ffff8800a7e49bc0
Jun  4 13:24:21 OVMiddleearth kernel: Call Trace:
Jun  4 13:24:21 OVMiddleearth kernel:  [] ? xen_force_evtchn_callback+0xd/0x10
Jun  4 13:24:21 OVMiddleearth kernel:  [] ? check_events+0x12/0x20
Jun  4 13:24:21 OVMiddleearth kernel:  [] ? ext3_mark_dquot_dirty+0x60/0x60 [ext3]
Jun  4 13:24:21 OVMiddleearth kernel:  [] ? xen_restore_fl_direct_reloc+0x4/0x4
Jun  4 13:24:21 OVMiddleearth kernel:  [] ? kmem_cache_alloc+0xab/0x190
Jun  4 13:24:21 OVMiddleearth kernel:  [] schedule+0x45/0x60
Jun  4 13:24:24 OVMiddleearth kernel:  [] __mutex_lock_slowpath+0xd6/0x150
Jun  4 13:24:26 OVMiddleearth kernel:  [] ? dquot_file_open+0x4a/0x50
Jun  4 13:24:30 OVMiddleearth kernel:  [] mutex_lock+0x2b/0x50
Jun  4 13:24:32 OVMiddleearth kernel:  [] ima_rdwr_violation_check+0x67/0x100
Jun  4 13:24:33 OVMiddleearth kernel:  [] ima_file_check+0x20/0x50
Jun  4 13:24:40 OVMiddleearth kernel:  [] nfsd_open+0x121/0x170 [nfsd]
Jun  4 13:24:44 OVMiddleearth kernel:  [] nfsd_write+0xb3/0x100 [nfsd]
Jun  4 13:24:46 OVMiddleearth kernel:  [] nfsd3_proc_write+0x103/0x140 [nfsd]
Jun  4 13:24:50 OVMiddleearth kernel:  [] nfsd_dispatch+0xbb/0x220 [nfsd]
Jun  4 13:24:51 OVMiddleearth kernel:  [] svc_process_common+0x324/0x650 [sunrpc]
Jun  4 13:25:05 OVMiddleearth kernel:  [] ? nfsd_set_nrthreads+0x190/0x190 [nfsd]

Oops looks like problem with kernel / xen stack. My first idea was to google for error but only a few pages were found. Oracle VM 3.1 is latest version and it is using Oracle kernel as well so I decided to reinstall everything thing using Oracle VM 3.0.3 and test it again. After 1 h I have my Oracle VM 3.0.3 up and running and I was ready for tests. This time I was able to go one step more. I was able to import assemblies into Oracle VM but it hung when I started create template process.

Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219126] INFO: task nfsd:6446 blocked for more than 120 seconds.
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219127] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219128] nfsd          D ffff880062372d2c     0  6446      2 0x00000000
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219129]  ffff8800efb85960 0000000000000246 ffffffff8002c2f0 0000000000000400
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219131]  ffffffff80618bc0 ffff8800efb825c0 0000000000009480 ffff8800efb829a0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219132]  ffff8800efb82680 ffff8800efb825c0 ffff8800f54da740 ffff8800efb829a0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219133] Call Trace:
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219135]  [] ? target_load+0x30/0x70
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219137]  [] ? tcp_transmit_skb+0x3d3/0x730
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219138]  [] ? _spin_lock_bh+0x13/0x120
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219140]  [] __mutex_lock_slowpath+0xd9/0x1a0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219141]  [] mutex_lock+0x1e/0x40
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219143]  [] generic_file_aio_write+0x44/0xb0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219145]  [] ? generic_file_aio_write+0x0/0xb0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219146]  [] do_sync_readv_writev+0xed/0x130
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219148]  [] ? iput+0x2b/0x70
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219150]  [] ? autoremove_wake_function+0x0/0x40
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219152]  [] ? find_acceptable_alias+0x23/0x140 [exportfs]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219155]  [] ? __kmalloc+0x80/0x160
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219156]  [] ? security_file_permission+0x11/0x20
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219158]  [] do_readv_writev+0xcb/0x1e0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219161]  [] ? nfsd_setuser+0x113/0x2d0 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219164]  [] ? nfsd_setuser_and_check_port+0x5c/0x60 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219165]  [] vfs_writev+0x39/0x60
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219168]  [] nfsd_vfs_write+0x106/0x430 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219170]  [] ? dentry_open+0x4d/0xb0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219173]  [] ? nfsd_open+0x15c/0x1e0 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219176]  [] nfsd_write+0xe5/0x100 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219179]  [] nfsd3_proc_write+0xfe/0x140 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219182]  [] nfsd_dispatch+0xb5/0x230 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219187]  [] svc_process+0x477/0x780 [sunrpc]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219188]  [] ? wake_up_process+0x10/0x20
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219191]  [] ? nfsd+0x0/0x150 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219193]  [] nfsd+0xbd/0x150 [nfsd]
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219195]  [] kthread+0x8e/0xa0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219197]  [] child_rip+0xa/0x20
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219199]  [] ? kthread+0x0/0xa0
Jun  4 15:23:21 OVMiddleEarth kernel: [  361.219200]  [] ? child_rip+0x0/0x20

There were similar errors in /var/log/message file so this same issue appear in two different kernels so probably kernel version is not a problem. This time there were direct relations to network so I think for while and I decided to check network stack – and it was it – network parameters in kernel were set to defaults so I set number of parameters.

net.core.wmem_max=12582912
net.core.rmem_max=12582912
net.ipv4.tcp_rmem= 10240 87380 12582912
net.ipv4.tcp_wmem= 10240 87380 12582912
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_timestamps = 1
net.ipv4.tcp_sack = 1

After that change issue have been solved for short time but it happen again. I have end up with installing and using tshark investigation of NFS packages and there were lot of lost ACK segments on loopback interface. I have stopped Oracle VM Manager and used set of commands to replicate unpacking assemblies after that I run Oracle VM Manager again

[root@OVMiddleEarth ~]# cat /OVS/Repositories/0004fb0000030000c7347e844b6d10ac/Assemblies/0004fb0011c5ece/unpacked/System.img | gzip -dc | dd of=/OVS/Repositories/0004fb0000030000c7347e844b6d10ac/VirtualDisks/marcin.img bs=1M

and in other window

[root@OVMiddleEarth ~]# tshark -i lo -w lo.trc

When D-state appear again I have trace file to investigate

[root@OVMiddleEarth ~]# tshark -r lo.trc | grep -i NFS
...
5238   5.870485 192.168.1.30 -> 192.168.1.30 TCP nfs > 725 [ACK] Seq=46726805 Ack=73719217 Win=194 Len=0 TSV=79041 TSER=79041
5239   5.870490 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73751985 Win=194 Len=0 TSV=79041 TSER=79041
5240   5.870493 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73784753 Win=194 Len=0 TSV=79041 TSER=79041
5241   5.870497 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73817521 Win=194 Len=0 TSV=79041 TSER=79041
5243   5.870500 192.168.1.30 -> 192.168.1.30 TCP nfs > 725 [ACK] Seq=46726805 Ack=73850289 Win=194 Len=0 TSV=79041 TSER=79041
5244   5.870502 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73883057 Win=194 Len=0 TSV=79041 TSER=79041
5246   5.912057 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73932209 Win=90 Len=0 TSV=79051 TSER=79041
5248   5.952204 192.168.1.30 -> 192.168.1.30 TCP nfs > 725 [ACK] Seq=46726805 Ack=73948593 Win=26 Len=0 TSV=79061 TSER=79051
5268   6.163832 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79114 TSER=79114
5270   6.375827 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=79167 TSER=79114
5271   6.375849 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79167 TSER=79114
5272   6.799804 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=79273 TSER=79167
5273   6.799820 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79273 TSER=79114
5309   7.647833 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=79485 TSER=79273
5310   7.647857 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79485 TSER=79114
5361   9.343832 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=79909 TSER=79485
5362   9.343864 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79909 TSER=79114
5792  12.735830 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=80757 TSER=79909
5793  12.735852 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=80757 TSER=79114
5966  19.519835 192.168.1.30 -> 192.168.1.30 TCP [TCP Keep-Alive] 725 > nfs [ACK] Seq=73955248 Ack=46726805 Win=8197 Len=0 TSV=82453 TSER=80757
5967  19.519866 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=82453 TSER=79114

So there were problems and connections have been terminated around package 5268 - 5270. So let's see what happen

[root@OVMiddleEarth ~]# tshark -r lo.trc | grep -e "^52[456]."
Running as user "root" and group "root". This could be dangerous.
524   3.914358 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] 725 > nfs [ACK] Seq=7745 Ack=6477389 Win=6148 Len=0 TSV=78551 TSER=78551
525   3.914369 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] 725 > nfs [ACK] Seq=7745 Ack=6510157 Win=6148 Len=0 TSV=78551 TSER=78551
526   3.914379 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] 725 > nfs [ACK] Seq=7745 Ack=6542925 Win=6148 Len=0 TSV=78551 TSER=78551
5240   5.870493 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73784753 Win=194 Len=0 TSV=79041 TSER=79041
5241   5.870497 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73817521 Win=194 Len=0 TSV=79041 TSER=79041
5242   5.870499 192.168.1.30 -> 192.168.1.30 RPC [TCP Previous segment lost] Continuation
5243   5.870500 192.168.1.30 -> 192.168.1.30 TCP nfs > 725 [ACK] Seq=46726805 Ack=73850289 Win=194 Len=0 TSV=79041 TSER=79041
5244   5.870502 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73883057 Win=194 Len=0 TSV=79041 TSER=79041
5245   5.870513 192.168.1.30 -> 192.168.1.30 RPC Continuation
5246   5.912057 192.168.1.30 -> 192.168.1.30 TCP [TCP ACKed lost segment] nfs > 725 [ACK] Seq=46726805 Ack=73932209 Win=90 Len=0 TSV=79051 TSER=79041
5247   5.912074 192.168.1.30 -> 192.168.1.30 RPC Continuation
5248   5.952204 192.168.1.30 -> 192.168.1.30 TCP nfs > 725 [ACK] Seq=46726805 Ack=73948593 Win=26 Len=0 TSV=79061 TSER=79051
5249   6.023955 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=16637 Ack=3042 Win=48 [TCP CHECKSUM INCORRECT] Len=831 TSV=79079 TSER=78940
5250   6.024731 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3042 Ack=17468 Win=48 [TCP CHECKSUM INCORRECT] Len=38 TSV=79079 TSER=79079
5251   6.024773 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [ACK] Seq=17468 Ack=3080 Win=48 Len=0 TSV=79079 TSER=79079
5252   6.024835 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=17468 Ack=3080 Win=48 [TCP CHECKSUM INCORRECT] Len=220 TSV=79079 TSER=79079
5253   6.024954 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3080 Ack=17688 Win=48 [TCP CHECKSUM INCORRECT] Len=51 TSV=79079 TSER=79079
5254   6.063807 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [ACK] Seq=17688 Ack=3131 Win=48 Len=0 TSV=79089 TSER=79079
5255   6.071880 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=17688 Ack=3131 Win=48 [TCP CHECKSUM INCORRECT] Len=212 TSV=79091 TSER=79079
5256   6.072005 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3131 Ack=17900 Win=48 [TCP CHECKSUM INCORRECT] Len=50 TSV=79091 TSER=79091
5257   6.072064 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [ACK] Seq=17900 Ack=3181 Win=48 Len=0 TSV=79091 TSER=79091
5258   6.079861 192.168.1.30 -> 192.168.1.30 TCP 57168 > 0 [SYN] Seq=0 Win=32792 Len=0 MSS=16396 TSV=79093 TSER=0 WS=8
5259   6.079876 192.168.1.30 -> 192.168.1.30 TCP 0 > 57168 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0
5260   6.159933 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=17900 Ack=3181 Win=48 [TCP CHECKSUM INCORRECT] Len=251 TSV=79113 TSER=79091
5261   6.160069 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3181 Ack=18151 Win=48 [TCP CHECKSUM INCORRECT] Len=50 TSV=79113 TSER=79113
5262   6.160126 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [ACK] Seq=18151 Ack=3231 Win=48 Len=0 TSV=79113 TSER=79113
5263   6.160183 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=18151 Ack=3231 Win=48 [TCP CHECKSUM INCORRECT] Len=236 TSV=79113 TSER=79113
5264   6.160303 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3231 Ack=18387 Win=48 [TCP CHECKSUM INCORRECT] Len=26 TSV=79113 TSER=79113
5265   6.160410 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [PSH, ACK] Seq=18387 Ack=3257 Win=48 [TCP CHECKSUM INCORRECT] Len=249 TSV=79113 TSER=79113
5266   6.160533 192.168.1.30 -> 192.168.1.30 TCP 54321 > 34311 [PSH, ACK] Seq=3257 Ack=18636 Win=48 [TCP CHECKSUM INCORRECT] Len=50 TSV=79113 TSER=79113
5267   6.163807 192.168.1.30 -> 192.168.1.30 RPC Continuation
5268   6.163832 192.168.1.30 -> 192.168.1.30 TCP [TCP ZeroWindow] nfs > 725 [ACK] Seq=46726805 Ack=73955249 Win=0 Len=0 TSV=79114 TSER=79114
5269   6.199807 192.168.1.30 -> 192.168.1.30 TCP 34311 > 54321 [ACK] Seq=18636 Ack=3307 Win=48 Len=0 TSV=79123 TSER=79113

So it looks like that NFS connection is terminated when any other packages from Oracle VM Manager or local Oracle XE database are appear on loop back interface. Probably (I can’t prove that so far) missing ACK is a part of problem but why [nfsd] is hanging on writing on disk ?
Anyway I still want to test Oracle VM so I decided to use iSCSI on loopback instead of NFS.

Adding local iSCSI server

I have found documentation how to set up iSCSI server here. So let’s start again:

# yum install scsi-target-utils

====================================================================================================================================================================
 Package                                       Arch                             Version                                  Repository                            Size
====================================================================================================================================================================
Installing:
 scsi-target-utils                             x86_64                           1.0.14-2.el5                             el5_latest                           172 k
Installing for dependencies:
 libibverbs                                    x86_64                           1.1.3-2.el5                              el5_latest                            45 k
 libnes                                        x86_64                           0.9.0-2.el5                              el5_latest                            13 k
 librdmacm                                     x86_64                           1.0.10-1.el5                             el5_latest                            22 k
 openib                                        noarch                           1.4.1-6.el5                              el5_latest                            20 k
 perl                                          x86_64                           4:5.8.8-38.el5                           el5_latest                            12 M
 perl-Config-General                           noarch                           2.40-1.el5                               el5_latest                            68 k

Now it is time to add some block devices to share. We need at least two – as one has to be used as voting disk for OCFS2 and other one will be used for keeping data. TGT (iSCSI server) is quite flexible so we can use file on file system presented as block device.

[root@OVMiddleEarth ~]# dd if=/dev/zero of=/etc/tgt/small_disk bs=1M count=1000
[root@OVMiddleEarth ~]# vi /etc/tgt/targets.conf

    backing-store /dev/sdb2 # my free partition
    backing-store /etc/tgt/small_disk # small file for OCFS vote at least 1 GB
    write-cache off # this is very important to disable write cache as TGT is killed and cache will be not sync at the reboot

Lets start TGT

[root@OVMiddleEarth ~]# service tgtd start
[root@OVMiddleEarth ~]# chkconfig tgtd on

Little hack to start TGTD just after network service and before iSCSI

[root@OVMiddleEarth ~]#  cd /etc
[root@OVMiddleEarth etc]# mv rc2.d/S39tgtd rc2.d/S11tgtd
[root@OVMiddleEarth etc]# mv rc3.d/S39tgtd rc3.d/S11tgtd
[root@OVMiddleEarth etc]# mv rc4.d/S39tgtd rc4.d/S11tgtd
[root@OVMiddleEarth etc]# mv rc5.d/S39tgtd rc5.d/S11tgtd
[root@OVMiddleEarth etc]# ls -lR rc?.d/*tgt*
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc0.d/K35tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc1.d/K35tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc2.d/S11tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc3.d/S11tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc4.d/S11tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc5.d/S11tgtd -> ../init.d/tgtd
lrwxrwxrwx 1 root root 14 Jun  7 17:33 rc6.d/K35tgtd -> ../init.d/tgtd

And now load new configuration

   
[root@OVMiddleEarth ~]# tgt-admin --execute

List Active Targets

[root@OVMiddleEarth ~]# tgtadm --lld iscsi --mode target --op show
Target 1: iqn.2008-09.com.example:server1.trial
    System information:
        Driver: iscsi
        State: ready
    I_T nexus information:
    LUN information:
        LUN: 0
            Type: controller
            SCSI ID: IET     00010000
            SCSI SN: beaf10
            Size: 0 MB, Block size: 1
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: null
            Backing store path: None
            Backing store flags:
        LUN: 1
            Type: disk
            SCSI ID: IET     00010001
            SCSI SN: beaf11
            Size: 200006 MB, Block size: 512
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /dev/sdb2
            Backing store flags:
        LUN: 2
            Type: disk
            SCSI ID: IET     00010002
            SCSI SN: beaf12
            Size: 1049 MB, Block size: 512
            Online: Yes
            Removable media: No
            Readonly: No
            Backing store type: rdwr
            Backing store path: /etc/tgt/small_disk
            Backing store flags:
    Account information:
    ACL information:
        ALL

So now we have iSCSI server and we can add it to Oracle VM. I have added it as new Storage Array using iSCSI Storage Server and I have added new iSCSI initiators in Access Group - here is my configuration:

When both LUN(s) have been presented to Oracle VM I have created server pool (it has to be clustered one even for one server - still not sure why but I was unable to create OCFS repository for not clustered server pool).

Then I have created repository and was able to import Assemblies and create template without any issues. Creating my first VM from template was possible as well and at the end I have my first Oracle VM machine.

So what I like in Oracle VM :

Assemblies from Oracle with preconfigured tools

What I dislike in Oracle VM (it can change when I will know that tool better) :

Tricky installation process in non production environment
Local storage (repository) on whole empty disk until you will setup NFS / iSCSI on local host
Assemblies imported via http(s)/ftp path – why there is no SCP and register functionality (or maybe I don’t know how to do it)
Oracle VM manager is quite big – after some tuning it can run on 2 GB but still this is much for management only
No command line tools – tricky to manage if you have ssh connection only

One more hack - Oracle VM Manager should be started after all Oracle VM Server processes

[root@OVMiddleEarth etc]# mv rc2.d/S99ovmm rc2.d/S99xovmm
[root@OVMiddleEarth etc]# ls -l rc?.d/S*ovmm
lrwxrwxrwx 1 root root 14 Jun  4 14:46 rc2.d/S99xovmm -> ../init.d/ovmm
lrwxrwxrwx 1 root root 14 Jun  4 14:46 rc3.d/S99xovmm -> ../init.d/ovmm
lrwxrwxrwx 1 root root 14 Jun  4 14:46 rc4.d/S99xovmm -> ../init.d/ovmm
lrwxrwxrwx 1 root root 14 Jun  4 14:46 rc5.d/S99xovmm -> ../init.d/ovmm

regards,

Marcin

Friday, April 27, 2012

HCC on non-Exadata - How Oracle is detecting storage type

This is next part of HCC compression series on non-Exadata. When 11.2.0.3 has been released Oracle announced that HCC compression will be possbile on ZFS Appliance and Axiom Pillar storage and patch 13041324 has been released as well. I have blogged about it and was able to run HCC on ZFS Appliance simulator and on default Linux NFS as well. After some time Oracle raised bug "Bug 13362079 HCC compression should not be allowed on dNFS with non ZFS or Pillar" and it has been fixed in PSU 11.2.0.3.1. (patch 13343438). After applying that PSU I was unable to create HCC compressed table anymore.

I was wondering how Oracle is checking storage type as far as I know NFS doesn't have that functionality. I have compared wireshark network dumps from old and new Oracle version and there was no difference. At that point I was thinking that maybe new firmware upgrade for ZFS Appliance is required but unfortunately I couldn't download it and apply to my simulator. After discussion on Twitter with @GregRahn @alexgorbachev @AlexFatkulin @kevinclosson I have been told that Oracle can use SNMP to check storage type. That's a new idea and I recalled that I have seen SNMP related error in trace files but I have ignored it as is appear for 11.2.0.3 with patch and for 11.2.0.3.1 as well.

This time I decided to dig it out. So this is a first error I have seen from DBWR trace:

test_dbw0_11626.trc: [1332587314469093] skgnfs_setup_snmp:250: dlopen errno = 0, errstr = libnetsnmp.so: cannot open shared object file: No such file or directory

So there is no shared library in system (server is running OEL 5.6) or some symlinks are missing. Let's try with new symlink

ln -s /usr/lib64/libnetsnmp.so.10 /usr/lib64/libnetsnmp.so

Let's check DBWR trace now - looks better library has been found but target host is not responding.

test_dbw0_3575.trc: [1334661892496086] skgnfs_query_snmp:1831: Timeout error 2 for server 10.10.10.60

I have connected to my ZFS Appliance simulator and I have enabled SNMP using network=0.0.0.0/0 as network filter. I have restarted Oracle and there was no SNMP errors in DBWR trace anymore. I have enabled wireshark again and this what have been captured.

12:44:14.276823 IP 10.10.10.51.20671 > 10.10.10.60.snmp:  GetRequest(33)  E:sun.2.225.1.4.2.0
12:44:14.289691 IP 10.10.10.60.snmp > 10.10.10.51.20671:  GetResponse(59)  E:sun.2.225.1.4.2.0=[|snmp]

Now we can use snmpwalk to check what ZFS Appliance simulator is responding to SNMP request.

[root@dg1 mibs]# snmpwalk -O n -v 1 -c public 10.10.10.60 .1 | grep 225
.1.3.6.1.4.1.42.2.225.1.4.1.0 = STRING: "sunstore"
.1.3.6.1.4.1.42.2.225.1.4.2.0 = STRING: "Sun ZFS Storage VirtualBox"
.1.3.6.1.4.1.42.2.225.1.4.3.0 = STRING: "2011.04.24.1.0,1-1.8"
.1.3.6.1.4.1.42.2.225.1.4.4.0 = Timeticks: (938601700) 108 days, 15:13:37.00
.1.3.6.1.4.1.42.2.225.1.4.5.0 = Timeticks: (938601700) 108 days, 15:13:37.00
.1.3.6.1.4.1.42.2.225.1.4.6.0 = Timeticks: (329800) 0:54:58.00
.1.3.6.1.4.1.42.2.225.1.4.7.0 = STRING: "f2513e14-f8c2-6d7e-fc29-bbd8078aad24"
.1.3.6.1.4.1.42.2.225.1.4.8.0 = STRING: "unknown"
.1.3.6.1.4.1.42.2.225.1.4.9.0 = STRING: "Oracle 000-0000"
.1.3.6.1.4.1.42.2.225.1.5.1.0 = STRING: "AKCS_UNCONFIGURED"
.1.3.6.1.4.1.42.2.225.1.6.1.2.1 = STRING: "zfspool/default/zfstest"
.1.3.6.1.4.1.42.2.225.1.6.1.3.1 = STRING: "zfspool"
.1.3.6.1.4.1.42.2.225.1.6.1.4.1 = STRING: "default"
.1.3.6.1.4.1.42.2.225.1.6.1.5.1 = STRING: "zfstest"
.1.3.6.1.4.1.42.2.225.1.6.1.6.1 = STRING: "/export/zfstest"
.1.3.6.1.4.1.42.2.225.1.6.1.7.1 = Counter32: 7
.1.3.6.1.4.1.42.2.225.1.6.1.8.1 = Counter32: 0
.1.3.6.1.4.1.42.2.225.1.6.1.9.1 = Counter32: 7

Oracle is looking for sun.2.225.1.4.2.0 and this id .1.3.6.1.4.1.42.2.225.1.4.2.0. is matching all numbers. It is returning "Sun ZFS Storage VirtualBox" value and this is a type of NFS server. I think that word "VirtualBox" in name is a non matching one. To confirm that I have googled for screen shots and I found this link. On page 11 I found a information from physical ZFS Appliance which I was looking for and name looks like "Sun ZFS Storage 7xxx".
I hope Oracle will include simulator name as proper name for HCC. In my opinion it can be used to HCC evaluation like dbms_compression package but in different scale. From businesses perspective no one will use simulator for production workload.

But is there any way to test it now ? ZFS Appliance type is not configurable during simulator installation process but there are two options:
- libnetsnm.so is a open source library so it can be amended to return proper value but this is a hard way
- Other possibility to realize that Sun ZFS Appliance simulator is just Solaris box. And there is a chance that as many of other parameters in UNIX type name is saved in text file and if you are able to access this file you can change it.After a few tries and was able to change it and now my simulator is returning following values. I did it only for education purpose and I'm not sure if I can share all steps I did.
At the end SNMP is returning different name

[root@dg1 ~]# snmpwalk -O n -v 1 -c public 10.10.10.60 .1 | grep 225
.1.3.6.1.4.1.42.2.225.1.4.1.0 = STRING: "sunstore"
.1.3.6.1.4.1.42.2.225.1.4.2.0 = STRING: "Sun ZFS Storage 7420"

Now it is time to try HCC in version 11.2.0.3.1 on new configured simulator ZFS and YES it is working again.

Disclaimer:
According to Oracle license Hybrid Column Compression can be run on ZFS Appliance and Axiom Pillar storage only. This post is for eduction purposes only to understand how DB is detecting storage type and how to enable it if you have proper hardware.

regards,
Marcin

Thursday, April 19, 2012

Oracle and HugePages

I have got some very bad experience of running Oracle with quite huge SGA ( 60 GB ) on RedHat 5.6 without HugePages. Host has been completely blocked and I was wondering what was a root cause.

I have used following test configuration:
Host : 96 GB, 2 sockets 12 cores 24 threads

Oracle: 11.2.0.2
SGA_TARGET = 60 GB
I have also set "pre_page_sga" to be sure that all memory will be allocated during instance startup.

I have started instance without HugePages and this is a output from meminfo

testbox1$ cat /proc/meminfo
MemTotal:     98999832 kB
MemFree:      21527276 kB
Buffers:        933668 kB
Cached:       69720980 kB
SwapCached:          0 kB
Active:       64168548 kB
Inactive:      6802180 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     98999832 kB
LowFree:      21527276 kB
SwapTotal:     2096472 kB
SwapFree:      2096472 kB
Dirty:            1548 kB
Writeback:           0 kB
AnonPages:      316056 kB
Mapped:       62708540 kB
Slab:           599964 kB
PageTables:    3679796 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  50572388 kB
Committed_AS: 64086964 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    264440 kB
VmallocChunk: 34359473527 kB
HugePages_Total:  1000
HugePages_Free:   1000
HugePages_Rsvd:      0
Hugepagesize:     2048 kB

As you can see Linux kernel used around 3.5 GB only to create all internal structures for default pages (parameter PageTables). Now let check same with HugePages

testbox1$ cat /proc/meminfo
MemTotal:     98999832 kB
MemFree:      14360908 kB
Buffers:        916216 kB
Cached:        9453272 kB
SwapCached:          0 kB
Active:        2191376 kB
Inactive:      8286108 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:     98999832 kB
LowFree:      14360908 kB
SwapTotal:     2096472 kB
SwapFree:      2096472 kB
Dirty:            3696 kB
Writeback:           0 kB
AnonPages:      230380 kB
Mapped:          76648 kB
Slab:           594156 kB
PageTables:      14368 kB
NFS_Unstable:        0 kB
Bounce:              0 kB
CommitLimit:  14841956 kB
Committed_AS:  1102392 kB
VmallocTotal: 34359738367 kB
VmallocUsed:    264404 kB
VmallocChunk: 34359473527 kB
HugePages_Total: 35893
HugePages_Free:   5245
HugePages_Rsvd:     73
Hugepagesize:     2048 kB

Now it is much better - PageTable has around 15 MB. If we compare this to 3.5 GB from previous output this a proof that non HugePage environment is wasting lot of memory. But wasting memory is not a biggest issue here. Let’s try to connect to database and run simple query

testbox1$ time sqlplus -s / as sysdba &lt;&lt; EOF 
select * from dual;
> exit; 
> EOF

D
-
X


real    0m9.645s
user    0m0.006s
sys     0m0.007s

It took 9.645 sec to connect and run select – so where whole time has been spent?
When I used strace to find a solution it wasn’t 100 % successful attempt – there is a matching gap between chdir and mmap calls – so it looks like time is spend on CPU

15:35:57.494002 stat("/opt/app/oracle/product/11.2.0.2/db1/lib", {st_mode=S_IFDIR|0755, st_size=12288, ...}) = 0 &lt;0.000008>
15:35:57.494044 chdir("/opt/app/oracle/product/11.2.0.2/db1/dbs") = 0 &lt;0.000009>
15:36:07.105693 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b5320e95000 &lt;0.000014>
15:36:07.105758 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b5320eb8000 &lt;0.000007>
15:36:07.105831 mmap(NULL, 143360, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b5320edb000 &lt;0.000008>

Maybe this is problem for local connection only. Let’s try wit listener:

testbox1$ time sqlplus -s user1/user1@log2perf &lt;&lt; EOF 
select * from dual;
> exit;
> EOF

D
-
X

real    0m14.340s
user    0m2.300s
sys     0m0.023s

This time it was even worse.

Same tests with HugePages:

testbox1$ time sqlplus -s / as sysdba &lt;&lt; EOF 
select * from dual;
> exit;
> EOF

D
-
X

real    0m0.437s
user    0m0.014s
sys     0m0.001s

testbox1$ time sqlplus -s user1/user1@log2perf &lt;&lt; EOF 
select * from dual;
> exit;
> EOF

D
-
X

real    0m2.547s
user    0m2.247s
sys     0m0.016s

With HugePage's connection time is much (7 do 20 times) faster. So first pitfall with not HugePages configuration with big SGA is a connection time which is much longer than connection time for configuration with HugePages.

For more information about connection time see update below.

I decided to go further and check DB performance after session has been established. Do to so I have used Kevin Closson SLOB script to test number of logical reads per sec but I have to do some modification to this great tool.
First modification was related to non huge page environment. SLOB is generating AWR snapshots between and after running workload and at the end it end up with AWR report. As connection time is an issue for non Huge Page environment I have to be sure that SLOB will run two AWR snapshots from one session. I have added one more script which is started like any other workers by semaphore and this script is taking two AWR snapshots with 10 sec gap between them. To increase workload time I have increased number of SQL executions in reader loop from 5000 to 25000. It increased average running time from 4 to 18 – 20 seconds and allow my session to grab two AWR snapshots.

New stats.sql script

set serveroutput off

HOST ./mywait
exec dbms_lock.sleep(2);
exec dbms_workload_repository.create_snapshot
exec dbms_lock.sleep(10);
exec dbms_workload_repository.create_snapshot
exit

runit.sh

#!/bin/bash

if [ -z "$2" ]
then

        echo "${0}: Usage : ${0}  "
        exit
else

        WU=$1
        RU=$2
fi

./create_sem > /dev/null 2>&1

cnt=1
until [ $cnt -gt $RU ]
do
        ( sqlplus -s user${cnt}/user${cnt} @reader > /dev/null 2>&1 ) &
        (( cnt = $cnt + 1 ))
done

until [ $cnt -gt $(( WU + RU )) ]
do
        ( sqlplus -s user${cnt}/user${cnt} @writer > /dev/null 2>&1 ) &
        (( cnt = $cnt + 1 ))
done

        ( sqlplus -L / as sysdba @stats.sql > /dev/null 2>&1 ) &
echo "start sleeping"
# sleep longer to allow all sessions to connect
sleep 120 
#sleep 20
# comment old awr_snap
#sqlplus -L '/as sysdba'  @awr/awr_snap > /dev/null
echo "running slob"

B=$SECONDS
./trigger > /dev/null 2>&1

wait

(( TM =  $SECONDS - $B ))

echo "Tm $TM"

# comment old awr_snap
#sqlplus -L '/as sysdba'  @awr/awr_snap > /dev/null
echo "running report"
sqlplus -L '/as sysdba'  @awr/create_awr_report > /dev/null

After tests with default SLOB default tables I decided to increase table size and effect force Oracle to use more memory for caching data blocks. I made 2 changes – one in setup.sh in cr_seed() function loop has been increased from 10000 to 200000 and similar change has been done in reader.sql I have extended random range from 10000 to 200000

setup.sh

function cr_seed () {

sqlplus -s user1/user1 &lt;&lt;EOF 
set echo on

CREATE TABLE seed
(
custid number(8),
c2 varchar2(128),
c3 varchar2(128),
c4 varchar2(128),
c5 varchar2(128),
c6 varchar2(128),
c7 varchar2(128),
c8 varchar2(128),
c9 varchar2(128),
c10 varchar2(128),
c11 varchar2(128),
c12 varchar2(128),
c13 varchar2(128),
c14 varchar2(128),
c15 varchar2(128),
c16 varchar2(128),
c17 varchar2(128),
c18 varchar2(128),
c19 varchar2(128),
c20 varchar2(128)
) PARALLEL PCTFREE 0 tablespace $TABLESPACE;

DECLARE
x      NUMBER :=1;
fluff  varchar2(128) := 'XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX';

BEGIN
FOR i IN 1..200000 LOOP
        insert into seed values (x,fluff, NULL, NULL, NULL, NULL,
        NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, fluff);
        x := x + 1;

END LOOP;
COMMIT;
END;
/
exit;

EOF
}

reader.sql

set serveroutput off

HOST ./mywait

DECLARE
x NUMBER := 0;
v_r PLS_INTEGER;

BEGIN
dbms_random.initialize(UID * 7777);

FOR i IN 1..25000 LOOP
        v_r := dbms_random.value(257, 10000) ;
        SELECT COUNT(c2) into x FROM cf1 where custid >  v_r - 256 AND  custid &lt; v_r;
END LOOP;

END;
/
exit

After all these work I have run my tests. To have better results I have every test 40 times. You can see results in table and graph below.

Configuration	Avg	Median	Std var
NonHuge Page small table	8,865,085.94	8,892,305.40	185,926.15
NonHuge Page big table	8,276,726.18	8,298,340.70	175,300.08
Huge Page small table	9,398,380.94	9,377,956.80	263,674.90
Huge Page big table	8,575,646.65	8,597,997.20	180,219.52

LIO / s for 40 tests runs - linear

LIO / s for 40 tests runs - radar view

For small tables Oracle is able to perform 6 % Logical IO more in HugePage configuration than in NonHuge configuration and for big tables this number is dropped to 3.6 % but in both cases HugePage configuration is better.

At the end I have to point out that I didn't measure session private memory utilization (heap and PGA) during this test but it has be taken into consideration if you are going to implement HugePages on your system.

regards,
Marcin

Update:

After morning Twitter conversation with Yury Velikanov @yvelikanov I learned that he didn't have connection problems with non-HugePage configuration. I used pstack during connection time and here is a output:

#0  0x000000000498a2cd in ksmprepage_granule ()
#1  0x0000000004990aca in ksmgapply_validgranules ()
#2  0x000000000498a01c in ksmprepage ()
#3  0x0000000000aff997 in ksmlsge_phasetwo ()
#4  0x0000000000aff1cf in ksmlsge ()
#5  0x0000000000aff1ab in ksmlsg ()
#6  0x00000000017b547b in opiino ()
#7  0x0000000009006cba in opiodr ()
#8  0x00000000017ac9ec in opidrv ()
#9  0x0000000001e61c93 in sou2o ()
#10 0x0000000000a07a65 in opimai_real ()
#11 0x0000000001e6713c in ssthrdmain ()
#12 0x0000000000a079d1 in main ()

KSMPREPAGE gave me a idea that long connection time is related to pre_page_sga and how Oracle is dealing with it. Connection time is proportional to number of pages to check - of course with nonHugePage configuration number of pages too check in loop in bigger. Why in loop ? You can find this in one of the notes on MOS.

When I have removed pre_page_sga parameter connection time for nonHugePage and HugePage configuration is similar. Now it is time to run test and compare number of LIO/s with and without pre_page_sga. I will update that post soon.

Update 2 - 25 Apr 2012
I have tested number of LIO/s with pre_page_sga set to true and false for nonHuge and Huge Pages environment.

Configuration	Avg	Median	Std var
NonHuge Page Pre_page = false	8,072,479.21	8,096,358.30	180,844.09
NonHuge Page Pre_page = true	8,190,746.98	8,212,969.55	141,426.52
Huge Page Pre_page = false	8,105,724.40	8,105,408.50	165,420.39
Huge Page Pre_page = true	8,735,329.93	8,744,138.20	187,104.66

There is not big difference between pre_page_sga set to true or false in NonHugePage environment - around 1.5 %. For HugePages difference is bigger and it was around 7.7 % in my test. So after all checks I think that HugePage plus pre_page_sga is a winner but remember to double check if you haven't problem with connection time when SGA
is per-allocated.