For quite a time I have been a happy user of SPM(SQL Plan Management) to resolve SQL performance issues by using a lazy man’s approach and Oracle capabilities:
in the latest versions Oracle will generate many SQL plans in it’s search of a better SQL plan. All you have to do is to select the one which makes sense and load it into SPM.
Until one day a valid and enabled SPM would stop to enforce the chosen plan. dbms_xplan.display_cursor would show “Sql Plan Directive used for this statement”
Deleting the Sql Plan Directive on the objects concerned has solved the issue.
Apparently, SQL Directives take precedence over SPM.
Here are some answers I’ve got from Oracle:
Still, IT guy has always to ask one simple question: what is the precedence.
In Oracle CBO (Cots Based Optimizer) we have many factors:
Statistics (don’t forget System Statistics)
For quite a time the precedence of factors influencing CBO was rather self-evident and had no need to be documented explicitly. Not any longer.
Time to throw a few notes about Oracle database on Amazon RDS.
- It works
- Issues getting access to data files: file system access is generally forbidden on RDS for reasons understandable. However, imagine you ran into RDS limits. For example, you need a one-off patch (which RDS would not allow you to install). You need to escape RDS quickly into a normal Oracle database. You can exp/edpdp/imp/impdp/GoldenGate/Amazon Migration Service/SQL interface. But SQL interface is sloooow. Here you are – caught by RDS for good.
- Access to Oracle redo log files. Take a look here: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_LogAccess.Concepts.Oracle.html
Amazon document writers do not even get that Oracle redo logs are light yeas away from Oracle trace and audit logs.
- Issues setting hidden parameters (I failed)
- Can’t change some parameters like service name. I failed to add service name. Want to rename database – it’s copy&paste (i.e. back database and restore as a different thing – do this on a few terabytes to get taste of it). Want to add service name for application purposes (resource management, end-to-end user tracing) – no no.
- Tablespace growth control. So far I recommend using bigfile tablespaces. Then you can execute “alter tablespace … autoextend on/off maxsize …” If you use small file tablespace then “alter database” raises privileges error immediately. Btw, Amazon default Oracle RDS is keen to let you grow (and pay) – it has the maximum limit on bigfile tablespace, SYSTEM tablespace including. Let it be with payments but letting database grow out of control will end badly for your database (and you personally).
- One-off patches are no no (as was mentioned already). I do not know a good RDS escape strategy. RDS is a trap. I personally would like to hear if Amazon allowed a single RDS customer to get hands on RDS datafiles and migrate for good whererver he or she wished – that would be nice of Amazon.
- Database Flashback – I failed to get it running. I assume it can’t.
- Some performance questions linger regarding High Availability and Disaster Recovery implementation – the Multi Availability Zones option. It’s an open secret what technology Amazon is using (not Data Guard, which is stated openly). Then it’s File/Storage block replication vs redo log replication. I do assume that redo log [vectors] replication does require way less bandwidth and has a better latency.
- MultiAZ RDS failover happens within minutes, like 5 minutes.
- RMAN backups. What RMAN backups ? RMAN restore of a particular failed block, data file, tablespace, ctonrol file, log file – no no. Amazon RDS restore granularity is the whole database or nothing.
- Amazon RDS point in time restore is of 5 minutes granularity, mine and other people tests seem to confirm this number
- Can you freeze RDS database backup so that it does not expire at the end of backup retention period ? Seems like not: you can freeze database snapshot, true. But I did not find how to freeze backups of database archive logs. Here is what Amazon says: http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_PIT.html
“You can restore to any point in time during your backup retention period”
If this is truly so then it is a serious limitation for some testing scenarious.
- Take a look into v$datafile. I get it that Amazon attaches a single EBS volume for your datafiles. A single EBS volume has it’s limits of IOPS. On Amazon you combine many EBS volumes to get IOPS you need. Throughput (bandwidth in Amazon terms as EBS is attached to network interfaces) is another concern – it’s limited to Amazon instance network card, naturally. Forget about gigabytes per second. About 700MB/sec EBS throughput is maximum what I was told regarding normal Oracle installation on one of high end Amazon instances. RDS maximum is even less.
- Upgrades. I have upgraded 188.8.131.52 SE1 into 184.108.40.206 SE2. It took a staggering 4 and a half hours. Note, that was a skeleton test database. Amazon did a few backups underway – even incremental backups do take time. Then it went to install “one off pacthes and performance improvements” – it kept improving the performance for exactly 3.5 hours. The instance was a rather small one, db.m3.large. Perhaps bigger instance would do better but that remains to be tested.
- Now, the paranoid part. If anything happens with the database then you have no access to the RDS host operating system. You are at Amazon mercy.
ok, enough for the Bad and Uglu. What about the Good. Provisioning is reasonably fast. Not light fast but still fast. Backups and restores do work. Failover does work. Upgrade – the one I have done (see above) – was successful. Some basic monitoring is included. Overall, Oracle RDS is a very reasonable offer for application that can get away with what was said above. 24×7 aspect is the most troublesome thing for me: as I has already said, if anything non-trivial happens with your database then RDS is a black box.
select sys.dbms_metadata.get_ddl('USER', 'USER1') from dual;
to get the new fancier “identified by values” syntax – remember, Oracle has changed and made stronger database user passwords. The old “identified by values” won’t work anymore.
if lgwr latency is more important than throughput then as of 220.127.116.11(and 3) consider the following:
– apparently log_buffer size governs group commit batch size. A bigger log buffer means bigger write batch size which means longer session waits on commit. See lgwr warnings of over 500ms log writes under heavier throughput: the bigger log buffer size – the bigger warning size is reported
– 11g lgwr polling might increase log file sync wait. Use parameter _use_adaptive_log_file_sync to disable polling
– Consider binding processes lgwr and nssN (this is a data guard process) to dedicated cpu’s
Regarding unaccounted time in v$sql, when elapsed_time is much bigger than cpu_time + user_io_wait_time + concurrency_wait_time + etc.
One reason: Log writer waits are not included into user_io_wait_time – it is truly a USER io time.
Test it: Stop log writer: kill -s SIGSTOP <lgwr_pid>. See how elapsed_time increases but other times stay unchanged.
Another credible reason: OS is losing time on paging (swapping)
x$kcvfh is a good one:
How to quickly check that Database is consistent after incomplete recovery (Point in Time Recovery) before OPEN RESETLOGS (Doc ID 1354256.1)
select hxfil file#, substr(hxfnm, 1, 50) name, fhscn checkpoint_change#, fhafs Absolute_Fuzzy_SCN, max(fhafs) over () Min_PIT_SCN from x$kcvfh where fhafs!=0 ;
Disable Diag and Tuning pack (oracle 10g and 11g)
– Baselines must not be used:
for m in (select * from dba_hist_baseline where baseline_name != ‘SYSTEM_MOVING_WINDOW’) loop
– Manual Snapshots must not be used:
for m in (select * from dba_hist_snapshot where snap_flag0) loop
– awr disable
begin sys.dbms_awr.disable_awr(); end;
– Disable Advisors
– 11g: tuning schedulers disabled. That should be optional as far as licenses are concerned but there are many reasons to get rid of them.
– OEM (Cloud Control, formerly Grid Control, formerly OEM): disable Packs in Grid Control interface, menu on the top-right corner.
Next: mapping between licensed features and dba_feature_usage_statistics is known – see metalink id: Database Options/Management Packs Usage Reporting for Oracle Database 11g Release 2 (Doc ID 1317265.1)
Then one can do this:
1) map features to dba_feature_usage_statistics and sys.wri$_dbu_feature_metadata
2) sys.wri$_dbu_feature_metadata describes how oracle checks for feature usage
3) the feature is disabled based on analysis of sys.wri$_dbu_feature_metadata
4) Mos article “MMON Suspended Due to ORA-12751 “cpu time or run time policy violation” (Doc ID 761298.1)” is used to execute oracle FUS immediately:
alter session set events ‘immediate trace name mmon_test level 6’;
5) repeat from step 2 untill dba_feature_usage_statistics is satisfactory