The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.
<HTML>
<HEAD>
<TITLE>Karma - Services Help</TITLE>
<LINK REV="made" HREF="mailto:root@porky.devel.redhat.com">
</HEAD>

<BODY>

<!-- INDEX BEGIN -->

<UL>

	<LI><A HREF="#NAME">NAME</A>
	<LI><A HREF="#DESCRIPTION">DESCRIPTION</A>
	<LI><A HREF="#Alert_Log_Errors">Alert Log Errors</A>
	<LI><A HREF="#Initialization_Parameters">Initialization Parameters</A>
	<LI><A HREF="#Extents">Extents</A>
	<LI><A HREF="#Fragmentation">Fragmentation</A>
	<LI><A HREF="#Hit_Ratios">Hit Ratios</A>
	<LI><A HREF="#Latch_Contention">Latch Contention</A>
	<LI><A HREF="#MTS_Multi_Threaded_Server">MTS - Multi-Threaded Server</A>
	<LI><A HREF="#OS_Statistics">OS Statistics</A>
	<LI><A HREF="#Redologs">Redologs</A>
	<LI><A HREF="#Deferred_Transaction_Error_Queue">Deferred Transaction Error Queue</A>
	<LI><A HREF="#Deferred_Transaction_Queue">Deferred Transaction Queue</A>
	<LI><A HREF="#Rollback_Segment_Contention">Rollback Segment Contention</A>
	<LI><A HREF="#Slow_SQL">Slow SQL</A>
	<LI><A HREF="#Tablespace_Quotas">Tablespace Quotas</A>
	<LI><A HREF="#UP_Status">UP Status</A>
</UL>
<!-- INDEX END -->

<HR>
<P>
<H1><A NAME="NAME">NAME</A></H1>
<P>
Karma - Services Help

<P>
<HR>
<H1><A NAME="DESCRIPTION">DESCRIPTION</A></H1>
<P>
This document details each of the services monitored by karma, and how to
configure monitoring for them.

<body bgcolor="#3399cc">

<P>
<HR>
<H1><A NAME="Alert_Log_Errors">Alert Log Errors</A></H1>
<P>
The Oracle alert.log facility is similar to that of the unix syslog
facility. It is used to report various system messages like startup and
shutdown, as well as checkpoints, redo-log switches, and most importantly
ORA-xxxx errors.

<P>
Monitoring ORA-xxxx errors is an important part of the DBA's
responsibility, and Karma aims to ease that burden by watching the
alert.log for you.

<P>
Karma monitors databases remotely, and as such cannot directly access an OS
level file such as the alert.log. The solution (if you're interested in
monitoring the alert.log of remote databases) is the run an additional
script which comes with Karma on any machine whose alert.log you wish to
monitor. Essentially it watches the file for changes, and writes any
ORA-xxxx errors to a table. Karma then just watches this table for new
entries. Checkout the karmagentd for more information on configuring that
end of things.

<P>
Unlike other services, the alertlog facility actually ignores the warning
and alert thresholds right now. The way it works currently is that if there
are any ORA-7445, or ORA-600's an ALERT will be flagged, and for any other
errors a WARNING is flagged. Only errors logged in the last day (by
karmagentd which may differ from the time the db logged the error) are
considered by this report. In addition, if you have your config file set to
do alert.log monitoring, but the karmagentd daemon isn't running, an ALERT
is also flagged. If you haven't created the karma_alertlog_errors table, it
is set to NO_STATUS.

<P>
<HR>
<H1><A NAME="Initialization_Parameters">Initialization Parameters</A></H1>
<P>
This section displays the Oracle initialization parameters from the
v$parameter data dictionary view. This section is not configurable, and is
always displayed.

<P>
<CODE>SELECT   name, value
  FROM     v$parameter
  ORDER BY name</CODE>



<P>
<HR>
<H1><A NAME="Extents">Extents</A></H1>
<P>
Karma monitors extents of non-sys objects in your database. Basically if
any object is within your warning or alert threshold of maxextents for that
object, a WARNING or ALERT is flagged. If you have maxextents set to
unlimited, you won't encounter a warning or alert status. If there are many
objects in your database which are nearing their maxextents, karma displays
the first 100 objects only.

<P>
<HR>
<H1><A NAME="Fragmentation">Fragmentation</A></H1>
<P>
Fragmentation occurs at the table (heap) or index (b-tree) level.
Essentially when you create objects in a tablespace, if you set them all
with different storage parameters, or a pctincrease which is non-zero
you'll likely cause tablespace fragmentation of the objects contained
therein.

<P>
Fragmentation can be resolved by rearranging objects in other tablespaces,
rebuilding with different storage parameters, or export/import. Ideally
though, it would be best to avoid fragmentation altogether. How can we
accomplish this? Oracle recommends in their latest whitepaper on the
subject ``How To Stop Defragmenting and Start Living'' to avoid
fragmentation altogether by creating tablespaces with with uniform extent
sizes, and leaving objects to assume the default storage params when
they're created. For more information, check: 

<a href="http://support.oracle.com/cgi-bin/cr/getfile_cr.cgi?239049">How To Stop Defragmentating and Start Living.</a>

<P>
If you're running Oracle 8i, should seriously consider using locally
managed tablespaces. These new types of tablespaces use an allocation
bitmap to keep track of equally sized extents which you set at tablespace
creation time. For more information, see the following:

<a href="http://pillango.sirma.bg/oracle/N053_spaceman.pdf">
Getting the Best Out of Oracle8i Space Management,by Dave Ensor</a>

<P>
<HR>
<H1><A NAME="Hit_Ratios">Hit Ratios</A></H1>
<P>
Hitratios are a simple way to get a big picture of how well your database
is performing. Essentially a hitratio gives you a ratio with which to
quickly judge how many I/O requests are being satisfied via memory vs I/O
requests which actually require disk I/O.

<P>
We monitor data block buffer hitratio only, for now. The query looks like
the following:

<P>
<CODE>SELECT name, value</CODE>



<P>
<CODE>FROM   v$sysstat</CODE>



<P>
<CODE>WHERE  name IN ('consistent gets', 'db block gets', 'physical reads')</CODE>



<P>
Then calculate like this:

<P>
<CODE>hitratio = logical reads - physical reads / logical reads</CODE>



<P>
<CODE>logical reads = consistent gets + db block gets</CODE>



<P>
There are a number of things to look at when tuning your buffer cache
hitratio. Generally you want to shoot for upwards of 95% in an OLTP
database. Increasing the size of your db_block_buffers parameter. In
addition, tuning slow sql queries can often improve your buffer cache
hitratio quite dramatically.

<P>
<HR>
<H1><A NAME="Latch_Contention">Latch Contention</A></H1>
<P>
Latches are special types of locks that Oracle uses to protect areas of
shared memory. There are latches protecting the redo log buffer, shared
pool, buffer cache, and sort segments. Locks when used in the strict Oracle
sense are used to protect transactions.

<P>
You can find out what's happening with the latches in your database by
issuing the following query:

<P>
<CODE>SELECT   name, gets, misses</CODE>



<P>
<CODE>FROM     v$latch</CODE>



<P>
<CODE>ORDER BY name</CODE>



<P>
Then calculate various ratios with this formula:

<P>
<CODE>ratio = (gets - misses) * 100 / gets</CODE>



<P>
Tuning latch contention can be a complex issue, so I won't go into all the
details here. Suffice to say that application tuning, or tuning queries
that have been identified as doing a lot of disk I/O can often produce
substantial benefit with respect to latch contention.  

<P>
<HR>
<H1><A NAME="MTS_Multi_Threaded_Server">MTS - Multi-Threaded Server</A></H1>
<P>
Multi-threaded server is a facility Oracle provides for installations which
require a very large number of user sessions, typically 500-1000.
Multi-threaded server reduces the memory requirements, and OS load, and is
often appropriate for database-backed websites.

<P>
As with every facility, in order for it to run properly, it needs to be
monitored to ensure no contention for it's resources. With MTS the database
preallocates shared server and dispatcher processes. If you don't have
enough shared servers or dispatcher processes started, Oracle can
dynamically spawn more up to a given threshold controlled by
mts_max_servers and mts_max_dispatchers. Generally, however, if you're
dynamically spawning these processes, and then they are dying, you'll have
performance problems. You want enough of them prespawned to handle all but
peak loads. If you're experiencing contention, you'll want to prespawn more
processes with mts_dispatchers, or mts_servers. See your Oracle
documentation for configuration details.

<P>
<HR>
<H1><A NAME="OS_Statistics">OS Statistics</A></H1>
<P>
Karma provides limited ability to monitor operating system level statistics
similar to the way it allows monitoring of the alert.log. The karmaagentd
script also checks via ``uptime'' the load average and percent idle. As
with the alert.log info, this data populates a table which karmad then
monitors for changes. Checkout the karmagentd for more information on
configuring that end of things.

<P>
For now only load averages are monitored, however additional OS level
statistics may be added.

<P>
<HR>
<H1><A NAME="Redologs">Redologs</A></H1>
<P>
Redologs are where Oracle writes all transactions to, in addition to
writing to a block of memory, which eventually makes its way to datafiles
on disk. Redologs capture INSERT, UPDATE, and DELETE activity, and provide
security in case the database or machine which it runs on crashes. The are
crucial to point in time recovery. Generally we don't want to be switching
redo-logs too quickly lest we degrade the databases performance. 

<P>
You can view how often the redolog files are switching with the following
query:

<P>
<CODE>SELECT   TO_CHAR(first_time, 'J'), TO_CHAR (first_time, 'SSSSS'),</CODE>



<P>
<CODE>group#, sequence#, TO_CHAR (first_time, 'DD/MM/YYYY HH24:MI')</CODE>



<P>
<CODE>FROM     v$log</CODE>



<P>
<CODE>ORDER BY 1,2</CODE>



<P>
Generally it is recommended that these files not switch more than once
every half hour during peak database activity. If the redolog files are
switching too often, it's probably a good idea to make the files bigger.
For large databases, 100M+ redolog files are not uncommon.

<P>
You can change the size of the redolog files while the database is running.
Suppose you currently have four logfile groups (1-4) which are 25M and you
want to make them 50M. You would do something like this:

<P>
<CODE>1. ALTER DATABASE ADD LOGFILE GROUP 5 ('log5a', 'log5b') SIZE 50M</CODE>



<P>
<CODE>ALTER DATABASE ADD LOGFILE GROUP 6 ('log6a', 'log6b') SIZE 50M</CODE>



<P>
<CODE>ALTER DATABASE ADD LOGFILE GROUP 7 ('log7a', 'log7b') SIZE 50M</CODE>



<P>
<CODE>ALTER DATABASE ADD LOGFILE GROUP 8 ('log8a', 'log8b') SIZE 50M</CODE>



<P>
<CODE>2. ALTER SYSTEM SWITCH LOGFILE</CODE>



<P>
<CODE>(do this step until you are using the new logfiles)</CODE>



<P>
<CODE>3. ALTER DATABASE DROP LOGFILE GROUP 1</CODE>



<P>
<CODE>ALTER DATABASE DROP LOGFILE GROUP 2</CODE>



<P>
<CODE>ALTER DATABASE DROP LOGFILE GROUP 3</CODE>



<P>
<CODE>ALTER DATABASE DROP LOGFILE GROUP 4</CODE>



<P>
During step 2, you'll have to verify that your system is now using the new
logfiles. You'll be able to do that with the following query:

<P>
<CODE>SELECT group#, status FROM v$log</CODE>



<P>
When the redologs in groups 5-8 are in use or 'CURRENT' you'll be allowed
to drop the old logfiles. Oracle won't allow you to drop a logfile that is
in use.

<P>
<HR>
<H1><A NAME="Deferred_Transaction_Error_Queue">Deferred Transaction Error Queue</A></H1>
<P>
The Oracle deferror queue contains transactions that have failed to
replicate for various reasons.

<P>
Monitoring the deferror queue is crucial to maintaining the health of a
replicated environment. Karma monitors the number of transactions which
have failed with errors. If it gets too high a warning or alert is flagged.

<P>
<CODE>SELECT   deferred_tran_id, origin_tran_db, destination,</CODE>



<P>
<CODE>to_char(start_time, 'HH24:MI:SS') , error_number</CODE>



<P>
<CODE>FROM     deferror</CODE>



<P>
<CODE>WHERE    start_time &amp;gt; sysdate - 1</CODE>



<P>
<CODE>ORDER BY start_time</CODE>



<P>
<HR>
<H1><A NAME="Deferred_Transaction_Queue">Deferred Transaction Queue</A></H1>
<P>
The Oracle deftran queue contains transactions bound for remote databases.  

<P>
Monitoring the deftran queue is crucial to maintaining the health of a
replicated environment. Karma monitors the number of transactions pending
in this queue. If it gets too high a warning or alert is flagged.

<P>
For further information, take a look at the deftran queue with this query:

<P>
<CODE>SELECT   t.deferred_tran_id, t.delivery_order, </CODE>



<P>
<CODE>to_char(t.start_time, 'DD/MM/YYYY HH24:MI:SS') </CODE>



<P>
<CODE>FROM     deftrandest d, deftran t</CODE>



<P>
<CODE>WHERE    d.deferred_tran_id = t.deferred_tran_id</CODE>



<P>
<CODE>AND      d.delivery_order = t.delivery_order</CODE>



<P>
<CODE>ORDER BY t.start_time</CODE>



<P>
<HR>
<H1><A NAME="Rollback_Segment_Contention">Rollback Segment Contention</A></H1>
<P>
Rollback segment activity is an important facility to monitor in your
database to maintain reliable performance. Whenever a transaction modifies
a block of data in your database, rollback segments provide a
read-consistent view to the other sessions in the database, giving the
picture of the data before any changes began. Additionally, as with
redologs, rollback segments are important for database recovery.

<P>
<CODE>SELECT   a.name, b.status, b.gets, b.waits</CODE>



<P>
<CODE>FROM     v$rollname a, v$rollstat b</CODE>



<P>
<CODE>WHERE    a.usn = b.usn</CODE>



<P>
<HR>
<H1><A NAME="Slow_SQL">Slow SQL</A></H1>
<P>
Slow SQL queries can be one of the most frustrating and performance
degrading aspects of database administration. What makes it particularly
frustrating is if you have developers on your production box. :-)  

<P>
Bad queries manage to find their way into every database. Karma provides a
method to be a little more proactive about monitoring this activity, and
letting you know hopefully before they become a problem. Karma, though, can
only help identify those queries that are problems, it can't optimize them.  

<P>
Optimizing queries can mean anything from analyzing related tables and
indexes in a schema, providing hints to suggest a better execution plan,
creating indexes to provide Oracle with a faster way to the data, or
actually rearranging the query so that perhaps it enables an index that it
previously disabled. Application tuning, by redefining the problem in a way
that makes it simpler to solve, can also often be a very viable and
attractive way to improve some particularly bad queries. For more
information on all aspects of SQL query tuning see:

<A HREF="http://www.amazon.com/exec/obidos/ASIN/0136142311/o/qid=935998933/sr=8-1/002-5520935-5714251">Guy Harrison's book "Oracle SQL High Performance Tuning" - ISBN 0136142311</A>

<P>
Please test the slow sql funcationality before running it on your
production database and limit how often you run it, as it can itself be a
``slow sql'' query.

<P>
The following query should help you get started:

<P>
<CODE>SELECT   disk_reads, executions, sql_text</CODE>



<P>
<CODE>FROM     v$sqlarea</CODE>



<P>
If diskreads are excessively high relative to a low number of executions,
the query will likely need tuning. Executions represent the number of times
the same query was reused.

<P>
<HR>
<H1><A NAME="Tablespace_Quotas">Tablespace Quotas</A></H1>
<P>
karma allows tablespaces to be monitored like you monitor disk capacity
with ``df'' in Unix. This is above and beyond the extents and fragmentation
which you can monitor separately.

<P>
Tablespaces with any datafiles set to AUTOEXTEND are ignored in this
analysis. Use this query to see where your tablespaces are currently at:

<P>
<CODE>SELECT  a.tablespace_name, a.total, b.used</CODE>



<P>
<CODE>FROM    (SELECT  tablespace_name, SUM (bytes) total</CODE>



<P>
<CODE>FROM    dba_data_files</CODE>



<P>
<CODE>GROUP BY tablespace_name) a,</CODE>



<P>
<CODE>(SELECT  tablespace_name, SUM (bytes) used</CODE>

<CODE>FROM    dba_segments</CODE>



<P>
<CODE>GROUP BY tablespace_name) b</CODE>



<P>
<CODE>WHERE   a.tablespace_name = b.tablespace_name (+)</CODE>



<P>
<HR>
<H1><A NAME="UP_Status">UP Status</A></H1>
<P>
This section merely monitors that the database is up and reachable. In
addition you can view performance statistics from v$sysstat, and other
miscellaneous database information. This section is always enabled, and
cannot be disabled.

</BODY>

</HTML>