Quantcast
Channel: obiee - Rittman Mead
Viewing all 99 articles
Browse latest View live

OBIEE Administration Tool – Import Metadata shows no schemas

$
0
0

Importing Metadata with the Administration Tool

The client-only install of the OBIEE 11g Administration Tool is installed with a set of OCI libraries. This means that it can support basic Oracle Database interaction, without the need for a full Oracle Client installation on the machine. For example, you can update row counts in the Physical layer of the RPD of tables that are on Oracle.

Unfortunately, the supplied OCI libraries are not complete, which leads to a rather tricky problem to diagnose. When you use the Import Metadata operation (either from the File menu, or context menu on an existing Connection Pool), the step (“Select Metadata objects”) which ought to show the schemas just shows a stub, and no schemas.

No schemas shown in Select Metadata Objects / Import Metadata

No error is shown by the Administration Tool, giving the erroneous impression that there just aren’t any schemas in the database.

Missing OCI library

The Administration Tool writes a log, which by default is in the following rather long path: C:\Program Files\Oracle Business Intelligence Enterprise Edition Plus Client\oraclebi\orainst\diagnostics\logs\OracleBIServerComponent\coreapplication\Administrator_NQSAdminTool.log

If you examine the log, you’ll see this error:

[2011-12-16T15:10:12.000+00:00] [OracleBIServerComponent] [ERROR:1] [] [] [ecid: ] [tid: 8b4]  [nQSError: 93001] Can not load library, oracore11.dll, due to, The specified module could not be found. [[
.
The specified module could not be found.

]]

The key bit in this is “Can not load library, oracore11.dll“. This is the library which is missing and on which there is a dependency. The library isn’t provided by InstantClient, so you must install the full Oracle Client.

 

Installing the Oracle Client

Download the Oracle Client from the Oracle Website. The link is currently this, but may change. In this instance I downloaded “Oracle Database 11g Release 2 Client (11.2.0.1.0) for Microsoft Windows (32-bit)” (all 600MB+ of it).

 

Unzip the installer and run setup.exe. If you want as minimal an installation as possible, then select the custom installation, and choose just the “Oracle Call Interface (OCI)” option.

Oracle Client installer - OCI libraries

Once you’ve installed the Full Client, restart the AdminTool and the Import Metadata function will now work.

 

Footnote – tnsnames.ora

Don’t forget that if you don’t install the full Oracle Client and use the OCI functionality provided by the OBIEE installation alone, you will need to configure your tnsnames.ora file in C:\Program Files\Oracle Business Intelligence Enterprise Edition Plus Client\oraclebi\orahome\network\admin\tnsnames.ora. The exception is if you are using Easy Connect DSNs (dbserver:port/sid) in your RPD rather than TNS entries (orcl etc)

Footnote – troubleshooting library issues

Microsoft’s SysInternals suite includes the program ProcMon. You can point this at a running process and see what it’s up to in terms of file access, DLLs, and networking. It is great for detecting things like:

  • Which files a process writes to (eg where is a user preference stored)
  • Check which PATHs are being searched for an executable / library
  • Which tnsnames.ora is being picked up
  • What network connections are being made, or failing
  • Registry key access

When you run ProcMon you’ll realise how much is going on in the background of your Windows machine – there’ll be screenful upon screenful of output. To focus on the target of your analysis, use the Include Process option to just show AdminTool.exe:

Include AdminTool in procmon traces

You can then see things like it searching for the oracore11.dll which it is missing:

oracore11.dll missing

The next entry shows the log file being updated, giving you the path if you didn’t know it already:

AdminTool log file


Downloading OBIEE patches from Oracle with wget

$
0
0

Last week saw the release of OBI 11.1.1.6.2 BP1, and with it, some eight patches to download (seven OBIEE, plus one JDeveloper). A very useful option when downloading the patches, particularly if you are working on Linux servers with no GUI, is to download the patches with wget. This can also be good if your company has download policies that necessitate a third party downloading any external files.

The option for downloading via wget from Oracle Support is not immediately obvious, so here is how to do it, using the example of the patchset for OBI 11.1.1.6.2 BP1 : 

First, you need to get a list of all the patches you want to download. Log into My Oracle Support (Flash version or HTML version), and click on Patches & Updates

MOS

Click on Product or Family (Advanced) and enter the following criteria to find the patches for OBIEE 11.1.1.6.2 BP1: 

  • Product: Oracle Business Intelligence
  • Release: OBI EE 11.1.1.6.2BP1
  • Platform: Linux x86-64

Also make sure you tick Include all products in a family.

NewImage

Click on Search, and you should get a list of results including seven for the patchset we are interested in: 

NewImage

Click on the first patch in the list, then press Shift and click on the last patch that we want. This will select all seven patches, and display a Download button. Click this button. NewImage

In the bottom left of the File Download window, you should see a link for WGET Options

NewImage

Click on WGET Options and then on the Download .sh button in the following dialogue

NewImage

Now open the wget.sh file that was downloaded, and locate the lines for SSO_USERNAME and SSO_PASSWORD. Edit these with your My Oracle Support username and password – the username will default to the user that created the wget.sh file.

 
#!/bin/sh

#
# Generated 7/2/12 7:46 AM
# Start of user configurable variables
#
LANG=C
export LANG

# SSO username and password
SSO_USERNAME=larry@oracle.com
SSO_PASSWORD=password

Set the script as executable: 

chmod u+x wget.sh

and then execute it:

./wget.sh

The script will write a corresponding wget.log with the output of the session, and if all is well you should see the patch files written to the current directory. If the files don’t download, check the .log file for the cause.

Footnote

Whilst My Oracle Support only lists seven patches for 11.1.1.6.2 BP1 when you search using the method above, per Mark’s article, you also need the patch for JDeveloper, 13952743.

Applying patches to OBIEE : 11.1.1.6.2 BP1

$
0
0

The recent release of OBIEE 11.1.1.6.2 BP1 weighs in with a hefty eight individual patches to apply. They use the standard Oracle opatch mechanism. Here we see how to apply the patches, and note any “gotchas”. 

Starting point

To install 11.1.1.6.2 BP1, you must be on 11.1.1.6.0 first.

This article is based on an installation on Linux (OL6), but the process should be the same across OSs. It is also based on a single-node OBIEE install. For clustered deployments, you will need to do this on each node – see the patch README for more information.

Obtain the patches

The first step is to download the patches from Oracle Support. See my previous post on how to locate them and optionally download them from the command-line using wget. Don’t forget the JDeveloper patch, 13952743, which doesn’t show up in the standard search for 11.1.1.6.2 BP1.

Unzip the patches

You can unzip them using the commandline statement

unzip \*.zip

This will give you a list of folders, each containing a set of patch files. You should have the following:

drwxrwxr-x  4 oracle oinstall 4096 Jul 10 17:10 13867143
drwxr-xr-x  4 oracle oinstall 4096 Apr 12 07:10 13952743
drwxrwxr-x  4 oracle oinstall 4096 Jun  2 03:25 13960955
drwxr-xr-x  4 oracle oinstall 4096 Jun 14 00:53 14142868
drwxr-xr-x  4 oracle oinstall 4096 Jul 10 17:20 14223977
drwxrwxr-x  4 oracle oinstall 4096 Jun 22 11:06 14226980
drwxrwxr-x  4 oracle oinstall 4096 Mar 29 03:09 14226993
drwxrwxr-x  4 oracle oinstall 4096 Apr 19 07:22 14228505

Read the README

The patches come with a master README file, in 14223977/README.txt. Make sure you read through this and understand it. What I detail below are my notes based on it, but always check the README.

Determine FMW_HOME

All OBI 11g installations follow the same directory structure, but the root of this structure varies depending on where it was installed. The root is known as FMW_HOME (Fusion MiddleWare Home), and is commonly found in locations such as

  • /u01/app/oracle/product/fmw
  • /home/oracle/obiee
  • c:\oracle\middleware

Before continuing, make sure you know where your FMW_HOME is. When you’ve found it, you should see the following structure within it:

-rw-rw----  1 oracle oinstall   225 Jul  6 13:49 domain-registry.xml
drwxr-x---  3 oracle oinstall  4096 Jul  6 13:44 instances
drwxr-x---  2 oracle oinstall  4096 Jul  6 16:41 logs
drwxr-x---  7 oracle oinstall 36864 Jul  6 13:26 modules
-rw-r-----  1 oracle oinstall   623 Jul  6 13:26 ocm.rsp
drwxr-x--- 65 oracle oinstall  4096 Jul 10 20:25 Oracle_BI1
drwxr-x--- 33 oracle oinstall  4096 Jul 10 21:38 oracle_common
-rw-r-----  1 oracle oinstall 86921 Jul  6 13:26 registry.dat
-rw-r-----  1 oracle oinstall  1750 Jul  6 13:26 registry.xml
drwxr-x---  4 oracle oinstall  4096 Jul  6 13:37 user_projects
drwxr-x---  8 oracle oinstall  4096 Jul  6 13:26 utils
drwxr-x---  9 oracle oinstall  4096 Jul  6 13:26 wlserver_10.3

For the rest of this article, I’m assuming you’ve set the environment variable FMW_HOME, for example using export FMW_HOME=/u01/app/oracle/product/fmw (on a bash shell – other shells & OSs will differ)

Put patch files in place

Move your unzipped patch files into the $FMW_HOME/Oracle_BI1 folder. It should look like this:

oracle@rnm-ol6 fmw]$ ls -l /u01/app/oracle/product/fmw/Oracle_BI1/
total 272
drwxrwxr-x  4 oracle oinstall 4096 Jul 10 17:10 13867143
drwxr-xr-x  4 oracle oinstall 4096 Apr 12 07:10 13952743
drwxrwxr-x  4 oracle oinstall 4096 Jun  2 03:25 13960955
drwxr-xr-x  4 oracle oinstall 4096 Jun 14 00:53 14142868
drwxr-xr-x  4 oracle oinstall 4096 Jul 10 17:20 14223977
drwxrwxr-x  4 oracle oinstall 4096 Jun 22 11:06 14226980
drwxrwxr-x  4 oracle oinstall 4096 Mar 29 03:09 14226993
drwxrwxr-x  4 oracle oinstall 4096 Apr 19 07:22 14228505
drwxr-x---  3 oracle oinstall 4096 Jul  6 12:53 admin
drwxr-x---  3 oracle oinstall 4096 Jul  6 12:53 asoneofftool
drwxr-x---  3 oracle oinstall 4096 Jul  6 13:08 assistants
drwxr-x---  3 oracle oinstall 4096 Jul  6 12:53 atgpf
drwxr-xr-x  3 oracle oinstall 4096 Jul 10 20:25 bicomposer
drwxr-x--- 16 oracle oinstall 4096 Jul  6 13:15 bifoundation
[…]

Check disk space

The patches copy files and take copies of files they are replacing, so you will need free disk space – from experience, around 10GB can be necessary

Backup, backup, backup

You already take backups of your OBIEE installation, right? Well, now is definitely the time to start if you don’t. Patching is a complex process, and if something goes wrong then you are at the mercy of oPatch and oraInventory.

And don’t forget, untested backups are as good as no backups.

See the Oracle® Fusion Middleware Administrator’s Guide – Introducing Backup and Recovery for more information, and specifically Backup and Recovery Recommendations for Oracle Business Intelligence

Shutdown the OBIEE processes, including WebLogic

Shutdown OPMN, the Weblogic managed server (bi_server1), and the Weblogic Admin Server.

Delete catalog manager cache directories

These may or may not exist – if they do, delete them

rm -rv $FMW_HOME/Oracle_BI1/bifoundation/web/catalogmanager/configuration/org.eclipse.osgi
rm -rv $FMW_HOME/Oracle_BI1/bifoundation/web/catalogmanager/configuration/org.eclipse.equinox.app

Set the environment variables

The patch README explains how to do this on Windows and a *nix C shell, here is how to do it on Bash shell:

cd $FMW_HOME/Oracle_BI1
export ORACLE_HOME=$PWD
export PATH=$ORACLE_HOME/bin:$PATH
export JAVA_HOME=$ORACLE_HOME/jdk
export PATH=$JAVA_HOME/bin:$PATH
export PATH=$ORACLE_HOME/OPatch:$PATH

Apply the OBI patches

Making sure that you’ve set the environment variables as shown above, apply the seven OBI patches as follows:
14223977 (1 of 7) Oracle Business Intelligence Installer

cd $FMW_HOME/Oracle_BI1/14223977
opatch apply

14226980 (2 of 7) Oracle Real Time Decisions

cd $FMW_HOME/Oracle_BI1/14226980
opatch apply

13960955 (3 of 7) Oracle Business Intelligence Publisher

cd $FMW_HOME/Oracle_BI1/13960955
opatch apply

14226993 (4 of 7) Oracle Business Intelligence ADF Components

cd $FMW_HOME/Oracle_BI1/14226993
opatch apply

14228505 (5 of 7) Enterprise Performance Management Components Installed from BI Installer 11.1.1.6.x

cd $FMW_HOME/Oracle_BI1/14228505
opatch apply

13867143 (6 of 7) Oracle Business Intelligence

cd $FMW_HOME/Oracle_BI1/13867143
opatch apply

14142868 (7 of 7) Oracle Business Intelligence Platform Client Installers and MapViewer

cd $FMW_HOME/Oracle_BI1/14142868
opatch apply

Check the OBI patches have been applied

cd $FMW_HOME/Oracle_BI1
opatch lsinventory|grep applied

You should see the seven patches just applied, plus a bunch of others from installing 11.1.1.6.0

Patch  14142868     : applied on Tue Jul 10 21:31:11 BST 2012
Patch  13867143     : applied on Tue Jul 10 20:58:29 BST 2012
Patch  14228505     : applied on Tue Jul 10 20:33:59 BST 2012
Patch  14226993     : applied on Tue Jul 10 20:25:58 BST 2012
Patch  13960955     : applied on Tue Jul 10 20:19:46 BST 2012
Patch  14226980     : applied on Tue Jul 10 20:02:40 BST 2012
Patch  14223977     : applied on Tue Jul 10 20:00:35 BST 2012
Patch  6845838      : applied on Fri Jul 06 13:33:14 BST 2012
Patch  7393921      : applied on Fri Jul 06 13:32:58 BST 2012
Patch  7707476      : applied on Fri Jul 06 13:32:46 BST 2012
Patch  7663342      : applied on Fri Jul 06 13:32:21 BST 2012
Patch  6599470      : applied on Fri Jul 06 13:32:10 BST 2012
Patch  6750400      : applied on Fri Jul 06 13:31:58 BST 2012
Patch  7427144      : applied on Fri Jul 06 13:31:23 BST 2012

Copy the client tool installers

cd $FMW_HOME/Oracle_BI1/clients/bipublisher/repository/Tools
cp BIPublisherDesktop*.exe $FMW_HOME/user_projects/domains/bifoundation_domain/config/bipublisher/repository/Tools/

Apply the JDeveloper patch

This requires a different ORACLE_HOME to be set, so make sure you don’t skip setting the environment variables here:

cd $FMW_HOME/oracle_common
export ORACLE_HOME=$PWD
export PATH=$ORACLE_HOME/bin:$PATH
export JAVA_HOME=$ORACLE_HOME/jdk
export PATH=$JAVA_HOME/bin:$PATH
export PATH=$ORACLE_HOME/OPatch:$PATH
cd $FMW_HOME/Oracle_BI1/13952743
opatch apply

Check the JDeveloper patch has been applied

cd $FMW_HOME/Oracle_BI1
opatch lsinventory|grep applied

You should see the patch just applied listed:

Patch  13952743     : applied on Tue Jul 10 21:44:19 BST 2012

Startup the OBIEE processes, including WebLogic

Start up the Weblogic Admin Server, Managed Server (bi_server1), and then OPMN.

Validate the patching has been successful

If all has gone well, you should now be able to login to OBIEE and from the Administration link see the new version listed:
11.1.1.6.2 (Build 120605.2000 BP1 64-bit)

Untitled 4
Untitled 5

Client tools

Don’t forget to update your Client Tools (Adminstration Tool, Catalog Manager, Job Manager). To do this, login to OBIEE and from the Home page go to Get Started… -> Download BI Desktop Tools -> Oracle BI Client Installer. This will download the client installer, which you should then run to install the tools.

The Admin Tool lists its version in C:\Program Files\Oracle Business Intelligence Enterprise Edition Plus Client\oraclebi\orahome\bifoundation\version.txt

Build: 11.1.1.6.0.BIFNDN_11.1.1.6.2BP1_NT_120604.0813
Release Version: Oracle Business Intelligence 11.1.1.6.0
Package: 120604.0136.000

Tidy up patch files

If you want to save disk space, you can delete the patch folders:

cd $FMW_HOME/Oracle_BI1/
rm -rf 13867143
rm -rf 13952743
rm -rf 13960955
rm -rf 14142868
rm -rf 14223977
rm -rf 14226980
rm -rf 14226993
rm -rf 14228505

Browser Cache

In the README for the patches, there is the following note that is worth being aware of:


NB: When the patchset installation is complete and the BI System is running again, end-users might experience unexpected behavior due to pre-existing browser sessions caching javascript from the earlier Oracle BI release. To avoid unnecessary support requests, ask all end-users to clear their browser cache.

This unexpected behaviour can include the display not rendering correctly – when creating a new Analysis, only the main toolbar is shown:
Untitled 6
The resolution is simple, just purge the browser cache and logout/login to OBIEE.

Another problem observed has been the dashboard tabs not displaying.

For an enterprise deployment, I would imagine this might cause a few problems as not all users will necessarily be familar with clearing browser caches, or pay attention to instructions telling them to do so.

Exalytics – TimesTen and OBIEE connectivity

$
0
0

Two of the key components in Exalytics are OBIEE and the TimesTen in-memory database. Configuring them to work together, particularly in non-standard configurations, can be fiddly, so here is a guide on how to make sure you get it right. 

The aim

To recap, the aim of this configuration is for the BI Server to be able to access the correct TimesTen database. This is necessary for both users running reports which use data held in TimesTen, and also when you are running the Summary Advisor/Aggregate Persistence Wizard script to create aggregates.

The configuration files

There are four configuration files involved:

  • The OBI Repository (RPD)
  • odbc.ini
  • sys.odbc.ini
  • sys.ttconnect.ini

In addition, you need to be aware of opmn.xml, but of that more later

Scenario 1 – Single Exalytics node, single OBIEE/TimesTen installation

If you have a single Exalytics server and have not deviated from the supported configuration then this is what you will have. By default Exalytics will come configured with part or all of this done.

sys.odbc.ini

In TimesTen, the database is defined in a file called sys.odbc.ini. You’ll find this file in $TIMESTEN_HOME/info, for example, /u01/app/oracle/product/TimesTen/tt1122/info/.

There are two important things about this file. One is the name of the database is set. The second is that the actual database definition is held here.

[ODBC Data Sources]
TT_AGGR_STORE1=TimesTen 11.2.2 Driver

[TT_AGGR_STORE1]
Driver=/u01/app/oracle/product/TimesTen/tt1122/lib/libtten.so
DataStore=/u01/data/tt/tt_aggr_store1/data
LogDir=/u01/data/tt/tt_aggr_store1/logs
[… ]

In line 4 we set the database name, also referred to as the DSN (DataSource Name). This must also be included in the list of data sources, see line 2.

Lines 5-8 and onwards are the definition of the database – where the data is stored, how large the database is, etc.

odbc.ini

This is the configuration file used by OBIEE for holding any ODBC data source definitions. This includes the OBIEE datasource itself, AnalyticsWeb (which is connected to by ODBC). You may have entries here for SQL Server and other ODBC-connected datasources too. The file is located in $FMW_HOME/instances/instance1/bifoundation/OracleBIApplication/coreapplication/setup.

For Exalytics TimesTen, the relevant entry here will look like this:

[…]
[ODBC Data Sources]
TT_AGGR_STORE1 = TimesTen 11.2.2 Driver
[…]
[TT_AGGR_STORE1]
Driver = /u01/app/oracle/product/TimesTen/tt1122/lib/libttclient.so
TTC_SERVER_DSN = TT_AGGR_STORE1
TTC_SERVER = localhost
TTC_TIMEOUT = 0

Line 5 defines the DSN name, and must also be listed under ODBC Data Sources, see line 2.

The key thing here is that the name of the DSN defined in odbc.ini doesn’t have to match that of sys.odbc.ini..
From the DSN defined here, the DSN defined in TimesTen is referenced, as TTC_SERVER_DSN, see line 7.
The TTC_SERVER in this context is the physical hostname (eg localhost) of the Exalytics server.

The OBI Repository (RPD)

In your Exalytics RPD, the connection pool for your TimesTen database definition will have a Data source name entry. You set this to the DSN defined (i.e. the name between the square brackets) in odbc.ini.

Scenario 2 – Single Exalytics node, multiple OBIEE/TimesTen installations

If there are multiple TimesTen installations on the one Exalytics server (for example, to support dev/test/pre-prod, and/or for standalone patching and versioning) then there will be multiple TimesTen daemon/server processes on distinct ports.

In the above example “Scenario 1″, the OBIEE odbc.ini configuration relies on the TimesTen server on which the intended database exists listening on the default server port. This is because TTC_SERVER can refer to either a physical hostname of the TimesTen server, or of a TimesTen logical server definition.

If the TimesTen server process that we want to connect to is on a different port, it is necessary to use a Logical Server reference.


A logical server is configured in the sys.ttconnect.ini file:

[ttServerB]
Description=TimesTen Server
Network_Address=localhost
TCP_PORT=54397

Line 1 defines the logical server name. The Network_Address is the physical hostname, and TCP_PORT the port on which the server process is listening. Be aware of some special values that Network_Address can be set to, see the documentation for more details.

To reference the logical server, the name (in the above example, ttServerB) is referenced in the odbc.ini DSN definition:

odbc.ini

[ODBC Data Sources]
TT_AGGR_STORE1 = TimesTen 11.2.2 Driver
[…]
[TT_AGGR_STORE1]
Driver = /u01/app/oracle/product/TimesTen/tt1122/lib/libttclient.so
TTC_SERVER_DSN = TT_AGGR_STORE1
TTC_SERVER = ttServerB
TTC_TIMEOUT = 0

The OBI Repository (RPD)

In your Exalytics RPD, the connection pool for your TimesTen database definition will have a Data source name entry. You set this to the DSN defined (i.e. the name between the square brackets) in odbc.ini.

Troubleshooting

The nqserver.log will show if there are problems, even if you are not querying TimesTen. This is because OBI recognises TimesTen connection pools and automatically checks them every sixty seconds (you can configure this interval with HA_DB_PING_PERIOD_MILLISECS in NQSConfig.ini).
A couple of common errors and their causes:

  • [nQSError: 16001] ODBC error state: IM002 code: 0 message: [TimesTen][TimesTen 11.2.2.2.1 ODBC Driver]Data source name not found and no default driver specified.
    • This means that you have specified a TTC_SERVER_DSN in the OBIEE odbc.ini for which there is no corresponding DSN in TimesTen’s sys.odbc.ini
  • [nQSError: 16001] ODBC error state: IM002 code: 0 message: [TimesTen][TimesTen 11.2.2.3.0 CLIENT]Cannot find the requested DSN (TT_AGGR_STORE) in ODBCINI [...]
    • This means that your RPD is specifying a DSN which you haven’t defined in OBIEE’s odbc.ini

opmn.xml

If you are using multiple TimesTen binaries then make sure you set the correct LD_LIBRARY_PATH and TIMESTEN_DLL in the opmn.xml configuration file, which you’ll find in $FMW_HOME/instances/instance1/config/OPMN/opmn/

Documentation

OBIEE / AD integration – [OBI-SEC-00022] Identity found … but could not be authenticated

$
0
0

A quick blog post to record for future Googlers a problem I encountered today. I was configuring OBIEE 11.1.1.6 to use Microsoft Active Directory (MSAD) as an Authentication Provider, following the instruction’s in Mark’s blog post.

After completing the setup, I could see my AD users in Web Logic Console under Users and Groups but logins to analytics with an AD user failed. In the bi_server1-diagnostic.log was the entry

[OBI-SEC-00022] Identity found jbloggs but could not be authenticated

The problem was that my Principal user (let’s call it ADBusInt) was outside of the AD region which I’d identified with Base User DN. This meant that OBIEE could find the user’s AD account (jbloggs) successfully (in the specified Base User DN), but not the ADBusInt account which is required to complete authentication. 

The solution was to broaden Base User DN to include the area of the AD which hosted my Principal user too. 

Automated Monitoring of OBIEE in the Enterprise – an overview

$
0
0

A lot of time is given to the planning, development and testing of OBIEE solutions. Well, hopefully it is. Yet sometimes, the resulting deployment is marked Job Done and chucked over the wall to the Operations team, with little thought given to how it is looked after once it is running in Production.

Of course, at the launch and deployment into Production, everyone is paying very close attention to the system. The slightest cough or sneeze will make everyone jump and come running to check that the shiny new project hasn’t embarrassed itself. But what about weeks 2, 3, 4…six months later…performance has slowed, the users are unhappy, and once in a while someone thinks to load up a log file to check for errors.

This post is the first of a mini-series on monitoring, and will examine some of the areas to consider in deploying OBIEE as a Production service. Two further posts will look at some of this theory in practice.

Monitoring software

The key to happy users is to know there’s a problem before they do, and even better, fix it before they realise. How do you do this? You either sit and watch your system 24 hours a day, or you set up some automated monitoring. There’s lots of companies willing to take lots of money off your hands for very complex and fancy pieces of software that will do this, and there are lots of open-source solutions (some of them also very complex, and some of them very fancy) that will do the same. They all fall under the umbrella title of Systems Management.

Which you choose may be dictated to you by corporate policy or your own personal choice, but ultimately all the good ones will do pretty much the same and require configuring in roughly the same kind of way. Some examples of the software include:

  • HP OpenView
  • Nagios
  • Zabbix
  • Tivoli
  • Zenoss
  • Oracle Enterprise Manager
  • list on Wikipedia

Some of these tools can take the output of a bash script and use it as the basis for logging an alert or not. This means that pretty much anything you can think of to check, so long as you can script it, you can check it.

I’m not aware of any which come with out of the box templates for monitoring OBIEE 11g – if there are please let me know. Any company which says they have, make sure they’re not mistaking “is capable of” with “is actually implemented”.

What to monitor

It’s important to try and look at the entire map of your OBIEE deployment and understand where things could go wrong. Start thinking of OBIEE as the end-to-end stack, or service, and not simply the components that you installed. Once you’ve done that, you can start to plan how to monitor those things, or at least be aware of the fault potential. For example, it’s obvious to check that the OBIEE server is running, but what about the AD server which you use for authentication? Or the SAN your webcat resides on? Or the corporate load balanacer you’re using?

Here are some of the common elements to all OBIEE deployments that you should be considering:

OBIEE services

An easy one to start with, and almost so easy it could be overlooked. You need to have in place something which is going to check that the OBIEE processes are currently running. Don’t forget to include the Web Logic Server process(es) in this too.

The simplest way to build this into a monitoring tool is to have it check with the OS (be it Linux/Windows/whatever) that a process with the relevant name is running, and raise an alert if it’s not. For example, to check that the Presentation Services process is running, you could do this on Linux:

ps -ef|grep [s]awserver

You could use opmnctl to query the state of the processes, but be aware that OPMN is going to report how it sees the processes. If there’s something funny with opmn, then it may not pick up a service failure. Of course, if there’s something funny with opmn then you may have big trouble anyway.

A final point on process monitoring; note that OPMN manages the OBIEE system components and by default will restart them if they crash. This is different behvaiour from OBIEE 10g, where when a process died it stayed dead. In 11g, processes come back to life, and it can be most confusing if an alert fires saying that a process is down but when you check it appears to be running.

Network ports

This is a belts and braces counterpart to checking that processes are running. It makes sense to also check that the network ports that OBIEE uses to communicate both externally with users and internally with other OBIEE processes are listening for traffic. Why do this? Two reasons spring to mind. The first is that you misconfigure your process-check alert, or it fails, or it gets accidentally disabled. The second, less likely, is that an OBIEE process is running (so doesn’t trigger the process-not-running alert) but has hung in some way and isn’t accepting TCP traffic.

The ports that your particular OBIEE deployment uses will vary, particularly if you’ve got multiple deployments on one host. To see which ports are used by the BI System Components, look at the file $FMW_HOME/instances/instance1/config/OPMN/opmn/ports.prop. The ports used by Web Logic will be in $FMW_HOME/user_projects/domains/bifoundation_domain/config/config.xml

A simple check that Presentation Services was listening on its default port would be:

netstat -ln | grep tcp | grep 9710 | wc -l

If a zero is returned that means there are no ports listening, i.e. there’s a problem.

Application Deployments

Web Logic Server hosts various JEE Application Deployments, some of which are crucial to the well-being of OBIEE. An example of one of these is analytics (which handles the traffic between the web browser and the Presentation Services). Just because Web Logic is running, you cannot assume that the application deployment is. You can check automatically using WLST:

connect('weblogic','welcome1','t3://localhost:7001')
nav=getMBean('domainRuntime:/AppRuntimeStateRuntime/AppRuntimeStateRuntime')
state=nav.getCurrentState('analytics#11.1.1','bi_server1')
print "\033[1;32m " + state + "\033[1;m"

You would invoke the above script (assuming you'd saved it as /tmp/check_app.py) using:

$FMW_HOME/oracle_common/common/bin/wlst.sh tmp/check_app.py

Checking application deployment health using WLST
Because WLST is verbose when you invoke it, you might want to pipe the command through tail so that you just get the output

$FMW_HOME/oracle_common/common/bin/wlst.sh tmp/check_app.py | tail -n 1
 STATE_ACTIVE

If you want to explore more detail around this functionality a good starting point is the MBeans involved, which you can find in Enterprise Manager under Runtime MBeans > com.bea > Domain: bifoundation_domain

Log files

The log files from OBIEE are crucial for spotting problems which have happened, and indicators of problems which may be about to happen. You'll find the OBIEE logs in:

  • $FMW_HOME/instances/instance1/diagnostics

and the Web Logic Server related logs primarily in

  • $FMW_HOME/user_projects/domain/bifoundation_domain/servers/AdminServer/logs
  • $FMW_HOME/user_projects/domain/bifoundation_domain/servers/bi_server1/logs

There are others dotted around but these are the main places to start. For a more complete list, look in Enterprise Manager under coreapplication > Diagnostics > Log Viewer > Selected Targets.

Once you've located your logs, there's no prescribed list of what to monitor for - it's down to your deployment and the kind of errors you [expect to] see. Life is made easier because FMW already categorises log messages by severity, so you could start with simply watching WLS logs for <Error> and OBIEE logs for [ERROR (yes, no closing bracket).

If you find there are errors regularly causing alerts which you don't want then set up exceptions in your monitoring software to ignore them or downgrade their alert severity. Of course, if there are regular errors occurring then the correct long-term action is to resolve the root cause so that they don't happen in the first place!

I would also watch the server logs for an indication of the processes shutting down, and any database errors thrown. You can monitor the Presentation Services log (sawlog0.log) for errors which are being passed back to the user - always good to get a headstart on a user raising a support call if you're already investigating the error that they're about to phone up and report.

Monitoring log files should be the bread and butter of any decent systems management software, and so each will probably have its own way of doing so. You'll need to ensure that it copes with rotating logs - if you have configured them - otherwise it will read a log to position 94, the log rotates and the latest entry is now 42, but the monitoring tool will still be looking at 94.

Server OS stats

In an Enterprise environment you may find that your Ops team will monitor all server OS stats generically, since CPU is CPU, whether it's on an OBI server or SMTP server. If they don't, then you need to make sure that you do. You may find that whatever Systems Management tool you pick supports OS stats monitoring.

As well as CPU, make sure you're monitoring memory usage, disk IO, file system usage, and network IO.

Even if another team does this for your server already, it is a good idea to find out what alert thresholds have been set, and get access to the metrics themselves. Different teams have different aims in collecting metrics, and it may be the Ops team will only look at a server which hits 90% CPU. If you know your OBIEE server runs typically at 30% CPU then you should be getting involved and investigating as soon as CPU hits, say, 40%. Certainly, by the time it hits 90% then there may already be serious problems.

OBI Performance Metrics

Just as you should monitor the host OS for important metrics, you can monitor OBIEE too. Using the Dynamic Monitoring Service (DMS), you can examine metrics such as:

  • Logged in user count
  • Active users
  • Active connections to each database
  • Running queries
  • Cache hits

This is just a handful - there are literally hundreds of metrics available.

You can see the metrics in Enterprise Manager (Fusion Middleware Control), but there is no history retained and no integrated alerting, making it of little use as a hands-off monitoring tool.

At Rittman Mead we have developed a solution which records the OBIEE perfomance data and makes it available for realtime monitoring and alerting for OBIEE:
Realtime monitoring of OBIEE metrics

The kind of alerting you might want on these metrics could include:

  • High number of failed logins
  • High number of query errors
  • Excessive number of database connections
  • Low cache hit ratio

Usage Tracking

I take this as such a given that I almost forgot it from this list. If you haven't got Usage Tracking in place, then you really should. It's easy to configure, and once it's in place you can forget about it if you want to. The important thing is that you're building up an accurate picture of your system usage which is impossible to do easily any other way. Some good reasons for having Usage Tracking in place:

  • How many people logged into OBIEE this morning?
  • What was the busiest time period of the day?
  • Which reports are used the most?
  • Which users are running reports which take longer than x seconds to complete? (Can we help optimise the query?)
  • Which users are running reports which bring back more than x rows of data? (Can we help them get the data more efficiently?)

In addition to these analytical reasons, going back to the monitoring aspect of this post, Usage Tracking can be used as a data source to trigger alerts for long running reports, large row fetches, and so on. An example query which would list reports from the last day that took longer than five minutes to run, returned more than 60000 rows, or used more than four queries against the database, would be:

SELECT user_name, 
       TO_CHAR(start_ts, 'YYYY-MM-DD HH24:MI:SS'), 
       row_count, 
       total_time_sec, 
       num_db_query, 
       saw_dashboard, 
       saw_dashboard_pg, 
       saw_src_path 
FROM   dev_biplatform.s_nq_acct 
WHERE  start_ts &gt; SYSDATE - 1 
       AND ( row_count &gt; 60000 
              OR total_time_sec &gt; 300 
              OR num_db_query &gt; 4 ) 

This kind of monitoring would normally be used to trigger an informational alert, rather than sirens-blazing code red type alert. It's important to be aware of potentially bad queries on the system, but it can wait until after a cup of tea.

Some tools will support database queries natively, others you may have to fashion together a sql*plus call yourself.

Databases - both reporting sources and repository schemas (BIPLATFORM)

Without the database, OBIEE is not a great deal of use. It needs the database in place to provide the data for reports, and it also needs the repository schemas that are created by the RCU (MDS and BIPLATFORM).

As with the OS monitoring, it may be your databases are monitored by a DBA team. But as with OS monitoring, it is a good idea to get involved and understand exactly what is being monitored and what isn't. A DBA may have generic alerts in place, maybe for disk usage and deadlocks. It might be useful to monitor the DW also for long running queries or high session counts. Long running queries aren't going to necessarily bring the database down, but they might be a general indicator of some performance problems that you should be investigating sooner rather than later.

ETL

Getting further away from the core point of monitoring OBIEE, don't forget the ancillary components to your deployment. For your reports to have data the database needs to be functioning (see previous point) but there also needs to be data loaded into it.

OBIEE is the front-end of service you are providing to users, so even if a problem lies further down the line in a failed ETL batch, the users may perceive that as a fault in OBIEE.

So make sure that alerting is in place on your ETL batch too and there's a way that problems can be efficiently communicated to users of the system.

Active Monitoring

The above areas are crucial for "passive" monitoring of OBIEE. That is, when something happens which could be symptomatic of a problem, raise an alert. For real confidence in the OBIEE deployment, consider what I term active monitoring. Instead of looking for symptoms that everything is working (or not), actually run tests to confirm that it is. Otherwise you end up only putting in place alerts for things which have failed in the past and for which you have determined the failure symptom. Consider it the OBIEE equivalent of a doctor reading someone's vital signs chart versus interacting with the person and directly ascertaining their health.

OBIEE stack components involved in a successful report reqest

This diagram shows the key components involved in a successful report request in OBIEE, and illustrated on it are the three options for actively testing it described below. Use this as a guide to understand what you are and are not confirming by running one of these tests.

sawping

This is a command line utility provided by Oracle, and it runs a "ping" of the Presentation Services server. Not complicated, and not overly useful if you're already monitoring for the sawserver process and network port. But, easy to setup so maybe worth including anyway. Note that this doesn't check the BI server, database, or Web Logic.

[oracle@rnm ~]$ sawping -s myremoteobiserver.company.com -p 9710 -v
Server alive and well


[oracle@rnm ~]$ sawping -s myremoteobiserver.company.com -p 9710 -v
Unable to connect to server. The server may be down or may be too busy to accept additional connections.
An error occurred during connect to &quot;myremoteobiserver.company.com:9710&quot;. Connection refused [Socket:6]
Error Codes: YM52INCK

nqcmd

nqcmd is a command line utility provided with OBIEE which acts as an ODBC client to the BI Server. It can be used to run Logical SQL (the query that Presentation Services generates to fetch data for a report) against the BI Server. Using nqcmd you can validate that the BI Cluster Controller, BI Server and Database are functioning correctly.

You could use nqcmd in several ways here:

  • Simple yes/no test that this part of the stack is functioning
  • nqcmd returns the results of a query, so you could test that the data being returned by the database is correct (compare it to what you know it should be)
  • Measure how long it takes nqcmd to run the query, and trigger an alert if the query is slow-running

This example runs a query extracted from nqquery.log and saved as query01.lsql. It uses grep and awk to parse the output to show just the row count retrieved, and the total time it took to run nqcmd. It uses the / character to split lines for readability. If you want to understand more about how it works, run the nqcmd bit before the pipe | symbol and then add each of the pipe-separated statements back in one by one.

. $FMW_HOME/instances/instance1/bifoundation/OracleBIApplication/coreapplication/setup/bi-init.sh

time nqcmd -d AnalyticsWeb -u Prodney -p Admin123 -s ~/query01.lsql -q -T / 
2&gt;/dev/null | grep Row | awk '{print $13}'

NB don’t forget the bi-init step, which sets up the environment variables for OBIEE. On Linux it’s “dot-sourced” – with a dot space as the first two characters of the line.

Web user automation

Pretty much the equivelant of logging on to OBIEE in person and checking it is working, this method uses generic web application testing tools to simulate a user running a report and raise an alert if the report doesn’t work. As with the nqcmd option previously, you could stay simple with this option and just confirm that a report runs, or you could start analyzing run times for performance trending and alerting.

To implement this option you need a tool which lets you record a user’s OBIEE session and can replay it simulating the browser activity. Then script the tool to replay the session periodically and raise an alert if it fails. Two tools I’m aware of that could be used for this are JMeter and HP’s BAC/LoadRunner.

A final point on this method – if possible run it remotely from the OBIEE server. If there are network problems, you want to pick those up too rather than just hitting the local loopback interface.

If you think that this all sounds like overkill, then consider this real-life case here, where all the OBIEE processes were up, the database was up, the network ports were open, the OS stats were fine — but the users still couldn’t run their reports. Only by simulating the end-to-end user process can you get proper confidence that your monitoring will alert you when there is a problem

Enterprise Manager

This article is focussed on the automated monitoring of OBIEE. Enterprise Manager (Fusion Middleware Control) as it currently stands is very good for diagnosing and monitoring live systems, but doesn’t have the kind of automated monitoring seen in EM for Oracle DB.

There has always been the BI Management Pack available as an extra for EM Grid Control, but it’s not currently available for OBI 11g. Updated: It looks like there is now, or soon will be, an OBI 11g management pack for EM 12c, see here and here (h/t Srinivas Malyala)

Capacity Planning

Part of planning a monitoring strategy is building up a profile of your systems’ “normal” behaviour so that “abnormal” behaviour can be spotted and alerted. In building up this profile you should find you easily start to capture valuable information which feeds naturally into capacity planning.

Or put it another way, a pleasant side-effect of decent monitoring is a head-start on capacity planning and understanding your system’s usage versus the available resource.

Diagnostics

This post is focused on automated monitoring; in the middle of the night when all is quiet except the roar of your data centre’s AC, something is keeping an eye on OBIEE and will raise alarms if it’s going wrong. But what about if it is going wrong, or if it went wrong and you need to pick up the pieces?

This is where diagnostics and “interactive monitoring” come in to play. Enterprise Manager (Fusion Middleware Control) is the main tool here, along with Web Logic Console. You may also delve into the Web Logic Diagnostic Framework (WLDF) and the Web Logic Dashboard.

For further reading on this see Adam Bloom’s presentation from this year’s BI Forum: Oracle BI 11g Diagnostics

What next

Over the next few days I will be posting further articles in this series, looking at how we can put some of this monitoring theory into practice:

An introduction to monitoring OBIEE with Nagios

$
0
0

Introduction

This is the second post in a mini-series on monitoring OBIEE. The previous post, Automated Monitoring of OBIEE in the Enterprise – an overview, looked at the overview and theory to why and what we should be monitoring. In this post I am going to walk through implementing a set of automated checks on OBIEE using the Systems Management tool Nagios

Nagios

There are at least three different flavours of Nagios, and only one of them is free (open source), called Nagios Core. The others listed here are Nagios XI and Nagios Fusion.

Brace yourself

One of the formal pre-requisites of open source software is either no documentation, or a vast swath of densely written documentation with no overview or map. OK, I’m kidding. But, be aware that with open source you have to be a bit more self-sufficient and prepared to roll up your sleeves than is normally the case with commercially produced software. I’m not trolling here, and there are exceptions on either side – but if you want to get Nagios working with OBIEE, be aware that it’s not simply click-click-done. :)

Nagios has a thriving community of plugins, addons, and companion applications such as alternative frontends. This is both a blessing and a curse. It’s great, because whatever you want to do with it, you probably can. It can be troublesome though because it means there’s no single point of reference to lookup how something is done — it could be done in many different ways. Some plugins will be excellent, others may be a bit ropey – you may find yourself navigating this with just your google-fu to guide you.

Right tool for the right job

As with any bit of software, make sure you’re not trying to hit the proverbial nail with a pick axe. Plugins and so on are great for extending a product, but always keep an eye on the product’s core purpose and whether you’re straying too far from it to be sensible. Something which works now might not in future product upgrades. Also sense-check whether two complementary tools might be better suited than trying to do everything within one.

Getting started

I’m working with two servers, both Oracle Linux 6.3.

  • The first server has OBIEE 11.1.1.6.2 BP1 installed in a standard single-node cluster with two WebLogic servers (AdminServer/Managed Server).
  • The second server is going to be my Nagios monitoring server

In theory you could install Nagios on the OBIEE server, but that’s not a great idea for Production usage as you’d be subject to all of the bad things which could happen to the OBIEE server and won’t be able to alert for them if the monitoring is from the same server.

Installing Nagios

There is documentation provided on how to install Nagios from source which looks comprehensive and easy to follow.

Alternatively, using the EPEL repository, install nagios and the default set of nagios plugins using the package manager yum:

 yum install nagios nagios-plugins-all 

If you use the yum method, you might want to follow this step from the above PDF which will set Nagios to startup automatically at boot:

 chkconfig --level 35 nagios on 

Testing the installation

If the installation has worked, you should be able to go to the address http://[server]/nagios and login using the credentials you created or the default nagiosadmin/nagiosadmin: Nagios01

If you don’t get this, check the following:

  • Is nagios running?
    $ ps -ef|grep [n]agios
    nagios 7959 1 0 14:16 ? 00:00:00 /usr/sbin/nagios -d /etc/nagios/nagios.cfg

    If it’s not, use

    service nagios start
  • Is Apache web server running?
    $ ps -ef|grep [h]ttpd 
    root 8016 1 0 14:19 ? 00:00:00 /usr/sbin/httpd apache 8018 8016 0 14:19 ? 00:00:00 /usr/sbin/httpd 
    […] 

    If it’s not, use

    service httpd start
  • If the firewall’s enabled, is port 80 open?

Nagios configuration

Nagios is configured, by default, through a series of files held on the server. There are GUI front ends for these files, but in order to properly understand what’s going on under the covers I am working with the files themselves here.

The documentation refers to Nagios config being in /usr/local/nagios, but on my install it put it in /etc/nagios/

Object types

To successfully work with Nagios it is necessary to understand some of the terminology and object types used. For a complete list with proper definitions, see the documentation.

  • A host is a physical server
  • A host has services defined against it
  • Each service defines a command to use
  • A command specifies a plugin to execute

For a detailed explanation of Nagios’ plugin architecture, see here

Examining the existing configuration

From your Nagios installation home page, click on Hosts and you should see localhost listed. Click on Services and you’ll see eight pre-configured checks (‘services’) for localhost. Nagios02 Let’s disect this existing configuration to start with. First off, the nagios.cfg file (probably in /etc/nagios or /usr/local/nagios) includes the line:

cfg_file=/etc/nagios/objects/localhost.cfg

The localhost.cfg file defines the host and services for localhost.

Open up localhost.cfg and you’ll see the line define host which is the definition for the machine, including an alias, its physical address, and the name by which it is referred to in later Nagios configuration.

Scrolling down, there is a set of define service statements. Taking the first one:

define service{
use local-service ; Name of service template to use 
host_name localhost 
service_description PING 
check_command check_ping!100.0,20%!500.0,60% 
}

We can see the following:

  1. It’s based on a local-service template
  2. The hostname to use in it is localhost, defined previously
  3. The (arbitrary) name of the service is PING
  4. The command to be run for this service (to determine the service’s state) is in the check_command. The syntax here is the command (check_ping) followed by arguments separated by the ! symbol (pling/bang/exclamation mark)

The command that a service runs (and the arguments that it accepts) is defined by default in the commands.cfg file. Open this up, and seach for ‘check_ping’ (the command we saw in the PING service definition above). We’re now getting closer to the actual execution, but not quite there yet. The define command gives us the command name (eg. check_ping), and then the command line that is executed for it. In this case, the command line is also called check_ping, and is an executable that is installed with nagios-plugins (nagios-plugins-all if you’re using a yum installation).

In folder /usr/lib64/nagios/plugins you will find all of the plugins that were installed by default, including check_ping. You can execute any of them from the command line, which is a good way to both test them and understand how they work with arguments passed to them. Many will support a -h help flag, including check_ping:

 $ cd /usr/lib64/nagios/plugins/ 
$ ./check_ping -h
check_ping v1.4.15 (nagios-plugins 1.4.15)
Copyright (c) 1999 Ethan Galstad <nagios@nagios.org>
Copyright (c) 2000-2007 Nagios Plugin Development Team
	<nagiosplug-devel@lists.sourceforge.net>

Use ping to check connection statistics for a remote host.

Usage:
check_ping -H <host_address> -w <wrta>,<wpl>% -c <crta>,<cpl>%
 [-p packets] [-t timeout] [-4|-6]
 […] 

Note the -w and -c parameters – this is where Warning and Critical thresholds are passed to the plugin, for it to then return the necessary status code back to Nagios.

Working back through the config, we can see the plugin is going to be executed with

command_line $USER1$/check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5

(from the command definition) and the arguments passed to it are

 check_command check_ping!100.0,20%!500.0,60%

(from the service definition). Remember the arguments are separated by the ! symbol, so the first argument ($ARG1$) is 100.0,20% and the second argument ($ARG2$) is 500.00,60%. $HOSTADDRESS$ comes from the hostname entry in the service definition.

So, we can now execute the plugin ourselves to see how it works and to validate what we think Nagios should be picking up:

./check_ping -H localhost -w 100.0,20% -c 500,60% -p 5 
PING OK - Packet loss = 0%, RTA = 0.05 ms|rta=0.052000ms;100.000000;500.000000;0.000000 pl=0%;20;60;0

A picture may be worth a thousand words

To visualise how the configuration elements relate and in which files they are located by default, see the following diagram: Nagios03NB this is not a fully comprehensive illustration, but a simplified one of the default configuration.

tl;dr?

If you’re skimming through this looking for nuggets, you’d be well advised to try to digest the above section, or at least the diagram. It will save you time in the long run, as all of Nagios is based around the same design principle

Adding a new host

Let us start our OBIEE configuration of Nagios by adding in the OBIEE server. Currently Nagios has a single host defined, localhost, which is the Nagios server itself.

The first step is to specify where our new configuration will reside. We can either

  1. bolt it on to one of the existing default config files
  2. Create a new config file, and reference it in nagios.cfg with a new cfg_file entry
  3. Create a new config file directory, and add a line to nagios.cfg for cfg_dir

Option 1 is quick ‘n dirty. Option 2 is fine for small modifications. Option 3 makes the most sense, as any new configuration files we create after this one we just add to the directory and they will get picked up automagically. We’ll also see that keeping certain configuration elements in their own file makes it easier to deploy to additional machines later on.

First, create the configuration folder

mkdir -p /etc/nagios/config

Then add the following line to nagios.cfg

cfg_dir = /etc/nagios/config

Now, in the tradition of all good technology learning, we will copy the existing configuration and modify it for the new host.

Copy objects/localhost.cfg to config/bi1.cfg, and then modify it so it resembles this:

 define host{ 
use linux-server
host_name bi1 
alias DEV OBIEE server 1 
address 192.168.56.101 
}

define service{ 
use local-service 
host_name bi1 
service_description PING 
check_command check_ping!100.0,20%!500.0,60% 
}

Substitute your server’s IP address as required. host_name is just a label, it doesn’t have to match the server’s hostname (although it is sensible to do so).

So we have a very simple configuration – our host, and a single service, PING.

Before the configuration change is activated, we need to validate the configuration, by getting Nagios to parse it and check for errors

nagios -v /etc/nagios/nagios.cfg

(Remember, nagios.cfg is the main configuration file which points to all the others).

Once the configuration has been validated, we restart nagios to pick up the new configuration:

service nagios restart

Returning to the Nagios web front end (http:///nagios) you should now see the second host listed: Nagios04

Running Nagios checks on a remote machine

Nagios checks are all based on a command line executable run locally on the Nagios server. This works fine for things like ping, but when it comes to checking the CPU load or for a given process, we need a way of finding this information out from the remote machine. There are several ways of doing this, including check_by_ssh, NRPE and NSCA. We’re going to use NRPE here. There is a good diagram here of how it fits in the Nagios architecture, and documentation for NRPE here.

NRPE works as follows:

  1. Nagios server calls a check_nrpe plugin locally
  2. check_nrpe communicates with NRPE daemon on the remote server
  3. NRPE daemon on the remote server executes the required nagios plugin locally, and passes the results back to the Nagios server

You can see from points 2 and 3 that there is installation required on the remote server, of both the NRPE daemon and the Nagios plugins that you want to be available for the remote server.

Setting up NRPE

On the remote server, install the Nagios plugins and the NRPE daemon:

$ sudo yum install nagios-plugins-all nagios-plugins-nrpe nrpe

If you’re running a firewall, make sure you open the port for NRPE (by default, 5666).

Amend the NRPE configuration (/etc/nagios/nrpe.cfg) to add the IP of your Nagios server (in this example, 192.168.56.102) to the allowed_hosts line

allowed_hosts=127.0.0.1,192.168.56.102

(You might need to use sudo to edit the file)

Now set nrpe to start at boot, and restart the nrpe service to pick up the configuration changes made

$ sudo chkconfig --level 35 nrpe on
$ sudo service nrpe restart

Normally Nagios will be running check_nrpe from the Nagios server, but before we do that, we can use the plugin locally on the remote server to check that NRPE is functioning, before we get the network involved:

$ cd /usr/lib64/nagios/plugins 
$ ./check_nrpe -H localhost 
NRPE v2.12

If that works, then move on to testing the connection between the Nagios server and the remote server. On the Nagios server, install the check_nrpe plugin:

$ sudo yum install nagios-plugins-nrpe

And then run it manually:

$ cd /usr/lib64/nagios/plugins 
$ ./check_nrpe -H 192.168.56.101 
NRPE v2.12

(in this example, my remote server’s IP is 192.168.56.101)

NRPE, commands and plugins

In a local Nagios service check, the service specifies a command which in turn calls a plugin. When we do a remote service check using NRPE the same chain exists, except the service always calls the NRPE command and plugin. The difference is that it passes to the NRPE plugin the name of a command executed on the NRPE remote server.

So there are actually two commands to be aware of :

  • The command defined on the Nagios server, which is specified from the service
    These commands are defined as objects using the define command syntax
  • The command on the remote server in the NRPE configuration, which specifies the actual plugin executable that is executed
    The command is defined in the nrpe.cfg file, with the syntax
    command[<command name>]=<command line execution statement>

An example NRPE service configuration

One of the default service checks that comes with Nagios is Check Load. It uses the check_load plugin. We’ll see how the same plugin can be used on the remote server through NRPE.

  1. Determine the commandline call for the plugin on the remote server. In the plugins folder execute the plugin manually to determine its syntax
    $ cd /usr/lib64/nagios/plugins/
    $ ./check_load -h 
    […]
    Usage: check_load [-r] -w WLOAD1,WLOAD5,WLOAD15 -c CLOAD1,CLOAD5,CLOAD15
    

    So for example:

    ./check_load -w 15,10,5 -c 30,25,20 
    OK - load average: 0.02, 0.04, 0.05|load1=0.020;15.000;30.000;0; load5=0.040;10.000;25.000;0; load15=0.050;5.000;20.000;0;
    
  2. Specify the NRPE command in nrpe.cfg file with the command line determined in the previous step:
    command[check_load]=/usr/lib64/nagios/plugins/check_load -w 15,10,5 -c 30,25,20

    You’ll see this in the default nrpe.cfg file. Note that “check_load” is entirely arbitrary, and “command” is a literal.

  3. On the Nagios server, configure the generic check_nrpe command. This should be added to an existing .cfg file, or a new one in the cfg_dir folder that we configured earlier
    define command{
    command_name check_nrpe 
    command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ 
    }

    Note here the -c argument, which passes $ARG1$ as the command to execute on the NRPE daemon.

  4. Define a service which will call the plugin on the NRPE server. I’ve added this into the configuration file for the new host created above (config/bi1.cfg)
    define service{ 
    use local-service
    host_name bi1 
    service_description Check Load 
    check_command check_nrpe!check_load 
    }

    Note that check_nrpe is the name of the command that we defined in step 3. check_load is the arbitrary command name that we’ve configured on the remote server in nrpe.cfg

As before, validate the configuration:

nagios -v /etc/nagios/nagios.cfg

and then restart the Nagios service:

sudo service nagios restart

Login to your Nagios console and you should see the NRPE-based service working:Nagios05

Nagios and OBIEE

Did someone say something about OBIEE? As I warned at the beginning of this article Nagios is fairly complex to configure and it is a steep learning curve. What I’ve written so far is hopefully sufficient to guide you through the essentials and give you a head-start in using it.

The rest of this article looks at the kinds of alerts we can build into Nagios for OBIEE

Process checks

To check for the processes in the OBIEE stack we can use the check_proc plugin. This is a flexible plugin with a variety of invocation approaches, but we are going to use it to raise a critical alert if there is not a process running which matches a argument or command that we specify.

As with all of these checks, it is best to develop it from the ground up, so start with the plugin on the command line and work out the correct syntax. Once the syntax is determined it is simple to incorporate it into the Nagios configuration.

The syntax for the plugin is obtained by running it with the -h flag:

 ./check_procs -h |more 
check_procs v1.4.15 (nagios-plugins 1.4.15)
Copyright (c) 1999 Ethan Galstad <nagios@nagios.org>
Copyright (c) 2000-2008 Nagios Plugin Development Team
	<nagiosplug-devel@lists.sourceforge.net>

Checks all processes and generates WARNING or CRITICAL states if the specified
metric is outside the required threshold ranges. The metric defaults to number
of processes.  Search filters can be applied to limit the processes to check.


Usage:
check_procs -w <range> -c <range> [-m metric] [-s state] [-p ppid]
 [-u user] [-r rss] [-z vsz] [-P %cpu] [-a argument-array]
 [-C command] [-t timeout] [-v][…]

So to check for Presentation Services, which runs as sawserver we would use the -C parameter to specify the process command to match. In addition, we need to specify the warning and critical thresholds. For the OBI processes these thresholds are pretty simple – if there are zero processes then sound the alarm, and if there’s one process then all is OK.

./check_procs -C sawserver -w 1: -c 1: 
 PROCS OK: 1 process with command name 'sawserver'

And if we bring down Presentation Services and run the same command:

./check_procs -C sawserver -w 1: -c 1: 
 PROCS CRITICAL: 0 processes with command name 'sawserver'

To add this into Nagios, do the following:

  1. On the remote server, add the command into NRPE.
    I’ve created a new file called custom.cfg in /etc/nrpe.d (the contents of which are read by NRPE for configuration as well as nrpe.cfg itself)
    The command I’ve defined is called check_obips:
    command[check_obips]=/usr/lib64/nagios/plugins/check_procs -w 1: -c 1: -C sawserver
  2. Because we’ve added a new command into NRPE, the NRPE service needs restarting:
    service nrpe restart
  3. On the Nagios server define a new service for the BI server which will use the check_obips command, via NRPE:
    define service{ 
    use local-service 
    host_name bi1 
    service_description Process: Presentation Services 
    check_command check_nrpe!check_obips }
  4. As before, validate the nagios configuration and if it passes, restart the service
    nagios -v /etc/nagios/nagios.cfg 
    service nagios restart

Looking in the Nagios frontend, the new Presentation Services alert should be present: Nagios06 In this screenshot the alert is status Critical because there are no Presentation Services (sawserver) processes running. If I restart it the alert will change: Nagios07

Network ports

To doublecheck that OBIEE is working, monitoring the state of the network ports is a good idea.

If you are using a firewall then you will need to run this check on the OBI server itself, through NRPE. If you’re not firewalled, then you could run it from the Nagios server. If you are firewalled but only want to check for the public-facing ports of OBIEE (for example, 9704) then you could run it locally on Nagios too.

Whichever way you run the alert, it is easily done using the check_tcp plugin

./check_tcp -p 9704 
TCP OK - 0.001 second response time on port 9704|time=0.001384s;;;0.000000;10.000000

The only parameter that we need to specify is the port, -p. As with the check_proc plugin, there are different ways to use it and check_tcp can raise warnings/alerts if there’s a specified delay connecting to the port, and it can also match a send/expect string. For our purpose, it will return OK if the port we specify is connected to, and fail if not.

The NRPE configuration:

command[check_obis_port]=/usr/lib64/nagios/plugins/check_tcp -H localhost -p 9703

The Nagios service configuration:

define service{
use local-service
host_name bi1
service_description Port: BI Server
check_command check_nrpe!check_obis_port
}

Log files

check_logwarn is not provided by the default set of Nagios plugins, and must be downloaded and installed separately. Once installed, it can be used thus:

NRPE command:

 command[check_log_nqserver]=/usr/lib64/nagios/plugins/check_logwarn -p -d /tmp /u01/app/oracle/product/fmw/instances/instance1/diagnostics/logs/OracleBIServerComponent/coreapplication_obis1/nqserver.log ERROR 

Service definition:

 define service{ 
use local-service 
host_name bi1 
service_description Logs: BI Server nqserver.log 
max_check_attempts 1 
check_command check_nrpe!check_log_nqserver 
} 

Nagios08Be aware that this method is only really useful for alerting you that there is something to look at in the logs — it doesn’t give you the log to browse through. For that you would need to go to the log file on disk, or the log viewer in EM. Tips:

  • Set max_check_attempts in the service defintion to 1, so that an alert is raised straight away.
    Unlike monitoring something like a network port where a glitch might mean a service should check it more than once before alerting, if an error is found in a log file it is still going to be there if you check again.
  • For this service, the action_url option for a service could be used to include a link through to the EM log viewer
  • Make sure that the NRPE user has permissions on the OBI log files.

Database

The check_oracle plugin can check that a database is running locally, or using a TNS entry remotely. Since the OBIEE server that I’m using here is a sandpit environment the database is also running on it, so the check can be run locally on it, via NRPE

NRPE configuration:

command[check_db]=/usr/lib64/nagios/plugins/check_oracle --db ORCL

Service definition:

define service{ 
use local-service 
host_name bi1 
service_description Database check_command 
check_nrpe!check_db 
}

Final Nagios configuration

Service Groups

Having covered the basic setup for monitoring an OBIEE server, we will now look at a couple of Nagios configuration options to improve the monitoring setup that’s been built. The first is Service Groups. These are a way of grouping services together (how did you guess). For example, all the checks for OBIEE network ports. In the Nagios frontend Service Groups can be examined individually and drilled into. Nagios09 The syntax is self-explanatory, except the members clause, which is a comma-separated list of host,service pairings:

 define servicegroup{ 
servicegroup_name obiports 
alias OBIEE network ports 
members bi1, Port: OPMN remote,bi1, Port: BI Server,bi1, Port: Javahost ,bi1, Port: OPMN local port,bi1, Port: BI Server - monitor,bi1, Port: Cluster Controller,bi1, Port: Cluster Controller - monitor,bi1, Port: BI Scheduler - monitor,bi1, Port: BI Scheduler - Script RPC,bi1, Port: Presentation Services,bi1, Port: BI Scheduler,bi1, Port: Weblogic Managed Server - bi_server1,bi1, Port: Weblogic Admin Server 
}

NBThe object definition for the servicegroups is best placed in its own configuration file, or at least, not in the same as the host/service configurations. If it’s in the same file as the host/service config then it’s less easy to duplicate that file for new hosts.

A note about templates

All of the objects that we have configured have included a use clause. This is a template object definition that specifies generic settings so that you don’t have to configure them each time you create an object of that type. It also means if you want to change that setting, you can do so in once place instead of dozens.

For example, services have a check_interval setting, which is how often Nagios will check the service. There’s also a retry_interval which is how many times Nagios will check the service again after the initial error, before raising an alert.

All the templates by default are defined in objects/templates.cfg, but note that templates in themselves are not an object type, they are just an object (eg service) which can be inherited. Templates can inherit other templates too. Examine the generic-service and local-service default templates to see more.

To see the final object definitions with all their inherited values, go to the Nagios web front end and choose the System > Configuration option from the left menu.

Email alerts

A silent alerting system is not much use if we want a hands-off approach to monitoring OBIEE. Getting Nagios to send out emails is pleasantly easy. In essence, you just need to configure a contact object. However I’m going to show how to set it up a bit neater, and illustrate the use of templates in the process.

  1. First step is to test that your Nagios server can send outbound email. In an enterprise this shouldn’t be too difficult, but if you’re trying this at home then some ISPs do block it.
    To test it, run:
    echo 'Email works from the Nagios server' | mailx -s 'Test message from Nagios' foo@bar.com

    Substitute your email address, and if you receive the email then you know the host can send emails. Note you’ve not testing the Nagios email functionality, just the functionality of the Nagios host server to send email.
    If the email doesn’t come through then check /var/log/maillog for errors

  2. In your Nagios configuration, create a contact and contactgroup object. For ease of manageability, I’ve created mine as config/contacts.cfg but anywhere that Nagios will pick up your object definition is fine.
    define contact { 
    use generic-contact 
    contact_name rnm 
    alias Robin Moffatt 
    email foo@bar.com 
    }
    
    define contactgroup { 
    contactgroup_name obiadmins 
    alias OBI Administrators 
    members rnm 
    }

    A contact group is pretty self-explanatory – it is made up of one or more contacts.

  3. To associate a contact group with a service, so that it receives notifications when the service goes into error, use the contact_groups clause in the service defintion.
    Instead of adding this into each service that we’ve defined (currently about 30), I am going to add it into the service template. At the moment the services use the local-service template, one of the defaults with Nagios. I’ve created a new template, called obi-service, which inherits the existing local-service definition but also includes the contact-groups clause:
    define service{ 
    name obi-service 
    use local-service 
    contact_groups obiadmins 
    }

    Now a simple search & replace in my configuration file for the OBIEE server (I called it config/bi1.cfg) to change all use local-service to use obi-service

    […]
    define service{ 
    use obi-service 
    host_name bi1 
    service_description Process: BI Server 
    check_command check_nrpe!check_obis 
    } 
    […]
  4. Validate the configuration and the restart Nagios

All going well, you should now receive alerts when services go into errorNagios10

You can see what alerts have been sent by looking in the Nagios web front end under Reports > Notifications on the left-hand menuNagios11

Deployment on other OBIEE servers

To deploy the same setup as above, for a new OBIEE server, do the following:

  1. Install nagios plugins and nrpe daemon on the new server
    sudo yum install nagios-plugins-all nagios-plugins-nrpe nrpe
  2. Add Nagios server IP to allowed_hosts in /etc/nagios/nrpe.cfg
  3. Start NRPE service
    service nrpe start
  4. Test nrpe locally on the new OBIEE server:
    $/usr/lib64/nagios/plugins/check_nrpe -H localhost 
    NRPE v2.12
  5. Test nrpe from Nagios server:
    $/usr/lib64/nagios/plugins/check_nrpe -H bi2 
    NRPE v2.12
  6. From the first OBIEE server, copy /etc/nrpe.d/custom.cfg to the same path on the new OBIEE server.
    Restart NRPE again
  7. On the Nagios server, define a new host and set of services associated with it. The quick way to do this is copy the existing bi1.cfg file (which has the host and service definitions for the original OBIEE server) to bi2.cfg and do a search and replace. Amend the host definition for the new server IP.
  8. Update the service group definition to include the list of bi2 services too.
  9. Validate the configuration and restart Nagios

The new host should now appear in the Nagios front end: Nagios12

Nagios13

Summary

Nagios is a powerful but complex beast to configure. Once you get into the swing of it, it does make sense though.

At a high-level, the way that you monitor OBIEE with Nagios is:

  • Define OBIEE server as a host on Nagios
  • Install and configure NRPE on the OBIEE server
  • Configure the checks (process, network port, etc) on NRPE on the OBIEE server
  • Create a corresponding set of service definitions on the Nagios server to call the NRPE commands

The final part of this series looks at how plugins can be created to do more advanced monitoring with Nagios, including simulating user requests and alerting if they fail : Advanced monitoring of OBIEE with Nagios

Documentation

Nagios Core documentation

Advanced monitoring of OBIEE with Nagios

$
0
0

Introduction

In the previous articles in this series, I described an overview of monitoring OBIEE, and then a hands-on tutorial for setting up Nagios to monitor OBIEE. Nagios is an Enterprise Systems Management tool that can monitor multiple systems and servers, send out alerts for pre-defined criteria, and so on.

In this article I’m going to demonstate creating custom plugins for Nagios to extend the capability of it to monitor additional elements of the OBIEE stack. The intention is not to document an exhaustive list of plugins and comprehensive configurations, but to show how the plugins can be created and get you started if you wanted to implement this yourself.

Most of these plugins will run local to the OBIEE server and the assumption is that you are using the NRPE mechanism for communication with the Nagios server, described in the previous article. For each plugin, I’ve included:

  • The plugin code, to be located in Nagios plugins folder (default is /usr/lib64/nagios/plugins)
  • If required, an entry for the NRPE configuration file on the BI Server
  • An entry for the service definition, on the Nagios server

Whenever you change the configuration of NRPE or Nagios, don’t forget to restart the appropriate service:

sudo service nrpe restart

or

sudo service nagios restart

A very brief introduction to writing Nagios plugins

There’s plenty on Google, but a Nagios plugin boils down to:

  • Something executable from the command line as the nagios or nrpe user
  • One or more lines of output to stdout. You can include performance data relevant to the check after a pipe | symbol too, but this is optional.
  • The exit code reflects the check state – 0,1,2 for OK, Warning or Critical respectively

Application Deployments

Here is a plugin for Nagios that will report on the state of a given Web Logic application deployment. Without several of the JEE applications that are hosted within Web Logic, OBIEE will not work properly, so it is important to monitor them.

Because of how WLST is invoked and Nagios’ use of a script’s exit code to determine the service status, there are two scripts required. One is the WLST python code, the other is a wrapper to parse the output and set the exit code accordingly.

Note that this plugin invokes WLST each time so running this for every Application Deployment at very regular intervals concurrently may not be a great idea since each invocation will spin up its own java instance on your BI Server. Using the Nagios service option parallelize_check=0 ought to prevent this I think, but didn’t seem to when I tested it. Another possibility would be to run wlst remotely from the Nagios server, but this is not a ‘light touch’ option.

N4 N5

check_wls_app_deployment.sh:   (put this in the Nagios plugins folder on the BI Server)

# check_wls_app_deployment.sh
# Put this in your Nagios plugins folder on the BI Server
#
# Check the status of an Application Deployment
# Takes five arguments - connection details, plus application name, and server
#
# This is a wrapper for check_wls_app_deployment necessary to make sure a proper exit code
# is passed back to Nagios. Because the py is called as a parameter to wlst, it cannot set the exit
# code itself (since it is the wlst.sh which exits).
#
# RNM 2012-09-03
#
# Set this to your FMW home path:
FMW_HOME=/home/oracle/obiee
#
# No user servicable parts below this line
# -----------------------------------------------------------------------------------------------
if [ $# -ne 5 ]then
        echo
        echo &quot;ERROR : not enough parameters&quot;
        echo &quot;USAGE: check_wls_app_deployment.sh WLS_USER WLS_PASSWORD WLS_URL app_name target_server&quot;
        exit 255
fi

output=$($FMW_HOME/oracle_common/common/bin/wlst.sh /usr/lib64/nagios/plugins/check_wls_app_deployment.py $1 $2 $3 $4 $5 | tail -n1)

echo $output

test=$(echo $output|awk '{print $1}'|grep OK)
ok=$?

if [ $ok -eq 0 ]
then
        exit 0
else
        exit 2
fi

check_wls_app_deployment.py:    (put this in the Nagios plugins folder on the BI Server)

# check_wls_app_deployment.py
# Put this in your Nagios plugins folder on the BI Server
#
# Check the status of an Application Deployment
# Takes five arguments - connection details, plus application name, and server
# RNM 2012-09-03
#
# You shouldn't need to change anything in this script
#
import sys
import os
# Check the arguments to this script are as expected.
# argv[0] is script name.
argLen = len(sys.argv)
if argLen -1 &amp;lt; 5:
        print &quot;ERROR: got &quot;, argLen -1, &quot; args.&quot;
        print &quot;USAGE: wlst.sh check_app_state.py WLS_USER WLS_PASSWORD WLS_URL app_name target_server&quot;
        sys.exit(255)
WLS_USER = sys.argv[1]
WLS_PW = sys.argv[2]
WLS_URL = sys.argv[3]
appname = sys.argv[4]
appserver = sys.argv[5]

# Connect to WLS
connect(WLS_USER, WLS_PW, WLS_URL);

# Set Application run time object
nav=getMBean('domainRuntime:/AppRuntimeStateRuntime/AppRuntimeStateRuntime')
state=nav.getCurrentState(appname,appserver)
if state == 'STATE_ACTIVE':
        print 'OK : %s - %s on %s' % (state,appname,appserver)
else:
        print 'CRITICAL : State is &quot;%s&quot; for %s on %s' %  (state,appname,appserver)

NRPE configuration:

command[check_wls_analytics]=/usr/lib64/nagios/plugins/check_wls_app_deployment.sh weblogic welcome1 t3://localhost:7001 analytics#11.1.1 bi_server1

Service configuration:

define service{
        use                             obi-service
        host_name                       bi1
        service_description             WLS Application Deployment : analytics
        check_command                   check_nrpe_long!check_wls_analytics
        }

By default, NRPE waits 10 seconds for a command to execute before returning a timeout error to Nagios. WLST can sometimes take a while to crank up, so I created a new command, check_nrpe_long which increases the timeout:

define command{
        command_name    check_nrpe_long
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ -t 30
        }

nqcmd OBIEE plugin for Nagios

Using the OBIEE command line utility nqcmd it is simple to create a plugin for Nagios which will run a Logical SQL statement (as Presentation Services would pass to the BI Server). This plugin will validate that the Cluster Controller, BI Server and source database are all functioning. With a bit more coding, we can include a check for the response time on the query, raising an alert if it breaches defined thresholds.

This script can be used to just run a Logical SQL and return pass/fail, or if you include the additional command line parameters, check for the response time. To use the plugin, you need to create a file holding the logical SQL that you want to run. You can extract it from usage tracking, nqquery.log, or from the Advanced tab of a report in Answers. In the example given below, the logical SQL was copied into a file called q1.lsql located in /usr/lib64/nagios/plugins/.

check_obi_nqcmd.sh:    (put this in the Nagios plugins folder on the BI Server)

# check_obi_nqcmd.sh
# Put this in your Nagios plugins folder on the BI Server
# 
# Nagios plugin to check OBI status using nqcmd.
# Assumes that your DSN is AnalyticsWeb - modify the nqcmd call in this script if it is different
# 
# RNM September 2012
#
#
# Set this to your FMW home path:
FMW_HOME=/home/oracle/obiee
#
# No user servicable parts below this line
# -----------------------------------------------------------------------------------------------
case $# in
	3)
		lsql_file=$1
		username=$2
		password=$3
		checktime=0
	;;
	5) 
		lsql_file=$1
		username=$2
		password=$3
		checktime=1
		warn_msec=$4
		crit_msec=$5
	;;
	*)
		echo &quot; &quot;
		echo &quot;Usage: check_obi_nqcmd.sh &lt;lsql-filename&gt; &lt;username&gt; &lt;password&gt; [warn msec] [crit msec]&quot;
		echo &quot; &quot;
		echo &quot;eg: check_obi_nqcmd.sh /home/oracle/nagios/q1.lsql weblogic welcome1&quot;
		echo &quot;eg: check_obi_nqcmd.sh /home/oracle/nagios/q1.lsql weblogic welcome1 1000 5000&quot;
		echo &quot; &quot;
		echo &quot; &quot;
		exit 255
esac
# Initialise BI environment
. $FMW_HOME/instances/instance1/bifoundation/OracleBIApplication/coreapplication/setup/bi-init.sh

outfile=$(mktemp)
errfile=$(mktemp)
grpfile=$(mktemp)

nqcmd -d AnalyticsWeb -u $username -p $password -s $lsql_file -q -T 1&gt;$outfile 2&gt;$errfile
grep Cumulative $outfile &gt; /dev/null
nqrc=`echo $?`
if [ $nqrc -eq 0 ]
then
	responsetime=$(grep Cumulative $outfile |awk '{print $8 * 1000}')
	if [ $checktime -eq 1 ]
	then
		if [ $responsetime -lt $warn_msec ]
		then
			echo &quot;OK - response time (msec) is  &quot;$responsetime&quot; |&quot;$responsetime
			exitcode=0
		elif [ $responsetime -lt $crit_msec ]
		then
			echo &quot;WARNING - response time is at or over warning threshold (&quot;$warn_msec&quot; msec). Response time is  &quot;$responsetime&quot; |&quot;$responsetime 
			exitcode=1
		else
			echo &quot;CRITICAL - response time is at or over critical threshold (&quot;$crit_msec&quot; msec). Response time is  &quot;$responsetime&quot; |&quot;$responsetime 
			exitcode=2
		fi
	else
		echo &quot;OK - response time (msec) is  &quot;$responsetime&quot; |&quot; $responsetime
		exitcode=0
	fi
else
	grep -v &quot;Connection open&quot; $errfile &gt; $grpfile
	grep failed $outfile &gt;&gt; $grpfile
	echo &quot;CRITICAL - &quot; $(tail -n1 $grpfile)
	exitcode=2
fi
rm $outfile $errfile $grpfile
exit $exitcode

NRPE configuration:

# Check nqcmd
command[check_nqcmd_q1]=/usr/lib64/nagios/plugins/check_obi_nqcmd.sh /usr/lib64/nagios/plugins/q1.lsql weblogic welcome1
command[check_nqcmd_q1_with_time_check]=/usr/lib64/nagios/plugins/check_obi_nqcmd.sh /usr/lib64/nagios/plugins/q1.lsql weblogic welcome1 500 1000

The first of the above commands runs the logical SQL file q1.lsql and will just do a pass/fail check. The second one checks how long it takes and raises a warning if it’s above half a second, or a critical alert if it’s over a second.

Nagios service configuration (use either or both, if you want the time checking):

define service{
        use                             obi-service
        host_name                       bi1
        service_description             NQCmd - Q1
        check_command                   check_nrpe!check_nqcmd_q1
        }

define service{
        use                             obi-service
        host_name                       bi1
        service_description             NQCmd - Q1 with time check
        check_command                   check_nrpe!check_nqcmd_q1_with_time_check
        }

N1

The plugin also supports the performance data output format, returning the time it took to run the logical SQL: N2

Test a real user with JMeter

Of all the checks and monitors described so far, they only consider a particular aspect of the stack. The above check with NQCmd is fairly comprehensive in that it tests both BI Server and the database. What it doesn’t test is the front-end into OBIEE – the web server and Presentation Services. For full confidence that OBIEE is working as it should be, we need a full end-to-end test, and to do that we simulate an actual user logging into the system and running a report.

N6

To do this, I am using JMeter plus some shell scripting. JMeter executes the actual web requests that a user would through their web browser in using OBIEE. The shell script looks at the result and sets the exit status, and how long it takes to perform the test is also recorded.

This check, like the NQCmd one above, could be set as a pass/fail, or also to consider how long it takes to run and raise a warning if it is above a threshold.N7

An important thing to note here is that this plugin is going to run local to the Nagios server, rather than on the BI Server like the two plugins above. This is deliberate, so that the network connectivity to the OBIEE server external to the host is also checked.

To set this up, you need:

  • JMeter (download the Binary from here). Unarchive it into a folder, for example /u01/app/apache-jmeter-2.7/. Set the files in the binary folder to executable
    chmod -R ugo+rx /u01/app/apache-jmeter-2.7/bin

    Make sure also that the nagios user (under which this check will run) has read/execute access to the folders above where jmeter is kept

  • A JMeter jmx script with the user actions that you want to test. The one I’m using does two simple things:
    • Login
    • Run dashboard

    I’m using assertions to check that each step runs correctly.

  • The actual plugin script which Nagios will use. Put this in the plugins folder (eg /usr/lib64/nagios/plugins)

    check_obi_user.sh:    (put this in the Nagios plugins folder on the Nagios server)

    # check_obi_user.sh
    # Put this in your Nagios plugins folder on the Nagios server
    #
    # RNM September 2012
    #
    # This script will invoke JMeter using the JMX script passed as an argument                            
    # It parses the output and sets the script exit code to 0 for a successful test                        
    # and to 2 for a failed test. 
    # 
    # Tested with jmeter 2.7 r1342410
    #
    # Set JMETER_PATH to the folder holding your jmeter files
    JMETER_PATH=/u01/app/apache-jmeter-2.7
    #
    # No user servicable parts below this line
    # -----------------------------------------------------------------------------------------------
    # You shouldnb't need to change anything below this line
    JMETER_SCRIPT=$1
    output_file=$(mktemp)
    
    /usr/bin/time -p $JMETER_PATH/bin/jmeter -n -t $JMETER_SCRIPT -l /dev/stdout 1&gt;/$output_file 2&gt;&amp;1
    status_of_run=$?
    realtime=$(tail -n3 $output_file|grep real|awk '{print $2}')
    if [ $status_of_run -eq 0 ]
    then
            result=$(grep &quot;&lt;failure&gt;true&quot; $output_file)
            status=$?
            if [ $status -eq 1 ]
            then
                    echo &quot;OK user test run successfully |&quot;$realtime
                    rc=0
            else
                    failstep=$(grep --text &quot;&lt;httpSample&quot; $output_file|tail -n1|awk -F=&quot;\&quot;&quot; '{print $6}'|awk -F=&quot;\&quot;&quot; '{sub(/\&quot; rc/,&quot;&quot;);print $1}')
                    echo &quot;CRITICAL user test failed in step: &quot;$failstep
                    rc=2
            fi
    else
            echo &quot;CRITICAL user test failed&quot;
            rc=2
    fi
    #echo &quot;Temp file exists : &quot;$output_file
    rm $output_file
    exit $rc
  • Because we want to test OBIEE as if a user were using it, we run this test from the Nagios server. If we used NRPE to run it locally on the OBIEE server we wouldn’t be checking any of the network gremlins that can cause problems. On the Nagios server define a command to call the plugin, as well as the service definition as usual:
    define command{
            command_name    check_obi_user
            command_line    $USER1$/check_obi_user.sh $ARG1$
            }
    
    define service{
            use                             obi-service
            host_name                       bi1
            service_description             OBI user : Sample Sales - Product Details
            check_command                   check_obi_user!/u01/app/jmeter_scripts/bi1.jmx
            }
    

The proof is in the pudding

After the configuration we’ve done, we now have the following checks in place for the OBIEE deployment:
N9

Now I’m going to test what happens when we start breaking things, and see if the monitoring does as it ought to. To test it, I’m using a script pull_the_trigger.sh which will randomly break things on an OBIEE system. It is useful for putting a support team through its paces, and for validating a monitoring setup.

Strike 1

First I run the script: N11 then I check Nagios: N10 Two critical errors are being reported; a network port error, and a process error — sounds like a BI process has been killed maybe. Drilling into the Service Group shows this: N12 and a manual check on the command line and in EM confirms it: N13

$ ./opmnctl status

Processes in Instance: instance1
---------------------------------+--------------------+---------+---------
ias-component                    | process-type       |     pid | status
---------------------------------+--------------------+---------+---------
coreapplication_obiccs1          | OracleBIClusterCo~ |    5549 | Alive
coreapplication_obisch1          | OracleBIScheduler~ |    5546 | Alive
coreapplication_obijh1           | OracleBIJavaHostC~ |     N/A | Down
coreapplication_obips1           | OracleBIPresentat~ |    5543 | Alive
coreapplication_obis1            | OracleBIServerCom~ |    5548 | Alive

So, 1/1, 100% so far …

Strike 2

After restarting Javahost, I run the test script again. This time Nagios shows an alert for the user simulation: N15 Drilling into it shows the step the failure occurs in: N17 And verifying this manually confirms there’s a problem: N16

The Nagios plugins poll at configurable intervals, so by now some of the other checks have also raised errors: N18 We can see that a process has clearly failed, and since user logon and the NQCmd tests are failing it is probably the BI Server process itself that is down: N19 I was almost right — it was the Cluster Controller which was down.

Strike 3

I’ve manufactured a high CPU load on the BI server, to see how/if it manifests itself in the alerts. Here’s how it starts to look: N20

The load average is causing a warning: N21

All the Application Deployment checks are failing with a timeout, presumably because WLST takes too long to start up because of the high CPU load: N22

And the NQCmd check is raising a warning because the BI Server is taking longer to return the test query result than it should do.

Strike 4

The last test I do is to make sure that my alerts for any problem with what the end user actually sees are picked up. Monitoring processes and ports is fine, but it’s the “unknown unknowns” that will get you eventually. In this example, I’ve locked the database account that the report data comes from. Obviously, we could write an alert which checks each database account status and raises an error if it’s locked, but the point here is that we don’t need to think of all these possible errors in advance.

When the user ends up seeing this:    (which is bad, m’kay?)

Our monitoring will pick up the problem:

Both the logical SQL check with nqcmd, and the end-user simulation with JMeter, are picking up the problem.

Summary

There are quite a few things that can go wrong with OBIEE, and the monitoring that we’ve built up in Nagios is doing a good job of picking up when things do go wrong.


Incremental refresh of Exalytics aggregates using native BI Server capabilities

$
0
0

One of the key design features of the Exalytics In-Memory Machine is the use of aggregates (pre-calculated summary data), held in the TimesTen In-Memory database. Out of the box (“OotB”) these aggregates are built through the OBIEE tool, and when the underlying data changes they must be rebuilt from scratch.

For OBIEE (Exalytics or not) to make use of aggregate tables in a manner invisible to the user, they must be mapped into the RPD as additional Logical Table Sources for the respective Logical Table in the Business Model and Mapping (BMM) layer. OBIEE will then choose the Logical Table Source that it thinks will give the fastest response time for a query, based on the dimension level at which the query is written.

OBIEE’s capability to load aggregates is provided by the Aggregate Persistence function, scripts for which are generated by the Exalytics Summary Advisor, or the standard tool’s Aggregate Persistence Wizard. The scripts can also be written by hand.

Aggregate Persistence has two great benefits:

  1. It uses the existing metadata model of the RPD to understand where to get the source data for the aggregate from, and how to aggregate it. Because it uses standard RPD metadata, it also means that any data source that is valid for reporting against in OBIEE can be used as a source for the aggregates, and OBIEE will generate the extract SQL automagically. The aggregate creation process becomes source-agnostic. OBIEE will also handle any federation required in creating the aggregates. For example, if there are two source systems (such as Sales, and Stock) but one target aggregate, OBIEE will manage the federation of the aggregated data, just as it would in any query through the front-end.
  2. All of the required RPD work for mapping the aggregate as a new Logical Table Source is done automagically. There is no work on the RPD required by the developer.

However, there are two particular limitations to ‘vanilla’ Aggregate Persistence:

  1. It cannot do incremental refresh of aggregates. Whenever the underlying data changes, the aggregate must be dropped and rebuilt in entirety. This can be extremely inefficient if only a small proportion of the source data has changed, and can ultimately lead to scalability and batch SLA issues.
  2. Each time that the aggregate is updated, the RPD is modified online. This can mean that batch times take longer than they need to, and is also undesirable in a Production environment.

I have written about alternatives and variations to the OotB approach for refreshing Exalytics aggregates previously here and here, namely:

  1. Loading TimesTen aggregates through bespoke ETL, in tools such as GoldenGate and ODI. TimesTen supports a variety of interfaces – including ODBC and JDBC – and therefore can be loaded by any standard ETL tool. A tool such as GoldenGate can be a good way of implementing a light-touch CDC solution against a source database.
  2. Loading TimesTen aggregates directly using TimesTen’s Load from Oracle functionality, taking advantage of Aggregate Persistence to do the aggregate mapping work in the RPD

In both of these cases, there are downsides to the method. Using bespoke ETL is ultimately very powerful and flexible, but has the overhead of writing the ETL along with requiring manual mapping of the aggregates into the RPD. This mapping work is done in the TimesTen Load from Oracle method, but can only be used against an Oracle source database and where there is a single physical SQL required to load the aggregate.

Refreshing aggregates using native OBIEE functionality alone

Here I present another alternative method for refreshing Exalytics aggregates, but using OBIEE functionality alone and remaining close to the OotB method. It is based on Aggregate Persistence but varies in two significant ways :

  1. Incremental refresh of the aggregate is possible
  2. No changes are made to the RPD when the aggregate is refreshed

The method still uses the fundamentals of Aggregate Persistence since , as I mentioned above, it has some very significant benefits:

  • BI Server uses (dare I say, leverages), your existing metadata modelling work which is necessary – regardless of your aggregates – for users to report from the unaggregated data.
  • BI Server generates your aggregate refresh ETL code
  • If your source systems change, your aggregate refresh code doesn’t need to – just as reports are decoupled from the source system through the RPD metadata layers, so are your target aggregates

For us to understand the new method, a bit of background and explanation of the technology is required.

Background, part 1 : Aggregate Persistence – under the covers

When Aggregate Persistence runs, it does several things:

  1. Remove aggregates from physical database and RPD mappings
  2. Create the physical aggregate tables and indexes on the target database, for the fact aggregate and supporting dimensions
  3. Update the RPD Physical and Logical (BMM) layers to include the newly built aggregates
  4. Populate the aggregate tables, from source via the BI Server to the aggregate target (TimesTen)

What we are going to do here is pick apart Aggregate Persistence and invoke just part of it. We don’t need to rebuild the physical tables each time we refresh the data, and we don’t need to touch the RPD. We can actually just tell the BI Server to load the aggregate table, using the results of a Logical SQL query. That is, pretty much the same SQL that would be executed if we ran the aggregate query from an analysis in the OBIEE front end.

The command to tell the BI Server to do this is the populate command, which can be found from close inspection of the nqquery.log during execution of normal Aggregate Persistence:

populate "ag_sales_month" mode ( append table connection pool "TimesTen aggregates"."TT_CP") as
select_business_model "Sales"."Fact Sales"."Sale Amount" as "Sale_Amoun000000AD","Sales"."Dim Times"."Month YYYYMM" as "Month_YYYY000000D0" 
from "Sales";

This populate <table> command can be sent by us directly to the BI Server (exactly in the way that a standard create aggregate Aggregate Persistence script would be – with nqcmd etc) and causes it to load the specified table (using the specified connection pool) using the logical SQL given. The re-creation of the aggregate tables, and the RPD mapping, doesn’t get run:

The syntax of the populate command is undocumented, but from observing the nqquery.log file it follows this pattern:

Looking at a very simple example, we can see how a simple aggregate with a measure summarised by month could be populated:

SELECT_BUSINESS_MODEL was written about by Venkat here, and is BI Server syntax allowing a query directly against the BMM, rather than the Presentation Layer which Logical SQL usually specifies. You can build and test the SELECT_BUSINESS_MODEL clause in OBIEE directly (from Administration -> Issue SQL), in nqcmd, or just by extracting it from the nqquery.log.

Background, part 2 : Secret Sauce – INACTIVE_SCHEMAS

So, we have seen how we can take advantage of Aggregate Persistence to tell the BI Server to load an aggregate, from any source we’ve modelled in the RPD, without requiring it to delete the aggregate to start with or modify the RPD in any way.

Now, we need the a bit of secret sauce to complete the picture and make this method a viable one.

In side-stepping the full Aggregate Persistence sequence, we have one problem. The Logical SQL that we use in the populate statement is going to be parsed by the BI Server to generate the select statement(s) against the source database. However, the BI Server uses its standard query parsing on it, using the metadata defined. Because the aggregates we are loading are already mapped into the RPD then by default the BI Server will probably try to use the aggregate to satisfy the aggregate populate request (because it will judge it the most efficient LTS) – thus loading data straight from the table that we are trying to populate!

The answer is the magical INACTIVE_SCHEMAS variable. What this does it tell OBIEE to ignore one or more Physical schemas in the RPD, and importantly, any associated Logical Table Sources. INACTIVE_SCHEMAS is documented as part of the Double Buffering. It can be used in any logical SQL statement, so is easily demonstrated in an analysis (using Advanced SQL Clauses -> Prefix):


Forcing OBIEE query to use avoid a LTS, using INACTIVE_SCHEMAS. Click image for a larger version.


So when we specify the populate command to update the aggregate, we just include the necessary INACTIVE_SCHEMAS prefix:

SET VARIABLE INACTIVE_SCHEMAS='"TimesTen Aggregates".."EXALYTICS"': 
populate "ag_sales_month" mode ( append table connection pool 
"TimesTen aggregates"."TT_CP") as  
select_business_model "Sales"."Fact Sales"."Sale Amount" as "Sale_Amoun000000AD","Sales"."Dim Times"."Month YYYYMM" as "Month_YYYY000000D0" 
from "Sales";

Why, you could reasonably ask, is this not necessary in a normal OotB aggregate refresh? For the simply reason that in “vanilla” Aggregate Persistence usage the whole aggregate gets deleted from the RPD before it is rebuilt, and therefore when the aggregate query is executed there is only the base LTS is enabled in the RPD at that point in time.

The final part of the puzzle – Incremental refresh

So, we have a way of telling BI Server to populate a target aggregate without rebuilding it, and we have the workaround necessary to stop it trying to populate the aggregate from itself. The last bit is making sure that we only load the data we want to. If we execute the populate statement as it stands straight from the nqquery.log of the initial Aggregate Persistence run then we will end up with duplicate data in the target aggregate. So we need to do one of the following :

  1. Truncate the table contents before the populate
  2. Use a predicate in the populate Logical SQL so that only selected data gets loaded

To issue a truncate command, you can use the logical SQL command execute physical to get the BI Server to run a command against the target database, for example:

execute physical connection pool "TimesTen Aggregates"."TT_CP" truncate table ag_sales_month

This truncate/load method is appropriate for refreshing dimension aggregate tables, since there won’t usually be an update key as such. However, when refreshing a fact aggregate it is better for performance to use an incremental update and only load data that has changed. This assumes that you can identify the data and have an update key for it. In this example, I have an aggregate table at Month level, and each time I refresh the aggregate I want to load just data for the current month. In my repository I have a dynamic repository variable called THIS_MONTH. To implement the incremental refresh, I just add the appropriate predicate to the SELECT_BUSINESS_MODEL clause of the populate statement:

select_business_model "Sales"."Fact Sales"."Sale Amount" as "Sale_Amoun000000AD","Sales"."Dim Times"."Month YYYYMM" as "Month_YYYY000000D0" 
from "Sales" 
where "Dim Times"."Month YYYYMM" =  VALUEOF("THIS_MONTH")

Making the completed aggregate refresh command to send to the BI Server:

SET VARIABLE DISABLE_CACHE_HIT=1, DISABLE_CACHE_SEED=1, DISABLE_SUMMARY_STATS_LOGGING=1, 
INACTIVE_SCHEMAS='"TimesTen Aggregates".."EXALYTICS"'; 
populate "ag_sales_month" mode ( append table connection pool 
"TimesTen aggregates"."TT_CP") as  
select_business_model "Sales"."Fact Sales"."Sale Amount" as "Sale_Amoun000000AD","Sales"."Dim Times"."Month YYYYMM" as "Month_YYYY000000D0" 
from "Sales" 
where "Dim Times"."Month YYYYMM" =  VALUEOF("THIS_MONTH");

Since there will be data in the table for the current month, I delete this out first, using execute physical:

execute physical connection pool "TimesTen Aggregates"."TT_CP" delete from ag_sales_month where Month_YYYY000000D0 = VALUEOF(THIS_MONTH);

Step-by-step

The method I have described above is implemented in two parts:

  1. Initial build- only needs doing once
    1. Create Aggregate Persistence scripts as normal (for example, with Summary Advisor)
    2. Execute the Aggregate Persistence script to :
      1. Build the aggregate tables in TimesTen
      2. Map the aggregates in the RPD
    3. Create custom populate scripts:
      1. From nqquery.log, extract the full populate statement for each aggregate (fact and associated dimensions)
      2. Amend the INACTIVE_SCHEMAS setting into the populate script, specifying the target TimesTen database and schema.
      3. For incremental refresh, add a WHERE clause to the populate logical SQL so that it only fetches the data that will have changed. Repository variables are useful here for holding date values such as current date, week, etc.
      4. If necessary, build an execute physical script to clear down all or part of the aggregate table. This is run prior to the populate script to ensure you do not load duplicate data
  2. Aggregate refresh – run whenever the base data changes
    1. Optionally, execute the execute physical script to prepare the aggregate table (by deleting whatever data is about to be loaded)
    2. Execute the custom populate script from above.
      Because the aggregates are being built directly from the base data (as enforced by INACTIVE_SCHEMAS) the refresh scripts for multiple aggregates could potentially be run in parallel (eg using xargs). A corollary of this is that this method could put additional load on the source database, because it will be hitting it for every aggregate, whereas vanilla Aggregate Persistence will build aggregates from existing lower-level aggregates if it can.

Summary

This method is completely valid for use outside of Exalytics too, since only the Summary Advisor is licensed separately. Aggregate Persistence itself is standard OBIEE functionality. For Exalytics deployed in an environment where aggregate definitions and requirements change rapidly then this method would be less appropriate, because of the additional work required to modify the scripts. However, for an Exalytics deployment where aggregates change less frequently, it could be very useful.

The approach is not without drawbacks. Maintaining a set of custom populate commands has an overhead (although arguably no more so than a set of Aggregate Persistence scripts), and the flexibility comes at the cost of putting the onus of data validity on the developer. If an aggregate table is omitted from the refresh (for example, a support aggregate dimension table) then reports will show erroneous data.

The benefit of this approach is that aggregates can be rapidly built and maintained in a sensible manner. The RPD is modified only in the first step, the initial build. It is then left entirely untouched. This makes refreshes faster, and safer; if it fails there is just the data to tidy up, not the RPD too.

Testing aggregate navigation on OBIEE and Exalytics

$
0
0

One of OBIEE’s many great strengths is aggregate navigation; the ability to choose from a list of possible tables the one which will probably give the optimal performance for a given user query. Users are blissfully unaware of which particular table their query is being satisfied from, since aggregate navigation happens on the BI Server once the user’s request comes through from an Analysis or Dashboard.

This seamless nature of aggregate navigation means that testing specific aggregates are working can be fiddly. We want to ensure that the aggregates we’ve built are (i) being used when appropriate and (ii) showing the correct data. This is the particularly the case in Exalytics when aggregates are put into in-memory (TimesTen) by the Summary Advisor and we need to validate them.

Whilst the log file nqquery.log (or Usage Tracking table S_NQ_DB_ACCT) tells us pretty easily which table a query used, it is nice to be able to switch a query easily between possible aggregate sources to be able to compare the data. This blog demonstrates how we can use the INACTIVE_SCHEMAS variable (as described in my previous blog on loading Exalytics incrementally) to do this.

INACTIVE_SCHEMAS is a Logical SQL variable that tells the BI Server to exclude the specified physical schema(s) from consideration for resolving an inbound query. Normally, the BI Server will parse each incoming query through the RPD, and where a Logical Table has multiple Logical Table Sources it will evaluate each one to determine if it (a) can satisfy the query and (b) whether it will be the most efficient one to use. By using INACTIVE_SCHEMAS we can force the BI Server to ignore certain Logical Table Sources (those associated with the physical schema specified), ensuring that it just queries the source(s) we want it to.

In the following example, the data exists on both Oracle database, and TimesTen (in-memory). Whilst the example here is based on an Exalytics architecture, the principle should be exactly the same regardless of where the aggregates reside. This is how the RPD is set up for the Fact table in my example:

The GCBC_SALES schema on Oracle holds the unaggregated sales data, whilst the EXALYTICS schema on TimesTen has an aggregate of this data in it. The very simple report pictured here shows sales by month, and additionally uses a Logical SQL view to show the contents of the query being sent to the BI Server:

Looking at nqquery.log we can see the query by default hits the TimesTen source:

[...]
------------- Sending query to database named TimesTen aggregates
WITH
SAWITH0 AS (select distinct T1528.Sale_Amoun000000AD as c1,
     T1514.Month_YYYY000000D0 as c2
from
     SA_Month0000011E T1514,
     ag_sales_month T1528
[...]

Now, for thoroughness, let’s compare this to what’s in the TimesTen database, using a Direct Database Request:

OK, all looks good. But, is what we’ve aggregated into TimesTen matching what we’ve got in the source data on Oracle? Here was can use INACTIVE_SCHEMAS to force the BI Server to ignore TimesTen entirely. We can see from the nqquery.log that OBI has now gone back to the Oracle source of the data:

[...]
------------- Sending query to database named orcl
WITH
SAWITH0 AS (select sum(T117.FCAST_SAL_AMT) as c1,
     T127.MONTH_YYYYMM as c2
from
     GCBC_SALES.TIMES T127 /* Dim_TIMES */ ,
     GCBC_SALES.SALES T117 /* Fact_SALES */
[...]

and the report shows that actually we have a problem in our data, since what’s on the source doesn’t match the aggregate:

A Direct Database Request against Oracle confirms the data we’re seeing – we have a mismatch between our source and our aggregate:

This is the kind of testing that it is crucial to perform. Without proper testing, problems may only come to light in specific reports or scenarios, because by the very nature of aggregate navigation working silently and hidden from the user.

So this is the feature we can use to perform the testing, but below I demonstrate a much more flexible way that having to build multiple reports.

Implementing INACTIVE_SCHEMAS

Using INACTIVE_SCHEMAS in your report is very simple, and doesn’t require modification to your reports. Simply use a Variable Prompt to populate INACTIVE_SCHEMAS as a Request Variable. Disable the Apply button for instantaneous switching when the value is changed.

A Request Variable will be prepended it to any logical SQL sent to the BI Server. Save this prompt in your web catalog, and add it to any dashboard on which you want to test the aggregate:

Even better, if you set the security on the dashboard prompt such that only your admins have access to it, then you could put it on all of your dashboards as a diagnostic tool and only those users with the correct privilege will even see it:

Displaying the aggregate source name in the report

So far this is all negative , in that we are specifying the data source not to use. We can examine nqquery.log etc to confirm which source was used, but it’s hardly convenient to wade through log files each time we execute the report. Ripped off from Inspired by SampleApp is this trick:

  1. Add a logical column to the fact table
  2. Hard code the expression for the column in each Logical Table Source
  3. Bring the column through to the relevant subject area
  4. Incorporate it in reports as required, for example using a Narrative View.

Bringing it all together gives us this type of diagnostic view of our reports:

Summary

There’s a variety of ways to write bespoke test reports in OBI, but what I’ve demonstrated here is a very minimal way of overlaying a test capability on top of all existing dashboards. Simply create the Request Variable dashboard prompt, set the security so only admins etc can see it, and then add it in to each dashboard page as required.

In addition, the use of a ‘data source’ logical column in a fact table tied to each LTS can help indicate further where the data seen is coming from.

MDS XML versus MUDE Part 2: What is MUDE?

$
0
0

Mark Rittman once joked during a presentation on OBIEE and SCM that “MUDE” (multiuser development environment) was the closest thing to a dirty word in the OBIEE documentation. I know many people who feel that way… some justified, some perhaps not, as MUDE has certainly gotten better in the later releases of OBIEE. As I mentioned in the introduction, I’d like to introduce MUDE in this post to set the stage for, at the very least, how good a competing multiuser development methodology would need to be. I’m not a genuine fan of MUDE… but I promise not to present a straw man to rip apart in my later posts. If you feel I don’t give it a fair shake, please comment and let me know. If you are interested in some other thorough treatments around MUDE, Venkat wrote about it on his old blog when the feature was first introduced in 10g, as did Mark here on this blog.

Let me start by putting forth what I think are the SDLC (software development life-cycle) imperatives that any solution to metadata development (or any other kind of development) should tackle without question:

  1. Multiple users should be able to develop concurrently… but we need to be clear exactly what we mean by this. Having all the developers login to a shared online repository is one way to do multiuser development… but this equates to multiuser development using serialized access to objects. Whether explicit or not in the requirements… what most organizations want is non-serialized access to objects… meaning that multiple users can be making changes to the same objects at the same time, and we will be able to merge our changes together, and handle conflicts.
  2. Our VCS should give us the ability to rollback to previous versions of our code, either for comparative purposes… or because we want to rewind certain streams of development because of “wrong turns”.
  3. Our VCS and supporting methodology should provide us the ability to tag, secure and package up “releases”… to be migrated to our various environments eventually finding their way to production.

So let’s see if MUDE is up to the challenge. To enable MUDE, I start by defining one or more Projects in my binary RPD. So, I open the RPD in the Admin Tool, choose the Manage option from the toolbar, and select Projects… from the drop-down list:

MUDE Manage Projects

Next… I choose all the objects I want to include in the project. Our choice for which logical tables to include from the Business Model (BMM) is driven by which logical fact tables we choose, using either individual Presentation Subject Area measure folders, or explicit BMM logical fact tables. We also select any other objects we want to include in the project: explicit Presentation layer folders, init blocks, variables, etc.:

MUDE Define Project

Once we have configured one or more projects in the repository, we then take the binary RPD file and place it on a network share accessible to all developers. From here on out, the RPD file is referred to as the Master Repository. Now, whenever we want to check out a project from the Master Repository, we register the shared network drive location in the Admin Tool. We get to the configuration screen by choosing Tools and then Options… and then choosing the Multiuser tab:

MUDE master location

Once a Master Repository exists in the development directory specified in the Admin Tool, we are then able to check out a project from the Master Repository. To check out a project, we choose the File menu, and from the drop-down menu, we select Multiuser and then Checkout… after that. Then we choose the project we want to checkout:

MUDE Multiuser Checkout

MUDE Select Project

The Admin Tool now prompts us to create a new “subset repository”. This is a fully-functioning binary RPD file containing only the subset of objects that were defined in the project. We can call this RPD anything we like–in my example I called it gcbc_project1.rpd–and it will exist in the “repository” directory buried in the bifoundation directory underneath our instance directory, for instance: [middleware_home]\instances\instance1\bifoundation\OracleBIServerComponent\coreapplication_obis1\repository:

MUDE Subset Repository

Once we have checked out our project, it’s interesting to see the files that the Admin Tool generates behind the scenes to manage MUDE. If we look in the Master Repository directory (this was the C:\master directory we configured in the screenshot above), we should have the following files:

  • gcbc.rpd: The original Master Repository binary RPD file
  • gcbc.000: A backup of the Master Repository RPD file. We will constantly see new versions of the backup file as development continues on the Master Repository, and we’ll see the numeric extension increment over time.
  • gcbc.mhl: The history file… this tracks all the history of changes, when they were performed, and by whom. This file can be output to an XML file using the mhlconverter utility so that it’s readable without the Admin Tool.

Master Repository Directory Contents

Now, if we look in our local repository directory, we have the following files:

  • gcbc_project1.rpd: The subset binary RPD file we created when we checked out our project above.
  • originalgcbc_project1.rpd: Initially… this is an exact copy of the subset binary RPD at the time of checkout. It will not be affected by changes to the actual subset RPD, as it’s purpose is to serve as the “original” copy of the RPD during the 3-way merge. Also… it facilitates the Discard Local Changes options in the Multiuser menu option.

MUDE Repository Directory Inventory

In looking at this infrastructure… there’s nothing magic happening here: MUDE simply uses the basic functionality of the Merge… and the Compare… that are built into the Admin Tool. I’m not saying this disparagingly. There’s no reason to maintain the code path for two different versions of a 3-way merge. MUDE is really nothing more than a framework for making the Compare… and Merge… functionality easier and more predictable. So, it’s not shocking when we choose Compare with Original… from the Multiuser menu and see a screen identical to the basic Compare… option without MUDE:

MUDE Compare with Original

MUDE Compare Differences

Similarly, when we choose the multiuser option Publish to Network… we see a window identical to the standard Merge… option in the Admin Tool:

MUDE Publish Conflicts

So this was a high-level look at MUDE… hopefully I’ve done it justice. Now I’d like to discuss where I think MUDE comes up short as a competent VCS, or SCM… or whatever solution we think it is. On our list of the three imperative boxes that a metadata development solution should tick, I think number (1) is a slam-dunk. Although there have been issues with MUDE… almost every BI or ETL tool I’ve worked with over the years has issues in this area. I think there is also a fair argument to be made that MUDE also ticks the boxes for (2) and (3). But is it complete enough in the areas?

Although MUDE provides us the ability to interact with previous versions of our code, it does this with a “siloed”, metadata-only approach. If I were building an entire BI solution, I would want to associate my metadata repository with my web catalog, and also presumably with my database DDL scripts and ETL routines, regardless of which tool I used for that. MUDE only handles the RPD. If I could somehow figure out how to use a standard VCS such as Git or Subversion to check in all my code into one place, then I could see how everything looked at a particular point in time. The same goes with tagging and packaging releases. MUDE makes it easy to prepare a binary RPD file for release, but it provides no benefit when it comes to packaging a release for the entire BI system. I want a more pervasive solution.

I also think MUDE misses the mark with the conflict resolution workflow, which is depicted in the screenshot directly above. The Define Merge Strategy dialog occurs for individual developers when they try to publish changes back to the Master Repository. I would argue that the handling of conflicts should not be the developer’s job. Suppose I add the logical column % of Discount to a logical fact table as depicted above. If my change conflicts with a change from another developer a continent away, am I really in a position to be able to determine the appropriate conflict resolution at that point in time? I may not even know the other developer… or understand why we are both making changes to the same logical column. So regardless of whether conflicts arise, developers should be able to “publish” their changes to be resolved downstream by the source master. This source master role may be a part-time or full-time role… but this is the person whose job it is to resolve conflicts. So our SDLC solution needs to support the decoupling of multiuser conflict resolution from the development process.

In the next post, I’m going to take a look at the combination of MDS XML and Git. I’ll talk a little bit about Git, and why it’s superior to Subversion for our purposes. I’ll see if this combination can tick all the right boxes I described above.

Embedding a D3 Visualisation in OBIEE

$
0
0

In this blog entry, my first since joining the company as a consultant a little over a month ago, I will be taking you through the process of embedding a D3 visualisation into OBIEE. In my first few weeks in this new role I’ve had the pleasure or working with the forward thinking people at Nominet and one of the technologies they’ve exploited in their BI solution is D3. They are using it to display daily domain name registrations’ by postcode on a map of the UK within OBIEE- very cool stuff indeed and after seeing it I was sufficiently inspired to have a go myself. We’ll start with an overview of D3 itself and then move onto a worked example where we will look at why you may want to use D3 in OBIEE and the best way to go about it. This blog entry is not intended to be a D3 tutorial, there are plenty of very good tutorials out there already, this is more targeted at OBIEE developers who may have heard of D3 but haven’t yet had a “play” ( yes, it’s great fun ! ) with it yet in the context of OBIEE. So without further delay let’s begin…..

What is D3 ?

To answer this question I will first tell you what D3 isn’t – D3 is not a charting library, you will find no predefined bar charts, no line graphs, no scatter plots, not even a single pie chart. “What no pie charts ?!” I here you say, don’t despair though, you can create all these types of visuals in D3 and a whole lot more. D3 which is short for “Data-Driven Documents” is a visualisation framework written entirely in JavaScript by . The term framework is the key word here, D3 is generic in nature which allows it to tackle almost limitless visualisation challenges. The downside though is that you have to tell D3 exactly what you want it to do with the data you throw at it and that means rolling up your sleeves and writing some code. Fear not though, D3 has a great API that will enable you to get a fairly basic visual up and running in a short space of time. You’ll need a basic understanding of HTML, CSS and SVG and of course some JavaScript but with the help of the online tutorials and the D3 API reference you’ll be up and running in no time.

How does it work ?

D3′s API allows you to bind data to place holders in the browser’s DOM. You can then create SVG or HTML elements in these place holders and manipulate their attributes using the dataset you pass to it. Below is the code to display a very simple D3 visualisation to give you a basic idea of how it works.


var dataset = [ 10, 15, 20 ];                    

var svg = d3.select("body")               
               .append("svg")                                    
               .attr("width", 200) 
               .attr("height", 200); 

svg.selectAll("circle") 
               .data(dataset) 
               .enter() 
               .append("circle") 
               .attr("fill","none") 
               .attr("stroke", "green") 
               .attr("r",function(d){  
                    return d ; 
               }) 
               .attr("cy",100)
               .attr("cx",function(d){ 
                    return d * 8; 
               });

This code produces the following visual, admittedly this is not very impressive to look at but for the purpose of this example it will do just fine.

Let’s break this down and see what’s going on.

var dataset = [ 10, 15, 20 ]; 

Above is our data in a JavaScript array, no D3 here. We’ll look at how to generate this from OBIEE a bit later on.

var svg = d3.select("body") 
               .append("svg")                                    
               .attr("width", 200) 
               .attr("height", 200); 

Here we are seeing D3 for the first time. This is a single line of code split over several lines so it’s easier to read. Anyone familiar with jQuery will notice the chaining of methods with the “.” notation. The first line selects the html body tag and then appends an SVG element to it. The two “.attr()” methods then set the width and the height of the SVG.

 svg.selectAll("circle") 
               .data(dataset) 
               .enter() 
               .append("circle") 
               .attr("fill","none") 
               .attr("stroke", "green") 
               .attr("r",function(d){  
                    return d ; 
               }) 
               .attr("cy",100)
               .attr("cx",function(d){ 
                    return d * 8; 
               }); 

Here’s where it gets interesting. The first line selects all circles within the SVG, but wait, we haven’t created any circles yet ! this is where the the place holders come into play that I mentioned earlier. These place holders exist in memory only and are waiting to have data bound to them and then finally an actual circle element. The next line binds the data to the place holders, 10 to the first one, 15 to the next and 20 to the last. The magic .enter() method will then execute all the remaining statements once for each of our data elements, in this case 3 times. The .append(“circle”) will therefore be called 3 times and create 3 circle elements within the SVG.

The remaining statements will change attributes associated with the circles and this is where you can use the data to “drive” the document. Notice the attr(“r”… and the attr(“cx”… method calls, The “r” attribute defines the circle radius and the “cx” attribute sets the centre “x” coordinate of the circle. Both these methods have functions passed as arguments, the “d” variable in each function represents the current data element  (or datum) in the array, 10 the first time, 15 the next and 20 on the last. These functions then return a value to the attr() methods. In this case the radius of each circle is being set to 10,15 and 20 respectively and the x coordinate of each is being set to 80, 120, 160 respectively.

Right, that’s a basic overview of D3 and how it works, let’s now have a look at an example where you might want to use D3 in OBIEE.

A Worked Example

Let’s say we have some customers and each month, with a bit of luck ( and with the help of a Rittmanmead implemented BI system ! ) we sell to those customers. Based on the sales total for each customer each month we then place these customers into a scoring group. Group 1 for customers for which revenue exceeded £10,000, Group 2 for revenue greater than £5,000, Group 3 for revenue greater than £2,000 and Group 4 for revenue greater £0. At the end of each month we want to compare each customers current scoring group with the scoring group they occupied the previous month. It’s the movement of customers between groups each month that we are interested in analysing.

Here’s the criteria we have in OBIEE Answers.

And here are the results.


As you can see the Movement column is  calculated as  ( Group Last MonthGroup This Month ).

Now at this point you might say “Job done, nothing more to do”, we can clearly see this months and last months scoring group and the movement between the two, we have successfully conveyed the information to the user with minimal fuss and you’d be right. We could even develop the results further by conditionally formatting the colour of each row to reflect the current group and maybe we could add an additional column that contains an up arrow, down arrow or equals sign image to indicate the direction of movement. This is where it gets subjective, some users love visuals, some prefer the raw data and some the combination of the two. For this example we are going to try and visualise the data and you can make up your mind which you prefer at the end.

Lets see what we can produce using some standard OBIEE visuals.

First up the Vertical Bar Chart

That’s not bad, we can see by how many groups a customer has shifted since last month with some customers clearly staying put. What we can’t see is from which groups they moved to and from, if we were to add in the Group Last Month and Group This Month measures to the chart it would look a complete mess.

Lets try something else, let’s try a line graph only this time we’ll include all three measures to make sure we’ve got all the information the user requires.

Mmmm, again not bad but to my mind it’s not particularly intuitive, you have to study the graph to get the information you’re after by which time most users will have switched off completely and may be better off a simple table view.

Lets have try again with a combined line and bar chart

A slight improvement on the previous attempt but still not great. An issue I have with the last 2 examples is that to my mind customers in Group 1 ( the best group ) should be at the top of the chart while customers in Group 4 should be at the bottom of the chart – the cream rises to the top, right ?! We have no way of inverting the Y axis values so we are stuck with this rather counterintuitive view of the data. We could ask the users if they could would kindly change their grouping system but it’s unlikely they’ll agree and anyway that would be cheating !

Here’s one last attempt just for fun !

Nope, we’re not making much progress here, this is just making me feel nauseous and believe me when I tell you I came up with several more ridiculous solutions than this, pie chart anyone ? So far the table view has been the clear winner over the visualisations in terms of ease of interpretation and this isn’t a dig at OBIEE’s visuals, OBIEE’s visual’s are great at doing what they are designed to do and they do it quickly. A D3 solution will take time and planning ( and debugging ) but you will have total control over the output and you’ll be able to express your data in a way that no OBIEE visualisation will ever be able to do. So with that in mind let’s have a look at one possible solution written in D3.

Developing a D3 Solution

In order to embed a D3 visualisation in OBIEE you’ll need to use a Narrative view. The Narrative view will enable you to gain access to the data that we need to drive our visualisation using the @n substitution variables where n equals the column position in the criteria. We’ll use this technique to generate some JavaScript from our analysis results that we can then use in our D3 script. Let’s look at doing that now.

In the Prefix section at the top we are declaring a JavaScript array variable called “data” that will contain the data from the analysis. The Narrative section contains the following code

data.push({customer:"@1",prev_group:@2,cur_group:@3,delta:@4});

Because this line of code is in the narrative section it will be output once for each row in the result set, each time substituting @1, @2, @3 and @4 for Customer, Group Last Month, Group This Month and Movement respectively and will dynamically generate the JavaScript to populate our array.  Notice that within the parentheses we have this format

{ key:value, key:value, key:value } 

This will in fact create a JavaScript object for us in each array element so we can then reference our data in our D3 script by using data.customer, data.prev_group and data.delta to get at the values. As you can see below the postfix section we now have a load of JavaScript code ready to be used in our D3 script for testing and development purposes.

At this point I would strongly recommend leaving OBIEE and opening up your favourite development IDE, you will soon get frustrated writing JavaScript code directly into OBIEE. Personally I like NetBeans but there are several other free alternatives out there.

To get started in your favourite IDE you can either download a copy of the D3 library from http://d3js.org/ or reference the online version from within your script. I’d recommend uploading a copy to the OBIEE server once you’ve finished developing and are ready to go into production. If you want to reference the online version you will need to include the following snippet between the <head></head> tags in your html file.

<script src="http://d3js.org/d3.v3.min.js" charset="utf-8"></script>

So to start developing your D3 solution create a new HTML project in your IDE. Add a reference to the D3 library and then copy and paste the generated JavaScript code from the OBIEE narrative between some <script> tags. You should end up with something like this:-

You are now ready to make a start, simply save the project and open up the resulting HTML file in your favourite browser to test any changes you make along the way.

When you are ready to embed the finished code into OBIEE you’ll first need to decide from where you wish to reference the D3 Library and the D3 Code you have just written. With regards to the D3 library as I mentioned earlier you can either reference the online version of the D3 library or copy the library to the OBIEE server and reference it from there. With the custom D3 code you’ve written you can again either upload the code to a file on the OBIEE server or you can just copy and paste your code directly into the narrative. I’d recommend uploading it to the OBIEE server so you can reference it in other analyses at a later date but for now let’s just paste it in.

Lets have a look at the completed Narrative View.

The First thing to note is that the “Contains HTML Markup” checkbox is ticked, this is required as we have now entered some <script> tags and a <div> tag and without this ticked OBIEE will not interpret them correctly. The first <script> tag in the prefix section is referencing the online D3 library. The second <script> tag in the prefix section is closed at the start of the postfix section and wraps around the “data” variable and the JavaScript used to populate it. Below the closing </script> tag in the postfix section we are creating a HTML DIV element that will contain the SVG created by the D3 script. Finally we either enter a reference to our custom D3 script on the OBIEE server or just paste it in between script tags. One important thing to note is that when we transfer the code from our IDE to OBIEE we only want to bring across the D3 code, we want to leave behind the “data” variable and all the html tags as OBIEE will be generating these for us.

So lets take a look at the solution written in D3.

As you can see this contains all the information we require. We can clearly see the current group that the customer occupies and the movement, if any, from the group they occupied the previous month. This could easily be enhanced so that when you hover your mouse over a customer the raw data is presented to the user in some way, you’d simply need to change the narrative section to include a new column from the criteria and the then write the required code in D3 to display it – you really are only limited by your imagination. It may take more time and effort to produce a D3 solution over using a standard OBIEE visual and yes, as with all customisations, you are exposed from a risk point of view when it comes to upgrades and when the person who developed the solution decides to go travel the world but for these edge cases where you just can’t get the right result using a standard OBIEE visual, D3 may just save your bacon.

Below is a live demo of the visualisation outside of OBIEE, it is seeded from random data each time you press the reload button.

It’s been tested in Firefox, Chrome, Safari and IE9. IE versions 8 and below do not natively support SVG although there are various workarounds, none of which have been implemented in this example.

And here’s a link to the code on jsfiddle.net so you can have a play with it yourself. D3 Demo

So in closing here are some tips around developing D3 in OBIEE.

  • Try to sketch out on paper what you want your D3 solution to look like and then try a replicate it with code ( the fun bit ! ).
  • Head over to d3js.org for some inspiration, there are some truly mind blowing examples on there.
  • Use the Narrative view to generate some hard coded data, you can then use it for development and testing purposes.
  • Don’t develop directly in OBIEE, use a purpose build development IDE
  • Once you’re ready to go into production with your solution upload the D3 library and your custom code to the OBIEE server and reference it from there.

Have fun….

MDS XML versus MUDE Part 3: Configuring MDS XML with Git

$
0
0

In the previous installment of this series, I gave an overview of MUDE (multi-user development environment), and described it’s strengths and weaknesses, hopefully presenting a fair assessment on both sides. For a recap, here are three properties that I think any metadata layer development (and really… any development period) should adhere to:

  1. Multiple users should be able to develop concurrently… but we need to be clear exactly what we mean by this. Having all the developers login to a shared online repository is one way to do multiuser development… but this equates to multiuser development using serialized access to objects. Whether explicit or not in the requirements… what most organizations want is non-serialized access to objects… meaning that multiple users can be making changes to the same objects at the same time, and we will be able to merge our changes together, and handle conflicts.
  2. Our VCS should give us the ability to rollback to previous versions of our code, either for comparative purposes… or because we want to rewind certain streams of development because of “wrong turns”.
  3. Our VCS and supporting methodology should provide us the ability to tag, secure and package up “releases”… to be migrated to our various environments eventually finding their way to production.

Although I gave MUDE a passing grade, no one is making the honor roll. I wasn’t pleased that MUDE is siloed: it is a VCS applicable only to the metadata repository. I want an all-in approach… a standard SCM methodology that could be applied to the RPD, the web catalog, ETL artifacts, etc. Additionally, I want to decouple the development process from the conflict resolution process: having our own changes to publish to the master repository does not qualify us to determine whether our changes should override those of other developers. That’s a job for the role I defined as the source master. So now, I want to see if an approach using a VCS (specifically Git) combined with MDS XML could match or surpass the grade I gave MUDE.

So I’m going to introduce MDS XML first of all. I won’t spend too much time on this, as Mark presented the case for it very well in a previous post, although he used Subversion as his VCS. Also… Christian Screen in his OTN article describes how to install Git, and get it configured with the Admin Tool. As I described in the introduction to this series, MDS XML is an XML-based standard, not unlike XUDML (but certainly not the same either), using Oracle Metadata Services as a standard instead. Flavius Burca provides a very good non-OBIEE primer to MDS, where he describes the MDS schema’s role in Fusion Middleware proper. So Oracle already heavily utilized the MDS standard, and being that OBIEE is technically Fusion Middleware, the product management team didn’t have to look far for a replacement for the binary structure in the RPD file, or the UDML/XUDML API that accompanies it.

We can use MDS XML by creating a brand new metadata repository specifying XML from the beginning, or converting an existing binary RPD to XML using Copy As|Save As in the Admin Tool:

XML New RPD

Copy As MDS XML

Notice that when we create an MDS XML metadata repository, we specify a directory location (what I defined as the “core directory” in my introduction) instead of a file location. That’s because we are defining the repository as a series of strongly related XML files organized tightly within a directory structure underneath our core directory. We can see this in the screenshot I first shared in that same introduction:

MDS XML Directory Contents

After we have created an XML repository, we can open it normally, just as we would with a binary RPD. So instead of navigating to an absolute file location, we navigate to the core directory location, as demonstrated below:

Open MDS XML Core Directory

The first time we open an MDS XML repository, we are prompted with a dialogue that defines our source control strategy. So the Admin Tool is interested in how we want to develop with MDS XML… whether we want to register this metadata repository with a VCS or not. This amounts to providing the Admin Tool the capability of issuing particular commands for our selected VCS to make it a little easier to work with. We also provide the configuration file–or SCM template–that provides the Admin Tool the logic necessary to work with a specific VCS.

Git SCM Template

The scm-conf-git.template.xml file specified above is a custom configuration file, as OBIEE only ships with support for Subversion and something called ADE (which I’ve been told is the version control system that Oracle uses internally). So we don’t get Git support (pardon the pun) right out of the box?

There are two solutions to this problem. First and foremost… we don’t need the Admin Tool configured with a template to work with a VCS at all. When the product of our metadata development process is XML, we can be our own proxy to Git, using the command-line tools, or any number of supported Git clients. But secondly… the OBIEE development team was smart to provide an API for allowing us to connect with different VCS providers. Creation of the scm-conf-git.template.xml specified above is demonstrated in Christian’s Screen’s OTN article that I mentioned above, and you can see the specific mapping logic pictured below.

Screen Shot 2013-07-02 at 2.36.20 PM

Whenever the Admin Tool creates, modifies, or deletes individual XML files as part of the repository development process, the Admin Tool keeps Git tuned-in to these changes with the commands configured in the SCM template. This is nice… but the Admin Tool will only do so much. It doesn’t handle the initial registration of the MDS XML repository with our VCS. So we can do that either with a client, or the command line, which is demonstrated below. The git status commands I issue are only for informational purposes and not required:

MDS XML Git Add

Also, the Admin Tool is not configured to do an actual “commit” either, which is noticeable when viewing the SCM template listed above. I assume this was a design decision by the development team at Oracle… though I have to admit: it’s not one I agree with. It’s the safe choice I guess… keeping their hands (and consciences) clean by not getting involved with VCS commits. But in the end… it’s a silly decision, as the whole point of a VCS is to be able to rollback to particular commits, or even undo them completely. Regardless… no matter what VCS we choose, we have to use an external process of some kind to issue our commits:

MDS XML Git Commit

So there we have it. I’ve gotten through all the precursors. In the next post, I can finally start demonstrating how we can do multi-user development using MDS XML and Git.

Patch OBIEE the quicker way – with OPatch napply

$
0
0

Since 2012, Oracle’s patching strategy for OBIEE has been Bundle Patches released approximately once a month. These bundle patches are usually cumulative ones, applied on top of the .0 version of the product. Patching is done through Oracle’s standard OPatch tool, which manages the application of patches along with an inventory of them and rollback if necessary.

I’ve previously written about the overall patching process here. OPatch is part and parcel of an OBIEE sysadmin’s life, so I wanted to share this short article to show the quicker way to apply the PSUs. It uses a more direct way than the patch documentation describes, taking advantage of the napply option of OPatch (documented here). By using this option OPatch will apply all listed patches in one go, rather than one at a time. As well as this, we can use the silent flag to stop OPatch from prompting to apply each patch in turn.

  1. Download the necessary patches – for 11.1.1.7.1 this is 16569379 and 16556157. In a server environment you can use wget to download the patches as detailed here.
  2. Validate the checksums for the downloaded files, to make sure they didn’t get corrupted during download. Use the Digest link when downloading to view the checksums. For example, the Linux x86-64 checksums are :
    p16556157_111170_Linux-x86-64.zip   2673750617 bytes 
    MD5 D3DDDEC4CB189A53B2681BA6522A0035
    
    p16569379_111170_Linux-x86-64.zip   93617 bytes 
    MD5 2BC0E8B903A10311C5CBE6F0D4871E31
    
  3. Unzip the patches. Within the main patch (16556157) there are a series of further zip archives – unzip these too
  4. Put all the patch folders in a single folder, so it looks something like this:

    Patches on Linux

    Patches on Windows

  5. Take backups, as described in the patch documentation.
  6. Set your environment variables, setting PATCH_FOLDER to the folder you unzipped the patches to in step 4 above, and ORACLE_HOME to your FMW_HOME/Oracle_BI1 folder
    1. Windows:
      set PATCH_FOLDER=Y:\installers\OBI\11.1.1.7\win-x86-64_11.1.1.7.1
      set ORACLE_HOME=c:\oracle\middleware\Oracle_BI1
      
      set JAVA_HOME=%ORACLE_HOME%\jdk
      set PATH=%ORACLE_HOME%\OPatch;%JAVA_HOME%\bin;%ORACLE_HOME\bin%;%PATH%
      
    2. Linux:
      export PATCH_FOLDER=/mnt/hgfs/installers/OBI/11.1.1.7/linux-x86-64_11.1.1.7.1
      export ORACLE_HOME=/home/oracle/obiee/Oracle_BI1
      
      export JAVA_HOME=$ORACLE_HOME/jdk
      export PATH=$ORACLE_HOME/OPatch:$JAVA_HOME/bin:$ORACLE_HOME/bin:$PATH
      
  7. Shut down OPMN, the Managed Server and the Admin Server
  8. Apply all the patches in one go, with no prompting:
    1. Windows:
      opatch napply -silent %PATCH_FOLDER% -id 16453010,16842070,16849017,16850553,16869578,16916026,16569379
      
    2. Linux:
      opatch napply -silent $PATCH_FOLDER -id 16453010,16842070,16849017,16850553,16869578,16916026,16569379
      
  9. Validate that they’ve been applied – the following should list all seven patches plus the bugs they fix:
    opatch lsinventory
    
  10. Per the instructions in the README.html for patch 16453010 for post-patch actions:
    1. Windows:
      del %ORACLE_HOME%\bifoundation\web\catalogmanager\configuration\org.eclipse.osgi
      del %ORACLE_HOME%\bifoundation\web\catalogmanager\configuration\org.eclipse.equinox.app
      
      copy %ORACLE_HOME%\clients\bipublisher\repository\Tools\BIPublisherDesktop*.exe %ORACLE_HOME%\..\user_projects\domains\bifoundation_domain\config\bipublisher\repository\Tools
      
      copy %ORACLE_HOME%\clients\bipublisher\repository\Admin\DataSource\msmdacc64.dll %ORACLE_HOME%\..\user_projects\domains\bifoundation_domain\config\bipublisher\repository\Admin\DataSource
      
      for /d /r %ORACLE_HOME%\..\user_projects\domains\bifoundation_domain\servers\bi_server1\tmp\_WL_user\bipublisher_11.1.1 %d in (jsp_servlet) do @if exist "%d" rd /s/q "%d"
      
    2. Linux:
      rm -rv $ORACLE_HOME/bifoundation/web/catalogmanager/configuration/org.eclipse.osgi
      rm -rv $ORACLE_HOME/bifoundation/web/catalogmanager/configuration/org.eclipse.equinox.app
      
      cp $ORACLE_HOME/clients/bipublisher/repository/Tools/BIPublisherDesktop*.exe $ORACLE_HOME/../user_projects/domains/bifoundation_domain/config/bipublisher/repository/Tools/
      
      cp $ORACLE_HOME/clients/bipublisher/repository/Admin/DataSource/msmdacc64.dll $ORACLE_HOME/../user_projects/domains/bifoundation_domain/config/bipublisher/repository/Admin/DataSource
      
      rm -rfv $ORACLE_HOME/../user_projects/domains/bifoundation_domain/servers/bi_server1/tmp/_WL_user/bipublisher_11.1.1/*/jsp_servlet
      

    (NB the msmdacc64.dll didn’t exist on either of my installations that I’ve tried this on)

  11. Start up Admin Server, Managed Server, and OPMN. Login to OBIEE and check the new version:
  12. Don’t forget to check the README.html for patch 16453010 for full instructions on updating the client, customised skins, mapviewer config, etc.

Learn OBIEE 11.1.1.7 from the experts!

$
0
0

training01Here at RittmanMead we write and deliver our own courses on Oracle’s products including OBIEE, ODI, OWB and EPM. Our trainers also consult and deliver projects, meaning that our training is based on real-world usage of the products. We offer public courses in UK, US, India and Australia, as well as private courses for clients the world over. Private courses can be customised to meet a client’s precise requirements.

We are proud to announce the latest version of our OBIEE training, which has been updated for the latest version of OBIEE, 11.1.1.7. The first public delivery of this course will be in our Brighton offices on July 22nd. RittmanMead co-founder and Oracle ACE Director Mark Rittman will be delivering part of it, as well as Principal Trainer Robin Moffatt.

Sign up online now to book your place, and benefit from a 10% discount using code 3202JUL13:

You can also join us for two shorter sections of the course:

For more information about our public courses, see the schedule listing here, or to enquire about private courses or any other training enquiry, email us at training@rittmanmead.com


The Rittman Mead scripts github repository

$
0
0

Here at RittmanMead we’re big fans of working smarter, not harder. One of the ways we do this is not re-inventing the wheel. There’s only so many ways that products such as OBI and ODI can be implemented, and we’ve probably seen most of them. The corollary of this is that we have a large number of useful and reusable scripts that we use on a regular basis .

Another part of working smarter is the use of a good VCS (Version Control System). Stewart has already been preaching the good word over at his git and OBIEE series, and needless to say we use git for looking after our scripts too.

Today we’re opening up our scripts git repository, making several of our scripts available to the OBIEE and ODI community. You can find it on github here, or clone it directly using the URL in your git client, or from the commandline:

git clone git@github.com:RittmanMead/scripts.git

Obviously, we’re holding a few gems back, after all, we’ve all got to eat right? But we’d like to see community participation in the repository, so go ahead and use it, fork it and make improvements, add your own scripts, and submit a pull request to merge them back in to share with the community again. To keep track of what’s going on and when new scripts are added, you can watch it on github.

Boring disclaimer

Hopefully it goes without saying, all the content in the repository is provided without warranty, etc etc, don’t blame us if your OBIEE server spontaneously combusts or starts dancing on one leg. The scripts will have been tested and known to work (usually on Oracle Linux), but may need amending to work on your particular OS. If you can fix scripts to run on your OS, please go ahead and submit a pull request!

Also – the material is available to all, that’s why we’re making it public – but it would be good form if you go on to use the scripts to keep reference back to their origin and not rip them off as your own. Not that anyone would do that.

So what’s in the goodie bag?

A gold-plated OBIEE init.d script

It’s not quite gold-plated, but it certainly has had some polish compared to the usual scripts that I’ve found for this purpose to date.

The script will automagically run OBIEE at startup, shut it down when the server is shutdown, and gives you access to a status command that probes the status of it to determine whether it’s up or not. It supports multiple installations of OBIEE on a server, and requires minimal configuration.

The difference between this script and others is that instead of just looking for a running service, or polling a log file for a particular string, it looks at the processes, the network ports, each of the opmn-managed processes, and even whether /analytics is accessible.

The script also handles timeouts for the components, which is important when using it as part of a server shutdown routine. If a process takes too long (quite how long is customisable) shutting down, it will be killed so that the server shutdown doesn’t hang.



The configuration required is minimal; just the FMW_HOME (where OBIEE is installed), and optionally where you want log files written and under which user to run OBIEE.

You’ll find the script in obi/init.d/obiee with full details on how to set it up and use it in the associated README.md

pull_the_trigger.sh – An OBIEE sysadmin testing script

Put your OBIEE sysadmin skills to the test with this nifty little script. Set it up with your FMW_HOME details (no peeking at the rest of the script!) and then run it. It will take a random action that breaks something on your OBIEE system, and it’s your job to figure out what.

Hopefully it’s not necessary to say, don’t run this on your Production OBIEE server!

Bonus Tip: Use service obiee status enabled by the script above to make your life easier in diagnosing OBIEE problems!

Various WLST scripts

A hotchpotch of WLST scripts to make the OBIEE sysadmin’s life easier, including:

  • Enable/disable BI Server caching
  • Add users to the WLS LDAP directory
  • Enable Usage Tracking
  • Deploy an RPD
  • Check the state of Application Deployments

Response files

Response files to use with silent installs and upgrades of OBIEE.

manage_rpd.pl

Last but no means least in this list is a script by Stewart Bryson. Find out more by following the forthcoming posts in his Git and OBIEE blog series, but I will say this, it looks very useful if you’re doing anything with MDS XML….

TimesTen and OBIEE port conflicts on Exalytics

$
0
0

Introduction

Whilst helping a customer set up their Exalytics server the other day I hit an interesting problem. Not interesting as in hey-this-will-cure-cancer, or oooh-big-data-buzzword interesting, or even interesting as in someone-has-had-a-baby, but nonetheless, interesting if you like learning about the internals of the Exalytics stack.

The installation we were doing was multiple non-production environments on bare metal, a design that we developed on our own Rittman Mead Exalytics box early on last year, and one that Oracle describe in their Exalytics white paper which was published recently. Part of a multiple environment install is careful planning of the port allocations. OBIEE binds to many ports per instance, and there is also TimesTen to consider. I’d been through this meticuluously, specifying ports through the staticports.ini file when building the OBIEE domain, as well as in the bim-setup.properties for TimesTen.

So, having given such careful thought to ports, imagine my surprise at seeing this error when we started up one of the OBIEE instances:

[OracleBIServerComponent] [ERROR:1] [] [] [ecid: ] [tid: e1dbd700]  [nQSError: 12002] 
Socket communication error at call=bind: (Number=98) Address already in use

which caused a corresponding OPMN error:

ias-component/process-type/process-set:
  coreapplication_obis1/OracleBIServerComponent/coreapplication_obis1/

Error
--> Process (index=1,uid=328348864,pid=17875)
  time out while waiting for a managed process to start

Address already in use? But…all my ports were hand-picked so that they explicitly woulnd’t clash …

Random ports

So it turns out that TimesTen, as well as using the two ports that are explicitly configured (deamon and server, usually 53396 and 53397 respectively), the TimesTen server processs also binds to a port chosen at random each time for the purpose of internal communication. This is similar to what the Oracle Database listener does, and as my colleague Pete Scott pointed out it’s been known for port clashes to occur between ODI and the Oracle Database listener.

To see this in action, use the netstat command, with the flags tlnp:

  • t : tcp only
  • l : LISTEN status only
  • n : numeric addresses/ports only
  • p : show associated processes

We pipe the output of the netstat command to grep to filter for just the process we’re looking for, giving us:

[oracle@obieesample info]$ netstat -tlnp|grep ttcserver
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:58476             0.0.0.0:*                   LISTEN      4811/ttcserver
tcp        0      0 0.0.0.0:53397               0.0.0.0:*                   LISTEN      4811/ttcserver

Here we can see on the last line the TimesTen server process (ttcserver) listening on the explicitly configured port 53397 for traffic on any address. We also see it listening on port 58476 for traffic only on the local loopback address 127.0.0.1.

What happens if we restart the TimesTen server and look at the ports again?

[oracle@obieesample info]$ ttdaemonadmin -restartserver
TimesTen Server stopped.
TimesTen Server started.

[oracle@obieesample info]$ netstat -tlnp|grep ttcserver
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 127.0.0.1:17073             0.0.0.0:*                   LISTEN      6878/ttcserver
tcp        0      0 0.0.0.0:53397               0.0.0.0:*                   LISTEN      6878/ttcserver

We can see the same listen port 53397 as before, listening for any client connections either locally or remotely, but now the port bound to the local loopback address 127.0.0.1 has changed – 17073.

Russian Roulette

So TimesTen randomly grabs a port each time it starts, and this may or may not be one of the ones that OBIEE is configured to use. If OBIEE is started first, then the problem does not arise because OBIEE has already taken the ports it needs, leaving TimesTen to randomly choose from the remaining unused ports.

If there are multiple instances of TimesTen and multiple instances of OBIEE then the chances of a port collision increase. What I wanted to know was how to isolate TimesTen from the ports I’d chosen for OBIEE. Constraining the application startup order (so that OBIEE gets all its ports first, and then TimesTen can use whatever is left) is a lame solution since it artificially couples two components that don’t need to be, adding to the complexity and support overhead.

TimesTen itself cannot be configured in its behaviour with these ports – Doc ID 1295539.1 states:

[...] All other TCP/IP port assignments to TimesTen processes are completely random and based on the availability of ports at the time when the process is spawned.

To understand the port range that TimesTen was using (so that I could then configure OBIEE to avoid it) I knocked up this little script. It restarts the TimesTen server, and then appends to a file the random port that it has bound to:

$ cat ./tt_port_scrape.sh
ttdaemonadmin -restartserver
netstat -tlnp|grep ttcserver|grep 127.0.0.1|awk -F ":" '{print $2}'|cut -d " " -f 1 1>>tt_ports.txt

Using my new favourite linux command, watch, I can run the above repeatedly (by default, every two seconds) until I get bored^H^H^H^H^H^H^H^H^H have collected sufficient data points

watch ./tt_port_scrape.sh

Finally, parse the output from the script to look at the port ranges:

echo "Lowest port: " $(sort -n tt_ports.txt | head -n1)
echo "Highest port: " $(sort -n tt_ports.txt | tail -n1)
echo "Number of tests: " $(wc -l tt_ports.txt )

Using this, I observed that TimesTen server would bind to ports ranging from as low as around 9000, up to 65000 or so.

Solution

Raising this issue with the good folks at Oracle Support yielded a nice easy solution. In the kernel settings, there is a configuration option net.ipv4.ip_local_port_range which specifies the local port range available for use by applications. By default this is 9000 to 65500, which matches the range that I observed in my testing above:

[root@rnm-exa-01 ~]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 9000     65500

I changed this range with sysctl -w :

[root@rnm-exa-01 ~]# sysctl -w net.ipv4.ip_local_port_range="56000 65000"
[root@rnm-exa-01 ~]# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 56000    65000

and then reran my testing above, which sure enough showed that TimesTen was now keeping its hands to itself and away from my configured OBIEE port ranges:

Lowest port:  56002
Highest port:  64990

(if I ran the test for longer, I’m sure I’d hit the literal extremes of the range)

To make the changes permanent, I added the entry to /etc/sysctl.conf:

net.ipv4.ip_local_port_range = 56000 65535

Lessons learned

1) Diagnosing application interactions and dependencies is fun ;-)
2) watch is a really useful little command on linux
3) When choosing OBIEE port ranges in multi-environment Exalytics installations, bear in mind that you want to partition off a port range for TimesTen, so keep the port ranges allocated to OBIEE ‘lean’.

Tips and Tricks for the OBIEE linux sysadmin

$
0
0

As well as industry-leading solution architecture, consultancy and training on Oracle BI, here at Rittman Mead we also provide expert services in implementation and support of such systems. In this blog I want to share some of the things I find useful when working with OBIEE on a Linux system.

OBIEE Linux start up script / service

Ticking both the OBIEE and Linux boxes, this script that I wrote is probably top of my list of recommendations (he says modestly…). It enables you to start and stop OBIEE using the Linux standard service command, integrate it into system startup/shutdown (through init.d), and also supports an advanced status command which does its very best to determine the health of your OBIEE system.

Start up

Shutdown

Status

You can get the script from the Rittman Mead public GitHub repository, or directly download the script here (but don’t forget to check out the README).

screen

GNU screen is one of the most useful programs that I use on Linux. I wrote extensively about it in my blog post screen and OBIEE. It enables you to do things such as:

  • Run multiple commands simultaneously in one SSH session
  • Disconnect and reconnect (deliberately, or from dropped connections, e.g. on unreliable wi-fi or 3G) and pick up exactly where you left off, with all processes still running
  • Share your SSH view with someone else, for remote training or a second pair of eyes when troubleshooting
  • Search through screen scroll back history, cut and paste
  • …a lot more!

There are other screen multiplexers such as tmux, but I’ve found that screen is the most widely available by default. Since they all have quite steep learning curves and esoteric key shortcuts to operate them, I tend to stick with screen.

SSH keys

Like screen, nothing to do with OBIEE per se, but an important part of Linux server security to understand, IMNSHO (In My Not-So Humble Opinion!).

Maybe I’m overly simple but I like pretty pictures when I’m trying to grasp concepts, so here goes:


  • You create a pair of keys using ssh-keygen. These are plain text and can be cut and pasted , copied, as required. One is private (e.g. id_rsa), and you need to protect this as you would any other security artifact such as server passwords, and you can optionally secure with a pass phrase. The other is public (e.g. id_rsa.pub), and you can share with anyone.

  • Your public key is placed on any server you need access to, by the server’s administrator. It needs to go in the .ssh folder in the user’s home folder, in a file called authorized_keys. As many public keys as need access can be placed in this file. Don’t forget the leading dot on .ssh.

Why are SSH keys good?

  • You don’t need a password to login to a server, which is a big time saver and productivity booster.
  • Authentication becomes about “this is WHO may access something” rather than “here is the code to access it, we have no idea who knows it though”.
  • It removes the need to share server passwords
    • Better security practice
    • Easier auditing of exactly who used a server
  • It enables the ability to grant temporary access to servers, and precisely control when it is revoked and from whom.
  • Private keys can be protected with a passphrase, without which they can’t be used.
  • Using SSH keys to control server access is a lot more secure since you can disable server password login entirely, thus kiboshing any chance of brute force attacks
  • SSH keys can be used to support automatic connections between servers for backups, starting jobs, etc, without the need to store a password in plain text

Tips

  • SSH keys are just plain text, making them dead easy to backup in a Password Manager such as LastPass, KeePass, or 1Password.
  • SSH keys work just fine from Windows. Tools such as PuTTY and WinSCP support them, although you need to initially change the format of the private key to ppk using PuTTYGen, an ancillary PuTTY tool.
  • Whilst SSH keys reside by default in your user home .ssh folder, you can store them on a cloud service such as Dropbox and then use them from any machine you want.
    • To make an ssh connection using a key not in the default location, use the -i flag, for example
      ssh -i ~/Dropbox/ssh-keys/mykey foo@bar.com
      
  • To see more information about setting up SSH keys, type:
    man ssh
    
  • The authorized_keys file is space separated, and the last entry on each line can be a comment. This normally defaults to the user and host name where the key was generated, but can be freeform text to help identify the key more clearly if needed. See man sshd for the full spec of the file.
    ssh4

Determining your server’s public IP / validating Internet connectivity

Dead simple this one – if you’re working on a server, or maybe a development VM, and need to check it has internet connection or want to know what the IP address is :

curl -s http://icanhazip.com/

This command returns just the IP and nothing else. Of course if you don’t have curl installed then it won’t work, so you’re left with the ever-reliable

ping google.com

Learn vi

…or emacs, or whatever your poison is. My point is that if you are going to be spending any serious time as an admin you need to be able to view and edit files locally on the Linux server. Transferring them to your Windows desktop with WinSCP to view in Notepad is what my granny does, and even then she’s embarrassed about it.

Elitism and disdain aside, the point remains. The learning curve of these console-based editors repays itself many-fold in time and thus efficiency savings in the long run. It’s not only faster to work with files locally, it reduces context-switching and the associated productivity loss.

Compare:

  1. I need to view this log file
  2. vi log.txt
  3. Done

with

  1. I need to check this log file for an error
  2. Close terminal window
  3. Start menu … now where was that program … hey fred, what’s the program … yeh yeh WinSCP that’s right
  4. Scroll though list of servers, or find IP to connect to
  5. Try to remember connection credentials
  6. Hey I wonder if devopsreactions has anything cool on it today
  7. Back to the job in hand … transfer file , which file?
  8. Hmmm, what was that folder called … something something logs, right?
  9. Dammit, back to the terminal … pwd , right, gottcha
  10. Navigate to the folder in WinSCP, find the file
  11. Download the file
  12. That dbareactions is pretty funny too, might just have a quick look at that
  13. Open Notepad (or at least Notepad++, please)
  14. Open file … where did Windows put it? My Documents? Desktop? Downloads? Libraries, whatever the heck they are ? gnaaargh
  15. Wonder if those cool guys at Rittman Mead have posted anything on their blog, let’s go have a look
  16. Back to Notepad, got the log file, now …… what was I looking for?
  17. Soddit

Silent Installs

This has to be my #1 tip in the Work Smarter, Not Harder category of OBIEE administration, and is as applicable to OBIEE on Windows as it is to OBIEE on Linux. Silent installs are where you run the installer “hands off”. You create a file in advance that describes all the configuration options to use, and then crank the handle and off it goes. You can use silent installs for

  • OBIEE Enterprise install
  • OBIEE Software Only install
  • OBIEE domain configurtion
  • WebLogic Server (WLS) install
  • Repository Creation Utility (RCU), both Drop and Create

The advantages of silent installs are many:

  • Guaranteed identical configurations across installations
  • No need to waste time getting a X Server working for non-Windows installs to run the graphical install client
  • Entire configuration of a server can be pre-canned and scripted
  • Running the graphical installer is TEDIOUS the first time, the second, third, tenth, twentieth … kill me. Silent installs make the angels sing and new born lambs frolic in the virtual glades of OBIEE grass meadows heady with the scent of automatically built RCU schemas

To find out more about silent installations, check out:

We’ve shared some example response files on the Rittman Mead public GitHub repository, or you can generate your own by running the installer once in GUI mode and selecting the Save option on the Summary screen of an installation. You can just run the installer to generate the response file – you don’t have to actually proceed with the installation if all you want to do is generate the response file.

opatch napply

I wrote about this handy little option for opatch in a blog post here. Where you have more than one patch to apply (as happens frequently with OBIEE patch bundles) this can be quite a time saver.

Bash

Bash is the standard command line that you will encounter on Linux. Here are a few tricks I find useful:

Ctrl-R – command history

This is one of those shortcuts that you’ll wonder how you did without. It’s like going through your command history by pressing the up/down arrows (you knew about that one, right?), but on speed.

What Ctrl-R does is let you search through your command history and re-use a command just by hitting enter.

How it works is this:

1) Press Ctrl-R. The bash prompt changes to

    (reverse-i-search)`':

2) Start entering part of the command line entry that you want to repeat. For example, I want to switch back to my FMW config folder. All I type is “co” to match the “co” in config, and bash shows me the match:

    (reverse-i-search)`co': cd /u01/dit/fmw/instances/instance1/config/

3) If I want to amend the command, I can press left/right arrows to move along the line, or just hit enter and it gets re-issued straight off

4) If there are multiple matches, either keep typing to narrow the search down, or press Ctrl-R to show the next match or Shift-Ctrl-R to show the previous match

Another example, I want to repeat my sqlplus command, I just press Ctrl-R and start typing sql and it’s matched:

(reverse-i-search)`sq': sqlplus / as sysdba

Finally, repeat the restart of Presentation Services, just by entering ps:

(reverse-i-search)`ps': ./opmnctl restartproc ias-component=coreapplication_obips1

time

If you prefix any command by time you get a nice breakdown of how long it took to run and where the time was spent after it completes. Very handy for quick bits of performance testing etc, or just curiosity :-)

$ time ./opmnctl restartproc ias-component=coreapplication_obis1
opmnctl restartproc: restarting opmn managed processes...

real    0m14.387s
user    0m0.016s
sys     0m0.031s

watch

This is a fantastic little utility that will take the command you pass it and repeatedly issue it, by default every two seconds.

You can use it to watch disk space, directory contents, and so on.

watch df -h

watch

sudo !!

Not the exclamation “sudo!”, but sudo !!, meaning, repeat the last command but with sudo.

$ tail /var/log/messages
tail: cannot open `/var/log/messages' for reading: Permission denied

$ sudo !!
sudo tail /var/log/messages
Sep 26 18:18:16 rnm-exa-01-prod kernel: e1000: eth1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
Sep 26 18:18:16 rnm-exa-01-prod avahi-daemon[4965]: Invalid query packet.

sudo !!

What is sudo? Well I’m glad you asked:

xkcd sudo
(credit: XKCD)

Over to you!

Which commands or techniques are you flabbergasted aren’t on this list? What functionality or concept should all budding OBI sysadmin padawans learn? Let us know in the comments section below.

OBIEE “Act As” vs “Impersonate”

$
0
0

There will come a point in the lifecycle of an OBIEE deployment when one user will need to access another user’s account. This may be to cover whilst a colleague is on leave, or a support staff trying to reproduce a reported error.

Password sharing aside (it’s zero-config! but a really really bad idea), OBIEE supports two methods for one user to access the system as if they were another: Impersonation and Act As.

This blog article is not an explanation of how to set these up (there are plenty of blogs and rip-off blogs detailing this already), but to explain the difference between the two options.

First, a quick look at what they actually are.

Impersonation

Impersonation is where a “superuser” (one with oracle.bi.server.impersonateUser application policy grant) can login to OBIEE as another user, without needing their password. It is achieved in the front end by constructing a URL, specifying:

  • The superuser’s login name and password (NQUser and NQPasword)
  • The login ID of the user to impersonate (Impersonate)

For example:

http://server:port/analytics/saw.dll?Logon&NQUser=weblogic&NQPassword=Password01&Impersonate=FSmith_FundY

The server will return a blank page to this request, but you can then submit another URL to OBIEE (eg the OBIEE catalog page or home page) and will already be authenticated as the Impersonate user – without having specified their password.

From here you can view the system as they would, and carry out whatever support or troubleshooting tasks are required.

Caution :  Impersonation is disabled by default, even for the weblogic Administrator user, and it is a good idea to leave it that way. If you do decide to enable it, make sure that the user to whom you grant it has a secure password that is not shared or known by anyone other than the account owner. Also, you will see from the illustration above that the password is submitted in plain text which is not good from a security point of view. It could be “sniffed” along the way, or more easily, extracted from the browser history.

Act As

Whilst Act As is a very similar concept to Impersonation (allow one user to access OBIEE as if they were another), Act As is much more controlled in how it grants the rights. Act As requires you to specify a list of users who may use the functionality (“Proxy users”), and for each of the proxy users, a list of users (“Target users”) who they may access OBIEE as.

Act As functionality is accessed from the user dropdown menu :

From where a list of users that the logged-in user (“proxy user”) has been configured to be able to access is shown :

Selecting a user switches straight to it:

In addition to this fine grained specification of user:user relationships, you can specify the level of access a Proxy user gets – full, or read-only. Target users (those others can Act As) can see from their account page who exactly has access to their account, and what level of access.

So what’s the difference?

Here’s a comparison I’ve drawn up

Here are a couple of examples to illustrate the point:

 

Based on this, my guidelines for use would be :

  • As an OBIEE sysadmin, you may want to use Impersonate to be able to test and troubleshoot issues. However, it is functionality much more intended for systems integration than front-end user consumption. It doesn’t offer anything that Act As doesn’t, except fewer configuration steps. It is less secure that Act As, and could even be seen as a “backdoor” option. Particularly at companies where audit/traceability is important should be left disabled.
  • Act As is generally the better choice in all scenarios of an OBIEE user needing the ability to access another’s account, whether between colleagues, L1/L2 support staff, or administrators.
    Compared to Impersonation, it is more secure, more flexible, and more granular in whose accounts can be accessed by whom. It is also fully integrated into the user interface as standard functionality of the tool.

Reference

Thanks to Christian Berg, Gianni Ceresa and Gianni Ceresa for reading drafts of this article and providing valuable feedback

Collecting Usage Tracking Data with Metric Extensions in EM12c

$
0
0

In my previous post I demonstrated how OBIEE’s Usage Tracking data could be monitored by EM12c through a Service Test. It was pointed out to me that an alternative for collecting the same data would be the use of EM12c’s Metric Extensions.

A Metric Extension is a metric definition associated with a type of target, that can optionally be deployed to any agent that collects data from that type of target. The point is that unlike the Service Test we defined, a Metric Extension is define-once-use-many, and is more “lightweight” as it doesn’t require the definition of a Service. The value of the metric can be obtained from sources including shell script, JMX, and SQL queries.

The first step in using a Metric Extension is to create it. Once it has been created, it can be deployed and utilised.

Creating a Metric Extension

Let us see now how to create a Metric Extension. First, access the screen under Enterprise -> Monitoring -> Metric Extensions.

To create a new Metric Extension click on Create…. From the Target Type list choose Database Instance. We need to use this target type because it enables us to use the SQL Adapter to retrieve the metric data. Give the metric a name, and choose the SQL Adaptor.

Leave the other options as default, and click on Next.


 

In a Metric Extension, the values of the columns (one or more) of data returned are mapped to individual metrics. In this simple example I am going to return a count of the number of failed analyses in the last 15 minutes (which matches the collection interval).


 

On the next page you define the metric columns, matching those specified in the adaptor. Here, we just have a single column defined:



 

Click Next and you will be prompted to define the Database Credentials, which for now leave set to the default.


 

Now, importantly, you can test the metric adaptor to make sure that it is going to work. Click on Add to create a Test Target. Select the Database Instance target on which your RCU resides. Click Run Test


 

What you’ll almost certainly see now is an error:

Failed to get test Metric Extension metric result.: ORA–00942: table or view does not exist


The reason? The SQL is being executed by the “Default Monitoring Credential” on the Database Instance, which is usually DBSNMP. In our SQL we didn’t specify the owner of the Usage Tracking table S_NQ_ACCT, and nor is DBSNMP going to have permission on the table. We could create a new set of monitoring credentials that connect as the RCU table owner, or we could enable DBSNMP to access the table. Depending on your organisation’s policies and the scale of your EM12c deployment, you may choose one over the other (manageability vs simplicity). For the sake of ease I am going to take the shortest (not best) option, running as SYS the following on my RCU database to create a synonym in the DBSNMP schema and give DBSNMP access to the table.

GRANT SELECT ON DEV_BIPLATFORM.S_NQ_ACCT TO DBSNMP;  
CREATE SYNONYM DBSNMP.S_NQ_ACCT FOR DEV_BIPLATFORM.S_NQ_ACCT;  

Now retest the Metric Extension and all should be good:


 

Click Next and review the new Metric Extension


 

When you click on Finish you return to the main Metric Extension page, where your new Metric Extension will be listed.

A note about performance

When building Metric Extensions bear in mind the impact that your data extraction is going to have on the target. If you are running a beast of a SQL query that is horrendously inefficient on a collection schedule of every minute, you can expect to cause problems. The metrics that are shipped with EM12c by default have been designed by Oracle to be as lightweight in collection as possible, so in adding your own Metric Extensions you are responsible for testing and ensuring yours are too.

Deploying a Metric Extension for testing

Once you have built a Metric Extension as shown above, it will be listed in the Metric Extension page of EM12c. Select the Metric Extension and from the Actions menu select Save As Deployable Draft.


You will notice that the Status is now Deployable and on the Actions menu the Edit option has been greyed out. Now, click on the Actions menu again and choose Deploy To Targets…, and specify your RCU Database Instance as the target

Return to the main Metric Extension page and click refresh, and you should see that the Deployed Targets number is now showing 1. You can click on this to confirm to which target(s) the Metric Extension is deployed.

Viewing Metric Extension data

Metric Extensions are defined against target types, and we have created the example against the Database Instance target type in order to get the SQL Adaptor available to us. Having deployed it to the target, we can now go and look at the new data being collected. From the target itself, click on All Metrics and scroll down to the Metric Extension itself, which will be in amongst the predefined metrics for the target:


After deployment, thresholds for Metric Extension data can be set in the same way they are for existing metrics:



Thresholds can also be predefined as part of a Metric Extension so that they are already defined when it is deployed to a target.

Amending a Metric Extension

Once a Metric Extension has been deployed, it cannot be edited in its current state. You first create a new version using the Create Next Version… option, which creates a new version of the Metric Extension based on the previous one, and with a Status of Editable. Make the changes required, and then go through the same Save As Deployable Draft and Deploy to Target route as before, except you will want to Undeploy the original version.

Publishing a Metric Extension

The final stage of producing a Metric Extension is publishing it, which moves it on beyond the test/draft “Deployable” phase and marks it as ready for use in anger. Select Publish Metric Extension from the Actions menu to do this.

A published Metric Extension can be included in a Monitoring Template, and also supports the nice functionality of managed upgrades of Metric Extension versions deployed. In this example I have three versions of the Metric Extension, version 2 is Published and deployed to a target, version 3 is new and has just been published:


Clicking on Deployed Targets brings up the Manage Target Deployments page, and from here I can select my target on which v2 is deployed, and click on Upgrade


After the confirmation message Metric Extension ME$USAGE_TRACKING upgrade operation successfully submitted. return to the Metric Extension page and you should see that v3 is now deployed to a target and v2 is not.

Finally, you can export Metric Extensions from one EM12c deployment for import and use on another EM12c deployment:

Conclusion

So that wraps up this brief interlude in my planned two-part set of blogs about EM12c. Next I plan to return to the promised JMeter/EM12c integration … unless something else shiny catches my eye in between …

Viewing all 99 articles
Browse latest View live


Latest Images