OracleBIBlog Search

Wednesday, June 2, 2010

BI Apps Performance Optimization and Consideration - Part 3

In this last part we will look at other options of optimizing the performance of OBIEE

Aggregation Strategy

  • Multi-level (pyramid) aggregation strategy: level 1 to level 2, level 2 built on level 1

  • Incremental aggregation for level 1 aggregates to reduce ETL time

  • Level 1 aggregates can be at the lowest leaf of each dimension to avoid bulk re-aggregation when dimension hierarchies change (e.g. sales force re-organization, product hierarchy changes)

  • Bulk aggregations for higher levels for simplicity

  • Data compression drives aggregation (at least 10 x, preferably closer to 100x). This could be implemented for Service Request tables

  • Try aggregation based on dimensions instead of factsAggregation to the ratio of 50:1

Caching Strategy

  • Proper caching strategy needs to be implemented as this is among the best practice of optimization . Caching must be implemented based upon users by groups

  • Set the physical table cache property on Mater table not aliases

  • Set cache persistence time based of refresh schedule

  • Ideally do not opt Cache never expires in the physical table properties

  • Caching logs may be monitored for user sessions and queries may be analyzed across various users to optimize

  • Use ODBC extension functions to purge cache which has the following advantages

  • Analytics Cache Purge ODBC Extension Functions

  • ETL routines can proactively call ODBC extension functions to manage cache. This ensures that obsolete cache is purged as soon as possible

  • Purging Cache by Logical Query is a new capability. This will be particularly helpful in purging cache associated with logical queries that are used in Cache Seeding.

  • Purging the entire cache set in one command is a new capability that can be used in various scenarios including development and testing environments.

  • SAPurgeCacheByQuery will purge a cache entry that exactly matches the logical query plan derived from a specified query. This function takes one, and only one, parameter representing query text. USE_ADVANCED_HIT_DETECTION. The default value is NO which will have the pre-existing cache hit detection behavior. The value YES will invoke the two pass algorithm which will examine a greater search space and is potentially more expensive than the default behavior. you should set the value to YES when you wish to maximize the number of cache hits they obtain

    Check the following Cache for scalability

Browser Cache
Scripts like Java or HTML
High resolution images
Data Access
Query Plan cache
Multithreaded Database Pools

BI Server Cache
Result Set

Database Optimization

  1. Database archive strategy

  • The data in the databse needs to be archived from time to time in order to manage the data size. The idea is to keep only a certain amount of data in the warehouse and archive the rest of the data. For example, keep only the last 12 months of data in the warehouse, and archive older data from the warehouse.

  • This database archive strategy will help limit the number of rows across tables with huge data

2. Database Partition strategy

  • A good database partition strategy will help query performance since the query will only need to look at a specific partition to obtain the result. For example: The Organization table can be partition based on Active Flag ; The Service Request table can be partitioned based on X_CALC_FLG = 'Y’

3. Database Indexes

  • A database index should be added for the most queried columns. The database index should also be rebuild from time to time to ensure its effectiveness. ALTER INDEX MONITORING USAGE command that collects statistics over a period of time.

  • Increase Parallel Query by configuring the database server for an optimal degree of parallel processing.

  • Given sufficient resources, this can greatly improve query

  • Make sure that cost-based optimization is enabled on databases

  • Set the tablespace to at least the same as the transactional database size. Make sure the temporary tablespace has adequate space

  • Create histogram statistics for these indices to enable the optimizer to better perform queries on critical tasks

  • Spread data across multiple disks. RAID Redundant Array of Independent Disks configuration if possible is recommended

  • Partition Large fact tables like Asset, SR is recommended

  • Set the number of log file groups to 4 on Warehouse schema

  • Set the size of each log file to 10 MB. on warehouse db

  • Set the sga_max_size to 700 MB or more

  • On the client side, edit the tnsnames.ora file. Modify the TNS alias by adding SDU= and TDU=as follows:

  • On the Oracle client Set the NLS_SORT parameter to BINARY

  • The ETL implementation team should use analyze table commands specific to their database environment, that can be executed after the respective session has completed. This can be achieved by using the post-session command property of each session
    Try to break up ETL processing so data is extracted and staged remotely then compressed for further local processing

  • Remote ETL servers extract and stage to local flat file, compress and send the flat file to central ETL server

  • Central ETL Server performs the ‘stage to data warehouse’ process

  • Manage I/O Traffic --> Manage the input and output accesses to disk storage by striping the disk storage. The best and simplest action is to install disk storage arrays (RAID), the second best is to stripe volumes using a Logical Volume Manager.

  • De fragment tables to which the data is extracted

  • See how much memory is on the available server and set the values for SORT_AREA_SIZE and SORT_AREA_RETAINED_SIZE rather high. 20, 30 or 40 MB per session is not uncommon in a data warehouse situation. Also set HASH_AREA_SIZE rather large

Infrastructure Options

  • Consider deployment options on separate hardware for various OBI components like BI Publisher, OBI Server, OBI Presentation Services, Oracle Delivers Server

  • Hardware must be available based of the recommendations from Oracle. The hardware for database must be must also be setup based of the size of the database