OracleBIBlog Search

Monday, December 29, 2008

CACHE MANAGEMENT

Databases are periodically refreshed with updated data due to ETL (Extract-Transform-Load) processes being executed. This will cause the information that is stored in the query cache to become outdated, or ‘stale’. OBIEE Administrators must enable a method of purging and refreshing the cache on a continuous basis.


IT TAKES MONEY TO MAKE CACHE

There is a cost associated with utilizing cache. This cost comes in the form of disk space for storage, purge transactions on the server, and administration. However, all the costs itemized above are easily outweighed by the decreased response times and increased performance gained by the application.


SOME CACHE IS WORTH MORE THAN OTHER CACHE

Not all data sets are treated the same. Although caching is set as a default within OBIEE, some queries may not be a suitable use of the cache.

Situations in which queries return an extra-large amount of data may not be ideal because the extra-large data set will also cause the cache file to be extra-large as well. The larger the cache, the less performance enhancement can be derived from the cache. A general rule-of-thumb is that a direct database hit should be sought with queries that will generate more than a 1GB cache entry.

Situations in which particular data elements must be refreshed frequently, or on a near real-time basis, may not be good candidates for a caching solution. Depending on your requirements, there may be a threshold for which the performance advantage of cache can be trumped by the frequency at which the cached data will need to be purged and refreshed.


OPEN YOUR CACHE PLAN

1. Make a decision to either (a) start with cache enabled for the application... or (b) use cache only as a peformance tuning tool following initial development.

2. Then.. develop a cache updating method

3. On an ongoing basis, monitor query requests to identify opportunities for improvement


CACHE METHODS

The important hurdle to overcome when implementing a cache solution is to eliminate the opportunity for data latency, or ‘stale’ data to exist. ‘Stale’ data will exist when the cached data is not purged after the ETL process has updated the data warehouse.

1. No Cache Method
2. Manually Administered Cache Method
3. Table Level Caching Method
4. Polling Table Reference Method


No Cache Method –

A new SQL query will call the database every time a request for data is generated from the users. This will greatly affect system performance and user productivity because of the increased network traffic and demands on the server.

The system will only be as good (or fast) as its weakest link. If the network connection is slow or the database is slow returning results… the users will feel this pain with slow response times.


Manually Administered Cache Method --

When connected in to a repository within online mode, a cache manager utility is available from Manage>Cache. Note that this option may be unavailable if you have disabled caching in the NQSConfig.ini file.

Manual cache management is best served as a useful utility during testing phases. It is generally not a dependable option for daily operations as it requires the ad-hoc purging of the cache by a user.


Polling Table Reference Method –

The system can identify when to refresh cached data with the use of a polling table, better known as an event table. The event table contains timing information about specific events that occur. When an ETL process is executed, an event table in the database is updated with an entry recording the details of a data table when that data table is updated by the ETL process. The BI Server can poll the event table and purge the data from the cache if a data table has been updated.

The frequency with which the BI Server checks the polling table can be set to coincide with that of the ETL so data from the more frequently updated tables is purged from the cache more often to avoid ‘stale’, or out-of-date’, data.

The polling table method can serve to be most useful where incremental ETL processes run during the day. An example of this would be to update transaction or sales data.

The polling table method is not as beneficial when the incremental ETL is run once a day or overnight.

Frequency settings for which the BI Server polls the event table is set in the OBIEE Administration tool, Tools>Utilities>Oracle BI Event Tables. Note that the parameters for the event table contain table names only and cannot contain an alias. This can lead to misleading results and an alternative purging strategy must be found for the alias.


Table Level Method –

The simplest description of this cache method is to manage caching on a table-by-table basis. All tables have caching enabled by default. The default amount of time data is left in cache (called persistence time) is infinite.

The best combination of performance improvement and storage space conservation can be achieved by deselecting tables from this process that are rarely queried. The modification of persistence time to coincide with the incremental ETL processes (similar to the Polling Table Reference Method) can also further increase performance.

0 comments: