Essbase® Analytic Services Database Administrator's Guide | | Update Contents | Previous | Next | Print | ? | |
Information Map | |
An Analytic Services partitioned application can span multiple servers, processors, or computers. A partition is the piece of a database that is shared with another database. Partitioning applications can provide the following benefits:
Note: The information in this chapter is designed for block storage databases. Some of the information is not relevant to aggregate storage databases. For detailed information on the differences between aggregate and block storage, see Comparison of Aggregate and Block Storage. For information on creating aggregate storage applications, see Aggregate Storage Applications, Databases, and Outlines.
This chapter contains the following sections:
Caution: Design partitions carefully. Hyperion strongly recommends that you read this chapter before creating partitions.
Here is the suggested process for designing a partitioned database.
Analytic Services Partitioning is a collection of features that makes it easy to design and administer databases that span Analytic Services applications or servers. Partitioning is licensed separately from Analytic Services. The Partitioning option must be licensed for every server that contains a database partition.
Partitioning can provide the following benefits:
Based on user requirements, select one of the following partitioning strategies:
This section contains the following sections:
A partition is a piece of a database that is shared with another database. Partitions contain the following parts, as illustrated in Figure 68.
Partitioned databases contain at least one data source, the primary site of the data, and at least one data target, the secondary site of the data. A single database can serve as both the data source for one partition and the data target for another. When you define a partition, you map cells in the data source to their counterparts in the data target:
Figure 69: Data Source and Data Target
An Analytic Services database can contain many different partitions as well as data that is not shared with any other Analytic Services database. You can define partitions between the following databases:
You can only define one partition of each type between the same two databases. For example, you can only create one replicated partition between the Sampeast East database and the Samppart Company database. The East or Company databases can, however, contain many replicated partitions that connect to other databases.
A single database can serve as the data source or data target for multiple partitions. To share data among many databases, create multiple partitions, each with the same data source and a different data target:
Figure 70: Data Shared at Multiple Targets
An overlapping partition occurs when similar data from two or more databases serve as the data source for a single data target in a partition. For example, IDESC
East,
Sales
from database 1 and Boston,
Sales
from database 2 are mapped to IDESC
East,
Sales
and Boston,
Sales
in database 3. Because Boston
is a member of the dimension East
, the data for Boston
mapped to database 3 from database 1 and database 2, overlap. This data overlap results in an overlapping partition:
Figure 71: Overlapping Partitions
An overlapping partition is allowed in linked partitions, but is invalid in replicated and transparent partitions and generates an error message during validation.
You can use attribute functions for partitioning on attribute values. But you cannot partition an attribute dimension. Use attribute values to partition a database when you want to access members of a dimension according to their characteristics.
For example, in the Sample Basic database, you cannot partition the Pkg Type attribute dimension. But you can create a partition that contains all the members of the Product dimension that are associated with either or both members (Bottle and Can) of the Pkg Type dimension. If you create a partition that contains members associated with Can, you can access data only on Product members that are packaged in cans; namely, 100-10, 100-20, and 300-30.
You can use the @ATTRIBUTE command and the @WITHATTR command to define partitions.
For example, to extract data on all members of the Product dimension that are associated with the Caffeinated attribute dimension, you can create a partition such as @ATTRIBUTE (Caffeinated). But you cannot partition the Caffeinated attribute dimension.
Based on the previous example, this partition is correct:
Figure 72: Correct Partitioning
Source Target @ATTRIBUTE(Caffeinated) @ATTRIBUTE(Caffeinated)
Based on the previous example, this partition is incorrect:
Figure 73: Incorrect Partitioning
Source Target Caffeinated Caffeinated
For more information about these commands, refer to the section on calculation commands in the Technical Reference.
For more information on attribute dimensions, see Working with Attributes.
Partitioning a database is not always the correct option. The following sections provide questions you can use to determine if partitioning the database is the best solution for you.
Review the following list of questions. If you answer yes to many of them, or answer yes to some that are very important to you, you may wish to partition databases.
Sometimes, it does not make sense to partition a centralized database. Partitioning a database can require additional disk space, network bandwidth, and administrative overhead. Review the following list of questions. If you answer yes to many of them, or answer yes to some that are very important to you, you may not want to partition a database.
When designing a partitioned database, find out the following information about the data in the database:
The answers to these questions determine which data to include in each partition. For examples, see Case Studies for Designing Partitioned Databases.
Note: You cannot partition attribute dimensions. See Attributes in Partitions.
Analytic Services supports the following types of partitions:
A replicated partition is a copy of a portion of the data source that is stored in the data target. Some users can then access the data in the data source while others access it in the data target.
In the Samppart and Sampeast applications shipped with Analytic Services, for example, the database administrator at The Beverage Company (TBC) created a replicated partition between the East database and the Company database containing Actual, Budget, Variance, and Variance%. Users in the eastern region now store their budget data locally. Because they do not have to retrieve this data live from the corporate headquarters, their response times are faster and they have more control over the down times and administration of the local data. For a more complete description of the sample partitioned databases provided with Analytic Services, see Case Study 1: Partitioning an Existing Database.
Changes to the data in a replicated partition flow from the data source to the data target. Changes made to replicated data in the data target do not flow back to the data source. If users change the data at the data target, Analytic Services overwrites their changes when the database administrator updates the replicated partition.
The database administrator can prevent the data in the replicated portion of the data target from being updated. This setting takes precedence over access provided by security filters and is also honored by batch operations such as dataload and calculation. By default, replicated partitions are not updateable. For directions on how to set a partition as updateable, see the Essbase Administration Services Online Help.
Use a replicated partition when you want to achieve any of the following goals:
These sections help you assess the value of replicated partitions:
Replicated partitions must follow these rules:
The data source and data target outlines for the non-shared areas do not have to be mappable.
The cells in a data target can be the data source for a different replicated partition. For example, if the Samppart Company database contains a replicated partition from the Sampeast East database, you can replicate the cells in the Sampeast East database into a third database, such as the Sampwest West database.
For a discussion of using Dynamic Time Series members in replicated partitions, see Using Dynamic Time Series Members in Partitions.
Replicated partitions can solve many database problems, but replicated partitions are not always the ideal partition type. This section describes the advantages and disadvantages of using a replicated partition.
Following are the advantages of using a replicated partition.
Following are the disadvantages of using a replicated partition.
To improve the performance of replicated partitions, consider the following when replicating data.
For example, to replicate the database along the Market dimension:
Sometimes you cannot calculate derived data at the data target. In that case, you must replicate it from the data source. For example, you cannot calculate derived data at the data source if the data meets any of the following criteria:
One port is used for every unique user and computer combination. If a user defines several replicated partitions on one server using the same user name, then only one port is occupied.
In a replicated partition, when a user (user1) drills into an area in the target that accesses source data, user1 is using the user name declared in the partition definition (partition user) to access the data from the source database. This access causes the use of an additional port because different users (user1 and partition user) are connecting to the application.
If a second user (user2) connects to the target database and drills down to access source data, user2 also uses the user name declared in the partition definition (partition user) to access the source database. Because the partition user is already connected to the source database, an additional port is not needed for the partition user, as long as user2 is accessing the same source database.
Note: Because of the short-term nature of replication, replicated partitions and ports are rarely a problem.
A transparent partition allows users to manipulate data that is stored remotely as if it were part of the local database. The remote data is retrieved from the data source each time that users at the data target request it. Users do not need to know where the data is stored, because they see it as part of their local database.
Figure 75: Transparent Partitions
Because the data is retrieved directly from the data source, users see the latest version of the data. When they update the data, their updates are written back to the data source. This process means that other users at both the data source and the data target have immediate access to those updates.
With a transparent partition, users at the data source may notice slower performance as more users access the source data and users at the data target may notice slower performance as more users access the source data.
For example, the database administrator at TBC can use a transparent partition to calculate each member of the Scenario dimension on a separate CPU. This process reduces the elapsed time for the calculation, while still providing users with the same view of the data. For a more complete description of a transparent partition based on the Sample Basic database, see Case Study 1: Partitioning an Existing Database.
Use a transparent partition when you want to achieve the following goals:
These sections help you assess the value of transparent partitions:
Transparent partitions must follow these rules:
For a discussion on using Dynamic Time Series members in transparent partitions, see Using Dynamic Time Series Members in Partitions.
Transparent partitions can solve many database problems, but transparent partitions are not always the ideal partition type. This section describes the advantages and disadvantages of using a transparent partition.
Following are the advantages of using a transparent partition:
Following are the disadvantages of using a transparent partition:
Analytic Services does not recalculate the values of transparent dependents because the outlines for the data source and the data target may be so different that such a calculation is accurate.
For example, suppose that the data target outline contained a Market dimension with East, West, South, and Central members and the data source outline contained an East dimension with New York and New Jersey members. If you tried to calculate the data target outline, you would assume that East was a level 0 member. In the data source, however, East is derived by adding New York and New Jersey. Any calculations at the data target, however, would not know this information and could not reflect any changes made to New York and New Jersey in the data source. To perform an accurate calculation, therefore, you must first calculate East in the data source and then calculate the data target.
For a tip on improving performance of transparent calculations, see Calculating Transparent Partitions.
If these disadvantages are too serious, consider using replicated or linked partitions instead.
To improve the performance of transparent partitions, consider the following facts when creating the partition:
When you perform a calculation on a transparent partition, Analytic Services performs the calculation using the current values of the local data and transparent dependents. When calculating local data that depends on remote data, Analytic Services performs a bottom-up calculation. The bottom-up calculation can be done only if the calculator cache on the target database is used properly. For complete information on bottom-up calculations, see Using Bottom-Up Calculation. For information on the calculator cache, see Sizing the Calculator Cache.
Increasing the amount of memory assigned to the calculator cache greatly improves calculation performance with transparent partitions. When a calculation is started, a message in the application log indicates whether or not the calculator cache is enabled or disabled on the target database. Using the calculator cache on the target database reduces the number of blocks that are requested from the data source during calculation. Reducing the blocks requested, in turn, reduces the amount of network traffic that is generated by transferring blocks across the network. For information on estimating the size of the calculator cache, see Sizing the Calculator Cache.
Calculating data on the data target can greatly slow performance when the data target must retrieve each dependent data block across the network, and then perform the calculation.
Performance with transparent calculations may also slow if Analytic Services must perform a top-down calculation on any portion of the data target that contains top-down member formulas. When the data target contains no top-down member formulas, Analytic Services can perform a bottom-up calculation on the data target, which is much faster.
When Analytic Services performs the calculation on the data source, it can always perform a bottom-up calculation. For a comparison of top-down and bottom-up calculations, see Using Bottom-Up Calculation.
Consider using these calculation alternatives:
Other performance strategies include the following:
If the data target and data source outlines are identical except for different member formulas, make sure that the partition definition produces the desired calculation results.
For example, suppose that the data source and data target outlines both contain a Market dimension with North and South members, and children of North and South. On the data target, Market is calculated from the data for the North and South members (and their children) on the data source. If any of these members on the data source contain member formulas, these formulas are calculated, thus affecting the calculated value of Market on the data target. These results may be different from how the Market member are calculated from the North and South members on the data target, where these formulas may not exist.
Make sure that any formulas you assign to members in the data source and data target produce the desired results.
One port is used for every unique user and machine combination. If a user defines several transparent partitions on one server, using the same user name, then only one port is occupied.
In a transparent partition, when a user (user1) drills into an area in the target that accesses source data, user1 is using the user name declared in the partition definition (partition user) to access the data from the source database. This process causes the use of an additional port because different users (user1 and partition user) are connecting to the application.
If a second user (user2) connects to the target database and drills down to access source data, user2 also uses the user name declared in the partition definition (partition user) to access the source database. Because the partition user is already connected to the source database, an additional port is not needed for the partition user, as long as user2 is accessing the same source database.
A linked partition connects two different databases with a data cell. When the end user clicks the linked cell in the data target, you drill across to a second database, the data source, and view the data there. If you are using Spreadsheet Add-in, for example, a new sheet opens displaying the dimensions in the second database. You can then drill down into these dimensions.
Unlike replicated or transparent partitions, linked partitions do not restrict you to viewing data in the same dimensionality as the target database. The database that you link to can contain very different dimensions than the database from which you connected. With linked partitions, data is not physically transferred from the source to the target. Instead, a data cell or range of cells on the target provides a link point to a cell or range of cells on the source.
To prevent users from seeing privileged data, establish security filters on both the data source and the data target. For directions on establishing security filters, see Planning for Security for Partitioned Databases.
There are no performance considerations for linked partitions, beyond optimizing the performance of each linked database.
For example, if TBC grew into a large company, they might have several business units. Some data, such as profit and sales, exists in each business unit. TBC can store profit and sales in a centralized database so that the profit and sales for the entire company are available at a glance. The database administrator can link individual business unit databases to the corporate database. For an example of creating a linked partition, see Case Study 3: Linking Two Databases.
A user in such a scenario can perform these tasks:
Figure 79: Source and Target for Linked Partition
For linked partitions, the spreadsheet that the user first views is connected to the data target, and the spreadsheet that opens when the user drills across is connected to the data source. This setup is the opposite of replicated and transparent databases, where users move from the data target to the data source.
Use a linked partition when you want to connect databases with different dimensionality.
These sections help you assess the value of linked partitions:
Linked partitions allow users to navigate to databases that contain different dimensions, but linked partitions are not always the ideal partition type. This section describes the advantages and disadvantages of using a linked partition.
Following are the advantages of linked partitions:
Following are the disadvantages of linked partitions:
When a user clicks on a linked cell in a linked partition, a spreadsheet opens and displays the linked database. This process is called drill across. To facilitate drill across access you can use the following strategies:
When a user drills across on data to a data target, Analytic Services logs the user into the data target using the following steps:
When accessing a linked partition, Analytic Services tries to use the end user's (user1) login information to connect to the source database. If user1 does not have access to the source database, Analytic Services looks for the linked partition default user name and password. If these defaults are not specified, user1 is requested to enter login information to access the source database. Port usage varies depending on the number of different user names being used to access the various source and target databases (and whether those databases are contained within the same or different servers).
The following table should help you choose which type of partition to use.
Feature |
Replicated |
Transparent |
Linked |
---|---|---|---|
Ability to use front-end tools that are not Distributed OLAP-aware |
|||
Users accessing replicated, transparent, or linked partitions may need to view data stored in two or more databases. The following sections describe how to set up security so that users do not view or change inappropriate data.
Create the required end users with the correct filters.
The administrative account performs all read and write operations requested by the data target for the data source. For example, when end users request data at the data target, the administrative account retrieves the data. When end users update data at the data target, the administrative account logs into the data source and updates the data there.
You can create filters on the administrative account in addition to filters on the end users. Filters on the administrative account can ensure that no one at the data target can view or update inappropriate data. For example, the administrator at the corporate database can restrict write access on certain cells to avoid relying on administrators in the various regions to set up security correctly for each end user.
Create the required administrative users with the correct filters.
See Setting the User Name and Password.
Analytic Services uses this account to log onto the data source to retrieve data and to perform outline synchronization operations.
See Managing Security for Users and Applications.
The following sections describe examples of partitioning a database:
Assume that TBC, the fictional soft drink company upon which the Sample Basic database is based, started out with a centralized database. As the eastern region grew, however, this solution was no longer feasible. The networks to the eastern region could not handle the large flow of data. Users were constantly waiting for data that they needed to make decisions. One day, the network went down and users at the eastern region could not access the data at all.
Everyone agreed that the eastern region needed to access its own data directly, without going through the company database. In addition, TBC decided to change where budgeting information was stored. The corporate budget stays at company headquarters, but the eastern region budget moves to the eastern region's database.
So, assume that TBC decided to ask you to partition their large centralized database into two smaller databases-Company and East.
This example is based on the Samppart application, which contains the Company database, and the Sampeast application, which contains the East database. Both are shipped with Analytic Services.
This illustration shows a subset of the partitioned databases. The arrows indicate flow from the data source to the data target. The Company database is the data source for the Corp_Budget member and the data target for the East and the East Actual members. The East database is the data source for its East and Actual members and the data target for the Corp_Budget member.
Use this procedure to create a partition based on this example:
The Sample Basic database contains five standard dimensions-Year, Measures, Product, Market, and Scenario.
Now that the corporate database is partitioned, users and database administrators see the following benefits:
Assume that TBC has several databases, such as Inventory, Payroll, Marketing, and Sales. Users viewing the Sample Basic database want to share data with and navigate to those other databases and you, the database administrator, want to synchronize related data. It is impractical to combine all of the databases into one database, for the following reasons:
By connecting the databases instead, you can reap the following benefits:
So you decide to connect multiple databases.
Note: This example is not shipped with Analytic Services.
Use a replicated partition for the Opening_Inventory and Ending_Inventory members because the network connection is slow.
Now that the Sample Basic database is partitioned, users and database administrators see the following benefits:
Assume that TBC, the fictional soft drink company upon which the Sample Basic database is based, has two main databases-the Sample Basic database and TBC Demo. Both databases have similar outlines, but TBC Demo has two additional dimensions, Channel, which describes where a product is sold, and Package, which describes how the product is packaged.
The database administrator for the Sample Basic database notices that more and more users are requesting that she add channel information to the Sample Basic database. But, since she does not own the data for channel information, she is reluctant to do so. She decides instead to allow her users to link to the TBC Demo database which already contains this information.
Note: This example is not shipped with Analytic Services.
The database administrator decides to link the Product dimension of the Sample Basic database to the Product dimension of TBC Demo. Users can then drill across to TBC Demo and view the Channel and Package information.
Note: This setup is the opposite of replicated and transparent databases, where users move from the data target to the data source.
Use a linked partition because the databases have different dimensionality.
Now that the databases are linked, users and database administrators see the following benefits:
![]() |