Olap cubes excel. OLAP = Multidimensional View = Cube. Creating an Internet Sales Cube

OLAP (Online Analytical Processing) data cubes allow you to efficiently extract and analyze multidimensional data. Unlike other types of databases, OLAP databases are designed specifically for analytical processing and rapid retrieval of all kinds of data sets from them. There are actually several key differences between standard relational databases, such as Access or SQL Server, and OLAP databases.

Rice. 1. To connect an OLAP cube to an Excel workbook, use the command From Analytics Services

Download the note in or

In relational databases, information is represented as records that are added, deleted, and updated sequentially. OLAP databases store only a snapshot of data. In an OLAP database, information is archived as a single block of data and is intended for on-demand output only. Although new information can be added to an OLAP database, existing data is rarely edited, much less deleted.

Relational databases and OLAP databases are structurally different. Relational databases typically consist of a set of tables that are related to each other. In some cases, a relational database contains so many tables that it is very difficult to determine how they are connected. In OLAP databases, the relationship between individual blocks of data is determined in advance and stored in a structure known as OLAP cubes. Data cubes store complete information about the hierarchical structure and relationships of the database, which greatly simplifies navigation through it. In addition, it is much easier to create reports if you know in advance where the data you are extracting is located and what other data is associated with it.

The main difference between relational databases and OLAP databases is the way information is stored. Data in an OLAP cube is rarely presented in a general way. OLAP data cubes typically contain information presented in a pre-designed format. Thus, the operations of grouping, filtering, sorting and merging data in cubes are performed before filling them with information. This makes retrieving and displaying the requested data as simplified as possible. Unlike relational databases, there is no need to organize the information properly before displaying it on the screen.

OLAP databases are typically created and maintained by IT administrators. If your organization does not have a structure that is responsible for managing OLAP databases, then you can contact the relational database administrator with a request to implement corporate network at least some OLAP solutions.

Connecting to an OLAP data cube

To access an OLAP database, you first need to establish a connection to the OLAP cube. Start by going to the Ribbon tab Data. Click the button From other sources and select the command from the drop-down menu From Analytics Services(Fig. 1).

When you select the specified command of the Data Connection Wizard (Fig. 2). Its main task is to help you establish a connection to the server, which will be used by Excel when managing data.

1. First you need to provide Excel with registration information. Enter the server name, login name and data access password in the fields of the dialog box, as shown in Fig. 2. Click the button Further. If you connect using account Windows, then set the switch Use Windows Authentication.

2. Select the database you will work with from the drop-down list (Fig. 3). The current example uses the Analysis Services Tutorial database. Once you select this database, the list below prompts you to import all the OLAP cubes available in it. Select the required data cube and click on the button Further.

Rice. 3. Select the working database and OLAP cube that you plan to use for data analysis

3. In the next wizard dialog box, shown in Fig. 4, you are required to enter descriptive information about the connection you are creating. All fields of the dialog box shown in Fig. 4, are not required to be filled out. You can always ignore the current dialog box without filling it out, and this will not affect your connection in any way.

Rice. 4. Change the connection descriptive information

4. Click the button Ready to complete the connection creation. A dialog box will appear on the screen Import data(Fig. 5). Set the switch PivotTable Report and click OK to start creating the pivot table.

OLAP cube structure

As you create a pivot table from an OLAP database, you will notice that the task pane window Pivot table fields will be different from that for a regular pivot table. The reason lies in the arrangement of the PivotTable so as to closely reflect the structure of the OLAP cube attached to it. To navigate through an OLAP cube as quickly as possible, you need to become thoroughly familiar with its components and how they interact. In Fig. Figure 6 shows the basic structure of a typical OLAP cube.

As you can see, the main components of an OLAP cube are dimensions, hierarchies, levels, members and measures:

  • Dimensions. The main characteristics of the analyzed data elements. Common examples of dimensions include Products, Customer, and Employee. In Fig. Figure 6 shows the structure of the Products dimension.
  • Hierarchies. A predefined aggregation of levels in a specified dimension. Hierarchy allows you to create summary data and analyze it at different levels of the structure, without delving into the relationships that exist between these levels. In the example shown in Fig. 6, the Products dimension has three levels, which are aggregated into a single hierarchy of Product Categories.
  • Levels. Levels are categories that are aggregated into a common hierarchy. Think of layers as data fields that can be queried and analyzed separately from each other. In Fig. 6 there are only three levels: Category, SubCategory and Product Name.
  • Members. A single data element within a dimension. Members are typically accessed through an OLAP structure of dimensions, hierarchies, and levels. In the example in Fig. 6 members are defined for the Product Name level. Other levels have their own members, which are not shown in the structure.
  • Measures- this is real data in OLAP cubes. Measures are stored in their own dimensions, which are called measure dimensions. You can query measures using any combination of dimensions, hierarchies, levels, and members. This procedure is called “slicing” measures.

Now that you're familiar with the structure of OLAP cubes, let's take a fresh look at the PivotTable field list. The organization of available fields becomes clear and does not cause any complaints. In Fig. Figure 7 shows how the field list represents the elements of an OLAP pivot table.

In the PivotTable Field List OLAP measures are displayed first and are indicated by the summation icon (sigma). These are the only data elements that can be in the VALUE region. After them in the list are indicated the dimensions, indicated by an icon with a table image. Our example uses the Customer dimension. This dimension contains a number of hierarchies. Once the hierarchy is expanded, you can view the individual data levels. To view the data structure of an OLAP cube, simply navigate through the list of fields in the pivot table.

Limitations on OLAP Pivot Tables

When working with OLAP PivotTables, remember that you interact with the PivotTable data source in the Analysis Services OLAP environment. This means that every behavioral aspect of the data cube, from the dimensions to the measures that are included in the cube, is also controlled by OLAP analytics services. In turn, this leads to restrictions on the operations that can be performed on OLAP pivot tables:

  • fields other than measures cannot be placed in the VALUES area of ​​a pivot table;
  • it is impossible to change the function used for summing up;
  • You cannot create a calculated field or calculated item;
  • any changes to field names are canceled immediately after the field is removed from the pivot table;
  • Changing page field parameters is not allowed;
  • command not available Showpages;
  • option disabled Showsignatureselements if there are no fields in the value area;
  • option disabled Subtotals by page elements selected by the filter;
  • parameter not available Backgroundrequest;
  • after double-clicking in the VALUES field, only the first 1000 records from the pivot table cache are returned;
  • checkbox disabled Optimizememory.

Creating Autonomous Data Cubes

In a standard pivot table, the source data is stored on your local hard drive. Thus, you can always manage them, as well as change the structure, even without access to the network. But this in no way applies to OLAP pivot tables. In OLAP pivot tables, the cache is not located on the local hard drive. Therefore, immediately after disconnecting from local network your OLAP pivot table will no longer work. You will not be able to move a single field in such a table.

If you still need to analyze OLAP data when you are not connected to a network, create an offline data cube. This is a separate file that represents the pivot table cache. This file stores OLAP data that is viewed after disconnecting from the local network. To create a standalone data cube, first create an OLAP pivot table. Place the cursor in the pivot table and click on the button OLAP tools contextual tab Analysis, included in the set of contextual tabs Working with Pivot Tables. Select a team Offline OLAP mode(Fig. 8).

A dialog box will appear on the screen Settings battery life OLAP(Fig. 9). Click the button Create offline data file. The first window of the Data Cube File Creation Wizard will appear on the screen. Click the button Further to continue the procedure.

In the second step (Fig. 10), indicate the dimensions and levels that will be included in the data cube. In the dialog box, you must select the data to import from the OLAP database. It is necessary to select only those dimensions that will be needed after disconnecting the computer from the local network. The more dimensions you specify, the larger the autonomous data cube will be.

Click the button Further to proceed to the third step (Fig. 11). In this window you need to select members or data elements that will not be included in the cube. If the checkbox is not selected, the specified item will not be imported and will take up unnecessary space on your local hard drive.

Specify the location and name of the data cube (Figure 12). Data cube files have the extension .cub.

After some time, Excel will save the offline data cube in the specified folder. To test it, double-click on the file, which will automatically generate an Excel workbook that contains a pivot table associated with the selected data cube. Once created, you can distribute the offline data cube to all interested users who are working in offline LAN mode.

Once connected to the local network, you can open the offline data cube file and update it and the corresponding data table. Please note that although the offline data cube is used when there is no network access, it is required to be updated when network connectivity is restored. Attempting to update an offline data cube after the network connection has been lost will result in a failure.

Using data cube functions in pivot tables

Data cube functions that are used in OLAP databases can also be run from a pivot table. In legacy versions of Excel, you only had access to data cube functionality after installing the Analysis Pack add-in. In Excel 2013, these functions are built into the program and are therefore available for use. To fully understand their capabilities, let's look at a specific example.

One of the most simple ways Learning the functions of a data cube involves converting an OLAP pivot table into data cube formulas. This procedure is very simple and allows you to quickly obtain data cube formulas without creating them from scratch. The key principle is to replace all the cells in the pivot table with formulas that are linked to the OLAP database. In Fig. Figure 13 shows a pivot table associated with an OLAP database.

Place the cursor anywhere in the pivot table, click the button OLAP tools contextual ribbon tab Analysis and select a team Convert to formulas(Fig. 14).

If your pivot table contains a report filter field, the dialog box shown in Figure 1 will appear on your screen. 15. In this window, you can specify whether you want to convert the data filter drop-down lists into formulas. If the answer is yes, the drop-down lists will be removed and static formulas will be displayed instead. If you plan to use drop-down lists in the future to change the contents of the pivot table, then clear the only checkbox in the dialog box. If you work on a PivotTable in Compatibility Mode, data filters will be converted to formulas automatically, without prior warning.

After a few seconds, instead of a pivot table, formulas will be displayed that are executed in data cubes and provide the output of the necessary information in the Excel window. Please note that this removes previously applied styles (Fig. 16).

Rice. 16. Take a look at the formula bar: the cells contain the data cube formulas

Given that the values ​​you're viewing are no longer part of the PivotTable object, you can add columns, rows, and calculated members, and combine them with other external sources, and also change the report with the most different ways, including dragging and dropping formulas.

Adding calculations to OLAP pivot tables

IN previous versions Excel OLAP pivot tables did not allow custom calculations. This means that it was not possible to add an additional level of analysis to OLAP PivotTables in the same way that it is possible to add calculated fields and members to regular PivotTables (for more information, please make sure you are familiar with this material before continuing reading ).

Excel 2013 introduces new OLAP tools - calculated measures and calculated MDX members. You are no longer limited to using measures and members in your OLAP cube provided by your DBA. You gain additional analysis capabilities by creating custom calculations.

Introduction to MDX. When you use a PivotTable with an OLAP cube, you issue MDX (Multidimensional Expressions) queries to the database. MDX is a query language used to retrieve data from multidimensional sources (such as OLAP cubes). When an OLAP PivotTable is changed or updated, the corresponding MDX queries are sent to the OLAP database. The results of the query are returned back to Excel and displayed in the PivotTable area. This makes it possible to work with OLAP data without a local copy of the pivot table cache.

When you create calculated measures and MDX members, you use MDX language syntax. Using this syntax, a pivot table allows calculations to interact with the OLAP database backend. The examples discussed in the book are based on basic MDX designs that demonstrate new Excel functions 2013. If you need to create complex calculated measures and MDX members, you'll need to spend time learning more about MDX.

Create calculated measures. A calculated measure is the OLAP version of a calculated field. The idea is to create a new data field based on some mathematical operations performed on existing OLAP fields. In the example shown in Fig. 17, an OLAP summary table is used, which includes the list and quantity of goods, as well as the income from the sale of each of them. We need to add a new measure that will calculate the average price per unit of an item.

Analysis Working with Pivot Tables. In the drop down menu OLAP tools select item (Fig. 18).

Rice. 18. Select a menu item MDX Computed Measure

A dialog box will appear on the screen Create a calculated measure(Fig. 19).

Follow these steps:

2. Select the measure group in which the new calculated measure will be located. If you don't do this, Excel will automatically place the new measure in the first available measure group.

3. In the field MDX expression(MDX) Enter the code that specifies the new measure. To speed up the entry process, use the list on the left to select existing measures to use in the calculations. Double-click the desired measure to add it to the MDX field. The following MDX is used to calculate the average unit selling price:

4. Click OK.

Pay attention to the button Check MDX, which is located in the lower right part of the window. Click this button to check that the MDX syntax is correct. If the syntax contains errors, a message appears.

Once you've finished creating your new calculated measure, go to the list Pivot table fields and select it (Fig. 20).

The scope of a calculated measure applies only to the current workbook. In other words, calculated measures are not created directly in the OLAP server cube. This means that no one will be able to access the calculated measure unless you open general access to the workbook or you will not publish it on the Internet.

Create calculated MDX members. An MDX calculated member is the OLAP version of a regular calculated member. The idea is to create a new data element based on some mathematical operations performed on existing OLAP elements. In the example shown in Fig. 22, an OLAP pivot table is used that includes sales information for 2005–2008 (with a quarterly breakdown). Let's say you want to aggregate data for the first and second quarters by creating a new element, First Half of Year. We will also combine data related to the third and fourth quarters, forming a new element Second of Year.

Rice. 22. We are going to add new MDX calculated members, First Half of Year and Second Half of Year

Place the cursor anywhere in the PivotTable and select the contextual tab Analysis from a set of contextual tabs Working with Pivot Tables. In the drop down menu OLAP tools select item MDX Computed Member(Fig. 23).

A dialog box will appear on the screen (Fig. 24).

Rice. 24. Window Creating a Calculated Item

Follow these steps:

1. Give the calculated measure a name.

2. Select the parent hierarchy for which you are creating new calculated members. At a construction site Parent element assign a value All. This setting allows Excel to access all members of the parent hierarchy when evaluating an expression.

3. In the window MDX expression Enter MDX syntax. To save some time, use the list on the left to select existing members to use in the MDX. Double-click the selected item and Excel will add it to the window MDX expression. In the example shown in Fig. 24, the sum of the first and second quarters is calculated:

..&& +

.. && +

.. && + …

4. Click OK. Excel displays the newly created MDX calculated member in the PivotTable. As shown in Fig. 25, the new calculated item is displayed along with the other calculated items in the PivotTable.

In Fig. Figure 26 illustrates a similar process used to create the Second Half of Year calculated item.

Notice that Excel doesn't even try to remove the original MDX members (Figure 27). The PivotTable continues to show records corresponding to the years 2005–2008, broken down by quarter. In this case, this is not a big deal, but in most scenarios, you should hide “extra” elements to avoid conflicts.

Rice. 27. Excel displays the created MDX calculated member as the original members. But it is still better to delete the original elements to avoid conflicts

Remember: Calculated members are only found in the current workbook. In other words, calculated measures are not created directly in the OLAP server cube. This means that no one will be able to access the calculated measure or calculated member unless you share the workbook or publish it online.

Note that if the parent hierarchy or parent element in an OLAP cube changes, the MDX calculated element no longer functions. You will need to recreate this element.

Managing OLAP calculations. Excel provides an interface that allows you to manage calculated measures and MDX members in OLAP pivot tables. Place the cursor anywhere in the PivotTable and select the contextual tab Analysis from a set of contextual tabs Working with Pivot Tables. In the drop down menu OLAP tools select item Compute Management. In the window Compute Management Three buttons are available (Fig. 28):

  • Create. Create a new calculated measure or calculated MDX member.
  • Change. Change the selected calculation.
  • Delete. Delete the selected calculation.

Rice. 28. Dialog box Compute Management

Perform what-if analysis on OLAP data. In Excel 2013, you can perform what-if analysis on data in OLAP pivot tables. Thanks to this new opportunity You can change values ​​in a PivotTable and recalculate measures and members based on your changes. You can also propagate changes back to the OLAP cube. To take advantage of what-if analysis capabilities, create an OLAP PivotTable and select the contextual tab Analysis Working with Pivot Tables. In the drop down menu OLAP tools select team What-if analysis –> Enable what-if analysis(Fig. 29).

From this point on, you can change the values ​​of the pivot table. To change the selected value in the PivotTable, right-click on it and select the item from the context menu (Fig. 30). Excel will re-run all calculations in the PivotTable with the changes you made, including calculated measures and calculated MDX members.

Rice. 30. Select an item Take the change into account when calculating the pivot table to make changes to the pivot table

By default, edits made to a PivotTable in What-If analysis mode are local. If you want to propagate changes to the OLAP server, select the command to publish changes. Select contextual tab Analysis located in a set of contextual tabs Working with Pivot Tables. In the drop down menu OLAP tools select items What-if analysis – > Publish changes(Fig. 31). As a result of executing this command, " write back» on the OLAP server, which means changes can be propagated to the source OLAP cube. (To propagate changes to the OLAP server, you must have the appropriate permissions to access the server. Contact your database administrator to help you obtain write access permissions to the OLAP database.)

The note was written based on the book by Jelen, Alexander. . Chapter 9.

/ In a cubist manner. Application of OLAP cubes in management practice of large companies


In contact with

Classmates

Konstantin Tokmachev, system architect

In a cubist style.
Application of OLAP cubes in management practice of large companies

Perhaps the time has passed when a corporation's computing resources were spent only on recording information and accounting reports. At the same time, management decisions were made “by eye” in offices, at meetings and meetings. Perhaps in Russia it’s time to return corporate computing systems to their main resource - solving management problems based on data registered in the computer

About the benefits of business analytics

In the corporate management loop, between the “raw” data and the “levers” of influencing the managed object, there are “performance indicators” - KPIs. They form a kind of “dashboard”, reflecting the state of various subsystems of the controlled object. Equipping a company with informative performance indicators and monitoring their calculation and obtained values ​​is the work of a business analyst. Automated analysis services, such as the MS SQL Server Analysis Services (SSAS) utility and its main tool, the OLAP cube, can provide significant assistance in organizing the corporation’s analytical work.

One more point needs to be made right here. Let's say, in the American tradition, a specialty focused on working with OLAP cubes is called BI (Business Intelligence). There should be no illusions that the American BI corresponds to the Russian “business analyst”. No offense, but often our business analyst is an “under-accountant” and “under-programmer”, a specialist with vague knowledge and a small salary, who really does not have any of his own tools and methodology.

A BI specialist is, in fact, an applied mathematician, a highly qualified specialist who uses modern mathematical methods for the company’s arsenal (what was called Operations Research). BI is more consistent with the specialty “system analyst” that was once in the USSR, graduated from the Faculty of Computational Mathematics and Mathematics of Moscow State University. M.V. Lomonosov. The OLAP cube and analysis services can become a promising basis for the workplace of a Russian business analyst, perhaps after some advanced training in the direction of American BI.

Recently, another harmful trend has emerged. Thanks to specialization, mutual understanding between different categories of corporation employees has been lost. An accountant, manager and programmer, like “a swan, a crayfish and a pike” in I.A.’s fable. Krylov, are pulling the corporation in different directions.

The accountant is busy with reporting; his amounts, both in meaning and in dynamics, are not directly related to the business process of the company.

The manager is busy with his part of the business process, but is not able to evaluate globally, at the level of the company as a whole, the results and prospects of his actions.

Finally, the programmer, who was once (thanks to his education) a conductor of advanced technical ideas from the sphere of science to the sphere of business, has turned into a passive executor of the fantasies of the accountant and manager, so it is no longer uncommon for the IT departments of corporations to be driven by accountants and, in general, everyone to whom not lazy. A lack of initiative, illiterate, but relatively highly paid 1C programmer is a real scourge of Russian corporations. (Almost like a domestic football player.) I’m not even talking about the so-called “economists and lawyers”; everything has been said about them a long time ago.

So, the position of a business analyst, equipped with a knowledge-intensive SSAS apparatus, proficient in the basics of programming and accounting, is capable of consolidating the work of the company in relation to the analysis and forecast of the business process.

Advantages of OLAP cubes

OLAP cube is modern remedy analysis of the corporate computer system database, which makes it possible to provide employees at all levels of the hierarchy with the required set of indicators that characterize manufacturing process companies. The point is not only that the convenient interface and flexible query language for the MDX cube (MultiDimensional eXpressions) allow you to formulate and calculate the necessary analytical indicators, but the remarkable speed and ease with which the OLAP cube does this. Moreover, this speed and ease, within certain limits, do not depend on the complexity of calculations and the size of the database.

Some introduction to OLAP-
cube can be given by a “pivot table” of MS Excel. These objects have similar logic and similar interfaces. But, as will be seen from the article, OLAP functionality is incomparably richer, and performance is incomparably higher, so the “pivot table” remains a local desktop product, while OLAP is an enterprise-level product.

Why the OLAP cube is so good for solving analytical tasks? The OLAP cube is designed in such a way that all indicators in all possible sections are pre-calculated (in whole or in part), and the user can only “pull out” the required indicators (measures) and dimensions (dimensions) with the mouse, and the program can redraw the tables.

All possible analytics in all sections form one huge field, or rather, not a field, but just a multidimensional OLAP cube. Whatever request the user (manager, business analyst, executive) turns to the analytics service, the speed of response is explained by two things: firstly, the required analytics can be easily formulated (either selected from a list by name, or specified by a formula in the MDX language ), secondly, as a rule, it has already been calculated.

The formulation of analytics is possible in three options: it is either a database field (or rather, a warehouse field), or a calculation field defined at the cube design level, or an MDX language expression when working interactively with the cube.

This means several attractive features of OLAP cubes. Essentially, the barrier between the user and the data disappears. The barrier is in the form of an application programmer, who, firstly, needs to explain the problem (set a task). Secondly, you will have to wait for the application programmer to create an algorithm, write and debug the program, and then possibly modify it. If there are many employees and their requirements are varied and changeable, then a whole team of application programmers is needed. In this sense, an OLAP cube (and a qualified business analyst) replaces an entire team of application programmers in terms of analytical work, just as a powerful excavator with an excavator operator replaces an entire team of migrant workers with shovels when digging a ditch!

At the same time, another very important quality of the obtained analytical data is achieved. Since there is only one OLAP cube for the entire company, i.e. This is the same field with analysts for everyone, which eliminates annoying discrepancies in the data. When a manager has to ask the same task to several independent employees in order to eliminate the factor of subjectivity, but they still bring different answers, which everyone undertakes to explain somehow, etc. The OLAP cube ensures uniformity of analytical data at different levels of the corporate hierarchy, i.e. if a manager wants to detail a certain indicator of interest to him, then he will certainly come to the lower-level data with which his subordinate works, and this will be precisely the data on the basis of which the higher-level indicator was calculated, and not some other data, received in some other way, at some other time, etc. That is, the entire company sees the same analytics, but at different levels of aggregation.

Let's give an example. Let's say a manager controls accounts receivable. As long as the KPI for overdue receivables is green, it means everything is normal and no management actions are required. If the color has changed to yellow or red, something is wrong: we cut the KPIs by sales departments and immediately see the departments “in red”. The next section by managers - and the seller whose clients are behind on payments is identified. (Further, the overdue amount can be divided by customers, by terms, etc.) The head of the corporation can directly contact the violators at any level. But in general, the same KPI (at their hierarchy levels) is seen by both department heads and sales managers. Therefore, in order to correct the situation, they don’t even need to wait for a “call on the carpet”... Of course, the KPI itself does not necessarily have to be the amount of overdue payments - it can be the weighted average period of overdue payments or, in general, the rate of turnover of receivables.

Let us note that the complexity and flexibility of the MDX language, together with the fast (sometimes instantaneous) results, allows us to solve (taking into account the stages of development and debugging) complex control problems that otherwise might not have been posed at all due to the complexity for application programmers and initial uncertainty in the formulation. (Lengthy deadlines for application programmers to solve analytical problems due to poorly understood formulations and long modifications of programs when conditions change are often encountered in practice.)

Let us also pay attention to the fact that each employee of the company can collect from the general field an OLAP analyst exactly the harvest that he needs for his work, and not be content with the “strip” that is cut out for him in communal “standard reports”.

The multi-user interface for working with an OLAP cube in client-server mode allows each employee, independently of others, to have their own (even self-made with some skill) analytics blocks (reports), which, once defined, are automatically updated - in other words, they are always up to date condition.

That is, the OLAP cube allows you to make analytical work (which is actually carried out not only by reception analysts, but, in fact, by almost all employees of the company, even logisticians and managers who control balances and shipments) more selective, “not in general terms” , which creates conditions for improving work and increasing productivity.

To summarize our introduction, we note that the use of OLAP cubes can raise the management of a company to a higher level. The uniformity of analytical data at all levels of the hierarchy, their reliability, complexity, ease of creating and modifying indicators, individual settings, high speed of data processing, and finally, saving money and time spent on supporting alternative analytical paths (application programmers, employee’s independent calculations) open up prospects for the use of OLAP cubes in the practice of large Russian companies.

OLTP + OLAP: outline feedback in the company management chain

Now let's look at the general idea of ​​OLAP cubes and their point of application in the corporate management chain. The term OLAP (OnLine Analytical Processing) was introduced by the British mathematician Edgar Codd in addition to his previously introduced term OLTP (OnLine Transactions Processing). This will be discussed later, but E. Codd, of course, proposed not only the terms, but also the mathematical theories of OLTP and OLAP. Without going into details, in the modern interpretation, OLTP is a relational database, considered as a mechanism for recording, storing and retrieving information.

Solution methodology

ERP systems (Enterprice Resource Planning), such as 1C7, 1C8, MS Dynamics AX, have user-oriented software interfaces (entering and editing documents, etc.) and a relational database (DB) for storing and retrieving information , represented today by software products such as MS SQL Server (SS).

Note that the information registered in the ERP system database is indeed a very valuable resource. The point is not only that the registered information ensures the current document flow of the corporation (extracting documents, adjusting them, the ability to print and reconcile, etc.) and not only the ability to calculate financial statements (taxes, audit, etc.). From a management point of view, it is much more important that the OLTP system (relational database) is, in fact, an actual life-size digital model of the corporation’s activities.

But to manage the process, it is not enough to register information about it. The process should be presented in the form of a system of numerical indicators (KPIs) characterizing its progress. In addition, acceptable ranges of values ​​must be defined for indicators. And only if the value of the indicator falls outside the permissible interval, a control action should follow.

Regarding this logic (or mythology) of control (“control by deviation”), both the ancient Greek philosopher Plato, who created the image of the helmsman (cybernose), who leans on the oar when the boat deviates from the course, and the American mathematician Norbert Wiener, who created the science of cybernetics on the eve of computer era.

In addition to the usual system for recording information using the OLTP method, another system is needed - a system for analyzing the collected information. This add-on, which in the control loop plays the role of feedback between management and the control object, is an OLAP system or, in short, an OLAP cube.

As a software implementation of OLAP, we will consider the MS Analysis Services utility, which is part of the standard delivery of MS SQL Server, abbreviated SSAS. Note that, according to E. Codd’s plan, the OLAP cube in analytics should give the same comprehensive freedom of action that the OLTP system and the relational database (SQL Server) provide in storing and retrieving information.

OLAP Logistics

Now let's look at the specific configuration external devices, application programs and technological operations on which the automated operation of the OLAP cube is based.

We will assume that the corporation uses an ERP system, for example, 1C7 or 1C8, within which information is recorded as usual. The database of this ERP system is located on a certain server and is supported by MS SQL Server.

We will also assume that another server has software installed, including MS SQL Server with the MS Analysis Services (SSAS) utility, as well as MS SQL Server Management Studio, MS C#, MS Excel and MS Visual Studio. These programs together form the required context: the tools and necessary interfaces for the developer of OLAP cubes.

The SSAS server has a freely distributed program called blat, called (with parameters) from command line and providing postal service.

On employee workstations within the local network, among other things, MS Excel programs (versions no less than 2003) are installed, as well as, possibly, a special driver to ensure MS Excel works with MS Analysis Services (unless the corresponding driver is already included in MS Excel ).

For definiteness, we will assume that an operating system is installed at employee workstations. Windows system XP, and on servers - Windows Server 2008. In addition, let MS SQL Server 2005 be used as the SQL Server, with Enterprise Edition (EE) or Developer Edition (DE) installed on the server with the OLAP cube. In these editions it is possible to use the so-called. “semi-additive measures”, i.e. additional aggregate functions (statistics) other than ordinary sums (for example, extremum or average).

OLAP cube design (OLAP cubism)

Let's say a few words about the design of the OLAP cube itself. In the language of statistics, an OLAP cube is a set of performance indicators calculated in all necessary sections, for example, the shipment indicator in sections by customers, by goods, by dates, etc. Due to direct translation from English in Russian literature on OLAP cubes, indicators are called “measures”, and sections are called “dimensions”. This is a mathematically correct, but syntactically and semantically not very successful translation. The Russian words “measure”, “dimension”, “dimension” are almost the same in meaning and spelling, while the English “measure” and “dimension” are different in both spelling and meaning. Therefore, we give preference to the traditional Russian statistical terms “indicator” and “cut”, which are similar in meaning.

There are several options for software implementation of an OLAP cube in relation to the OLTP system where data is recorded. We will consider only one scheme, the simplest, most reliable and fastest.

In this design, OLAP and OLTP do not share tables, and OLAP analytics are calculated in as much detail as possible during the cube update (Process) stage, which precedes the usage stage. This scheme is called MOLAP (Multidimensional OLAP). Its disadvantages are asynchrony with ERP and high memory costs.

Although formally an OLAP cube can be built using all (thousands) of ERP system relational database tables as a data source and all (hundreds) of their fields as indicators or sections, in reality this should not be done. Vice versa. To load into a cube, it is more correct to prepare a separate database, called a “showcase” or “warehouse”.

Several reasons force us to do this.

  • Firstly, Linking an OLAP cube to tables in a real database will certainly create technical problems. Changing data in a table can trigger a refresh of the cube, and refreshing a cube is not necessarily a fast process, so the cube will be in a state of constant rebuilding; At the same time, the cube update procedure can block (when reading) the data of the database tables, slowing down the work of users in registering data in the ERP system.
  • Secondly, Having too many indicators and cuts will dramatically increase the storage area of ​​the cube on the server. Let’s not forget that the OLAP cube stores not only the source data, as in the OLTP system, but also all the indicators summed up over all possible sections (and even all combinations of all sections). In addition, the speed of updating the cube and, ultimately, the speed of building and updating analytics and user reports based on them will slow down accordingly.
  • Third, too many fields (indicators and sections) will create problems in the OLAP developer interface, because the lists of elements will become immense.
  • Fourthly, The OLAP cube is very sensitive to data integrity violations. The cube cannot be built if the key data is not located at the link specified in the structure of the cube field connections. Temporary or permanent integrity violations, empty fields are common in an ERP system database, but this is absolutely not suitable for OLAP.

You can also add that the ERP system and the OLAP cube should be located on different servers to share the load. But then, if there are common tables for OLAP and OLTP, the problem of network traffic also arises. Practically insoluble problems arise in this case when it is necessary to consolidate several disparate ERP systems (1C7, 1C8, MS Dynamics AX) into one OLAP cube.

Probably, we can continue to pile up technical problems. But most importantly, remember that, unlike OLTP, OLAP is not a means of recording and storing data, but an analytics tool. This means that there is no need to upload and download “dirty” data from ERP to OLAP “just in case.” On the contrary, you must first develop a concept for managing the company, at least at the level of the KPI system, and then design an application data warehouse (warehouse), located on the same server as the OLAP cube, and containing a small, refined amount of data from ERP necessary for management .

Without promoting bad habits, the OLAP cube in relation to OLTP can be likened to the well-known “still”, through which a “pure product” is extracted from the “fermented mass” of real registration.

So, we got that the data source for OLAP is a special database (warehouse), located on the same server as OLAP. Generally this means two things. First, there must be special procedures that will create a warehouse from ERP databases. Secondly, the OLAP cube is asynchronous with its ERP systems.

Taking into account the above, we propose the following version of the computing process architecture.

Solution architecture

Suppose there are many ERP systems of a certain corporation (holding) located on different servers, the analytical data for which we would like to see consolidated within one OLAP cube. We emphasize that in the technology described, we combine data from ERP systems at the warehouse level, leaving the design of the OLAP cube unchanged.

On the OLAP server we create images (blank copies) of the databases of all these ERP systems. We periodically (nightly) perform partial replication of the corresponding active ERP databases onto these empty copies.

Next, SP (stored procedure) is launched, which, on the same OLAP server without network traffic, based on partial replicas of ERP system databases, creates (or replenishes) a warehouse (warehouse) - the data source of the OLAP cube.

Then the standard procedure for updating/building a cube based on warehouse data is launched (Process operation in the SSAS interface).

Let us comment on certain aspects of the technology. What kind of work do SPs do?

As a result of partial replication, current data appears in the image of some ERP system on the OLAP server. By the way, partial replication can be performed in two ways.

Firstly, from all the tables in the ERP system database, during partial replication, only those that are needed to build a warehouse are copied. This is controlled by a fixed list of table names.

Secondly, partial replication may also mean that not all fields of the table are copied, but only those that are involved in building the warehouse. The list of fields to copy is either specified or dynamically created in SP in the image of the copy (if not all fields are initially present in the copy of the table).

Of course, it is possible not to copy entire table rows, but only to add new records. However, this creates serious inconveniences when accounting for ERP revisions “retroactively,” which is often the case in real-life systems. So it’s easier, without further ado, to copy all records (or update the “tail” starting from a certain date).

Next, the main task of SP is to convert ERP system data to warehouse format. If there is only one ERP system, then the task of conversion mainly comes down to copying and possibly reformatting the necessary data. But if it is necessary to consolidate several ERP systems of different structures in the same OLAP cube, then the transformations become more complicated.

The task of consolidating several different ERP systems in a cube is especially difficult if the sets of their objects (directories of goods, contractors, warehouses, etc.) partially overlap, the objects have the same meaning, but are naturally described differently in directories of different systems (in sense of codes, identifiers, names, etc.).

In reality, such a picture arises in a large holding company, when several of its constituent autonomous companies of the same type carry out approximately the same types of activities in approximately the same territory, but use their own and non-agreed registration systems. In this case, when consolidating data at the warehouse level, you cannot do without auxiliary mapping tables.

Let's pay some attention to the warehouse storage architecture. Typically, an OLAP cube schema is represented in the form of a “star”, i.e. as a data table surrounded by “rays” of directories - tables of secondary key values. A table is a block of “indicators”; reference books are their sections. In this case, the directory, in turn, can be an arbitrary unbalanced tree or a balanced hierarchy, for example, a multi-level classification of goods or contractors. In an OLAP cube, the numeric fields of a data table from a warehouse automatically become “indicators” (or measures), and sections (or dimensions) can be defined using secondary key tables.

This is a visual “pedagogical” description. In fact, the architecture of an OLAP cube can be much more complex.

Firstly, a warehouse can consist of several “stars”, possibly connected through common directories. In this case, the OLAP cube will be a union of several cubes (several data blocks).

Secondly, the “ray” of an asterisk can be not just one directory, but an entire (hierarchical) file system.

Thirdly, on the basis of existing dimension sections, new hierarchical sections can be defined using the OLAP developer interface tools (say, with fewer levels, with a different order of levels, etc.)

Fourthly, based on existing indicators and sections, using MDX language expressions, new indicators (calculations) can be defined. It is important to note that new cubes, new indicators, new sections are automatically fully integrated with the original elements. It should also be noted that poorly formulated calculations and hierarchical sections can significantly slow down the operation of an OLAP cube.

MS Excel as an interface with OLAP

Of particular interest is the user interface with OLAP cubes. Naturally, the most complete interface is provided by the SSAS utility itself. This includes an OLAP cube developer toolkit, an interactive report designer, and a window interactive work with an OLAP cube using MDX queries.

In addition to SSAS itself, there are many programs that provide an interface to OLAP, covering their functionality to a greater or lesser extent. But among them there is one, which, in our opinion, has undeniable advantages. This is MS Excel.

The interface with MS Excel is provided by a special driver, downloadable separately or included in the Excel distribution. It does not cover all OLAP functionality, but as MS Excel version numbers grow, this coverage becomes wider (for example, in MS Excel 2007 it appears graphic image KPI, which was not in MS Excel 2003, etc.).

Of course, in addition to its fairly complete functionality, the main advantage of MS Excel is the widespread distribution of this program and the close familiarity with it of the overwhelming number of office users. In this sense, unlike other interface programs, the company does not need to purchase anything additional and does not need to train anyone additionally.

The great advantage of MS Excel as an interface with OLAP is the ability to further independently process the data obtained in the OLAP report (i.e., continue to study data obtained from OLAP on other sheets of the same Excel, no longer using OLAP tools, but using regular Excel tools).

Facubi nightly treatment cycle

Now we will describe the daily (nightly) computational cycle of OLAP operation. The calculation is carried out under the control of the facubi program, written in C# 2005 and launched via Task Scheduler on a server with warehouse and SSAS. At the beginning, facubi goes to the Internet and reads current exchange rates (used to represent a number of indicators in a currency). Next, perform the following steps.

First, facubi launches SPs that perform partial replication of the databases of various ERP systems (holding elements) available on the local network. Replication is performed, as we said, to pre-prepared “backgrounds” - images of remote ERP systems located on the SSAS server.

Secondly, through SP, a mapping is performed from ERP replicas to the warehouse storage - a special DB, which is the source of OLAP cube data and located on the SSAS server. In this case, three main tasks are solved:

  • ERP data adjusted to the required cube formats; we're talking about both about tables and table fields. (Sometimes the required table needs to be “fashioned,” say, from several MS Excel sheets.) Similar data may have different formats in different ERPs, for example, key ID fields in 1C7 directories have a 36-digit character code of length 8, and _idrref fields in directories 1С8 – hexadecimal numbers of length 32;
  • during processing logical data control is carried out (including writing defaults in place of missing data, where possible) and integrity control, i.e. checking the presence of primary and secondary keys in the corresponding classifiers;
  • code consolidation objects that have the same meaning in different ERPs. For example, the corresponding elements of directories of different ERPs may have the same meaning, say, they are the same counterparty. The problem of consolidating codes is solved by constructing mapping tables, where various codes the same objects are brought to unity.

Thirdly, facubi launches standard procedure updating Process cube data (from the SSAS utility procedures).

Based on the checklists, facubi sends emails about the progress of processing steps.

After executing facubi, Task Scheduler launches several excel files, in which reports are pre-created based on OLAP cube indicators. As we said, MS Excel has a special software interface(separately downloadable or built-in driver) for working with OLAP cubes (with SSAS). When you start MS Excel, MS VBA programs (such as macros) are activated, which ensure that data in reports is updated; reports are modified if necessary and sent by mail (blat program) to users according to checklists.

Local network users with access to the SSAS server will receive “live” reports configured for the OLAP cube. (In principle, they themselves, without any mail, can update OLAP reports in MS Excel that are on their local computers.) Users outside the local network will either receive original reports, but with limited functionality, or for them (after updating OLAP reports in MS Excel) special “dead” reports will be calculated that do not access the SSAS server.

Evaluation of results

We talked above about the asynchrony of OLTP and OLAP. In the technology variant under consideration, the OLAP cube update cycle is performed at night (say, it starts at 1 am). This means that in the current working day, users are working with yesterday's data. Since OLAP is not a recording tool (look at the latest revision of the document), but a management tool (understand the trend of the process), such a lag is usually not critical. However, if necessary, even in the described version of the cube architecture (MOLAP), updates can be carried out several times a day.

The execution time of update procedures depends on the design features of the OLAP cube (more or less complexity, more or less successful definitions of indicators and sections) and on the volume of databases of external OLTP systems. According to experience, the warehouse construction procedure takes from several minutes to two hours, the cube update procedure (Process) takes from 1 to 20 minutes. We are talking about complex OLAP cubes that unite dozens of star-type structures, dozens of common “rays” (reference sections) for them, and hundreds of indicators. Estimating the volume of databases of external ERP systems based on shipping documents, we are talking about hundreds of thousands of documents and, accordingly, millions of product lines per year. The historical processing depth of interest to the user was three to five years.

The described technology is used in a number of large corporations: since 2008 in the Russian Fish Company (RRK) and the Russian Sea company (RM), since 2012 in the Santa Bremor company (SB). Some corporations are primarily trading and purchasing firms (PPCs), others are production companies (fish and seafood processing plants in the Republic of Moldova and the Republic of Belarus). All corporations are large holdings, uniting several companies with independent and various computer accounting systems - ranging from standard ERP systems such as 1C7 and 1C8 to “relic” accounting systems based on DBF and Excel. I will add that the described technology for operating OLAP cubes (without taking into account the development stage) either does not require special employees at all, or is the responsibility of one full-time business analyst. The problem has been swirling around for years automatic mode, providing various categories of corporate employees with up-to-date reporting on a daily basis.

Pros and cons of the solution

Experience shows that the proposed solution is quite reliable and easy to use. It is easily modified (connection/disconnection of new ERPs, creation of new indicators and sections, creation and modification of Excel reports and their mailing lists) with invariance control program facubi.

MS Excel as an interface with OLAP provides sufficient expressiveness and allows different categories of office employees to quickly become familiar with OLAP technology. The user receives daily “standard” OLAP reports; using the MS Excel interface with OLAP, can independently create OLAP reports in MS Excel. In addition, the user can independently continue to study the information of OLAP reports using the usual capabilities of his MS Excel.

The “refined” warehouse database, in which several heterogeneous ERP systems are consolidated (during the construction of the cube), even without any OLAP allows you to solve (on the SSAS server, using the query method in the language Transact SQL or the SP method, etc.) many applied control problems. Let us recall that the warehouse database structure is unified and much simpler (in terms of the number of tables and the number of table fields) than the database structures of the original ERP.

We especially note that in our proposed solution there is the possibility of consolidating various ERP systems in one OLAP cube. This allows you to obtain analytics for the entire holding and maintain long-term continuity in analytics when a corporation moves to another accounting ERP system, say, when moving from 1C7 to 1C8.

We used the MOLAP cube model. The advantages of this model are reliability in operation and high speed of processing user requests. Disadvantages: OLAP and OLTP are asynchronous, as well as large amounts of memory for storing OLAP.

In conclusion, here is another argument in favor of OLAP that might have been more appropriate in the Middle Ages. Because its evidentiary power rests on authority. A modest, clearly underrated British mathematician E. Codd developed the theory of relational databases in the late 60s. The power of this theory was such that now, after 50 years, it is already difficult to find a non-relational database and a database query language other than SQL.

OLTP technology, based on the theory of relational databases, was the first idea of ​​E. Codd. In fact, the concept of OLAP cubes is his second idea, expressed by him in the early 90s. Even without being a mathematician, you can quite expect that the second idea will be as effective as the first. That is, in terms of computer analytics, OLAP ideas will soon take over the world and displace all others. Simply because the topic of analytics finds its comprehensive mathematical solution in OLAP, and this solution is “adequate” (B. Spinoza’s term) to the practical problem of analytics. “Adequately” means in Spinoza that God himself could not have thought of anything better...

  1. Larson B. Development of business analytics in Microsoft SQL Server 2005. – St. Petersburg: “Peter”, 2008.
  2. Codd E. Relational Completeness of Data Base Sublanguages, Data Base Systems, Courant Computer Science Sumposia Series 1972, v. 6, Englwood cliffs, N.Y., Prentice – Hall.

In contact with

Data is usually sparse and long-term stored. It can be implemented on the basis of universal relational DBMS or specialized software (see also OLAP). IN software products SAP company uses the term “infocube”.

Array indices correspond to dimensions (dimensions) or axes of the cube, and values ​​of array elements correspond to measures (measures) of the cube.

w : (x,y,z) → w xyz,

Where x, y, z- measurements, w- measure.

Unlike a regular array in a programming language, access to the elements of an OLAP cube can be carried out either by the full set of index dimensions or by their subset, and then the result will be not one element, but many of them.

W : (x,y) → W = ( w z1, w z2, …, w zn}

Also known description OLAP cube using relational algebra terminology as a projection of relations.

see also


Wikimedia Foundation. 2010.

  • Star diagram
  • Our home is Russia (faction)

See what an “OLAP cube” is in other dictionaries:

    OLAP cube- ... Wikipedia

    OLAP- (eng. online analytical processing, analytical processing in real time) data processing technology, which consists in the preparation of summary (aggregated) information based on large amounts of data, structured by ... ... Wikipedia

    Cube (disambiguation)- Cube is a multi-valued term: In mathematics In stereometry, a cube is a hexagonal regular polyhedron In algebra, the third power of a number Film Series of science fiction films: “Cube” “Cube 2: Hypercube” “Cube Zero” Slang and jargon medical... ... Wikipedia

    Cube- This term has other meanings, see Cube (meanings). Cube Type Regular polyhedron Face square ... Wikipedia

    Mondrian- OLAP Server Type OLAP server Developer Pentaho operating system cross-platform software Latest version 3.4.1 (2012 05 07) License free software ... Wikipedia - Information analytical system automated system allowing experts to quickly analyze large volumes of data, as a rule, is one of the elements of situation centers. Also, sometimes the IAS includes a collection system... ... Wikipedia

As part of this work, the following issues will be considered:

  • What are OLAP cubes?
  • What are measures, dimensions, hierarchies?
  • What types of operations can be performed on OLAP cubes?
The concept of an OLAP cube

The main postulate of OLAP is multidimensionality in data presentation. In OLAP terminology, the concept of a cube, or hypercube, is used to describe a multidimensional discrete data space.

Cube is a multidimensional data structure from which the user-analyst can query information. Cubes are created from facts and dimensions.

Data- this is data about objects and events in the company that will be subject to analysis. Facts of the same type form measures. A measure is the type of value in a cube cell.

Measurements- these are the data elements by which the facts are analyzed. A collection of such elements forms a dimension attribute (for example, days of the week can form a time dimension attribute). In business analysis tasks for commercial enterprises, the dimensions often include categories such as “time”, “sales”, “products”, “customers”, “employees”, “geographic location”. Dimensions are most often hierarchical structures, representing logical categories by which the user can analyze actual data. Each hierarchy can have one or more levels. Thus, the hierarchy of the “geographic location” dimension may include the levels: “country - region - city”. In the time hierarchy, we can distinguish, for example, the following sequence of levels: A dimension can have several hierarchies (each hierarchy of one dimension must have the same key attribute of the dimension table).

A cube can contain actual data from one or more fact tables and most often contains multiple dimensions. Any given cube usually has a specific focus for analysis.

Figure 1 shows an example of a cube designed to analyze sales of petroleum products by a certain company by region. This cube has three dimensions (time, product and region) and one measure (sales volume expressed in monetary terms). Measure values ​​are stored in the corresponding cells of the cube. Each cell is uniquely identified by a set of members of each dimension, called a tuple. For example, the cell located in the lower left corner of the cube (contains the value $98399) is specified by the tuple [July 2005, Far East, Diesel]. Here the value of $98,399 shows the sales volume (in monetary terms) of diesel in the Far East for July 2005.

It is also worth noting that some cells do not contain any values: these cells are empty because the fact table does not contain data for them.

Rice. 1. Cube with information on sales of petroleum products in various regions

The ultimate goal of creating such cubes is to minimize the processing time of queries that extract the required information from the actual data. To accomplish this task, cubes typically contain precomputed totals called aggregations(aggregations). Those. the cube covers a data space larger than the actual one - there are logical, calculated points in it. Aggregation functions allow you to calculate the values ​​of points in logical space based on actual values. The simplest aggregation functions are SUM, MAX, MIN, COUNT. So, for example, using MAX function, for the cube given in the example, it is possible to identify when the peak in diesel sales occurred in the Far East, etc.

Another specific feature of multidimensional cubes is the difficulty of determining the origin. For example, how do you set point 0 for the Product or Regions dimension? The solution to this problem is to introduce a special attribute that combines all the elements of the dimension. This attribute (created automatically) contains only one element - All. For simple aggregation functions such as sum, the All element is equivalent to the sum of the values ​​of all elements in the actual space of a given dimension.

An important concept in a multidimensional data model is the subspace, or sub cube. A subcube is a part of the full space of a cube in the form of some multidimensional figure inside the cube. Since the multidimensional space of a cube is discrete and limited, the subcube is also discrete and limited.

Operations on OLAP cubes

The following operations can be performed on an OLAP cube:

  • slice;
  • rotation;
  • consolidation;
  • detailing.
Slice(Figure 2) is a special case of a subcube. This is a procedure for forming a subset of a multidimensional data array corresponding to a single value of one or more dimension elements not included in this subset. For example, to find out how sales of petroleum products progressed over time only in a certain region, namely in the Urals, you need to fix the “Products” dimension on the “Ural” element and extract the corresponding subset (subcube) from the cube.
  • Rice. 2. OLAP cube slice

    Rotation(Figure 3) - the operation of changing the location of measurements presented in a report or on the displayed page. For example, a rotation operation may involve rearranging the rows and columns of a table. Additionally, rotating a data cube moves out-of-tabular dimensions into place with dimensions present on the displayed page, and vice versa.

    OLAP (On-Line Analytical Processing) is a method of electronic analytical data processing that represents the organization of data into hierarchical categories using pre-calculated totals. OLAP data is organized hierarchically and is stored in cubes rather than tables. OLAP cubes are a multidimensional data set with axes containing parameters and cells containing parameter-dependent aggregate data. Cubes are designed for complex multidimensional analysis of large volumes of data because they provide only summary results for reporting, instead of a large number of individual records.

    The concept of OLAP was described in 1993 by the famous database researcher and author of the relational data model E. F. Codd. Currently, OLAP support is implemented in many DBMSs and other tools.

    An OLAP cube contains two types of data:

    · total values, values ​​for which you want to summarize, representing calculated data fields;

    · descriptive information representing measurements or dimensions. Descriptive information is typically organized into levels of detail. For example: “Year”, “Quarter”, “Month” and “Day” in the “Time” dimension. Organizing fields into levels of detail allows reporting users to choose the level of detail they want to view, starting with high-level summary data and then drilling down to a more detailed view, and vice versa.

    Microsoft Query tools also allow you to create OLAP cubes from a query that loads relational database data, such as Microsoft Access, in this case the linear table is transformed into a structural hierarchy (cube).

    The Create OLAP Cube Wizard is a built-in Microsoft Query tool. To create an OLAP cube based on a relational database, you must complete the following steps before running the wizard.

    1. Determine the data source (see Figure 6.1).

    2. Using Microsoft Query, create a query, including only those fields that will be either data fields or dimension fields of an OLAP cube; if a field in a cube is used more than once, then it must be included in the query the required number of times.

    3. At the last step of the query creation wizard, set the switch on the item Creating an OLAP cube from of this request (see Fig. 6.2) or after the request is created using the Query menu directly File select a team Create OLAP Cube, after which the Create OLAP Cube Wizard will be launched.

    The Create OLAP Cube Wizard consists of three steps.

    At the first step of the wizard (see Fig. 6.6) the data fields– calculated fields for which total values ​​must be determined.



    Rice. 6.6. Defining Data Fields

    The wizard places expected calculated fields (usually numeric fields) at the top of the list, checks them, and determines the resulting function of these fields, usually - Sum. When selecting data fields, at least one field must be selected as a calculated field and at least one field must be left unchecked to determine the dimension.

    When creating an OLAP cube, you can use four summary functions − Sum, Number(number of values), Minimum, Maximum for numeric fields and one function Number for all other fields. If you want to use several different summary functions of the same field, that field must be included in the query the required number of times.

    The name of a calculated field can be changed in a column Data field name.

    At the second step of the wizard, descriptive data and their dimensions are determined (see Fig. 6.7). To select a measurement field, you must from the list Source fields drag the desired dimension field top level to the list Measurements to the area marked as Drag fields here to create dimensions. To create an OLAP cube, you must define at least one dimension. At the same step of the wizard, using context menu You can change the name of a dimension or level field.

    Rice. 6.7. Defining Dimension Fields

    Fields that contain isolated or discrete data and do not belong to a hierarchy can be defined as single-level dimensions. However, the cube will be more efficient if some of the fields are organized into levels. To create a level as part of a dimension, drag a field from the list Source fields on a field that is a dimension or level. Fields containing more detailed information should be placed at lower levels. For example, in Figure 6.7 the field Job title is the field level Department Name.

    To move a field to a lower or higher level, you need to drag it to a lower or higher field within the dimension. To display or hide levels, use the or buttons, respectively.

    If you use date or time fields as the top-level dimension, the OLAP Cube Wizard automatically creates levels for those dimensions. The user can then select which levels should appear in the reports. For example, you can select weeks, quarters and years, or months (see Figure 6.7).

    Keep in mind that the wizard automatically creates levels for date and time fields only when you create a top-level dimension; When adding these fields as sublevels of a dimension, automatic levels are not created.

    At the third step of the wizard, the type of cube created by the wizard is determined, with three options possible (see Fig. 6.8).

    Rice. 6.8. Selecting the type of cube to be created at the third step of the wizard

    · The first two options involve creating a cube each time you open a report (if the cube is viewed from Excel, then we are talking about a pivot table). In this case, the request file and the file cube definitions *.oqy, which contains instructions for creating a cube. The *.oqy file can be opened in Excel program to create reports based on the cube, and if you need to make changes to the cube, you can open Query to run the Create Cube Wizard again.

    By default, cube definition files, like query files, are stored in the user profile folder in Application Data\Microsoft\Que-ries. When saving a *.oqy file in the standard folder, the name of the cube definition file is displayed on the tab OLAP cubes when opening a new query in Microsoft Query or when selecting a command Create a request(menu Data, submenu Importing external data) in Microsoft Excel.

    · In case of choosing the third option of cube type Saving a cube file containing all the data for the cube, all data for the cube is retrieved and a cube file with the extension * is created in a user-specified location .cub, in which this data is stored. Creation of this file does not happen immediately when the button is pressed Ready; the file is created either when you save the cube definition to a file or when you create a report based on the cube.

    The choice of cube type is determined by several factors: the amount of data the cube contains; the type and complexity of reports that will be created based on the cube; system resources (memory and disk space), etc.

    A separate *.cub cube file should be created in the following cases:

    1) for frequently changed interactive reports if there is sufficient disk space;

    2) when you need to save the cube on a network server to provide access to it for other users when creating reports. A cube file can provide specific data from the source database while omitting sensitive or sensitive data that you want to prevent other users from accessing.



  • 
    Top