Multidimensional cubes, OLAP and MDX. Calculated fields and calculated items


Data is usually sparse and long-term stored. Can be implemented based on universal relational DBMS or specialized software (see also OLAP). SAP software products use the term “infocube”.

Array indices correspond to dimensions (dimensions) or axes of the cube, and values ​​of array elements correspond to measures (measures) of the cube.

w : (x,y,z) → w xyz,

Where x, y, z- measurements, w- measure.

Unlike a regular array in a programming language, access to the elements of an OLAP cube can be carried out either by full set indexes-dimensions, and by their subset, and then the result will be not one element, but many of them.

W : (x,y) → W = ( w z1, w z2, …, w zn}

Also known description OLAP cube using terminology relational algebra as projections of relationships.

see also


Wikimedia Foundation. 2010.

  • Star diagram
  • Our home is Russia (faction)

See what an “OLAP cube” is in other dictionaries:

    OLAP cube- ... Wikipedia

    OLAP- (eng. online analytical processing, analytical processing in real time) data processing technology, which consists in the preparation of summary (aggregated) information based on large amounts of data, structured by ... ... Wikipedia

    Cube (disambiguation)- Cube is a multi-valued term: In mathematics In stereometry, a cube is a hexagonal regular polyhedron In algebra, the third power of a number Film Series of science fiction films: “Cube” “Cube 2: Hypercube” “Cube Zero” Slang and jargon medical... ... Wikipedia

    Cube- This term has other meanings, see Cube (meanings). Cube Type Regular polyhedron Face square ... Wikipedia

    Mondrian- OLAP Server Type OLAP server Developer Pentaho operating system cross-platform software Latest version 3.4.1 (2012 05 07) License free software ... Wikipedia - Informational analytical system automated system allowing experts to quickly analyze large volumes of data, as a rule, is one of the elements of situational centers. Also, sometimes the IAS includes a collection system... ... Wikipedia

Perhaps for some, the use of OLAP technology (On-line Analytic Processing) when creating reports will seem somewhat exotic, so the use of OLAP-CUBE for them is not at all one of the most important requirements when automating budgeting and management accounting.

In fact, it is very convenient to use a multidimensional CUBE when working with management reporting. When developing budget formats, you may encounter the problem of multivariate forms (you can read more about this in Book 8, “Technology for setting up budgeting in a company,” and in the book, “Setting up and automating management accounting”).

This is due to the fact that effective management of a company requires increasingly detailed management reporting. That is, the system uses more and more different analytical sections (in information systems analytics are defined by a set of reference books).

Naturally, this leads to the fact that managers want to receive reporting in all analytical sections that interest them. This means that the reports need to be made to “breathe” somehow. In other words, we can say that in in this case we're talking about that, within its meaning, one and the same report should provide information in different analytical aspects. Therefore, static reports no longer suit many modern managers. They need the dynamics that a multidimensional CUBE can provide.

Thus, OLAP technology has already become mandatory element in modern and advanced information systems. Therefore, when choosing a software product, you need to pay attention to whether it uses OLAP technology.

Moreover, you need to be able to distinguish real CUBES from imitation ones. One such simulation is pivot tables in MS Excel. Yes, this tool looks like a CUBE, but in fact it is not one, since these are static, not dynamic tables. In addition, they have a much worse implementation of the ability to build reports using elements from hierarchical directories.

To confirm the relevance of using CUBE in the construction of management reporting, we can cite simplest example with a sales budget. In the example under consideration, the following analytical sections are relevant for the company: products, branches and sales channels. If these three analytics are important for the company, then the sales budget (or report) can be displayed in several versions.

It should be noted that if you create budget lines based on three analytical sections (as in the example under consideration), this allows you to create quite complex budget models and create detailed reports using CUBE.

For example, a sales budget can be compiled using only one analytics (directory). An example of a sales budget built on the basis of one analytics "Products" is presented at Figure 1.

Rice. 1. An example of a sales budget built on the basis of one analytics “Products” in OLAP-CUBE

The same sales budget can be compiled using two analytics (directories). An example of a sales budget built on the basis of two analytics “Products” and “Branches” is presented at Figure 2.

Rice. 2. An example of a sales budget built on the basis of two analytics “Products” and “Branches” in the OLAP-CUBE of the INTEGRAL software package

.

If there is a need to build more detailed reports, then you can draw up the same sales budget using three analytics (directories). An example of a sales budget built on the basis of three analytics “Products”, “Branches” and “Sales Channels” is presented at Figure 3.

Rice. 3. An example of a sales budget built on the basis of three analytics “Products”, “Branches” and “Sales Channels” in the OLAP-CUBE of the INTEGRAL software package

It should be recalled that the CUBE used to generate reports allows you to display data in different sequences. On Figure 3 The sales budget is first “expanded” by product, then by branch, and then by sales channel.

The same data can be presented in a different sequence. On Figure 4 the same sales budget is “expanded” first by product, then by sales channel, and then by branch.

Rice. 4. An example of a sales budget built on the basis of three analytics “Products”, “Distribution Channels” and “Branches” in the OLAP-CUBE of the INTEGRAL software package

On Figure 5 the same sales budget is “unfolded” first by branches, then by products, and then by sales channels.

Rice. 5. An example of a sales budget built on the basis of three analytics “Branches”, “Products” and “Sales Channels” in the OLAP-CUBE software package “INTEGRAL”

Actually that's not all possible options withdrawal of the sales budget.

In addition, you need to pay attention to the fact that KUB allows you to work with the hierarchical structure of directories. In the examples presented, the hierarchical directories are “Products” and “Distribution Channels”.

From the user's point of view, he is in this example receives several management reports (see Rice. 1-5), and from the point of view of settings in software product- this is one report. Simply using the CUBE you can view it in several ways.

Naturally, in practice it is very possible a large number of options for outputting various management reports if their articles are based on one or more analysts. And the set of analytics itself depends on the users’ needs for detail. True, we should not forget that, on the one hand, the larger the analyst, the more detailed reports can be built. But, on the other hand, this means that the financial budgeting model will be more complex. In any case, if there is a KUB, the company will have the opportunity to view the necessary reporting in various versions, in accordance with the analytical sections of interest.

It is necessary to mention several more features of the OLAP-CUBE.

In a multidimensional hierarchical OLAP-CUBE there are several dimensions: row type, date, rows, directory 1, directory 2 and directory 3 (see. Rice. 6). Naturally, the report displays as many buttons with directories as there are in the budget line containing the maximum number of directories. If there is not a single reference book in any budget line, then the report will not have a single button with reference books.

Initially, the OLAP-CUBE is built along all dimensions. By default, when the report is initially built, the dimensions are located in exactly the areas shown in Figure 6. That is, a dimension such as “Date” is located in the area of ​​vertical dimensions (dimensions in the column area), dimensions “Rows”, “Directory 1”, “Directory 2” and “Directory 3” - in the area of ​​horizontal dimensions (dimensions in the area rows), and the “Row Type” dimension is in the area of ​​“unexpanded” dimensions (dimensions in the page area). If a dimension is in the last area, then the data in the report will not "expand" on that dimension.

Each of these dimensions can be placed in any of the three areas. Once measurements are transferred, the report is instantly rebuilt to match the new measurement configuration. For example, you can swap the date and lines with reference books. Or you can move one of the reference books to the vertical measurement area (see. Rice. 7). In other words, you can “twist” the report in the OLAP-CUBE and select the report output option that is most convenient for the user.

Rice. 7. An example of rebuilding a report after changing the measurement configuration of the INTEGRAL software package

The measurement configuration can be changed either in the main CUBE form or in the change map editor (see. Rice. 8). In this editor, you can also drag and drop measurements from one area to another with the mouse. In addition, you can swap measurements in one area.

In addition, in the same form you can configure some measurement parameters. For each dimension, you can customize the location of totals, the sorting order of elements, and the names of elements (see. Rice. 8). You can also specify which element name to display in the report: abbreviated (Name) or full (FullName).

Rice. 8. Measurement map editor of the INTEGRAL software package

You can edit measurement parameters directly in each of them (see. Rice. 9). To do this, click on the icon located on the button next to the measurement name.

Rice. 9. Example of editing directory 1 Products and services in

Using this editor, you can select the elements that you want to show in the report. By default, all elements are displayed in the report, but if necessary, some elements or folders can be omitted. For example, if you need to display only one product group in the report, then you need to uncheck all the others in the measurement editor. After that, the report will contain only one product group (see. Rice. 10).

You can also sort elements in this editor. In addition, elements can be rearranged in various ways. After such a regrouping, the report is instantly rebuilt.

Rice. 10. Example of output in a report of only one product group (folder) in the INTEGRAL software package

In the dimension editor, you can quickly create your own groups, drag and drop elements from directories there, etc. By default, only the Other group is automatically created, but other groups can be created. Thus, using the dimension editor, you can configure which elements of the reference books and in what order should be displayed in the report.


It should be noted that all such rearrangements are not recorded. That is, after closing the report or after its recalculation, all directories will be displayed in the report in accordance with the configured methodology.

In fact, all such changes could have been made initially when setting up the lines.

For example, using restrictions you can also specify which elements or groups of directories should be displayed in the report and which should not.

Note: the topic of this article is discussed in more detail at workshops "Budget management of an enterprise" And "Organization and automation of management accounting" conducted by the author of this article, Alexander Karpov.

If the user almost regularly needs to display only certain elements or directory folders in the report, then similar settings It's best to do this in advance when creating report lines. If it is important for the user various combinations elements of reference books in reports, then there is no need to set any restrictions when setting up the methodology. All such restrictions can be quickly configured using the measurement editor.

In the previous article of this cycle(see No. 2’2005) we talked about the main innovations of analytical services SQL Server 2005. Today we will take a closer look at the tools for creating OLAP solutions included in this product.

Briefly about the basics of OLAP

Before we start talking about tools for creating OLAP solutions, let us recall that OLAP (On-Line Analytical Processing) is a technology for complex multidimensional data analysis, the concept of which was described in 1993 by E.F. Codd, the famous author relational model data. Currently, OLAP support is implemented in many DBMSs and other tools.

OLAP cubes

What is OLAP data? To answer this question, consider a simple example. Let's assume that in the corporate database of a certain enterprise there is a set of tables containing information about sales of goods or services, and on their basis an Invoices view has been created with the fields Country (country), City (city), CustomerName (name of the client company), Salesperson (manager for sales), OrderDate (date of order placement), CategoryName (product category), ProductName (product name), ShipperName (carrier company), ExtendedPrice (payment for goods), while the last of these fields is, in fact, the object of analysis .

Selecting data from such a view can be done using the following query:

SELECT Country, City, CustomerName, Salesperson,

OrderDate, CategoryName, ProductName, ShipperName, ExtendedPrice

FROM Invoices

Suppose we are interested in the total value of orders made by customers from different countries. To get an answer to this question you need to make the following request:

SELECT Country, SUM (ExtendedPrice) FROM Invoices

GROUP BY Country

The result of this query will be a one-dimensional set of aggregate data (in this case, sums):

Country SUM (ExtendedPrice)
Argentina 7327.3
Austria 110788.4
Belgium 28491.65
Brazil 97407.74
Canada 46190.1
Denmark 28392.32
Finland 15296.35
France 69185.48
209373.6
...

If we want to know the total cost of orders placed by customers from different countries and delivered by different delivery services, we must run a query containing two parameters in GROUP offer BY:

SELECT Country, ShipperName, SUM (ExtendedPrice) FROM Invoices

GROUP BY COUNTRY, ShipperName

Based on the results of this query, you can create a table that looks like this:

This set of data is called a pivot table.

SELECT Country, ShipperName, SalesPerson SUM (ExtendedPrice) FROM Invoices

GROUP BY COUNTRY, ShipperName, Year

Based on the results of this query, a three-dimensional cube can be constructed (Fig. 1).

Adding Extra options for analysis, you can create a cube with theoretically any number of dimensions, and along with the sums, the cells of the OLAP cube can contain the results of calculating other aggregate functions(for example, average, maximum, minimum values, number of records of the original view, corresponding this set parameters). The fields from which results are calculated are called cube measures.

Hierarchies in dimensions

Suppose we are interested not only in the total value of orders placed by customers in different countries, but also the total cost of orders made by customers in different cities of the same country. In this case, you can take advantage of the fact that the values ​​plotted on the axes have different levels of detail - this is described within the concept of a hierarchy of changes. Let's say that countries are located at the first level of the hierarchy, cities are at the second. Note that starting with SQL Server 2000, analysis services support so-called unbalanced hierarchies, which contain, for example, members whose “children” are not contained at adjacent levels of the hierarchy or are missing for some members of the change. A typical example of such a hierarchy is taking into account the fact that in different countries there may or may not be administrative-territorial units such as a state or region, located in the geographical hierarchy between countries and cities (Fig. 2).

Note that in Lately It is common to highlight typical hierarchies, such as those containing geographic or time data, and to support the existence of multiple hierarchies in one dimension (in particular, for calendar and fiscal year).

Creating OLAP cubes in SQL Server 2005

SQL Server 2005 cubes are created using SQL Server Business Intelligence Development Studio. This tool is special version Visual Studio 2005, designed to solve this class of problems (and if you have an already installed development environment, the list of project templates is replenished with projects designed to create solutions based on SQL Sever and its analytical services). In particular, the Analysis Services Project template is designed for creating solutions based on analytical services (Fig. 3).

To create an OLAP cube, you first need to decide on what data to form it. Most often, OLAP cubes are built on the basis of relational data warehouses with star or snowflake schemas (we talked about them in the previous part of the article). The SQL package includes an example of such a storage: the AdventureWorksDW database, to use which as a source you should find the Data Sources folder in Solution Explorer, select context menu New Data Source and consistently answer the questions of the corresponding wizard (Fig. 4).

It is then recommended to create a Data Source View on which the cube will be created. To do this, select the appropriate context menu item Data folders Source Views and consistently answer the wizard’s questions. The result of these actions will be a data schema, with the help of which a representation of data sources will be built, and in the resulting schema, instead of the original ones, you can specify “friendly” table names (Fig. 5).

The cube described in this way can be transferred to the analytical services server by selecting the Deploy option from the project context menu and viewing its data (Fig. 7).

Many features are currently used when creating cubes new version SQL Server, such as the data source view. The description of the source data for constructing a cube, as well as the description of the structure of the cube, is now done using the Visual Studio tool familiar to many developers, which is a significant advantage of the new version of this product - the study of new tools by developers of analytical solutions in this case is minimized.

Note that in the created cube you can change the composition of measures, delete and add dimension attributes, and add calculated attributes of dimension members based on existing attributes (Fig. 8).

Rice. 8. Add a calculated attribute

In addition, SQL Server 2005 cubes can automatically group or sort dimension members by attribute value, define relationships between attributes, implement many-to-many relationships, determine key business metrics, and much more (learn how all of these steps can be found in the SQL Server Analysis Services Tutorial help system of this product).

In subsequent parts of this publication we will continue our acquaintance with analytical SQL services Server 2005 and find out what's new in support Data Mining.

I have been a resident of Habr for quite some time, but I have never read articles on the topic of multidimensional cubes, OLAP and MDX, although the topic is very interesting and is becoming more and more relevant every day.
It’s no secret that during that short period of time in the development of databases, electronic accounting And online systems, a lot of data itself has accumulated. Now, a full analysis of the archives, and perhaps an attempt to predict situations for similar models in the future, is also of interest.
On the other hand, large companies, even over the course of several years, months or even weeks, can accumulate such large amounts of data that even their basic analysis requires extraordinary approaches and stringent hardware requirements. Such processing systems can be banking transactions, stock agents, telephone operators etc.
I think everyone knows 2 well different approaches database design: OLTP and OLAP. The first approach (Online Transaction Processing - real-time transaction processing) is designed for efficient data collection in real time, while the second (Online Analytical Processing - real-time analytical processing) is aimed specifically at sampling and processing data in the most efficient way.

Let's look at the main capabilities of modern OLAP cubes and what problems they solve (Analysis Services 2005/2008 are taken as a basis):

  • fast access to data
  • preaggregation
  • hierarchy
  • working with time
  • multidimensional data access language
  • KPI (Key Performance Indicators)
  • date mining
  • multi-level caching
  • multilingual support
So, let's look at the capabilities of OLAP cubes in a little more detail.

A little more about the possibilities

Quick access to data
Actually, fast access to data, regardless of the size of the array, is basis of OLAP systems Since this is the main focus, a data warehouse is usually built on principles different from those of relational databases.
Here, the time to fetch simple data is measured in fractions of a second, and a query exceeding a few seconds most likely requires optimization.

Preaggregation
In addition to quickly retrieving existing data, it also provides the ability to preaggregate “most likely to be used” values. For example, if we have daily records of sales of a certain product, the system Maybe We can also preaggregate monthly and quarterly sales amounts, which means that if we request data monthly or quarterly, the system will instantly give us the result. Why does preaggregation not always occur? Because theoretically possible combinations goods/time/etc. May be great amount, which means you need to have clear rules for which elements the aggregation will be built and for which not. In general, the topic of taking these rules into account and the actual design of aggregations is quite extensive and deserves a separate article in itself.

Hierarchies
It is natural that when analyzing data and constructing final reports, there is a need to take into account the fact that months consist of days, and they themselves form quarters, and cities are included in areas, which in turn are part of regions or countries. Good news The fact is that OLAP cubes initially view data in terms of hierarchies and relationships with other parameters of the same entity, so building and using hierarchies in cubes is a very simple matter.

Working with time
Since data analysis mainly occurs in time periods, namely time in OLAP systems ah has a special meaning, which means that by simply defining for the system where we have time here, in the future you can easily use functions like Year To Date, Month To Date (the period from the beginning of the year/month to current date), Parallel Period (on the same day or month, but last year), etc.

Multidimensional Data Access Language
MDX(Multidimensional Expressions) - a query language for simple and efficient access to multidimensional data structures. And that says it all – there will be a few examples below.

Key Performance Indicators (KPI)
Key indicators efficiency is a financial and non-financial measurement system that helps an organization determine the achievement of strategic goals. Key performance indicators can be quite simply defined in OLAP systems and used in reports.

Mining date
Data Mining(Data Mining) - essentially, identifying hidden patterns or relationships between variables in large data sets.
The English term “Data Mining” does not have an unambiguous translation into Russian (data mining, data mining, information mining, data/information extraction) therefore in most cases it is used in the original. The most successful indirect translation is the term “data mining” (DMA). However, this is a separate, no less interesting topic for consideration.

Multi-level caching
Actually, to ensure the most high speed access to data, in addition to clever data structures and pre-aggregations, OLAP systems support multi-level caching. Besides caching simple queries, parts of data subtracted from the storage, aggregated values, and calculated values ​​are also cached. Thus, the longer you work with an OLAP cube, the faster it, in fact, starts working. There is also the concept of “warming up the cache” - an operation that prepares the OLAP system for working with specific reports, queries, or all combined.

Multilingual support
Yes Yes Yes. At a minimum, Analysis Services 2005/2008 (though Enterprise Edition) natively supports multilingualism. It is enough to provide a translation of the string parameters of your data, and the client who specified his language will receive localized data.

Multidimensional cubes

So what exactly are these multidimensional cubes?
Let's imagine a 3-dimensional space whose axes are Time, Products and Customers.
A point in such a space will indicate the fact that one of the buyers bought a specific product in a certain month.

In fact, the plane (or the set of all such points) will be the cube, and, accordingly, Time, Products and Customers will be its dimensions.
It is a little more difficult to imagine (and draw) a four-dimensional or more cube, but the essence does not change, and most importantly, for OLAP systems it does not matter at all in how many dimensions you will work (within reasonable limits, of course).

A little bit of MDX

So, what is the beauty of MDX? Most likely, it’s that we need to describe not how we want to select data, but What exactly we want.
For example,
SELECT
( . ) ON COLUMNS,
( ., . ) ON ROWS
FROM
WHERE (., .)

Which means I want the number of iPhones sold in June and July in Mozambique.
At the same time I describe which this is the data I want and How I want to see them in the report.
Beautiful, isn't it?

Here's a little more complicated:

WITH MEMBER AverageSpend AS
. / .
SELECT
( AverageSpend ) ON COLUMNS,
( .., .. ) ON ROWS
FROM
WHERE (.)

* This source code was highlighted with Source Code Highlighter.

In fact, first we determine the formula for calculating the “average purchase size” and try to compare who (what gender) in one visit to Apple store, spends more money.

The language itself is extremely interesting both to study and to use, and perhaps deserves a lot of discussion.

Conclusion

In fact, this article covers very little even basic concepts, I would call it “appetizer” - an opportunity to interest the Habra community in this topic and develop it further. As for development, there is a huge unplowed field here, and I will be happy to answer all your questions.

P.S. This is my first post about OLAP and the first publication on Habré - I would be very grateful for constructive feedback.
Update: I transferred it to SQL, I will transfer it to OLAP as soon as they allow me to create new blogs.

Tags: Add tags

OLAP is not a separate software product, not a programming language, or even a specific technology. If we try to cover OLAP in all its manifestations, then it is a set of concepts, principles and requirements that underlie software products that make it easier for analysts to access data. Let's find out For what analysts need something special facilitate access to data.

The fact is that analysts are special consumers of corporate information. The analyst's task is to find patterns in large amounts of data. Therefore, the analyst will not pay attention to the separate fact that on Thursday the fourth a batch of black ink was sold to counterparty Chernov - he needs information about hundreds and thousands similar events. Single facts in the database may be of interest, for example, to an accountant or the head of the sales department, who is responsible for the transaction. For an analyst, one record is not enough - he, for example, may need all transactions of a given branch or representative office for a month or a year. At the same time, analyst discards unnecessary details like the buyer’s TIN, his exact address and telephone number, contract index and the like. At the same time, the data that an analyst requires for his work necessarily contains numerical values ​​- this is due to the very essence of his activity.

So, the analyst needs a lot of data, this data is selective and also of the nature of " attribute set - number". The latter means that the analyst works with tables of the following type:

Here " A country", "Product", "Year" are attributes or measurements, A " Volume of sales" - thereby numerical value or measure. The analyst’s task, we repeat, is to identify strong relationships between attributes and numerical parameters. Looking at the table, you will notice that it can easily be converted into three dimensions: we will put countries on one of the axes, goods on the other, and years on the third. And the values ​​in this three-dimensional array will be the corresponding sales volumes.

Three-dimensional representation of the table. The gray segment shows that there are no data for Argentina in 1988

It is precisely this three-dimensional array that is called a cube in OLAP terms. In fact, from the point of view of strict mathematics, such an array will not always be a cube: a real cube must have the same number of elements in all dimensions, but OLAP cubes do not have such a limitation. However, despite these details, the term “OLAP cubes”, due to its brevity and figurativeness, has become generally accepted. An OLAP cube does not have to be three-dimensional. It can be both two- and multidimensional, depending on the problem being solved. Particularly seasoned analysts may need about 20 dimensions - and serious OLAP products are designed for exactly this amount. Simpler desktop applications support around 6 dimensions.

Measurements OLAP cubes consist of so-called marks or members. For example, the Country dimension consists of the labels Argentina, Brazil, Venezuela, and so on.

Not all elements of the cube must be filled in: if there is no information on sales of rubber products in Argentina in 1988, the value in the corresponding cell simply will not be determined. It is also not at all necessary that an OLAP application necessarily store data in a multidimensional structure - the main thing is that this data looks exactly like this to the user. By the way, exactly special ways compact storage of multidimensional data, "vacuum" (unfilled elements) in cubes does not lead to wasted memory.

However, the cube itself is not suitable for analysis. If it is still possible to adequately imagine or depict a three-dimensional cube, then with a six- or nineteen-dimensional cube the situation is much worse. That's why before use ordinary ones are extracted from a multidimensional cube two-dimensional tables. This operation is called "cutting" the cube. This term, again, is figurative. The analyst, as it were, takes and “cuts” the dimensions of the cube according to the marks of interest to him. In this way, the analyst receives a two-dimensional slice of the cube and works with it. In much the same way, lumberjacks count the annual rings on a cut tree.

Accordingly, as a rule, only two dimensions remain “uncut” - according to the number of dimensions in the table. It happens that only a dimension remains “uncut” - if the cube contains several types of numeric values, they can be plotted along one of the table dimensions.

If you look even more closely at the table that we depicted first, you will notice that the data in it is most likely not primary, but obtained as a result summation on smaller elements. For example, a year is divided into quarters, quarters into months, months into weeks, weeks into days. A country is made up of regions, and regions are made up of settlements. Finally, in the cities themselves, districts and specific retail outlets can be identified. Products can be combined into product groups and so on. In OLAP terms, such multi-level associations are quite logically called hierarchies. OLAP tools make it possible to move to the desired hierarchy level at any time. Moreover, as a rule, several types of hierarchies are supported for the same elements: for example, day-week-month or day-decade-quarter. Source data is taken from lower levels of hierarchies and then summed to obtain values ​​at higher levels. In order to speed up the transition process, the summed values ​​for different levels are stored in a cube. Thus, what looks like one cube from the user's side, roughly speaking, consists of many more primitive cubes.

Hierarchy example

This is one of the essential points that led to the emergence of OLAP - productivity and efficiency. Let's imagine what happens when an analyst needs to obtain information, but there are no OLAP tools in the enterprise. The analyst independently (which is unlikely) or with the help of a programmer makes the appropriate SQL query and receives the data of interest in the form of a report or exports it to a spreadsheet. A great many problems arise in this case. Firstly, the analyst is forced to do something other than his job (SQL programming) or wait for programmers to complete the task for him - all this has a negative impact on labor productivity, increasing storming, heart attack and stroke rates, and so on. Secondly, a single report or table, as a rule, does not save the giants of thought and the fathers of Russian analysis - and the whole procedure will have to be repeated again and again. Thirdly, as we have already found out, analysts do not ask about trifles - they need everything at once. This means (although technology is advancing by leaps and bounds) that the corporate relational DBMS server accessed by the analyst can think deeply and for a long time, blocking other transactions.

The concept of OLAP appeared precisely to solve such problems. OLAP cubes are essentially meta reports. By cutting meta-reports (cubes, that is) along dimensions, the analyst actually receives the “ordinary” two-dimensional reports that interest him (these are not necessarily reports in the usual sense of the term - we are talking about data structures with the same functions). The advantages of cubes are obvious - data needs to be requested from a relational DBMS only once - when building a cube. Since analysts, as a rule, do not work with information that is supplemented and changed on the fly, the generated cube is relevant for quite a long time. Thanks to this, not only are interruptions in the operation of the relational DBMS server eliminated (there are no queries with thousands and millions of response lines), but the speed of access to data for the analyst himself also sharply increases. In addition, as already noted, performance is also improved by calculating subsums of hierarchies and other aggregated values ​​at the time the cube is built. That is, if initially our data contained information about daily revenue for a specific product in a single store, then when forming a cube, the OLAP application calculates the totals for different levels of hierarchies (weeks and months, cities and countries).

Of course, you have to pay to increase productivity in this way. It is sometimes said that the data structure simply “explodes” - an OLAP cube can take up tens or even hundreds of times more space than the original data.

Answer the questions:

    What's happened cube OLAP?

    What's happened tags specific measurement? Give examples.

    Can they measures in an OLAP cube, contain non-numeric values.







2024 gtavrl.ru.