For making complex data-driven decisions, you need a complex data visualization tool. Happily, OLAP (short for On-Line Analytical Processing) is fast becoming a ubiquitous method for the advanced analysis of multi-dimensional information. In Argos, our OLAP engine is fast and incredibly intuitive; the most difficult part is the interpretation of the information it provides. This post will introduce you to our OLAP environment and explain some of the best practices we employ when developing OLAP cubes for our clients. First, some important terminology.
OLAP Terms Defined
- On-Line Analytical Processing (OLAP):The technological infrastructure used to provide ad-hoc user query capability and multi-dimensional analysis.
- Measure (or Fact):The numerical values being analyzed: headcount, revenue, GPA, SAT scores, etc.
- Dimension: The “buckets” into which the measures are sorted for analysis; term code, year, month, quarter, bio-demographics, geographic region, etc.
- Level of Granularity:The deepest level of detail that’s included in the OLAP’s dataset.
NOTE: For example, let’s say you have an OLAP cube with ‘student’ as the level of granularity. That would mean you would be able to analyze information for individual students, such as GPA, headcount, and test scores, but you would NOT see student coursework because a single student can register for many courses. To analyze coursework information, you would need to set the OLAP’s level of granularity all the way down to the course level.
What is OLAP, really?
OLAP is an answer engine whose goal is to put the power of finding answers into the hands of non-technical individuals. It does this by grouping measures into logical groups and allowing the user to slice the data into different views through the use of dimensions. It creates a mechanism for easy longitudinal reporting (analysis over time) as well as trend analysis. The questions OLAP focuses on typically start with something like “How many,” “How much,” or “What is the average?”
Design Process & Best Practices
The first step when designing an OLAP cube is to identify the level of granularity, the cube’s measures, and the cube’s dimensions. This is important because trying to incorporate dimensions or measures in a cube that are not consistent with the granularity will cause inaccurate calculations and thus incorrect answers. For example, let’s assume you were hoping to count students by term but you set your level of granularity at the course level. For each student taking more than one course in a given term, they would be counted more than once! Obviously, this leads to confusion and inaccuracy, and is something that must be carefully watched during OLAP development.
Next, we must determine what measures and dimensions will be needed. The best way to do this is to ask your target user groups the following questions:
- What are the three most frequent questions you are asked that this OLAP cube should answer?
- What, if any, are the federal or state reporting requirements related to this data?
- What complex question can you ask about this data that you’ve always wished you could answer but never have?
The responses to these questions typically look something like this:
- What is the unduplicated headcount by term, gender, & major?
- What was last year’s retention rate?
- How many Psychology 101 sections should I offer next fall?
So, using the responses above, you can create a measure and dimension breakdown something like this:
- Measures:FTE; Dimensions: Year, Term, Gender, Age, Department, Major, & Area Code
- Measures: Retention Rate; Dimension: Year
- Measures: Headcount, FTE, & Credits; Dimensions:Year, Term, Subject, & Course Number
By picking apart the questions that your audience needs to answer, you can inform the design of the dataset used to populate the OLAP you’re designing. This is, by far, the hardest part of designing an OLAP cube. And note: we have not yet written a single line of code.
Alas, it couldn’t last forever. Now it’s time to write the query required to pull the measures and dimensions as identified above. With OLAP this is usually quite simple, as it does not typically require any aggregation or grouping (the OLAP engine handles that for you). The example query below would get the measures and dimensions required for the first question above.
NOTE: This query is written for a standard Banner installation that includes the student module.
As you can see, this query does zero aggregation. It pulls all registered students that exist in your Banner database and feeds that into the cube. The cube uses that raw data to build its multi-dimensional view, which then enables the user to pose questions and analyze the dataset.
With our simple-to-use calculated measure manager you can create custom calculations that fit your needs. From here you can create calculated measures to show your data in new ways, such as percentage of the total or by comparing values row by row. Just one more way OLAP excels at answering those complex questions!
While this is just one example of the kind of questions an OLAP cube can help you answer, you can see how this feature serves as a powerful tool for data analysis. If you’ve struggled to get clear answers to complex questions, it’s definitely worth taking the time to learn how to harness the power of OLAP.