Concept: Pivot Tables vs Power Pivot
Many individuals are curious about the distinctions between Power Pivot and traditional pivot tables In this article, I will clarify these differences, ensuring you have a clear understanding in no time.
A pivot table is a powerful summarization and visualization tool that connects to a data source to generate on-the-fly totals and subtotals, aiding in data comprehension As the size and granularity of the data increase, the utility of pivot tables becomes even more significant Integrated within Excel, pivot tables also leverage all the additional advantages offered by the software.
Data Sources for Pivot Tables
Historically, there have been two main types of data sources that you can connect to with a pivot table: flat tables and data cubes.
Connecting to a Single Flat Table
To connect to a single flat table inside Excel, click in the table, select Insert, Pivot Table, and off you go There are some limitations with this approach, however:
• It is very common to have to do a lot of VLOOKUP()s (or similar operations) to be able to join data from different data sources into a single flat table.
Excel has a limit of 1 million rows; however, if you frequently use VLOOKUP() functions within a single flat table, you may encounter performance issues long before reaching this limit.
These two issues have historically prevented Excel from being a scalable BI tool But Power Pivot changes that, as you'll see in a few moments.
A powerful yet less common application of pivot tables is their ability to connect directly to reporting cubes, such as SQL Server Analysis Services multidimensional cubes, from Excel Many large enterprises utilize these multidimensional data cubes for reporting purposes, making it easy and convenient for Excel users to generate reports using pivot tables However, this direct connection to cubes remains a relatively rare use case compared to the more typical single-table applications of pivot tables in Excel.
Power Pivot revolutionizes the way pivot tables connect to source data without altering the functionality of the pivot tables themselves It introduces an enhanced method for establishing connections to data sources, making it a superior choice for data analysis.
Power Pivot is a powerful data modeling tool that enables users to organize and enhance source data for analysis with Excel pivot tables and other tools Traditionally associated with IT and business intelligence professionals, data modeling is now accessible to all Excel users through Power Pivot, transforming how they analyze data.
Data modeling involves gathering data from multiple sources, organizing and structuring it logically, and enhancing it to prepare for effective use.
The data modelling process includes:
• Determining the optimal structure and shape of the source data to analyse (e.g., whether to bring in all the data, full data, or summary data).
• Loading the data from the source into the data model (Power Pivot in this case).
• Defining the logical relationships between the various tables (which is similar to what you do with VLOOKUP() inside Excel).
• Defining data types (e.g., specifying whether a column of data is numeric or a column of currency values or a column of text fields).
By generating new insights from source data, you can analyze concepts that are not inherently present in the original dataset but can be derived within the data model For instance, with a transactional data table containing cost and sell prices, you can enhance the model to include calculations for margin and margin percentage, even though these metrics are not explicitly available in the source data Once these new metrics are integrated into the data model, they can be consistently reused by users accessing your workbook.
Learning Power Pivot is essentially about mastering data modeling, a concept that may seem intimidating at first However, there's no need for concern By the end of this book, you'll be on your path to becoming a skilled data modeler with Power Pivot Simply apply the techniques discussed, and remember that you are developing your data modeling skills through Power Pivot.
Concept: Loading Data
To begin writing DAX and utilizing Power Pivot, it's essential to first load your data, as Power Pivot imports a full copy of the source data into the Data Model After this initial loading step, you can easily share your workbooks without requiring others to have direct access to the original source data.
When loading data, it's crucial to determine the specific tables and columns to import, as well as the desired structure of the data In the upcoming section, you will load pre-prepared data, but it's essential to recognize that selecting the appropriate data for import is a vital aspect of the data modeling process, which has already been addressed in this instance.
Here's How: Loading Data from a New Source
Download the sample AdventureWorks database from http://xbi.com.au/learndax, unzip it, and store it in an easily accessible location Begin by loading the specified tables from the AdventureWorks Access database.
Then you will prepare these tables for use in Power Pivot.
Follow these steps to load data into a workbook for use in Power Pivot:
1 Open a new blank Excel workbook You should see the PowerPivot tab at the top of the sheet.
To enable the PowerPivot tab in Excel, navigate to File, select Options, and then click on Add-Ins Scroll to the bottom of the window, choose COM Add-ins from the Manage list, and click Go.
3 In the COM Add-ins dialog that appears, check the Microsoft Office PowerPivot check box and then click OK.
What if I Can't Find the Power Pivot Add-in?
If you are using Excel 2013 or later and cannot find the Power Pivot add-in, it means your current version does not support it To access Power Pivot, you will need to upgrade to a different version of Excel that includes this feature.
If you're using Excel 2010, you can still enhance its functionality by downloading the Power Pivot plug-in online, a feature not accessible to users of Excel 2013 and later versions.
For more information about Power Pivot versions, go to http://xbi.com.au/versions.
To get started, download the sample database from http://xbi.com.au/learndax After downloading, unzip the file and store the contents in a location that is easy to access later.
5 On the PowerPivot tab, click Manage.
The Windows taskbar displays two distinct Excel windows when you hover over the Excel icon: the traditional Excel window and the Power Pivot window.
Power Pivot is an integrated application within Excel that allows users to manage data models In this book, you'll be instructed to toggle between Excel and Power Pivot windows If the Power Pivot window is not visible, simply access it by selecting the PowerPivot tab in Excel and clicking on Manage to open it.
6 In the Power Pivot window, select Home, From Database (see #1 below), From Access (#2).
7 Browse to the location of the sample database you downloaded and unzipped in step 4 and then click Next.
8 Accept the default option in the Table Import Wizard dialog (as shown below) and then click Next.
To select the five views from the bottom of the list, check the box next to each one Be aware of the distinct icons that represent queries/views and tables.
To customize the data retrieved from the database, utilize the Preview & Filter button to select specific rows and columns for each table Additionally, you can rename the views in the Friendly Name column for better clarity.
In database design, tables often have prefixes such as "dim" for dimension and "fct" for fact, exemplified by names like dimProduct and fctSales For Excel users, a dimension table functions as a lookup table, while a fact table represents a data or transactions table Understanding the distinction between these two types of tables—lookup tables and data tables—is crucial in Power Pivot, and this concept will be explored further throughout this book.
To enhance user experience in Power Pivot, Excel users should eliminate the "dim" and "fct" prefixes from table names prior to import While these prefixes serve a purpose for IT professionals by indicating the table type, they can confuse nontechnical users accessing the PivotTable Fields list Therefore, removing these prefixes ensures clearer and more accessible table names for all users.
To enhance user experience in Excel, assign friendly names to tables, such as changing "dimProducts" to "Products." This makes it easier for business users to navigate and understand pivot tables effectively.
10 Click Finish, and the wizard imports your data.
After closing the Table Import Wizard, you will see five new tabs in the Power Pivot window, each representing a table imported from the source files, such as an Access database These tables serve as complete copies of the original data, and you won't need the source files again until it's time to refresh the data, typically when changes occur This feature of Power Pivot allows for easy data updates, ensuring your workbooks reflect the most current information with a simple refresh.
Concept: Calculated Fields
Calculated fields have existed for many years, alongside the development of pivot tables However, the traditional calculated fields differ significantly from the newer Power Pivot calculated fields introduced in Excel 2013.
There are four important things related to calculated fields that you need to know about:
Original calculated fields for standard pivot tables are limited in functionality and can be challenging to use However, with the introduction of Power Pivot, there is now little need to rely on these basic calculated fields If you can create calculated fields using the traditional method, you will find that transitioning to Power Pivot is straightforward and offers significant advantages Embracing Power Pivot calculated fields will enhance your data analysis capabilities and efficiency.
So you can forget about these original calculated fields; you don't need them anymore.
In Excel 2010, Microsoft introduced measures in Power Pivot, significantly enhancing the capabilities of calculated fields from standard pivot tables Measures are at least 100 times more powerful, allowing users to enrich their Power Pivot data models and create pivot tables that deliver essential calculations This article will delve deeper into writing measures, emphasizing their importance in data analysis.
In Excel 2013, Microsoft rebranded Power Pivot measures as calculated fields, creating a distinction from the original calculated fields in standard pivot tables While Power Pivot calculated fields in Excel 2013 serve the same function as measures in Excel 2010, this renaming has led to confusion, particularly for users transitioning from Power Pivot for Excel 2010 Many believe that this change has complicated the understanding of calculated fields, both in Power Pivot and standard pivot tables.
In Excel 2016, Microsoft has reversed its decision to rename measures to calculated fields, thanks to advocacy from Rob Collie and others This change means that measures will once again be referred to as measures, marking Excel 2013 as the only version to use the term calculated field in relation to Power Pivot This decision reflects Microsoft's willingness to acknowledge its mistakes and correct them, a commendable move in the eyes of users.
In this book, the term "calculated fields" is used to refer to the functionality originally known as "measures," which was introduced in Power Pivot for Excel 2010 and later evolved in Excel 2013 This terminology aligns with what users of Excel 2013 will encounter on the ribbon It's important to note that the traditional calculated fields found in standard pivot tables are now redundant and will not be mentioned further in this book Despite this, I recommend continuing to use the term "measures," as it is the preferred nomenclature that Microsoft is expected to adopt.
Techniques for Writing DAX Calculated Fields
There are three places you can write DAX calculated fields:
In the Power Pivot window, you can create a calculated field by entering the field name followed by a colon and the formula in the formula bar.
• You can write and edit calculated fields in any empty cell in the calculation area at the bottom of the Power Pivot window, as shown below.
In Excel, you can create calculated fields by accessing the calculated field dialog, which can be opened from the Power Pivot tab by clicking on the Calculated Field button.
Excel users should write DAX formulas in the Calculated Field dialog box, ideally after creating a pivot table for context This approach allows users to see the calculated field reflected in the pivot table immediately after clicking OK, providing instant feedback on the accuracy of the formula.
Here's How: Writing Calculated Fields
For Excel users, mastering the creation of calculated fields can initially seem challenging, but with practice, it becomes easier I recommend following a specific approach to writing new calculated fields to streamline the process.
1 Create a new blank pivot table connected to your data model (or use an existing one if you already have something appropriate).
2 Add some relevant data to the rows in your pivot table For the sample database used in this book,
I suggest that you go to the Products table and place [Category] on Rows in the pivot table (see #1 below).
3 Click inside the pivot table, navigate to the Power Pivot tab, click the Calculated Fields drop-down arrow (#2), and then select New Calculated Field (#3) The Calculated Field dialog appears.
To ensure efficiency and avoid rework, it's essential to utilize the Calculated Field dialog as a process guide Neglecting this can lead to missing critical steps, ultimately wasting time Develop a consistent habit of adhering to the outlined process steps and using the dialog as a reminder for each stage Always follow the specified order to achieve optimal results.
To store your calculated field effectively, select the appropriate table from the Table Name drop-down menu It's important to place the calculated field in the same table as the source data; in this instance, the relevant data is located in the Sales[ExtendedAmount] column within the Sales table Therefore, ensure that you choose "Sales" from the Table Name drop-down list.
5 In the Calculated Field Name text box (#2), give the new calculated field a meaningful, unique name.
When writing DAX, it is essential to use the specific names provided in this book In your future projects, opt for descriptive and meaningful names, including spaces, instead of abbreviations For example, instead of shortening "Total Sales Value" to "TSV," use the full term to avoid confusion for yourself and others later on.
6 In the Formula box (#3) write the DAX formula.
7 Click Check Formula (#4) to check whether the formula you wrote is syntactically correct Fix any errors if you need to.
8 Select an appropriate formatting option from the Category list (#5), including a suitable Symbol and Decimal places in the area to the right of the Category list.
The Description box is available for your use, although I typically do not fill it out It serves as a reference and does not influence the functionality of the formulas.
After you click OK, you get immediate feedback about whether everything is working as you expect (as shown below).
Adhering to this procedure will save you time by preventing the need for corrections later By practicing this method from the beginning, you will cultivate a beneficial habit that will be advantageous in the future.
You can create a calculated field by drag- ging a column from a table and dropping it in the Values section of the PivotTable
DAX Topic: SUM(), COUNT(), COUNTROWS(), MIN(), MAX(), COUNTBLANK(), and DIVIDE()
This chapter introduces fundamental DAX formulas to help you begin your journey Most DAX functions discussed here utilize a column as their sole parameter, formatted as follows: =FORMULA(ColumnName) The only exception to this rule is noted.
=COUNTROWS(Table), which takes a table (not a column) as the parameter.
All the functions in this chapter are aggregation functions, or aggregators That is, they take inputs from a column or table and somehow aggregate the contents (differently for each formula).
When dealing with the Sales[ExtendedAmount] column containing over 60,000 rows of data, it's important to note that you cannot directly insert this entire column into a pivot table, as pivot tables cannot accommodate an entire column within a single cell in a spreadsheet.
Using a "naked" column in a DAX formula without an aggregation function will not function correctly when creating a calculated field, as indicated by the error message.
To effectively aggregate data in Power Pivot, you must instruct it to return a single value for each cell in the pivot table from a specific column The correct approach involves writing a calculated field that utilizes both the table name and the column name in the formula, adhering to best practices for optimal results.
When writing DAX, it is crucial to always include both the table name and the column name, as referring to a column without specifying the corresponding table can lead to confusion Currently, you must manually input both names, but future versions of Power Pivot are expected to automate this process Understanding this practice is essential for effective DAX usage.
In DAX, a key feature is the ability to reuse calculated fields when creating new ones For instance, once you establish a calculated field named [Total Sales Value] in the Power Pivot data model, it can be referenced in other calculations This allows you to easily create additional calculated fields, such as a new one for 10% tax on sales, by utilizing the existing [Total Sales Value] field in your formula.
[Total Tax] = [Total Sales Value] * 0.1
Note that the new calculated field [Total Tax] is a calculation based on the original calculated field [Total Sales Value] multiplied by 0.1.
It is good practice to reuse calculated fields inside other calculated fields.
Now is the perfect opportunity to begin writing your own DAX formulas Open your workbook with the data loaded from Chapter 1 and start creating DAX expressions If you’ve never written these functions before, it's essential to engage in the process physically rather than just imagining it Dive in and practice as you read through this section for a hands-on learning experience.
To begin, ensure you have loaded the test data as outlined in Chapter 1 Once the data is prepared, you can proceed to create the new calculated fields detailed in the subsequent sections The first calculated field to write is the one demonstrated in "Here's How: Using IntelliSense" from Chapter 3.
This book includes practice exercises aimed at enhancing your learning, which you should complete as you encounter them You can find the answers to all these exercises in "Appendix A: Answers to Practice Exercises" on page 175.
Challenge yourself to write the DAX formulas independently without referring to previous examples If you encounter difficulties, consult Chapter 3 for guidance and attempt the task again This exercise is designed for your practice, and you can find the answers in Appendix A.
Write DAX formulas for the following columns using SUM().
Create a new calculated field that sums the sales from the ExtendedAmount column in the Sales table This calculated field uses the same formula discussed earlier in the chapter, but it has been assigned a new name for this practice exercise, in case you have already created the previous version while studying Chapter 3.
To create a calculated field in the Sales table that sums one of the cost columns, follow the same structure as the previous calculated field, but focus on adding the product cost instead of the sales amount You can utilize any of the available product cost columns in the Sales table, as they are identical in this sample database.
To calculate the total margin, create a new calculated field by subtracting total cost from total sales Ensure that this new field incorporates the two previously established calculated fields for accurate results.
Create a new calculated field that takes Total Margin $ from above and express it now as a percentage of total sales Once again, reuse the calculated fields you created above.
Create another calculated field for total sales tax paid Look for a tax column in the Sales table and add up the total for that column.
Note that the total sales amount above excludes tax, so you will need to add two calculated fields together to get this total.
This is just the same as the others, but this time you add up the quantities purchased Look for the correct column in the Sales table.
Did you do the following?
To optimize your Excel experience, it is recommended to first create a pivot table and place Products[Category] in the Rows section This approach allows for immediate feedback on your calculated fields, enabling you to see results instantly and ensuring best practices for effective data analysis.
When creating calculated fields in DAX, it is essential to reference all columns using the format TableName[ColumnName] Always specify the table name before the column name to ensure accuracy and clarity in your calculations Avoid referencing columns without their corresponding table names to maintain best practices in DAX coding.