Picture the following scenario:
The boss calls at 3:30 P.M. wanting a report on specific data and charts for a presentation first thing the next morning. Quickly, you run an SQL statement to dump the data into a spreadsheet, filter it down to exactly what is needed, create some killer charts, and deliver it within the hour. Then you promptly forget about it… until three months later when it’s requested again. Now you have to wrack your brain, trying to remember what you did between the SQL and the spreadsheet, hoping you can get it exactly right the second time!
For data analysts, this situation is all too familiar. We are constantly given last-minute requests, and often don’t have time to document everything because we’re always in such a hurry.
But if we take time to pause after performing our amazing analytical feats, we can help make everyone’s life a little easier going forward.
Sure, it feels great to be the person who can fulfill an urgent request every time, but it also gets tiring being in a constant rush.
Lighten your load, and maybe even become someone else’s hero, by creating a flexible dashboard with reporting capabilities that anyone who has access to the data can run. Here are the seven steps to get there.
Step 1 – Examine the Current Request
Put together a list of the data elements that were requested this time. What did you have to do to this data to hone it down to exactly what was being asked for? List your filters, the way you have organized the data, and any groupings you might have put it into. Don’t lose your SQL statement that you already ran – you’ll be able to use that to create the new report – but we’ll need to make some adjustments to it in the next step.
Step 2 – Find Similar Past Requests
Look back, if possible, to see what other requests have been made. Can you add any of those data elements into the first list to make the dataset more robust? (Here is where things might get a little more complicated, so be careful with what is added!)
If the list of data provides one row per subject matter, will adding a new data element cause that perspective to change? If so, maybe it doesn’t belong in this dataset.
Say, for example, you have a list of people, and you currently expect one row per person. If you add each person’s e-mail addresses, it could turn into a multiple-row situation, which you don’t want! When this happens, the first thing the data wants to know is: Do you need every single e-mail address, just the active e-mail addresses, or perhaps just one specific preferred e-mail address?
If the answer requires multiple values, you’ll need to do some creative SQL to transpose the data across rather than down. (A few well-planned scalar sub-queries will allow you to stretch multiple rows across columns; you just have to specify the number of columns to add.)
Back to our original dataset: As with the first step, start by listing all the filters you may have applied to the data. Do those filters still work in a larger dataset, or do they cause things to get skewed?
If the filters are too specific to the old dataset, it may not make sense to combine them together. Keep working through past requests until you have a robust list of data elements.
Start building out the original SQL statement to add all the new data elements you’ve identified. Remember, we’re going to take OUT the filters and groupings that may have been in the original statements, because they’re going to be expanded in the next step…
Step 3 – Remove Filters and Create Parameters
Now comes the fun part – creating a parameter form that will make this dataset shine! By taking as many of the hard-coded filters out and making parameters out of them, you’ll have a much more flexible dataset. Providing parameters on the form allows you and any end-users the chance to take a much larger dataset and really drive down into the specific details you need to see for each report run. Instead of having a hard-coded selection, you are going to create a variable that will allow you or the end-user to select one (or many) of whichever item is selected at the time the report is run, rather than hard-coding the value within the SQL statement.
If you’ve done the work to create a great big dataset, your parameter form might need to grow above and beyond the filters you’ve already identified.
Now you can dig in to your dataset and think a little outside the box. What else can you come up with that some end-users might want to select for filtering? There’s a bit more art than science to it. Just make sure to provide as much flexibility as possible. You don’t want it to be too full of parameters.
As you create each parameter, think about how to present the options: Is this variable best represented by a check box, radial button, a single-choice drop-down box, a multi-select list box, or maybe even an open-ended edit box? With so many choices, it can get confusing!
Choose the parameter type that makes the most sense for future reporting. Your original request only needed a single option, but is it possible that future requests may require two? If so, it’s better to assume that the variable will be multi-select, just to be safe. It will make things easier to maintain in the future, since changing from a single to a multiple selection later on can cause problems.
Keep in mind that all your users will not need every parameter for the reports they may be running, so give them the ability to skip a filter. Generally, this is done by providing the option to ‘Select All’ for most list-type parameters, which effectively makes the parameter optional.
Step 4 – Add Parameters to Your SQL Statement and Test, Test, Test
You have to be very slow and systematic in building out your SQL statement. Never add every parameter at the same time! This can cause errors in the resulting dataset.
Instead, run your SQL statement through a few times to make sure that the resulting giant list of data is exactly what you expected to see. This ensures that all the tables in the statement are joined correctly, and any overarching necessary filters are applied correctly.
Once your dataset is correct, add the parameters to the WHERE clause, one at a time. Test them out to make sure the dataset does not get skewed along the way. This is a very common mistake that we analysts can make. We get used to being hurried, so we try to add everything at once, and get stuck with a big mess when the resulting set is incorrect. Do your testing early and often—before the results are presented to end-users—and you’ll gain respect and confidence in the data.
Step 5 – Design Reports
Time to get a little creative again. Since you’ve already provided some reports to your boss and others, go ahead and recreate them with your report-design tool. Dress them up and add groupings that you know your end-users want. Add subtotals and totals as appropriate, so people won’t need to transfer it all into a spreadsheet.
Add as many reports as you can think of ahead of time and document each one so that end-users will know what to expect when they run the report. Also, start preparing any training documentation for end-users. This is your chance to give end-users control of their own reports, so you won’t have to do it for them!
Once you know what data you need, how to filter it down, and have tested it thoroughly, you can start building charts out for on-screen viewing. If you’ve done your homework, you know what types of charts have been requested in the past, so start with those.
Make sure that the SQL statements that run the charts match the SQL statement that runs the report; this way there will be no confusion in the resulting datasets.
Also, take advantage of filters when appropriate. Perhaps you can provide a group of unfiltered charts (that give results with little to no parameter selections), then another group of fully filtered results that hones down to specific requests. This provides end-users with a comparative view on-screen without having to run two different datasets.
Consider OLAP Capability
End-users may not have requested this capability in previous reports. But if you’ve provided a spreadsheet full of data in the past, it’s likely that advanced spreadsheet users will have created some sort of pivot table of results that allows them to really dig into datasets. If some of your users are doing this already, you’ll want to consider creating an OLAP for them.
One important caveat to creating OLAP capabilities: You’ll probably want to use the giant dataset without parameters that you created in Step 2, rather than the fully parameterized dataset from Step 3. This way, end-users can filter and sort to their hearts’ content, and dig down to the exact data elements they need to see at any given moment. The flexibility is outstanding, but more end-user training will be needed to explain all the capabilities this tool provides.
Step 6 – Document Your Work
Hopefully, you’ve been documenting along the way. In case you haven’t, now is the time to document all the work you’ve done. If a documentation standard already exists, then it should be easy enough to follow. If not, you’ll want to include the following in your documentation:
- Data Elements, with definitions (technical and functional)
- Tables used, with joins
- Filters, both hard-coded and parameterized
- Parameters, with code (if used)
- Screen Shots
- Chart Definitions, with code
- OLAP Description
- Report Descriptions, with data elements and additional code (if used)
- End-User Training Materials
Step 7 – Educate
Now it’s time to put on your salesperson hat and start selling this effort to your end-users! They’ll never know this cool dataset is available unless you tell them about it!
It’s important to understand who your audience is here — not every training style will work. For some situations, it can work to fit as many end-users as possible into a room to teach all of them the capabilities in an on-screen demonstration. Or you may want to conduct several smaller group sessions in a computer classroom where they can learn about this dataset in a hands-on setting. For certain users such as your boss (or the bosses’ boss), it might be best to train them individually on their own computers.
Educating others gives you a chance to not only prove how great the tool is, but also allows you to learn what each person’s most commonly used filters might be. If possible, save those filtered sets for their parameters. Then show them how to access those saved filters later on. Do the same with OLAP, if they’ll be using that tool.
The idea is that by providing these saved sets and teaching people in a hands-on setting, they won’t have to call you the next time they need last-minute access to the data and charts for their big meeting!
All this may sound like a lot of work. But in the long run, considering all the time and frustration that results from last-minute requests, you’ll find that this effort really will make your job easier.
You’ll be providing a service to end-users that will open new possibilities to them through this incredibly flexible tool. By documenting your work thoroughly, it will be much easier to figure out how you came up with your results when it’s time to make updates to the dataset .
Then later, after you’ve received a big promotion for being so awesome, the next person who has to fill your considerable shoes and do your job won’t have to dig through file after file of crazy SQL to produce the same information.