Exactly what is a Slowly Changing Dimension (SCD)?

Just the sound of the phrase “slowly changing dimension“ can sound quite intimidating when uttered in a technical meeting, but you might be shocked at how simple this concept really is. In fact, it has nothing to do with shifts or alterations in the universe at all. J

If you are involved in data warehousing you should have a firm grasp of SCDs as they will be necessary to structure your data in a meaningful way that takes advantage of the capabilities of a data warehouse; specifically when you wish to store historical data rather than simply overwriting a record. So, let’s break it down:

Slowly Changing Dimensions: Dimensions in your DW that include attributes that will gradually change over a period of time.

There are three types:

Type I – Overwrite old values: The old record is completely over written by the new record. Type I is used when you have no desire to store the historical attributes of the record.

Type II – Create an additional record: The old record will be kept and a new record will be inserted. Type II is used when you wish to store the historical attributes of the record.

Type III – Create new fields: This type creates new fields in the record to maintain the history and the current values of the specific attributes in the record.

Below are examples of how to use each type. In this example the price of a product is changing.

Type I: The old values are over written

Product price in 2011

Product price in 2012

Type II: A new record is created.

Product price in 2011

Product price in 2012

As you can see in the above example there are a few problems with the structure of the table. When using a type II SCD you will need to add a couple of basic fields to effectively track the historical records. There are several possibilities, but the idea is to be able to differentiate between the two records. Also you may notice that there are two identical IDs for this record. That is OK. This is still the same product so a common identifier is necessary. We will be able to narrow down our selection criteria by choosing the active record. A more realistic type II record would look like the below.

Product price in 2012

Type III: New fields are inserted

Product price in 2011

Product price in 2012

There are many scenarios in business where each of these SCD types will be useful. The most common are types I and II because type III can only store a limited amount of historical data.

I hope that this explanation has helped to simplify your understanding of how to use slowly changing dimensions in data warehousing.

Advertisements

Looking for BI Talent? Questions to ask a potential BI Analyst

A Business Intelligence Analyst position requires a wide range of technical knowledge and problem solving skills as well as a personality that can fit well in both IT and business analytical environments. Many IT centered people prefer to keep to themselves and focus on programming. This type of personality, while valued, will likely not fit well in a BI Analyst position. On the flip side, someone with loads of personal skills who is lacking in technical comprehension will simply not fit the bill.

What you are looking for is the perfect balance between IT and business, which really can’t be judged by reading an applicant’s resume. Resumes can often be misleading. They are full of recommendations, certifications, and include a long list of skills that just so happen to match the job description exactly.

Below I have listed a few questions that might help you gauge the competency of your BI Analyst applicant, starting with basic comprehension of a BI solution.

What is a BI solution used for?

This is a good starting question that a qualified applicant should be able to easily tackle. They will have the opportunity to present their perspectives on what a BI solution actually is and their answer will reveal whether or not they see the same potential value in BI as you do. Ultimately you are listening for the key words “better decision making.”

Walk me through the flow of data in a BI solution starting with the source system.

This question will test to see if the applicant understands the data flow from the technical perspective all the way through to the presentation level. An acceptable answer can be as simple as: Source -> ETL -> Data Staging -> ETL -> Data Warehouse -> Dashboards.

How would you approach analyzing a report with bad data?

This will provide some discovery into the applicant’s thought processes and problem solving skills. You should expect to hear that the applicant would question the integrity of the report as well as the integrity of the data in the data warehouse and the source system. From there the applicant should be able to walk through his or her plan for correcting the issue.

What steps would you take when building a data warehouse?

This will be the beginning of evaluating the applicant’s technical knowledge. The applicant should understand the importance of communicating with users and gathering business requirements. The applicant should also display an understanding of how to design the data warehouse schema and plan for the structure of the upcoming fact and dimension tables.

From here you will want to move into more technical questions. The specific questions will greatly depend on the specific software that you use. You will want to ensure that there is an understanding of the different components of an ETL package. It is also important to test the applicant’s understanding of how security should best be organized and utilized in reporting services and dashboards.

It is a good idea to push the applicant to the edges of his or her knowledge. At some point you will reach a question that the applicant doesn’t know. In this case “I don’t know” is an acceptable answer. What is less appealing is when the applicant obviously doesn’t know the answer but attempts to bull shit and ramble while avoiding a direct answer. A confident and intelligent person is comfortable to admit when he doesn’t know something, so you should view this as a positive in communication skills.

I hope that this has been a valuable resource for you and that it might jump start some ideas when interviewing your BI Analyst applicant. Happy hunting!

 

Mobile BI

Source: Glenn Fleishman

From paper to desktops to laptops and now tablets; business intelligence is evolving. Business decisions can now be made from a plane, a bus, or while you’re lounging on the beach sipping a Mai Tai. We are no longer limited to the confines of our office when an imminent decision must be made.

Mobile BI offers us all of the advantages of our desktop applications with the added option of pulling location specific data. Are you traveling for business and you only want to view the BI data that is relevant to your location? Tablets are equipped with GPS devices that will allow your BI application to determine your coordination and pull data that is geographically relevant.

Mobile BI is not just for data extraction, but data collection too! A great example lies on the gaming floors of casinos where hosts have been armed with iPads. Gamers are able to avoid long lines by meeting directly with a host who can use the tablet to gather the customer’s information and book a hotel room, sign up for a player’s club card, or even make reservations at the casino’s sushi restaurant. This concept makes marketers ecstatic while in turn attending to customers’ needs more efficiently.

Privacy and security concerns lurk in the background as companies edge into the mobile BI sector. Is the company tracking too much information about the actions of the employee while using the tablet? What policies should be put into place about the rights of employers and employees? And the most popular concern is the idea of company data floating around outside of the company’s firewalls. These are all considerations, but certainly not road blocks when deciding to empower your company with mobile BI capabilities.

Overall, easier access to data means we can accomplish more from nearly anywhere in the world. There is less waiting and more doing as decisions are easily made on-the-go. The future will bring many more mobile solutions with features that will allow us to do much more than view data, but manipulate it as well.

 

Where does SQL belong in your BI solution?

Well, in my opinion, it definitely belongs somewhere. SQL is a powerful querying language that empowers tech-savvy users to create large data sets with just a few simple commands. Regardless of how robust your BI solution is, SQL querying will never go away. Many BI vendors’ major selling point is the simplicity of report creation through drag and drop interfaces. Users will never have to see the code being generated because of the wonderful and visually appealing user interface.

So, what if your analysts are already accustomed to accessing your data warehouse through SQL? It just isn’t feasible to expect them to drop this tool of empowerment and move straight into creating ad-hoc reports using only your new business intelligence software. It is obvious why, who doesn’t want to run a “select * “ statement occasionally? J

Herein lays the fear: What if the analysts never migrate to the new BI tools? What if interactive dashboards and web driven reports are never created? What happens if reports continue to be emailed as attachments? While these are all valid concerns, you will need to step back and gain some perspective (and patience).

As the BI Administrator you have likely had a large role in the development of the data warehouse. You are familiar with the tables, the joins, and the meaning of the data. By the time your BI solution goes live you are likely excited to see the true potential of the tools revealed, so a few days later you might ask: Where are all those pretty dashboards? What you have to realize is that you are way ahead of the pack on this one. (It’s lonely huh?)

Other analysts will need time to adjust to and experience the new organization of your company’s data. If they are used to querying through SQL then it is a great idea to set them up with access to the new data warehouse using a familiar tool. There will be some period of training where the analysts will learn the capabilities of the new BI tools and they will begin to make connections between the data and the possible displays that can be created.

Continue to be patient and you will likely see user adoption increasing, after all, the analysts are probably responsible for helping to realize that great ROI that was calculated before investing in a BI solution so they want to see it succeed  as much as you do!

BI & Collaborative Decision Making (CDM) Software: A Happy Couple

Business intelligence provides businesses the opportunity to analyze massive amounts of data in relatively simple formats. The obvious goal when companies deploy a business intelligence solution is to achieve a high ROI by streamlining business processes through the analysis of the presented data. Unfortunately many organizations may be missing out on their ROI by failing to follow through with the decision making processes that should follow the analysis of the data.

Collaborative decision making software may help bridge the gap by keeping the data analytics and the decision making process linked. With CDM software multiple people in the organization will have the ability to collaborate on the displayed data and converse about the interpretation of it. CDM will avoid the scenario where a colleague calls a meeting and presents his findings to a group of decision makers. Rather than isolated analysis of the data, group interaction will occur which has the potential to generate more ideas and creative ways to take action based on the displayed results of the data. In addition, this type of interactive collaboration makes it possible to pull in other parties, who might add valuable insight, nearly instantly.

CDM software is also self-documenting. There will be no need to take notes and write documents (this hardly ever happens anyway) that record the results of a meeting or discussion. All parties involved in the decision making process will be held accountable and this will avoid the scenario that may occur after a decision is made…”Why did we decide to do that again?”

CDM software is rapidly growing in popularity and is now often included as a standard feature in a business intelligence solution package. When choosing your BI software be sure to consider if CDM platform will add value to your organization. If you already have your BI solution in place it may be a good idea to ask if there is an add-on solution for BI collaboration.

 

Correcting Data Mistakes in Your Data Warehouse

Your analysts found some data that doesn’t quite look right in your data warehouse and ask you to investigate. You look into the data in more detail and find that the data is indeed incorrect. This is quite alarming since analysts and executives have been making business decisions based on this flawed data. In addition, as the data warehouse administrator you probably take a lot of pride in providing users with quality and reliable data. It is undesirable to run the risk of losing user confidence in the data. Now it is time to dissect the ETL process and determine what went wrong, but where to begin?

  1. Address the issue – It is important that you inform the people accessing this data that it is incorrect. Any analysis or decision making processes that use the bad data should be discontinued until the data is corrected.
  2. Reverse engineer – This will be much easier if you were the architect who designed the data warehouse. It will also be a simple process if detailed documentation exists for your data warehouse. Unfortunately this might not be the scenario you are facing.You will first need to trace the data back to your data staging area by reviewing the ETL package that moves data from data staging to the data warehouse.

    From there you will need to review the ETL process that moves data from your source systems into your data staging area.

    The next step will require you to review your source system data.

  3. Now ask questions – The next step is to analyze what you have found in your reverse engineering process.Is the source system data wrong? If this is the case you may need to investigate the possibility of data entry mistakes by users, or bugs in the source system. Have any updates or enhancements been made to the source system that could cause this error?

    If the source system appears to be correct then you can conclude that the issue lies somewhere in the data warehouse build. Is your data staging area correct? In most cases your data staging area will be straight data pulls from the source system without any major data transformations, but this possibility should still be investigated.

    Finally the last step, and the most complicated, is to begin sifting through the intricate ETL package and data quality process that populates your data warehouse. Hopefully this is where you will find your ETL error. If this investigation is unsuccessful however you may want to consider other external processes that alter your data warehouse. For example, is there a scheduled job that runs separately from the initial data warehouse build?

  4. How far back does the bad data exist? – If the bad data only started appearing in recent months or days then you should investigate what changed that may have caused this unexpected change. If the data has always been incorrect you can assume that the data warehouse architect simply made a mistake.
  5. Correct or move forward? – Depending on the type of data and the extent of the damage you may opt to correct the historical data or it may be more efficient to correct the data for the future and chalk the old data up to a lost cause. If you choose to only correct the data moving forward then I suggest truncating the field so that future analysis will not include the bad data.

If you are facing the task of debugging your data warehouse I hope that this has been helpful. It can be an overwhelming task (especially when others are not exactly happy) but take a deep breath and just begin the process while focusing on one step at a time.

What is SharePoint? A simple answer to a complex question.

In this video I will simplify the components of the SharePoint platform. Many people view SharePoint as a single product, but as I will demonstrate, it is actually a platform that supports multiple products and functionalities.

The capabilities of SharePoint go far beyond what I have outlined in the video. My goal is to help you to compartmentalize the complexity of the tool so that as you learn more about its’ capabilities you will be able to separate functionality into the six categories of products that SharePoint offers.

I hope this video is beneficial and helpful in your business intelligence research.

BI Software as a Service (SaaS)

The majority of business software applications are still installed locally on desktops. Administrators will have tools for ETL and data quality processes while developers will have tools for managing reports and building dashboards. The software required for each of these processes is typically very expensive and also designed for big data companies. These characteristics often make business intelligence less accessible for small or medium sized companies.

A realistic solution for smaller companies to consider is BI software as a service, also known as on-demand BI or cloud BI. SaaS is offered by vendors on a pay as you go plan rather than purchasing an annual license. Companies like Panorama  and GoodData  are now offering cloud hosted BI software as a cost effective solution for smaller businesses. Your company may already be using other cloud based applications. If this is the case then you already know that this alternative requires far less investment with minimal up-front costs. An additional cost saving factor is that SaaS does not require an on-site administrator to maintain software, which will save your company in labor costs.

There are several implications to consider when choosing SaaS BI for your business. A concern for many managers is that hosting data on the cloud means that secure data will be sent outside the company firewall. BI SaaS is sometimes a simplified version of traditional BI software. This means that functionality may be limited or features may be difficult to use. SaaS is often less scalable and BI SaaS may not be able to grow with your company if your data needs change drastically.

It is common for companies to offer free online demos of their BI SaaS. I recommend taking a test drive with the software before making a decision. It would be most beneficial for you to upload some of your actual data and conduct a trial data analysis using the software. You may find that BI SaaS meets all of your data needs, or you may wish to pursue a more traditional BI approach.

Scope Creep

Changing business intelligence requirements is a normal part of doing business in a fast paced environment. It isn’t uncommon for business processes to evolve during the implementation of your business intelligence solution. The question isn’t how to handle the project if the requirements change, it’s how to handle the project when the requirements change.

The best approach in these situations really depends on the extent of the change that needs to be implemented and the completion percentage of your data warehouse. During the beginning stages of your project it will be fairly reasonable to pull in additional fields from your source systems, however, if you are nearing the completion of your business intelligence solution requests for additional data should be postponed until the system is fully functional.

Data quality processes will generally be one of the final stages of your business intelligence solution. During this time period it is common to experience some definition changes as to how your data should be cleaned up and organized. Small enhancements such as changing the hierarchy of cleansed records will be a nuisance but feasible at this stage.

Obviously changing any requirements midway through the project is a very undesirable situation, but unfortunately this scenario is sometimes unavoidable. Managing scope creep will be a very involved process from the beginning stages of your project until completion. You won’t be able to “rule with an iron fist” because some changes will need to be made, but you will be able to actively manage the users’ expectations by keeping them informed on the limitations of changes and by working with your vendors to determine which types of changes are acceptable and which ones are not.

Vacations: Good for you, Great for your Organization

We all know them, they are at the office in the wee hours of the morning and the last ones to leave in the evening. They are usually extremely productive and are viewed as extremely dedicated to their work. They sacrifice their vacation time because there is simply too much to get done. They’ve sometimes earned the title of “workaholic.” What many of us often overlook is that putting in these extremely long hours without taking the time to rejuvenate may actually be harming our team more than it is helping.

Working in the IT industry, like many other professions, requires long hours of continual focus and concentration. We are challenged with solving complex technical and non-technical issues on a daily basis. We manage multiple projects and deal with dozens of team members, vendors, and executives simultaneously. This fast-paced and highly demanding work environment will eventually take its toll on you if you do not schedule time to separate yourself from your work and rejuvenate.

Here are a few of the many benefits of taking regular vacations:

Get a new perspective: You may have heard analogies of being “too close to the project” to see the solution clearly. Sometimes when we take a break from our projects we will return with a fresh perspective and keener problem solving skills.

Reduce stress: Most of us know that when we are stressed out we make poor decisions. Taking a breather will allow us to unwind and relax. With a relaxed mind and body we will be able to think more clearly and handle difficult situations with a level head.

Improve health: If you find yourself taking more sick days than usual this may simply be your body telling you that it needs a break. A vacation will allot you the opportunity to get more sleep, exercise, and pay more attention to your diet. A rushed lifestyle may lead us to become sleep deprived, eat fast food, and to be sluggish as we sit at our desk all day.

These are just a few of the many benefits of taking some time away from work. Turn off the cell phone and avoid checking your email for a few days and you will be rewarding yourself and your coworkers in the long run.