打开APP
userphoto
未登录

开通VIP,畅享免费电子书等14项超值服

开通VIP
Business Dimensional Modeling:The Key to Fact Table Success

As I have discussed in the past, it is critical that you design and build single grain fact tables. This requires that there is one and only one level of detail represented in the fact table for each dimension. This also means that there must be a value in every row for every dimension key - no nulls are allowed! Also, the logical primary key of a fact table is the collection of foreign keys from your dimensions. A physical primary key is the collection of columns that will uniquely identify a row in a table. Most of the time, the logical and physical primary keys will be the same. However, they do not need to be.

Consider a basic purchase order (P.O.) scenario. The purchase order is described by customer, order date, requested ship date and sales representative. These are the dimensions. To uniquely track the individual order, the purchase order number is also a dimension. There would be several fact columns as well. See Figure 1 for an example of the purchase order fact table.


Figure 1

The purchase order number is considered a degenerate dimension because all of the interesting attributes about an order have already been placed into another dimension. This leaves only the purchase order number itself. This should be handled in the business dimensional model as a full-fledged dimension, but does not require the creation of a separate dimension table. You may still create a physical degenerate dimension table if your data access tool does not recognize it.

There are additional facts associated with a purchase order, but they exist at a different grain or level of detail. There are facts at the line item detail, which describe the products that are being purchased. Looking simply at the relational tables, we can uniquely identify a line item with the purchase order number and product (see Figure 2).


Figure 2

This would allow unique identification of a row in the purchase order line item table, but it would not easily support the business users for analysis. If you wanted facts for a specific product that was ordered in January, you could get the answer, but not directly. You would need to access both fact tables and join the results on the purchase order number. Most analyses would require using both fact tables. This is not an optimal situation.

You can eliminate that layer of complexity by carrying all of the purchase order dimensions down to the line item fact table. This is illustrated in Figure 3.


Figure 3

Usually, adding dimension keys to a fact table changes the grain of the table and increases the number of rows. In this instance, it does not change the number of rows or the grain. Suppose that you have 5 million rows in the line item fact table in Figure 2. After we add the dimensions in Figure 3, we still only have 5 million line item rows. Each row is fully described with these added dimensions. The size of the fact table does increase. We have added four additional columns. At four bytes each, we have an increase of 16 bytes for each row in the table.

Figure 3 illustrates the difference of the physical primary key of a fact table from the dimensional logical primary key. The physical primary key is comprised of purchase order number and product key. The logical primary key (how we view and use the table dimensionally) is comprised of the order date key, requested ship date key, customer key, sales rep key, P.O. number and product key.

We should also consider the need to retain the purchase order fact table itself. Review each of the facts to determine:

  • Does this fact represent an aggregate that can be created by summing facts from the purchase order line item table? If so, simply eliminate the fact from the purchase order fact table.
  • Can this fact be allocated or distributed across each of the individual line items? If so, then create the new allocated fact during the transformation process. The allocated fact is placed on the line item fact table and eliminated from the purchase order table itself.

If all of the facts can be taken down to the line item level of detail, then the purchase order fact table is no longer needed!

There are many legitimate instances where the purchase order level facts cannot be allocated down. Consider the case where a hardware store offers a customer a coupon for $10.00 off a total purchase of $100.00 or more. There is a big sale on drills, and the retailer is selling them with a very low profit margin. The customer purchases:

Drill $74.00
Garden Gloves $9.00
Gallon of Paint $18.00
Flower Seeds $1.50
Vegetable Seeds $1.50

Total Goods $104.00
Coupon -$10.00
Total Due $94.00

Therefore, the customer pays $94.00. To allocate the coupon across the products, a simple average would result in $2 off the vegetable seeds, which is more than the item cost. A weighted average seems the next logical step but can impact the profitability measurements of the retailer‘s merchandising staff. If the drills were on sale close to the retailer‘s purchase price, the additional $7.12 allocated to the drill could drive the price below what the retailer paid for the item. Total transaction coupons are often accounted for and analyzed separately. If this is the business case, then you must retain the purchase order fact table. How would you know if you can allocate these facts? Ask the business!

As you have seen, there can be a difference between the physical primary key and the logical dimensional primary key of a fact table. Including additional dimensions that further describe the facts can expand the types of analyses that are possible.

本站仅提供存储服务,所有内容均由用户发布,如发现有害或侵权内容,请点击举报
打开APP,阅读全文并永久保存 查看更多类似文章
猜你喜欢
类似文章
? MySQL ORDER BY 的实现分析
6.1.6 Sorted and ordered collections (1)- sorted collections, SortedMap,SortedSet
MySQL Order By实现原理分析和Filesort优化
Oracle(PLSQL)入门学习五
MySQL索引操作命令(创建索引、重建索引、查询索引、删除索引)总结
MySQL5.7常用命令
更多类似文章 >>
生活服务
热点新闻
分享 收藏 导长图 关注 下载文章
绑定账号成功
后续可登录账号畅享VIP特权!
如果VIP功能使用有故障,
可点击这里联系客服!

联系客服