Posts

Showing posts from October, 2019

User Story

User Story basics A user story is a description of a requirement from a user's point of view. Although not a scrum artifact strictly speaking, in industry each product backlog requirement is usually maintained in the form of a user story. User stories are usually written on three by five-inch cards. Small enough to carry and pass around the team, but large enough to write a description and acceptance criteria on. User Stories – The Three Rs ------------------------------------  |  As a <role>  |  I want <requirement>  |  So that <reason/ROI>  ------------------------------------ ------------------------------------  |  As an F1 game player                       |  I want to see a circuit  map       |  So that I know where I am on the circuit             -----------------------...

Epic and Theme

Image
Epic and Theme What is an Epic? An epic is a large story, an epic is usually a story that cannot be completed within one iteration or within one sprint in the scrum. An epic needs to be broken down into separate stories before working on it. So what makes this an epic? Well, the things that usually make a story into an epic are firstly we're early in a project and we don't have all of the information that we will end up having around this particular user story. What is a Theme? A theme is a collection of stories with a shared attribute. Here's an example of a theme. SPORTS PAGE: As a sports fan I want a group of sports pages so that I can see the top 10 news stories for each sport. The difference between an epic and a theme is that whereas an epic is simply a large story that may be broken down into further stories later, a theme actually refers to a collection of stories . Here, in this case, we know that we're going to n...

Data Warehouse Physical Design

Data Warehouse Physical Design Data files guidelines Create files with an initial size  Based on the eventual size of the objects that will be stored on them  This pre-allocates sequential disk blocks and helps avoid fragmentation.  Disable autogrowth  If you begin to run out of space in a data file, it is more efficient to explicitly increase the file size by a large amount rather than rely on incremental autogrowth.  Filegroups guidelines Create at least one filegroup in addition to the primary one, and then set it as the default filegroup so you can separate data tables from system tables. Create dedicated filegroups for extremely large fact tables and using them to place those fact tables on their own logical disks. If some tables in the data warehouse are loaded on a different schedule from others, consider using filegroups to separate the tables into groups that can be backed up independently. If you intend to partition a large fa...

Types Of Measure And Types Of Fact Tables

Image
Transaction Fact Tables: Every record represents one transaction; such as the sales of an item in a store. Periodic Snapshot Fact Tables :  Every record represents a measure for a period of time. Accumulating Snapshot Fact Tables: Each fact represents a process, business process, with a clear beginning and end.

Junk Dimensions And Degenerate Dimension

Image
Junk Dimensions  Combine low-cardinality attributes that don’t belong in existing dimensions into a junk dimension  Avoids creating many small dimension tables; for the purpose of simplifying the model. Degenerate Dimension A key without its related dimension Provides grouping and business meaning (i.e: the college-department code in a university; this code is usually well known by the decision-makers or the business owners)

Slowly Changing Dimensions – Type 3

Image
Slowly Changing Dimensions – Type 3  Rarely used The previous value (or complete history of previous values) is maintained in the dimension table row. This requires modifying the dimension table schema to accommodate new values for each tracked attribute, and can result in a complex dimension table that is difficult to manage.  References: https://www.slideshare.net/antoniosch/building-data-warehouse-in-sql-server?from_action=save And others

Slowly Changing Dimensions – Type 2

Image
Slowly Changing Dimensions – Type 2 Updates the dimension table and keeps track of the old values through using "EffectiveDate" and "ExpirationDate" fields, while each date has its own record in the dimension table and a foreign key in the fact table. These changes involve the creation of a fresh version of the dimension entity in the form of a new row. Typically, a bit column in the dimension table is used as a flag to indicate which version of the dimension row is the current one. Additionally, DateTime columns are often used to indicate the start and end of the period for which a version of the row was (or is) current. Maintaining start and end dates makes it easier to assign the appropriate foreign key value to fact rows as they are loaded so they are related to the version of the dimension entity that was current at the time the fact occurred. Note:  Each row in a dimension table represents an instance of a business entity by which the mea...

Using ETL Staging Tables

Using ETL Staging Tables When the volume or granularity of the transformation process causes ETL processes to perform poorly, consider using a staging table on the destination database as a vehicle for processing interim data results. Staging tables are normally considered volatile tables, meaning that they are emptied and reloaded each time without persisting the results from one execution to the next. Staging tables should be used only for interim results and not for permanent storage. When using a load design with staging tables, the ETL flow looks something more like this: Delete existing data in the staging table(s) Extract the data from the source Load this source data into the staging table(s) Perform relational updates (typically using T-SQL, PL/SQL, or another language that is specific to your RDBMS) to cleanse or apply business rules to the data, repeating this transformation stage as necessary Load the transformed data from the staging table(s) into the final des...

Data Warehouse Vs Data Mart

Image
What is Data Warehouse? A Data Warehouse collects and manages data from varied sources to provide meaningful business insights. It is a collection of data that is separate from the operational systems and supports the decision making of the company. In Data Warehouse, data is stored from a historical perspective. The data in the warehouse is extracted from multiple functional units. It is checked, cleansed and then integrated with the Data warehouse system. Data warehouse used a very fast computer system having large storage capacity. This tool can answer any complex queries relating to data. What is Data Mart? A data mart is a simple form of a Data Warehouse. It is focused on a single subject. Data Mart draws data from only a few sources. These sources may be central Data warehouse, internal operational systems, or external data sources. A Data Mart is an index and extraction system. It is an important subset of a data warehouse. It is subject-oriented, and it is desig...