/
Data Warehouse Data Modeling Standards

Data Warehouse Data Modeling Standards

 

*Disclaimer - we do not always have control over standards for adopted systems such as SIS or UCPath

 

Tool: ErWin

Data Lineage Documentation (source to target): ErWin

ETL: Informatica

 

SQL Server (lunar/solar)

These standards are up for debate - one shared DB w/multiple schemas or different DBs? Need to discuss with the DB team to make a final decision.

Database: dss

Schema ODS: ods_{subject area}, for UCPath the standards is hcm_ods

Schema for Data Marts: dss_{subject area}

Schema for EDW (conformed dimensions, person, time): edw

Database for staging: staging

Schema for staging: stage_{subject area}

 

 

Table Naming Standards:

Dimensions: {subject area}_D_{table name} (KFS), {table_name}_D (UCPath), D_{table name} (EDW)

Facts: {subject area}_F_{table name} (KFS), {table_name}_F (UCPath), F_{table name} (EDW)

ODS Tables: If it is a copy of OLTP system table, leave name the same as the source system. If it is a new table, follow the naming standards of the source system.

ODS Views: {table_name}_V, and also should follow the naming standards of the source system.

Access control tables for row-level and column-level security should be in the dss_* schemas: {source system}_KSAMS_{table name}, for example KBM_KSAMS_ROLES

 

Column Naming Standards:

Try to maintain the column names from the source system when possible.

All tables should have a "CREATED_EW_DTTM" and "UPDATED_EW_DTTM" columns to show when the record was created and the data was last updated. Upon initial creation, these two values will be the same until the first update load.

ETL Columns:

SID columns should be named {name}_SID (e.g. ACCOUNT_SID), datatype BigInt Identity, primary key.

All fact tables should have a primary key, which should be a combination of all the dimension SIDs plus any date/time columns (fiscal period, snapshot date, etc.)

All dimensions will include type 2 fields, even if it is a type 1 dim:

*List out Type 2 Fields

 

 

 

 

Add to ETL Standards:

Non-matching Dimension value SID number should be set to -1