Published May 05, 2017 by

Surrogate Key in data warehouse


In most of the table, the  primary key will be loaded from source schema, but some source table might not have a primary key in such has by using sequence generator the primary key will be created, such keys are called Surrogate key.

Surrogate key vs primary key:

In terms of usage, there is no difference between these two types of keys. Both differ in the way of loading primary key loaded from the source table, whereas surrogate key loaded by the sequence generator.

When does it require?
  1. Moving data from multiple locations
    A retail store is running in multiple locations and maintains the dedicated database with the same application for every location. In that case, all locations would have same primary key values. The data cannot be integrated without any intervention, in this case, surrogate key created for smooth data load.
Surrogate key example
  1. No primary key in source table
    Sometimes the source table does not have any primary key it may not be required for a single database or application, but when integrating from different sources the same table data might be loaded from the different one. In that case, surrogate key created to avoid the duplication.
Surrogate key example when no source primary key
  1. New table creation
    A table is derived as part of data warehouse development, for example, month table which is not available in the source where the date is captured in date column in a table itself. Month table would have month and year, a surrogate key is required to reference this in the fact table.
Surrogate key for new table
    email this