SDET- QA Automation Techie

Full Stack QA Automation Testing Blog

  • Home
  • Training
    • Online
    • Self-Paced
  • Video Tutorials
  • Interview Skills
    • HR Interview Questions Videos
    • Domain Knowledge
  • Career Guidance
  • Home
  • Software Testing
    • Manual Testing Tutorials
    • Manual Testing Project
    • Manaul Testing FAQS
    • ISTQB
    • AGILE
  • Web Automation Testing
    • Java Programmng
    • Python Programmng
    • Selenium with Java
    • Selenium with Python
    • Robot Framework(Selenium with Python)
    • selenium with Cucumber
    • TestNG+IntelliJ
    • Mobile App Testing(Appium)
    • JMeter
  • API Automation Testing
    • Rest Assured API Testing (BDD)
    • Rest Assured API Testing (Java+ TestNG)
    • Robot Framework(Rest API Testing with Python)
    • Postman
    • SoapUI
    • API Testing(FAQ's)
  • SDET|DevOps
    • Continuos Integration
    • SDET Essentials
    • AWS For Testers
    • Docker
  • SQL
    • Oracle(SQL)
    • MySQL for Testers
    • NoSQL
  • Unix/Linux
    • UNIX TUTORIALS
    • Linux Shell Scripting
  • ETL Testing
    • ETL Data warehouse Tutorial
    • ETL Concepts Tools and Templates
    • ETL Testing FAQ's
    • ETL Testing Videos
  • Big Data Hadoop
  • Video Tutorials
  • ApachePOI Video Tutorials
  • Downloads
    • E-Books for Professionals
    • Resumes
  • Automation Essencials
    • Cloud Technologies
      • Docker For Testers
      • AWS For Testers
      • Sub Child Category 3
    • Java Collections
    • Selenium Locators
    • Frequently Asked Java Programs
    • Frequently Asked Python Programs
    • Protractor
    • Cypress Web Automation

ETL Basics

 ETL Testing, ORACLE, SQL, SQL SERVER   

What is ETL?
  • ETL stands for extract, transform and load are a three-stage process in database usage and data warehousing.
  • It enables integration and analysis of the data stored in different databases and various formats.
  • After it is collected from multiple sources the data is reformatted and cleaned for operational needs. Finally, it is loaded into a target database.
  • ETL Tools used to move data from one operational system to another 
  • Comprehensive testing of a data warehouse at every point throughout the ETL process is becoming increasingly important as more data is being collected and used for strategic decision-making. 
  • ETL testing is often initiated as a result of mergers and acquisitions, compliance, regulations and data consolidation.
  • ETL usually determines the success or failure of a Data warehouse because a lot of importance on data and the quality of data that is being analyzed
E - Extraction
  • The process of pulling out data that is required for the Data Warehouse from the source system
  • The purpose of the extraction process is to reach to the source systems and collect the data needed for the transformation.
  • Usually data is consolidated from different source systems that may use a different data organization or format so the extraction must convert the data into a format suitable for transformation processing.
  • The complexity of the extraction process may vary and it depends on the type of source data.
  • The extraction process also includes selection of the data as the source usually contains redundant data.
  • Extraction can be automated since it becomes repetitive once established
T - Transformation
  • Transformation is the process of making data into reformatted and cleaned for operational needs.
  • Transformation usually done after Extraction with the purpose of ensuring ‘clean’ and ‘consistent’ data
  • The transformation stage involves applying a series of rules or functions to the extracted data. 
  • Transforming the data in accordance with the business rules and standards that have been established
  • Example include:  format changes, de-duplication, splitting up fields, replacement of codes, aggregates
  • Transformation includes validation of records and their rejection if they are not acceptable in integration part. 
  • Transformation is required to convert and summarize operational data into a consistent, business oriented format
  • The amount of manipulation needed for transformation process depends on the data.
  • The most common processes involved in transformation are conversion, clearing the duplicates, standardizing, filtering, sorting, translating and looking up or verifying if the data sources are inconsistent.
Reasons for “Dirty” data

 1) Dummy Values
 2) Absence of Data
 3) Duplicate Data
 4) Inappropriate Use of Address Lines
 5) Violation of Business Rules
 6) Non-Unique Identifiers

Clean data is vital for the success of the warehouse.

Cleansing
=> The process of removing errors and inconsistencies from data being imported to a data warehouse
=> Cleansing could involve in multiple stages


L - Loading
  • Loading is the process of transfer data to the warehouse after extracting, Transforming, cleaning and validating 
  • The loading is the last stage of ETL process
  • It loads extracted and transformed data into a target repository. 
  • One of the ways to load ETL data is physically insert each record as a new row into the table of the target warehouse involving SQL insert statement 
  • The loading part is usually a bottleneck of the whole process. 
  • Issues while loading
  • Huge volumes of data to be loaded
  • Small time window available when warehouse can be taken off line (usually nights)
  • Allow system administrators to monitor, cancel, resume and change load rates
  • After failure, While recovery needs restart from where you were and without loss of data integrity
The Need for ETL
  • ETL facilitates Integration of data from various data sources for building a Data Warehouse 
  • Businesses have data in multiple databases with different formats
  • Makes data available in a queriable format
Data Integrity Problem Scenarios
  • Same person, different spellings
Eg. Agarwal,  Agrawal,  Aggarwal etc.
  • Multiple ways to denote company name
Eg. Persistent Systems, PSPL, Persistent Pvt. LTD.
  • Use of different names
Eg. Mumbai, Bombay
  • Different account numbers generated by different applications for the same customer
  • Required fields left blank
  • Invalid product codes collected at point of sale
  • Manual entry leads to mistakes
Characteristics of ETL Tools
  • Provides facility to specify a large number of transformation rules with a GUI
  • Generate programs to transform data
  • Handle multiple data sources
  • Handle data redundancy
  • Generate metadata as output
  • Support data extraction, cleansing, aggregation, reorganization, transformation and load operations
  • Generate and maintain centralized metadata
  • Closely integrated with various RDBMS
  • Filter data, convert codes, calculate derived values, map many source data fields to one target data field
  • Automatic generation of data extract programs
  • High speed loading of target data warehouses
ETL Tool - General Selection criteria
  • Business Vision/Considerations
  • Overall IT strategy/Architecture
  • Over all cost of Ownership
  • Vendor Positioning in the Market
  • Performance
  • In-house Expertise available
  • User friendliness
  • Training requirements to existing users
  • References from other customers







  • Share This:  
  •  Facebook
  •  Twitter
  •  Google+
  •  Stumble
  •  Digg
Email ThisBlogThis!Share to TwitterShare to Facebook
Newer Post Older Post Home
popup

Popular Posts

  • How To Explain Project In Interview Freshers and Experienced
    “ Describe an important project you’ve worked on ” is one of the most common questions you can expect in an interview. The purpose of a...
  • API/Webservices Testing using RestAssured (Part 1)
    Rest Assured : Is an API designed for automating REST services/Rest API's Pre-Requisites Java Free videos: https://www.you...
  • MANUAL TESTING REAL TIME INTERVIEW QUESTIONS & ANSWERS
    1. How will you receive the project requirements? A. The finalized SRS will be placed in a project repository; we will access it fr...

Facebook Page

Pages

  • Home
  • Resumes
  • Job Websites India/UK/US
  • ISTQB
  • Selenium with Java
  • E-Books for Professionals
  • Manual Testing Tutorials
  • Agile Methodology
  • Manual Testing Projects

Live Traffic

YouTube


Blog Visitors

Copyright © SDET- QA Automation Techie | Powered by Blogger
Design by SDET | Blogger Theme by | Distributed By Gooyaabi Templates