QA City

Join Hands to Build a Smarter India: Sign in | Join now
Most Popular

All About Testing Data Warehousing Applications

By SiliconIndia   |   Wednesday, 27 June 2012, 03:45 Hrs
Print Email


Bangalore: Businesses today are concentrating on assembling and analyzing data to make informed decisions. Take a look at a few points to keep in mind when testing ETL (Extract, Transform and Load) applications. These points are applicable to any organization that is attempting to improve an existing data warehouse.



Data must be completely loaded:



This basically means all the data must be fully loaded into the data warehouse. This includes validating all records, all fields and the full contents of each field are loaded. Strategies to consider include:



1. Record counts between source data, data loaded to the warehouse and rejected records could be compared.
2. Unique values of key fields between source data and data loaded to the warehouse could also be compared. This is a valuable technique that points out a variety of possible data errors without doing a full validation on all fields.



Data Transformation:



Ensuring that data has transformed correctly based on business rules can be the most challenging part of testing an ETL application and it involves a lot of transformation logic. The straightforward technique is to pick some sample records and "stare and compare" to validate data transformations manually. But this requires manual testing steps and testers who comprehend the ETL logic. A combination of automated data profiling and automated data movement validations is a better long-term strategy.



Quality of Data:



We can define Data Quality as how the ETL system handles data rejection, substitution, correction and notification without modifying data. Usually, data quality rules are defined during design. These may include rejecting the record if a certain decimal field has nonnumeric data or substituting null if a certain decimal field has nonnumeric data.



Depending on these rules, scenarios to test might include null key values, duplicate records in source data and invalid data types in fields. Go through the detailed test scenarios with business users and technical designers to ensure that all are kept in the loop.



Performance:



As the volume of data in a data warehouse grows, ETL load times can be expected to increase and performance of queries might degrade. This could be navigated by having a solid technical architecture and good ETL design. The aim of the performance testing is to point out any potential weaknesses in the ETL design, such as reading a file multiple times or creating unnecessary intermediate files.



User-Acceptance Testing



The main reason behind deploying a data warehouse application is to make data available to business users as they know the data best. So their participation in the testing effort is vital for the success of a data warehouse implementation. User-acceptance testing (UAT) typically focuses on data loaded to the data warehouse and any views that have been created on top of the tables, not the intricacies of how the ETL application functions. You could consider the following strategies:



1. Make use of data that is either from production or as close to production data as possible. Users might find issues once they see the "real" data which usually leads to design changes.
2. It is also important that users fully understand how the views have been created. It will be necessary to test the database views.
3.  System test team must be able to support users during UAT. The users would most probably have queries about how the data is populated and need to understand details of how the ETL works.



Regression Testing



Regression testing is basically the revalidation of existing functionality with each fresh release of code. When building test cases, keep in mind that they will likely be executed multiple times as new releases are created several times due to defect fixes, enhancements or upstream systems changes. An efficient strategy to retest basic functionality is storing source data sets and results from successful runs of the code and contrasting new test results with previous runs. When doing a regression test, it is more effective to compare results to a previous execution than to do a whole data validation from scratch.



Taking these considerations into account during the design and testing portions of building a data warehouse will ensure that a quality product is produced and prevent costly mistakes.


Sign Up for QA Digest and Read the Day's Highlights
Don't Miss
Experts on QA
Praveen  Mohan
Principal Engineer
Yahoo
Mangesh Shrikant Gokhale
SeniorTestEngineer
Sigma Software
Vimal Raj Selvam
SoftwareTestEngineer
BNP Paribas
Harsha  Nutalapati
Sr.Vice President
Indium
Punit  Thakkar
CEO
Shivaami Corporation
Vaidyanath  Ramalingam
Director-Engineering
Huawei
Arobinda  Tripathy
Senior Manager
Mindfire Solutions
Soumik  Ghosh
Product Management
Robert Bosch
Write your comment now
 
Whitepapers
This report is the result of the largest public-private sector rese...
For those not familiar with the Coverity Scan™ service, i...
SiliconIndia About Us   |   Contact Us   |   Help   |   Community rules   |   Advertise with us   |   Sitemap   |  
News:       Technology   |   Enterprise   |   Gadgets   |   Startups   |   Finance   |   Business   |   Career   |   Magazine  |   Newsletter   |   News archive  
Cities:        CEO   |     Startup   |   Mobile   |   CIO   |   Women   |   BI   |   HR   |   SME   |   Cloud   |   Marketing   |   QA   |   Java   |   Web Developer  
Community:      Members   |   Blogs   |   Indian Entrepreneurs   |   Gyan   |   Advice   |   Community   |   Find   |   CXO Insights  
Job Board:      Jobs   |   Freshers   |   Companies   |   HR Speak   |   Forum  
Online Courses:   Web Developer   |   Java Developer   |   CCNA Training   |   SEO   |   SAS   |   SQL Server 2005   |   J2EE
Education:   MBA   |   MCA   |   Engineering   |   Training Institute
Life:          Real Estate   |   Travel   |   Finance   |   Gadgets   |   Movie Reviews   |    Jokes  
Send your feedback and help us continue to improve SiliconIndia
© 2014 InfoConnect Web Technologies India Pvt Ltd. all rights reserved