How to Run Test Cases using Selenium Grid

Selenium Grid is a tool that distributes the tests across multiple physical or virtual machines so that we can execute scripts in parallel (simultaneously). It dramatically accelerates the testing process across browsers and across platforms by giving us quick and accurate feedback.
Selenium Grid allows us to execute multiple instances of WebDriver or Selenium Remote Control tests in parallel which uses the same code base, hence the code need NOT be present on the system they execute. The selenium-server-standalone package includes Hub, WebDriver, and Selenium RC to execute the scripts in grid.

Apache Pig Notes

What is pig?

Implemented by Yahoo.
Pig Hadoop echo system s/w from apache foundation used for analysing the data.
Pig uses pig latin language.

Apache Sqoop Notes

Overview on Sqoop

Sqoop is open source s/w from Apache used for transfer data between RDBMS(oRACLE, sLQSERVER, mYSQL...) and HDFS.

MySQL Database

Connecting to MySQL Database:
root user: root/cloudera
other user: cloudera/cloudera

Apache Hive Notes

Apache Hive
  • Hive is a data warehouse infrastructure tool to process structured data in Hadoop. 
  • Initially Hive was developed by Facebook, later the Apache Software Foundation took it up and developed it further as an open source under the name Apache Hive.
  • It stores schema in a database and processed data into HDFS.
  • It is designed for OLAP not for OLTP.
  • It provides SQL type language for querying called HiveQL or HQL.
  • Hive is not RDBMS.
  • Hive is a system for managing and querying un-structured data into structured format. It uses the concept of Map Reduce for execution.