- Creating UDF
- How to packaging UDF(creating jar file)
- Add jar file in to hive
- Test UDF
Steps to create and test UDF's
1) Implement the code for UDF in Java
2) Package java class into jar file copy in some location
3) Add jar file in to Hive CLI
4) Create temporary function in hive
5) Use hive UDF BY using Query.
Prerequiste: Table should have some data.
Problem statement-1
Find the maximum marks obtained out of four subject by an student.
Package java class into jar file copy in some location.
SELECT CLASS IN ECLIPSE-->RIGHT-->EXPORT-->JAVA-->JAR--> BROWSE THE LOCATION-->PROFILE FILENAME WITH .JAR Extension.
Add jar file in to Hive CLI
hive> add jar /home/cloudera/training/HiveUDFS/getMaxMarks.jar;
Create temporary function in hive
hive> create temporary function getmaxmarks as 'udfs.GetMaxMarks';
Use hive UDF BY using Query
hive> select getmaxmarks(10,20,30,40) from dummy; // sanity test
There are 2 types of UDF'S
1) Regular UDF( UDF) ---> Applied on more number of rows in a table
2) User Defined aggregate function (UDAF) --> Group of result sets.
Problem statement-2: Find the mean of marks obtained in maths by all the students.
Package java class into jar file copy in some location
Right click onth package-->export-->java-->provide jar file name.
Add jar file in to Hive CLI
hive> add jar /home/cloudera/training/HiveUDFS/getMeanMarks.jar;
Create temporary function in hive
hive> create temporary function getmeanmarks as 'udaf.GetMeanMarks';
Use functions with queries
hive> select getmeanmarks(social)from t_student_record;