Apache Hive Complex Data Types(Collections)


Complex Data Types
  1. arrays: ARRAY
  2. maps: MAP
  3. structs: STRUCT

 Array

The first complex type is an array. It is nothing but a collection of items of similar data type. i.e, an array can contain one or more values of the same data type.

cloudera@quickstart hivedata]$ cat ​arrrayfile.txt
1,abc,40000,​a$b$c​,​hyd
2,def,3000,​d$f​,​bang
Create Hive Table

Create Table

hive> create table arrayTable(id int,name string,sal bigint,sub array,city string)
row format delimited fields terminated by ',' collection items terminated by '$';

Load the data in Hive Table

hive> load data local inpath'/home/cloudera/training/hivedata/arrrayfile.txt' into table arrayTable;

Select data from Hive table

hive> select * from arrayTable;
1 abc 40000 ["a","b","c"] hyd
2 def 3000 ["d","f"] bang

hive> select sub[2] from arrayTable;
OK
13
c
NULL

hive> select sub[1] from arrayTable;
OK
b
f

Map

Map is a collection of key-value pairs where fields are accessed using array notation of keys
create table maptable(id int,name string,sal bigint,sub array,pfund map,city string) row format delimited fields terminated by ',' collection items terminated by '$' map keys terminated by '#';

cloudera@quickstart hivedata]$ cat ​mapfile.txt
1,abc,40000,a$b$c,​pf#500$epf#200​,​hyd
2,def,3000,d$f,​pf#500​,bang

Create Hive Table

hive> create table maptable(id int,name string,sal bigint,sub array,pfund
map,city string) row format delimited fields terminated by ',' collection items
terminated by '$' map keys terminated by '#';

Load the data in Hive Table

hive> load data local inpath'/home/cloudera/training/hivedata/mapfile.txt' into table
maptable;

Select data from Hive table

hive> select * From maptable;
hive> select pfund["pf"],["epf"] from maptable;

 Struct

Struct is a record type which encapsulates a set of named fields that can be any primitive data type. An element in STRUCT type can be accessed using the DOT (.) notation.

[cloudera@quickstart hivedata]$ cat ​structfile.txt
1,abc,40000,a$b$c,pf#500$epf#200,​hyd$ap$500001
2,def,3000,d$f,pf#500,​bang$kar$600038

Create Hive Table

hive> create table structtable(id int,name string,sal bigint,sub array,dud
map, address struct) row format delimited
fields terminated by ',' collection items terminated by '$' map keys terminated by '#';

Load the data in Hive Table

hive> load data local inpath '/home/cloudera/training/hivedata/structfile.txt' into table
structtable;

Select data from Hive table

hive> select * From structtable;
OK
1 abc 40000 ["a","b","c"] {"pf":500,"epf":200}
{"city":"hyd","state":"ap","pin":500001}
2 def 3000 ["d","f"] {"pf":500} {"city":"bang","state":"kar","pin":600038}

hive> select address.city from structtable;
OK
hyd
bang

hive> select address.pin from structtable;
OK
500001
600038

Followers