Site icon Technology and Trends

Types of Table in Apache Hive

Apache Hive has mainly two types of tables: Managed and External tables.

Managed Table: 

When hive creates managed (default) tables, it follows the “schema on read” principle and loads the complete file as it is, without any parsing or modification to the Hive data warehouse directory. And its schema information would be saved in a hive metastore for later operational use. When we drop an internal or managed hive table, both the data file from the data warehouse and the schema from the meta store is dropped.

Hive Internal or Managed tables are used when data is temporary, and we want Hive to completely manage the lifecycle of the table and data.

CREATE TABLE test_table(firstName String, lastName String);

External Table: 

When we create hive external tables, it does not load source files in the hive data warehouse; it only adds schema information in the metastore. When an external table is dropped, the hive does not remove the data from the source file but drops only the schema from the Hive meta store. We use it when data needs to remain in the underlying location even after a DROP TABLE.

CREATE EXTERNAL TABLE test_table(firstName String, lastName String);

When to use external and internal tables in Hive

Use external tables when you have the below reasons.

Use internal tables when you have the below reasons.

Exit mobile version