What are HDFS Data Blocks?
Hadoop Block Size Configuration and Components Block is defined as the smallest site/location on the hard drive that is available to read and write data. Data in HDFS(Hadoop Distributed File…
Hadoop Block Size Configuration and Components Block is defined as the smallest site/location on the hard drive that is available to read and write data. Data in HDFS(Hadoop Distributed File…
Apache Hive has mainly two types of tables: Managed and External tables. Managed Table: When hive creates managed (default) tables, it follows the "schema on read" principle and loads the…
What is Rack? Before looking into the Rack awareness in Hadoop HDFS, let us understand the rack itself. A rack is a storage area where all the data nodes are…
According to report by global Harvest initiative, Human population will exceed 9 billion by 2050, which will lead to food crisis if production rate remains the same.Thats why there has been a wide research in use of big data techniques in agricultural methods in both academia and commerical product.
Apache HBase is an open-source, non-relational, distributed database modeled after Google's BigTable. It is developed as part of the Apache Software Foundation and is written in Java. It sits on…
We can run Apache Pig Latin code and Pig statements using various modes. We will go through all of the Apache Pig execution modes in detail in this blog post.
Apache Hadoop/HDFS and HBase are both parts of the Big data framework. They both are used to store a massive amount of data. In spite of this similarity, they have…
Apache Hadoop 3 incorporated a number of enhancements over the Hadoop-2.x. We will talk about the important enhancement that was implemented as part of Hadoop 3 over Hadoop 2 in…
Big data refers to datasets whose size, volume and structure is beyond the ability of traditional software tools and database systems to store,process and analyze within reasonable timeframes. Big data security is a term used for the different tools and techniques used to protect data,any back end processes from outside attacks and thefts.
Metadata is the information that describes other data or in other words it is data about the data.It is the descriptive,administrative and structural data that defines a firms data assets.