Finding Right hardware for Hadoop Cluster

Many things needs to be considered before finding the right hardware for Hadoop clusters. Hadoop workloads tend to vary a lot betweek different jobs. It takes experience to correctly anticipate the amounts of storage, processing power, and inter-node communication that will be required for different kinds of jobs.

Continue ReadingFinding Right hardware for Hadoop Cluster

What are Big Data File Storage Formats?

One of the most important aspect of architecting a solution with Big Data is choosing a proper Data Storage options in Hadoop/Spark. Hadoop does not have a standard data storage format ,but as a standard file system ,allows for storage of data in any format ,whether it’s text,binary ,image or other.

Continue ReadingWhat are Big Data File Storage Formats?

Data Compression Techniques in Hadoop Framework

When working with Hadoop Framework to process massive data, we will face many challanges and difficulties which includes the Input/Output(I/O) and network related bottlenecks. To overcome this problem, one needs to understand the different data compression techniques available in Hadoop Framework.

Continue ReadingData Compression Techniques in Hadoop Framework