What are Hadoop Execution Modes?

Post author:nitendratech
Post category:Hadoop
Post comments:2 Comments
Post published:July 25, 2017

Apache Hadoop can be used in multiple modes to achieve a different set of tasks. There are three modes in which a Hadoop Mapreduce application can be executed.

Introduction to Hadoop Distributed File System(HDFS)

Post author:nitendratech
Post category:Hadoop
Post comments:41 Comments
Post published:July 15, 2017

HDFS is a distributed file system that is designed for storing very large files with streaming data access patterns running on clusters of commodity hardware.

Understanding the World of Linux Operating Systems

Post author:nitendratech
Post category:Linux
Post comments:2 Comments
Post published:July 5, 2017

Linux is open-source and one of the most popular operating systems. It is one of the most important technological advancements of the last century. It has made a huge impact…

What are the Sources of Big Data and How it gets generated?

Post author:nitendratech
Post category:Big Data
Post comments:1 Comment
Post published:July 5, 2017

Digital data is now everywhere—in every sector, in every economy, in every organization, and user of digital technology. While this topic might once have concerned only a few data geeks,…

What is Apache Hadoop? An In-depth Look at This Big Data Tool

Post author:nitendratech
Post category:Hadoop
Post comments:18 Comments
Post published:July 1, 2017

What exactly is Apache Hadoop? Apache Hadoop is an open-source distributed processing framework that is used to store and process large datasets whose size ranges from gigabytes to petabytes of…

Important Git Command Cheat Sheet

Post author:nitendratech
Post category:Programming
Post comments:0 Comments
Post published:June 25, 2017

Git is a version control system for tracking changes in computer files and coordinating work on those files among multiple people. This post is a collection of important Git command Cheat Sheet that i use in my day to day basis.

Installing Apache Spark on Linux

Post author:nitendratech
Post category:Spark
Post comments:0 Comments
Post published:June 18, 2017

Apache Spark is an open-source cluster-computing framework. This post will explain the steps for installing prebuilt version of Apache Spark 2.1.1 as a stand alone cluster in a Linux system. I have used Ubuntu as a debains based OS for this post.

What is Big Data and Why it is important to understand? Introduction and Properties

Post author:nitendratech
Post category:Big Data
Post comments:17 Comments
Post published:June 8, 2017

The amount of data in our world has been exploding. Different Companies capture trillions of bytes of information about their customers, suppliers, and operations, and millions of networked sensors are being embedded in the physical world in devices such as mobile phones and automobiles, sensing, creating, and communicating data.

Introduction to MicroServices

Post author:nitendratech
Post category:Programming
Post comments:1 Comment
Post published:May 22, 2017

Microservice architecture, or simply microservices, is a distinctive method of developing software applications as a suite of independently deployable, small , modular services in which each service runs a unique process and communicates through a well-defined, lightweight mechanism to serve a business goal.

What is Apache Spark? The Unified engine for large-scale data analytics.

Post author:nitendratech
Post category:Spark
Post comments:28 Comments
Post published:May 10, 2017

Apache Spark is a distributed, in-memory and disk based optimized system which does real-time analytics using Resilient Distributed Data(RDD) Sets.Spark includes a streaming library, and a rich set of programming interfaces to make data processing and transformation easier.