Hadoop 101!

The world is one big data problem! - Andrew McAfee

This article is a part of Big-Data series in which I'll be posting stuff related to Big data tech stack. Most of the articles will be short and to the point.

Hadoop

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. (Wikipedia)

In simple words:

Like any other platform It's a platform for Data storage and processing but with immense powers (same as Thor). Since it's inception it has been a game changer in how we process and store data.

What is Hadoop?

It's an open source project from Apache Software foundation.
It consist of a software framework for distributing and running applications on cluster of servers (I said you same power as Thor ;)).
Hadoop is written in Java.
It is inspired by Google's GFS(Google File system).
Hadoop is capable of processing large volumes of data (Big Data) on a cluster of in-expensive hardware also called commodity hardware.
Hadoop is highly fault tolerant ie. if any failure occurs it is automatically taken care of.

So like Thor what power does Hadoop has?

Fault Tolerance, Reliability, High Availability, Scalability - (Vertical & Horizontal), Highly Economic, Data Locality (Moving computation close to data rather than moving data close to computation)

As human body has 2 most important part : The heart and Brain. Similarly Hadoop has 2 most important parts.

The Heart and Brain of Hadoop:

1. HDFS: Hadoop Distributed File System

HDFS aka heart of Hadoop is responsible for Data Storage and Data Protection.
HDFS is the Storage layer in Hadoop ecosystem.

2. Map-Reduce :

Map Reduce aka the brain of Hadoop is responsible for Data Processing in parallel.
Map Reduce is the computation layer in the Hadoop ecosystem.

We'll decipher these 2 organs in detail in next blog post ;)