Hadoop Architecture – Types of Hadoop Nodes in Cluster – Part 2

In continuation to the previous post (Hadoop Architecture-Hadoop Distributed File System), Hadoop cluster is made up of the following main nodes:-
1.Name Node
2.Data Node
3.Job Tracker
4.Task Tracker

The above depicted is the logical architecture of Hadoop Nodes. But physically data node and task tracker could be placed on single physical machine as per below shown diagram.

There are few other secondary nodes name as secondary name node, backup node and checkpoint node. This above diagram shows some of the communication paths between the different types of nodes in the Hadoop cluster. A client is shown as communicating with a JobTracker as well as with the NameNode and with any DataNode. There is only one NameNode in the cluster but one can plan for the redundant name node in the cluster but manually it has to be switched on. While the data file is stored in blocks at the data nodes, the metadata for a file is stored at the NameNode. If there is one node in the cluster to spend money on the best enterprise hardware for maximum reliability it is the NameNode. The NameNode should also have as much RAM as possible because it keeps the entire filesystem metadata in memory and data nodes could be used as commodity hardware.


Share on facebook
Share on twitter
Share on linkedin

Leave a Reply

Your email address will not be published. Required fields are marked *

Become a member

Full Access to 739 Lessons. New Lessons Added Every Week!

Awesome Deal! Get 2 Months for FREE!

No Obligations. Cancel At Any Time!