Node manager core

The Node Managers are worker nodes and thus each worker node has one Node Manager. This means if we have five worker nodes in a cluster, we will have five Node Managers running on the node. The job of the Node Manager is to run and manage containers on worker nodes, send heartbeats and node information at regular intervals to the Resource Manager, manage already running containers, manage utilization of containers, and so on.

Let us talk about a few important components of the Node Manager:

  • Resource manager component: Node Managers are worker nodes and work closely with resource managers. The NodeStatusUpdater is responsible for sending updated information about node resources to the Resource Manager at regular intervals and at the first time when the machine starts. NodeHealthCheckerService works closely with NodeStatusUpdater and any change in node health will be reported to NodeStatusUpdater from NodeHealthCheckerService.
  • Container component: The primary work the Node Manager is to manage the life cycle of containers and ContainerManager is responsible for starting, stopping, or getting status of the already running containers. The ContainerManagerImpl contains the implementation part of starting or stopping containers and checking the status of containers. The following are the components of containers:
    • Application master request: The application master request requires resource from Resource Manager and then sends a request to the Node Manager to launch a new container. It can also send a request to stop an already running container. The RPC server running on Node Manager is responsible for receiving requests from application masters to launch the new container or stop already running containers.
    • ContainerLauncher: The requests received from application masters or Resource Managers go to the ContainerLauncher. Once the request is received, the ContainerLauncher launches the containers. It can also clean up the container resources based on demand. 
    • ContainerMonitor: The Resource Manager provides a resource to the application masters to launch containers, and containers are launched with the provided configuration. The ContainersMonitor is responsible for monitoring container health, resources utilization, and sending signals for cleaning up the container when it exceeds the resource utilization assigned to it. This information can be very helpful in debugging application performance and memory utilization, which can further help in performance tuning.
    • LogHandler: Each container generates a log of its life cycle and the LogHandler allows us to specify the log location either on the same disk or some other external storage location. These logs can be used to debug the applications.