SparkonYarn| Spark, from Beginner to Proficient

Welcome to the Beauty chart of the data technology team「Spark, From Beginner to Proficient」 Series of articles, This series of articles will introduce you from the shallow to the deep Spark, Getting started with the framework to the implementation of the underlying architecture, I'm sure there's a pose for you., We welcome your continued interest:)

Previous Directories: Hello Spark!

/ Why do you need Yarn? /

Yarn's full name is Yet Anther Resource Negotiator. It serves as a component of Hadoop, which is officially defined as A framework for job scheduling and cluster resource management

Yarn first appeared in Hadoop 0.23 branched,0.23 Branching is an experimental branch, It has gone through several iterations since then, Last published in 2014 year 6 three-month 0.23.11 version( The last version of this branch)。 (located) at 0.23.0 Shortly after the release of the 2011 year 12 month,Hadoop of 0.20 Branches developed into Hadoop1.0, right 1.0 The last version of the 1.2.1-stable None of them showed up. Yarn figure, and in Hadoop2.0 The first version of the 2.0.0-alpha,Yarn has been added as a full component。 (located) at 2.0.2-alpha version, It already supports 2k Cluster of machines, Then in 2.0.3-alpha Versions already support 30k Cluster of machines。 (located) at 2.0.3-alpha A variety of resources are also supported in this version, as if cpu&memory Scheduling and ResourceManager restart。

chart 1,via

as if chart 1 as shown, Hadoop1.0 The operational process is as follows:

1.Submission of tasks to the cluster by the client.

2.JobTracker receives Job requests.

3.JobTracker requests a list of DataNode nodes containing these file data blocks from the NameNode based on the input parameters of the Job.

4.JobTracker determines the execution plan of a Job: it confirms the number of Map and Reduce tasks and assigns the task to the node closest to the data block for execution.

Initially, Hadoop 1.0 was able to support big data computing well, but as the scale of computing grew and the computational models diversified, it gradually outstripped its capabilities. It is well known that you can simply add more machines when the cluster is under-performing, but only one JobTracker is active when multiple are deployed at the same time, so there is a limit to the number of machines that can be accommodated in the whole cluster due to the load limit of this active JobTracker, and some data shows that the management limit of the whole cluster is about 4k machines. Also application-related and resource management-related logic is all placed in JobTracker, which can be a bottleneck when the cluster scales up. On top of that, the Map-Reduce computational model is too coupled to JobTracker for other computational models to run on Hadoop 1.0.

Yarn is a solution for Hadoop based on these problems, and the next section analyzes how Yarn solves these problems by understanding its components, architecture, and mechanisms of operation.

/ What is Yarn? /

Yarn components of& basic structure

as if chart 2 as shown Yarn adopted Master/Slave structure, Overall two-tier scheduling architecture。 The scheduling at the first level is ResourceManager harmony NodeManager:ResourceManager be Master knots, equivalent to JobTracker, include Scheduler harmonyApp Manager Two components, Resource scheduling and application management;NodeManager be Slave knots, Can be deployed on a separate machine, For managing resources on the machine。NodeManager will show ResourceManager Report the number of resources it has、 Usage, and accepted ResourceManager Resource scheduling for the。

*The ResourceManager, like the JobTracker, can be deployed on multiple machines, and only one is active. However, the scheduling management and application management have been split in ResourceManager, and the two components are more dedicated.

chart 2

The second level of scheduling refers to the NodeManager harmony Container。NodeManager will include Cpu& Resources such as memory are abstracted into a Container, and manage their life cycle。

By using a two-tier scheduling structure will Scheduler The resources managed by the fine-grained Cpu& Memory becomes coarse-grained Container, Reduced load。 (located) at App Manager component also only needs to manage the App Master, No need to manage complete information on task scheduling execution, Same reduced load。 By reducing the ResourceManager load, Disguised increase in cluster scalability。

Yarn Operations Process

chart 3,via

as if chart 3 as shown Yarn The operational process is as follows:

1.The client submits the application to the App Manager of the ResourceManager and requests an instance of AppMaster.

2.ResourceManager finds a NodeManager that can run a Container and starts an instance of AppMaster in that Container.

3.The App Master registers with the ResourceManager, after which the client can query the ResourceManager to get its App Master details and interact directly with the App Master.

4.Then the App Master requests the resource, i.e., the Container, from the Resource Manager.

5.After obtaining the Container, the App Master starts the Container and executes the Task.

6.Container sends information such as run progress and status to the AppMaster during execution.

7.The client actively communicates with the App Master about the running status of the application, progress updates, etc.

8.All work is completed and the App Master is unregistered with the RM and then closed, and all containers are returned to the system.

The processing of this Job shows that the App Master acts as the driver of the Job, which drives the scheduling and execution of the Job tasks. In this operational flow, the App Manager only needs to manage the lifecycle of the App Master and keep its internal state, and the abstraction of the App Master role allows each type of application to customize its own App Master so that other computational models can run on a Yarn cluster with relative ease.

Yarn HA (disaster recovery and backup)

The next presentation is Yarn Designing for fault-tolerant backups in cluster high availability。 depending on chart 3 illustrated Yarn infrastructure chart, if Container faults Resource Manager It is possible to assign other Container Continued implementation, When running App Master of Container After a failure a new one will also be assigned Container,App Master It is possible to get the most out of App Manager Access to information recovery。 proper NodeManager The system can remove this node first in case of failure, in other NodeManager Restart and resume the mission。

chart 4,via

What about when the ResourceManager fails? As mentioned above, in a Yarn cluster, the ResourceManager can start multiple units, only one of which is active, the others are on standby. The active ResourceManager writes its state to the ZooKeeper cluster when it executes, and when it fails these RMs first elect another leader to become active, and then load the ResourceManager's state from the ZooKeeper cluster. It does not receive new Jobs during the transfer and only receives new Jobs when the transfer is complete.

/ Spark on Yarn /

First, we introduce Spark's resource management architecture. Spark clusters are not designed to be closed to the outside world in terms of resource management, so the Spark architecture is designed to abstract resource management out to a layer, which enables the construction of a plug-in resource management module.

chart 5,via

as if chart 5 Shown are Spark Resource management architecture for the chart。Master be Spark of master control node, In a real production environment there will be multiple Master, There's only one. Master be in (some state, position, or condition) active statuses。Worker be Spark Work nodes of the, toward Master Reporting on own resources、Executeor A change in the execution state, and accepted Master command to start Executor or Driver。Driver is the driver for the application, Each application includes many small tasks,Driver Responsible for driving the orderly execution of these small tasks。Executor be Spark of the work process, due to Worker supervisory, Responsible for the implementation of specific tasks。

The above is an abstraction of Spark's architecture for resource management, which is very similar to Yarn's architecture, so Spark is easy to build on top of Yarn. In the comparison of roles on both sides of Spark and Yarn: Master corresponds to ResourceManager, Worker corresponds to NodeManager, Driver corresponds to App Master, and Executor corresponds to Container.

Depending on the Spark deployment model the resource management architecture can take different shapes. Spark consists of roughly four deployment models.

  • Local Mode : Deployed on the same process, with only the Driver role. Create a Driver after accepting a task that is responsible for scheduling the execution of the application and does not involve the Master and Worker.
  • Local-Cluster Mode : deployed on the same process, there are Master and Worker roles which exist as separate threads within the process.
  • Standalone mode : Spark's true cluster mode, in which the Master and Worker are independent processes.
  • Third-party deployment model : Built on top of Yarn or Mesos, they provide resource management.

Next, take a look at how Spark on Yarn processes a Job. After the client submits a task to the Yarn ResourceManager, the App Manager accepts the task and finds a Container to create the App Master, which is running the Spark Driver. After that, App Master requests the Container and starts it, Spark Driver starts Spark Executor on the Container and schedules Spark Task to run on Spark Executor, and when all the tasks are executed, it unregisters with App Manager and releases the resources.

chart 6,via

You can see that this execution process is almost identical to the processing of a task by Yarn, except that the App Master and Container are handed over to the corresponding roles in Spark on Yarn during the Job processing.

chart 7,via

There is another mode of operation for Spark on Yarn: Spark on Yarn-Client. Unlike the Spark on Yarn-Cluster described above, the Spark on Yarn-Client client does not host the Spark Driver to Yarn after submitting a task, but runs it on the client side. After the App Master requests the Container, the Spark Driver also starts the Spark Executor and executes the task.

So why use Yarn as a resource manager for Spark? Let's compare Spark clustering mode Standalone and Spark on Yarn in terms of resource scheduling capabilities: Spark's Standalone mode only supports FIFO scheduler, single-user serial, and by default all resources of all nodes are available to the application; while Yarn not only supports FIFO resource scheduling, but also provides elastic and fair resource allocation methods.

Yarn allocates resources by assigning them to queues. Each queue can set its resource allocation method, and the next section describes the three ways Yarn allocates resources.

FIFO Scheduler

If no policy is configured, all tasks are submitted to a default queue and executed according to their order of submission. If the resources are rich, the task will be executed, and if the resources are not rich, the resources will be released after the previous tasks are executed, which is the FIFO Scheduler's first-in-first-out allocation method.

chart 8,via

as if chart 8 as shown, (located) at Job1 All resources were consumed at the time of submission, Soon after Job2 submitted, But at this point there are no resources in the system to allocate to it。 join Job1 It's a big task, so Job2 You can only wait a long time to get the resources to execute。 So one problem with the first-in, first-out allocation is that big tasks take up a lot of resources, Causes the small tasks behind to wait too long and starve to death, Therefore, this default configuration is generally not used。

Capacity Scheduler

Capacity Scheduler is a multi-tenant, resilient allocation method. One queue per tenant, each queue can be configured with an upper and lower limit of resources that can be used (e.g., 50%, after which other resources cannot be used even if they are vacant), and the queue can be configured so that at least the resources configured for the lower limit are available.

chart 9,via

chart 9 the queue in A and the queue B Separate resources are allocated。Job1 Submit to the queue A execute, It can only use queues A The resource。 then Job2 Submitted to the queueB You don't have to wait Job1 The resource has been released。 This allows you to assign large and small tasks to two queues, The resources of the two queues are independent of each other, There will be no cases of starvation on small tasks。

Fair Scheduler

Fair Scheduler is a fair distribution, and by fair I mean that the cluster willProportional allocation as far as possible to the configuration resources to the queue.

chart 10,via

chart 10 middle Job1 Submit to the queue A, It takes up all the resources of the cluster。 then Job2 Submitted to the queue B, at this time Job1 would need to release half of its resources to the queue A hit the target Job2 use。 then Job3 Also submitted to the queue B, this time Job2 Half of the resources must also be released if they have not been implemented Job3。 This is how it is distributed fairly, The resources available to all tasks within the queue are equally divided。

/ the Future of Spark /

Mesos' resource scheduling is similar to Yarn, but it offers both coarse- and fine-grained models. The difference between the so-called coarse and fine granularity is whether the resources requested by the Executor are requested before execution or on-demand during execution. A cluster may have an Executor request for resources that are idle at the time when resources are tight, and if in coarse-grained mode, those resources are wasted at the time. But in fine-grained mode, the resources required for Executor execution are allocated according to its needs, so that there are no idle resources.

Yarn does not currently support a fine-grained scheduling model because Mesos's Executor is dynamically tunable and Yarn's Container is not, but Yarn has plans to support fine-grained resource management.

In addition to this, in Hadoop 3.1.0 Yarn provides support for gpu resources, currently only Nvidia gpu is supported. Look forward to more exploration of other aspects of Spark, in the next post we will specifically cover RDD, so stay tuned.

1、How to troubleshoot data center networks quickly
2、How do you get rich overnight Deep learning teaches you to predict bitcoin prices
3、Zexizhou 103 gold trend analysis ADP hand in hand with Powell gold market storm clouds rise again with a solution
4、Big Data Lottery King Lue Qiangs Fortuna 3D 036th free condition share
5、Application of VehicleBorne Mobile Measurement System in Urban Tree Census

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送