Apache Mesosphere and DC/OS – Introduction to containerization at hyper scale

A concept that sits at the heart of Apache Mesos, Containerization is a virtualization method for the deployment & running of distributed applications without bringing VMs into the equation. Now, where does Apache Mesos fit into all this?

Consider that you’ve deployed some containers in your data center, including Analytics, Web Application, Software Networking, etc. Now, if you wish to deploy your web apps by integrating these containers, the first thing you’ll require will be selecting a subset of nodes for your application’s runtime environment. Also, there will be other details to take care of like virtual or physical locations for deployment.

However, you can automate these steps by scripting them out. You’ll require the details of all the resources you’re employing including the computers, their ports, DNS addresses and so on. The end product of all these operations would be a Statically-Partitioned Sub-Cluster. Now, suppose the need for another deployment arises. The only way to do so, while following the legacy topology would be to repeat all the steps mentioned above, which as it may seem brings redundancy & inefficiency into the equation.

It will be worse if your Web App becomes really popular. To fulfill the increased demand, you’ll have to shut down the existing system, bring disruption to the users and need to put resources into jobs that could’ve been rescaled.

Development times could be cut short, wasteful spending could be avoided, disruption time can be brought down and most importantly more efficient distribution of resources could take place if a new solution can be put to use. The solution comes in the form of Apache Mesos.

What is Apache Mesos?

Running Docker containers in a data center isn’t as easy as it seems when it comes to huge scale deployment where proper distribution of resources is a priority. An excellent way to do so would be to make the clusters treat the containers like CPU cores in a personal computer. Enter Apache Mesos!

Apache Mesos is a fault-tolerant cluster manager that used a centralized approach to the allocation of resources and their subsequent management. Mesos joins up the physical resources and presents them as one big unit that can then be scheduled across various clusters, similar to how the Linux Kernel works.

Developed by the Apache Software Foundation at the University of California, Berkley, the software is Cross Platform and has a stable release dated November 3rd, 2016.

Apache Mesos is primarily built for hyper-scalability. Its ability to scale to thousands to tens of thousands of nodes has made it a top-level open source project and is the driver for its popularity in companies like Microsoft, Twitter and Ebay when it comes to management of their data centers.

Also, Mesos is language independent and supports several development languages like C++, Java & Python.

DC/OS Mesosphere

Based on the Apache Mesos distribution systems kernel comes an operating system, i.e. DC/OS. The OS enables the visualization & management of several machines as one unit, automating several tasks such as process placement, resource management, inter-process communications, etc. The OS has a web interface as well as a CLI for remote administration tasks.

Notable to say, DC/OS is the only open source project that brings all these features under one roof.

Docker & Mesos go hand in hand because of their synergetic approach to pushing a container into production, making the entire process super-easy for developers.

DC/OS provides a level of abstraction between the scheduler and the machines where these tasks are to be executed. This essentially means that it is up to the OS to distribute resources accordingly, eliminating the need for a scheduler to deal with the tasks. Thus, static partitioning has been shut down.

How a distributed system is designed?

Two different sets of machines are implemented:

  • Coordinator machine: assigns tasks to workers
  • Workers machine: executes assigned tasks.

Mesos provides a level of abstraction b/w the scheduler & the machines, so in effect Mesos sits in between them. This provides the immediate benefit of running multiple distributed systems on the same cluster of machines without hogging down any resources or stealing any system’s share of resources.

DC/OS Features

Apache Mesos is made up of a set of masters and a set of workers, working in conjunction with a framework that runs in accordance with Mesos API, e.g. Hadoop. Whenever the framework wants to run a task on the Mesos cluster, a connection is made to the masters which trigger a distribution of resources.

To sum it up, DC/OS packs the following features:

  • High resource utilization
  • Mixed workload colocation
  • Container orchestration
  • Extensible resource isolation
  • Stateful storage support
  • Zero downtime upgrades
  • High availability & elastic stability
  • Web & Command Line Interface
  • Real-time interaction
  • Integration-tested components
  • Service discovery and distributed load balancing
  • And much more…

It all starts off with the Mesos master, which has a list of all the slaves as well as their specifications in one place. For instance, there may be 10 slaves with 4x CPUs & 4GB RAM each. These resources are presented to a framework scheduler, and whenever the need for task execution arises, the task is launched and handed over to the Mesos Master.

The Master handles the task to the slave according to resources from where the executor will take over. Meanwhile, the status of operations is sent back up, from the Master to the Scheduler. On the basis, of this information, a new task may be started or the current one may be killed or halted.

DC/OS Architecture

As mentioned before, DC/OS is a distributed operating system that sits between the resources of machines & provides services to apps. The services include service discovery, package management and running processes across several nodes.

The architecture can essentially be split into 3 parts:

  • User Space
  • Kernel Space
  • Hardware

The User Space consists of components like Distributed DNS proxy, Mesos DNS as well as services like Spark & Marathon. In addition, it spans systemd services like Chronos or Kafka.

Consider the DC/OS kernel a magnified & glorified version of the Linux kernel. The kernel comprises of:

  • Mesos Master: the process orchestrates tasks that are later-on run by Mesos agents. The process receives reports from various agents and allocates resources to each DC/OS service in need of it.
  • Mesos Agents: runs discrete Mesos tasks on behalf of the entire framework. Private agent nodes are employed to run the apps & services while public agent nodes compute the DC/OS apps in a publicly accessible network. The Mesos-Slave process also packs the ability to invoke an Executor for launching tasks via containerizers.

The Kernel Space in DC/OS is responsible for managing resource allocation & for performing two level scheduling across the clusters

Finally, the Hardware may be Amazon web services, Open stack, or any physical or virtual hardware.

A general way to look at all these processes is as follows:

  1. Client/scheduled initializes itself.
  2. Mesos master sends resources offers to scheduler.
  3. The scheduler declines resource offers as long as processes haven’t been initiated from the client side.
  4. Client proceeds with the launch.
  5. Mesos master sends a resource offer, which if matched is accepted and sent along with a “launchTask” request to the master.
  6. The Mesos agents are directed by the master.
  7. The executor reports the status of the tasks to the agents which report it to the master, from where they are sent to the scheduler.
  8. The scheduler informs the client.

If you like to learn more about this popular platform – check out our course on Docker, Apache Mesos and DC/OS here: http://www.tetratutorials.com/p/docker-apache-mesos-dcos-run-your-own-iaas-cloud

Leave a Reply

Your email address will not be published. Required fields are marked *