System Requirements

Hardware and software requirements for DC/OS deployments

A DC/OS cluster consists of two types of nodes master nodes and agent nodes. The agent nodes can be either public agent nodes or private agent nodes. Public agent nodes provide north-south (external to internal) access to services in the cluster through load balancers. Private agents host the containers and services that are deployed on the cluster. In addition to the master and agent cluster nodes, each DC/OS installation includes a separate bootstrap node for DC/OS installation and upgrade files. Some of the hardware and software requirements apply to all nodes. Other requirements are specific to the type of node being deployed.

Hardware prerequisites

The hardware prerequisites are a single bootstrap node, Mesos master nodes, and Mesos agent nodes.

Bootstrap node

  • DC/OS installation is run on a single Bootstrap node with two cores, 16 GB RAM, and 60 GB HDD.
  • The bootstrap node is only used during the installation and upgrade process, so there are no specific recommendations for high performance storage or separated mount points.

NOTE: The bootstrap node must be separate from your cluster nodes.

All master and agent nodes in the cluster

The DC/OS cluster nodes are designated Mesos masters and agents during installation. The supported operating systems and environments are listed on the version policy page.

When you install DC/OS on the cluster nodes, the required files are installed in the /opt/mesosphere directory. You can create the /opt/mesosphere directory prior to installing DC/OS, but it must be either an empty directory or a link to an empty directory. DC/OS can be installed on a separate volume mount by creating an empty directory on the mounted volume, creating a link at /opt/mesosphere that targets the empty directory, and then installing DC/OS.

You should verify the following requirements for all master and agent nodes in the cluster:

  • Every node must have network access to a public Docker repository or to an internal Docker registry.
  • If the node operating system is RHEL 7 or CentOS 7, the firewalld daemon must be stopped and disabled. For more information, see Disabling the firewall daemon on Red Hat or CentOS.
  • The DNSmasq process must be stopped and disabled so that DC/OS has access to port 53. For more information, see Stopping the DNSmasq process.
  • You are not using noexec to mount the /tmp directory on any system where you intend to use the DC/OS CLI.
  • You have sufficient disk to store persistent information for the cluster in the var/lib/mesos directory.
  • You should not remotely mount the /var/lib/mesos or Docker storage /var/lib/docker directory.

Disabling the firewall daemon on Red Hat or CentOS

There is a known Docker issue that the firewalld process interacts poorly with Docker. For more information about this issue, see the Docker CentOS firewalld documentation.

To stop and disable the firewalld, run the following command:

sudo systemctl stop firewalld && sudo systemctl disable firewalld

Stopping the DNSmasq process

The DC/OS cluster requires access to port 53. To prevent port conflicts, you should stop and disable the dnsmasq process by running the following command:

sudo systemctl stop dnsmasq && sudo systemctl disable dnsmasq.service

Master node requirements

The following table lists the master node hardware requirements:

Minimum Recommended
Nodes 1* 3 or 5
Processor 4 cores 4 cores
Memory 32 GB RAM 32 GB RAM
Hard disk 120 GB 120 GB

* For business critical deployments, three master nodes are required rather than one master node.

There are many mixed workloads on the masters. Workloads that are expected to be continuously available or considered business-critical should only be run on a DC/OS cluster with at least three masters. For more information about high availability requirements see the High Availability documentation.

Examples of mixed workloads on the masters are Mesos replicated logs and ZooKeeper. In some cases, mixed workloads require synchronizing with fsync periodically, which can generate a lot of expensive random I/O. We recommend the following:

  • Solid-state drive (SSD) or non-volatile memory express (NVMe) devices for fast, locally-attached storage. To reduce the likelihood of I/O latency issues, solid-state drives should be locally attached to the physical machine, if possible. You should also be sure that solid-state drive (SSD) or non-volatile memory express (NVMe) devices are used for the file systems hosting master node replicated logs.

    In planning your storage requirements, keep in mind that you should avoid using a single storage area network (SAN) device and NFS to connect to the nodes in the cluster. This type of architecture introduces a higher possibility of latency than using local storage and introduces a single point of failure in what should otherwise be a distributed system. Network latency and bandwidth issues can cause client sessions to time out and adversely affect [DC/OS] cluster performance and reliability.

  • RAID controllers with a battery backup unit (BBU).

  • RAID controller cache configured in writeback mode.

  • If separation of storage mount points is possible, the following storage mount points are recommended on the master node. These recommendations will optimize the performance of a busy DC/OS cluster by isolating the I/O of various services.

    Directory Path Description
    /var/lib/dcos A majority of the I/O on the master nodes will occur within this directory structure. If you are planning a cluster with hundreds of nodes or intend to have a high rate of deploying and deleting workloads, isolating this directory to dedicated SSD storage on a separate device is recommended.
  • Further breaking down this directory structure into individual mount points for specific services is recommended for a cluster which will grow to thousands of nodes.

    Directory Path Description
    /var/lib/dcos/mesos/master logging directories
    /var/lib/dcos/cockroach CockroachDB Enterprise
    /var/lib/dcos/navstar for Mnesia database
    /var/lib/dcos/secrets secrets vault Enterprise
    /var/lib/dcos/exec Temporary files required by various DC/OS services. The /var/lib/dcos/exec directory must not be on a volume which is mounted with the noexec option.
    /var/lib/dcos/exhibitor Zookeeper database
    /var/lib/dcos/exhibitor/zookeeper/transactions The ZooKeeper transaction logs are very sensitive to delays in disk writes. If you can only provide limited SSD space, this is the directory to place there. A minimum of 2 GB must be available for these logs.

Agent node requirements

The table below shows the agent node hardware requirements.

Minimum Recommended
Nodes 1 6 or more
Processor 2 cores 2 cores
Memory 16 GB RAM 16 GB RAM
Hard disk 60 GB 60 GB

In planning memory requirements for agent nodes, you should ensure that agents are configured minimize the use of swap space. The recommended best practice is optimize cluster performance and reduce potential resource consumption issues to disable memory swapping for all agents in the cluster, if possible.

In addition to the requirements described in All master and agent nodes in the cluster, the agent nodes must have:

  • A /var directory with 20 GB or more of free space. This directory is used by the sandbox for both Docker and DC/OS Universal container runtime.

  • Do not use noexec to mount the /tmp directory on any system where you intend to use the DC/OS CLI unless a TMPDIR environment variable is set to something other than /tmp/. Mounting the /tmp directory using the noexec option could break CLI functionality.

  • If you are planning a cluster with hundreds of agent nodes or intend to have a high rate of deploying and deleting services, isolating this directory to dedicated SSD storage is recommended.

    Directory Path Description
    /var/lib/mesos/ Most of the I/O from the Agent nodes will be directed at this directory. Also, The disk space that Apache Mesos advertises in its UI is the sum of the space advertised by filesystem(s) underpinning /var/lib/mesos
  • Further breaking down this directory structure into individual mount points for specific services is recommended for a cluster which will grow to thousands of nodes.

    Directory path Description
    /var/lib/mesos/slave/slaves Sandbox directories for tasks
    /var/lib/mesos/slave/volumes Used by frameworks that consume ROOT persistent volumes
    /var/lib/mesos/docker/store Stores Docker image layers that are used to provision URC containers
    /var/lib/docker Stores Docker image layers that are used to provision Docker containers

Port and protocol configuration

  • Secure shell (SSH) must be enabled on all nodes.
  • Internet Control Message Protocol (ICMP) must be enabled on all nodes.
  • All hostnames (FQDN and short hostnames) must be resolvable in DNS; both forward and reverse lookups must succeed. Enterprise
  • All DC/OS node host names should resolve to locally bindable IP addresses. Most applications require host names to resolve by binding to a local IP address to function correctly. Applications that cannot resolve the host name of a node by binding to a local IP address might fail to function or behave in unexpected ways. Enterprise
  • Each node is network accessible from the bootstrap node.
  • Each node has unfettered IP-to-IP connectivity from itself to all nodes in the DC/OS cluster.
  • All ports should be open for communication from the master nodes to the agent nodes and vice versa. Enterprise
  • UDP must be open for ingress to port 53 on the masters. To attach to a cluster, the Mesos agent node service (dcos-mesos-slave) uses this port to find leader.mesos.

Requirements for intermediaries (e.g., reverse proxies performing SSL termination) between DC/OS users and the master nodes:

  • No intermediary must buffer the entire response before sending any data to the client.
  • Upon detecting that its client goes away, the intermediary should also close the corresponding upstream TCP connection (i.e., the intermediary should not reuse upstream HTTP connections).

High-speed internet access

High speed internet access is recommended for DC/OS installations. A minimum 10 MBit per second is required for DC/OS services. The installation of some DC/OS services will fail if the artifact download time exceeds the value of MESOS_EXECUTOR_REGISTRATION_TIMEOUT within the file /opt/mesosphere/etc/mesos-slave-common. The default value for MESOS_EXECUTOR_REGISTRATION_TIMEOUT is 10 minutes.

Software prerequisites

  • Refer to the install_prereqs.sh script for an example of how to install the software requirements for DC/OS masters and agents on a CentOS 7 host.Enterprise

  • When using OverlayFS over XFS, the XFS volume should be created with the -n ftype=1 flag. Please see the Red Hat and Mesos documentation for more details.

Docker requirements

Docker must be installed on all bootstrap and cluster nodes. The supported Docker versions are listed on version policy page.

Recommendations

  • Do not use Docker devicemapper storage driver in loop-lvm mode. For more information, see Docker and the Device Mapper storage driver.

  • Prefer OverlayFS or devicemapper in direct-lvm mode when choosing a production storage driver. For more information, see Docker’s Select a Storage Driver.

  • Manage Docker on CentOS with systemd. The systemd handles will start Docker and helps to restart Dcoker, when it crashes.

  • Run Docker commands as the root user (with sudo) or as a user in the docker user group.

Distribution-specific installation

Each Linux distribution requires Docker to be installed in a specific way:

For more more information, see Docker’s distribution-specific installation instructions.

Disable sudo password prompts

To disable the sudo password prompt, you must add the following line to your /etc/sudoers file.

%wheel ALL=(ALL) NOPASSWD: ALL

Alternatively, you can SSH as the root user.

Synchronize time for all nodes in the cluster

You must enable Network Time Protocol (NTP) on all nodes in the cluster for clock synchronization. By default, during DC/OS startup you will receive an error if this is not enabled. You can check if NTP is enabled by running one of these commands, depending on your OS and configuration:

ntptime
adjtimex -p
timedatectl

Bootstrap node

Before installing DC/OS, you must ensure that your bootstrap node has the following prerequisites.

IMPORTANT: If you specify `exhibitor_storage_backend: zookeeper`, the bootstrap node is a permanent part of your cluster. With `exhibitor_storage_backend: zookeeper`, the leader state and leader election of your Mesos masters is maintained in Exhibitor ZooKeeper on the bootstrap node. For more information, see the configuration parameter documentation.

  • The bootstrap node must be separate from your cluster nodes.

DC/OS configuration file

  • Download and save the dcos_generate_config file to your bootstrap node. This file is used to create your customized DC/OS build file. Contact your sales representative or sales@mesosphere.com for access to this file. Enterprise

  • Download and save the dcos_generate_config file to your bootstrap node. This file is used to create your customized DC/OS build file. Open Source

Docker NGINX (production installation)

For production installations only, install the Docker NGINX image with this command:

sudo docker pull nginx

Cluster nodes

For production installations only, your cluster nodes must have the following prerequisites. The cluster nodes are designated as Mesos masters and agents during installation.

Data compression (production installation)

You must have the UnZip, GNU tar, and XZ Utils data compression utilities installed on your cluster nodes.

To install these utilities on CentOS7 and RHEL7:

sudo yum install -y tar xz unzip curl ipset

Cluster permissions (production installation)

On each of your cluster nodes, follow the below instructions:

  • Make sure that SELinux is in one of the supported modes.

    To review the current SELinux status and configuration run the following command:

    sudo sestatus
    

    DC/OS supports the following SELinux configurations:

    • Current mode: disabled
    • Current mode: permissive
    • Current mode: enforcing, given that Loaded policy name is targeted

    To change the mode from enforcing to permissive run the following command:

    sudo sed -i 's/SELINUX=enforcing/SELINUX=permissive/g' /etc/selinux/config
    

    Or, if sestatus shows a “Current mode” which is enforcing with a Loaded policy name which is not targeted, run the following command to change the Loaded policy name to targeted:

    sudo sed -i 's/SELINUXTYPE=.*/SELINUXTYPE=targeted/g' /etc/selinux/config
    

    NOTE: Ensure that all services running on every node can be run in the chosen SELinux configuration.

  • Add nogroup and docker groups:

    sudo groupadd nogroup &&
    sudo groupadd docker
    
  • Reboot your cluster for the changes to take effect.

    sudo reboot
    

    NOTE: It may take a few minutes for your node to come back online after reboot.

Locale requirements

You must set the LC_ALL and LANG environment variables to en_US.utf-8.

localectl set-locale LANG=en_US.utf8

Next steps