Configuration

Configuration options for Elastic

The default DC/OS Elastic installation provides reasonable defaults for trying out the service, but may not be sufficient for production use. You may require a different configuration depending on the context of the deployment.

Installing with Custom Configuration

The following are some examples of how to customize the installation of your Elastic instance.

In each case, you would create a new Elastic instance using the custom configuration as follows:

dcos package install elastic --options=sample-elastic.json

We recommend that you store your custom configuration in source control.

Installing multiple instances

By default, the Elastic service is installed with a service name of elastic. You may specify a different name using a custom service configuration as follows:

{
  "service": {
    "name": "elastic-other"
  }
}

When the above JSON configuration is passed to the package install elastic command via the --options argument, the new service will use the name specified in that JSON configuration:

dcos package install elastic --options=elastic-other.json

Multiple instances of Elastic may be installed into your DC/OS cluster by customizing the name of each instance. For example, you might have one instance of Elastic named elastic-staging and another named elastic-prod, each with its own custom configuration.

After specifying a custom name for your instance, it can be reached using dcos elastic CLI commands or directly over HTTP as described below.

WARNING: The service name cannot be changed after initial install. Changing the service name would require installing a new instance of the service against the new name, then copying over any data as necessary to the new instance.

Installing into folders

In DC/OS 1.10 and later, services may be installed into folders by specifying a slash-delimited service name. For example:

{
  "service": {
    "name": "/foldered/path/to/elastic"
  }
}

The above example will install the service under a path of foldered => path => to => elastic. It can then be reached using dcos elastic CLI commands or directly over HTTP as described below.

WARNING: The service folder location cannot be changed after initial install. Changing the folder location would require installing a new instance of the service against the new name, then copying over any data as necessary to the new instance.

Addressing named instances

After you’ve installed the service under a custom name or under a folder, it may be accessed from all dcos elastic CLI commands using the --name argument. By default, the --name value defaults to the name of the package, or elastic.

For example, if you had an instance named elastic-dev, the following command would invoke a pod list command against it:

dcos elastic --name=elastic-dev pod list

The same query would be over HTTP as follows:

curl -H "Authorization:token=$auth_token" <dcos_url>/service/elastic-dev/v1/pod

Likewise, if you had an instance in a folder like /foldered/path/to/elastic, the following command would invoke a pod list command against it:

dcos elastic --name=/foldered/path/to/elastic pod list

Similarly, it could be queried directly over HTTP as follows:

curl -H "Authorization:token=$auth_token" <dcos_url>/service/foldered/path/to/elastic-dev/v1/pod

You may add a -v (verbose) argument to any dcos elastic command to see the underlying HTTP queries that are being made. This can be a useful tool to see where the CLI is getting its information. In practice, dcos elastic commands are a thin wrapper around an HTTP interface provided by the DC/OS Elastic Service itself.

Integration with DC/OS access controls

In Enterprise DC/OS, DC/OS access controls can be used to restrict access to your service. To give a non-superuser complete access to a service, grant them the following list of permissions:

dcos:adminrouter:service:marathon full
dcos:service:marathon:marathon:<service-name> full
dcos:service:adminrouter:<service-name> full
dcos:adminrouter:ops:mesos full
dcos:adminrouter:ops:slave full

Where <service-name> is your full service name, including the folder if it is installed in one.

Service Settings

Placement Constraints

Placement constraints allow you to customize where a service is deployed in the DC/OS cluster. Placement constraints use the Marathon operators syntax. For example, [["hostname", "UNIQUE"]] ensures that at most one pod instance is deployed per agent.

A common task is to specify a list of whitelisted systems to deploy to. To achieve this, use the following syntax for the placement constraint:

[["hostname", "LIKE", "10.0.0.159|10.0.1.202|10.0.3.3"]]

IMPORTANT: Be sure to include excess capacity in such a scenario so that if one of the whitelisted systems goes down, there is still enough capacity to repair your service.

Updating Placement Constraints

Clusters change, and as such so will your placement constraints. However, already running service pods will not be affected by changes in placement constraints. This is because altering a placement constraint might invalidate the current placement of a running pod, and the pod will not be relocated automatically as doing so is a destructive action. We recommend using the following procedure to update the placement constraints of a pod:

  • Update the placement constraint definition in the service.
  • For each affected pod, one at a time, perform a pod replace. This will (destructively) move the pod to be in accordance with the new placement constraints.

Zones Enterprise

Requires: DC/OS 1.11 Enterprise or later.

Placement constraints can be applied to DC/OS zones by referring to the @zone key. For example, one could spread pods across a minimum of three different zones by including this constraint:

[["@zone", "GROUP_BY", "3"]]

For the @zone constraint to be applied correctly, DC/OS must have Fault Domain Awareness enabled and configured.

WARNING: A service installed without a zone constraint cannot be updated to have one, and a service installed with a zone constraint may not have it removed.

Virtual networks

DC/OS Elastic supports deployment on virtual networks on DC/OS (including the dcos overlay network), allowing each container (task) to have its own IP address and not use port resources on the agent machines. This can be specified by passing the following configuration during installation:

{
  "service": {
    "virtual_network_enabled": true
  }
}

NOTE: Once the service is deployed on a virtual network, it cannot be updated to use the host network.

User

By default, all pods’ containers will be started as system user “nobody”. If your system configured for using over system user (for instance, you may have externally mounted persistent volumes with root’s permissions), you can define the user by defining a custom value for the service’s property “user”, for example:

{
  "service": {
    "properties": {
      "user": "root"
    }
  }
}

Regions

The service parameter region can be used to deploy the service in an alternate region. By default the service is deployed in the “local” region, which is the region the DC/OS masters are running in. To install a service in a specific reason, include in its options:

{
  "service": {
    "region": "<region>"
  }
}

WARNING: A service may not be moved between regions.

Configuration Guidelines

  • Service name: This should be unique for each instance of the service that is running. It is also used as your cluster name.
  • Service user: This should be a non-root user who already exists on each agent. The default user is nobody.
  • X-Pack is installed by default and comes with a 30-day trial license.
  • Health check credentials: If you have X-Pack Security enabled, the health check will use the credentials specified in the configuration for authorization. We recommend that you create a specific Elastic user/password for this with minimal capabilities rather than using the default superuser elastic.
  • Plugins: You can specify other plugins via a comma-separated list of plugin names (such as, “analysis-icu”) or plugin URIs.
  • CPU/RAM/Disk/Heap: These will be specific to your DC/OS cluster and your Elasticsearch use cases. Please refer to Elastic’s guidelines for configuration.
  • Node counts: At least one data node is required for the cluster to operate at all. You do not need to use a coordinator node. Learn about Elasticsearch node types here. There is no maximum for node counts.
  • Master transport port: You can pick whichever port works for your DC/OS cluster. The default is 9300. If you want multiple master nodes from different clusters on the same host, specify different master HTTP and transport ports for each cluster. If you want to ensure a particular distribution of nodes of one task type (such as master nodes spread across multiple racks, data nodes on one class of machines), specify this via the Marathon placement constraint.
  • Serial vs Parallel deployment. By default, the DC/OS Elastic Service tells DC/OS to install everything in parallel. You can change this to serial in order to have each node installed one at a time.
  • Serial vs Parallel update. By default, the DC/OS Elastic Service tells DC/OS to update everything serially. You can change this to parallel in order to have each node updated at the same time. This is required, for instance, when you turn X-Pack Security on or off.
  • A custom YAML file can be appended to elasticsearch.yml on each node.

Immutable Settings

At cluster creation time via Elastic package UI or JSON options file (via CLI), the following settings cannot be changed after installation:

  • Service name (aka cluster name). Can be hyphenated, but not underscored.
  • Master transport port
  • Disk sizes/types

Modifiable Settings

  • Plugins
  • CPU
  • Memory
  • JVM Heap (do not exceed one-half available node RAM)
  • Node count (up, not down)
  • Health check credentials
  • X-Pack Security enabled/disabled
  • Deployment/Upgrade strategy (serial/parallel). Note that serial deployment does not yet wait for the cluster to reach green before proceeding to the next node. This is a known limitation.
  • Custom elasticsearch.yml

Any other modifiable settings are covered by the various Elasticsearch APIs (cluster settings, index settings, templates, aliases, scripts). It is possible that some of the more common cluster settings will get exposed in future versions of the DC/OS Elastic Service.

X-Pack Security

X-Pack is an Elastic Stack extension that provides security, alerting, monitoring, reporting, machine learning, and many other capabilities. By default, when you install Elasticsearch, X-Pack is installed.

You must set the update strategy to parallel when you toggle X-Pack Security in order to force a full cluster restart. Afterwards, you should set the update strategy back to serial for future updates.

You can toggle this setting at any time. This gives you the option of launching an Elastic cluster without X-Pack Security and then later enabling it. Or, you can run a cluster with X-Pack Security enabled and, if at the end of the 30-day trial period you do not wish to purchase a license, you can disable it without losing access to your data.

License Expiration

If you let your license expire, remember these two important points:

  • Your data is still there.
  • All data operations (read and write) continue to work.

Graph, Machine Learning, Alerting and Notification, Monitoring, and Security features all operate with reduced functionality when the license expires.

Click here to learn more about how X-Pack license expiration is handled.

Topology

Each task in the cluster performs one and only one of the following roles: master, data, ingest, coordinator.

The default placement strategy specifies that no two nodes of any type are distributed to the same agent. You can specify further Marathon placement constraints for each node type. For example, you can specify that ingest nodes are deployed on a rack with high-CPU servers.

agent

Figure 1 - Private nodes displayed by agent

vip

Figure 2 - Private nodes displayed by VIP

No matter how big or small the cluster is, there will always be exactly 3 master-only nodes with minimum_master_nodes = 2.

Default Topology

These are the minimum resources to run on 3 agents:

  • 3 master-only nodes
  • 2 data-only nodes
  • 1 coordinator-only node
  • 0 ingest-only node

The master/data/ingest/coordinator nodes are set up to only perform their one role. That is, master nodes do not store data, and ingest nodes do not store cluster states.

Minimal Topology

You can set up a minimal development/staging cluster without ingest nodes, or coordinator nodes. You will still get 3 master nodes placed on 3 separate hosts. If you do not care about replication, you can even use just 1 data node.

Note that the default monitoring behavior is to try to write to an ingest node every few seconds. Without an ingest node, you will see frequent warnings in your master node error logs. While they can be ignored, you can turn them off by disabling X-Pack monitoring in your cluster, like this:

curl -XPUT -u elastic:changeme master.<service-dns>.l4lb.thisdcos.directory:9200/_cluster/settings -d '{
    "persistent" : {
        "xpack.monitoring.collection.interval" : -1
    }
}'

Zone/Rack-Aware Placement and Replication

Elastic’s “rack”-based fault domain support is automatically enabled when specifying a placement constraint that uses the @zone key. For example, you could spread Elastic nodes across a minimum of three different zones/racks by specifying the constraint [["@zone", "GROUP_BY", "3"]]. When a placement constraint specifying @zone is used, Elastic nodes will be automatically configured with racks that match the names of the zones. If no placement constraint referencing @ zone is configured, all nodes will be configured with a default rack of rack1.

In addition to placing the tasks on different zones/racks, the zone/rack information will be included in the Elastic’s node.attr.zone attrribute and cluster.routing.allocation.awareness.attributes is set to “zone”. This enables Elastic to ensure data is replicated between zones/racks and not to two nodes in the same zone/rack.

Custom Elasticsearch YAML

Many Elasticsearch options are exposed via the package configuration in config.json, but there may be times when you need to add something custom to the elasticsearch.yml file. For instance, if you have written a custom plugin that requires special configuration, you must specify this block of YAML for the Elastic service to use.

Add your custom YAML when installing or updating the Elastic service. In the DC/OS UI, click Configure. In the left navigation bar, click elasticsearch and find the field for specifying a custom Elasticsearch YAML. You must base64 encode your block of YAML and enter this string into the field.

You can do this base64 encoding as part of your automated workflow, or you can do it manually with an online converter.

NOTE: You must only specify configuration options that are not already exposed in config.json.

Using Volume Profiles

Volume profiles are used to classify volumes. For example, users can group SSDs into a “fast” profile and group HDDs into a “slow” profile.

NOTE: Volume profiles are immutable and therefore cannot contain references to specific devices, nodes, or other ephemeral identifiers.

DC/OS Storage Service (DSS) is a service that manages volumes, volume profiles, volume providers, and storage devices in a DC/OS cluster.

Once the DC/OS cluster is running and volume profiles are created, you can deploy Elasticsearch with the following configuration:

cat > elastic-options.json <<EOF
{
	"master_nodes": {
		"volume_profile": "elastic",
		"disk_type": "MOUNT"
	},
	"data_nodes": {
		"volume_profile": "elastic",
		"disk_type": "MOUNT"
	},
	"ingest_nodes": {
		"volume_profile": "elastic",
		"disk_type": "MOUNT"
	},
	"coordinator_nodes": {
		"volume_profile": "elastic",
		"disk_type": "MOUNT"
	}
}
EOF
dcos package install elastic --options=elastic-options.json

NOTE: Elasticsearch will be configured to look for MOUNT volumes with the elastic profile.

Once the Elasticsearch service finishes deploying, its tasks will be running with the specified volume profiles.

dcos elastic update status
deploy (serial strategy) (COMPLETE)
├─ master-update (serial strategy) (COMPLETE)
│  ├─ master-0:[node] (COMPLETE)
│  ├─ master-1:[node] (COMPLETE)
│  └─ master-2:[node] (COMPLETE)
├─ data-update (serial strategy) (COMPLETE)
│  ├─ data-0:[node] (COMPLETE)
│  └─ data-1:[node] (COMPLETE)
├─ ingest-update (serial strategy) (COMPLETE)
└─ coordinator-update (serial strategy) (COMPLETE)
   └─ coordinator-0:[node] (COMPLETE)

Using external volumes

You can specify external volume configuration in your Elastic definition.

By default name of external volume constructed as:

  • <service-name>_<pod-type>-<pod-index> - for Portworx volume provider
  • <service-name>_<pod-type>_<pod-index> - for other external volume providers

You can specify custom volume name for certain node types. Volume name will be constructed as:

  • <volume-name>-<pod-index> - for Portworx volume provider
  • <volume-name>_<pod-index> - for other external volume providers

<service-name> - sanitized service path with slashes / replaced by double underscore __

<pod-type> - pod type

<pod-index> - pod index

Example Elastic deployment with external volume confiugration:

cat > elastic-options.json <<EOF
{
    "service" : { 
        name": "/foo/bar/elastic"
    },
	"master_nodes": {
       "external_volume": {
          "enabled": True,
          "driver_options": "size=50",
          "volume_name": "MasterNodeVolume",
          "driver_name": "pxd"
        },
	},
	"data_nodes": {
       "external_volume": {
          "enabled": True,
          "driver_options": "size=50",
          "volume_name": "",
          "driver_name": "pxd"
        },
	},
	"coordinator_nodes": {
       "external_volume": {
          "enabled": True,
          "driver_options": "size=50",
          "volume_name": "",
          "driver_name": "pxd"
        },
	}
}
EOF

For this example external volumes will be created with the following names:

  • foo__bar__elastic-coordinator-0
  • foo__bar__elastic-data-0
  • foo__bar__elastic-data-1
  • MasterNodeVolume-0
  • MasterNodeVolume-1
  • MasterNodeVolume-2

Elasticsearch Metrics

Elasticsearch Prometheus Exporter collects Metrics for Prometheus. Preconfigured alert-rules and/or grafana-dashboards are stored in dcos/grafana-dashboards and/or dcos/prometheus-alert-rules repositories. Otherwise, you can use your own dashboards/alert files and point to them in dcos-monitoring configuration.

Grafana Dashboards

dcos-monitoring configuration:

{
  "grafana": {
    "dashboard_config_repository": {
      "url": "https://github.com/dcos/grafana-dashboards",
      "path": "dashboards/elasticsearch",
      "reference_name": "",
      "credentials": {
        "username_secret": "",
        "password_secret": "",
        "deploy_key_secret": ""
      }
    }
  }
}

dashboard_config_repository.path contains files:

  • elasticsearch.json

Prometheus Alerts

dcos-monitoring configuration:

{
  "prometheus": {
    "alert_rules_repository": {
      "url": "https://github.com/dcos/prometheus-alert-rules",
      "path": "rules/elasticsearch",
      "reference_name": "",
      "credentials": {
        "username_secret": "",
        "password_secret": "",
        "deploy_key_secret": ""
      }
    }
  }
}

alert_rules_repository.path contains the files:

  • elasticsearch.rules
  • elasticsearch.yml

Kibana

Kibana is an open source data visualization plugin for Elasticsearch that lets you visualize your Elasticsearch data and navigate the Elastic Stack. You can install Kibana like any other DC/OS package via the Catalog tab of the DC/OS UI or via the DC/OS CLI with:

dcos package install kibana

This will install Kibana using the default name “kibana”. The service name can be configured via the service.name option. Check the configuration guidelines below for more details.

dcos package install kibana --options=kibana.json

Accessing the Kibana UI

Make sure that Kibana is up and running

Services usually take a moment to finish being installed and ready to use. You can check if your Kibana service is ready with the following command:

dcos marathon app show kibana | jq -r '.tasksHealthy'

If it outputs a 1 it means Kibana is up and running. A 0 means that it is still probably being installed.

Another good indication that Kibana is ready is when the following line appears in the in the stdout log for the Kibana task.

{"type":"log","@timestamp":"2016-12-08T22:37:46Z","tags":["listening","info"],"pid":12263,"message":"Server running at http://0.0.0.0:5601"}

Kibana without X-Pack Security enabled

If Kibana was installed without X-Pack Security enabled, you should be able to access it through the default DC/OS UI Service link (https://<cluster-url>/service/<kibana-service-name>).

Kibana with X-Pack Security enabled

Due to a known limitation, if you installed Kibana with X-Pack Security enabled, you will not be able to access it through the default DC/OS 1.11 UI Service link. In this case you must expose Kibana using EdgeLB.

Configuration Guidelines

  • Service name (service.name): This must be unique for each instance of the service that is running. The default is kibana.
  • Service user (service.user): This must be a non-root user that already exists on each agent. The default user is nobody.
  • If you have X-Pack Security enabled in Elastic (elasticsearch.xpack_security_enabled: true), you must also have it enabled in Kibana (kibana.elasticsearch_xpack_security_enabled: true).
  • Elasticsearch credentials (kibana.user and kibana.password): If you have X-Pack Security enabled, Kibana will use these credentials for authorized requests to Elasticsearch. The default user is kibana, and the password must be configured through the service options.
  • Elasticsearch URL: This is a required configuration parameter. The default value http://coordinator.<elastic-service-name>.l4lb.thisdcos.directory:9200 corresponds to the named VIP that exists when the Elastic package is launched with its own default configuration.

Configuring Kibana

You can customize the Kibana installation in a variety of ways by specifying a JSON options file. For example, here is a sample JSON options file that:

  • Sets the service name to another-kibana
  • Sets the password for Kibana requests to an Elasticsearch cluster configured with authentication
  • Configures Kibana to communicate with Elasticsearch via TLS
  • Turns on X-Pack Security, so that Kibana works against an Elasticsearch similarly configured

another_kibana.json

{
    "service": {
        "name": "another-kibana"
    },
    "kibana": {
        "password": "0cb46ab2d7790f30ceb32bd3d43fff35",
        "elasticsearch_tls": true,
        "elasticsearch_url": "https://coordinator.elastic.l4lb.thisdcos.directory:9200",
        "elasticsearch_xpack_security_enabled": true
    }
}

The following command installs Kibana using a JSON options file:

dcos package install kibana --options=another_kibana.json

To see a list of all possible options, run the following command to show the configuration schema:

dcos package describe kibana |  jq -r '.package.config'

Custom Kibana YAML

Many Kibana options are exposed via the package configuration in config.json, but there may be times when you need to add something custom to the kibana.yml file. For instance, if you have written a custom plugin that requires special configuration, you must specify this block of YAML for the Elastic service to use.

Add your custom YAML when installing or updating the Elastic service. In the DC/OS UI, click Configure. In the left navigation bar, click kibana and find the field for specifying custom Kibana YAML. You must base64 encode your block of YAML and enter this string into the field.

You can do this base64 encoding as part of your automated workflow, or you can do it manually with an online converter.

NOTE: You must only specify configuration options that are not already exposed in config.json.