Search K
Appearance
Appearance
This page describes the Dovecot config necessary to implement the Palomar Architecture.
Palomar requires a layer of load balancers to balance traffic ingress to the Dovecot Proxy.
Requirement
The load balancer MUST either be transparent (keep the original client IP visible) or it MUST support the HAProxy PROXY V2 protocol.
# haproxy.conf
# Sample Configuration for a Palomar load balancing setup using software
# (HAproxy) load balancing
#
# Sample assumptions:
# - Traffic will be balanced to 2 proxies, located at 192.168.1.10 and
# 192.168.1.11
defaults
mode tcp
log global
option dontlognull
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 128000
listen lmtp
bind 0.0.0.0:24
mode tcp
balance roundrobin
server hac-dc1-proxy2 192.168.1.11:24 check send-proxy-v2
server hac-dc1-proxy1 192.168.1.10:24 check send-proxy-v2
listen pop3
bind 0.0.0.0:110
mode tcp
balance roundrobin
server hac-dc1-proxy2 192.168.1.11:110 check send-proxy-v2
server hac-dc1-proxy1 192.168.1.10:110 check send-proxy-v2
listen imap4
bind 0.0.0.0:143
mode tcp
balance roundrobin
server hac-dc1-proxy2 192.168.1.11:143 check send-proxy-v2
server hac-dc1-proxy1 192.168.1.10:143 check send-proxy-v2
listen sieve
bind 0.0.0.0:4190
mode tcp
balance roundrobin
server hac-dc1-proxy2 192.168.1.11:4190 check send-proxy-v2
server hac-dc1-proxy1 192.168.1.10:4190 check send-proxy-v2
listen submission
bind 0.0.0.0:587 accept-proxy
mode tcp
balance roundrobin
server hac-dc1-proxy2 192.168.1.11:587 check send-proxy-v2
server hac-dc1-proxy1 192.168.1.10:587 check send-proxy-v2
Dovecot Proxies authenticate the user, convert to internal user ID (if needed), and redirect to the Dovecot Backend.
A Dovecot Proxy requires the dovecot-pro-cluster package.
Documentation on cluster settings can be found below.
# Name of the proxy's local site
cluster_local_site = DC1
login_socket_path = cluster
auth_socket_path = cluster-userdb
lmtp_proxy = yes
# Upgrading note: Only the cluster process now connects to auth-userdb socket,
# so the socket's user must be default_internal_user (dovecot). This is already
# the default, but old installations may be using "vmail" or something else,
# which needs to be commented out.
service auth {
unix_listener auth-userdb {
#user = $SET:default_internal_user
}
}
service cluster {
# This needs to be 1 to ensure cluster process is running even when clients
# are not connected.
process_min_avail = 1
# Enable the listeners for login_socket_path and auth_socket_path:
unix_listener cluster-userdb {
mode = 0666
}
unix_listener login/cluster {
mode = 0666
}
}
These metrics must be configured exactly as below for cluster to work correctly:
metric lmtp_rcpt_finished_failure {
filter = event=smtp_server_transaction_rcpt_finished AND category=lmtp AND NOT error="" AND NOT enhanced_code=5.*
group_by = dest_host
}
metric lmtp_rcpt_finished_success {
filter = event=smtp_server_transaction_rcpt_finished AND category=lmtp AND error=""
group_by = dest_host
}
metric proxy_session_failure {
filter = event=proxy_session_finished AND error_code=* AND NOT error_code=proxy_dest_auth_temp_failed AND NOT error_code=proxy_dest_redirected
group_by = dest_host
}
metric proxy_session_success {
filter = event=proxy_session_established
group_by = dest_host
}
cluster_geodb = proxy:dict-async:cluster-geodb
cluster_localdb = proxy:dict-async:cluster-localdb
dict {
cluster-localdb = sqlite:/etc/dovecot/cluster-localdb.conf.ext
cluster-geodb = cassandra:/etc/dovecot/cluster-cql.conf.ext
}
If HAProxy is used as a load balancer, its IPs must be trusted:
haproxy_trusted_networks = 192.168.1.0/24
Dovecot Proxy IPs/networks that must be trusted:
service imap-login { inet_listener }
, which defaults to the listen
. Typically this is 127.0.0.1
or ::1
.login_trusted_networks = 127.0.0.1 ::1 192.168.1.0/24
The doveadm TCP service must be configured for all Proxies and Backends using the same doveadm_port
and doveadm_password
.
doveadm_port = 2300
doveadm_password = # shared password
service doveadm {
inet_listener tcp {
port = 2300 # same as doveadm_port
}
}
Settings needed to monitor Backends/users. The test username is recommended to use %{backend_host}
, which expands to the tested backend. This way if there is a problem with the test user, it affects only the one backend.
The proxy needs a passdb listing the test users. They could be in the same passdb as the real users, or they can have a separate passdb. In the latter case, it should be placed as the first passdb to catch the test users.
cluster_backend_test_username = probe-%{backend_host}
cluster_backend_test_password = # shared password
# Passdb configuration for monitoring users first:
passdb test-accounts {
args = /etc/dovecot/cluster-test-accounts.passwd
override_fields = proxy=y password= # shared password
driver = passwd-file
}
# Passdb for the real users:
#passdb ldap {
# ...
#}
Dovecot Backends perform all work related to protocol commands and interfaces with the storage.
A Dovecot Backend requires the dovecot-pro-cluster package.
Documentation on cluster settings can be found below.
# Name of the backend (same as in geodb)
cluster_backend_name = hac-dc1-be3
# Name of the backend's local site
cluster_local_site = DC1
login_socket_path = cluster
auth_socket_path = cluster-userdb
lmtp_proxy = yes
service cluster {
# This needs to be 1 to ensure cluster process is running even when clients
# are not connected.
process_min_avail = 1
# Enable the listeners for login_socket_path and auth_socket_path:
unix_listener cluster-userdb {
mode = 0666
}
unix_listener login/cluster {
mode = 0666
}
}
cluster_geodb = proxy:dict-async:cluster-geodb
cluster_localdb = proxy:dict-async:cluster-localdb
dict {
cluster-localdb = sqlite:/etc/dovecot/cluster-localdb.conf.ext
cluster-geodb = cassandra:/etc/dovecot/cluster-cql.conf.ext
}
The doveadm TCP service must be configured for all Proxies and Backends using the same doveadm_port
and doveadm_password
.
doveadm_port = 2300
doveadm_password = # shared password
service doveadm {
inet_listener tcp {
port = 2300 # same as doveadm_port
}
}
Each backend must receive test user logins from proxies. The users can be in the same passdb as the real users, or they can have a separate passdb. In the latter case, it should be placed as the first passdb to catch the test users.
# Passdb configuration for monitoring users first:
passdb {
args = /etc/dovecot/cluster-test-accounts.passwd
driver = passwd-file
}
# Passdb for the real users:
#passdb ldap {
# ...
#}
# Backends use this in "doveadm cluster group access" command:
cluster_backend_test_username = probe-%{backend_host}
These settings are necessary to allow for user listing and metacache pulling.
The userdb on backends must provide the user_cluster_group
and metacache_last_host_dict
settings. To do that we are returning user_cluster_group=%{passdb:forward_cl_user_group}
and metacache_last_host_dict=%{passdb:forward_meta_last_host_dict}
from userdb.
Alternatively, these settings could be returned by passdb as long as they are prefixed with userdb_
, i.e. userdb_user_cluster_group
and userdb_metacache_last_host_dict
. In any case, the passdb/userdb must succeed for all the valid logins and other user accesses so the fields are always added. This configuration applies only to login-based protocols (imap, pop3, submission, managesieve). For lmtp and doveadm they are hardcoded.
Cluster process sets forward_cl_user_group
and forward_meta_last_host_dict
when applicable so backends can use them. The following example is using a static userdb (must be adapted to the actually used userdb or passdb):
userdb {
driver = static
args = user_cluster_group=%{passdb:forward_cl_user_group} metacache_last_host_dict=%{passdb:forward_meta_last_host_dict}
}
Additionally, backends need to have these settings in the plugin block (write the $variables
as they are - they are expanded automatically):
plugin {
cluster_backend_name = $cluster_backend_name
doveadm_ssl = $doveadm_ssl
doveadm_port = $doveadm_port
doveadm_password = $doveadm_password
doveadm_username = $doveadm_username
}
Suggested metric configuration:
metric metacache_pull_finished {
filter = event=metacache_pull_finished and error=""
group_by = type
}
metric metacache_pull_finished_failure {
filter = event=metacache_pull_finished and not error=""
group_by = type
}
You need to setup a Cassandra namespace for Palomar.
create table if not exists tags (
id uuid,
tag text,
primary key ((id))
);
create table if not exists sites (
id uuid,
name text,
tag uuid,
load_balancer text,
status text,
primary key ((id))
);
create table if not exists site_reachability (
src_site_id uuid,
dest_site_id uuid,
reachable int,
primary key ((src_site_id), dest_site_id)
);
create table if not exists backends (
id uuid,
site_id uuid,
load_factor int,
host text,
status text,
status_reason text,
last_moved_from text,
last_moved_to text,
primary key ((site_id), id)
);
create table if not exists backend_stats (
site_id uuid,
id uuid,
key text,
value double,
type text,
primary key ((site_id), id, key)
);
create table if not exists proxy_dest_stats (
proxy_site_id uuid,
proxy_host text,
dest_host text,
key text,
value double,
type text,
primary key ((proxy_site_id), dest_host, proxy_host, key)
);
create table if not exists user_groups (
name text,
site_id uuid,
backend_id uuid,
alt_backend_id uuid,
sticky_users int,
moving text,
refresh_after timestamp,
sticky_backend int,
primary key ((site_id), name),
);
create table if not exists group_sites (
group_name text,
site_id uuid,
status text,
failover_site_id uuid,
last_update timestamp,
primary key ((group_name), site_id),
);
create table if not exists users (
username text,
preferred_site_id uuid,
group_name text,
failover_site_group_name text,
metacache_last_host text,
moving_preferred_site_id uuid,
moving_group_name text,
moving_failover_site_group_name text,
last_update timestamp,
primary key ((username))
);
create table if not exists cluster_settings (
section text,
site_id uuid,
backend_id uuid,
setting_name text,
setting_value text,
primary key ((section), site_id, backend_id, setting_name),
);
CASSANDRA_KEYSPACE
Default | "d8s_cluster" |
---|---|
Value | Python String |
Cassandra keyspace to use.
CASSANDRA_PORT
Default | 9042 |
---|---|
Value | Python Integer |
The port Cassandra servers listen on.
CASSANDRA_PROTOCOL_VERSION
Default | 4 |
---|---|
Value | Python Integer |
Version of the cassandra protocol to be used.
CASSANDRA_SERVERS
Default | "localhost" |
---|---|
Value | Python String |
Comma separated list of Cassandra server endpoints.
CASSANDRA_TLS_ENABLED
Default | False |
---|---|
Value | Python Boolean |
Whether to use TLS in Cassandra connections.
CELERY_BROKER_TRANSPORT_OPTIONS
Default | {} |
---|---|
Value | Python Dictionary |
Needed when using Redis Sentinel where it's required to provide "master_name" as a key in the dictionary.
CELERY_BROKER_URL
Default | "redis://localhost:6379/0" |
---|---|
Value | Python String |
A valid URI to Redis server or Redis Sentinel used for controller task brokering.
CELERY_RESULT_BACKEND
Default | "redis://localhost:6379/0" |
---|---|
Value | Python String |
A valid URI to Redis server or Redis Sentinel used for holding task results.
CELERY_RESULT_BACKEND_TRANSPORT_OPTIONS
Default | {} |
---|---|
Value | Python Dictionary |
Needed when using Redis Sentinel for celery results where it's required to provide "master_name" as a key in the dictionary.
CELERY_RESULT_EXPIRES_SECS
Default | 60 |
---|---|
Value | Python Integer |
Time to live of task results in Redis (in seconds).
CLUSTER_SITE
Default | "dc1a" |
---|---|
Value | Python String |
Palomar site this controller is responsible for. Must be the same
site name configured in dovecot proxies with cluster_local_site
.
CONTROLLER_API_URL
Default | "http://localhost:8000" |
---|---|
Value | Python String |
URL controller API runs on. The server will bind to this address and the admin UI uses this address to send the HTTP requests.
DRY_RUN
Default | 200 |
---|---|
Value | Python Integer |
Whether to enable DRY_RUN mode to log but not perform controller worker actions, such as: set_host_offline, set_host_online and move group.
GROUP_BALANCE_ENABLED
Default | False |
---|---|
Value | Python Boolean |
Whether to enable group balancing feature. This feature slowly redistributes users between groups in the same backend so they are roughly the same size.
GROUP_BALANCE_GROUP_SIZE_SLACK_PERCENT
Default | 10 |
---|---|
Value | Python Integer |
If group size differences are larger than the given percentage, users will be redistributed.
GROUP_BALANCE_MAX_USER_MOVES_PER_PASS
Default | 200 |
---|---|
Value | Python Integer |
Maximum total number of users that can be moved (across all groups) at each iteration of group balancing. Should be higher than GROUP_BALANCE_MAX_USER_MOVE_BETWEEN_GROUPS since that setting controls number of users moved between a group pair while GROUP_BALANCE_MAX_USER_MOVES_PER_PASS controls total sum of users moved across all group pairs. For more information see Automatic Group Rebalancing
GROUP_BALANCE_MAX_USER_MOVE_BETWEEN_GROUPS
Default | 100 |
---|---|
Value | Python Integer |
Maximum number of users that can be moved between individual groups at each iteration of balancing. Used to prevent big swings in group sizes and too much load on Cassandra. For more information see Automatic Group Rebalancing
HOST_FAILURE_COOL_TIME_SECS
Default | 3600 |
---|---|
Value | Python Integer |
Minimum time in seconds between moving groups from hosts with failing logins to other hosts. If a host has high failure rate and the cool off time from its last group move has passed, controller will try to find another host which also has passed its cool-off period to move a group to.
HOST_FAILURE_MIN_LOGINS
Default | 10 |
---|---|
Value | Python Integer |
Minimum number of logins needed in past 5 minutes to start processing host's health (i.e. change backend status if necessary or move groups from it if there is high failure rate).
HOST_FAILURE_RATIO
Default | 0.1 |
---|---|
Value | Python Float |
Ratio of failed logins or mail deliveries to trigger group moves. Must be in (0, 1) range exclusive.
HOST_LOAD_BALANCE_MIN_COOL_TIME_SECS
Default | 3600 |
---|---|
Value | Python Integer |
Minimum time in seconds a group will not be moved to a new backend. If a group needs to be moved for load balancing, this period is honored and groups that have been moved recently will not be moved again. Must be larger than 0.
HOST_LOAD_BALANCE_MIN_SAMPLES
Default | 3000 |
---|---|
Value | Python Integer |
Number of samples over the last 24 hours needed for all the Z-scores for a host to do load balancing.
HOST_LOAD_BALANCE_SCORE_DELTA_THRESHOLD_RATIO
Default | 0.5 |
---|---|
Value | Python Float |
Minimum load score difference between backends to initiate group move. See Cluster controller load balancing.
PROMETHEUS_JOB_NAME
Default | "" |
---|---|
Value | Python String |
If set, will be used as the job name to scrape backend and proxy statistics. Required when a Prometheus instance is used that is managed outside of kubernetes installation (e.g. a central instance or mixed environments). If the Prometheus instance scrapes each site hosts with a different job name, this setting must have the value of the job name to prevent unwanted data in query results. Can be empty if using the bundled Prometheus with controller.
PROMETHEUS_URL
Default | "http://localhost:9090/" |
---|---|
Value | Python String |
URL of Prometheus used to gather backend and proxy statistics.
cluster_backend_name
Default | [None] |
---|---|
Value | string |
Host name of the backend. This must be the same name as used by
doveadm cluster-backend
commands.
Cluster: Applies only to Backend.
cluster_backend_test_password
Default | [None] |
---|---|
Value | string |
Password used for logging into backends to see if it's up or down.
Cluster: Applies only to Proxy.
cluster_backend_test_username
Default | [None] |
---|---|
Value | string |
This setting is used for two purposes:
%{backend_host}
variable expands to the hostname of the backend.doveadm cluster group access
command. This command just needs any existing user to work - it doesn't
matter that it's not actually in the accessed group.Cluster: Applies to both Proxy and Backend.
cluster_default_group_count
Default | [None] |
---|---|
Value | unsigned integer |
Number of user groups that may be automatically created. This is used for
creating a group for a user that doesn't yet have one. The group will be
named default-N
where N is between 1 and cluster_default_group_count
.
Cluster: Applies only to Proxy.
cluster_geodb
Default | [None] |
---|---|
Value | string |
Dictionary URI used for the globally shared GeoDB. This typically points to Cassandra.
Cluster: Applies to both Proxy and Backend.
cluster_local_site
Default | [None] |
---|---|
Value | string |
Name of the local site. This must be the same name as used by
doveadm cluster-site
commands.
Cluster: Applies to both Proxy and Backend.
cluster_localdb
Default | [None] |
---|---|
Value | string |
Dictionary URI used for the server-specific local database. This typically
points to SQLite under /dev/shm
.
Cluster: Applies to both Proxy and Backend.
cluster_proxy_check_backends
Default | yes |
---|---|
Value | string |
If enabled, this proxy runs checks to see whether backends are up or down.
If disabled, this proxy never sets any backends offline.
Cluster: Applies only to Proxy.
cluster_user_move_timeout
Default | [None] |
---|---|
Value | time |
If user moving hasn't finished by this timeout, just assume it finished and continue to the next user.
Cluster: Applies only to Proxy.