Search K
Appearance
Appearance
Mail index files consist of:
Main Index File (dovecot.index
)
Transaction Log (dovecot.index.log
and dovecot.index.log.2
)
Cache File (dovecot.index.cache
)
See also Mail Index API.
The mail index files are used in a few different places:
dovecot.index*
)mailbox_list_index
(dovecot.list.index*
)dovecot.map.index*
)The mailbox index is optional for some mailbox formats (maildir, mbox), but required for all high performance mailbox formats (sdbox, mdbox).
The index files were implemented to optimize Dovecot, so the file formats attempt to be efficient. The index files are often mmap()ed into memory and accessed directly via structs. This means that the data is stored using the CPU endianness, and all structs that end up in index files have to be careful with data alignment to avoid crashes with CPUs that require the alignment.
At times there have been thoughts about changing index handling so it wouldn't care about CPU endianness or alignment, but this would be a huge change and the end result would almost certainly be worse performance. This is mostly a theoretical problem anyway: It's very unlikely that index files are moving between little and big endian CPUs, and if that is actually wanted the mails can be migrated with dsync.
The main index contains fixed size records, which contain at least:
There are also optional extensions, which increase the record size:
The index file's header also contains some summary information, such as how many messages exist, how many of them are unseen and how many are marked with \Deleted
flag. This allows efficiently answering the IMAP STATUS commands.
The dovecot.index
file is lazily updated by recreating it once in a while. An existing dovecot.index
file is never written to. The transaction log file contains updates that need to be applied on top of the main index file to get to the latest state of the mailbox.
See Main Index for more details.
Transaction log contains all the changes going to the main index (no dovecot.index.cache
contents). It is the only file that is always required to exist for a folder. (Newly created folder indexes don't contain dovecot.index
immediately.)
New transactions are usually appended to the log file. Once the log becomes large enough, it's rotated into dovecot.index.log.2
and a new empty log file is created. The .log.2
file becomes deleted on the next log rotation, or earlier if the .log.2
becomes old enough.
There are several advantages to having a transaction log:
It provides atomic transactions: The transaction either succeeds, or it doesn't. For example if a transaction sets a flag to one message and removes it from another, it's guaranteed that both changes happen.
It allows another process to quickly see what changes have been made to the mailbox by other processes. For example IMAP protocol needs to send the IMAP client a list of all mailbox changes after each IMAP command.
They're also used for quickly getting changes (flag changes and expunges especially) since a specific point in time:
See Transaction Log for more details.
The cache file can have all kinds of cached email data, such as cached email headers. The cached data can't be changed. To prevent abuse, excessively large cache records aren't added to the cache file.
Each mailbox can have its own different caching decisions. New cache fields are dynamically added as they become used. For example a user may start using a new IMAP client, which fetches some new message headers that old clients didn't want. This triggers Dovecot to start caching the newly requested header for any new mail deliveries. Similarly if some cache field isn't accessed for a while, it's dropped entirely.
Fields can be cached either permanently or temporarily. The temporary fields may be dropped for mails that were saved more than 7 days ago. The idea for temporary fields is that some IMAP clients cache all the data locally, so they benefit from Dovecot's caching only once. 7 days should be long enough that the user has accessed the mailbox with all their locally caching clients. After this the cache fields are just wasting disk space unnecessarily.
See Cache File for more details.
dovecot.list.index*
files are used for mailbox list indexes. It uses the same mail index format, although some fields are slightly abused to make it work.
\NoSelect
).dovecot.list.index
doesn't change, because the UID refers to the mailbox name, not to the mailbox itself.Dovecot uses several different techniques to allow reading files without locking them. One of them uses fields in a "lockless integer" format. Initially these fields have "unset" value. They can be set to a wanted value in range
The lockless integers work by allocating one bit from each byte of the value to "this value is set" flag. The reader then verifies that the flag is set for the value's all bytes. If all of them aren't set, the value is still "unset". Dovecot uses the highest bit for this flag. So for example:
0x00000000
: The value is unset
0xFFFF7FFF
: The value is unset, because one of the bytes didn't have the highest bit set
0xFFFFFFFF
: The value is
0x80808080
: The value is 0
0x80808180
: The value is 0x80
Dovecot contains mail_index_uint32_to_offset()
and mail_index_offset_to_uint32()
functions to translate values between integers and lockless integers. The "unset" value is returned as 0, so it's not possible to differentiate between "unset" and "set" 0 values.