Search K
Appearance
Appearance
Dovecot Pro can use Apache Tika to index text in various mail attachments to improve the user's search experience.
Additionally, Tika supports Optical Character Recognition (OCR) which enables certain types of image data (e.g. scanned PDFs, JPEG) to be indexed as well.
Dovecot Pro provides a pre-built Apache Tika OCI container that contains the necessary indexing service. This container includes the necessary components to perform OCR.
These containers are located in the Open-Xchange Container Registry.
Authenticate to registry.open-xchange.com
docker login registry.open-xchange.com
Pull the image.
docker pull registry.open-xchange.com/dovecot-pro/apache-tika:latest
Example command to run the container:
docker run -d -p 9998:9998 registry.open-xchange.com/dovecot-pro/apache-tika:latest
See Custom Config for instructions on how to configure the Tika container.
Add fts_decoder_tika_url
to the Dovecot configuration:
fts_decoder_driver = tika
fts_decoder_tika_url = http://example.com:9998/tika/
Tika acts as a microservice, and each request from Dovecot is independent. Therefore, Tika can be scaled horizontally by adding more nodes.
The customer will need to provide the service to distribute requests between the various Tika nodes. Dovecot should then be configured to point to this distribution endpoint.
By default, the Tika container exposes its service via an unsecured HTTP port.
The customer will need to provide TLS protection if the service needs to be secured. fts_decoder_tika_url
supports https URLs.
The Tika container is provided under the terms of the Apache License v2.0.