We're updating the issue view to help you get more done. 

Unable to start if mailserver is listening but not responding

Description

Original report: https://github.com/mattermost/mattermost-server/issues/11777

Summary
Mattermost is unable to start if mailserver is listening but not really responding.

Steps to reproduce
Version information:
Version: 5.13.0
Build Number: 5.13.2
Build Date: Wed Jul 24 17:25:26 UTC 2019
Build Hash: 9a8b0906fdbf2c2a309db2400f4bcc5902800e2c

Configuration (config.json):

[...]
"SMTPUsername": "",
"SMTPPassword": "",
"SMTPServer": "localhost",
"SMTPPort": "25",
[...]

Reproduction
You can break the mailserver used, or instead install netcat-openbsd and bind the port 25/tcp:

  1. nc -l -p 25 &

Then start mattermost.

Expected behavior
Mattermost could display a console DEBUG message (i can propose the Pull Request) and abort the connection within a given timeout.

Observed behavior
Mattermost start but is stuck:

Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1237967,"caller":"utils/i18n.go:83","msg":"Loaded system translations for 'fr' from '/srv/mattermost/i18n/fr.json'"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1251771,"caller":"app/server_app_adapters.go:58","msg":"Server is initializing..."}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"debug","ts":1564689739.1273336,"caller":"utils/html.go:31","msg":"Parsing server templates at /srv/mattermost/templates"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1383843,"caller":"sqlstore/supplier.go:224","msg":"Pinging SQL master database"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"debug","ts":1564689739.2219186,"caller":"sqlstore/preference_store.go:41","msg":"Deleting any unused pre-release features"}
Strace shows that one process tries to connect to 25/tcp:

[...]
connect(14, {sa_family=AF_INET, sin_port=htons(25), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(4, EPOLL_CTL_ADD, 14, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2569088496, u64=139765099867632}}) = 0
epoll_pwait(4, [{EPOLLOUT, {u32=2569088496, u64=139765099867632}}], 128, 0, NULL, 1) = 1
getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getpeername(14, {sa_family=AF_INET, sin_port=htons(25), sin_addr=inet_addr("127.0.0.1")}, [112->16]) = 0
[...]

Possible fixes
1. As mentioned in expected behavior, a debug message in ConnectToSMTPServerAdvanced can be a good start.
2. The code logic that handle the connection use net.Dial and tls.Dial which does not use a timeout. It seems that it is impossible to use a timeout with tls.Dial (see https://groups.google.com/forum/#!topic/golang-nuts/7p61fSVtJ5k)

QA Test Steps

None

Status

Mana

None

Assignee

Unassigned

Reporter

Amy Blais

Epic Link

None

Fix versions

Mattermost Team

Platform

Sprint

None

Labels

GitHub Issue

None