Unable to start if mailserver is listening but not responding

Description

Original report: https://github.com/mattermost/mattermost-server/issues/11777

Summary
Mattermost is unable to start if mailserver is listening but not really responding.

Steps to reproduce
Version information:
Version: 5.13.0
Build Number: 5.13.2
Build Date: Wed Jul 24 17:25:26 UTC 2019
Build Hash: 9a8b0906fdbf2c2a309db2400f4bcc5902800e2c

Configuration (config.json):

[...]
"SMTPUsername": "",
"SMTPPassword": "",
"SMTPServer": "localhost",
"SMTPPort": "25",
[...]

Reproduction
You can break the mailserver used, or instead install netcat-openbsd and bind the port 25/tcp:

  1. nc -l -p 25 &

Then start mattermost.

Expected behavior
Mattermost could display a console DEBUG message (i can propose the Pull Request) and abort the connection within a given timeout.

Observed behavior
Mattermost start but is stuck:

Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1237967,"caller":"utils/i18n.go:83","msg":"Loaded system translations for 'fr' from '/srv/mattermost/i18n/fr.json'"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1251771,"caller":"app/server_app_adapters.go:58","msg":"Server is initializing..."}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"debug","ts":1564689739.1273336,"caller":"utils/html.go:31","msg":"Parsing server templates at /srv/mattermost/templates"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"info","ts":1564689739.1383843,"caller":"sqlstore/supplier.go:224","msg":"Pinging SQL master database"}
Aug 1 22:02:19 chat mattermost[19149]: {"level":"debug","ts":1564689739.2219186,"caller":"sqlstore/preference_store.go:41","msg":"Deleting any unused pre-release features"}
Strace shows that one process tries to connect to 25/tcp:

[...]
connect(14, {sa_family=AF_INET, sin_port=htons(25), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
epoll_ctl(4, EPOLL_CTL_ADD, 14, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=2569088496, u64=139765099867632}}) = 0
epoll_pwait(4, [{EPOLLOUT, {u32=2569088496, u64=139765099867632}}], 128, 0, NULL, 1) = 1
getsockopt(14, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
getpeername(14, {sa_family=AF_INET, sin_port=htons(25), sin_addr=inet_addr("127.0.0.1")}, [112->16]) = 0
[...]

Possible fixes
1. As mentioned in expected behavior, a debug message in ConnectToSMTPServerAdvanced can be a good start.
2. The code logic that handle the connection use net.Dial and tls.Dial which does not use a timeout. It seems that it is impossible to use a timeout with tls.Dial (see https://groups.google.com/forum/#!topic/golang-nuts/7p61fSVtJ5k)

QA Test Steps

None

Status

Mana

None

Assignee

Unassigned

QA Assignee

None

Reporter

Amy Blais

Epic Link

None

Fix versions

Mattermost Team

Platform

Sprint

None

Labels

QA Testing Areas

None

GitHub Issue

None

Components

None
Configure