Server-side telemetry has been broken since 5.14

Description

All server-side telemetry through segment.com has been broken since the release of 5.14. Once this is fixed, access to segment.com is required to verify that data is being emitted correctly again – can use the diagnostic id from community.

Initial Notes

Initially noticed that newest config settings aren't reporting telemetry anymore.

After further investigation, noticed that no server-side telemetry is been sent back (e.g. no telemetry for config settings) since 5.14, implying that something in our telemetry broke.

Note that https://mattermost.atlassian.net/browse/MM-15653 reported client-side data such as message posts no longer send telemetry either, which was reported in May.

QA Test Steps

See description.

Activity

Show:
Jesse Hallam
September 20, 2019, 12:59 AM

I had a theory that perhaps our segment key wasn’t being included in the build, but I can confirm the key is present in, for example, the v5.15 build, so it’s not an issue there as far as I can see.

Jason Blais
September 20, 2019, 1:20 AM

I searched through some of my history and remembered this comment from Carlos: https://mattermost.atlassian.net/browse/MM-15653?focusedCommentId=67491&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-67491

— the issue was in the release pipeline we were downloading the built package and then we cannot change the segment key. ... now the build pipeline is building webapp and setting the segment key

The issue doesn't seem to have been resolved for v5.15 server-side metrics, but sharing in case it's relevant to the investigation here.

Jesse Hallam
September 20, 2019, 2:06 AM
Edited

I’ve identified the regression: in the upgrade to v3, the semantics of the segmentio package changed to expect just struct values, and not pointers to struct values. Unfortunately, a pointer to those struct values satisfied the interface required, but then wasn’t actually configured correctly within the package.

With verbose mode on, emitting a struct pointer yielded the following (note that no segment key is present here: the 7jy9fakagfrkf8zhntzc6ydxoc value is just a local diagnostic id):

By contract, emitting just the struct triggered the generation of a UUID and timestamp and yielded:

I can fix this in the server source code. We will require a backport to v5.14/v5.15 to fix those releases.

I’ll also suggest a change upstream to make this issue more apparent to consumers of the package.

Jesse Hallam
September 27, 2019, 2:54 AM

v5.14 and v5.15 cherry-pick completed. v5.16 cherry-pick pending merge of

Linda Mitchell
October 9, 2019, 10:20 PM

Jason tested this and verified:

I'm seeing at least "some" server-side telemetry coming from 5.14/5.15/5.16 RCs now, so for the purpose of the dot releases, we can consider it resolved.

Closing. Tests not added for now, but can add general tests after telemetry training.

Done

Mana

None

Assignee

Jesse Hallam

QA Assignee

Jason Blais

Reporter

Jason Blais

Epic Link

None

Mattermost Team

Sustained Engineering

Sprint

None

Labels

None

QA Testing Areas

Other (write in QA test steps)

GitHub Issue

None

Components

None

Severity

None
Configure