Tech Spec - LDAP Profile Picture Updates

Status: 99%

OVERVIEW

This document describes the main design and implementation decisions behind implementing the “LDAP Profile Picture Updates” feature. This feature allows a System Administrator the ability to set an LDAP Attribute name and provides the ability for the Profile Picture to be kept in sync with LDAP

GOALS

  • describe the changes and additions to the backend architecture (API and Storage)

  • describe the research and performance evaluations that guided the technical decisions

  • describe the UX changes required

SCOPE

In:

  • ability to define LDAP user attribute containing the photo

  • ability to restrict users from updating photos

  • ability to synch the photos with LDAP

  • update sync routine to specify returned attributes

Out:

  • fixing paging in the go-ldap library or current LDAP Sync

Requirements:

  • Sync on login

  • Profile image locked if setting is enabled

  • Existing LDAP users will have to logout and log back in to do a check for LDAP photo match
    enable in LDAP config

Out:

  • CLI command/job to manually sync all users

BACKGROUND READING

  1. Spike document

  2. High fidelity designs

TERMINOLOGY

Picture/Image/Photo

SPECIFICATIONS

High-level Architecture

At a high level, the System Administrator will define the “Profile Picture Attribute” in the LDAP Settings section of the System Console. This will define which attribute from the LDAP system will be used to retrieve the picture. This will require a new configuration item under LDAP Settings.

Once the attribute has been defined, the Mattermost image for all LDAP users will need to be updated. After the initial update, the image will need to be periodically retrieved and updated.

Configuration

Add PictureAttribute to the LDAP Settings section. Default to empty string.

UX Design

The UX will need modified in two different ways -

  1. System Console, to allow the PictureAttribute to be set

  2. User Account Settings, to disallow the user from setting their profile picture.

System Console

Add a new text type section to admin_definition.jsx in order to allow an Admin to set this attribute. Disable if Enable Sync is false. The specific text for the user interface can be found in the following figma document.

Account Settings

Normally, the user can go to account settings and upload a profile picture. When the PictureAttribute is set, the picture will be updated via LDAP Sync, therefore the user will no longer be able to change the picture. Instead when selecting the Profile Picture section of the Account Settings, the following message will be displayed.

'This field is handled through your login provider. If you want to change it, you need to do so through your login provider.'

This is the same message displayed in the other Account Sections when they are updated via LDAP Sync. The same pattern currently used in the sections of user_settings_general.jsx can be used here as well.

Tickets

Configuration and System Console Setting -
https://mattermost.atlassian.net/browse/MM-24082

User Account Settings -
https://mattermost.atlassian.net/browse/MM-24086

Sync Design

When the attribute is turned on the first time, all LDAP users in Mattermost will need to have their image updated. Then periodically, the LDAP needs to be checked, so when images get modified in LDAP, the modified image is updated in Mattermost.

WhenChanged (modifiedTimestamp)

Most LDAP systems contain an attribute called either WhenChanged and/or modifiedTimestamp. However, there are a few reasons this attribute should not be used. First, there is no guarantee it will be present in ALL LDAP systems. It is present in ADFS and OpenLDAP, but the LDAP spec does not require it. Second, this attribute is not replicated. This means that each Domain Controller in ADFS will have a different value. Each domain controller will eventually update depending on how often LDAP replicates. Third, ADFS updates this value whenever the user logs in. Since most people log into ADFS daily it would result in potentially thousands of false positives.

LDAP Sync job

Another potential way to sync the profile image is via the LDAP Sync job (or a separate job). This would require returning the image for each user in order to compare to the existing image.

PictureHash

In order to make the job more performant, a new database field would be added for the PictureHash. Before a profile image is saved in Mattermost, it is modified, orientation, height, and width may be adjusted. Therefore in order to compare picture byte arrays -

  • The LDAP image would need converted

  • The current Mattermost image would need to be read (from disk or S3)

  • Compare byte arrays

Rather than doing this, the image from LDAP will be hashed and encoded into a string. That string can then be saved and used to compare. This will save from reading the image from disk which will be a large performance gain.

Paging

Currently, the LDAP Sync job returns ALL users from LDAP. When each user record is only say 200 bytes, that isn’t an issue. However, when returning a picture of up to 100k for each user memory could become an issue. The obvious answer to this issue is to use paging. our current LDAP go library (go-ldap) does not support paging for that usage. It supports paging between the LDAP Server and itself. So it can respect return limits of the LDAP server. However, it does this paging in a loop and returns the entire dataset.

Another paging strategy would need to be implemented. Modifying the go-ldap library to support paging between itself and its caller would be one solution. Another potential solution would be to perform our own paging but requesting queries that would return X users. This could be as simple as requesting 1000 users by id, or by doing some type of range query.

Performance

In order to evaluate these potential solutions. 100k users were created in ADFS with images of approximately 42K. It should be noted, there is only 1 LDAP Server and it is not optimized.

Case 1 - Admin first turns on LDAP Picture Sync and all 100K users required updating.
Total process time - 100 minutes
Users per minute - 1000
LDAP Retrieval - 20 seconds
Mattermost Image Check and Update - 40 seconds

Case 2 - Normal Sync, Sync 100K users, 0 require updating.
Total process time - 30 minutes
Users per minute - 3300
LDAP Retrieval - 20 seconds
Mattermost Image Check - 0 seconds

Observations

  • Batch size doesn’t matter (500 vs 1000)

  • Mattermost Image Update does 2 database writes, so this could be improved

This means our LDAP Sync would take 30 minutes per 100k users. We have installations with as many 350k users. This synchronization process probably won’t work.

Possible solutions -

  • Separate Picture Sync Job from normal LDAP Sync

  • Update a percentage of the LDAP users each sync so that each user gets updated daily. Photos would update every 24 hours. (If sync is every hour, update 1/24 of the users. (360000/24 = 15000 per sync)

  • Pick a timeout, process sync for that time then quit, process as many as you can each sync. (Unknown when the sync will occur.)

None of these solutions are very good. The worse part is all of solutions still require a ton of processing and will be doing very, very little. How often do people update their LDAP Image? When they join the company. Some may change it more often, but more than likely once per year at most.

Attribute Retrieval

Currently, when querying LDAP a filter is provided, but no attributes are requested. This results in all user-defined attributes being returned, including the thumbnailPhoto or jpegPhoto attributes. There may customers that currently are already returning photos although they are not being used. It is a little surprising there has not been a complaint about this. Another aspect of this project will be to request the specific attributes we are requesting. This will help the performance of LDAP Sync regardless of the solution for synchronizing images.

Tickets

LDAP Sync - Retrieve only required attributes
https://mattermost.atlassian.net/browse/MM-24458

LDAP Login

Updating user Profile Pictures could also be handled during the user login process. In fact, this is the recommendation of this document. Other LDAP attributes are update at login as well. Adding the image update will not add much time to the total login process (see below). However, it does prevent the need to run a memory heavy bulk synchronization process. Instead, the time is spread across thousands of login processes.

The only real disadvantage is that forcing users to log out and back in, could be seen as a manual process. Enterprise users may desire zero user actions be required and require a deterministic process.

There are a couple different choices when it comes to comparing the image.

First, since the image has been brought down from the LDAP the login could just write it to disk, no comparison. However, this has the side effect of the LastPhotoUpdate field being updated with each login. Obviously, this code could be changed, but without comparing photos, the true last update wouldn’t be know.

A new database field for a hashcode could be created and a comparison of hashes as described above could be used.

Finally, the comparison could be accomplished between the byte arrays. In order for this to work, the new LDAP image will have to be converted and the existing file read from disk before the comparison is made.

In order to validate these methods, a quick prototype was made in order to get an idea of the processing time for each. Using the same picture as the update.

  • Don’t check, just update it

    • No update - NA

    • Update - 15 milli

  • Check via hash in database (doesn’t read existing image)

    • No update - 0 milli

    • Update - 17 milli

  • Read current, convert new, compare

    • No update - 1 milli

    • Update - 18 milli

Based on this information it is recommended to read the current file, convert the new file and compare the two at login. In a normal login, it will add 1 millisecond to the login flow. Then a new database field does not need to be created and kept updated.

Tickets

Update LDAP login to sync picture

Manual Update Job

If necessary, a manual update job could be created. A CLI could be created in order to start an update job. This would allow an administrator to kick it off during off hours. Some admins may desire to have a manual update process after first turning on the PictureAttribute. There are some things to think about with this process. This is going to be a long running process. Therefore, some type of status should be given regarding the process status of the job. In addition, there should be a way for an administrator to update a partial batch of users. If a manual batch job is necessary, this section will need to be investigated further to determine how to handle the issues stated above.

CREDITS

Thanks to Martin for great suggestions and brainstorming.