ILM fixes and displayName explained in detail

Over the past 4 months, I’ve been working on improving

A technical intro to ILM

In that post, I talked about some of the “gotchas” in our existing implementation.

I’m now back to report that most of those have been fixed. ūüôā

I spoke of the high number of disconnectors, which resulted in a 3 hour cycle in our environment. That problem has been solved. The solution was to switch which management agent (MA)project objects to the metaverse (MV). Instead of having the NETID Active Directory MA project, now the PDS custom MA projects to the MV. This means that instead of re-evaluating about half a million disconnectors (from the PDS MA) every time it runs, it only re-evaluates a small handful (from the NETID MA). The ride getting to this fixed state was a bit bumpy on the back-end, and I learned quite a bit along the way.

For example, once a given MV object has been projected, removing the projection rule which resulted in its projection does not remove the relationship with the original MA object. This can result in lots of awful behavior, especially when it happens in large numbers. One fix is to delete the MA space, reimport everything for that MA, and re-run a sync. Deleting the underlying MA object removes the relationship. But it also incurs an unexpected penalty as the MV space needs to re-evaluate everything. And that particular MV object may get deleted as a result, then get re-projected by the other MA.

But the results are rather dramatic: our entire sync cycle now takes about 10 seconds to run. I’ve said elsewhere that it’s about a 1200% improvement, because we went from a 3 hour scheduled cycle to a 15 minute scheduled cycle. But in reality it’s more like 600% improvement because the actual time went from 100 minutes to 10 seconds. Regardless, it’s very good.

Another gotcha I’ve fixed is including more than just the uwPerson objectclass from PDS. There was really no reason to limit what classes from PDS contributed info, and so we did away with that limitation. Along the way, we also added uid synchronization to our ILM feed to UWWI. Combined with the non-uwPerson objectclass fix, this means that all accounts in UWWI which have been provisioned with a UW uid will have it present in UWWI on their uidNumber attribute.

However, I do need to report that one of the more visible gotchas is still outstanding. The name situation remains in the state I reported previously. There have been a couple new problems reported, and awareness of this issue seems to be spreading, but it hasn’t reached enough critical mass yet to justify prioritization over existing projects. I’m hopeful that will happen within the next 6 months however.

My previous coverage of that problem was mostly just an overview, so it’s probably worth taking the time now to cover the problem in greater detail.

Here’s the skinny:

Upon provisioning, our account creation agent, fuzzy_kiwi, does some complicated name parsing logic similar to what I’ll describe below. For uw netids where it doesn’t find a PDS entry, or where it isn’t allowed to publish the name, it stamps the uwnetid value on the name attributes. And for some uw netids, that initial value is where things end.

As you know, ILM connects NETID users with PDS objects, and keeps the name attributes in synchronization according to some complicated name parsing logic.

The recent fix to include non-uwPerson objects from PDS doesn’t really change the name story at all, because the name attributes in PDS that ILM uses as seed information are not present on non-uwPerson objects.

This is an important point to understand. non-uwPerson objects don’t have any name info that ILM wants, so ILM does nothing with them.

In contrast, uwPerson objects do have the naming attributes ILM cares about. However for those uwPerson objects, only two classes of users have any ability to modify those naming attributes. Only UW employees and students have any ability to change that name information. Employees can use ESS to modify that name info, and students must talk to the Registrar to modify it (yes, a phone call or an in-person visit, nothing online). To generalize the logic for these accounts to an understandable form, there are two important pieces of info. First, a flag which indicates whether your info is publishable. Second,¬†the name you’ve given ESS or the registrar. If you’ve agreed to publish, then parsing happens, and your displayname comes in the form “Brian D. Arkills” or “Brian Arkills” depending on how many substrings there are in what you gave ESS/Registrar. If you didn’t agree to publish, then parsing happens and your displayname comes out in a format like “B. Arkills”. In some odd cases, your displayName can come out as just “Arkills” or “Brian”. There isn’t much flexibility here.

People who aren’t UW employees or students have no ability to create the editable naming attributes ILM cares about.¬†In other words, affiliates and¬†sponsored UW NetIDs, shared NetIDs, temporary NetIDs, etc. all have no ability to change what their UWWI displayName is. So what displayName do they end up with? All of these accounts end up with a displayName of “B. Arkills”. The flexibility here is non-existent; there is no back-end solution here where someone can edit something to fix this for a given user. The entire¬†system needs to be overhauled to fix this.

Yes, this is an awful state of affairs, and yes, I agree this should get fixed sooner than later. And if you agree, you should talk with your UW Exchange representative and ask them about raising the priority of this feature.

So, how or why did things get designed this way?

Well … it turns out that back when we rolled out Exchange, the MSCA initiative was limited to *employees*. No students, no non-employees. And all the engineering inquiries into this design constraint came back that it was a firm limit.¬†But then rather quickly,¬†that limitation fell by the wayside , but engineering wasn’t given the time to revisit and refactor. So the combination of poor design constraints and then changing the scope after launch w/o revisiting the solution was a pretty big contributing factor.

Another factor was that our primary engineer was convinced that there was a significant value to having consistent displayName formatting within the Exchange GAL. So he wanted “Brian D. Arkills” or “B. Arkills” only. This is why almost every¬†displaynames ends up in one of those two formats.

Another factor was that the name source information situation isn’t pretty today. There’s the official name info, which is case insensitive. Folks¬†with mid-name capitalization¬†lose out in that, and¬†it’s rather hard to edit this piece of info. Then there’s the editable name info coming from HEPPS (the HR source system) and SDB (the Registrar source system). But the info coming from those sources has no input validation, and is not guaranteed to follow any format. So while the user has lots of control, it’s a nightmare to figure out whether a name will come in as “Brian Arkills”, “Arkills, Brian”, “Brian David Arkills”, “Brian David Joe-Bob Arkills”, “Mr. Brian Arkills”, “Mr. Brian Arkills Sr.”, etc. Obviously the number of permutations are endless, and it’s impossible to predict what format the data will be in.

And that’s it.

But in a very real sense, there is a bigger picture problem here. The problem here is that there are many source systems, and each of them do things differently. Getting all those source systems to implement naming information, input validation, publishing flags, etc is an uphill battle. And when someone is in more than one of those source systems, then you have to choose which source wins.

The solution we imagine bringing to this situation is to implement a UW NetID level solution. From the UW NetID Manage page you assert your name. This eliminates the problems that come from multiple source systems, and the person vs. non-person issues. If you have a UW NetID, you’d be able to set the name. Period. It also allows us to implement input validation in a single place, and restrict the formatting to¬†something reasonable and predictable. Obviously, UWWI would be one of the first to use such a mechanism, and hopefully other source systems will begin to see the value in having a single name across all systems and also leverage it instead. We imagine a state of affairs where people incrementally populate this new bit of name information, and UWWI continues to use the existing logic, unless this new info is present.

So in summary, things in the UWWI directory synchronization space are much better, but we’ve still got this name blot on our balance sheet. And hopefully we’ll get fixing that prioritized soon.

Leave a Reply

Your email address will not be published. Required fields are marked *