Posts Tagged ‘directory synchronization’

Identity 101 — build a consolidated, clean view?

Thursday, July 30th, 2009

I’m at the Burton Group’s Catalyst conference this week in San Diego,
where interesting conversations about identity management, entitlement
management, role management and more are everywhere.

One conversation got me thinking about the (what seems like) old strategy
of building a “master system of record” as a pre-requisite to deploying
an enterprise identity management system. We used to work with our
customers to do just that, back when our company was called M-Tech.
I take it that most of our software vendor peers and our collective
systems integrator partners did the same thing.

A couple of years ago, when we were designing the then-next/now-current
generation of our identity management suite, we examined this idea more
closely, and came away with the conclusion that it wasn’t really necessary.

Yes, you read that right. It’s not necessary to consolidate all your
systems of records and construct a “gold standard” directory of users,
from which your identity management processes will flow.

Our conclusion is based on a few ideas:

First, it’s expensive and time consuming to build this database … so
if we want rapid deployment of an enterprise IDM system, it’s preferable
to skip the huge data cleansing project.

Second, there’s the observation that the user population is a moving
target and it’s very difficult to get a perfect data set about all
those users, since they keep getting hired, fired, moved, name-changed,
responsibility-changed and so on.

So if we don’t put together a “gold standard” data set, what will happen?
To answer that, we should first think about how that data set is intended
to be used. Mostly, I think it’s useful for automated provisioning,
de-provisioning and identity synchronization between systems. If you have
a pure and clean HR feed, for example, you might synchronize it with
your corporate AD, RACF system, SAP user base and so on — and then those
would be perfect too.

But really – you don’t want to look at every user on every system on every
pass through the automation process. That’s expensive – and slow. Instead,
it makes more sense to look at changes to a system of record — who got hired
today? Who got fired today? Who changed jobs? Whose surname got changed?
Those changes are a much smaller data set, and each of them presumably
represents a state change — either to correct an error in the data or to
update data that used to be correct to its new state.

So if we capture changes and propagate them out into the enterprise,
we’ll gradually improve the quality of data on all systems, but without
having to clean everything up in one fell swoop.

And for that matter, what’s so special about systems of record? Why
don’t we monitor *every* system for changes that were initiated outside
of the IDM system. Any change could merit a response — it could
be a new e-mail address set in the Exchange server, which we should
replicate to HR. It might be a change in the user’s cubicle number
in a white pages application, which might be handy to copy to AD.
It might even be an unauthorized addition of a user to the Administrators
group in AD, which could trigger automatic removal and a security alert.

It’s not just HR we should be watching…

This approach is auto-provisioning based on propagating change events,
rather than directory synchronization based on comparing the full state
of two systems. It’s also the difference between our old ID-Compare
automation engine and our new ID-Track engine.

This strategy means that you can deploy a system that makes data cleaner at
every step, without having to make data perfect before you go live.

Another way to think about it is that “perfect” is the enemy of “good,”
and I think what we really want is “good,” without the pre-requisite of
“perfect.”