Subject Re: [MACE-Dir] eduPersonSubjectIDGUID
From Nate Klingenstein <ndk@xxxxxxxxxxxxx>
Date Fri, 23 Oct 2015 02:51:05 +0000


> Pretty confident I have not slept on it enough, but I have read it.

I blame myself for being unable to help people understand the context in which it was meant.  I’m asking people to rethink so many fundamental assumptions that we have built our world upon that I didn’t spend enough time setting that up.

> The way I think of it (logically, not technically) is that if “privacy protection” is desired, but we also want PPIs (Pairwaise Pseudonymous Identifiers) used as much as is feasible, then what VOs and others really need is a way to dynamically “affiliate” SPs.

Roughly, yes.

> That is, today’s process for affiliating of SPs – which as I understand it is asserted statically in metadata – will run into the similar concerns to some that were raised in the “academia” categories discussion, namely that for some of these problems, making policy statements about users at the IdP level is too coarsely grained.

Yes, but static takes on a new meaning with my suggestion of distributed metadata that starts its resolution at the provider with which you’re corresponding, not a singular authority on which you agree ahead of time.  Once you’ve got a list of authorities/trust models that the other party uses, you look for a common authority at some point in the tree.  If you don’t find one, then you ditch it.

That’s part of the fear for everyone.  Distributed change control is very hard to do well.  I think we no longer have the luxury of avoiding it, though, because the alternative is centralized change control, which means we’re immediately trying to be an Okta or a Windows Azure AD Connect.

> What would be ideal for privacy is PPIs where the user can dynamically control the affiliation scope of their PPIs, i.e., I tell or have some consent interface to approve the IdP correlating me across a set of SPs.

That’s a fascinating concept that I haven’t thought about, but it’s something that I explicitly allowed for.  I deliberately wanted each PPI, as you call them, to have no defined fixed scope *in terms of relying parties* — a single key could be used for as many or as few SP’s as you’d like.

You could even issue a fresh key per transaction if you wanted to stay quasi-transientId and you were willing to incur the burden of managing that large a data set.

The twist I didn’t think about was giving the user the ability to select those groups.  I don’t know whether that’s good or bad, but it’s definitely something I didn’t think about out.  I thought it would fall to the IdP operator and the various federations for validation procedures, small and large.  Other, new trust mechanisms could be experimented with by the community.

My general rule in the dogma about all of this is expressed in principal #2, and everything I wrote above I believe to be in concordance with the original document.  When I think "more informative" about consent, I think about giving users plaintext explanations of why the service needs particular attributes and what happens if optional attributes are omitted:

"The content of metadata needs to be replaced or amended to be both more informative(provisioning, attribute, and consent stuff) and less informative(deprecate lightly used data fields). It also needs the right primary index(probably domain, because that’s the more common bootstrap for discovery) and secondary indices.”

This would be avoiding grouping providers to the extent possible, but instead documenting their behavior and needs on a more atomic level, allowing for much better disclosure to the user about the implications of a consent decision.

That said, there’s no reason why a provider couldn't be in a group, and that would be verified by chasing another trust reference that the corresponding provider would provide, and this could be a natural role for InCommon to play.

I’m not looking to enforce baseline attribute release: we’re building a distributed identity data model that gives us lots of ways to define how attributes are defined, and supplying as much information as possible to identity provider operators and users around the data that an application needs, why it’s needed, and what happens if it’s not provided.

Providers that are heavily attribute retentive would see the world much as they do today, but with more data with which to be retentive.

> I think that thinking is vaguely in line with Nate’s ideas, though it sounds to me like Nate’s suggestions are more around dynamic management of persistent IDs across multiple IdPs, whereas my statement above is more about management of persistent IDs within one IdP (though exposed to many SPs). And I’m not sure if the “persistent” in Nate’s example is “PPI” vs. just “non-reassigned”.

Take out the “pairwise” and you’re there.  I would envision many of them to be pairwise, but there’s no reason why every one needs to be.

> The challenge I see to Nate’s suggestions (as I understand them) are less about technology and more about the trust frameworks. Under what circumstances would I trust your IdP to assert one of your users as the same person as one of mine? If the answer is too restrictive, then that puts a cap on the value of the service.

That’s a keen observation and I don’t have a great answer to it, but my presumption had been that we would recycle the concept of scopes, and if you expressed a key from another IdP, the SP would then chase that key down for additional information by issuing a query to the IdP named in the key.  Trust rules could be applied then.

I also proposed a mechanism for doing this in a one-off fashion by taking pairwise identifiers and expressing them in encrypted form to other providers.  That provides a level of privacy protection, but it doesn’t limit the extent to which account linking could be done, and you’ve probably identified a serious weakness in my suggestion of this mechanism.

This could be addressed by including some reference to or cryptographic proof of the original provider that furbished the key in subsequent requests to providers that had issued other credentials, allowing the recipient of the secondary queries to make informed decisions about whether they consider the linkage to be permissible, or if they don’t want to participate.

Ultimately, we’re just talking about an IdP issuing data to an SP; it’s just that authentication is no longer a fixed requirement for doing so.

And, again, I apologize for my challenges in explaining all of this.  I’m having a hard time setting the context for people to evaluate and understand the proposal because I’m asking people to rethink so many basic assumptions and not expressing that well.

Thank you so much for taking the time to read and consider my ideas,