Bloodhound – AD Attack Resilience Methodology

Last month I was introduced to BloodHound and the Active Directory Adversary Resilience Methodology via a special workshop put on by SpecterOps.

While a lot of the time and technical nit-picky details center on the Cypher query language, the overall technology and approach is so awesome that I found myself not really caring that it took awhile to figure out how to express what I wanted.

Here’s the punch-line: as a defender, with this approach you have a really excellent tool to figure out how attackers might compromise the high-value targets protected by your Active Directory. This includes a visual map of their potential path, and a way to model how possible mitigations might change what paths are left. The tool itself has an excellent command of the possible exposures in an AD environment–which I can almost guarantee will exceed your awareness and ability to track within your environment.

With a tool and approach like this, you can:

  • identify weak points in your environment which need extra attention
  • have a quantitative way to evaluate possible mitigations or changes proposed
  • quantitatively compare your security posture between two points in time

The tool is not perfect, but for something that is an open-source labor of love that has been released in the last year–it’s pretty impressive–especially when you note the scientific methodology behind the tool.

Here’s how it generally works:

  1. You setup a neo4j database (and web interface) – https://www.youtube.com/watch?v=o22EMEUbrNk (walkthrough)
  2. You setup bloodhound to use that neo4j db – https://github.com/BloodHoundAD/BloodHound/releases/tag/2.0.3 (check for newer versions)
  3. You run bloodhound’s data collector in your environment to populate bloodhound’s db (look for smarthound in the above repo)
  4. You use a combination of the bloodhound UI and the neo4j web interface to explore your environment and the possible attack paths

Neo4j is a graph database, with nodes and edges (relationships between nodes). This allows the modeling needed to happen in an efficient way. Bloodhound defines a great set of AD related nodes and edges in its schema, and the data collector goes about discovering that data in your environment.

Once you’ve got a database with data from your environment, you can use the bloodhound UI or the neo4j web interface (using the cypher query language) to identify attack paths to high-value targets in your environment. A very obvious path you might want to find are ways to go from domain users to domain admins. Finding all such paths in a single query isn’t really practical–instead you might find all paths which are the shortest in terms of the number of hops from node to node. For example, a domain user might be able to log into a computer where a domain admin has a session (i.e. has logged in)–that’s a short path to escalate to domain admin.

You have the power to manually manipulate the nodes and edges in the database. You can add nodes or edges. You can remove nodes or edges. You might do this to simulate what the environment would look like if you applied that mitigation in your real environment.

By iteratively applying manual manipulations and re-running queries for the shortest attack paths you can identify as many weak points in your environment as you care to find. This gives you a laboratory-like environment where you can explore and test a hypothesis using a scientific methodology.

If you layer an analysis tool like PowerBI on top of this, you can put together a dashboard which gives you an objective sense of the overall security stance of your environment in a potential configuration. You’d applying manual manipulations to your bloodhound database and check the PowerBI dashboard to see how much improvement resulted. Likewise, you might use this kind of approach to provide an independent analysis of the security risk profile to any proposed change in your environment before it actually is implemented.

There are some approaches you’ll need to figure out if you use bloodhound as a risk evaluation tool.

Database management

If you are making manual changes, you are likely to want some way to copy and roll-back your database to the a given state. Alternatively, you can save your sharphound data collection and re-import to a fresh database. Do you need multiple copies of the database to support more than one purpose?

Data collection

How often do you do a fresh run of sharphound? And import it?
Which account runs sharphound? From which host(s) is sharphound run?
Do you notify others in your environment that they may see connections from the sharphound host(s)?

Basic reading

https://posts.specterops.io/introducing-the-adversary-resilience-methodology-part-one-e38e06ffd604
https://posts.specterops.io/introducing-the-adversary-resilience-methodology-part-two-279a1ed7863d
https://posts.specterops.io/bloodhound-2-0-bc5117c45a99

Intermediate reading

Sharphound rewrite: https://blog.cptjesus.com/posts/newbloodhoundingestor
Low-level technical details on sharphound: https://blog.cptjesus.com/posts/sharphoundtechnical
Intro to Cypher: https://blog.cptjesus.com/posts/introtocypher
Cypher cheat sheet: https://neo4j.com/docs/cypher-refcard/current/

Engaging

Slack channel for Bloodhound: https://bloodhoundgang.herokuapp.com/

Following

Andy Robbins Twitter: @_wald0
Rohan Vazarkar Twitter: @CptJesus
@SpecterOps

Azure AD tokens and Windows token binding

This blog post is an attempt to capture and share a variety of information that is not well-documented by Microsoft, spanning the two topics in the subject line.

I feel these topics are pretty critical to understanding the fundamentals of modern Azure AD and Windows security, and invaluable for troubleshooting. I do not consider myself an expert on these topics, and certainly not on the protocols via which one might get a token. But I’ve run into these topics enough to recognize there is a big gap in understanding and documentation, so this is my attempt to fill that gap.

First, here’s a table I created for a presentation in April 2018 for the Microsoft Technology community of practice at the UW.

1 Revocation is a complex topic; don’t rely on this too much w/o a deeper understanding.

These are the different types of Azure AD tokens that Microsoft has described in a variety of sources, along with a brief description of what they are good for, and the restrictions associated with each. Some references for this table are:

  • https://docs.microsoft.com/en-us/azure/active-directory/active-directory-configurable-token-lifetimes
  • https://jairocadena.com/2016/11/08/how-sso-works-in-windows-10-devices/
  • https://blogs.technet.microsoft.com/educloud/2017/06/14/how-to-kill-an-active-user-session-in-office-365/

Some key things to understand:

  1. Most users are now familiar with the “Keep Me Signed In” (KMSI) setting. The KMSI dialog only governs the browser cookie, in other words, when you choose no, that means the cookie is browser session bound. And when you choose yes, that cookie persists across browser sessions. I’m saying browser cookie, but as you see in the table that is synonymous with SSO token. And yes, Microsoft sprinkles use of both when talking about its tokens. So KMSI only governs the SSO token or browser cookie.
  2. Many of the AAD tokens have long lifetimes. You can read more about that at the first URL reference above.
  3. Browsers are not the only software managing your Azure AD tokens, e.g.
    1. if on iOS, the app you are using might manage the token, unless you’ve installed MS Authenticator, in which case, it manages AAD tokens
    2. if on Windows, it depends on the OS & Office version. And yes, this is one of the places where Microsoft has down a really poor job documenting or explaining. I might be able to share a few more details here, but it is complicated and I have incomplete details, so I fear that won’t be useful.
    3. if your AAD tokens are federated, then you’ve got upstream tokens. For example, at the UW, we federate our AAD to ADFS, which in turn is federated to Shibboleth. So there are multiple upstream tokens, and each of those tokens may be managed by something other than the browser, and may affect my AAD tokens.
  4. Getting rid of a cached AAD token is a problem. You need to know the specific client details & the recipe for that specific scenario. This is deeply entwined with #3.
    1. At the UW, in early 2018, we moved from ADFS 2 to ADFS 4. During that, we had a broad “Outlook” incident, where there were quite a few instances of corrupt cached AAD tokens that had to be manually deleted to enable users to use Outlook to connect to Exchange Online. In most cases, getting rid of those cached tokens was accomplished via the Windows credential manager.
  5. Apps have to actually enforce token lifetime. And many do not. So don’t necessarily trust the lifetimes to be solid.
  6. How token binding works is complicated. I don’t know how it works on non-Windows platforms. But this Twitter thread has some good details on the Windows platform token binding. Some key takeaways from that thread:
    1. The obvious: with token binding, you can’t do lateral movement–the token is only good on the computer it is on.
    2. You can further protect the token with Windows 10’s Key Guard, a hypervisor key isolation service
    3. Edge, IE, and the HTTP stack on Windows 10 all support token binding
    4. There are downsides to token binding: No 0-RTT, you can’t share tokens :), and proxies might break/strip your access.
    5. In the near-future, you can add FIDO as an additional layer of protection, which gives you a portable hardware token you can bind your AAD token to, in addition to the client computer binding.
  7. AAD token revocation is complicated.
    1. The ability to revoke is limited to specific AAD roles and you must use one of two PowerShell cmdlets to do it.
    2. Disabling the AAD user is usually a good idea in combination with revocation. If you don’t want the user to remain disabled, then have some process where the user must call into your service desk to get out of that state. You’ll need to design your approach based on the common use cases behind why you need to revoke.
    3. Revocation will be ineffective in some scenarios–in particular when a PRT is in play–and a PRT can only be in play if you have Azure AD domain joined devices. To make a PRT unusable, you have to disable or delete the AAD device. No amount of revocations will affect it. Of course, you could just disable the AAD account and then that PRT can’t be used to get new tokens.
    4. Conditional access is another way to prevent an issued token from being useful. It might be used in situations where you want to block access to some/most apps, but still need to allow access to a few.

User on Mac can’t login to AD, but can login from Windows

For about a month, we’ve been struggling with a pesky problem which boils down to the above description. As the above description suggests, the password is known to the user and the account is functional.

Other details:

  • Many others could login from all the macs tried—just the one user was having problems
  • We ensured that different groups setup the macs to rule out some odd configuration decision
  • We tried a couple semi-obvious things, like changing the password and ensuring the length and use of special characters shouldn’t cause problems on a Mac
  • We also looked closely at the AD account itself to see if there was anything broken or odd about it
  • We also tried inspecting the Mac-side logs, but that didn’t lead anywhere—but admittedly, our Mac expertise may not have been deep enough.

None of these produced much of anything and we were starting to create very far-fetched theories, and approaching what we thought was an option of last resort—deleting and recreating the AD user (which we now know probably wouldn’t have worked).

Finally, via a fortunate circumstance, one of our team members discovered there was another account with a displayName value which matched the user’s samAccountName value. We manually changed the displayName value on that other account, and logons for the person from Macs now worked!

The connection between a displayName and what the Mac client calls the username may seem arbitrary, but there is a fairly straightforward explanation:

  • Apple’s Mac AD client takes the username field value supplied and does an AD query using what is called ‘ambiguous name resolution’ (or ANR). This is a query against a set of pre-defined set of naming attributes. It’s what Outlook uses when you do an Address Book query. The ANR set includes displayName. You can find the other AD attributes it includes via an internet search of ‘ambiguous name resolution’. That’s a little funny design decision, but given the broad set of circumstances Apple might find in various Active Directories, it’s pretty understandable.

We believe the following to be true but there is a little speculation on our part:

  • Apple’s Mac AD client then takes all results from that query and silently chooses one and tries to authenticate using that one. If it fails, it doesn’t try any others and raises an error. Assuming our speculation is correct, this is a really bad design decision on Apple’s part.

The takeaway is there is a known problem where if your samAccountName matches the displayName or sn of another UW NetID, you may have problems logging in from a Mac.

I have never heard anyone else share anything remotely like this, nor have I found anything on the internet which suggests someone else has previously traced this problem to this cause. So I thought I’d share.

WS2016 AD upgrade problem: TLS ciphers

We’ve been replacing all our WS2012R2 DCs with WS2016 DCs. A non-Windows platform application which performs LDAPS queries against AD had a breaking issue. That application leverages a JDK library to establish the TLS connection (for LDAPS) with a domain controller.

See https://docs.microsoft.com/en-us/windows-server/security/tls/tls-schannel-ssp-changes-in-windows-10-and-windows-server for Microsoft details on the newly added crypto ciphers. An interesting detail is that the preferred crypto cipher for WS2016 is still in draft status! Presumably it was chosen because it is significantly faster than the other new/strong ciphers.

The exact details of the known problem are still a bit sketchy, so I can’t share that level of detail definitively. The facts are that both WS2016 server and JDK client have ciphers in common, but a mutually supported cipher is not negotiated (whereas a WS2012R2 server with the same JDK client does negotiate successfully). The JDK team has an issue tracking this which they have resolved based on wire-level examination showing that the WS2016 server doesn’t properly do TLS cipher negotiation: https://bugs.openjdk.java.net/browse/JDK-8178429.

So we looked to find a solution. We are aware that the ‘bouncy castle’ provider (instead of JDK) has been used in similar situations. Our working hypothesis was that WS2016 is just overzealous in negotiation when the newer ciphers are enabled or suggested first.

Our testing showed that switching the order so that the cipher still in draft status is not first was enough to allow the JDK based client to negotiate, which seems to support that hypothesis.

We rolled an order switch out, and plan to keep that in place for 6 months when the dependent application will be retired.

Some related links: