We are facing a continuous stream of ransomware, wipers, and related attacks. I had a client ask recently, somewhat in exasperation after being hit with one, about why all their investment in security wasn’t enough to keep them safe, and what they could do to deal with the next one more effectively. It’s a complex problem and paradoxically requires both more granular and more pre-planned response capabilities than we generally have available today.
A bit of brief background. These attacks are focused on denying access to systems or information – that is, an attack on availability, with a secondary risk to confidentiality, and (so far) they have not impacted integrity, though I expect that to happen. The code involved is often autonomous – rather than an attacker remotely accessing and controlling systems, the malware spreads via vulnerabilities in applications or operating systems, with some spread by phishing and email attachments. This hands-of approach and quick impact mean that often the first warning we have of a campaign is when machines are discovered to be encrypted or wiped. These are popular because malware kits are available for purchase, there’s a clear line from infection to monetization (or national state goal achievement), and the more advanced forms spread on their own.
We know these attacks are coming, and we know we’ll get hit. Yet we continue to lurch from one to the next, relying on yesterday’s blanket policies and procedures to protect us from a dynamic and ever-changing threat. There’s three major places to start work, and most of it is on the IT and policy side of the house.
Backup
Above all else, we need robust, and granular, backup and recovery solutions in place for all of our systems, including end-user workstations. Running without backups, is like sailing the North Atlantic in spring without pumps – if you hit an iceberg, you’re going down.
If we have good backups, using a cloud backup solution like CrashPlan or enterprise-grade agent-based backup solution like Tivoli Storage Manager, it immediately limits the potential impact to whatever time it takes to restore the system. We can’t use local disks like Time Machine, or NAS backups via mounted drives, as malware is now specifically targeting those. While we can leverage our disaster recovery plans for this, they will have to be updated and made more agile. Specifically, we need to understand system dependencies, and have solid – and tested – restore plans, and may need to change the backup timing to reduce the rollback period required.
Dynamic Organizational Threat Posture
Ok, I think I just won buzzword bingo, but let me explain. Right now, 99% of organizations have two operational postures – running with static rules for network and system access, or hitting the big red button and dropping systems offline. First, this involves properly segmenting networks – not just based on the type of system or classification of data, but also by type of machine. Next, it involves defining at four levels of access control through those segments: Standard, Heightened, Permissive, and Locked. Standard is essentially what we have today – a set of restrictions that allow all components of the enterprise to function, and other services blocked or restricted. Heightened is a set restrictions that only permit access necessary for tier-1 critical business functions. Permissive involves selectively relaxing controls for catastrophic situations like natural disasters (e.g. suspending 2-factor requirements for regular users. Locked is the equivalent of the big-red-button on the data center all, and involves isolating systems to prevent infection.
To continue the analogy, standard is our normal cruising posture – we’ve done lifeboat drills, inspected the equipment, but still allow people to smoke on deck, and maintain our current course. Heightened might require uncovering the lifeboats, banning smoking, closing certain of the interior bulkhead doors, and making moderate course changes even if we’re going to miss a port date. Permissive would be suspending lifeboat drills during a hurricane, and locked means that we slow the ship, post watches looking for bergs ahead, change course, and close all the watertight doors.
This isn’t easy to do. Enterprise architectures are highly complex, intertwined messes of undocumented connections between systems. And of course, it doesn’t help if our network gear is the point of infection. Still, it can provide us with a more granular response – for example, if we know there’s a wiper that’s traveling via a self-replicating worm against Windows Server systems, we have the option to quickly restrict or lock down the ‘windows server’ segment on our network.
There’s numerous other examples of this approach: if we know there’s a virulent campaign being spread by email attachments, let’s have a plan in place to temporarily block all attachments until we get signatures in place (and potentially all encrypted attachments completely). The key thing is to build the capability to change the threat posture for different logical components of our enterprise architecture independently, and establish an incident response team that includes security, IT and business decisions makers who jointly authorize the posture changes.
End Users
This is one of the hardest things to address politically and culturally in our organizations, but it’s time to have that conversation. We need to start doing more granular segmentation of user devices than simply ‘Regular and Privileged’. There’s a number of slices, some of which may intersect.
All the usual practices still apply. Patching of applications and OS’s is the obvious one, and yes, that means we have an Android problem. I’ll write more on that later this week.
First, on BYOD. My personal preference is for the ‘treat them the same’ policy – allow BYOD, but require that any device used for business be fully managed by the organization with the same policies as enterprise owned equipment. BYOD is a privilege, not a right. For organizations that don’t have that policy (and particularly for the ones that are missing or ignore a policy), all BYOD devices should be relegated to their own network segment that can be quickly and completely isolated from all core infrastructure. We wouldn’t allow passengers to steer the ship in iceberg infested waters after all. More on BYOD policies another time.
Related, let’s look at local access rights. This is the one place where BYOD policies may diverge, but we should ask how many users really need to be able to install and run software on their local machines? I guarantee it’s less than are currently allowed. Another way to reduce attack surface is to implement whitelisting of applications. That’s would still allow local admin/installation rights, but within a controlled sandbox. At the very least, requiring that software be signed by Apple/Microsoft/etc. is a policy that could be put in place with little impact.
Speaking of Apple, let’s have a conversation about Mac’s versus windows. Right now, Mac has a lower threat posture than windows – doesn’t matter if it’s because it’s more secure inherently or just that’s it’s a less common target – the fact is, that we see far less malware on Marcos than on Windows. The downside of Mac is that there is simply no good live anti-malware software available today. I’ve tried all the packages – they either destabilize the system, fail on OS upgrades, or get in the way far too much. We rely on Apple’s XProtect, and Apple can be tardy on posting new signatures. More on Mac anti-malware another time (I think that’s three more blog entries coming up). But for users of critical data, particularly if the organization is unwilling to lock down workstations, MacOS is an option to reduce the attack surface. In other words, if you’re headed into Greenland in the spring, let’s bring an Icebreaker, not a dugout canoe.
And the rest
At this point we’re back into the security realm. Host and network intrusion detection/prevention and anti-malware solutions, particularly if they are behavior rather than signature based, are a big part of stopping an attack in it’s tracks. Security intelligence, including both open source and private threat feed data shortens the time to discovery. Advanced analytics, including cognitive analytics, can help discover the full scope and context of an infection and provides more clear guidance on how to change the posture of our asset classes. Workflow-based incident response plans, including pre-positioned conference bridges, roles and responsibilities, and decision maker identification facilitates ‘making the call’ to change postures in a timely manner.
How we stop the next black swan is a vexing question. I believe that we have to move beyond tools and technologies, and focus on building processes that allow our people to respond in more effectively. That means admitting we’ll be hit and being prepared to recover backups, segmenting our enterprise so we can better isolate and prevent the spread of an infection while minimizing business impact, and address the most common root cause – layer 8 problems with users doing things that compromise our systems. Will that fix it? Nope. But it’s a start.