The nail in the coffin for VPNs
I think about enterprise IT architecture and the VPNs within them a lot — probably far too much. VPN ecosystems cost a bunch of money, take significant operational effort to run, often lead to a terrible IT user experience, and are a breeding ground for vulnerabilities. I previously wrote about the general internet user not needing a VPN.
In this post, I’ll ponder the use of client dial-up VPNs in corporate IT systems — the VPNs used by corporate laptops and smartphones. I’m using ‘VPNs’ as a catch-all for ‘centralised networking systems’ — traditionally that is a VPN, but more recently it could mean TLS tunnels that achieve the same goal of funnelling some or all network traffic to/via a central network e.g. Zscaler Secure Internet Access (ZIA) or Palo Alto Prisma Access.
Transparency
The company that provides technology and cybersecurity consultancy maintains relationships with a number of the vendors mentioned in this post. This is not a sponsored post and they were not involved in its publication.
If you’re interested in talking about the technology/security musings and suggestions in this post, find me on Twitter or send an email to hello@slash32.co.uk.
Why I have it out for VPNs
A centralised network = centralised chaos
If the central network is down, mis-performing or compromised — your entire organisation probably grinds to a halt.
If the VPN system or the infrastructure behind it is comprised, the centralised network now acts as a single point of traffic manipulation and/or DNS poisoning.
They don’t actually provide full network activity visibility
There is a common assumption that the VPN will protect all network traffic all of the time. Sadly, thats usually not true — whether its ‘boot time’, while the user logs in, or a captive portal is in the way, there are lots of times where the VPN isn’t connected.
Mature VPN clients for Windows will have OK functions for stopping network traffic if the VPN is not connected (mostly). This is much much harder and usually neglected for macOS.
To labour the point: there are lots of times when network traffic flows entirely unrestricted, unprotected and unmonitored contrary to the popular security design assumptions for corporate VPNs.
They are rarely configured properly
In an ideal world, the authentication to the VPN service includes a hardware-backed unexportable certificate, while the VPN service is checking CRL/OSCP for certificate revocations.
Teams often purport that if all network traffic is funnelled to a centralised network, that centralised network can have central montoring… however a lot of networks don’t collect the VPN/firewall logs and/or eventing data. If they do, the SIEM ingests costs go up. Then someone has to write useful detection use-cases, and actually process/respond to any alerts.
In reality, VPN certificates are readily exportable, VPN systems accept expired certificates, allow multiple concurrent sessions from the same certificate, and no one is looking.
They often provide a false sense of security
Far too many networks assume total trust with an authenticated VPN session — broad access to internal systems (which often don’t have the same protections and monitoring as services on the internet), visibility of Active Directory and corporate knowledge repositories full of sensitive/personal information.
As Lukas has said, once that outer veil of the VPN is eventually compromised the networks inside are usually ‘flat’ — giving the attacker full trusted access.
Centralised networks are a major IT cost
VPN technologies (or SaaS subscriptions to VPN providers and other centralised networking providers such as ZScaler, etc) cost money.
The VPN solution will vary:
- a cloud-based ‘as a service’ solution with a ‘per user, per month’ fee. There will be add-ons for web filtering, DNS filtering, data leak/loss prevention and more
- virtualised VPN concentrators in a cloud hosting environment
- ‘on-premise’ physical VPN concentrators with finite capacity — whether a datacentre or in the comms room of an office building
The costs vary between the hardware investment, ongoing support/feature subscriptions through to the operational personnel to keep it all purring. Feeding, watering, monitoring, securing and licensing the VPN ecosystem (let alone the typical firewall systems behind them) is not cheap.
Bandwidth can also be a significant cost factor — if a knowledge worker spends their time on Microsoft 365 and video conferences, routing that traffic centrally and then out to the internet will incur bandwidth costs in both directions and offer no security benefit.
They really bug certain types of users
In general VPNs add latency, annoy users when they disconnect and often interrupt the user for authentication if not solely relying on certificate-based authentication.
Even futher they can be quite disruptive for developers and other more ‘technical’ roles that use a wide variety of software, run virtual machines or containers and so on.
What does it take to get rid of VPNs?
To get rid of the VPN, we have to find ways to solve a few problems:
- #1: encrypt/encapsulate network traffic — to protect it from any prying eyes on the local network (coffee shop, hotel, airport Wi-Fi etc) through to what used to be the VPN concentrator (but now the service endpoint: Microsoft 365’s HTTPS load balancers, Slack’s HTTPS load balancers, etc)
- #2: maintain device network traffic visibility — audit and assure what is going on, and detect things like malware C2 or otherwise unexpected connections
- #3: protect DNS traffic from interception/poisoning, provide DNS filtering, and understand if DNS queries are going to the right DNS services or not
- #4: support authentication and authorisation decisions in onward systems
- #5: enable applications that use source IP access control lists and/or live inside internal networks — without a VPN, the user/device’s source IP will be the external IP address of whatever network the device is, whether thats a coffee shop or home Wi-Fi
The problems have answers
- #1: Application-level encryption (particularly TLS for HTTPS) has come leaps and bounds
The network traffic for the important corporate softwares/systems (MS Teams, Outlook, Google Workspace, Slack, etc) are already resistant to downgrade attacks and often use certificate pinning. VPNs on top are double encryption, which has highly questionable value.
For unencrypted connections (HTTP, etc) this probably won’t be for anything valuable or important. A lot of content delivery is still over HTTP which isn’t ideal, but that shouldn’t contain any personal or remotely sensitive information.
- #2: host-based firewalls an endpoint agent that conducts 1:1 network packet deep inspection— this is where solutions like SenseOn uniquely come in, as an endpoint-based combined endpoint detection & response (EDR) and network detection & response (NDR) solution
Localised firewalls to generically stop traffic you don’t want (inbound connections, outbound connections to the local network etc).
- #3: Cisco Umbrella, NextDNS and plenty of other encrypted DNS solutions exist. Apple natively supports encrypted DNS configuration payloads, and Windows has this in the works (not enterprise production grade yet).
Its crucial that the DNS service conducts malware and other ‘known bad’ domain filtering. This is a key compensation for the DNS filtering the centralised network would have provided.
- #4: hardware-backed certificates can be directly used by services, and services that support it can also take attestation/health data from mobile device management systems as well
- #5: re-configuration to certificate-based authentication, ‘real’ authentication (applications with SAML, etc), AWS WorkSpaces or AppStream, Azure App Proxy or Zscaler Private Access are all modern solutions to access internal/on-premise applications
Alternatively, #2 doesn’t have to be solved: organisations could take the view that if TLS is in place for the important things and that encrypted DNS (with filtering) is working most of the time, then thats good enough for them. This may be too daring for many enterprises, but a likely an acceptable situation for micro/small businesses. This is a great model for personal devices.
What do others think?
The UK National Cyber Security Centre (NCSC, a part of GCHQ) have also posted about zero trust and VPNs. I agree that — as per #1 above — encryption in transit is key, but I don’t believe forwarding traffic through an internal protective monitoring service as they suggest is the best thing to do. It recreates IT operational cost and resiliency risks, and I believe this can be offset through TLS configurations resistant to downgrade and detective monitoring to provide comprehensive assurance.
SenseOn
SenseOn is a cybersecurity vendor headquarted in the UK. They have — in my view — uniquely cracked the network visibility problem. Providing endpoint detection & response (EDR) and network detection & response (NDR) entirely client-side (Windows, macOS and Linux) is very exciting.
In early 2023, I oversaw a pilot to battle test SenseOn’s NDR claims (it tested the automated data analysis side of SenseOn as well, but the NDR was the purpose of the pilot) and it passed with flying colours.
Zeek is interesting for a number of reasons, but I personally feel this falls short of what SenseOn have built.
Without SenseOn, the accepted model is to ingest logs from the VPN service, mobile device management and applications (Microsoft 365, etc) into a SIEM, and then do the correlation to form the extended detection & response (XDR). This is a fine solution but, as discussed above, since the VPN doesn’t actually provide full network activity visibility you will still be missing a large amount of data.
A new solution finally appears
Stepping back and painting the full picture, my ideal deployment would look something like:
- enterprise-owned and issued laptops — I’ll post about BYOD another time
- remote assistance solution (ideally tied to the IT Service Management/ticketing system, for full audit etc)
- modern SaaS mobile device management in ‘device owner’ supervised modes — Microsoft InTune, JAMF, VMWare Workspace ONE etc
- encrypted DNS with security filters (known malware, phishing, SafeBrowsing etc)— Cisco Umbrella, NextDNS etc
- endpoint detection & response (EDR) — if already available through productivity licenses that include Microsoft Defender for Endpoint, or SenseOn
- network detection & response (NDR) — SenseOn
SenseOn would be used to:
- understand IP connections to unexpected and unusual destinations
- detect IP connections to ‘known bad’ — IoCs and the like
- understand any unencrypted traffic
- understand where DNS traffic is being processed by unexpected DNS services or via unencrypted protocols
- support the detection of malicious DNS lookups — IoCs and the like
- understand inbound connections to the endpoint from the local network
- conduct EDR
SenseOn’s Reflex feature also offers device isolation on par with all other EDR solutions — useful if the EDR or NDR detections indicate something is very very wrong.
In this design, the IT service be would highly resilient and cost effective:
- if the device management solution is unavailable, users won’t be able to install new apps — but they can still work
- if the encrypted DNS solution is unavailable, the device can fail-open (fall back to unencrypted DNS from the local network as provided by DHCP, or a public DNS provider) — with SenseOn maintaining visibility of all DNS lookups
- none of the centralised network service, bandwidth or operational costs
- if SenseOn and/or the EDR is unavailable, the device works as normal
In this design, even if any of the IT infrastructure is down, the user/device still functions and can do all their work.
Different threat profiles
There will be organisations (large multi-nationals, governments, etc) where the solution above isn’t viewed as providing enough protection — the desire to be able to block IP connections in real time, over quick detection and response.
If we assume their current systems do do that effectively (👀), then they are right. The conversation would have to move to the balance of protective controls, detective capabilities, speed of response, IT operation effort, IT resiliency, IT infrastructure costs and so on, to find the most acceptable and affordable balance.
In these situations, a far more dynamic configuration would be better, perhaps:
- a VPN is deployed but will not capture all network routes (0.0.0.0/0) unless the user/device is in a specific group (perhaps the member of a high risk travel group in the IT identity system)
- a VPN is deployed but location/context aware, and will automatically process all network routes if it deems the device is outside of the user/device’s usual country
- a VPN is deployed but all known systems are allowed to bypass it (Microsoft 365, Google Workspace, Slack, Atlassian, mobile device management and so on) resulting in more generic internet browsing being via the VPN — if the VPN would be processing a lot less traffic and if it went down, the device still mostly works and all of the user’s core work tools keep going
In this more dynamic design, a VPN may exist and may be running, but it won’t be processing traffic unnecessarily unless the specific contextual security requirements say so.
This still has the benefit of most users not needing the VPN most of the time, so the costs related to the VPN model are still dramatically reduced even if not completely offset.
Smartphones are a little different
Smartphone networking behaviours can be all over the place. While mobile device management systems / profiles can be used to specify a VPN, and specify that all routes (0.0.0.0/0) should pass through the VPN, it doesn’t always work that way.
As with laptops in boot sequences, smartphones will reach back to various services very quickly (push notification systems, etc) before a VPN can dial, while a VPN is disconnected and in some cases ‘above’ the VPN in terms of network priority traffic.
The interactions between VPN configurations and other smartphone security configurations (for example, DNS over HTTPS / DNS over TLS) can also vary, and sometimes clash with a VPN.
Smartphone networking needs a bit more thinking, because the ability to audit/observe networking isn’t the same as what is possible under Windows/macOS/Linux.
On a day to day basis, organisations could be quite happy with the main productivity apps (MS Teams, Outlook, Slack, etc) and other popular apps (WhatsApp, Signal, etc) using encryption in transit and then relying on the fact the encrypted DNS service is probably working.
This wouldn’t be high enough assurance for high-risk roles or high-risk travel, but I intend to talk about high-risk smartphones in a separate mini-series ‘soon’.
You might find even more exciting posts in my Medium profile. I am on Twitter as @JoelGSamuel.