#1263216: Project Svalbard: The Future of Have I Been Pwned
Back in 2013, I was beginning to get the sense that data breaches were becoming a big thing. The prevalence of them seemed to be really ramping up as was the impact they were having on those of us that found ourselves in them, myself included. Increasingly, I was writing about what I thought was a pretty fascinating segment of the infosec industry; password reuse across Gawker and Twitter resulting in a breach of the former sending Acai berry spam via the latter. Sony Pictures passwords being, well, precisely the kind of terrible passwords we expect people to use but hey, actually seeing them for yourself is still shocking. And while I'm on Sony, the prevalence with which their users applied the same password to their Yahoo! accounts (59% of common email addresses had exactly the same password).
Around this time the Adobe data breach happened and that got me really interested in this segment of the industry, not least because I was in there. Twice. Most significantly though, it contained 153M other people which was a massive incident, even by today’s standards. All of these things combined – the prevalence of breaches, the analysis I was doing and the scale of Adobe – got me thinking: I wonder how many people know? Do they realise they were breached? Do they realise how many times they were breached? And perhaps most importantly, have they changed their password (yes, almost always singular) across the other services they use? And so Have I Been Pwned was born.
I’ll save the history lesson for the years between then and today because there are presently 106 blog posts with the HIBP tag you can go and read if you’re interested, let me just talk briefly about where the service is at today. It has almost 8B breached records, there are nearly 3M people subscribed to notifications, I’ve emailed those folks about a breach 7M times, there are 120k people monitoring domains they’ve done 230k searches for and I’ve emailed them another 1.1M times. There are 150k unique visitors to the site on a normal day, 10M on an abnormal day, another couple of million API hits to the breach API and then 10M a day to Pwned Passwords. Except even that number is getting smashed these days:
Oh – and as I’ve written before, commercial subscribers that depend on HIBP to do everything from alert members of identity theft programs to enable infosec companies to provide services to their customers to protecting large online assets from credential stuffing attacks to preventing fraudulent financial transactions and on and on. And there are the governments around the world using it to protect their departments, the law enforcement agencies leveraging it for their investigations and all sorts of other use cases I never, ever saw coming (my legitimisation of HIBP post from last year has a heap of other examples). And to date, every line of code, every configuration and every breached record has been handled by me alone. There is no “HIBP team”, there’s one guy keeping the whole thing afloat.
When I wanted an infographic to explain the architecture, I sat there and built the whole thing myself by hand. I manually sourced every single logo of a pwned company, cropping it, resizing it and optimising it. Each and every disclosure to an organisation that didn't even know their data was out there fell to me (and trust me, that's massively time-consuming and has proven to be the single biggest bottleneck to loading new data). Every media interview, every support request and frankly, pretty much every single thing you could possibly conceive of was done by just one person in their spare time. This isn't just a workload issues either; I was becoming increasingly conscious of the fact that I was the single point of failure. And that needs to change.
It's Time to Grow Up
That was a long intro but I wanted to set the scene before I got to the point of this blog post: it’s time for HIBP to grow up. It’s time to go from that one guy doing what he can in his available time to a better-resourced and better-funded structure that's able to do way more than what I ever could on my own. To better understand why I’m writing this now, let me share an image from Google Analytics:
That graph is the 12 months to Jan 18 this year and the spike corresponds with the loading of the Collection #1 credential stuffing list. It also corresponds with the day I headed off to Europe for a couple of weeks of “business as usual” conferences, preceded by several days of hanging out with my 9-year old son and good friends in a log cabin in the Norwegian snow. I was being simultaneously bombarded by an unprecedented level of emails, tweets, phone calls and every other imaginable channel due to the huge attention HIBP was getting around the world, and also turning things off, sitting by a little fireplace in the snow and enjoying good drinks and good conversation. At that moment, I realised I was getting very close to burn-out. I was pretty confident I wasn’t actually burned out yet, but I also became aware I could see that point in the not too distant future if I didn’t make some important changes in my life. (I’d love to talk more about that in the future as there are some pretty significant lessons in there, but for now, I just want to set the context as to the timing and talk about what happens next.) All of this was going on at the same time as me travelling the world, speaking at events, running workshops and doing a gazillion other things just to keep life ticking along.
To be completely honest, it's been an enormously stressful year dealing with it all. The extra attention HIBP started getting in Jan never returned to 2018 levels, it just kept growing and growing. I made various changes to adjust to the workload, perhaps one of the most publicly obvious being a massive decline in engagement over social media, especially Twitter:
Up until (and including) December last year in that graph, I was tweeting an average of 1,141 times per month (for some reason, Twitter's export feature didn't include May and June 2017 and only half of July so I've dropped those months from the graph). From Feb to May this year, that number has dropped to 315 so I've backed off social to the tune of 72% since January. That may seem like a frivolous fact to focus on, but it's a quantifiable number that's directly attributable to the impact the growth of HIBP was having on my life. Same again if you look at my blog post cadence; I've religiously maintained my weekly update videos but have had to cut way back on all the other technical posts I've otherwise so loved writing over the last decade.
Read rest in the link
|Date added||June 12, 2019, 10:47 a.m.|