Data security risks in the modern age

“Yeah. I’ve got a couple of companies.”

Tom Dalglish comes across like an absolute boss. Vibe like a club bouncer, but with a heavyweight brain.

“I’m a data guy.” We gloss over his career – all the big names in finance (Bear Stearns, JP Morgan, Merrill...). Now he exists on various councils and boards and so on. You should probably connect with him.

He tells me about applied innovation labs in various banks, namely how traditional enterprises are moving their data towards “the cool stuff like analytics, financial crime, automation, machine learning, intelligent cognitive technology automation... all that nonsense!”

Companies are sitting on so much data that they’re now obsessed with finding ways of monetising it. Often as a means of keeping up with their competition. To my mind, this race doesn’t end well for consumers but it also plants legal landmines for business (and professionals?) at every step. It’s not just businesses that might struggle with legal consequence, it’s the legal system itself that is struggling. For the law to be trustworthy it must be deliberate, not rushed; yet it’s not moving fast enough, innovation growth is increasing and the backlog of grey area lawsuits will only continue to pile up.

At the time of speaking, Tom was freelancing to help a company evolve from being a throughput service (i.e. where data comes in, data is added and repackaged as a new data product) into a generative service - where it owns its own data and delivers micro services as the output.

His view is that these companies “need people to actually do the work“ and that, as a result “technical ability is the money making end of data management”.

Massive companies are moving fast to own as much data as possible, the new oil to sit on, yet as a security expert he’s also tuned in to the risks this speedy scrambling could present.

“We’ve gone from one big machine, to many little machines doing lots of different things – the security landscape is MUCH harder.” He tells me that many firms have so much going on, so many data sources and confused data pipelines, that it’s coming to a point where these companies often don’t know where data has originated and what legal liabilities there are associated with it because they don’t how to use or store it sensibly.

“Go to a supermarket,” he chuckles, “they can tell you exactly which chickens went into your sandwich,“ sometimes you even get a name, ”...but we can’t do that with data?!”

It’s madness. I’ve worked for a few firms and data management has, at best, been chaotic. It needs everyone in the business to follow the same rule book religiously, to use correct naming conventions, to store data correctly - even simple things like saving a file in the correct place. I’ve personally seen a flurry of machine learning solutions that provide an internal search engine because company files are so hard to navigate. It’s a band-aid economy! Staff have their own ways of doing things, they follow their own logic and, of course, staff leave. Like with our medical system, companies are just patching up symptoms.

His exasperation is clear. Talk to me, Tom.

  1. “The amount of data there is in the market is insane.. Scary numbers… Did you know the average car generates a terabyte of data per day – who the hell owns that? What happens when you sell your car - does the data transfer to the new owner? Can they access it? Think about how much information your car has about you, where you live, where you work, where you shop... Do we really need to measure everything?! The old VW Beatles had four wires. Now your car is spying on you. What happens if old driver habits affect your insurance?  

  2. “Python, Scala, Tensorflow, all that sh*t they’re giving away for free [open source]; there’s a lot of risk and I won’t be too popular for saying that… but it’s true. It presents a colossal security risk. Just look at what’s happening with Russia, the global political scene is so unstable and we’re practically showing under the hoods of our information architecture. As open source gains further traction, It will begin to creep into high level security - can you imagine a bank using open source technology?

  3. “The absolute insanity of data lakes. The inappropriateness of injecting absolutely everything - when most of it isn’t even used - presents a huge security risk and often for little value.

  4. ”The inability to select from options In market, there are just way too many to choose from. There are six products for every solution.

  5. “Most fatal, it’s really difficult to find people with the skill to do all this stuff, businesses can’t afford them. This is twisting the model and leads to off shoring, which creates economic problems.

  6. “We don’t have a tech problem, we have a management problem.

"Sifting through the morass of ‘what should I use’ – this should be talked about more."

A place to compare and coordinate perspectives?

Welcome to Machine Commons, Mr Dalglish.

He hammers in a great point on governance.

“Contrasting traditional EDM style data management with the new age of data analytics, they’re just different things. Last year we hit peek EDM and we’ll see more companies move towards real time data management initiatives. Like Splunk, Elastic.“

“It’s all well and good but no governance, no analytics. You can’t get good analytics without the governance.“

So there’s actually a financial incentive for strong data governance. I suppose that’s a good thing, but has the realisation of this incentive come too late? Wait, have management teams realised this?!

As Tom rightly points out, we’ve already enjoyed “five star crashes of massive companies making colossal errors” yet I think by far the worst is yet to come. We’ve seen hacks and negligent data management but, as I discuss with Patrick Hall, what happens when computation takes over more and more decision making - decisions usually purview of human beings - and start condemning people‘s futures?

What happens when people can’t get credit because ‘computer said no’ and the entity responsible has failed in their data governance, failing to provide a sensible reason why? The legal system lags so far behind this technology that people could suffer decades of machine made decisions with no consequence to the offending companies. By the time there’s a consequence, the system will have evolved - the profitability structure of firms will have changed to take advantage of the new ML decision making efficiencies - and there will be no going back.

As Patrick said, “the toothpaste can’t go back in the tube!”

Tom‘s like a steam roller. “We know banks are trying to determine what the likelihood of your future buying behaviour is: hey take this, here’s a coupon, buy through our supplier.”

Here’s what I find troubling. Think about fast food chains: they build their restaurants where they know people will be travelling, along main roads. The physical roads are put there most