“Yeah. I’ve got a couple of companies.”
Tom Dalglish comes across like an absolute boss. Vibe like a club bouncer, but with a heavyweight brain.
“I’m a data guy.” We gloss over his career – all the big names in finance (Bear Stearns, JP Morgan, Merrill...). Now he exists on various councils and boards and so on. You should probably connect with him.
He tells me about applied innovation labs in various banks, namely how traditional enterprises are moving their data towards “the cool stuff like analytics, financial crime, automation, machine learning, intelligent cognitive technology automation... all that nonsense!”
Companies are sitting on so much data that they’re now obsessed with finding ways of monetising it. Often as a means of keeping up with their competition. To my mind, this race doesn’t end well for consumers but it also plants legal landmines for business (and professionals?) at every step. It’s not just businesses that might struggle with legal consequence, it’s the legal system itself that is struggling. For the law to be trustworthy it must be deliberate, not rushed; yet it’s not moving fast enough, innovation growth is increasing and the backlog of grey area lawsuits will only continue to pile up.
At the time of speaking, Tom was freelancing to help a company evolve from being a throughput service (i.e. where data comes in, data is added and repackaged as a new data product) into a generative service - where it owns its own data and delivers micro services as the output.
His view is that these companies “need people to actually do the work“ and that, as a result “technical ability is the money making end of data management”.
Massive companies are moving fast to own as much data as possible, the new oil to sit on, yet as a security expert he’s also tuned in to the risks this speedy scrambling could present.
“We’ve gone from one big machine, to many little machines doing lots of different things – the security landscape is MUCH harder.” He tells me that many firms have so much going on, so many data sources and confused data pipelines, that it’s coming to a point where these companies often don’t know where data has originated and what legal liabilities there are associated with it because they don’t how to use or store it sensibly.
“Go to a supermarket,” he chuckles, “they can tell you exactly which chickens went into your sandwich,“ sometimes you even get a name, ”...but we can’t do that with data?!”
It’s madness. I’ve worked for a few firms and data management has, at best, been chaotic. It needs everyone in the business to follow the same rule book religiously, to use correct naming conventions, to store data correctly - even simple things like saving a file in the correct place. I’ve personally seen a flurry of machine learning solutions that provide an internal search engine because company files are so hard to navigate. It’s a band-aid economy! Staff have their own ways of doing things, they follow their own logic and, of course, staff leave. Like with our medical system, companies are just patching up symptoms.
His exasperation is clear. Talk to me, Tom.
“The amount of data there is in the market is insane.. Scary numbers… Did you know the average car generates a terabyte of data per day – who the hell owns that? What happens when you sell your car - does the data transfer to the new owner? Can they access it? Think about how much information your car has about you, where you live, where you work, where you shop... Do we really need to measure everything?! The old VW Beatles had four wires. Now your car is spying on you. What happens if old driver habits affect your insurance?
“Python, Scala, Tensorflow, all that sh*t they’re giving away for free [open source]; there’s a lot of risk and I won’t be too popular for saying that… but it’s true. It presents a colossal security risk. Just look at what’s happening with Russia, the global political scene is so unstable and we’re practically showing under the hoods of our information architecture. As open source gains further traction, It will begin to creep into high level security - can you imagine a bank using open source technology?
“The absolute insanity of data lakes. The inappropriateness of injecting absolutely everything - when most of it isn’t even used - presents a huge security risk and often for little value.
”The inability to select from options In market, there are just way too many to choose from. There are six products for every solution.
“Most fatal, it’s really difficult to find people with the skill to do all this stuff, businesses can’t afford them. This is twisting the model and leads to off shoring, which creates economic problems.
“We don’t have a tech problem, we have a management problem.
"Sifting through the morass of ‘what should I use’ – this should be talked about more."
A place to compare and coordinate perspectives?
Welcome to Machine Commons, Mr Dalglish.
He hammers in a great point on governance.
“Contrasting traditional EDM style data management with the new age of data analytics, they’re just different things. Last year we hit peek EDM and we’ll see more companies move towards real time data management initiatives. Like Splunk, Elastic.“
“It’s all well and good but no governance, no analytics. You can’t get good analytics without the governance.“
So there’s actually a financial incentive for strong data governance. I suppose that’s a good thing, but has the realisation of this incentive come too late? Wait, have management teams realised this?!
As Tom rightly points out, we’ve already enjoyed “five star crashes of massive companies making colossal errors” yet I think by far the worst is yet to come. We’ve seen hacks and negligent data management but, as I discuss with Patrick Hall, what happens when computation takes over more and more decision making - decisions usually purview of human beings - and start condemning people‘s futures?
What happens when people can’t get credit because ‘computer said no’ and the entity responsible has failed in their data governance, failing to provide a sensible reason why? The legal system lags so far behind this technology that people could suffer decades of machine made decisions with no consequence to the offending companies. By the time there’s a consequence, the system will have evolved - the profitability structure of firms will have changed to take advantage of the new ML decision making efficiencies - and there will be no going back.
As Patrick said, “the toothpaste can’t go back in the tube!”
Tom‘s like a steam roller. “We know banks are trying to determine what the likelihood of your future buying behaviour is: hey take this, here’s a coupon, buy through our supplier.”
Here’s what I find troubling. Think about fast food chains: they build their restaurants where they know people will be travelling, along main roads. The physical roads are put there mostly for non-profit reasons, for societal convenience. Now our journeys are moving online and there are no ‘roads’ as such to be laid down with taxpayer dollars, but there are still virtual pathways we all take and these virtual pathways can be taken advantage of by private companies in a very similar way to a McDonald’s on the side of the road. The problem is that these virtual pathways are being built by private companies.
In other words, the future of [virtual] infrastructure is private.
Most people are unfamiliar with the digital media landscape but you should know this: even when you’re not on (i.e.) Facebook you’re still likely being tracked by them and even if you don’t have a facebook account you’re still being tracked as a ‘shadow profile’. When your click links, you feed the last url you were at to the new site, among many other data points, and ’tracker sites’ stitch all this information together retroactively to map - and sell - your digital road trips, your online commutes.
“There will be a flurry of activity. Some data services will be cool. Some will fail. Combine this with the move to the cloud and many companies will over spend, burning out due to lack of governance.“
I suppose the good thing is that the market will ween out companies failing to manage data correctly, but the road there will be rocky. I see a future where companies apologise for decades of data breaches they weren’t even aware of, or, worse, were aware of but turned a blind eye to.
Our conversation has naturally turned to Orwell’s 1984 theme of total governance. I ask if he thinks this is where we’re going.
“We’re already there brother!“
“They may not be linking it all together yet, but think about CCTV - they know where you’re walking, where you drive, when you take public transport, which coffee you get. All that information is out there.”
Is Orwell’s future truly only a matter of time?
I ask him if there would be a tipping point or if it would be a slow and steady, insidious thing.
“Oooooorrrrhhhh, good question. It comes down to what is the tipping point? When there’s a guaranteed ROI? Maybe profitability will be the deciding factor.“
He tells me about a BBC show he watched on AI, “Digital Twin”, that there was a new factory entirely occupied by robots. 30K different electric motors being built, through CAD cam - “robotics build it!” That side of automation is crazy, consider 3D printing in general - “we’re printing organs now right?!”
’What does it do to society?!”
Referencing algorithms and the grooves that we all sink into, such as echo chambers online or the confirmation bias that will keep ’untrustworthy people‘ (i.e. without credit ratings) untrustworthy, he posits “at what point do we lose our own free will?“
“It’s random chance the first time, but when does random become predictive? When does reciting become dictating?“
I couldn’t agree more. If you haven’t already, you should read Yuval Harari’s Homo Sapiens and Homo Deus. He summarises that all life (yes including humans) are practically biological algorithms. If we make our world, our economy, algorithmic and we have throughpout between us and our systems, then at one point are we just part of the greater algorithm? Where is the free will? #SamHarris
It’s a scary thought. I ask Tom what he would do to change the dangerous course of all this.
“Wouldn’t let them have access to my data! If you’re not paying of the product, you are the product. I know when they’re stealing my data but my kids don’t know.”
Kids don’t know when they’re the product.
It’s like Gmail. Google indexes everything in your inbox (yes also the content of the emails) in order to cluster you into a demographic or attach a buying signal to you for advertisers to buy against. What gives them the right to scan your email?
“I didn’t give you permission to do that....oh, no, wait, oh yeah I did.”
“Like the 50 page standard legal terms of conditions that everyone agrees to. Yeah of course I read that, tick that box, but no one read it! And they say things like oh btw every photo you upload we can do whatever we want with it.”
”I think it’s our job to make it easy to access entitled data (as a data purveyor) however, for example, if you leave your data payload (i.e. your ‘keys’ to it) on a plane then no one else should be able to see that.”
”Digital identity will get big. Such as the plan by the UK government, on behalf of a collective of organisations for employer data, which they pulled out of. Then Mastercard stepped in. It’s a fingerprint on a computer. Big questions like where is the data stored, on the drive or a on the network? How long if it’s on the drive? Do you have to log in to the network every time you boot up? If not, for convenience, then if you get fired then how long before that percolates to the hard drive?”
“The event horizon is what we do when some agent hacks a global network and nothing works anymore. Russia is doing that exercise, disconnecting Russia from global internet. It’s a military, aggressive thing. Cables are lain. They’ve probably already planted the [explosive] charges on our cables.”
“They cleared a charge in the UK channel. It’s well known this happens!“
“Bad things will happen. That’s for sure. Or that tragedy with the Boeing supermax. Shitty hardware and a software bug. Who wrote that software?!”
”It’s never been a more exciting time to be in data.”