We’re reading more and more about AI, IVA, neural networks, machine learning and most recently, deep learning. What is deep learning when it comes to video surveillance systems? Is it any different to these other acronyms or are they all part of the same thing?
A: This is a good question and a large question that in considerable part remains unanswered. Labelling in the broader area of AI tends to be ambiguous and opaque. In short, artificial intelligence is an intelligence other than natural intelligence – machine intelligence springs to mind, though I can never help feeling AI should include the biochemical functioning of organic cells.
When it comes to machine learning we’re talking about using algorithms to assign data to categories and to use it to decide if a pre-programmed assumption about the material world is more or less true. Machine learning is code heavy and depends on complex algorithms, as well as data gathered, to make decisions.
The concept of neural networks is old, with input data broken up into layers of very small components and compared to expected thresholds to establish a probability vector. In the security industry, that something might be a face or a license plate. However, for a long time, there simply wasn’t enough computer power available to deliver on the promise of neural networks – it took the advent of GPUs to give us that.
Deep learning is a label for the latest systems that deliver this sort of neural capability by parsing monstrous amounts of data to tune circuits until they are expert at identifying multifarious sums of data inputs. The more data they process, the more expert they become. It’s worth reading about Andrew Ng’s work on the Google Brain project if you’re keen to know more about deep learning. In 2011, Ng and his team loaded deep learning algorithms onto 16,000 CPUs and had them plough through 10 million YouTube videos. The result was a system with the self-taught ability to recognise cats.
More recently, AlphaGo trained to expert level by playing Go against itself, over and over and over. Although we are in very early days, it’s hard not to see deep learning-based electronic security solutions (and smart city solutions) that will become better and better at offering situational awareness the longer they are in operation. Further, such intelligence could be shared between systems, and passed from one system to the next during system expansions and upgrades.
Regardless of how completely current security solutions are capable of delivering nebulous notions of deep learning, it’s certain this technology will lead to the ongoing development of systems that are consistently better than humans at recognising things in the material world. The power of a surveillance solution that could teach itself to recognise faces, gender, gait, mood, to recognise events that breached its vast, collective experience – many people running, gunshots, chemical signatures exceeding background thresholds, vehicles where they should not be, outbreaks of fire, groups of people in conflict, traffic accidents, medical emergencies, or any variable deliverable by any conceivable sensor input – is best encapsulated by that big little word – ‘proactive’.