The Future of Crime in South Africa
Is big data the answer or will it take us into 1984?
An interview with Tristan Bergh on the future of crime at a local level raises questions on the realities of using big data and what changes to expect.
The story of John Snow
There is a very famous story about a physician in London called John Snow. This John Snow actually did know something. In 1854 there was a massive outbreak of Cholera with people dying in large numbers every day – there seemed to be no end in sight.
John Snow mapped out where the outbreaks were happening, and in a stroke of genius thought to add water source points into the map. His map revealed a very different answer as he traced the source of the outbreak to a particular water pump. By quarantining the pump, he solved the case of Cholera outbreaks. His insight led to advancements in waste and water management in London as well as beginning the study of modern epidemiology (many believe that this was also the beginning of data visualization).
John Snow happened to ask the right question at the right time. Data revealed a link between two seemingly dissimilar things, spurring a major leap forward in data analysis. Today’s John Snow tactics of causal discovery are being used every day through big data and analytics especially in one area that affects everyone: crime.
Safety in Numbers
For crime intelligence, it’s necessary to predict the unlawful activity before it happens. While scenes from Minority Report may come to mind, in reality there is less Tom Cruise and a lot more graphs and data sets.
A few years ago, the Los Angeles Police Department looked into adapting an algorithm that was developed to predict the impact of earthquakes. The algorithm was originally designed to predict earthquake after effects; after an earthquake they could run data sets through the algorithm and make predictions of where the aftereffects would have impact and therefore could prepare accordingly.
The LAPD tweaked it a little and ran crime statistics through it with the results “contributing to a reduction of 33% in burglaries, 21% in violent crime, and 12% in property crime.” Since then there has been a spread of interest surrounding what big data can do to make everyone safer.
All sorts of algorithms are now being developed for using big data to predict and analyse crime and from different sides too. For example, software has been developed for U.S parole boards using 24 variables to accurately judge the recidivism rate which after application of the software dropped by 15%.
Local use of crime analytics
This is all very exciting, and it’s great to know what is happening elsewhere but what is happening on a local level? What is the existing infrastructure for data collection in the South African Police Service?
Tristan Bergh, director of Ixio Analytics, explains that the problem is multidimensional. He provides insight into different aspects:
“Our biggest challenge isn’t being able to predict, it’s being able to merge collaborating data sets.”
(1) Equipment to Record Data
“Crime wise in our country there is a lot of data, but it’s not being captured in a digital form. SAPS’s sophisticated officer-based mechanisms allow for great tactical responses and ability to deploy resources. There is a lot of local knowledge, but very little is automated or digitised. This is a massive challenge because getting an officer outfitted with kit to record data is almost a military spec requirement – the costs are extremely high and deployment can take a very long time. I am not aware of existing on-the-ground tactical criminal reporting electronics in South Africa or what the sophistication levels are. In terms of running a campaign, in order to predict, the model accuracy improves the better the data you get. We need digitised logs at police stations on all activities.”
(2) Type of Data Recorded
“Is the house North facing? Electric fencing? Dogs? High grass? Proximity to well-lit major thoroughfare? etc. That type of stuff starts getting into the GIS (Geographic Information Systems) location-based arena. Then you look at subtleties like access to good escape routes and we can digitise that as a flag. Each adds variable after variable after variable; the more of those we get the better our models are able to extract meaningful correlations. The lowest thresholds are hundreds of cases with 5 or 6 variables per case. Then you might be able to start looking at things, a few thousand data points are ideal. Also if we get number of crimes in a month we’ll only be able to predict within a one month window and that’s not particularly useful. If we get crime event statistics down to the minute in a particular area we can start getting some pretty sophisticated event predictions – which will be valuable for patrolling and presence management.”
(3) Restrictions / risks of releasing data and who can analyse it
“There are difficulties getting hold of the crime stats and there are many reasons for that; public panic, homeowners will see property prices plummet. In terms of safety and security of their officers, they would not want to release operational procedures. It’s quite a sensitive area. They don’t want people to know where the data stores are or what’s in those data stores. as there’s a very real threat of hacking and so on. Only if it’s treated with the correct sensitivity, and long disclosure agreements, is there a possibility of gaining access to data within the SAPS firewall. If this were to happen, I believe there is a strong chance for a positive impact. Strong reasons not to give access is vigilantism, but on the other hand if we work closely with SAPS on pilots and so on we could get some very good data without compromising their operations.”
For the public, releasing this data could have detrimental effects to their public safety and to their property, although others may welcome the security benefits. It may help to improve the perception of policing. This dichotomy between benefits and disadvantages of releasing data is only a portion of the problem. This should become clearer as one looks through Tristan’s lens of the future of data collection.
The future of policing with real-time data collection according to Tristan Bergh
“A Google Glass equipped police officer is alerted that there is a likeliness of an event. As they move in, software architecture would shift from a strategic to a more tactical overlay on the Google Glass- algorithms find high stress levels based on body posture and movement using real-time image capture. The officers would look through the crowd and with people highlighted on the glass, would go in with a team of officers depending on the predicted threat level. They might apprehend the suspects …
This is a very viable solution. Of course after reading 1984 this type of thing scares me, however, having worked on the inside I can predict that it’s a likelihood. Possibly within the next 3 to 5 years, somewhere in the world.”
The crooks are always ahead of the curve
Another aspect which makes the implementation of big data tricky in predictive policing is that criminals tend to be innovative types of people. A natural side-effect of effective business analysis in this space, rather than the desired moral improvement, is that savvy crooks will adapt their strategies in response to policing tactics.
“If mechanisms are working, your predictions will change and you have to keep adapting them, because you’re causing the models to change.” Tristan and a team worked on a biometrics solution for HANIS (Home Affairs National Identification System) and “the day after release, all crime to do with manipulation of the ID card form stopped. Went to zero. All criminal efforts now went to bribery and corruption of the legal process. When a particularly powerful mechanism is implemented it tends to shift problems to different vulnerabilities.”
The move toward big brother
The reality is that for prediction to be truly effective, it will require data collection on a scale of George Orwell’s 1984. For some there is no issue with living in a sensor-filled world. All of our information will become correlations for other research, and increasingly our personal information will be up for sale. There are of course benefits, the drop in crime in areas that have implemented Software such as Datameer and Predpol is undoubtedly inspiring and many are up for the challenge.
For those interested in viewing the raw figures of criminal cases in South Africa, they can be viewed at the SAPS website.
About Tristan Bergh:
Tristan was recently a speaker at Mammoth BI’s pilot conference on November 17th.
He studied aeronautical engineering and was responsible for data processing and for temperature measurements on the XDM version of the Rooivalk attack helicopter. He was also the chief architect on the South African HANIS, securing ID documents and biometrics. He was involved in the implementation of the world’s first combined civil and criminal AFIS for Namibia as well as proposing evidence on firearm management systems for the SA Police Services. Today, he and his wife Megan Bergh run Ixio Analytics to deliver actionable insights that have unlocked millions of rands for their clients.
Follow Ixio Analytics