NZ Police publish every recorded victimisation in the country — over a million records — and most people have no idea.
I stumbled across policedata.nz a while back and was genuinely surprised by how much is there. Every reported theft, assault, burglary, robbery — broken down by location, time of day, day of week, and month. All the way down to meshblock level, which is roughly a city block. Updated monthly. Creative Commons licensed. You can just... use it.
So naturally I started wondering — what happens if you point deep learning at this?
A million rows of crime
The dataset I pulled covers February 2022 through January 2026. Four years, 1,154,102 records across the whole country. The breakdown is roughly what you'd expect: theft dominates at 66%, followed by burglary at 21% and assault at 10%. The remaining sliver covers robbery, sexual offences, and harm/endangerment.
What makes it interesting for modelling is the spatial granularity. Each record maps to one of 42,778 meshblocks — tiny geographic units defined by Stats NZ. That's detailed enough to see patterns at a neighbourhood level, not just "Auckland has more crime than Tauranga" (which, yeah, obviously).
Auckland alone accounts for about 36% of all recorded crime. Then there's a long tail — Wellington, Christchurch, Hamilton, and then it drops off fast. NZ's urban geography is weird like that. One mega-city and a bunch of mid-size towns.
The idea
The core question is pretty simple: given the crime patterns of the last few months, can we predict what the next month looks like?
This isn't Minority Report. Nobody's getting arrested for crimes they haven't committed. It's pattern recognition on publicly available statistics — the same kind of modelling people do with weather data or traffic flows.
The neat trick is how you frame it. If you overlay a grid on a city — say 500m by 500m cells — and count crimes per cell per month, you get something that looks a lot like a video. Each month is a frame. Each cell is a pixel. The brightness is the crime count.
Predicting next month's crime becomes a video prediction problem. And there are some really cool deep learning architectures built exactly for that.
ConvLSTM and ST-ResNet
The two models I'm building are ConvLSTM and ST-ResNet. Don't worry if those sound like gibberish — the short version is they're neural networks designed to learn patterns that are both spatial (where things cluster) and temporal (how those clusters change over time).
ConvLSTM is the primary model. A standard LSTM network is great at learning sequences — it's the architecture behind a lot of language and time-series models. ConvLSTM swaps out the matrix multiplications for convolutions, which means it can process grid-structured data. Feed it the last six months of crime grids and it learns both the shape of hotspots and how they evolve. Recent research has shown these work well for crime forecasting across multiple US cities.
ST-ResNet takes a different angle. Instead of one sequential view, it captures three temporal perspectives — what happened recently, what happened at the same time last year, and what's the long-term trend. Each gets its own branch of residual convolutional networks, and a learned fusion layer combines them. The original paper was for crowd flow prediction in Beijing, but the architecture translates well to crime data.
Why NZ?
Here's the thing — almost all published crime prediction research uses US data. Chicago, Los Angeles, New York. A systematic review of spatial crime forecasting makes this pretty clear. The models are well-studied, but they're trained on American cities with American urban patterns.
New Zealand doesn't look like that. Our cities are smaller, more spread out, and the distribution is completely different. Auckland is genuinely dominant in a way that no single US city is relative to the rest of the country. The spatial patterns here are their own thing, and I couldn't find anyone who'd applied these deep learning approaches to NZ data.
That's what got me keen. Not because I think I'll beat the published benchmarks — those researchers have GPUs and PhD students and I have a Ryzen 5 desktop with no graphics card. But because applying known techniques to new geography is genuinely useful, and nobody else seems to have done it.
No GPU, no problem (mostly)
All of this runs on my desktop — an AMD Ryzen 5 5600GT with 12 threads and 30GB of RAM. No GPU at all. That sounds limiting, but the Auckland 500m grid works out to about 60 by 80 cells. The ConvLSTM model ends up around 5 million parameters, which trains in under an hour on CPU. You don't always need a beefy rig.
It does mean being smart about model sizing and not going crazy with hyperparameter searches. But for a hobby project, it's more than enough.
What's coming
This is the first post in a ten-part series covering the whole project end to end.
Part 1: Data Acquisition and Exploration — We start with a 503MB CSV file from NZ Police that's UTF-16 encoded (because of course it is), has trailing periods on area names, and 32% of records with unknown hour-of-day. We'll wrangle it into a clean, typed Parquet file and get our first look at what's actually in there.
Part 2: Geographic Data Pipeline — Crime records come with meshblock IDs, but no coordinates. We'll join them to Stats NZ geographic boundary files using geopandas, giving every record a place on the map.
Part 3: Spatiotemporal Grid Construction — This is where it gets fun. We overlay a 500m by 500m grid on Auckland, count crimes per cell per month, and build the 4D tensors that feed the neural networks. Crime prediction becomes video prediction.
Part 4: Exploratory Data Analysis — Before throwing deep learning at anything, we need to understand what patterns actually exist. When does crime peak? Where does it cluster? How do different crime types behave differently?
Part 5: Baseline Models — Simple benchmarks — historical averages, naive persistence — so we know whether the deep learning is actually adding value or just being fancy for the sake of it.
Part 6: ConvLSTM Architecture — Building and training the primary model. Three ConvLSTM layers, six-month lookback window, learning spatial hotspots and temporal dynamics simultaneously.
Part 7: ST-ResNet Architecture — The three-branch alternative that captures closeness, periodicity, and long-term trend separately, then fuses them with learned weights.
Part 8: Model Evaluation and Comparison — Which model wins? By how much? And more importantly — where do they fail?
Part 9: Building the Dashboard — A 3D interactive map built with deck.gl where you can watch crime patterns evolve over time. Dark theme, extruded columns, time-lapse playback.
Part 10: Deployment and Reflections — Shipping to Vercel, what worked, what didn't, and what I'd do differently next time.
Every post will include code and real results. The whole codebase will be open source. And I'll be upfront about the stuff that didn't work — which, trust me, there's plenty of.
This is a hobby project. It's not a policing tool, it's not a product, and it's definitely not claiming to solve crime. It's just me being curious about what's sitting in a publicly available dataset and seeing how far you can push it with some Python and a bit of patience.
Comments