Our world is photographed zillions of times a day. Machine learning can look through those images, and possibly find what you’re looking for.
Whether gathered by satellites or stringers, the world’s images are too many to look at. As much as I’d like the ability to say, “find me pictures that are memes about George Soros,” I don’t know if that’s possible yet.
But “here are some pictures of guns; find me more” might be – as long as you’re comfortable with a bunch of false positives (pictures of things the computer thinks are guns, but aren’t) and false negatives (pictures of guns that get missed).
While we work to build tools and guides for journalists to use machine learning for images, here are some existing projects that demonstrate what’s possible
Spotting illegal amber mines from the sky
Ukrainian investigative newsroom Texty detected illegal amber mines across Ukraine, using machine-learning for two separate pieces of the task.
First an algorithm divided sections of Bing satellite images into visually uniform subsections. So if an image was half green forest and half dirt field, it would split the image into those two subsections.
Another algorithm found which subsections most resembled the existing examples of amber mining, which have a distinctive pockmark-like pattern of holes in the ground. Finally, the journalists examined by hand the examples the algorithm found, to filter out false positives – things that the algorithm thought looked like amber mining but were actually something else, like deforestation.
The resulting story included an online map in which a viewer can zoom into pictures of amber mines across the country.
A hot-or-not detector for ducks
The model, which he trained with about 40 images, successfully distinguishes between a mandarin duck and a mallard (also: trucks, boats, and anything else). He’s hoping to wire it up to his hot duck monitoring website, which currently watches for mentions in a Central Park Twitter feed.
More to come on images
My colleague John Keefe is writing up the steps for building a “hot duck” detector, which could be used for other sorting projects. And we have two more image-detection projects underway we’ll describe in detail when they become public.
If you’d like an alert when we post those guides, drop your email address into the box below. And if you have a project you think might benefit from this kind of machine learning, let us know at firstname.lastname@example.org.