Our assignment was to develop an algorithm that segments satellite images into distinct areas based on vegetation without any labeled training data.
In this project, our assignment was to develop a machine learning algorithm that segments satellite images into distinct areas based on vegetation. Since we didn’t have any labeled training data, the idea was to start with unsupervised learning where the algorithm learns to cluster areas in photos according to some form of similarity measurement.
Our customer is a software company that builds a flight simulator. The game renders vegetation based on what they believe should be there. Having a machine learning algorithm that can decide based on the satellite image over that area gives the game more authenticity. It also makes the current process of classifying vegetation more effective and allows them to include larger areas in the simulator.
The satellite images for the project had a resolution of 10 meters per pixel. We also had information about the topology and slope. A resolution of 10 meters per pixel doesn’t allow us to see many details, but it’s enough to see the primary vegetation for any given area.
We developed a CNN based on the "Image Segmentation Based on Differentiable Feature Clustering" paper. However, optimizing the algorithm for each image wasn't an option for our use case. So instead, we adjusted the author's solution and trained the algorithm on thousands of images before running it on previously unseen areas.
The algorithm performed according to the expectations we set before the project. There’s room for improvement, but it’s a significant boost to their productivity. Unsupervised segmentation is not our long-term plan, but it’s a great start.