I've always had an interest in spatial data. Many years ago, I came across an airport dataset from OpenFlights.org. It listed the location of 10,000+ airport's around the world.
As I was unfamiliar with GIS software at the time, I wrote a simple Python script to create a map of the world. It would then render a circle at each airport's location.
Once it had rendered the map, you could see the general shape of most countries due to the density of the airports. The coastlines containing the majority of them.
A separate dataset on the same page listed airline routes with a start and ending airport. I extended my Python script to draw a straight orange line between them. The result was interesting to look at but inaccurate due to the curvature of the earth. I should have instead calculated the lines as great-circle arcs.
What is ADS-B?
These days, many planes are always broadcasting positional data while in flight. "Automatic Dependent Surveillance-Broadcast" (ADS-B for short) is the name of this system.
Ground stations or other planes receive these signals. Contained within are a unique ID and positional information. Positional data comprised of current latitude, longitude, altitude, speed, and climb rate. Anyone interested can receive the broadcasts by using an inexpensive USB receiver.
The Opensky Network is a network of ADS-B receivers around the world. They pool their data together and make it available through a web platform. They are very generous and give out a 24-hour window of data each month.
Processing 75M Records
In July 2018, I downloaded a complete day's worth of data, 24 hourly CSV files in total. Uncompressed it amounted to 7GB of data and contained 75 million records.
I merged the files into a single CSV and loaded the data into QGIS. After it had loaded I realized how slow it was going to be to work with. There was a faster option though. Instead of reading it as a text file, I loaded it into a spatial database I had used before, PostgreSQL / PostGIS.
Studying the data in QGIS revealed some extra processing steps I could take.
Resample from 10-second intervals to 1-minute intervals.
Interpolate the plane's position when they go out of coverage range.
The next step was to find gaps where a plane had left receiver range and then reappeared at a later time. For each gap, I used GeographicLib to create a great circle arc between the last and the next position. I then sampled this curve to find the position for each missing minute interval. Besides the position, I also very roughly interpolated altitude and speed.
The final issue that I needed to solve was crossing the 180th meridian. Usually, I would use the "Points to Path" tool to create my lines. But, when a flight crossed the antimeridian, a straight line would appear from one side of the map to the other.
To solve this problem, I added a check in my processing script. I calculate the distance between the previous longitude and the next longitude. If the distance was greater than 340 then it must have crossed. In QGIS I was grouping the flights by their ICAO code. To split the flight I added "_1" to the ICAO of any record after I detected the jump.
Once I had the processed data loaded back into QGIS, the "Points to Path" tool now worked. As mentioned above, I grouped the flight by its ICAO ID and set the order to be the timestamp field.
I gave each line a small amount of thickness and had the feature blend mode set to additive. This resulted in the areas with the most flights being the brightest.
After I was happy with the way the map looked, I rendered it out to "XYZ Web Tiles". To reduce bandwidth, I only rendered out the lines which I set to be transparent pngs. I could add the country map back in later with a GeoJSON file.
Unfortunately, QGIS will still save tiles even if they are empty. I didn't particularly feel like paying Amazon to upload several thousand empty images. To fix this, I used Imagemagick in a shell script to delete any image that had an empty alpha channel.
I used Leaflet.js to make an interactive web map. Adding the CSS property "filter:hue-rotate(ndeg)" would enable the viewer to color the lines.
QGIS can handle temporal data. This means if your data has a time or date element to it you can visualize it with a timeline. When you are happy with the result you can render the frames out and composite them into a video.
As an excuse to practice writing shaders, I attempted to make a 3D globe version. A naïve approach would be to create a 3D object for each plane and then animate it. Yet, performance would suffer after a few hundred objects.
Modern graphics cards are lightning-fast at rendering large amounts of vertices. They can handle millions without breaking a sweat. Using a particle system material on these vertices can give us a square that we can apply an image texture to.
So the plan is to pre-calculate the following:
- Determine how many planes I want to show.
- For each plane determine it's position at each minute of the day
- Set the Latitude and Longitude a consistent 2 digits of precision
- Map the -180.00 to 180.00 range to an integer range of 0 - 36000
- Write the data to a binary file for better compression
And then in the browser:
- Convert its latitude, longitude from spherical coordinates to XYZ cartesian
- If the plane doesn't have a position set it to 0,0,0
- Store the data for each minute of the day as a flat vertex array
- Set each of these arrays to be the particle systems vertex array at the correct animation time.
This approach worked well and could handle 48,000 planes at a smooth framerate. The main limitation was the download file size.
To attempt to fix this, I limited the number of flights to 5,000. I chose the longest flights as it's more interesting to look at. After that cull, I still wasn't happy with the resulting file size. Instead of calculating 1 array for each minute of the day (1440 total), I reduced it to every 5th minute (288 total).
I interpolated the missing minutes on the CPU otherwise the animation would stutter. But, this type of interpolation is inaccurate over long distances. Luckily for me, the distance a flight will cover in 5 minutes was so short that it prevented it from being an issue. You can't extend this approach to greater missing intervals such as hourly.
To make the earth look a little nicer, I wrote a custom fragment shader that blended between a day and night image. The angle of the sun with the camera decided how the textures should blend.