The Vera Rubin Observatory (VRO) is something special among telescopes. It’s not built for better angular resolution and increased resolving power like the European Extremely Large Telescope or the Giant Magellan Telescope. It’s built around a massive digital camera and will repeatedly capture broad, deep views of the entire sky rather than focus on any individual objects.
By repeatedly surveying the sky, the VRO will spot any changes or astronomical transients. Astronomers call this type of observation Time Domain Astronomy.
When the VRO spots something transient in the night sky, it’ll automatically send alerts out to other observatories that will observe the transient object in detail. It could be a distant supernova explosion, a hazardous asteroid here in the inner Solar System, or anything that registers a change in the sky. The VRO’s job is to spot it and then pass the baton to other observatories.
But issuing alerts to other telescopes is just one of the things the VRO will do. The VRO’s primary observing program is called the Legacy Survey of Space and Time (LSST.) The LSST will catalogue the entire available night sky by imaging it every night for ten years with its massive 3.2 gigapixel camera. Every five seconds, the camera will point to a different part of the sky and capture a 15-second exposure.
This decade-long effort will generate an enormous amount of data. It’ll take 200,000 images per year, amounting to 1.28 petabytes of data. There’ll be so much data that the VRO project includes a new data pipeline travelling from its site in northern Chile back to the US. There’s no way that people can process all the data, so machine learning will play a big role in handling it and finding what’s hidden.
The authors of a new research paper developed a novel way for the observatory to detect anomalies in the immense amount of data it generates. The paper is “The Weird and the Wonderful in our Solar System: Searching for Serendipity in the Legacy Survey of Space and Time.” It’s been accepted for publication in The Astronomical Journal, and the lead author is Brian Rogers from the Department of Physics at the University of Oxford.
The list of objects and events the VRO will spot contains all the things we’d expect to see. Along with supernovae and asteroids, the VRO might spot the elusive Planet 9 that may be lurking in the far reaches of our Solar System. It’ll also see kilonovae, gamma-ray bursts, variable quasars, AGN, and even interstellar objects (ISOs) like Oumaumua and Borisov.
But to find those objects in all that data requires machine learning. The authors have developed a type of neural network to process the data. A neural network is a type of AI that mimics how the human brain works. It employs a layered network of individual nodes, or neurons, that somewhat resembles the human brain.
The authors have developed a specific type of neural network called an autoencoder. Autoencoders can perform a very useful function. They take data, encode or compress it, then reconstitute the data back into a version of itself. By doing that, an autoencoder can ‘learn’ which aspects of data are relevant and which are noise. The noise can then be discarded.
In their paper, the researchers write, “We present a novel method for anomaly detection in Solar System object data, in preparation for the Legacy Survey of Space and Time. We train a deep autoencoder for anomaly detection and use the learned latent space to search for other interesting objects.”
The authors’ autoencoder is based on finding anomalies like interstellar objects (ISOs.) If the autoencoder can identify them, it means that the massive amount of LSST data becomes more manageable. “We demonstrate the efficacy of the autoencoder approach by finding interesting examples, such as interstellar objects, and show that using the autoencoder, further examples of interesting classes can be found,” they explain.
They tested their autoencoder on a simulation of the 10 years of data the LSST will collect. As real data from the LSST arrives, they intend to keep testing their autoencoder and strengthening it. “In the meantime, this work does not attempt to quantify the likely yield of unusual objects but merely demonstrates that we can find them in a large survey of the type which will be produced by LSST,” they write.
What the authors call ‘reconstruction loss’ plays a large role in the work, as do anomalies.
When working with known, simulated data, the researchers measured the autoencoder’s accuracy. They simply measured the output against the input. Reconstruction loss is a measure of how accurate the autoencoder is and it can be quantified.
Anomalies are unusual objects that stand out, just as an ISO would. From the figure above, the authors identified the top ten anomalies ranked by reconstruction loss. For each of those ten, they identified their twenty nearest neighbours. These are not neighbours in the Solar System; they’re neighbours in the latent space.
The neighbourhoods of objects are related by aspects of data. They’re data neighbourhoods. For example, one of the neighbourhoods is based on measured magnitudes. Another is based on orbital eccentricity, and another is based on outlier objects in Jupiter’s vicinity.
Astronomy is changing. Our observatories and telescopes are becoming so powerful and automated that they create a massive universe of data. It’s beyond the capability of the astronomical community to deal with the data without automated help. By training the autoencoder to detect anomalies, it can sift through the LSST data and flag anomalies.
The authors are quick to point out that the autoencoder is not completely automatic. It still needs human help.
“After evaluating the deficiencies of standalone unsupervised methods, we demonstrated the power of human feedback in detecting anomalies <> using a supervised approach,” they write. “Using human feedback can increase the relevance, accuracy and precision of the anomaly detection system.”
It’s not hype to say that the Vera Rubin Observatory will change our understanding of our Solar System and things well beyond it. Its first light is scheduled for January 2025. It’ll take a while to test and commission all of the equipment, but sometime after that, the data will start to flow.
Once it does, there’ll be no stopping it, and astronomers will need tools like autoencoders to help them find anomalies.
“By putting the right anomalies in the right hands, we can multiply the value of the data collected by LSST and precipitate potential follow-up studies for the most interesting objects found in the survey,” the researchers write in their work. “We have demonstrated that deep autoencoders can fulfil this role as an unsupervised detection model by performing on the scale of LSST and that they can enable efficient anomaly discovery for the most interesting Solar System objects.”