Google Pixel 3 and 3 XL have some of the best cameras in the year, and in combination with it's generally impressive performance, there are a handful of pretty stylish software features. One of them, called Top Shot, makes those great pictures even better by helping you when making a mistake and snap a picture just a little too early or late. And while we knew a little about how Top Shot worked, there was quite a big gap in our knowledge. Fortunately for the curious, Google has just published a more technical explanation of the technology behind it.
The full deeds are on Google's AI blog for your long-format technical reading playback – although the official review goes a bit out of order, doubling back on itself in ways that can make it harder to read than other good AIs blog post. If you prefer the simpler version, read on:
Google Clips Heritage
Top Shot functionality is based on the same tools created by Google for use with the Google clips. Although it does not appear to have had great success as a product in itself, the limitations associated with Cut any advanced problem solving needed: How could you create an auto camera that independently recognizes and stores best cards video opportunities it sees?
For the sake of how Google pulled that magic, read the detailed explanation that was published earlier this year, but the (exceedingly) short version was that they used machine learning to do it ̵
Photographers and video editors gave relative calculations between the pairs of training data from Google Cut.
A model was trained on thousands of preselected source videos using professional photographers and video editors who manually chose between pairs to train the model on the best clips to search for imitation. In fact, over fifty million binary comparisons were made to collect data for the model. Combining it with the existing Google Photos development profile, developed the developers behind Google Cut, could create a model that worked to predict "interesting" content evaluated on what is called an Moment Score built on recognizing things in line with the qualities of a good clips. But the model could only run on power-hungry hardware. The really genius then trained a simpler model in parallel to mimic the performance of its server-based brother (ie, use a model to train another).
There is much more to it, but with all that information combined (plus some continuous training that recognizes more "famous" faces and pets over time), Google Clips can determine a relative score as it sees and further select when and how to capture content.
With the small Google clips that could recognize good quality from not so good, there was a relative mental hope, hope and hope to consider adapting the overall concept of Pixel 3 – albeit in a slightly different way  Brought to pixel 3
Even before you press the metaphoric shutter on Pixel 3, Top Shot is already working in the background as Motion Photos – if you remember it's the feature of Google's pixels that allow you to record a card We used to just before, below and after a picture was taken. It may seem like a simple step that moves from capturing before and after video to before and after images, but there's much more to it.
Google claims that Top Shot captures up to 90 images from 1.5 seconds before and after the shutter is pressed for comparison. When we talked to a Google representative on the Top Shot Pixel 3 launch event, we were told most of the alternative images were still images drawn from Motion Photo, but a selection of Top Shot options are also stored before the video capture process,
Abstract diagram of the Top Shot capture process.
But long before it can save the pictures, Top Shot must quickly determine as is worth saving, adding the previously mentioned Google clips. Top Shots custom, energy efficient on-device model was trained on sample photos in sets of up to 90 to sort through all the photos taken to save only the best. It eliminates those who can be unclear, incorrectly exposed, or where the subject's eyes can be closed, trying to recognize things like smiling or other visible emotional expressions. It also takes into account other data such as information from the phone's gyroscope (captured for other uses) to further speed up where it can determine an alternate image.
Once it has found up to two photos, it might seem better than the one you intended to chop, they are stored in HDR + quality and set aside in the same file with Motion Photo. And later, when you go to review these pictures, the ability to switch to one of the other intelligent options will present themselves. If you wish, you can even manually select one of the lower resolutions, non-HDR images if you think they are even better.
Two quick and simple Top Shot recommendations. Who knew such simplicity was built on such a complexity?
Like all the hot new features on phones these days, Top Shot's photo magic comes with a host of advanced machine learning software. For how easy and practical it is to use, there are many advanced and complex machinations that occur behind the scenes, and now they are a bit less mysterious to you.