Watching the watchers: The high-tech tools behind Hollywood test screenings
Ever since the business of movie-making arose in the early 20th century, big studios have incorporated some kind of test screening routine into the production process. As budgets skyrocketed, the test screening process became more and more influential, with studios occasionally reworking entire movies based on these small audience responses.
Filmmakers have been notoriously divided on the worth of test screenings. For every film that has been reportedly "saved" through extensive test screenings, you could find another to show that they're a complete waste of time.
David Fincher's classic 1990s film Seven is one of those perfect examples. After several mediocre test screenings, surveyed audiences reported a significantly dislike for the now infamous "what's in the box" conclusion. Fincher and the production team subsequently had to battle the studio to keep the fittingly bleak ending, and the final result spoke for itself. The film was a hit, both critically and with the public, ultimately delivering a worldwide box office of US$327 million.
Test screenings are obviously not an exact science. They generally involve audiences filling out targeted surveys, or being informally interviewed after watching early cuts of films. It's easy to imagine how this data could lead movie studios to think their films aren't working properly if the wrong type of audience were to view a film. Survey error also haunts the test screening process from response bias in the framing of questions, to the effects an interviewer has on a respondent's answers.
But what if you could track an audience's response to a film using raw scientific data?
In the early 2000s, a psychology professor from Princeton began to study what actually goes on in a viewer's brain while they watch a movie. Uri Hasson and his team screened large chunks of movies to subjects while their brain activity was recorded with fMRI. The results were published in a paper that coined the term "Neurocinematics".
The results fascinatingly showed that some films tend to "sync up" brain regions across all viewers. When a film was really working, everyone's brain tended to fire in the same places. The best results were seen in films that activated audience attention through action or suspense.
For example, Sergio Leone's classic spaghetti western The Good, The Bad and the Ugly was shown to demonstrate similar brain activity across all sampled viewers. Work from master of suspense Alfred Hitchcock was also found to be exceptionally effective in activating similar brain activity across a wide variety of test subjects.
Hasson's later studies examined brain activity across other forms of screen media and found that the more unstructured the media, the less synchronous brain activity was observed. So a single static shot of people standing in a park resulted in less synchronous brain activity compared to a tightly edited suspense scene.
Interestingly, a clip from Larry David's comedy series Curb Your Enthusiasm also showed little synchronous brain activity across all subjects. Perhaps scientific proof that comedy truly is subjective?
The fundamental idea behind Hasson's work was that there may be a way to objectively measure how engaging a film is. A way that doesn't rely on audience's subjective likes or dislikes. A way that bypasses the problems of surveying.
Measuring "heart-pounding moments"
The recent influx of wearable technology is now allowing Hollywood to get this kind of audience engagement data without resorting to impractical fMRI machines. In late 2015 tech company Lightwave joined forces with 20th Century Fox to try to find a way to quantify an audience's psychical engagement with the film The Revenant.
Across a series of pre-release screenings audience members wore specialized wearable devices that measured heart rate, pulse, skin temperature, electrodermal activity and motion. The statistics gathered allowed the filmmakers to look at the effect of their film on the audience from an entirely new perspective.
The technology calculated the film contained 15 fight-or-flight responses, 14 heart-pounding moments, 4,716 seconds where viewers were transfixed and rendered motionless and nine moments where the audience was startled. This biometric data could be tied to specific moments of the film and the filmmakers could accurately identify how physically effective the entire cinematic experience actually was.
"Through biometric data, we no longer need to rely solely on subjective and biased measurements to determine the impact that the content is having on the audience," said Lightwave CEO, Rana June.
Dolby has also been experimenting with similar biometric data recently. Using EEG caps, thermal imaging cameras and other sensors the company has been amassing a trove of data to examine how certain video and sounds can best create a physical effect in viewers. Dolby isn't a content producer, but it is interested in creating exhibition technology that can most effectively arouse audiences. Here the objective physical data can demonstrate how its technology is viscerally affecting audiences. The end point is that the company can sell its technology to exhibitors and film producers.
The limits of data?
A few years ago, neurocinematics pioneer Uri Hasson presented his work to Hollywood insiders and the results were unsurprisingly mixed. Filmmaker Darren Aronofsky was shown the results of an fMRI study done using a clip from his thriller Black Swan. Hasson demonstrated that nearly 70 percent of the cortex in his subjects were firing in sync while watching the clip.
Another study on Black Swan showed that the brain activity in viewers watching a certain clip resembled patterns seen in patients diagnosed with schizophrenia. Aronofsky was naturally thrilled at the results and half-joked, "Soon they'll do test screenings with people in MRIs."
Blockbuster director Jon Favreau, also present at the event, noted the limitations of this kind of data, "How do you use all this to smuggle in something that's a little more transformative? Ideally, you want to present something a little more elusive than what the statistics at this point can identify."
Favreau had a point. Sure, this biometric data could measure excitement or engagement akin to rollercoaster thrills, but what about the wealth of other things a good film can do.
How do you measure a comedy, or maybe a tearjerker?
A team at Caltech has been working with Disney to try and develop a way of tracking audience facial reactions in real time, while they are watching a movie. Less intrusive than any kind of EEG or MRI, this method is underpinned by a new deep-learning neural network that was produced by the researchers.
The team created a new algorithm known as factorized variational autoencoders (FVAEs) and the system breaks a person's face down into 68 different "landmarks". Using multiple infrared cameras trained on the audience in a cinema, each face can be captured at two frames per second and an audience of 400 can be analyzed simultaneously. These "landmark" features signal different degrees of smiling, laughing or other facial markers associated with engagement.
"It's more data than a human is going to look through," says Disney research scientist Peter Carr. "That's where computers come in – to summarize the data without losing important details."
The study offers a fascinating insight into ways audiences can be studied as a collective group. These results are still broad – and not applicable to those more stony-faced viewers perhaps – but the most frightening part of the study came not in how the system was interpreting audience engagement, but its ability to begin to predict an audience's future engagement.
The algorithm swiftly became able to accurately predict an audience member's facial response to an entire movie after just observing that face for the first 10 minutes. It learned quickly and could ultimately tell if someone was going to enjoy the film sooner than the person in question could.
Four percent more adrenaline engagement
The rise of this kind of scientifically based, data-driven, production work will inevitably lead to some boardroom conversations ripe for satire. When a film's efficacy can be tracked in such minute detail it's not hard to imagine future studio notes to filmmakers asking for an extra heart-rate rising set piece or suggesting certain scenes be cut as test audience's attention can be seen to drift across particular stretches.
It is easy to approach this kind of data-managed production process with a cynical eye, but it is worth remembering that Hollywood has been doing this for almost 100 years. In many ways it is preferable that some of these decisions are being made based on real data instead of the subjective opinions of a small test screening audience.
Of course, a reasonable fear is that this kind of analysis will lead to a homogenization of cinema where all movies are made for the lowest common denominator and any harsh edges are shaved off before a product gets a commercial release. Any scenes that confuse or offend removed. Any jokes that less than half the audience laugh at are dropped.
Either way, Darren Aronofsky's comment regarding the future of test screenings being held inside an MRI could be less an offhand joke and more a prophecy. After all, if $200 million dollars is being spent producing something that needs to return a profit then its almost silly to not do all you can to maximize the product's chances. Film may be an art form, but Hollywood is a business – and in business, data is king.