The Elusive Story and the Paradox of Predictive Analytics



Hollywood and its investors are mystified by the current run of megablockbuster failures and the small budget wonders. Hundreds of millions of dollars went to create what were essentially public works projects for film crew unions. The Lone Ranger, R.I.P.D., White House Down, and After Earth (all sound like downers when you think about it) cost almost a billion dollars combined and all made little more than quiet plunk as they sank into an ocean of indifference.

Perhaps it is not a coincidence that just a few months ago, Spielberg’s oracle pronounced in action movie metaphor:

“There’s eventually going to be an implosion – or a big meltdown…where three or four or maybe even a half-dozen megabudget movies are going to go crashing into the ground, and that’s going to change the paradigm.”

Implosion, check. Paradigm shift, not really.

Instead, Hollywood has invoked the muse of big data in the form of “Predictive Analytics.” Here is an introduction to how film studios are mining large data sets for story ideas, courtesy of SAP.

“Predictive analytics identifies patterns in past data. For example, if a proposed script is a raucous comedy about a wedding aboard a cruise ship, the data process can take into account information on how well recent comedies have done, while adding in box office receipts for previous wedding films.”

Some Remarks by E.B. White comes to mind: “Humor can be dissected, as a frog can, but the thing dies in the process and the innards are discouraging to any but the pure scientific mind.” But adding the human element doesn’t appear to improve things if those humans can only derive their intuitions from the amassed data.

Script evaluators can also suggest changes to a script, such as that a cruise ship comedy should not include a scene set in a bowling alley — movies with bowling alley scenes tend not to do well, script evaluator Vinny Bruzzese told the New York Times.”

Though the bowling alley metaphor is merely an example, take a moment to shudder at what a script evaluator would do to The Big Lebowski or There Will Be Blood.

What we have in predictive analytics is a perfect setup for the Simpson Paradox, aka When Big Data Sets Go Bad. Simply put, the paradox is that when you combine good data into larger sets the result is usually bad conclusions. Another way to say it is that in most data mining expeditions you will find exactly whatever you are looking for, no matter how ridiculous.

Predictive analytics is really about trying to figure out what motivates people by looking at their past choices. It is the way an alien would analyze what kind of stories motivate humans and its conclusions, without a grounding in literature and storytelling, can only be bizarre.

To tell a compelling story, you need an artist. You will need someone steeped in storytelling with obsessive dedication to the working and reworking and shaping of the story with the patience of a glass blower. Working backwards, constructing a tree out of broken furniture in the words of Anne Sexton, simply does not work.

Comments are closed.