Machine Learning Meets Interactive Stories
On Friday, December 28th Netflix aired a special episode of their Black Mirror series called Bandersnatch. What made Bandersnatch interesting is that it was an interactive story: watchers could use a remote control to choose options at various points throughout the story’s progression to influence character choice and plot progression.
(This post does not contain Bandersnatch spoilers.)
Bandersnatch is a choose your own adventure style interactive narrative. The mode of interaction — choosing from prescribed options at critical junctions — is the same mode of interaction used in Choose Your Own Adventure books from the 1980s. The same mode of interaction was briefly popular in laserdisc adventure games, and can even be found in modern digital games such as The Walking Dead games. Whether you liked Bandersnatch or hated it, it is probably the greatest single exposure of interactive storytelling to a mass audience.
Because Bandersnatch is a streaming online TV episode, it is possible that Netflix has data consisting of every branch selection that every user (account) makes. That is a lot of data. It is not necessarily useful data as each data point would basically only tell us “in the context of the audience’s experience so far, this user prefers option A when faced with two choices A and B”. The value of the data is quite limited, although in conjunction with other data about what the user views, it is possible for Bandersnatch data to augment TV and movie recommendation. One form of recommender system looks at correlations between user behavior (in this case both TV and movie viewing selections and interactive choice point selection) to determine user similarity. This can be done without context for the observed behavior itself.
It is unclear whether Netflix keeps Bandersnatch interaction data, nor whether they intend to use it to augment their TV and movie recommendation systems.
Beyond this, is data about the choices users make in streaming online interactive stories useful? The answer is yet: It can be used to help increase the probability that a user enjoys their interactive experience.
Background: Branching Stories
In choose your own adventure books, as well as in Bandersnatch, the plot can be visualized as a directed acyclic graph (DAG). Each page (or movie sequence) can be represented as a node in a graph. Directed arcs between nodes represent choices. Here is one for the book, The Mystery of Chimney Rock:
Each path through the DAG from root to terminal leaf is a complete story. In Choose Your Own Adventure books and in Bandersnatch, some paths are more satisfying than others. There are “losing” endings and “winning” endings. Even if all endings are winning endings, the path taken to reach an ending might be better than other paths that reach the same, or different endings. What makes a path satisfying may be very individualistic.
Branching Stories as an Optimization Problem
We have now set up an optimization problem: which path is the best one for a given individual? If there were one path through the DAG that was universally preferred, we could just send everyone down that path. In that case, one would have a non-branching story. We assume that there are a number of possible paths and that some individuals will prefer some paths over others.
This is still a problem: if we solve the optimization problem and send each individual down the path they will prefer the most, then we have also eliminated the need for branches altogether — we might as well just compose N different non-branching stories and then recommend the right one to each person.
Thus we must tackle two problems:
- Which path is the best one for an individual?
- How do we improve the user’s experience given that they are still allowed to make choices?
We must do all of this in a data-driven way because we do not have hand-crafted models of our users, who we assume to be highly individualistic and that we might not know the features that distinguish them ahead of time.
Let’s take the problems on in sequence.
Sequential Recommendation
Netflix uses a technology called a recommendation system (or recommender) to predict which movies you might be interested in watching. Recommenders work by either predicting the similarity between users based on their behaviors (viewing behaviors) or based on similarity between products (movies) or both. From this, the algorithm can guess at which movies you will be most likely to select.
For example, the table below shows items A-D and users 1–3. Each cell is the rating that user gave to that item. Some ratings are missing (denoted by *) because that user has never rated that item. A recommender system predicts the value that should go in the missing spots. (The details on how one might do this are beyond the scope of this post.)
Can we use a recommendation system to guess which plot point — a node in the branching story DAG — a user will prefer? The answer is “sort of”. The problem is that recommenders assume recommendations are episodic, meaning that each prediction can be made in isolation of all others. This is a reasonable assumption for movie recommendation. However, in branching stories, the plot point that an individual considers to be best may be affected by the plot points already experienced; if there is more than one path to the same branching point, then the answer of which one is best may be different depending on how the user got there. As a simple example, consider the following branching story DAG. What is the best next plot point from the yellow plot point? It may depend on whether one has fought zombies or vampires. Keep in mind that the user doesn’t get to see what comes next until they make a choice.
Predicting the optimal path through a branching story DAG is thus a sequential recommendation problem. Fortunately, one can convert a sequential recommendation problem into a sequential recommendation problem. A branching story — a directed acyclic graph — can also be represented as a tree where each node is a sequence of plot points and each node’s child extends the sequence by one plot point.
Now, all you have to do is collect a bunch of data on people trying different paths and rating different sequences. It looks like this:
The only difference is that instead of items/products/movies, each row is a node from the tree. We can use our favorite recommender algorithm. The number of nodes in the tree will be much bigger than the number of nodes in the original DAG, but most recommender algorithms scale very well.
Data Collection
Let’s take a moment to think about our data. To make sequential recommendation work, we must assume that we have some users who have played through some branches of an interactive story. We further have to have those users rate each plot point. This is an acknowledged limitation, though there are ways to reduce the number of ratings that users have to give by predicting the similarity between the nodes themselves.
We now want to predict the ratings a new user U will give to different sequences. If we can do that we can pick the branch with the highest predicted rating. But that requires us to have some data on new user U. To solve this problem we need to have multiple branching stories (multiple DAGs/trees) and have some users having played through some stories that other users have also played through. Preferably, all users, including U, have played through multiple branching stories multiple times. Now the recommendation problem (e.g., the table above) has sequences from multiple stories. Same problem as far as the recommender is concerned.
The cold-start problem is common to all recommenders. One needs some data on a user before any predictions can be made. In our case, we can have some test stories that new users play and rate plot points before the system can start working.
Making Predictions
With what we have so far, we can predict the rating of possible sequences in a branching story. If we know the path the user has taken, thus locating them in the tree, we can identify the successor plot points and choose the one with the highest predicted rating.
Maximizing Expected Enjoyment
So what? If we force the user to the highest rated plot point, we have removed all choice. But if we all the user to have choice, they may make choices that diminish their enjoyment (recall the user is unaware of future plot points and thus cannot themselves choose to optimize their own enjoyment). What we need is a way of subtly nudging the user toward certain paths while still giving them agency to choose any possible path.
Suppose we have a plot point in a DAG that has two children. Let’s further suppose that 75% of users choose the option leading to B and 25% choose the option leading to C. Finally, suppose that we have used the above sequential recommender and determined that this particular user will enjoy the path through C better than the path through B. What can we do?
Unfortunately, there isn’t much we can do… unless we make a small structural change to the DAG. Let’s further suppose that instead of two options there were four distinct options provided to the user… but two of them resulted in a path to B and two of the options resulted in a path to C. If we could predict that the user’s ranked preference for each option is as given in the diagram, then we can see that the user will in fact most likely pick an option leading to plot point B and will not follow the path predicted to provide the greatest enjoyment.
Finally, suppose that we know all of this, and only show two options to the user instead of four such that the most preferred option is the one that leads to option C. This maximizes the probability that the user will indeed select the option that maximizes their enjoyment (as predicted by our sequential recommender).
To make this work, we need rating or preference data for options in addition to rating data for plot point sequences. We will have to collect that data. We can use our favorite episodic machine learning algorithm to predict option selection preferences. Finally, we use the option hiding method described above.
Experiments show that all of this together can increase overall user enjoyment. A positive feature is that no path is ever denied the user and the user makes all choices. The choices are just not being made on a level playing field anymore because we have used data to analyze the graph and tilt the odds toward certain paths.
Conclusions
We know that companies like Netflix that produce streaming interactive stories like Bandersnatch can collect data on people’s choices. With a bit of effort, that data can be used to improve the probability that users enjoy the particular path they take through a branching story. The sequential recommendation approach described in this post is one way to do that, although it does require ranking and preference data that probably isn’t currently collected by companies like Netflix; bootstrapping the system will be the hardest part of using machine learning to increase user enjoyment.
Further Reading
For further reading on the sequential recommendation technique described above, including algorithm details and results of human subject evaluations, has been published in the IEEE Transactions on Computational Intelligence and Artificial Intelligence in Games, vol. 6(2), 2014. A pre-print is available here.
Choose your own adventure style branching narratives are just one type of interactive narrative possible. Most work on interactive narratives has looked at computer games, which provide the illusion of much more user agency. Artificial intelligence agents called Drama Managers or Experience Managers are often used in game-based interactive narratives to construct story experiences for users. An overview of AI approaches to interactive narrative can be found in AI Magazine, vol. 34(1), 2013. A pre-print is available here. While this article is not comprehensive, it provides a framework for understanding the different general ways researchers have gone about building AI Experience Managers.