A new Dartmouth study on how we use reward information for making choices shows how humans and monkeys adopt their decision-making strategies depending on the uncertainty of information present. The results of this study illustrated that for a simple gamble to obtain a reward, when the magnitude or amount of the reward is known but the probability of the reward is unknown and must be learned, both species will switch their strategy from combining reward information in a multiplicative way (in which functions of reward probability and magnitude are multiplied to obtain the so-called subjective value) to comparing the attributes in an additive way to make a decision. The findings published in Nature Human Behavior, challenge one of the most fundamental assumptions in economics, neuroeconomics and choice theory that decision-makers typically evaluate risky options in a multiplicative way when in fact this only applies in a limited case when information about both the magnitude and probability of the reward are clearly known.
“This is the first cross-species study using similar experimental design to show that both humans and monkeys change their strategy when they go from choice under risk (when reward probabilities are known) to choice under uncertainty (when reward probabilities are unknown and must be learned), from combining information in a multiplicative way to comparing information in an additive way,” said senior author Alireza Soltani, an assistant professor of psychological and brain sciences at Dartmouth. “Comparing reward attributes may seem like comparing apples to oranges; however, when you compare different pieces of reward information rather than combine them, you become a more flexible decision-maker,” he added.
The team of researchers from three universities found that when the probability of the reward must be learned (but the magnitude of reward is provided), as the environment becomes more uncertain both humans and monkeys would more often opt for bigger but more risky options by putting less weight on the probability and more weight on the magnitude of the reward. The team also examined neural activity in the monkeys’ brain during the task and found a correlation between this adjustment in behavior and how prefrontal neurons represents reward information. Specifically, consistent with the behavior, neurons in the dorsolateral prefrontal cortex represented magnitude more strongly in a more uncertain environment when more weight was put on magnitude.
To understand the findings, consider the following hypothetical scenario (not part of the actual methods used in the research). Pretend it’s your lucky day where you could win money in a free sweepstakes. All you need to do is pick a ticket from one of two bowls: Bowl 1 contains 99 winning tickets each valued at $100 and 1 ticket with $0 value. Bowl 2 contains 50 winning tickets valued at $250 and 50 tickets with $0 value. Which bowl do you choose from? Most people will pick Bowl 1 because humans are risk averse. Bowl 1 offers a better combination of properties, even though Bowl 2 could be more lucrative. In order to decide which option to go with, you probably came up with a subjective value for each of the two bowls by multiplying the probability of winning and the subjective utility or desirability of the winning tickets.
Consider another scenario where you only know the dollar amount of the winning tickets in each bowl but don’t know the probability of picking a winning ticket. However, you have been observing people who have been choosing tickets from the two bowls before you and have learned that Bowl 1 almost always gives $100 winning tickets but Bowl 2 gives $250 winning tickets only half the time. In this uncertain scenario, you probably choose the bowl that you think is better by comparing how often the two bowls have been awarding winning tickets relative to the amounts of winning tickets they award. In this scenario, as the decision-maker, you used an additive strategy because you compared reward information across the two options rather than trying to combine it.
For the actual study, a series of gambling tasks were administered on a computer for which monkeys and human participants had to choose from two options. Humans (Dartmouth undergraduate students) were awarded a combination of points that were converted to money and extra credit for a course, and monkeys (studied at Yale School of Medicine and University of Minnesota) were awarded with drops of juice according to their choices and the outcomes of the gambles.
“Speaking more broadly, our results show that in an uncertain reward environment, which is the case most of the time, we may not construct the so-called subjective value as prescribed by normative models of choice, and that flexibility is more important than being rational or optimal,” added Soltani.