Prediction Markets Failed Their Midterm Exams
Share and Follow

James Surowiecki’s 2005 best-seller, The Wisdom of Crowds, produced a Malcolm-Gladwell-like bubble in enthusiasm for the strange power of the hive-mind to outguess any mere individual, expert or otherwise. Surowiecki’s premise is simple:

  • “Groups are remarkably intelligent, and are often smarter than the smartest people in them.”

He developed this claim out of a charming anecdote. In 1906, the famous British statistician Francis Galton had attended an agricultural fair where he observed a contest being held to guess the weight of an ox – and as the story goes:

  • “A fat ox having been selected, competitors bought stamped and numbered cards, for 6d. each, on which to inscribe their respective names, addresses, and estimates of what the ox would weigh after it had been slaughtered and ‘dressed.’ Those who guessed most successfully received prizes.”

In Surowiecki’s retelling, the message is elevated to a Great Truth.

  • “Eight hundred people tried their luck. They were a diverse lot. Many of them were butchers and farmers, but there were also quite a few who had no insider knowledge of cattle. ‘Many non-experts competed,’ Galton wrote later…The crowd guessed 1,197 pounds; after it had been slaughtered and dressed the ox weighed 1,198 pounds. In other words, the crowd’s judgment was essentially perfect…

Galton himself saw a political moral in the story.

  • “The average competitor was probably as well fitted for making a just estimate of the dressed weight of the ox, as an average voter is of judging the merits of most political issues on which he votes.”

Others, especially those who apply this to the financial markets, turned Surowiecki’s and Galton’s thesis into a general argument for the superiority of collective over individual judgment.

  • “Large crowds are collectively smarter than individuals. Collective knowledge and opinions of a group are better at decision-making, problem-solving, and innovating than an individual.” – Investopedia

There is even a kind of engineering-flavored explanation, drawn from signal processing techniques in which (assumed random) deviations, or bias, in individual measurements, fluctuating around a central (correct) value, can be canceled out.

  • “The viewpoint of an individual can inherently be biased, whereas taking the average knowledge of a crowd can result in eliminating the bias or noise to produce a clearer and more coherent result.”

In fact, this is the same general principle underlying the standard arguments in favor of portfolio diversification, claims for the accuracy of prices in “efficient” markets, and many other aspects of orthodox finance theory: fluctuations and biases can be controlled (and ignored) by assuming they are random in nature (uncorrelated).

Financial markets are seen by some as an embodiment of the Wisdom of Crowds (WOC) principle. Millions of investors, all jostling their diverse expectations, understandings and forecasts together, are said to be able to arrive collectively at the most-nearly correct value for any widely traded asset (e.g., shares of public company equity), integrating all past, present, and future information into a preternaturally accurate share price. Thus, WOC is the philosophical mechanism underpinning Efficient Market Theory. In addition, the stock market is said to be able, through the same mechanism, to predict future macro-economic events such as recessions or recoveries (as described in an earlier column).

The media generally endorses the WOC premise, and reviewers have gushed praise for Surowiecki’s book:

  • “He musters ample proof that the payoff from heeding collective intelligence is greater than many of us imagine.” – BusinessWeek
  • The impressive power of collective thinking provided here is fascinating, and oddly comforting.” – Detroit Free Press

Michael Lewis (Moneyball etc.) loved the book, too. Even Malcolm Gladwell weighed in

  • “The most brilliant book on business that I’ve read in years.”

From the start of the WOC bubble, however, there have been skeptics. The Financial Times was guarded in its original review:

  • [The book] is packed with amusing ideas that leave the reader feeling better-educated.”

Before long contrary assessments began to appear. Collective judgment didn’t always seem to work its magic. The WOC thesis became a foil for revisionists: e.g., “When the Crowd Isn’t Wise” (NY Times headline) and “When Crowds Aren’t Wise” (Harvard Business Review headline).

Nevertheless, pro or con, WOC is now established as a journalistic meme. It has also helped to promote an interesting new cottage industry: prediction markets.

Prediction Markets

The idea behind a prediction market is to automate Galton’s ox-guessing contest, and apply it to estimating or predicting various things, especially the outcomes of political contests. Trading on such a platform is akin to betting. Some of the earliest examples were hosted on sports-betting sites in the UK. The Defense Department famously proposed to create a forum to allow traders to bet on political events in he Middle East, including coups, wars, and terrorist incidents, as a way to marshal WOC in service of the War on Terror. (This resulted in rather ugly optics – “The idea of a federal betting parlor on atrocities and terrorism is ridiculous and grotesque” [said one U.S. Senator] – and the idea was shot down.)

Apparent early “successes” prompted academics to study the phenomenon. Prediction markets began to focus on political contests – e.g., for predicting the outcome U.S. presidential elections. The Iowa Electronic Market (IEM) became famous for its election predictions, said to outperform traditional polling. By positioning the alternatives — “Democratic Win” vs “Republican Win” – as though they were stocks to be bought and sold, the Iowa exchange produced surprisingly accurate predictions. Starting in 1988, the IEM beat the pollsters decisively over the next several Presidential elections.

PredictIt is a project developed by Victoria University in Wellington, New Zealand, which calls itself “a unique and exciting real money site that tests your knowledge of political events by letting you trade shares on everything from the outcome of an election to a Supreme Court decision to major world events.” They claim to serve 80,000 traders, and provide anonymized data used by more than 200 academic researchers and university educators. PredictIt has operated their election markets for the past 8 years under a No Action letter granted by the Commodity Futures Trading Commission (CFTC).

PredictIt covered the American midterm elections intensely, allowing traders to bet on almost every race and scenario. Their user interface far outdoes Iowa’s in gloss and granularity. (Who knew the Kiwi’s were so interested in the Georgia Senate contest, e.g.?) However, the CFTC has recently moved to restrict and perhaps prohibit PredictIt from operating in the U.S. after February. PredictIt has sued the CFTC to block this action.

Kalshi is a start-up company, in beta today, which operates a series of markets keyed to various events. such as certain legislation actions, supreme court decisions, tax changes, and the like. The company’s banner portrays the range of its projected offerings.

Kalshi has also applied to the CFTC to be permitted to trade election contracts, which would look something like options or futures based on traditional commodities (hence the CFTC’s assumption of regulatory jurisdiction), described as “cash-settled, binary contracts based on the question such as: “Will <party> be in control of the <chamber of Congress>?”

The CFTC has seemed interested but hesitant. The Commission issued a formal Request for Public Comment on this matter in August, with a goal to issue a ruling in late October. Yet according to Bloomberg, based on objections of its staff, the Commission is apparently “poised to deny” approval of the election contracts – at least for now.

Polymarket – apparently a start-up based in New York – describes itself as “an information markets platform that harnesses the power of free markets to demystify real world events.…” Polymarket portrays itself as more technically au courant than its competitors: its exchange is based on blockchain technology, and it allows users to transact in crypto-tokens – USD Stablecoins – rather than “real money.”

  • “On Polymarket, you build a portfolio based on your forecasts and earn a return if you are right. When you decide to buy shares in a market, you are weighing in with your own knowledge, research, and view of the future. Market prices reflect what traders think are the odds of future events, turning trading activity into actionable insights that help people make better decisions. As a result, Polymarket is a leading source of unbiased and real-time data about future events.”

Gnosis is another Fintech-flavored start-up which claims to be building a “permissionless prediction market” (whatever that is). There is not much information available on their efforts so far.

Something of a mini-bubble in technologize WOC is developing, despite regulatory headwinds. Prediction markets have acquired a reputation for uncanny accuracy which has boosted the moral standing of the WOC thesis. As a WSJ commentator wrote, indignantly, just last week, arguing against the CFTC’s move to clamp down on PredictIt –

  • “It’s a blow to the public at large, because political futures have proven to have better predictive power than polls….That would be unfortunate for liberty. If investors can express their opinions on the future prices of corn and pork bellies, surely the First Amendment also protects their ability to do the same on elections and other political matters. ”

So – do they really work? Do these platforms reliably produce better forecasts than other methods?

Prediction Markets Failed Their Midterm Test, Big-Time

Unfortunately, the 2022 midterm elections did not turn out well for any of the operational prediction markets. They got the key calls all wrong.

Control of the Senate

Right up to election day, all the prediction markets forecast a clear Republican win in the contest for control of the Senate. Then reality struck. IEM’s results are shown here.

PredictIt was even more decisively wrong.

Polymarket trading was rock solid for a Republican Senate win right up to the close of the polls at 7:30 Eastern Time on election day, when the odds were 77% in favor of the Republicans. Seven hours later, trading favored Democratic control by 86%.

State-Level Markets

PredictIt got the Senate races in the key states wrong as well. (Iowa does not run state-level markets.)

Pennsylvania

PredictIt trading ran steady in favor of Republican Win, until it flipped as the actual election results came in.

Polymarket’s collapse of the market for a Republican Pennsylvania Senate Win was dramatic. From 61% at 7 pm, the odds fell to 0% just six hours later. (Of course by then you didn’t need a prediction market since the major TV networks had already called the race.)

Georgia

Again, PredictIt and Polymarket both failed.

At 8:00 pm on election day (after the voting was closed), Polymarket’s blockchain prediction engine was forecasting a Republican Win at a 2:1 advantage (65%) – as it had been for a week. Hours later this outcome was priced at a 4:1 disadvantage (23%).

Arizona

PredictIt’s Arizona forecast was similar.

The Unwisdom of Crowds

There are two problems here.

The first problem: prediction market mechanics are still poorly understood. Forecasting accuracy seems to depend on “hidden variables” that skew the nature of the “crowd” in ways which may either improve or diminish the accuracy of the predictions. For example, diversity of the “crowd” has been cited by some as a crucial variable, but one that is difficult to measure – markets by their nature, as well as by law, do not scrutinize the race, gender, ethnicity, intelligence, or income of traders.

The motives and mindset of the participants also matter. If traders see the market as more like a casino – which is to say, a game to “play” with small stakes, mostly for entertainment rather than enrichment, accepting that a loss is the price one pays to be so entertained – they may behave differently than if they treat it like the stock market where the price they pay for position is an investment that they do not wish or expect to lose.

The nature of the contract is also important. “Easy calls” are apparently less interesting to traders than the “hard calls.” The volume of trading on PredictIt for the Pennsylvania race (a hard call) was 28 times greater than the volume for the Illinois race (an easy call). Did this drive the loss of accuracy? Prediction markets get the easy calls right – but so does everyone. The prediction markets miss badly on the hard calls. (The hard calls are of course the cases where accurate prediction has the greatest cash value.) The higher the value, and the higher the volume, the greater the inaccuracy.

Predictable Folly: Market Sentiment as a Contrarian Signal

The second problem is more fundamental. WOC may not be true.

Are groups really “smarter than the smartest people in them”? Much evidence – drawn from the study of financial market sentiment – suggests otherwise.

Sentiment is a technical term in finance. It refers to the views held and opinions expressed by pretty much anyone regarding the present or future of the market, the economy, a particular company, or life in general – and the extent to which these views and opinions can be treated as signals of the market’s direction. Sentiment metrics can be based on a wide variety of underlying data sources, from surveys of individual investors (such as the weekly poll of investor sentiment taken by the American Association of Individual Investors – AAII) or the consensus of professional equity analysts, to metrics based on word-counts in text sources such as the financial media, or Twitter.

One principle that holds true across almost all sentiment metrics is… that they are usually wrong. The explicit forecast that most sentiment measures express is often incorrect.

In fact, sentiment guidance is often so precisely wrong, so predictably wrong, that it becomes a sound market indicator – if the “sign” is reversed. That is, if investors are bullish, it is a sign that the market is going to decline. A surfeit of negative sentiment among investors is closely watched as a sign of an impending rally. Market experts and many retail investors know this, and adopt a contrarian interpretation of sentiment metrics. In grim times, contrarians look for “despair” and “capitulation” as the buy-signal – “blood in the streets” as the legendary maxim sometimes attributed to the Rothschilds would have it.

The reliability of sentiment as a contrarian signal has been verified countless times. Here are the AAII weekly surveys over a 25-year period (1987-2014), grouped by levels of optimism/pessimism, and the average change in the S&P 500 index over the following six months.

The same pattern holds for many other sources of sentiment —

  • consumer surveys – the Michigan Consumer Sentiment survey is a known lagging indicator, peaking just before recessions, and strongly contrarian (i.e, usably incorrect)
  • put/call ratios (a precise and objective measure that is decisively contrarian – see my previous column)
  • analysts’ opinions, almost always biased towards optimism: “Profits forecasts made more than a few months ahead have a dismal record of inaccuracy…Forecasts for American firms’ total annual earnings per share made in the first half of the year had to be revised down in 34 of the past 40 years.” – The Economist
  • short interest spikes and troughs – a rise in the short-interest is often followed, at a short lag, by a market rally

All generally contrarian.

A prediction market can be seen as another type of sentiment metric. Might it also prove, as a rule, to be contrarian? Which is to say — wrong, much of the time?

The Retail Options Game

  • “If investors can express their opinions on the future prices of corn and pork bellies…” (from WSJ Commentary cited above)

The financial markets provide an even more relevant and powerful analogy: the retail segment of the options market.

Traders buy and sell options on companies’ shares based on whether they expect the shares to rise (call options) or fall (put options). The retail options market is a prediction market, de facto. It is vastly larger than the prediction markets described above – according to one recent paper, “retail investors accounted for more than $250 billion of total single- name option volume in 2020 alone.”

Options resemble the contracts created on prediction markets. Just as prediction market contracts are settled when the election results are tabulated, equity options contracts are “bets” rather than “investments” – they have an expiration date. They trade actively and their prices fluctuate in response to news and shifts in sentiment.

Options price trends constitute a strong contrarian signal, as noted above. But the meta-signal presented by the aggregate outcome of the entire market is even more powerful: 85% to 95% of all retail options purchased by retail players lose money.

One recent academic study found that the “wealth-depleting behaviors” of retail options traders cause losses that amount to, on average, 5-15% of their investment (de Silva et al).

Another study quantified the losses by retail traders during the pandemic period at around $5 billion.

Overall, the predictions of retail options traders are spectacularly wrong. No WOC on offer here.

Conclusions: The Wisdom Of Crowds May Be An Illusion

It may be that guessing the weight of an ox, and forecasting the winner of a close political race, present different challenges. It is likely that “the wisdom of crowds” is not a universal principle of human decision-making. It is also likely that the error of group judgments increases with the value-at-risk and the volume of trading, contrary to what standard statistical reasoning would suggest. In fact, the failures of prediction markets in the recent elections were so consistently wrong that they begin to resemble more traditional sentiment metrics, useful perhaps as contrarian indicators, but dangerous if taken at face value.

Share and Follow