Prediction Markets · research

800,000 Trades Exposed: How Kalshi Weather Markets Converge on Truth

Last Updated: April 1, 2026

800,000 Trades Exposed: How Kalshi Weather Markets Converge on Truth

Last Updated: April 2, 2026

Prediction markets claim to aggregate information into prices. But how does that actually work in practice? We analyzed 804,248 hypothetical trade-out scenarios across 1,506 Kalshi weather events to map the mechanics of price convergence — how contracts move from uncertain initial prices to near-certain terminal values. Our companion weather model article tested whether a forecasting model could beat these markets. This article goes deeper on the convergence itself: how different brackets converge at different speeds, where the biases are, and what happens in the final hour before settlement.

Key Findings:

What Does the Dataset Look Like?

Our dataset spans 1,506 NYC high temperature events on Kalshi, each with 6 temperature brackets and complete price histories from listing through settlement. We constructed 804,248 hypothetical trade-out scenarios: for each contract at each observed price point, what would have happened if you bought at that price and held to settlement or sold later? The result is a dense map of potential P&L across entry times, entry prices, and outcomes.

The events span all four seasons, with a roughly even distribution:

SeasonEventsTop-1 Hit RateTop-2 Hit Rate
Spring (MAM)33834.0%56.8%
Summer (JJA)36633.1%60.9%
Fall (SON)36433.5%68.4%
Winter (DJF)43831.7%60.1%

Fall shows the highest top-2 accuracy (68.4%), reflecting more predictable temperature patterns. Winter has the lowest top-1 accuracy (31.7%), consistent with higher forecast uncertainty during cold-weather systems. These seasonal patterns matter for anyone attempting to trade weather systematically — your model’s edge varies with the calendar.

How Do Winners and Losers Diverge?

Contracts that ultimately settle YES and contracts that settle NO follow dramatically different price paths, and the divergence begins hours before settlement. As we showed in our weather model article, winners converge from a median entry of 43.8 cents to a peak of 77 cents, and losers decline from 15.2 cents to 4 cents. Here we go deeper on the mechanics of that convergence — specifically, how bracket rank, season, and the cascade effect among losing brackets shape the path to settlement.

Convergence by Bracket Rank

Not all brackets converge at the same speed. The rank-1 bracket (highest pre-settlement probability) converges roughly 2x faster than the rank-3 bracket. Market attention concentrates on the most likely outcome, and price discovery accelerates there first.

At T-6h (6 hours before settlement), the rank-1 bracket sits at approximately 50 cents across all events. By T-4h, rank-1 brackets have already climbed to approximately 70 cents — capturing 20 cents of convergence in just 2 hours. Rank-2 brackets move more slowly, reaching approximately 35 cents at T-4h. Rank-3 brackets barely budge above 20 cents until T-2h, when the final forecast updates arrive and the market rapidly sorts the remaining uncertainty.

The mechanism is straightforward. Market makers and informed traders focus their capital on the bracket with the strongest signal. NWS forecasts narrow the range progressively, and with each update, the rank-1 bracket absorbs the largest share of buying pressure. Rank-3 brackets don’t attract serious buying until the rank-1 outcome is largely priced in, which frees up capital and attention for secondary positioning.

This 2x convergence gap has a practical implication for model-based traders: if your model identifies the rank-1 bracket at T-12h, you capture approximately 27 cents of convergence (from 50 cents to 77 cents over 6 hours). If your model only identifies the rank-3 bracket — or if the rank-1 bracket shifts — you capture less than 12 cents of convergence over the same window. Edge accrues to accuracy in rank ordering, not just bracket identification.

Convergence Speed by Season

Seasonal forecast accuracy directly affects convergence speed. Fall (SON) produces the fastest convergence: rank-1 brackets reach approximately 65 cents by T-6h, reflecting the fact that fall temperature patterns in NYC are the most predictable. The atmosphere is transitioning smoothly, frontal passages are well-modeled, and NWS day-0 MAE drops below 1.8 degrees F during October and November.

Winter (DJF) shows the slowest convergence. Rank-1 brackets sit at only approximately 48 cents at T-6h — 17 cents lower than fall at the same horizon. Winter storms inject genuine uncertainty: a 4-degree forecast miss in January isn’t unusual, and a single bracket shift can flip the outcome. The market prices this uncertainty honestly, withholding conviction until later forecasts confirm the trend.

Summer (JJA) and spring (MAM) fall between these extremes. Summer benefits from predictable heat patterns (65.1% of summer events see the top-2 brackets hit), but afternoon thunderstorm convection occasionally scrambles afternoon highs. Spring is the most volatile transition season, producing the widest spread between fast-converging and slow-converging events.

The Cascade Effect Among Losers

When the rank-1 bracket converges upward, the 5 losing brackets don’t all decline uniformly. The decline follows a cascade pattern determined by each bracket’s role in the probability distribution.

Brackets ranked 4th through 6th — the deep tails — collapse earliest. These represent temperature outcomes 10-15 degrees away from the forecast center. By T-6h, rank-5 and rank-6 brackets have already fallen below 3 cents. The market rules out these outcomes as soon as the day-0 forecast confirms they require an extreme deviation.

The rank-2 bracket — the “insurance” bracket — declines last. This bracket represents the adjacent temperature range, typically 5 degrees from the most likely outcome. A rank-2 bracket priced at 25 cents at T-6h may still sit at 18 cents at T-3h, because a 2-3 degree forecast miss would flip the outcome into this bracket. Traders hold rank-2 positions as hedges against late forecast revisions, and market makers keep rank-2 liquidity available because it’s the primary alternative outcome.

The rank-3 bracket occupies the middle of the cascade. It declines faster than rank-2 but slower than the tails, typically halving from its T-6h price by T-3h. The net effect: capital released from collapsing tail brackets flows upward into rank-1 and, to a lesser extent, rank-2 positions, accelerating the divergence between winners and losers in the final 4 hours.

How Fast Does New Information Get Priced?

NWS releases forecast updates multiple times daily. Each update contains new model runs (GFS, NAM, HRRR) that refine the temperature prediction. Our weather model research documented that 73% of weather contracts show more than 1 cent of price movement within 1 hour of an NWS forecast update, with the primary reaction window of 10-30 minutes. Here we focus on what that article didn’t cover: the asymmetry of reactions by forecast magnitude and the execution challenges that make the lag unexploitable.

Reaction Asymmetry by Forecast Magnitude

Not all NWS updates produce equal market reactions. A 3-degree forecast revision (e.g., the GFS run shifts from 68 degrees F to 71 degrees F) produces 2-3x larger price movements than a 0.5-degree revision. The market is more attentive to surprises — a 3-degree shift at T-6h may move the rank-1 bracket by 5-8 cents and simultaneously shift rank-2 by 3-5 cents, because a revision that large can change which bracket is most likely.

Time to settlement amplifies the asymmetry. A 2-degree revision at T-12h produces approximately 1-2 cents of movement — the market discounts distant forecasts because additional model runs will follow. The same 2-degree revision at T-3h produces 4-6 cents of movement, because fewer corrections remain before the temperature is recorded. The interaction between forecast magnitude and time horizon creates a nonlinear reaction surface that makes systematic exploitation difficult even for automated systems.

Why the Lag Isn’t Exploitable

The 10-30 minute lag sounds like a trading opportunity. Buy when the forecast shifts, wait for the market to catch up, sell at the higher price. The math kills it:

The margin is 1-3 cents. A typical NWS update shifts the most affected bracket’s price by 1-3 cents. Round-trip taker fees on a mid-priced contract are 3-3.5 cents (1.5-1.75 cents per side). The fee exceeds the convergence opportunity.

Execution is uncertain. By the time you parse the NWS update, run your model, and submit an order, 5-10 minutes have elapsed. Other participants (including automated systems) are reacting simultaneously. You’re competing for the same 1-3 cent move.

Adverse selection is real. If you’re buying after a forecast update, you’re buying from someone who might already know something you don’t. The resting NO-side liquidity adjusts quickly to new information. You’re lifting a quote that’s being repriced in real time.

At maker fees (0.4-0.5 cents per side), the math improves marginally. But maker fills are uncertain, and the competition for maker rebates on weather contracts is intense. The same structural conclusion holds here as in our crypto backtest: an edge of 1-2 cents per signal cannot survive execution friction.

What Patterns Appear in Volume and Depth?

Trading activity on weather contracts follows predictable patterns that reveal how market participants approach these instruments. Weekday events generate 2-3x more volume than weekend events, reflecting the participation of professional traders operating on business-day schedules.

Temporal Patterns

  • Peak activity: 2-4 hours before settlement. This is when forecast confidence is high enough to trade on but the market hasn’t fully converged.
  • Morning lull: Limited activity in early trading (12+ hours before settlement). Prices are set but volume is thin.
  • Settlement rush: A spike in the final 30-60 minutes as remaining positions are closed and last-minute speculators enter.

Depth Distribution

Liquidity is not evenly distributed across brackets:

  • Top 2 brackets (highest probability): Carry 60-70% of total market depth
  • Middle brackets: Moderate depth, wider spreads
  • Tail brackets (rank 5-6): Thin liquidity, wide spreads, and occasional gaps in the order book

Market makers focus their capital where the flow is. The top 2 brackets attract the most speculative interest, so market makers provide the most depth there. Tail brackets see primarily one-way flow (sellers) with limited buyers.

Who Trades in the Final Hour?

The settlement rush in the final 30-60 minutes is disproportionately large: approximately 30% of a contract’s total daily volume concentrates in this window. Two distinct groups drive this late-stage activity.

Market makers unwinding hedges. Professional liquidity providers who have been quoting both sides of the book all day need to flatten their positions before settlement. A market maker holding 200 YES contracts on a bracket that’s now at 85 cents doesn’t want to bear the settlement risk. They sell into the final-hour demand, often accepting 1-2 cents below the current mid to guarantee execution.

Last-second speculators. Retail traders and algorithmic players who monitor real-time temperature readings make bets based on observed temperatures rather than forecasts. If the current temperature at 2:00 PM is 72 degrees F and the high-so-far is 74 degrees F, they can estimate whether the remaining afternoon hours will produce a higher reading. These bets are directional, concentrated in the rank-1 bracket, and often placed at market prices.

Where Does the Volume Go?

Final-hour volume concentrates disproportionately in the rank-1 bracket. Across our 1,506 events, the rank-1 bracket captured approximately 45% of final-hour volume versus approximately 30% of pre-final-hour volume. The shift reflects confidence: with 1 hour remaining, the probability distribution has narrowed enough that most traders are competing over the same bracket.

Net flow direction is consistent. In the final hour, the winning bracket sees net buying pressure — prices drift upward from approximately 77 cents toward 85-90 cents as the remaining uncertainty resolves. Losing brackets see net selling, with prices dropping from approximately 4 cents toward 1-2 cents. The 4-cent “floor” we observe at T-1h reflects the last holdouts: traders who either aren’t paying attention or are making speculative bets on a last-minute temperature shift.

Is Late-Stage Trading Profitable?

For a trader who already holds the correct bracket from an earlier entry, the final hour adds 8-13 cents of convergence ($0.08-0.13 per contract). That’s meaningful on a percentage basis if your entry was at 40 cents.

But entering at T-1h means buying at approximately 77 cents for a $1.00 payoff — a maximum profit of 23 cents with genuine uncertainty remaining. The risk-reward calculus requires accuracy above 77% to break even (you risk 77 cents to gain 23 cents). Our model’s accuracy at T-1h is approximately 75%, based on the seasonal hit rates from the weather model analysis. After taker fees of 1.5-1.75 cents per side, the expected value is slightly negative. Late-stage entry is not a viable strategy.

How Large Is the Favorite-Longshot Bias?

Weather markets show a mild favorite-longshot bias (FLB), consistent with patterns observed across prediction markets. Low-probability brackets settle YES slightly more often than prices imply, while high-probability brackets settle YES slightly less often. But the magnitude is far smaller than in other domains.

Quantitative Comparison with Crypto Markets

Our crypto market analysis documented a 2-4 percentage point FLB on out-of-the-money contracts across 877,606 settled contracts. Weather FLB is roughly one-tenth that size:

Price BracketWeather FLBCrypto FLB
Sub-10% implied+0.4 pp (settles YES 5.4% vs 5.0% implied)+2-4 pp
40%+ implied-0.6 pp (settles YES ~44% vs 45% implied)-1-2 pp

The 0.4 pp overpricing of longshots in weather markets means a bracket priced at 5 cents settles YES approximately 5.4% of the time — barely outside the margin of sampling noise across 1,506 events.

Seasonal Variation in FLB

The bias isn’t constant across seasons. Summer (JJA) produces the smallest FLB: temperature distributions are tightly clustered around the forecast (afternoon highs in July are highly predictable), so even the tail brackets are accurately priced. The sub-10% FLB in summer is approximately 0.2 pp.

Winter (DJF) produces the largest FLB: approximately 0.7 pp on sub-10% contracts. Winter storms create fat tails in the temperature distribution — an Alberta Clipper or nor’easter can produce a 10-degree miss that the tails capture. The market slightly underprices these tail risks, creating the FLB.

City Limitations

Our dataset covers only NYC high temperatures. Whether the FLB varies by city — coastal cities with marine moderation versus continental cities with higher volatility — remains an open question. We would expect cities with higher forecast uncertainty (Denver, Chicago) to show larger FLBs, but we don’t have the data to confirm this across 1,506+ events per city.

Is the FLB Tradeable?

No. A 0.4 pp edge on a 5-cent contract translates to 0.02 cents of expected profit per contract. Kalshi’s minimum taker fee on a 5-cent contract is approximately 0.34 cents — 17x larger than the edge. Even at maker fees (approximately 0.09 cents), the edge is destroyed. The same conclusion holds as in our crypto backtest: small biases exist but are below the fee floor.

What Does This Mean for Market Design?

Weather prediction markets provide an unusually clean laboratory for studying how prediction markets aggregate information. The convergence mechanics we’ve documented have implications for every prediction market domain. Three structural features drive weather market efficiency.

Public Expert Forecasts Accelerate Efficiency

The NWS publishes detailed, accurate forecasts multiple times daily, giving every participant the same high-quality baseline. The market’s job is to aggregate this baseline with private knowledge (local weather observations, alternative model outputs) into a consensus price. The 10-30 minute lag after forecast updates represents the time it takes for the crowd to process and trade on public information.

Fast Resolution Cycles Improve Calibration

Daily weather contracts provide 365 feedback loops per year. A mispriced contract settles within 24 hours, and the result is unambiguous (the temperature was 73 degrees F, not 72 degrees F). This tight loop punishes persistent biases: a trader who consistently overweights the rank-1 bracket loses money within weeks, not years.

Compare this to political prediction markets, where contracts on “Will X be president in 2028?” may remain open for years. The long resolution cycle allows narrative-driven pricing, momentum effects, and persistent biases that daily weather markets self-correct within hours. Our dataset of 1,506 events provides the equivalent of 4+ years of daily feedback — enough for any systematic bias to be identified and arbitraged away by attentive participants.

Low Emotional Stakes Reduce Noise

Nobody bets on weather because their identity is tied to the outcome. The absence of partisan, tribal, or fandom-driven biases produces a cleaner price signal. The FLB exists but is muted — 0.4 pp versus 2-4 pp in crypto and likely larger in political markets.

Which Domains Would Produce Similarly Efficient Markets?

The combination of public expert data, daily resolution, and minimal emotional attachment isn’t unique to weather. Several domains share all three characteristics and would likely produce efficient prediction markets:

  • Air quality indices from the EPA: daily readings, public expert models, no partisan attachment. A contract on “Will NYC AQI exceed 100 tomorrow?” has the same structural properties as temperature brackets.
  • River levels from USGS: hourly gauge readings, NWS river forecasts, relevant to flood insurance and logistics. 4,500+ gauges provide geographic breadth.
  • Initial jobless claims from the BLS: weekly resolution, consensus forecasts from 50+ economists, and a long history of market-based pricing (bond markets already trade these outcomes).
  • Commodity settlement prices: daily NYMEX closes are public, forecast by hundreds of analysts, and resolve unambiguously.

Conversely, domains with high emotional stakes will carry larger biases regardless of data quality. Elections attract partisan bettors who systematically overpay for their preferred candidate — studies of political prediction markets show 3-7 pp of bias on heavily partisan contracts. Celebrity outcomes and sports rivalries introduce fandom-driven distortions where the rooting interest overpowers probability assessment. A Yankees fan betting on a Yankees-Red Sox game and a weather trader betting on tomorrow’s high temperature are solving the same mathematical problem, but the fan’s probability estimate is contaminated by 15-20% of identity-driven noise.

Our crypto market structure analysis showed that even non-emotional markets (crypto price contracts) achieve near-efficiency when data is public and resolution is fast — the 72.1 million trades in that dataset produced a FLB of only 2-4 pp, small by sports-betting standards. Weather markets push even lower, at 0.4 pp. The pattern across our 3 research datasets (weather, crypto structure, crypto backtest) is consistent: the emotional component is the primary driver of prediction market bias, not information asymmetry.

For real-time monitoring of convergence patterns across all prediction market platforms, check our live dashboard. The SIGNAL index tracks market certainty across categories, giving a macro view of where prediction markets are most and least confident.

Frequently Asked Questions

How do prediction market prices converge before settlement?

Winning contracts drift from a median of 43.8 cents to 77 cents in the final 6 hours before settlement, while losing contracts decline from 15.2 cents to 4 cents. The convergence path is smooth and predictable — there are no sudden jumps or discontinuities in the median trajectory. The rank-1 bracket converges approximately 2x faster than rank-3 brackets, with most price discovery concentrated in the top 2 brackets. Across 1,506 events, the market effectively identifies the winning bracket 6+ hours before settlement.

How quickly do Kalshi weather markets react to NWS forecasts?

73% of weather contracts show more than 1 cent of movement within 1 hour of an NWS forecast update, with the primary reaction window at 10-30 minutes. But the typical 1-3 cent movement is smaller than round-trip taker fees of 3-3.5 cents. Even at maker fees (0.8-1.0 cents round-trip), execution uncertainty and adverse selection erode the edge. The structural conclusion matches our crypto research: sub-3-cent signals cannot survive trading friction.

Is there a favorite-longshot bias in weather prediction markets?

Yes, but it’s small. Sub-10% contracts settle YES approximately 0.4 pp more often than prices imply (5.4% realized vs 5.0% implied). This is roughly one-tenth the 2-4 pp FLB observed in crypto markets. The bias is largest in winter (0.7 pp, driven by storm-related fat tails) and smallest in summer (0.2 pp). The 0.4 pp edge translates to 0.02 cents of expected profit per 5-cent contract, which is 17x smaller than the minimum taker fee.

What time of day are weather prediction markets most active?

Peak trading occurs 2-4 hours before settlement, capturing the window where forecast certainty is high but prices haven’t fully converged. The final 30-60 minutes see a secondary spike accounting for approximately 30% of daily volume, driven by market makers unwinding hedges and speculators betting on real-time temperature observations. Weekday events generate 2-3x more volume than weekend events, consistent with professional participation patterns.

Key Takeaways

  • Rank-1 brackets converge approximately 2x faster than rank-3 brackets — reaching 70 cents by T-4h while rank-3 brackets remain below 20 cents until T-2h
  • Fall events show fastest convergence (rank-1 at 65 cents by T-6h) versus winter’s slowest (rank-1 at 48 cents at T-6h), reflecting seasonal forecast accuracy differences
  • Losing brackets don’t decline uniformly — tail brackets (rank 4-6) collapse earliest while the rank-2 “insurance” bracket holds value longest, creating a cascade effect
  • Final-hour trading captures 30% of daily volume but is not profitable for new entries: buying at 77 cents requires >77% accuracy, while our model achieves approximately 75% at T-1h
  • Weather FLB is 0.4 pp on sub-10% contracts — one-tenth the size of crypto FLB and far below the fee floor for profitable trading

For the full academic treatment of these findings, see our research paper on weather prediction market efficiency.

Frequently Asked Questions

How do prediction market prices converge?
Winning contracts drift from a median 43.8¢ to 77¢ peak over the final 6 hours. Losers decline from 15.2¢ to 4¢. The convergence is smooth and predictable, with most price discovery happening 6+ hours before settlement. The final hours are refinement, not discovery.
What is the favorite-longshot bias in prediction markets?
Low-probability contracts settle YES slightly more often than prices imply, while high-probability contracts settle slightly less often. In weather, the bias is 0.4 percentage points. In crypto, it's 2-4 percentage points. Markets with lower emotional stakes show smaller biases.
Are Kalshi weather markets efficient?
Yes. Our analysis of 804,248 trade scenarios shows prices incorporate NWS forecast updates within 10-30 minutes, but the 1-3 cent convergence margin is smaller than round-trip trading costs. The crowd aggregates public expert forecasts faster than any individual model.
When is the best time to trade weather contracts?
Peak liquidity occurs 2-4 hours before settlement. The final 30-60 minutes sees a volume spike (30% of daily volume) driven by market makers unwinding hedges and last-minute speculators. Late-stage net flow is 58% NO-biased as hedgers exit positions.