Extended Kelly Criterion

Handling multiple bets and payoff uncertainties

November 24, 2025 · John Peach

Introduction

Niko Tosa and a couple of his friends walked into the Ritz Club in London one night, sat down at the roulette table and broke the bank. By watching the relative positions of the ball and wheel they accurately predicted the winning pocket, and they did it without using any “cheating” methods that were used by the Eudaemons who built computers in their shoes to track the ball and wheel. Tosa won by improving the expected value of his bets in his favor. He was not only adept at predicting the outcome, but he must have studied his probability theory very carefully.

Security personnel at the Ritz reviewed the tapes of Tosa’s play, but never found evidence of any predictive software, so perhaps he was just an exceptional athlete who could sense where the ball was going. This article won’t show you how to gain an edge at roulette or any other casino game, but we’ll show you how to optimally manage your money.

Figure 1.
Niko Tosa and his roulette strategy.

Beside predicting the outcome of each roll, Tosa likely kept careful track of how much he was betting each game. Bet too little, and he wouldn’t increase his stake at an optimum rate, but if he bet too much, he risked losing it all.

We’re not going to give you a method to pick stock market winners or encourage you to gamble, but we’ll show you how to invest wisely by managing your money. The mathematics in this article is dense, so if you’re more interested in experimenting with the outcome, go to the Github repository and download the Julia code. See the section Code for this article below for the link.

Quick Start Guide

For Mathematical Readers: Work through each section sequentially—the complexity builds deliberately.

For Experimenters: Jump to the “Experiments to Try” section, download the code, and start playing. Return to the theory when you want to understand why something works.

For Practitioners: Focus on “Discrete Case: Multiple Simultaneous Bets” and “Transaction Fees and Costs”—these have immediate practical applications.

A Review of the Kelly Criterion

In the Kelly Criterion, we showed how to optimize betting by wagering exactly the right amount of money based on the probability of winning, the expected returns, and the size of your stake at the time of the bet. Recall that if the probability of winning is $p,$ the fraction of your initial capital $S_0$ that you bet is $0 \leq b \leq 1$ , and you win $w$ units per unit bet, then your wealth after $n$ bets will be

S_n = S_0(1-b+bw)^{pn}(1-b)^{(1-p)n}

where the first term represents wins and the second losses. Letting $R_n = \frac{S_n}{S_0}$ and taking the limit as $n \rightarrow 1$ , the expected rate of return per bet is

R = (1 - b + bw)^p(1-b)^{1-p}

and if we calculate the derivative with respect to $b$ , and set $\frac{dR}{db} = 0$ , we find that the optimal bet fraction is

b = \frac{pw - 1}{w-1.}

Suppose we want to make two bets at the same time, with expected win probabilities of $p_1$ and $p_2$ , and returns $w_1$ and $w_2$ on bet fractions $b_1$ and $b_2$ . How should the bets be allocated to optimize the return?

Since $p_1$ and $p_2$ are probabilities, then $0 \leq p_1,p_2 \leq 1$ , and similarly $0 \leq b_1,b_2 \leq 1$ because they represent the fractions of the total current cash on hand available to bet. A further constraint is that $b_1 + b_2 < 1$ since we can’t bet more than the total on hand.

A final constraint is that the expectation must be greater than one for each bet, so $p_1 w_1 > 1$ and $p_2 w_2 > 1$ .

Let $B = 1 - b_1 - b_2$ , which is the fraction of the capital on hand after making the two bets, and $R_1 = b_1w_1, R_2 = b_2w_2$ , which are the amounts returned on each bet for winning outcomes. Next, let

\begin{aligned} S_{00}(b_1,b_2) &= (B)^{(1-p_1)(1-p_2)} \qquad \qquad \text{both lose} \\ S_{01}(b_1,b_2) &= (B+R_2)^{(1-p_1)p_2} \qquad \text{$b_1$ loses, $b_2$ wins}\\ S_{10}(b_1,b_2) &= (B+R_1)^{p_1(1-p_2)} \qquad \text{$b_1$ wins, $b_2$ loses} \\ S_{11}(b_1,b_2) &= (B+R_1+R_2)^{p_1 p_2} \qquad \text{both win}\\ \end{aligned}

Letting $S(b_1,b_2) = S_{00}(b_1,b_2) \cdot S_{01}(b_1,b_2) \cdot S_{10}(b_1,b_2) \cdot S_{11}(b_1,b_2)$ we need to find $b_1$ such that

\frac{\partial S(b_1,b_2)}{\partial b_1} = 0.

Since

\frac{\partial S}{\partial b_1} = S \frac{\partial}{\partial b_1} \log(S)

the value of $b_1$ that optimizes $S$ reduces to a function of the sum of the terms, which is cubic in $b_1$ , but the coefficients of the polynomial are too much for Mathematica to find a closed-form solution.

Figure 2.
Coefficients of third degree polynomial of $b_1$ .

Adding in a third bet would generate a quartic polynomial, and beyond that, Abel showed that polynomials of degree five or higher have no closed form solutions (see The Sum of the Sum of Some Numbers). Still, the maximum value of $S$ exists as seen in the surface plot:

Figure 3.
Surface plot of $S$ for $(p_1,w_1) = (0.4,3),(p_2,w_2) = (0.5,5)$ .

We also need to consider cases where the payoff $w$ is a continuous probability distribution. This would be the case if you invest in the stock market and the price of the stock could, in principle, rise indefinitely, or fall to zero when the company goes bankrupt. Conversely, if you short a stock, there is no effective bottom to the losses you could incur.

For continuous probability distributions, the normal distribution is often used, but others are possible and may better represent the circumstances. We also need to consider other costs such as brokerage or exchange fees, bid-ask spreads, taxes on short-term gains, slippage, and even travel costs to casinos. These extra fees reduce the optimal betting amount and require a more conservative strategy. Many have advocated for a fractional Kelly approach where the amount bet is reduced to a percentage of the calculated optimum.

In this article, we’ll extend the Kelly Criterion in four steps:

Discrete Case: Multiple Simultaneous Bets allows multiple bets to be placed simultaneously, but assumes winning probabilities and payoffs are known.
Discrete Case: Uncertain Probabilities considers the situation where the probability of winning is a distribution, but the payoff is known.
Continuous Case: Uncertain Returns investigates the case where the payoff amounts are unknown and could take on any value.
Continuous Case: Uncertain Probabilities and Returns combines the idea that both the probability of winning and the expected payoff are distributions.

Discrete Case: Multiple Simultaneous Bets

Let’s first consider the case where the payoffs for two bets are known, and we can estimate the probabilities of winning each independently. Even though there may not be closed-form solutions for this problem, we can still find a very close numerical approximation to the optimum values for the bet fractions $b_1$ and $b_2$ . In the surface plot of $S(b_1,b_2)$ , there is a maximum, and excellent numerical methods have been developed to find it. In fact, in many cases, the amount bet needs to be rounded to the nearest integer of some minimum allowable amount. For example, you couldn’t bet a fraction of a chip at a roulette table, or buy fractions of a stock.

Let’s generalize the equations to allow for an arbitrary number $n$ of simultaneous bets. Then

B = 1 - \sum_{k=1}^n b_k

is the fraction of the capital remaining after the bet, and $R_k = b_k w_k$ is the return amount if the bet pays. Each bet is binary - either it wins with probability $p_k$ or it loses with probability $1-p_k$ , and so there are $2^n$ possible combinations of win/loss outcomes for the $n$ bets. Let $p$ be the vector of probabilities of winning and $w$ be the vector of payoffs per unit bet. Now, if we define $R = [p_1 w_1, p_2 w_2, \ldots, p_n w_n] = p \odot w$ and the index vector to be the binary representation of an integer between $0$ and $2^n-1$ , then

S_I = (B + R I^T)^{\Pi (p \odot I) + ((1-p) \odot \neg I)}

where $\odot$ is the Hadamard, Hadamard Product: Also called the element-wise product, this operation multiplies corresponding elements of two vectors or matrices. For example, $[1,2,3] \odot [4,5,6] = [4,10,18]$ . Note that this is different from matrix multiplication or dot products. or elementwise product of vectors.

Suppose $n=3$ , so the index vector $I$ has values ranging from $0 \ldots 2^3-1 = 7$ expressed in base $2$ as $I = \{ 000, 001, 010, 011, 100, 101, 110, 111 \}.$ Using this representation, we can construct each of the terms of $S.$ For example, suppose we want to generate $S_{I_5} = S_{101}.$ In this case

R I_5^T = \begin{bmatrix} p_1 w_1 & p_2 w_2 & p_3 w_3 \end{bmatrix} \begin{bmatrix} 1 \\ 0 \\ 1 \end{bmatrix} = p_1 w_1 + p_3 w_3.

The exponent becomes

\begin{aligned} & \Pi (\begin{bmatrix} p_1 & p_2 & p_3 \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & 1 \end{bmatrix})(\begin{bmatrix} (1-p_1) & (1-p_2) & (1 - p_3) \end{bmatrix} \odot \begin{bmatrix} 0 & 1 & 0 \end{bmatrix}) \\ & \Pi \begin{bmatrix} p_1 & 0 & p_3 \end{bmatrix} + \begin{bmatrix} (0 & (1-p_2) & 0 \end{bmatrix} \\ & \Pi \begin{bmatrix} p_1 & (1-p_2) & p_3 \end{bmatrix} \\ & = p_1(1-p_2)p_3. \end{aligned}

Thus,

S_{I_5} = (B + p_1 w_1 + p_3 w_3)^{ p_1(1-p_2)p_3}

representing the case when the first and third bets paid, but the second lost. By constructing the function $S$ this way, we can store copies for various values of $n$ , and then optimize on the particular estimates for $p$ and $w$ at the moment we want to place a bet.

The Julia module kelly_disc.jl contains functions for optimizing the discrete case. Suppose you have two opportunities where $p_1 = 0.3, w_1 = 4$ , and $p_2 = 0.25, w_2 = 6.$ If they were played independently, then the optimal bet fractions would be

\begin{aligned} b_1 &= \frac{p_1 w_1 - 1}{w_1 - 1} = \frac{0.3 \cdot 4 - 1}{4 - 1} = 0.0666 \\ b_2 &= \frac{p_2 w_2 - 1}{w_2 - 1} = \frac{0.25 \cdot 6 - 1}{6 - 1} = 0.1. \end{aligned}

When playing both bets simultaneously, the betting fractions change slightly:


============================================================
DISCRETE KELLY OPTIMIZATION RESULTS
============================================================

Problem Setup:
  Number of bets: 2
  Transaction fee: 0.00%

Bet Parameters:
  Bet 1: p=0.3000, w=4.0000, E[p·w]=1.2000
  Bet 2: p=0.2500, w=6.0000, E[p·w]=1.5000

Optimal Bet Fractions:
  b_1 = 0.064500 (6.45%)
  b_2 = 0.099154 (9.92%)

Total bet: 0.163655 (16.37%)
Cash held: 0.836345 (83.63%)

Max E[log(wealth)]: 0.028532
Converged: true  Iterations: 3  Method: lbfgs
============================================================

The long-term growth rate is $2.85\%$ per bet, so if you play $100$ similar bets, your initial stake will grow by $(1 + 0.0285)^{100} = 16.6$ times.

The third input parameter to DiscreteKelly is the transaction cost, which we set to zero in the example above, but should be considered in an actual betting situation. Suppose you’re playing roulette (see Roulette Physics) and you can reliably predict where the ball will fall within an octant. It might become obvious that you’re playing several numbers that are adjacent on the wheel, so to cover this, you could throw a few chips randomly onto other numbers, which would be a transaction cost. You might also include travel expenses or taxes as a transaction cost, but this would be amortized over your stay at the casino.

The DiscreteKelly function should be restricted to a handful of simultaneous bets because the number of terms in the solution equation doubles with each new bet. If you have $n$ discrete bets, then the number of terms will be $2^n$ since each combination of win/lose needs to be included. For example, if you have $10$ bets, then the equation will have $2^{10} = 1024$ terms.

Discrete Case: Uncertain Probabilities

Unlike the previous case, where we assumed a fixed probability of winning and a known payoff, Niko Tosa didn’t know the probability exactly, but knew only a distribution. The game is still binomial - you win with probability $p$ and lose with probability $1-p$ , but your knowledge of $p$ itself is uncertain. If you use a fixed value of $p$ , then you are assuming more information about the outcome than you actually have.

If you assume that the distribution is a beta distribution, Beta Distribution: A flexible probability distribution defined on the interval [0,1], controlled by two parameters ( $\alpha$ and $\beta$ ) that determine its shape. When both parameters equal 1, it becomes a uniform distribution. then you can begin to estimate the parameters by collecting your win/loss results. The beta distribution is defined in the interval $[0,1]$ by

\frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}

where

B(\alpha,\beta) = \frac{\Gamma(\alpha)\Gamma{\beta}}{\Gamma(\alpha + \beta)}

and $\Gamma$ is the Gamma function.

Figure 4.

Beta distribution PDF.

Start with an initial guess for the probability $p \sim \text{Beta}(\alpha_0, \beta_0)$ , called the prior. With no knowledge of how well you will be able to predict the outcome, set $\alpha_0 = \frac{1}{38}, \; \beta_0 = \frac{37}{38}$ , which gives equal likelihood to every pocket for an American wheel (change to 37 pockets for European). Now, begin collecting data, counting the number of wins $k$ and the total number of attempts $n$ , which will improve your estimate of the true distribution

p | \text{data} \sim \text{Beta}(\alpha_0+k,\beta_0+n-k)

where $p|\text{data}$ means “the probability given collected data”. The empirically derived probability $p_{\text{eff}}$ is the mean of the distribution,

p_{\text{eff}} = \mathbb{E}[p|\text{data}] = \frac{\alpha_0 + k}{\alpha_0 + \beta_0 + n}.

An even better way to estimate the probability distribution would be to record how far off your guess is for the winning pocket. Build a vector of distances between the estimated target pocket and the true pocket, $q = [q_{-3},q_{-2},q_{-1},q_0,q_1,q_2,q_3]$ , where $q_{-i}$ represents $i$ pockets early, $q_0$ is exactly right, and $q_i$ is $i$ pockets late. Now you can use the Dirichlet distribution Dirichlet Distribution: A multivariate generalization of the Beta distribution used when modeling multiple categories that must sum to 1. Think of it as modeling probabilities for multiple outcomes simultaneously, such as the likelihood of a roulette ball landing in different groups of pockets., which is a multivariate extension of the Beta distribution, to develop a more complete model of the error pattern. As in the case with the Beta distribution, start with an estimated prior $q \sim \text{Dirichlet}(\alpha_0)$ and collect $c_k$ counts of each offset value to estimate the posterior

\begin{aligned} q|\text{data} &\sim \text{Dirichlet}(\alpha_{0,k} + c_k) \\ &\sim \text{Beta}\left(\alpha_{0,0} + c_0 + \sum_{k \neq 0}(\alpha_0 + c_k) \right). \end{aligned}

Using this method provides a better estimate of the probabilities over a range of adjacent pockets. For the Dirichlet priors, the best initial estimate is $\alpha_{0,k} = 1/38$ for all $k$ .

We can compare how the uncertain probabilities affect the betting fraction to the prior case where we knew the probabilities of winning exactly. Let $\alpha_i = s p_i$ and $\beta_i = s(1-p_i), \; i=1,2,$ for some small value of $s,$ such as $s=10,$ so the resulting probabilities are very uncertain. Keep the payoffs the same as previously,

s = 10.0       # weak prior, much more uncertainty about p
α1, β1 = s*p1, s*(1-p1)
α2, β2 = s*p2, s*(1-p2)
p_dists = [Beta(α1, β1), Beta(α2, β2)]

and then run the multi-Bayes example,


[0.0645002348517128, 0.09915443802025861]  total bet = 0.1636546728719714
Multi-bet Bayesian Kelly result:
  b_1 = 0.064500 (6.45%)
  b_2 = 0.099154 (9.92%)
  Cash held = 0.836345 (83.63%)
  Total bet = 0.163655 (16.37%)
  Max E[log(wealth)] = 0.028532
  Converged: true  Iterations: 8  Method: lbfgs_bayes_multi

which gives the same result as the discrete case. For a single bet with probability $p$ and log-utility, Log-Utility: Using logarithms to measure wealth changes captures the diminishing marginal value of money—doubling your wealth from $1M$ to $2M$ doesn’t feel as significant as doubling from $10K$ to $20K$ . This aligns with how humans actually perceive financial gains. the conditional expected log-wealth is

g(p;b)=p\log W_{win}(b)+(1−p)\log W_{lose}(b).

If $P$ is random with some distribution (e.g., Beta), the Bayesian expected log-wealth is

\mathbb{E}_P[g(P;b)]=\mathbb{E}_P[P \log (W_{win})+(1−P) \log (W_{lose})]=\mathbb{E}[P]\log (W_{win})+(1−\mathbb{E}[P]) \log (W_{lose}).

so the dependence on the distribution of $P$ vanishes except for the mean $p_{\text{eff}} = \mathbb{E}[P].$ If you wanted to include a variance term, you could define the objective function as

\max_{\substack{b}} \left\{ \mathbb{E}_P [g(P;b)] - \lambda \text{Var}_P[g(P;b)] \right\}

where $\lambda$ is a measure of your risk tolerance. Alternatively, you could run Monte Carlo simulations Monte Carlo Methods: Computational techniques that use repeated random sampling to obtain numerical results. Named after the famous casino, these methods are particularly useful when analytical solutions are difficult or impossible to find. with data collected in situ to optimize $\lambda,$ or use the method described in Distributional Robust Kelly Gambling by Qingyun Sun and Stephen Boyd.

Continuous Case: Uncertain Returns

What if we don’t know the exact return, but only its probability distribution? Instead of a fixed return $w$ , the payoff is a function $p(w)$ where $p$ is a probability density function. In this example, the payoff reaches a maximum when $w = 1.75$ , with a probability of $0.8,$ but $w$ could also be $1$ with probability $0.25$ and falls away to zero for $w < 0$ or $w > 3.5.$

Figure 5.
Payoff returns as a continuous distribution.

For a single bet with a continuous probability distribution, the wealth after betting is

R = 1 - b + bw.

The objective function is,

g(b) = \mathbb{E}[\log R] = \int_{-\infty}^{\infty} p(w) \log(1 - b + bw) dw

and we want to maximize $g(b).$ This is similar to Claude Shannon’s Information Entropy discussed in The Kelly Criterion,

H(X) = - \sum_{i=1}^n P(x_i) \log_2P(x_i)

where $P(x_i)$ is the probability of receiving message $x_i$ , and $X$ is the vector of messages, $X=[x_1,x_2,…,x_n]$ .

For multiple simultaneous bets with bet fractions $b_1, b_2, \ldots, b_n$ and payoffs $w_1, w_2, \ldots, w_n$ the wealth $W$ after one session is

W = 1 - \sum_{i=1}^n b_i + \sum_{i=1}^n b_iw_i.

The objective function for multiple bets becomes

G(b_1, b_2, \ldots, b_n) = \mathbb{E}[\log R] = \mathbb{E} \left[ \log \left( 1 - \sum_{i=1}^n b_i + \sum_{i=1}^n b_iw_i \right) \right]

with the constraints

b_i \geq 0, \; \sum_{i=1}^n b_i \leq 1.

Multiple Continuous Kelly

Just as in the discrete case, we can extend the continuous case to include multiple simultaneous bets. If the bets are known to be independent, as might be the case if day-trading in unrelated industries, then we could invest in multiple trades simultaneously to improve the overall probability of success.

In the example above with two bets, $(p_1, w_1) = (0.3,4.0)$ and $(p_2, w_2) = (0.25,6.0)$ we found that the optimal fractions were $(b_1,b_2) = (0.0645, 0.0991).$ When the probabilities are continuous, the input parameters are dependent on the expected payoff values and the probability distribution. For the discrete case, the expected value and variance are

\begin{aligned} \mathbb{E}[R] &= pw \\ \text{Var}(R) &= p(1-p)w^2 \end{aligned}

where $R$ is the gross return:

$R = w$ with probability $p$
$R = 0$ with probability $1-p$ .

For the first bet,

\begin{aligned} \mu_{R_1} &= \mathbb{E}[R_1] = p_1w_1 = 0.3 \cdot 4 = 1.2 \\ \sigma_{R_1} &= \text{Var}(R_1) = p_1(1-p_1)w_1^2 = 0.3 (1 - 0.3) \cdot 4^2 = 3.36 \end{aligned}

and for the second bet $\mu_{R_2} = 1.5$ and $\text{Var}(R_2) = 6.75.$

For a continuous distribution, the log-normal works well because it is bounded below by zero, so the payoff is never negative.

The expected value $R$ for a log-normal distribution is $\mathbb{E}[R] = e^{\mu + \frac{1}{2}\sigma^2},$ and the variance is $\text{Var}(R) = \left( e^{\sigma^2} - 1 \right)e^{2 \mu + \sigma^2}.$ Then

\phi = 1 + \frac{\sigma_R^2}{\mu_R^2} = e^{\sigma^2} \Rightarrow \sigma^2 = \ln\left( 1 + \frac{\sigma_R^2}{\mu_R^2} \right)

and

\mu = \ln(\mu_R) - \frac{1}{2} \sigma^2.

(See the Appendix for details on the variable $\phi$ .) For the first bet, $\mu_{R_1} = 1.12$ and $\sigma_{R_1} = 3.36$ so

\begin{aligned} \phi_1 &= 1 + \frac{3.36}{1.2^2} \approx 3.333 \\ \sigma_1^2 &= \ln(\phi_1) \approx 1.20397 \\ \mu_1 &= \ln(1.2) - \frac{1}{2} \sigma_1^2 \approx 0.18232−0.60199 \approx −0.41966 \end{aligned}

and for the second bet $\mu_{R_2} = 1.5$ and $\sigma_{R_2} = 6.75$ so $\sigma_2^2 \approx 1.38629$ and $\mu_2 \approx −0.28768.$


===============================================================
Number of bets: 2
Transaction fee: 0.00%
Correlation: Independent

Bet parameters:
  Bet 1: LogNormal{Float64}, μ=1.2000, σ=1.8330
  Bet 2: LogNormal{Float64}, μ=1.5000, σ=2.5981

Optimal bet fractions:
  b_1 = 0.029763 (2.98%)
  b_2 = 0.037038 (3.70%)

Total bet: 0.066801 (6.68%)
Cash held: 0.933199 (93.32%)

Max E[log(wealth)]: 0.019828
Converged: true  Iterations: 0  Method: mc_fixed
===============================================================

Notice that the bet fractions for the continuous distribution are much smaller (2.98% and 3.70%) than for the discrete case (6.45% and 9.92%). In the discrete case, the payoff is certain and only the probability of winning is uncertain, while in the continuous case, the payoff might be any value greater than zero.

The log-normal distribution has a heavy tail near zero and a much smaller tail that extends on to infinity, so the expected value is lower. The continuous Kelly reduces the bet size to reflect the reduced expected payoff.

Continuous Case: Uncertain Probabilities and Returns

In this final extension of the Kelly Criterion, we consider the case that both the probability of winning and the payoff are taken from probability distributions. The probability of winning is taken from a Beta distribution, and the payoff is a univariate distribution, and is implemented in the module kelly_general_bayes.jl, and extends the concepts developed in the previous models. This version lets you place multiple simultaneous bets allocated optimally to increase your total capital at the maximum rate, and, as in previous models, lets you include fees or betting costs.

Traditional Kelly models assume you know the probability of winning and the payoff exactly, but we can now turn both into distributions that better reflect how you would approach an investment or betting opportunity. By modeling distributions, we have a rigorous basis for adjusting the betting fractions. For example, if we use the probabilities and payoffs from the first example above,

# Fixed means and payoffs:
p1_mean, p2_mean = 0.30, 0.25
gross_w1, gross_w2 = 4.0, 6.0

and compare a strong prior, $s_A = 1000,$ to a weak prior, $s_B = 2,$ in two scenarios, we see that the weak prior causes the optimal fractions to be reduced by over 15%

Scenario A (strong prior): b = [0.06565483237670092, 0.0990407051985005]  total bet = 0.16469553757520142
Scenario B (weak prior):  b = [0.044006266789879726, 0.09526125397014709]  total bet = 0.13926752076002683
Fraction reduction in total bet = 15.44%

In most cases, the reduction would likely be even greater since the expected values are high in this example, $\mathbb{E}[p_1 w_1] = 0.3 \cdot 4.0 = 1.2, \; \mathbb{E}[p_2 w_2] = 0.25 \cdot 6.0 = 1.5.$ With strong conviction (tight probability prior) and near-known payoff distributions, the model recovers something very close to classic Kelly fractions, but with weak conviction (diffuse probability priors) or large payoff variance, the optimal fractions naturally shrink, reflecting risk from estimation error or payoff tail risk.

With multiple simultaneous bets, you can allocate capital not just by edge but also by uncertainty, return variability, and covariance if enabled. Here’s how to use this model:

Define your model beliefs
- Choose for each bet $i$ : a prior distribution for $p_i$ (e.g., Beta(α,β)), a win and loss return distribution (e.g., LogNormal(μ,σ)).
- Optionally, a correlation matrix if bet returns are dependent.
Instantiate the model

k = GeneralBayesianKelly(p_dists      = [ … ],
                         return_dists = [ … ],
                         loss_dists   = [ … ],
                         fee          = f,
                         correlation  = Corr)

Optimize for stake fractions

res = optimize_gen_bayes(k; method=:lbfgs, n_samples=…, rng=…)

The result res.b gives the vector of optimal bet fractions, res.cash the un-bet cash fraction, and res.objective the achieved expected log-wealth.

Inspect and plot Use plot_objective_gen_bayes to explore how expected log-wealth varies with each $b_i$ , or compare different prior strengths and payoff variances to study their effect on allocation.

The kelly_general_bayes.jl lets you model real-world conditions much more accurately than the original single bet, fixed probability, and fixed payoff model.

You can also run Monte Carlo simulations,

include("kelly_general_bayes.jl")
using .KellyGeneralBayes
using Random

# Base model
(k, res_kelly) = run_examples_general()

# Make some alternative strategies to compare:
# e.g., half-Kelly and a "conservative" tweak
res_half = GenBayesKellyResult(0.5 .* res_kelly.b,
                               1.0 - k.fee - sum(0.5 .* res_kelly.b),
                               res_kelly.objective, true, 0, :manual_half)

res_zero = GenBayesKellyResult([0.0, 0.0], 1.0 - k.fee, 0.0, true, 0, :cash_only)

fig = compare_strategies_monte_carlo(
    k,
    ["Full Kelly" => res_kelly,
     "Half Kelly" => res_half];
    n_iters=3000,
    n_horizon=200,
    bands=((0.10,0.90),(0.25,0.75)),
    bins=40,
    rng=MersenneTwister(123)
)

display(fig)

which shows the difference between a conservative half-Kelly and the full model:

Transaction Fees and Costs

You need to consider various costs that might reduce your overall winnings. While playing roulette, you might spread a few “dummy” bets around the table to hide the fact that your system is consistently selecting a few winning pockets. For day trading, there are brokerage fees and taxes to consider, among others. In the code, we simply subtract the fee from available capital: $B = 1 - \sum b_i - \text{fee}$ to account for these losses. Some extra costs you should consider are,

Fixed percentage fees (brokerage, exchange fees)
Bid-ask spreads - The range between the buying and selling prices offered for a security.
Tax on short-term gains - Tax on profits gained when a position is held for less than one year.
Slippage - The difference between the expected price of a trade and the price at execution.

You should also consider the possibility of naturally occurring long strings of losses. Imagine playing a game involving a coin toss in which heads is a win and tails is a loss. The law of large numbers says that over many tosses, the number of heads will approximately equal the number of tails, but there will always be long runs of tails that could happen at any time. If the run is long enough, no matter how large your initial capital, you will go bust. This is known as gambler’s ruin, and we need to account for the possibility of such an occurrence.

⚠️ Common Pitfalls When Using Kelly

Overestimating your edge - The #1 error. Kelly assumes you know probabilities accurately.

Ignoring correlation - Betting on multiple correlated outcomes isn’t really diversification.

Forgetting transaction costs - Even small fees dramatically reduce optimal bet sizes.

Not rebalancing - Kelly fractions need updating as your wealth changes, preferably after each transaction.

Using arithmetic means - Kelly requires geometric thinking about compound growth.

Possible Applications

Several investment or betting opportunities could benefit from the extended Kelly-Criterion framework:

Quantitative trading strategies / algorithmic strategies

Many algorithmic trading or systematic strategies have the following characteristics:

You have a historical signal that suggests a trading edge (i.e., probability of success), but that probability is uncertain.
The returns when the trade succeeds or fails are random (sometimes large winners, sometimes tail losses).
You may run many strategies simultaneously (i.e., multiple “bets”).

Sports betting or wagering markets

In sports betting, you may estimate true win probabilities of teams or players (based on models/data) and compare them with bookmaker odds. There is inherent uncertainty in your probability estimates, and payoffs are known (or nearly known) when you win, but you can treat them as distributions if you include, e.g., parlay possibilities.

Options or derivatives trades with modelled edge

In options trading, you may have a view that an option is mispriced (so you estimate a probability of a favorable outcome), and you have an estimate of payoff distribution (the option payoff is a known structure, but the underlying volatility or tail risk is uncertain).

Summary

These extensions to the original Kelly Criterion bring you much closer to realistic betting/investing opportunities by modeling uncertain probabilities of winning and uncertain payoff amounts, optimizing over multiple simultaneous bets, and including transaction fees. The models check that your total bet fraction is never greater than one, and each bet has an expected value greater than one.

These methods only optimize the betting fractions, so you still need to accurately estimate the chances of winning and how much you expect to make on each investment. Many investments could benefit from applying the extended Kelly Criterion, but be sure to thoroughly understand each and be careful when developing the distributions used.

Where to Go From Here

The mathematics we’ve covered transforms the simple Kelly Criterion into a practical tool for real-world decision-making. But understanding the theory is just the beginning—the real value comes from experimentation and application.

Start conservative, then explore: Begin with the discrete case using small, hypothetical portfolios. Watch how the optimal fractions change as you adjust probabilities and payoffs. Notice how uncertainty in your estimates naturally reduces the recommended bet sizes—this is the model protecting you from overconfidence.

Build intuition through simulation: Run Monte Carlo simulations comparing full Kelly to half-Kelly strategies. You’ll see that while full Kelly maximizes long-term growth, it also experiences dramatic drawdowns. This visceral understanding of the volatility-growth tradeoff is something equations alone can’t teach.

Test with real data: Apply these models to historical data—stock prices, sports betting odds, or even board game probabilities. The gap between theoretical optimal betting and practical constraints will become immediately apparent. Transaction costs matter. Estimation error matters. The frequency of rebalancing matters.

Remember the limits: These models optimize how much to bet, not what to bet on. No amount of mathematical sophistication can turn a bad edge into a good one. The Kelly Criterion’s most important lesson might be knowing when not to bet at all.

The Julia code accompanying this article gives you a laboratory for exploring these ideas. Whether you’re managing an investment portfolio, analyzing strategic decisions, or just curious about optimal resource allocation, the extended Kelly framework offers a rigorous foundation for thinking about risk, uncertainty, and growth.

Start experimenting. Start small. And most importantly, start learning from what the models tell you about your own assumptions.

Experiments to Try

Ready to dig deeper? Here are hands-on experiments to build your intuition:

Beginner Experiments

The Overconfidence Test
- Set up two bets with p₁=0.3, w₁=4 and p₂=0.25, w₂=6
- Compare optimal fractions using strong prior (s=1000) vs weak prior (s=10)
- Question: How much do your bet sizes shrink when you’re less certain?
The Correlation Explorer
- Create two bets with identical expected values
- Run optimizations with correlation = 0, 0.5, and 0.9
- Question: How does correlation between bets affect diversification benefits?
The Transaction Cost Reality Check
- Take any optimal betting strategy
- Gradually increase the fee parameter from 0% to 5%
- Question: At what fee level does the strategy become unprofitable?

Intermediate Experiments

Discrete vs Continuous Comparison
- Set up the same scenario using both discrete and continuous models
- Compare the optimal fractions (you’ll see continuous gives smaller bets)
- Question: Why does uncertainty in payoff reduce bet size more than uncertainty in probability?
The Gambler’s Ruin Simulator
- Start with $1,000 and a favorable bet (p=0.55, w=2)
- Use full Kelly fractions for 1,000 rounds
- Run 100 simulations and plot the distribution of outcomes
- Question: How many simulations went bust despite positive expected value?
Half-Kelly vs Full Kelly Battle
- Run 10,000 iterations comparing both strategies
- Track: median wealth, maximum drawdown, probability of 50% loss
- Question: Is the extra volatility of full Kelly worth the higher growth rate?

Advanced Experiments

Multi-Asset Portfolio Optimization
- Create 5 independent betting opportunities with varying risk/reward
- Optimize simultaneously vs optimizing each independently
- Question: How much does simultaneous optimization improve expected growth?
Bayesian Learning Simulation
- Start with a weak prior (Beta(1,1)) for win probability
- Simulate 100 bets, updating your belief distribution after each
- Watch optimal fractions evolve as your estimates improve
- Question: How many observations until your bets stabilize?
Stress Testing with Fat Tails
- Replace log-normal return distribution with Student’s t-distribution (df=3)
- Compare optimal fractions to the log-normal case
- Question: How should you adjust betting when extreme outcomes are more likely?
The Rebalancing Frequency Study
- Set up a two-bet continuous strategy
- Compare: rebalance every bet vs every 10 bets vs never rebalance
- Question: How much growth do you sacrifice by not rebalancing?

Real-World Application Challenge

Your Own Portfolio Analysis
- Take 3-5 investments you’re considering (or currently hold)
- Estimate probability distributions for each using historical data
- Run the general Bayesian Kelly optimization
- Compare recommended fractions to standard “equal weight” or “market cap weight”
- Question: How different are Kelly-optimal weights from conventional wisdom?

Debugging Your Intuition

The Impossible Bet Exercise
- Try to optimize a bet with p=0.4, w=2 (expected value < 1)
- Watch the optimizer recommend b=0
- Try p=0.5, w=2.1 (barely favorable)
- Question: How close to break-even must you be before Kelly says “don’t bet”?

For each experiment:

Record your hypothesis before running
Document unexpected results
Share interesting findings in the GitHub discussions

The best way to learn Kelly optimization isn’t through equations—it’s through breaking the models, stress-testing assumptions, and discovering edge cases. Start experimenting today.

Glossary

Kelly Criterion: A betting strategy that maximizes the long-run geometric growth rate.
Fractional Kelly: Betting a fraction of the full Kelly amount to reduce risk.
Expected Log Wealth: The objective function Kelly maximizes.
Value at Risk (VaR): The maximum loss at a given confidence level.
Conditional VaR (CVaR): Expected loss given that loss exceeds VaR.
Maximum Drawdown: Largest peak-to-trough decline in wealth.
Sharpe Ratio: Risk-adjusted return metric (mean/std dev).
Probability of Ruin: Chance of going bankrupt.
Rebalancing: Adjusting portfolio to maintain target allocations.
Geometric Mean: nth root of the product of n values (appropriate for compounding).
Arithmetic Mean: Sum of values divided by count (not appropriate for Kelly).

Code for this article

The complete Julia implementation is available at Extended Kelly Criterion — Julia Implementation. The code for each section is

kelly_disc.jl - Discrete multiple bet optimization
kelly_bayes.jl - Bayesian uncertain probabilities
kelly_cont.jl - Continuous return distributions
kelly_general_bayes.jl - Full general case

This repository provides a full suite of four standalone Kelly-optimization models, progressing from classical fixed-probability betting to general Bayesian optimization with uncertain probabilities and uncertain return distributions.

Each module is self-contained: optimization routines, Monte Carlo tools, plotting utilities, and examples are implemented within each .jl file—no external utility files are required.

Software

Julia - The Julia Project as a whole is about bringing usable, scalable technical computing to a greater audience: allowing scientists and researchers to use computation more rapidly and effectively; letting businesses do harder and more interesting analyses more easily and cheaply.
Wolfram Language - Wolfram Language is a symbolic language, deliberately designed with the breadth and unity needed to develop powerful programs quickly. By integrating high-level forms—like Image, GeoPolygon or Molecule—along with advanced superfunctions—such as ImageIdentify or ApplyReaction—Wolfram Language makes it possible to quickly express complex ideas in computational form.

References

Andersen, R., et al.. In-game betting and the Kelly criterion. Mathematics for Application, vol. 9, no. 2, pp. 67–81.
Beta distribution. Wikipedia.
Bhowmick, K.. Inverse Problems in Portfolio Selection: Scenario Optimization Framework.
Browne, S., Whitt, W.. Portfolio Choice and the Bayesian Kelly Criterion.
Carta, A., Conversano, C.. Practical Implementation of the Kelly Criterion: Optimal Growth Rate, Number of Trades, and Rebalancing Frequency for Equity Portfolios. Frontiers in Applied Mathematics and Statistics, vol. 6.
Carta, A., Conversano, C.. Practical Implementation of the Kelly Criterion: Optimal Growth Rate, Number of Trades, and Rebalancing Frequency for Equity Portfolios. Frontiers in Applied Mathematics and Statistics, vol. 6.
Chellel, K.. Meet the Man Who Proved Roulette Was Beatable. Bloomberg.com.
Dirichlet distribution. Wikipedia.
Gordon, G., Tibshirani, R.. Karush-Kuhn-Tucker conditions.
https://byjus.com/maths/beta-distribution/.
Jonathan. Hey Kelly, Optimize My Portfolio.
Karush-Kuhn-Tucker Condition - an overview | ScienceDirect Topics.
Kelly, J. L.. A New Interpretation of Information Rate.
Kim, S. K. (.. Kelly Criterion Extension: Advanced Gambling Strategy. Mathematics, vol. 12, no. 11.
Kirschenmann, T.. thk3421-models/KellyPortfolio.
MacLean, L. C., et al.. Good and bad properties of the Kelly criterion.
MO-BOOK: Hands-On Mathematical Optimization with AMPL in Python 🐍 — Hands-On Mathematical Optimization with AMPL in Python.
Multivariate Optimization - KKT Conditions.
Niko Tosa Roulette Strategy vs. Traditional Approaches.
Numerically solve Kelly criterion for multiple simultaneous bets.
Peterson, Z.. Kelly’s Criterion in Portfolio Optimization: A Decoupled Problem.
philh, et al.. When and why should you use the Kelly criterion?.
Reid, A.. Kelly Criterion: The Smartest Way to Manage Risk & Maximize Profits.
Sizing the bets in a focused portfolio.
Sun, Q., Boyd, S.. Distributional Robust Kelly Gambling.
The Gambler Who Beat Roulette and Made History.
Thorp, E. O.. Chapter 9 The Kelly Criterion in Blackjack Sports Betting, and the Stock Market. Handbook of Asset and Liability Management, vol. 1, pp. 385–428.
Thorp, E. O.. PORTFOLIO CHOICE AND THE KELLY CRITERION. World Scientific Handbook in Financial Economics Series, vol. 3, pp. 81–90.
Understanding a Wager: To gamble or not to gamble, that….
Wesselhöfft, N.. The Kelly Criterion: Implementation, Simulation and Backtest.
Ziemba, D. W. T.. Understanding the Kelly Capital Growth Investment Strategy.

Image credits

Hero image: DreamStudio.
Niko Tosa: Roulette equations by Irene Suosalo. In BetUS, Niko Tosa Roulette Strategy vs Traditional Approaches.
Surface plot of $S$ - screenshot from Mathematica computation
Beta distribution: By Horas based on the work of Krishnavedala.
Payoff returns as a continuous distribution - Desmos.
LogNormal distribution: By Xenonoxid.

Appendix

What is $\phi$ ?

In the equation

\phi = 1 + \frac{\sigma_R^2}{\mu_R^2} = e^{\sigma^2},

the symbol $\phi$ is just a temporary variable used to simplify the algebra. It represents the multiplicative ratio

\phi = \frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2}.

For a lognormal distribution, this ratio always equals $e^{\sigma^2}$ .

So:

$\phi$ is dimensionless,
it measures “spread relative to the mean,”
and it is exactly $e^{\sigma^2}$ for a log-normal distribution.

Derivation from log-normal moments

Let $R$ be log-normal:

R = e^X,\qquad X \sim N(\mu,\sigma^2).

1. Compute the first and second moments

\begin{aligned} \mathbb{E}[R] &= e^{\mu + \frac{1}{2}\sigma^2} \\ \mathbb{E}[R^2] &= e^{2\mu + 2\sigma^2} \end{aligned}

2. Form the ratio

\frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2} = \frac{e^{2\mu + 2\sigma^2}}{e^{2(\mu + \sigma^2/2)}}.

Simplify the exponent:

numerator exponent: $2\mu + 2\sigma^2$
denominator exponent: $2\mu + \sigma^2$

Difference:

(2\mu + 2\sigma^2) - (2\mu + \sigma^2) = \sigma^2.

Thus

\frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2} = e^{\sigma^2}.

So we define this ratio:

\phi := \frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2} = e^{\sigma^2}.

Connect $\phi$ to mean and variance

Variance identity:

\mathrm{Var}(R) = \mathbb{E}[R^2] - \mathbb{E}[R]^2.

Solve for $\mathbb{E}[R^2]$ :

\mathbb{E}[R^2] = \mathrm{Var}(R) + \mathbb{E}[R]^2.

Divide by $\mathbb{E}[R]^2$ :

\frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2} = \frac{\mathrm{Var}(R)}{\mathbb{E}[R]^2} + 1.

But the left side is $\phi$ . Therefore:

\phi = 1 + \frac{\sigma_R^2}{\mu_R^2}.

So we have both expressions:

\begin{aligned} \phi = 1 + \frac{\sigma_R^2}{\mu_R^2} = \frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2} = e^{\sigma^2}. \end{aligned}

Summary

$\phi$ is just a convenient shorthand for $\phi = \frac{\mathbb{E}[R^2]}{\mathbb{E}[R]^2}.$
For a lognormal distribution, this ratio is exactly ${\sigma^2}$ .
Using variance and mean of $R$ , we can express it as $\phi = 1 + \frac{\mathrm{Var}(R)}{\mathbb{E}[R]^2}.$
This leads directly to the formula used to solve for the lognormal parameters $(\mu, \sigma)$ .

Extended Kelly Criterion

Handling multiple bets and payoff uncertainties

Introduction

Quick Start Guide

A Review of the Kelly Criterion

Discrete Case: Multiple Simultaneous Bets

Discrete Case: Uncertain Probabilities

Continuous Case: Uncertain Returns

Multiple Continuous Kelly

Continuous Case: Uncertain Probabilities and Returns

Transaction Fees and Costs

Possible Applications

Quantitative trading strategies / algorithmic strategies

Sports betting or wagering markets

Options or derivatives trades with modelled edge

Summary

Where to Go From Here

Experiments to Try

Beginner Experiments

Intermediate Experiments

Advanced Experiments

Real-World Application Challenge

Debugging Your Intuition

Documentation and Sharing

Glossary

Code for this article

Software

References

Image credits

Appendix

What is ϕ\phiϕ?

Derivation from log-normal moments

1. Compute the first and second moments

2. Form the ratio

Connect ϕ\phiϕ to mean and variance

Summary

What is $\phi$ ?

Connect $\phi$ to mean and variance