Scaling Option Pricing with Ray

Published: October 1, 2023

Introduction

Ray is a framework that makes it simple to scale Python applications. The Binomial model is a basic technique for option pricing. This post showed how using Ray makes it easy to take your existing Python code that runs sequentially and transform it into a distributed application with minimal code changes. While the experiments here were all performed on the same machine, Ray also makes it easy to scale your Python code on every major cloud provider. Here, I demonstrate taking some of these functions, and distributing them across multiple machines using Ray.

Basics of Binomial Option Pricing

The Binomial model calculates option prices using a tree-like structure. It works on a few assumptions:

  • The underlying asset has only two possible prices: up-price and down-price.
  • No dividends are paid during the option's lifetime.
  • A constant risk-free rate.
  • No transaction costs.
  • Investors are risk-neutral.

Four parameters (u, d, p, q) build the tree:

  • u and d: Up and down price factors.
  • p and q: Probabilities of price going up or down, with p+q=1.

Input Variables

The model requires the following inputs:

  • S: Underlying price
  • K: Strike price
  • T: Time to maturity
  • r: Risk-free interest rate
  • sigma: Volatility
  • N: Number of binomial steps

Python Implementation

The code uses Python libraries such as NumPy and Matplotlib. It also employs Ray for distributed computing. Two functions define the Cox-Ross-Rubinstein and Jarrow-Rudd methods:

# Standard Python libraries
import math
import numpy as np
import matplotlib.pyplot as plt
import ray
 
def Cox_Ross_Rubinstein_Tree (S,K,T,r,sigma,N, Option_type):
 
    # Underlying price (per share): S;
    # Strike price of the option (per share): K;
    # Time to maturity (years): T;
    # Continuously compounding risk-free interest rate: r;
    # Volatility: sigma;
    # Number of binomial steps: N;
 
        # The factor by which the price rises (assuming it rises) = u ;
        # The factor by which the price falls (assuming it falls) = d ;
        # The probability of a price rise = pu ;
        # The probability of a price fall = pd ;
        # discount rate = disc ;
 
    u=math.exp(sigma*math.sqrt(T/N));
    d=math.exp(-sigma*math.sqrt(T/N));
    pu=((math.exp(r*T/N))-d)/(u-d);
    pd=1-pu;
    disc=math.exp(-r*T/N);
 
    St = [0] * (N+1)
    C = [0] * (N+1)
 
    St[0]=S*d**N;
 
    for j in range(1, N+1):
        St[j] = St[j-1] * u/d;
 
    for j in range(1, N+1):
        if Option_type == 'P':
            C[j] = max(K-St[j],0);
        elif Option_type == 'C':
            C[j] = max(St[j]-K,0);
 
    for i in range(N, 0, -1):
        for j in range(0, i):
            C[j] = disc*(pu*C[j+1]+pd*C[j]);
 
    return C[0]
 
def Jarrow_Rudd_Tree (S,K,T,r,sigma,N, Option_type):
 
    # Underlying price (per share): S;
    # Strike price of the option (per share): K;
    # Time to maturity (years): T;
    # Continuously compounding risk-free interest rate: r;
    # Volatility: sigma;
    # Steps: N;
 
        # The factor by which the price rises (assuming it rises) = u ;
        # The factor by which the price falls (assuming it falls) = d ;
        # The probability of a price rise = pu ;
        # The probability of a price fall = pd ;
        # discount rate = disc ;
 
    u=math.exp((r-(sigma**2/2))*T/N+sigma*math.sqrt(T/N));
    d=math.exp((r-(sigma**2/2))*T/N-sigma*math.sqrt(T/N));
    pu=0.5;
    pd=1-pu;
    disc=math.exp(-r*T/N);
 
    St = [0] * (N+1)
    C = [0] * (N+1)
 
    St[0]=S*d**N;
 
    for j in range(1, N+1):
        St[j] = St[j-1] * u/d;
 
    for j in range(1, N+1):
        if Option_type == 'P':
            C[j] = max(K-St[j],0);
        elif Option_type == 'C':
            C[j] = max(St[j]-K,0);
 
    for i in range(N, 0, -1):
        for j in range(0, i):
            C[j] = disc*(pu*C[j+1]+pd*C[j]);
 
    return C[0]

Local vs. Distributed Performance

The Ray library allows you to parallelize the code with minimal changes. It provides features like:

  • Running on multiple machines.
  • Microservices and actors.
  • Handling machine failures.
  • Efficient data handling.

You can compare local and remote performance with Ray. Both methods utilize os.cpu_count() to get the number of CPUs.

Local

def generate_options_price_steps(
    S=100,
    K=110,
    T=2.221918,
    r=5,
    sigma=30,
    PC : Literal["C", "P"] = 'C'
):
    '''
    Generate data to Plot call / put options price with different steps
    Takes in:
        Underlying price (per share): S;
        Strike price of the option (per share): K;
        Time to maturity (years): T;
        Continuously compounding risk-free interest rate (%): r;
        Volatility (%): sigma;
        ? Number of binomial steps: N
        Put or Call: PC
    '''
    r_float = r/100;
    sigma_float = sigma/100
 
    ## call option with different steps
    runs = list(range(50,5000,50))
    CRR = [Cox_Ross_Rubinstein_Tree(S, K, T, r_float, sigma_float, i, PC) for i in runs]
    JR =  [Jarrow_Rudd_Tree(S, K, T, r_float, sigma_float, i, PC) for i in runs]
 
    return runs, CRR, JR
 
def run_local(PC : Literal["C", "P"] = 'C'  ):
    ## call/put option with different steps
    start_time = time.time()
    runs, CRR, JR = generate_options_price_steps(PC=PC)
    plt.plot(runs, CRR, label='Cox_Ross_Rubinstein')
    plt.plot(runs, JR, label='Jarrow_Rudd')
    plt.title(f'{PC} option with different steps')
    plt.legend(loc='upper right')
    plt.show()
    duration = time.time() - start_time
    print(f'Local execution time: \033[96m{duration}\033[0m')
run_local("P")
Sequential Put
Local execution time: 131.54762029647827
run_local("C")
Sequential Call
Local execution time: 128.2617244720459

Parallelize and distribute with Ray

Using Ray, you can take Python code that runs sequentially and transform it into a distributed application with minimal code changes. Parallel and distributed computing are a staple of modern applications. The problem is that taking existing Python code and trying to parallelize or distribute it can mean rewriting existing code, sometimes from scratch. Additionally modern applications have requirements that existing modules like multiprocessing lack. These requirements include:

  • Running the same code on more than one machine
  • Building microservices and actors that have state and can communicate
  • Graceful handling of machine failures and preemption
  • Efficient handling of large objects and numerical data

The Ray library satisfies these requirements and allows you to scale your applications without rewriting them. In order to make parallel & distributed computing simple, Ray takes functions and classes and translates them to the distributed setting as tasks and actors. Compare how long it takes to generate the results both locally and in parallel here:

# Ray task
@ray.remote
def Cox_Ross_Rubinstein_Tree_distributed (S,K,T,r,sigma,N, Option_type):
    # Same Code as before
    return C[0]
 
# Ray task
@ray.remote
def Jarrow_Rudd_Tree_distributed (S,K,T,r,sigma,N, Option_type):
    # Same Code as before
    return C[0]
 
def generate_options_price_steps_remote(
    S=100,
    K=110,
    T=2.221918,
    r=5,
    sigma=30,
    PC : Literal["C", "P"] = 'C'
):
    '''
    Generate data to Plot call / put options price with different steps
    Takes in:
        Underlying price (per share): S;
        Strike price of the option (per share): K;
        Time to maturity (years): T;
        Continuously compounding risk-free interest rate (%): r;
        Volatility (%): sigma;
        ? Number of binomial steps: N
        Put or Call: PC
    '''
    r_float = r/100;
    sigma_float = sigma/100
 
    ## call option with different steps
    runs = list(range(50,5000,50))
    CRR = ray.get([Cox_Ross_Rubinstein_Tree_distributed.remote(S, K, T, r_float, sigma_float, i, PC) for i in runs])
    JR =  ray.get([Jarrow_Rudd_Tree_distributed.remote(S, K, T, r_float, sigma_float, i, PC) for i in runs])
 
    return runs, CRR, JR
 
def run_remote(PC : Literal["C", "P"] = 'C'  ):
    ## call/put option with different steps
    start_time = time.time()
    # To explicitly stop or restart Ray, use the shutdown API
    ray.shutdown()
    # Starting Ray
    ray.init()
    runs, CRR, JR = generate_options_price_steps_remote(PC=PC)
    plt.plot(runs, CRR, label='Cox_Ross_Rubinstein')
    plt.plot(runs, JR, label='Jarrow_Rudd')
    plt.title(f'{PC} option with different steps')
    plt.legend(loc='upper right')
    plt.show()
    duration = time.time() - start_time
    print(f'Remote execution time: \033[96m{duration}\033[0m')
run_remote("P")
Ray Put
Remote execution time: 31.533876657485962
run_remote("C")
Ray Call
Remote execution time: 33.92897319793701

You can see the execution time is almost 4X as fast! Ray makes it easy to turn your sequential Python code into a distributed application. To find out more, you can read my notes on Ray on my wiki


Resources: