Dynamically Pricing Hotel Rooms for Maximum Revenue

 

Introduction

dynamic-hotel-pricing

For a hotel chain, the strategy being used to price rooms can make or break the business. When compared to static pricing, pricing rooms dynamically — adjusting the price based on availability, demand, and other factors — can result in a lot more profit. But how do you identify the optimal price — and ensure that its implemented accordingly? 

In this post, I'll be using methodologies from "Dynamic room pricing model for hotel revenue management systems," a paper published by researchers at Cairo University. In their paper, the authors describe a hotel revenue management model based on dynamic pricing that can be easily incorporated by any business. I wasn't able to get the dataset used in the paper, so what you see below uses a mock version of the original dataset.

In addition, I'll be explaining how to put this dynamic pricing model into production using several features of the DataScience.com Platform, including: 

  • The ability to deploy a model as a REST API. In this case, the API is called from a web application as part of a web reservation/booking system.
  • A job scheduler that allows you to update occupancy forecast at a pre-determined frequency (e.g., daily)
  • Model monitoring capabilities that track the performance of the pricing model via a report shared with business stakeholders

Now let's get started. 

The Problem

Typically, hotels control the quantity of rooms offered at different rates. In revenue management terms, this is an example of a quantity-based revenue management decision — rooms are typically offered initially at the lowest rate, but as the number of bookings surpasses the number allocated to the cheapest "bucket," the rate changes to the value assigned to the second bucket.

Below is an example of what quantity-based decision looks like for a particular room segment (e.g., deluxe king rooms with city view). In this chart, the prices are fixed ahead of time, but the number of rooms assigned to different rates could change depending on market conditions, time, etc. Once the second bucket is fully booked, the rate is bumped up to the third bucket, and so forth. This process continues until all the rooms have been booked. A similar approach is also adopted by airline companies to sell seats.

For a hotel, finding the rates that will maximize revenue constitutes a large optimization problem where the number of states to explore can be quite large. For example, a hotel can have multiple room segments and different lengths of stay (one night, two nights, one week, etc.) for which optimal prices need to be found over a long forecast horizon (e.g., 365 days). Such a scenario would result in a very large number of states to compute. If rate updates are required multiple times per day, this optimization problem becomes computationally challenging. 

To overcome this possible computational problem, hotels have used various simplifications. In "Dynamic room pricing model for hotel revenue management systems," the authors used a price-based decision approach, which I have also used for the purposes of this post. In other words, I pre-assigned a number of rooms to each bucket, but bucket prices will vary over time.

Mathematical Formulation

At its core, a revenue maximization exercise is an example of a constrained optimization problem. Let's lay out the mathematical framework first. I want to maximize the following objective function: 

$$ \sum_{i=1}^{90} p_i \times o_i  $$

where i is the night index, pi is the price of the room segment of interest for night i, and oi is the projected occupancy (demand) for night i at price pi. Here, the sum is over a quarter (90 nights). We assume that the occupancy (demand) function is subject to the following relationship with the price:

$$ o = o_{\rm{nominal}} \times \bigg( \frac{p}{p_{\rm{nominal}}} \bigg)^{e} $$

where onominal corresponds to the forecasted occupancy for a given night based on rate pnominal. In this example, the base rate is $120. The value of e (elasticity) has taken a value of e = − 2 (based on Aziz's paper). In other words, p increases by 10%, and demand/occupancy decreases by roughly 20%.

Let's say the hotel has a total of 80 rooms in the room segment of interest. The objective function listed above is subject to the following constraints:

$$o_i \leq 80~~\forall~i$$

and

$$p_i \geq 0~~\forall_i \; .$$

In other words, the occupancy can't exceed the number of rooms available (80) and the price is strictly positive.

In principle, an optimal price per night can be set ahead of time given the forecasted occupancy/demand for that night. However, this wouldn't be fully dynamic pricing: To get an update on the price, you would have to re-run the forecast and update the values of oi based on new data. This is typically done on a daily/weekly basis. 

To make this truly dynamic, I manipulated the capacity of the hotel and partitioned it into four overlapping segments. For each level of capacity, I solved the above objective function to get a set of four prices. In other words, each price level is set based on the current occupancy level in the hotel. The price will be low and it will increase as long as the number of available rooms in the hotel decreases.

To find the optimal price, you simply have to find the number of rooms still available in the reservation database and the corresponding price segment based on the calculation above. As the number of reservations increase for a given night, the number of rooms available decrease and prices rise.

A Simple Contextual Example

Imagine this scenario: The Plaza Hotel has a very simplistic rate schedule for their 80 junior suite with city view rooms. After a few months on the job, an analyst on the revenue management team notices that the fixed pricing scheme ($120/night during the high season) is a suboptimal scenario for maximizing revenue. These are the trends the newly hired analyst highlights:

  • The hotel experiences relatively low occupancy in the middle of the week that could potentially be fixed by lowering the price, which could increase revenue overall.
  • On weekends (Friday and Saturday nights) the situation is reversed. The hotel declines reservations because there are no vacancies. Increasing the price beyond $120 may increase revenue while still maximizing occupancy.

The situation described above is a perfect application of a revenue/manage dynamic pricing framework. Many such frameworks (xotels, IDeaS, etc.) are used by hotels, although in some of these tools, the price levels are set manually by analysts. To circumvent the process of setting somewhat arbitrary price levels, we'll adopt the methodology described above. 

Limitations of Our Approach

What you'll find below is a simple model for dynamically pricing hotel rooms. Consequently, there are several limitations to this simple approach. Here are the most important ones:

  • I consider only a single length of stay (a single night).
  • I do not account for group reservations or cancellations.
  • I assume a monopolistic situation where competitors do not matter.
  • I assume that the pricing does not vary by distribution channel (e.g., Orbitz, Hotels.com, trivago, Hotwire, etc.).
  • I consider a single class of rooms (junior suites with city views). All of these rooms are deemed to be equivalent.
  • I assume that the user has a forecast of demand for the next 90 days, based on historical data. In principle, this forecast should be updated on a daily basis with new booking information. 
  • I also assume that price elasticity is known. I adopted a value of -2, which is reasonable for the industry. 

In the cell below, I import a series of libraries that I need for my analysis.

import pandas as pd 
import numpy as np 
import scipy 

import math as mt 

import pickle as pkl 
import os 
import psycopg2
import matplotlib 
import matplotlib.pyplot as plt 
from numpy.random import normal
import calendar 
from scipy.optimize import curve_fit

%matplotlib inline 
plt.rcParams['figure.figsize'] = (16,8)
import warnings
warnings.filterwarnings('ignore')

#import plotly 
import plotly.plotly as py 
import plotly.graph_objs as go 
import plotly.figure_factory as ff
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
init_notebook_mode(connected=True)

Occupancy Forecasting for the Next 90 Days

In the cells below, I will generate a forecast of the junior suites with city views based on the numbers provided for the Plaza Hotel. There are 80 junior suites. The forecast shows a weekly period with a mild trend upward indicating the start of the high touristic season, as well as peaks on the weekends. The green line shows the capacity of that particular room class.

Note that the forecast below is based on the assumption that each room is priced $120 per night, which is typically what the Plaza Hotel charges for that room class. The forecast is based on historical data and should be updated frequently based on new reservation and cancellation data.

Based on this baseline forecast at a price of $120 per night, my goal is to find optimal prices to maximize revenue.

# Let's define a function that will simulate an occupancy level (demand) 
# for the next "nb_days" days. 

def simulate_demand(nb_days, period=7.0, dc_level=35.0, noise_level=5.0, 
                    amplitude=3.0, phase=2.0*mt.pi/7.0, trend=0.06):
    """Simulate a demand curve over a forecast period. 
    
    Parameters 
    ----------
    
    nb_days (integer):
       Length of the forecast period (in days)

    period (float): 
       Length of the seasonal component. Typically 7 days for weekly 
       fluctuations. 
       
    dc_level (float): 
       Baseline occupancy level 
       
    noise_level (float): 
       Noise level (additive) of the occupancy numbers. This number 
       corresponds to the standard deviation of the normal distribution. 
       
    amplitude (float): 
       Amplitude of the periodic signal 
       
    phase (float): 
       phase shift of the signal (in radians) 
    
    trend (float): 
       slope of the linear trend term. In units of rooms/day. Set to 0 if you 
       do not want to include a linear trend. 
       
    Returns 
    -------
    A numpy.ndarray (1D) containing the forecast numbers.
    """
    
    days = np.linspace(1,nb_days,nb_days)
    
    # noise level: 
    noise = noise_level * normal(size=len(days))
    
    # Demand signal: 
    demand = dc_level + trend * days + amplitude * \
             np.cos(2.0 * mt.pi * days / period + phase) + noise 
    
    return demand
# Let's simulate the occupancy for a period of 90 days: 

date_start = '2017-07-31'
date_end = '2017-10-29'
nb_days = (pd.to_datetime(date_end) - pd.to_datetime(date_start)).days

forecasted_demand = simulate_demand(nb_days, dc_level=68, 
                                    noise_level=1.0, amplitude=5)

forecasted_demand = map(round, forecasted_demand)

forecasted_demand = pd.DataFrame(forecasted_demand, 
                                 index=pd.date_range(start=date_start,
                                                     end=date_end,
                                                     closed='left'), 
                                 columns=['occupancy'])

Below, I show the outcome of a simple occupancy forecast simulation with plotly. 

# Let's plot the occupancy forecast using plotly: 

occupancy = [go.Scatter(x=forecasted_demand.index, y=forecasted_demand['occupancy'], 
                        name='Forecasted Junior Suites with City View Occupancy for the next 90 days')]

layout_occ = go.Layout(title='Forecasted Junior Suites with City View Occupancy -- Each room is priced $120/night', 
                       xaxis={'title':'Day'}, 
                       yaxis={'title':'Number of Rooms Occupied'}, 
                       shapes=[{'type':'line', 
                                'x0':'2017-07-31', 
                                'x1':'2017-10-31',
                                'y0':80.0, 'y1':80.0,
                                'line': {
                                    'color': 'rgb(50, 171, 96)',
                                    'width': 4, 'dash':'dashdot'}
                               }]
                      )

fig = go.Figure(data=occupancy, layout=layout_occ)
iplot(fig, filename='occ_ts')

The next step is to assume some relationship between price and demand. I have adopted the relationship described in the Mathematical Formulation of this piece and set the elasticity between price and demand to e = − 2.

# Let's assume a demand price elasticity function:  

def demand_price_elasticity(price, nominal_demand, elasticity=-2.0, nominal_price=120.0): 
    """Returns demand given a value for the elasticity, nominal demand and nominal price.
    
    Parameters
    ----------
    
    price (numpy.ndarray):
        one-dimensional price array. The length of that array should correspond to the 
        length of the forecast period. 
    
    nominal_demand (numpy.ndarray): 
        one-dimensional forecasted occupancy array. The length of that array should 
        correspond to the length of the forecast period. 
    
    elasticity (float): 
        value of the elasticity between price and demand. A value of e=-2 is reasonable. 
    
    nominal_price (float): 
        room rate for which the forecast was computed.
    
    Returns
    -------
    
    A numpy.ndarray of expected demand. 
    """
    
    return nominal_demand * ( price / nominal_price ) ** (elasticity)

Finally, l'll bring the forecasted demand and demand elasticity together to define the constrained optimization problem. This will allow me to identify the optimal set of prices for different capacity levels. To minimize the objective function, I will use the scipy.optimize optimization library. Other libraries that can perform optimization tasks in Python include pyOpt and CVXOPT, to name a few.

import scipy.optimize as optimize
# definition of the objective function: 

def objective(p_t, nominal_demand=np.array([50,40,30,20]), 
              elasticity=-2.0, nominal_price=120.0): 
    """ 
    Definition of the objective function. This is the function that want to minimize. 
    (minus sign in front)
    
    Parameters 
    ----------

    p_t (numpy.ndarray):
        one-dimensional price array. The length of that array should correspond to the 
        length of the forecast period.
        
    nominal_demand (numpy.ndarray): 
        one-dimensional forecasted occupancy array. The length of that array should 
        correspond to the length of the forecast period. 
        
    elasticity (float): 
        value of the elasticity between price and demand. A value of e=-2 is 
        reasonable. 

    nominal price (float): 
        room rate for which the forecast was computed.

    Returns
    -------

    Value of the objective function (float). 
    
    Note: here we're trying to minimize the objective function. That's where the 
    minus sign comes_in.    
    
    """
    
    return (-1.0 * np.sum( p_t * demand_price_elasticity(p_t, nominal_demand=nominal_demand,
                                                        elasticity=elasticity, 
                                                        nominal_price=nominal_price) )) / 100

In the cell below, I explicitly list all the constraints the objective function is subject to. In this case, two constraints are defined:

  • Prices have strictly positive values
  • Demand can't exceed capacity
# Constraints: 

def constraint_1(p_t):
    """ This constraint ensures that the prices are positive. 
    """
    return p_t


def constraint_2(p_t, capacity=20, forecasted_demand=35.0, 
                 elasticity=-2.0, nominal_price=120.0):
    """ This constraint ensures that the demand does not exceed 
    capacity. 
    
    Parameters 
    ----------
    
    p_t (float): 
        Room price 
    
    capacity (integer): 
        Capacity of the hotel (in rooms). 
        
    forecasted_demand (float): 
        Forecasted demand (in rooms) for that night 
    
    elasticity (float): 
        slope of the 
        
    nominal_price (float): 
        The price for which the forecasted_demand was computed. 
          
    Returns
    -------
    Returns an array of excess capacity. 
    
    """
    return capacity - demand_price_elasticity(p_t, nominal_demand=forecasted_demand,
                                                        elasticity=elasticity, 
                                                        nominal_price=nominal_price)

I have defined the objective function and the constraints. Now, I'll perform the minimization over four overlapping capacity segments. To do this, I'll use the Sequential Least Squares Programming (SLSQP) minimization technique.

# Let's run the optimization algorithm over four overlapping segments 
# of 20, 40, 60, 80 room capacity. 

# We look at four capacity segments: 20, 40, 60, and 80 (full capacity)  
# rooms available. 
capacities = [20.0, 40.0, 60.0, 80.0]  

optimization_results = {}
for capacity in capacities: 

    # Nominal price associated with forecasted demand:  
    nominal_price = 120.0                 
    # Forecasted demand: 
    nominal_demand = forecasted_demand['occupancy'].values    
    # Assumed price elasticity: 
    elasticity = -2.0

    # Starting values: 
    p_start = 125.0 * np.ones(len(nominal_demand))

    # bounds on the prices. Let's stick with reasonable values. 
    # One could be more sophisticated here and apply constraints  
    # that limit the prices to be in range of what competitors 
    # are charging, for example. 
    bounds = tuple((10.0, 400.0) for p in p_start)

    # Constraints: 
    constraints = ({'type': 'ineq', 'fun':  lambda x:  constraint_1(x)},
               {'type': 'ineq', 'fun':  lambda x, capacity=capacity, 
                                           forecasted_demand=nominal_demand, 
                                           elasticity=elasticity,
                                           nominal_price=nominal_price: constraint_2(x,capacity=capacity,
                                                                                     forecasted_demand=nominal_demand,
                                                                                     elasticity=elasticity,
                                                                                     nominal_price=nominal_price)})
    
    opt_results = optimize.minimize(objective, p_start, args=(nominal_demand, 
                                                              elasticity, 
                                                              nominal_price),
                                    method='SLSQP', bounds=bounds,  
                                    constraints=constraints)
    
    optimization_results[capacity] = opt_results
# Plotting the resulting rates vs dates. 

time_array = np.linspace(1,len(nominal_demand),len(nominal_demand))
rate_df = pd.DataFrame(index=time_array)

for capacity in optimization_results.keys(): 
    rate_df = pd.concat([rate_df, 
                         pd.DataFrame(optimization_results[capacity]['x'],
                                      columns=['{}'.format(capacity)], 
                                      index=time_array)],
                        axis=1)

rate_df.index.name = 'Day'
datelist = pd.date_range(start=date_start, end=date_end, closed='left').tolist()
rate_df.index = [ x.date() for x in datelist]

In the table below, I show an example of the rates as a function of the number of rooms still available for booking. For example, column Capacity left: 80.0 corresponds to the case where there are 80 rooms left to book (no reservations that night yet) while the column Capacity left: 20.0 represents the case where there are only 20 rooms left to book. As you can see, the prices go up as capacity goes down — which is what we intuitively expect.

The table below shows the prices (in dollars) per night and per capacity levels. I also show the results in a chart highlighting fluctuations in price as a function of remaining capacity and day of week.

# Generate a pretty table for display purposes. 

rate_df_to_show = rate_df.copy()

# Renaming the columns:
rate_df_to_show = rate_df_to_show[np.sort(np.asarray(rate_df_to_show.columns))]
rate_df_to_show.columns = ['Capacity left : {}'.format(x) for x in rate_df_to_show.columns]

# Rounding the numbers: 
for col in rate_df_to_show.columns:
    rate_df_to_show[col] = rate_df_to_show[col].apply(lambda x: round(x,2))

dow_map = { 6:'Sun', 0:'Mon', 1:'Tue', 2:'Wed', 3:'Thu', 4:'Fri', 5:'Sat'}
rate_df_to_show['date'] = rate_df_to_show.index 
rate_df_to_show['dow'] = rate_df_to_show['date'].apply(lambda x: dow_map[x.weekday()])
rate_df_to_show['date'] = rate_df_to_show.apply(lambda row: row['dow']+" "+str(row['date']), 
                                                axis=1)
rate_df_to_show.index = rate_df_to_show['date'].values 
rate_df_to_show.drop(['date','dow'],axis=1,inplace=True)
rate_df_to_show.head(10)
  Capacity left: 20.0 Capacity left: 40.0 Capacity left: 60.0 Capacity left: 80.0
Mon 2017-07-31 216.33 152.97 124.90 108.17
Tue 2017-08-01 214.66 151.79 123.94 107.33
Wed 2017-08-02 214.66 151.79 123.94 107.33
Thu 2017-08-03 222.89 157.61 128.69 111.45
Fri 2017-08-04 230.82 163.22 133.27 115.41
Sat 2017-08-05 229.26 162.11 132.36 114.63
Sun 2017-08-06 227.68 161.00 131.45 113.84
Mon 2017-08-07 216.33 152.97 124.90 108.17
Tue 2017-08-08 216.33 152.97 124.90 108.17
Wed 2017-08-09 214.66 151.79 123.94 107.33
# Plotting the room rate time series. 
# Let's focus on a single week cycle. 

price_levels = [go.Scatter(x=rate_df_to_show.head(7).index, 
                           y=rate_df_to_show.head(7)['Capacity left : 20.0'],
                           name='Capacity Remaining : 20 rooms'),
                go.Scatter(x=rate_df_to_show.head(7).index, 
                           y=rate_df_to_show.head(7)['Capacity left : 40.0'],
                           name='Capacity Remaining : 40 rooms'),
                go.Scatter(x=rate_df_to_show.head(7).index, 
                           y=rate_df_to_show.head(7)['Capacity left : 60.0'],
                           name='Capacity Remaining : 60 rooms'),
                go.Scatter(x=rate_df_to_show.head(7).index, 
                           y=rate_df_to_show.head(7)['Capacity left : 80.0'],
                           name='Capacity Remaining : 80 rooms')]

layout_prices = go.Layout(title='Rate vs Reservation Date and Current Capacity Levels', 
                       xaxis={'title':'Day'}, yaxis={'title':'Rate ($)'})

fig = go.Figure(data=price_levels, layout=layout_prices)
iplot(fig, filename='price_levels_ts')
# Save rate dataframe to local folder: 

rate_df.to_csv('data/rates.csv')

In order for this dynamic pricing system to work, you have to check the number of reservations made for a particular night and assign the correct price based on the four price buckets I outlined above.

This is the purpose of the rate_query() function that I will deploy as a REST API on the DataScience.com Platform. What this function does is query the number of rooms booked in the central booking database of the hotel and assign the prices accordingly.

# deploy a model. Given a date and the capacity for that date, 
# returns the rate. 
# Reads current capacity from the heroku reservations db 
# Read current rates from a csv file stored locally. 

def rate_query(arrival_date, departure_date): 
    """Given an arrival and a departure dates, this function 
    will look up the current number of reservations for those dates 
    and return the optimal price according to the rate_df dataframe. 
    
    Parameters
    ----------
    
    arrival_date (string): 
        format YYYY-MM-DD 
    
    departure_date (string): 
        format YYYY-MM-DD 
    
    Returns
    -------
    
    A list containing the room rates for each night. 
    """
    
    # connection to the reservations database. 
    # In this case, we used a postgres DB hosted on heroku. 
    # The DataScience.com Platform allows you to easily store 
    # your access credentials as environment variables. You never 
    # have to copy and paste credentials directly in notebook! 
    conn = psycopg2.connect(database='my_db', 
                       port=os.environ['HOTEL_BOOKINGS_DB_PORT'],
                       password=os.environ['HOTEL_BOOKINGS_DB_PASS'],
                       user=os.environ['HOTEL_BOOKINGS_DB_USER'],
                       host=os.environ['HOTEL_BOOKINGS_DB_HOST'])
    
    current_bookings = pd.read_sql("SELECT * from bookings where date>=\'{}\' and date <\'{}\' ".format(arrival_date, 
                                                                                                        departure_date),conn)
    current_bookings.index = pd.to_datetime(current_bookings['date'])
    current_bookings.drop(['date'], axis=1, inplace=True)

    # Read the rate dataframe: 
    rate_df = pd.read_csv('data/rates.csv',header=0, index_col=0)
    
    # Check for no availability on any of those nights: 
    no_avail = current_bookings[current_bookings['rooms_available'] < 1.0 ]
    if len(no_avail) > 0 :
        raise ValueError("No Room available on {}".format(no_avail.index))
    
    capacity_values = rate_df.columns 
    capacity_values = np.sort(capacity_values)[::-1]    
    
    rates = []
    # Look over each date in current_bookings: 
    for book_date in current_bookings.index:
        id = 0 
        current_capacity = current_bookings.loc[book_date, 'rooms_available']
        while id <= len(capacity_values)-1 and current_capacity <= float(capacity_values[id]):
            tmp = rate_df.loc[str(book_date.date()), capacity_values[id]]
            id+=1
        rates.append(tmp)

    return [ round(rate,2) for rate in rates ]

Finally, a web reservation/booking system can make HTTP requests to the dynamic pricing API and fetch updated prices. You can deploy multiple versions of the algorithm, which allows for A/B testing or multi-arm bandit tests of your pricing algorithm. Below is a snapshot of the web app, along with an example request made to the API.

In addition, we can schedule a job in the platform that updates the hotel's occupancy forecast on a daily basis based on new reservation and cancellation data.

Conclusion

While this is a very simplified approach to dynamically pricing hotel rooms, I hope it provides you with a clear introduction to how powerful data science can be for a business like a hotel. 

Of course, dynamic pricing is only powerful if pricing information gets to the right people. The DataScience.com Platform can be used to implement, deploy, and update a dynamic pricing model, so that analysts don't have to manually update prices in different tools.

Want to see the platform in action? Request a demo.