cust_id | scen_id | age | income | is_business | seat_browsed | price_f1 | stops_f1 | price_f2 | stops_f2 | choice | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 25-34 | high | 0 | econ | 177.0 | 1 | 233.0 | 2 | 1 |
1 | 0 | 1 | 25-34 | high | 0 | econ | 191.0 | 2 | 198.0 | 1 | 0 |
2 | 0 | 2 | 25-34 | high | 0 | econ | 194.0 | 2 | 189.0 | 2 | 0 |
3 | 1 | 0 | 25-34 | low | 1 | econ | 169.0 | 2 | 255.0 | 2 | 0 |
4 | 1 | 1 | 25-34 | low | 1 | econ | 201.0 | 0 | 228.0 | 1 | 1 |
Abstract
Dynamic pricing has been the dominant approach in airline revenue management systems for decades. This method has continually improved with advances in new algorithms and better computing power. However, even with these improvements, the general pricing system remained rooted in price discrimination principles. Recently, Large Language Models (LLMs) have opened new avenues for many technologies. Their underlying structures and capabilities provide an opportunity for a new pricing paradigm to enter the airline industry: fully personalized pricing. In this post, we review the history and current approaches to airline pricing systems, from economic theory to the algorithms generally utilized. We then present information on Large Market Models (LMMs), the new approach companies are leveraging to provide completely personalized prices to airline consumers. We conclude with a basic prototype of personalized pricing using a dynamic hierarchical Bayesian mixed logit model.
Introduction
In the early history of commercial aviation, airlines were governed by the Civil Aeronautics Board (CAB). The CAB not only set prices and established routes but also oversaw various other operational aspects for these airlines. However, this changed dramatically in 1978 with airline deregulation [1]. From that point onward, airlines gained the autonomy to determine their own pricing strategies, establish their own routes, and offer distinct services to differentiate themselves within the industry.
This new freedom, however, brought the challenge of achieving profitability at each airline. The airline business operates within a high fixed cost, low marginal cost industry. A significant portion of airline expenses is incurred through the acquisition and maintenance of aircraft, personnel, fuel, and other operational overhead. In contrast, the marginal cost of accommodating an additional passenger on a given flight is relatively low. Thus, achieving full or near-full flights—became the primary driver of airline profitability.
Like all businesses, airlines sought to move beyond simply selling available seats. They wanted to maximize total revenue (and thus profit) from each flight. To achieve this, airlines recognized the critical need to accurately estimate a consumer’s willingness to pay (WTP) [2]. Consequently, a sole focus on cost-based pricing was insufficient. Instead, a transition to market-based pricing [3] became imperative.
First Methods: Yield Management
This market-based pricing approach became known as yield management (a term coined by Robert Cross at American Airlines). Yield management constituted a systematic approach focused on identifying distinct customer segments, setting different prices for these segments, and updating these prices regularly. Essentially, yield management aims to estimate a consumer’s WTP and set prices that capture as much of that value as possible.
Airlines recognized that each customer had a unique WTP, but given the computational limitations of the time, calculating individual WTP was impractical. Insetad, general market approaches such as price discrimination were employed [4]. Rather than offering every seat at a uniform price, airlines could differentiate seats with varying perks and sell them at distinct price points, thereby segmenting consumers into different WTP ranges.
Beyond differentiating prices based on seating sections, airlines also began to strategically set prices according to supply and demand dynamics. With the increasing prevalence of computer systems, airlines gained the capacity to collect data on passenger booking patterns. For instance, they observed that leisure travelers typically booked tickets well in advance, while business travelers often purchased tickets with short notice. This behavioral insight was leveraged to develop tactics such as fare classes and fences [5].
The fare class system operates as follows: While seat prices are initially differentiated by section, fare classes further segment inventory by stipulating the order in which seats are sold and at what price points. For example, within the economy section, different fare classes (e.g., A, B, C, D) are established. Seats assigned to class A are priced the lowest, with progressively higher prices for classes B, C, and D. This hierarchical pricing structure continues until a given class is fully booked. This approach is designed to mimic supply-demand principles: once a lower-priced class (e.g., A) is sold out, subsequent bookings are directed to higher-priced classes (e.g., B), reflecting a decrease in the remaining supply for that particular section. Thus, fare classes allowed airlines to “dynamically” adjust prices based on the quantity of seats sold within each designated class.
Fences extend the fare class system by further refining price discrimination. These rules aim to retain higher-WTP customers within their designated fare classes while encouraging lower-WTP customers to “jump” fences and access more price-sensitive fares. For instance, airlines typically offer lower prices for early bookings to incentivize leisure travelers to commit in advance, whereas business travelers often pay a premium for bookings made closer to the travel date. Another common example is the minimum stay rule. Airlines frequently implement a Saturday-night stay requirement for round-trip tickets to differentiate business travelers from leisure travelers.
Modern Methods: Big Data
While the overall approach to pricing airfare remains deeply rooted in economic theory and operations research, airline revenue management systems have become significantly more sophisticated due to advances in data collection and predictive analytics.
A key development in this area, for example, is the application of multi-armed bandit problems to dynamic pricing. Papers such as A Modern Bayesian Look at the Multi-armed Bandit [6] leverage the power of multi-armed bandits to contextualize consumer behavior and optimize prices to encourage purchase behavior.
Modern systems have also embraced other machine learning techniques to process a broader array of data points. Instead of relying solely on booking history and fare class availability, algorithms now incorporate factors such as web traffic patterns, historical search data for specific routes, time of day, and even weather forecasts. These models are designed to uncover complex, non-linear relationships between these variables and consumer demand, allowing for more nuanced and responsive pricing adjustments. The goal is to move beyond simple segmentation based on booking time and toward a more data-driven, continuous assessment of market conditions.
From Market Personalization to Consumer Personalization
Despite significant advances in machine learning and big data, airline revenue management systems remained limited to personalizing prices based on market segments and prevailing market conditions rather than on individual WTP. Consequently, airlines were still unable to capture the complete consumer surplus from each passenger. However, a new advancement in technology began to emerge in the late 2010s, one that would fundamentally transform personalization across all industries.
Large Language Models
In 2017, Google released the paper “Attention Is All You Need” by Vaswani et al. [7]. This paper introduced the revolutionary architecture of transformers to process sequences of data, which was a significant breakthrough in natural language processing (NLP). The transformer’s ability to handle large amounts of data in parallel was a key enabler for the development of much larger models.
From this foundational paper, research and development in the field accelerated. Examples include companies like OpenAI, who began developing their own Large Language Models (LLMs) to produce a personalized search engine. The transformer architecture fundamentally changed the landscape of AI, and today, almost all companies are looking to incorporate these powerful models into their services for personalization and advanced data processing.
Large Market Models
From a simplified point of view, the core objective of an LLM is to predict the next token in a sequence of input tokens. This fundamental principle of next-token prediction is now being generalized across various industries to address different problems. A prominent challenge being tackled is next market dynamic prediction. That is, using sequential observations of a market’s behavior to forecast its subsequent state.
This new class of models aiming to predict market dynamics are called Large Market Models (LMMs) [8]. Just like LLMs, LMMs can process large amounts of data to ultimately predict optimal actions within a given market. In the context of airline revenue management, this optimal action would be the optimal price. However, this pricing approach fundamentally differs from current dynamic pricing strategies. Instead of the “hard jumps” between fare classes (as previously discussed), LMMs can generate a continuous function for pricing at the individual consumer level.
One company at the forefront of this research is Fetcherr. Founded in 2019 by Roy Cohen, Uri Yerushalmi, and Robby Nissan, the company believed that traditional business decision-making was not leveraging the full potential of modern AI and machine learning tools. Consequently, they developed a new revenue management system centered on the use of LMMs to revolutionize the airline industry. As their technology is proprietary, the specific implementation details cannot be discussed here. However, for those interested in a deeper understanding, a presentation on their approach is available in the video below.
Personalized Pricing Prototype
Due to hardware constraints and data availability, a prototype LMM cannot be presented in this post. However, to illustrate the core principles of personalized pricing, we instead present a hierarchical mixed logit model.
To achieve this, we take the perspective of an airline aiming to optimize pricing to maximize revenue per customer through personalized offers. For the purpose of simulation, we created a data-generating function that attempts to mimic the dynamics of different consumer segments purchasing airline tickets. Each customer is presented with two options: Flight 1 and Flight 2. Each flight has its own attributes (e.g., number of layovers, price), and each customer has their own attributes (e.g., age group, income level, business traveler status). The dataset includes three scenarios for each customer, who can choose between Flight 1, Flight 2, or no flight. A sample of this dataset is provided below.
In total, our dataset consists of one hundred unique customers each with three distinct scenarios. Our analysis continues with some simple EDA to better understand the synthetic data, starting with Figure 4.1.
Our data appears to have a higher proportion of middle class income customers with approximately an even split between high and low income customers. There also seems to be more customers in the age range of 25-34 than other categories.
Figure 4.2 shows the count of choices for flight 1 (0), flight 2 (2), or no flight (2). The graph shows that the general consumer selected flight 1 slightly more often than flight 2.
Figure 4.3 shows the average price between flight 1 and flight 2 tickets, split by class type. Flight two in both categories had higher average ticket prices than flight 1.
Figure 4.4 shows that both economy and premium class seats were about equally purchased, around 60% purchase rate.
Utility Theory
The theory for our modeling approach stems from microeconomics, namely utility theory [9]. As previously discussed, the primary objective for airlines is to capture as much consumer surplus as possible by accurately estimating a customer’s WTP. WTP itself is derived from the principle that individuals seek to maximize their utility from the consumption of a good or service. Therefore, if an airline’s service provides a consumer with utility, the goal is to set a price that is optimally aligned with the value of that utility, thereby maximizing the revenue captured.
In our simulation, given the discrete consumer choice between alternatives, we use the random utility model [10], as shown in Equation 4.1.
\[ U_{ijt} = V_{ijt} + \epsilon_{ijt} \] \[ \epsilon_{ijt} \overset{i.i.d.}{\sim} \text{Gumbel}(0,1) \tag{4.1}\]
\(U_{ijt}\) is the utility derived by the \(i\)th customer choosing the \(j\)th alternative in the \(t\)th scenario. This utility is composed of a deterministic portion \(V_{ijt}\) (our “observed” utility from purchased ticket data) and a stochastic term \(\epsilon_{ijt}\) that we assume follows a Gumbel distribution.
While we don’t observe an exact “utility” estimate in this data (or in any data for that matter), we postulate that utility is derived as a linear combination of observed features. This is illustrated in Equation 4.2.
\[ V_{ijt} = \text{ASC}_{flight1,i,t} I(\text{Flight 1}) + \text{ASC}_{flight2,i,t} I(\text{Flight 2}) + \beta_{i,price,t} Price_{jt} + \beta_{i,stops,t} Stops_{jt} + \beta_{i,seat,t} Seat_{jt} \tag{4.2}\]
The equation Equation 4.2 illustrates that the observed utility, \(V_{ijt}\), is a function of several key attributes of both the alternatives and the customer. Specifically, the utility is dependent on the flight’s price, the number of stops, and the type of seat purchased.
The utility we derive in Equation 4.2 is then used to calculate the probability of choosing a given alternative in a given scenario. This is illustrated in Equation 4.3.
\[ P(Y_{ijt} = 1) = \frac{exp(V_{ijt})}{\sum_{z=1}^{J}exp(V_{izt})} \tag{4.3}\]
Essentially, we assume that the probability of a consumer’s observed choice is a function of the utility of a given alternative ($V_{ijt}) relative to the utilities of all other available alternatives.
Hierarchical Bayesian
As mentioned at the outset of this section, we are modeling this scenario hierarchically. This approach is based on the belief that individuals derive their preferences from a “global” distribution, which is then adjusted based on personal differences. To implement this, we model each \(\beta\) value as a draw from a global population distribution, as shown in Equation 4.4.
\[ \beta_{it} = \begin{pmatrix} \text{ASC}_{flight1,i,t} \\ \text{ASC}_{flight2,i,t} \\ \beta_{i,price,t} \\ \beta_{i,stops,t} \\ \beta_{i,seat,t} \end{pmatrix} \sim N(\mu_{t}, \Sigma) \tag{4.4}\]
Each customer \(i\) in scenario \(t\) has a vector \(\beta\) that is drawn from a multivariate normal distribution with mean \(\mu_{t}\) and covariance \(\Sigma\). \(\mu_{t}\) is our vector of population parameter estimates for the average preference for each attribute of a flight at time \(t\). \(\Sigma\) represents the unobserved heterogeneity around \(\mu_{t}\) (Note: we hold this value constant throughout to simplify modeling).
We model \(\mu_{t}\) as an AR(1) process, as shown in Equation 4.5.
\[ \mu_{kt} = \mu_{k,base} + \phi_{k}(\mu_{kt-1} - \mu_{k,base}) + \eta_{kt} \tag{4.5}\]
\(\mu_{kt}\) is the average preference for attribute \(k\) at time \(t\). This is a linear combination of three key components: The base preference (\(\mu_{k,base}\)), the difference between the past average preference and the base preference scaled by autoregressive component \(\phi_{k}\) (how much the past mean influences the current mean), and the random shock \(\eta_{kt}\) which accounts for unobserved fluctuations in the average preference. (Note: We assume \(\eta_{kt}\) is drawn from \(N(0,q_{k})\). Interested readers can see the code for \(q_{k}\) prior).
The \(\mu_{k,base}\) parameter is estimated from the baseline population hyperparameters, as shown in Equation 4.6.
\[ \mu_{k,base} = \alpha_{k} + \delta_{kc} \tag{4.6}\]
\(\alpha_{k}\) is the intercept for the baseline population mean for each attribute \(k\). This is then adjusted based on different customer demographics \(c\) for each attribute \(k\). Each \(\alpha_{k}\) and \(\delta_{kc}\) has prior \(N(\mu, \sigma)\) (interested readers can view the numpyro code for specific numbers).
The likelihood function based on all these parameters is modeled in Equation 4.7.
\[ L(Y|\beta, \mu, \Sigma, \phi, Q) = \prod_{i=1}^{N} \prod_{j=1}^{J} \prod_{t=1}^{T_{i}} P(Y_{ijt} = 1) \tag{4.7}\]
Once these parameters are estimated, we can calculate individual WTP using Equation 4.8.
\[ \text{WTP}_{ikt} = -\frac{\beta_{ikt}}{\beta_{i,price,t}} \tag{4.8}\]
WTP is formally defined as the marginal rate of substitution between a non-monetary attribute and price. Within our logit framework, this is calculated as the ratio of an attribute’s coefficient to the absolute value of the price coefficient. A positive WTP indicates the amount a customer is willing to pay to gain an attribute, while a negative WTP represents the cost a customer is willing to incur to avoid an undesirable attribute.
To model this data, we used numpyro
[11] using a NUTS kernel [12] and collected 1000 samples.
Results
In the context of airline pricing, we’d like to understand the WTP for customers avoiding layovers (stops) on their flights and the WTP for customers seeking a premium class seat. If we can identify different WTP for each customer, we can better personalize pricing based on these attributes to capture more revenue per customer.
To illustrate the results of our model, we present the average WTP estimate for stops along with 95% credible interval for customer 5. These results are shown below for each scenario \(t\).
Average WTP stop for customer 5: -11.361211776733398
95% CI for customer 5: [-60. -0.18702286]
Average WTP stop for customer 5: -10.813233375549316
95% CI for customer 5: [-60. -0.1928775]
Average WTP stop for customer 5: -10.883264541625977
95% CI for customer 5: [-60. -0.17119325]
cust_id | scen_id | age | income | is_business | seat_browsed | price_f1 | stops_f1 | price_f2 | stops_f2 | choice | |
---|---|---|---|---|---|---|---|---|---|---|---|
15 | 5 | 0 | 18-24 | med | 0 | econ | 207.0 | 1 | 237.0 | 1 | 1 |
16 | 5 | 1 | 18-24 | med | 0 | prem | 323.0 | 0 | 328.0 | 0 | 0 |
17 | 5 | 2 | 18-24 | med | 0 | econ | 180.0 | 2 | 228.0 | 0 | 1 |
Customer 5, an individual aged 18-24 with a middle income who is not a business traveler, shows a WTP of approximately $11 to avoid one additional stop. This is the marginal value that customer 5 would be willing to add to the ticket, on average, to avoid an additional stop on their flight. If we were to set prices at the customer level, we could use this information to set a price below this ceiling amount to better capture the additional consumer surplus.
We also can view customer 5’s WTP for seat upgradge (going up from economy to premium). These results are shown below.
Average WTP seat for customer 5: 26.012453079223633
95% CI for customer 5: [ 0.8786044 60. ]
Average WTP seat for customer 5: 26.450408935546875
95% CI for customer 5: [ 0.83505126 60. ]
Average WTP seat for customer 5: 26.11977767944336
95% CI for customer 5: [ 1.03312797 60. ]
The posterior distributions for seat upgrades are more spread out, indicating a higher degree of uncertainty in our estimates compared to those for stops. Similar to the distributions in Figure 4.5, these posterior distributions do not vary significantly across each scenario. It is worth noting that the mean WTP for seat upgrades is higher for Customer 5, reflecting a greater potential value, though this estimate comes with higher uncertainty due to the wider distribution. This uncertainty can be incorporated directly into our pricing strategy to create a more robust offering system.
Price Offer Strategy
Our goal is to maximize the expected profit from a given price point for a particular customer. While we could set the price as the base fare plus the mean of the posterior distribution, a more optimal approach would be finding a price point \(p\) that would maximize Equation 4.9.
\[ E[p-c] = (p-c)P(WTP_{ikt} > p) \tag{4.9}\]
\(p\) is the price and \(c\) is the cost associated with the flight. The probabilities from Equation 4.9 can be pulled from our posterior distributions.
Assuming a cost \(c\) of $200 and a base fare \(p\) of $200, we determine the optimal price for a flight based on the WTP for avoiding a stop. To do this, we test integer price points ranging from $5 to $30 above the base fare. Using the posterior distribution derived from scenario 2 for customer 5, the resulting optimal price is shown below.
Optimal price to maximize profit: $218
Conclusion
In this post, we reviewed the history of airline revenue management, from the 1978 deregulation act to today’s more sophisticated approaches. We discussed how these systems have attempted to maximize revenue by constantly adjusting prices based on market dynamics and leveraging techniques like price discrimination. We then introduced a new technological disruption with the advent of LLMs and LMMs. Following this, we presented a simpler approach using a dynamic hierarchical mixed logit model to estimate individual customer WTP. We concluded by demonstrating how these estimates can be used within a basic profit optimization framework to offer prices that maximize profitability for each customer.
Overall, we hope this post has provided a clear overview of the complexities of pricing within the airline industry. The practices currently in use are the result of a considerable amount of research and development. Now, with the revolution of AI powered by LLMs, the airline industry appears to primed for yet another pricing revolution.