"Introduction to normalization of demand data

Practice Article

Introduction to normalization of demand
data – The first step in isolating the effects
of price on demand
Received (in revised form): 3rd September 2009

John D. Quillinan
is Executive Director of Analytics for Westgate Resorts, where his responsibilities include delivering analytics
to improve resort profitability, and implementing a revenue management system. John has over 25 years of
information technology and operational research experience. Revenue management, retail price optimization
and data analytics have been John’s focus the past 5 years. John’s most recently implemented price test
analytics were for Tractor Supply Company. Leveraging his revenue management background and strong
project management skills, John ushered in a price optimization system for the merchandise line of business
at Walt Disney Parks and Resorts. John also laid the groundwork for bringing in a placement optimization
system. In addition to creating the vision and plans for retail revenue management at Disney’s theme parks
and resorts, he assembled a strong team to lead the company through the development, implementation and
ongoing maintenance of these optimization systems. Before working for Disney, John was a Manager of
Operations Research at Norfolk Southern Corporation, and led a team of programmer analysts that
developed Intranet-enabled decision support applications. At Norfolk Southern, John and his team developed
and implemented a corporate forecasting system that is still in use today to assist Marketing with forecasting
volume and revenues. John also enjoyed a successful career in the airline industry, where he worked in the
Operations Research organizations at United Airlines, Delta Air Lines and Trans World Airlines over the
course of 10 years. John holds a Master of Science degree in Industrial and Systems Engineering from the
University of Southern California Los Angeles and a Bachelor of Science in Industrial Engineering from the
University of Tennessee Knoxville. He is a member of INFORMS, current Chair-Elect of the INFORMS Pricing
and Revenue Management Section and a Past Chair of the INFORMS Aviation Applications Section.
Correspondence: John D. Quillinan, 10046 Fox Meadow Trail, Winter Garden, Florida, USA

ABSTRACT This article is based on the presentation (Quillinan, 2008) delivered in Montreal at the
INFORMS Pricing and Revenue Management Section Conference in June 2008. We will introduce the concept
of data normalization in pricing. The fundamental first step in identifying optimal pricing is to determine the
price–demand relationship. The demand of a product is influenced by many explanatory variables such as
consumer traffic or number of visiting customers, demographics of consumers, weather, and product
availability. In order to extract the price–demand relationship, one must isolate the effect of price on demand
from the effects of other explanatory variables. We employ the normalization process to isolate the impact of the
significant explanatory variables on demand. In our study, normalization occurred on a weekly level, and within
demand zones. We aggregated daily demand and all explanatory variables – mostly visitation statistics and
demographics for a theme park and resort weekly. We grouped store locations to create demand zones for the
purpose of using only applicable explanatory variables. Using the techniques of normalization, model fit, as
measured by adjusted R2, improved dramatically when using all significant explanatory variables versus
consumer traffic only.
Journal of Revenue and Pricing Management (2010) 9, 4–22. doi:10.1057/rpm.2009.37;
published online 30 October 2009
Keywords: pricing; price elasticity; normalization; explanatory variables; demand drivers; retail

& 2010 Macmillan Publishers Ltd. 1476-6930 Journal of Revenue and Pricing Management Vol. 9, 1/2, 4–22
www.palgrave-journals.com/rpm/

Introduction to normalization of demand data

OPPORTUNITY We often refer to demand as being either elastic
‘Ceterıs paribus’ is fundamental to price
¯ ¯ or inelastic. Elastic demand means that the
elasticity theory. Ceterıs paribus is Latin for
¯ ¯ quantity demanded is sensitive to the price.
‘with other things [being] the same’ or ‘all Inelastic demand means that the quantity
other things being equal.’ For example, an demanded is not very sensitive to the price.
increase in the price of beef will result, ceteris Demand also responds to exogenous and
paribus, in less beef being sold to consumers. endogenous factors. Parameters or variables are
Putting aside the possibility that the prices of said to be endogenous when they are predicted
chicken, pork, fish and lamb have simulta- by other variables in the model; variables are
neously increased by even larger percentages, exogenous if they are entirely outside the
or that consumer incomes have also jumped model. For example, a primary factor in any
sharply, or that CBS News has just announced business is the ‘traffic’, that is, the number of
that beef prevents AIDS, and so on – an guest/customers visiting a location or store.
increase in the price of beef will result in less If more people visit a location, then generally
beef being sold to consumers. the location sells more products. For the same
Unfortunately, there are rarely instances in price, demand is higher with higher traffic
which everything else remains the same. counts, and thus traffic can be an exogenous
Normalizing the demand data helps us satisfy variable. For many businesses, the traffic itself is
the ‘ceteris paribus’ requirement. We can also a function of the price: lower advertised
accomplish normalization through techniques prices drive more traffic into the store, and thus
such as ordinary least squares, whereby the traffic is also an endogenous variable. If this is
demand is regressed against the demand true for a business, then advertising versus
drivers. Price can be included in the model, traffic alone may need to be employed as an
or a two-stage approach can be taken whereby independent variable to isolate the price effect,
the normalized means from a primary model as advertised prices are having an impact on
without price as a covariate are regressed traffic. If independent variables, such as traffic
against price. By incorporating the drivers of and advertised prices, are related, there are
demand for products into the modeling a variety of statistical procedures that you
process, we can more accurately model the might use to isolate the effects; Bascle (2008)
price–demand relationship for products. offers a methodology to control for endogene-
ity through the use of instrumental variables.
The different types of consumers visiting
FACTORS INFLUENCING a location may also influence a product’s
DEMAND demand; for instance, if more high-income
Will a change in the price of a good or service consumers visit a location, we may find
change the quantity demanded? Understanding a higher proportion of high-end products
the price–demand relationship is the central sold. Consumer demographics capture the
goal to price a product or service correctly. intrinsic variability that influences price elasti-
At any given time, demand can be influenced city, and can vary over time, affecting the
by several factors beyond price. Traffic and quantity demanded for a product. In addition
customer demographics are just a couple of to the demographics of consumers varying over
factors that influence demand. The impact of time, consumer demographics typically vary
price on product demand is isolated when we by location.
identify significant explanatory variables (other For any given traffic level or demographic
than price) and remove their effect. mix, we can make a difference to the bottom
How sensitive is the quantity demanded to line by adjusting price to maximize profits.
a change in the price of the good or service? Understanding the factors that influence actual

r 2010 Macmillan Publishers Ltd. 1476-6930 Journal of Revenue and Pricing Management Vol. 9, 1/2, 4–22 5

Quillinan

consumer behavior and applying the informa- called herding instinct. People tend to follow
tion to maximize profits is the fundamental the crowd without examining the merits of
value of normalization. There are numerous a particular object.
studies on drivers that affect the demand of K Prices of related goods: When a fall in the price
products. We will discuss some of the factors of one good reduces the demand for another
in the literature review; however, most of the good, the two goods are called substitutes
factors were not applicable to specific imple- (for example, pork and beef). When a fall
mentation. in the price of one good increases the
demand for another good, the two goods are
called complements (for example, cars and
A LITERATURE REVIEW OF gasoline).
FACTORS INFLUENCING K Availability of (number of and closeness of)
DEMAND substitutes: The greater the number of
The literature was reviewed for studies of substitute products, the greater the elasticity.
factors that influence product demand. We will The closer the substitutes for a good or
touch on some of the noteworthy or more service, the more elastic demand will be in
frequent factors found in the literature. response to a change in price.
K Newness of the product: When first introduced,
K Product utility, that is, the usefulness of the demand for a product can be highly
product features, can influence product inelastic and gradually gain elasticity at
demand. Incremental utility can translate maturity. This reflects a compound of
into incremental demand. primary demand of the product and impact
There are two measurable types of utility of cross price elasticity. Normalization of
gained from the use of any product or the price demand function must take the
service: utilitarian, which is the utility product’s lifecycle into account.
directly provided to the user, and conspic- K Product presentation quantity and shelf space
uous, which is the utility provided to the allocation (placement): Altering the product
user as a result of being seen consuming the presentation quantity (within the same
product or service (Basmann et al, 1988). location and across multiple locations) will
The second type of utility is based on the alter demand. Dreze et al (1994) found that
concept of ‘conspicuous consumption’, product location had a large impact on sales,
which was coined by Thorstein Veblen whereas changes in the number of facings
(1912). Demand for some kinds of luxury allocated to a brand had much less impact
goods, like luxury automobiles, actually goes as long as a minimum threshold (to avoid
up as prices rise. When prices increase, out-of-stocks) was maintained.
demand increases instead of dropping or K Degree of necessity or luxury: Luxury products
staying the same; this is referred to as the tend to have greater elasticity than neces-
Veblen effect. found evidence that offers sities. The demand for necessities tends to be
much support for the presence of status inelastic to price, whereas luxury goods and
motives in the behavior of women who buy services tend to be more elastic to price. For
cosmetics. Materialism, reference group and example, the demand for opera tickets is
even education have been associated with more elastic than the demand for urban rail
conspicuous consumption (Schor, 1998). travel. The demand for vacation air travel
Conspicuous consumption is also linked to is more elastic than the demand for business
the bandwagon effect, that is, purchasing a air travel.
good because others are purchasing that After examining 101 different studies on
good (Leibenstein, 1950). The effect is often gasoline, Espey (1996) found that while

6 r 2010 Macmillan Publishers Ltd. 1476-6930 Journal of Revenue and Pricing Management Vol. 9, 1/2, 4–22


gasoline demand appears to be inelastic negative, and that long-run price elasticity
over the short term, over the long term was twice as large as the short-term price
the effect is not the same. Goodwin et al elasticity.
(2004) could not predict with absolute Some studies suggest that the demand for
certainty what effect a rise in the price of alcohol is price inelastic; others suggest it is
gas would have on quantity demanded. price elastic. This is a signal that perhaps the
Goodwin et al observed that the demand demand for alcohol is being impacted by
for fuel decreased by a greater percentage factors other than price that have not been
than the volume of traffic. This is probably incorporated in demand models. Fogarty
because all else was not equal (a violation (2004) concluded that the year of the study,
of our ceteris paribus requirement). The the length of study, the per capita level of
reason why fuel consumed decreases by alcohol consumption and the relative etha-
more than the volume of traffic is probably nol share of a beverage are important factors
that price increases trigger more efficient when explaining variations in estimates of
use of fuel (by a combination of tech- the price elasticity of demand for alcohol.
nical improvements to vehicles, more fuel- Again, the price elasticity of alcohol is being
conserving driving styles and driving in impacted by other factors, which when not
easier traffic conditions). taken into account lead to poor estimates of
K Proportion of income required by the item: elasticity.
Products requiring a larger portion of the K Time period consumed or time elapsed since a
consumer’s income tend to have greater price change: Demand tends to be more elastic
elasticity. in the long run than in the short term. The
As incomes grow, increases in status con- more time consumers have to adjust to a
sumption will be pursued. Chao and Schor price change, the more elastic the demand
(1998) found that income and occupational for that good. Becker et al (1994) and Espey
status were positively associated with the (1996) found this to be the case for cigarettes
propensity to engage in status purchasing. and gasoline, respectively.
K Habit-forming goods: Goods such as cigarettes Fibich et al (2005) found that price elasticity
and drugs tend to be inelastic in demand. is very sensitive to the time that has elapsed
Preferences are such that habitual consumers since the price change. The effect of
of certain products become de-sensitized to reference price is most noticeable immedi-
price changes. Cigarettes and alcohol have ately after a price change. A better estima-
been the topics of numerous studies, includ- tion of elasticity can be derived when the
ing their joint demand. Goel and Morey time dependencies are accounted for. Fibich
(1995) concluded that cigarettes and liquor et al derive an expression for the price
are substitutes in consumption; in other elasticity of demand in the presence of
words, an increase in the price of cigarettes reference price effects that includes a
leads to an increase in the consumption of component resulting from the presence of
liquor and conversely. gains and losses in consumer evaluations.
Lyon and Simon (1968) reviewed results K Permanent or temporary price changes: A 1-day
from prior cigarette elasticity studies, which sale will result in a different response than
vary widely, ranging from À0.10 to À1.48, a permanent price decrease of the same
and suggested that temporal changes magnitude. Blattberg et al (1995) found
may explain some of the variation. Becker that temporary retail price promotions cause
et al (1994) found that the empirical results a significant short-term sales spike.
tend to support the implication of addi- K Price points: Schindler (2006) showed the 99¢
ctive behavior that cross-price effects are price ending is a signal of low price appeal.


Quillinan

Figure 1: Any change that alters the quantity demanded at every price results in shifts in the demand curve.

Decreasing the price from US$2.00 to $1.99 increased by 222 per cent. Eighty-nine to
may result in greater increase in quantity seventy-one cents played on consumers’
demanded than decreasing it from $1.99 to left digit effect. Eighty-nine to sixty-nine
$1.98. cents played on consumers’ right digit signal
Thomas and Morwitz (2005) conducted as well as left digit effect to result in a
experiments with graduate students and phenomenal increase of 222 per cent.
showed that the greatest impact of price K External factors: Weather (temperature and
ending comes when it changes the leftmost precipitation) can influence consumer pur-
digit in the price – $19.99 versus $20, chase behavior. Bottled water and flashlights
for instance, as opposed to $3.49 versus are typical pre-hurricane items. In 2004,
$3.50. ‘Generally, it can be said that this Wal-Mart (Hays, 2004) learned from Hurri-
happens because we read from left to right,’ cane Charley demand data that sales for
Dr Thomas said, and we place extra strawberry Pop-Tarts increase seven times
importance on the first number we see. their normal sales rate ahead of a hurricane,
When he asked students to compare the and that beer was the top-selling pre-
prices of $99.99 with $150 and then hurricane item.
compare $100 with $150, they rated the K Advertising: Becker and Murphy (1988)
gap between $99.99 and $150 as being have argued that advertising works by
significantly larger, even though there was raising marginal consumers’ willingness to
only a penny’s difference. pay for a brand. Advertising actually raises
Blattberg and Wisniewski (1989) showed each individual consumer’s willingness
that when the price of margarine was to pay for a brand (Erdem et al, 2008),
lowered from 89 to 71 cents, sales volume and shifts the whole distribution of
increased 65 per cent, but when it was willingness to pay in the population (see
lowered from 89 to 69 cents, sales volume Figure 1).



Table 1: Theme park visitation statistics

Attendance – Visitor traffic at a theme park
Origin – Visitor’s permanent residence
Group type – Composition of the group visiting (family versus non-family)
Accommodations area – Where the visitor is staying overnight
Ticket type – Type of theme park ticket a visitor is using
Ethnicity – Ethnic origin of visitor
Household income – Household income of visitor
Last trip/repeat visitation behavior – Last time visitor was at theme park
Household segment – Segment into which household of a visitor falls

Table 2: Resort demographics

Resort population – Number of guests staying at a resort
Average length of stay by resort – Number of nights on average that a guest stays overnight
Average room rate by resort – Rate that a guest pays for room on average per night
Percent of children by resort – Based on the reported number of children staying at a resort

Although any of the above-described factors We started out with a larger list of business
could be examined to help isolate the effect of metrics, and narrowed down the lists tabulated
price on demand, for this study, in a theme in Tables 1 and 2. The demand drivers were
park and resort context, we focused on two identified through interviews with merchants as
exogenous explanatory factors, visitation well as insights from financial analysts support-
and demography. In our study, the normal- ing merchandise, and ultimately statistically
ization process was successfully applied to validated through multivariate regression ana-
the merchandise line of business, dramatically lysis. The demand drivers are often referred
improving the fit of the model relating demand to as explanatory variables or covariates in the
to price. The normalization process can be normalization.
applied to other lines of business, such as We selected different demand drivers for
restaurants or service industries, where demand sales of merchandise in the theme parks and
is price-able. Though the demand drivers may resorts. We will discuss each separately to give
change, the approach remains the same. the reader a background on these drivers. The
reason behind the separation of demand drivers
is that the theme park visitation statistics (see
THE STUDY: IDENTIFYING Table 1) cannot be reasonably applied to the
APPLICABLE BUSINESS sales of merchandise in resorts, because these
METRICS FOR DEMAND statistics are collected through attendance-based
DRIVERS theme park research surveys. We carry out
The application of normalization to the sales of a different set of surveys on guests staying
merchandise at theme parks began with at resorts, and they therefore have their own
investigation of what business metrics were set of descriptive statistics. Because not all
collected. Many of these business metrics were resort guests visit the theme parks, it is
already accepted in terms of describing the inappropriate to infer that theme park visitation
cycle and trends of the theme park and resort statistics affect sales of merchandise at resort
businesses. locations.


Quillinan

Visitation statistics are family and non-family. Family subgroups
We intercepted guests visiting the theme parks, are based on the age of the oldest child. Non-
and asked a series of questions. Attendance is family subgroups are based on the age of the
the most important visitation metric. This is respondent.
the total number of ‘unique’ daily entries into a
park; it does not include re-entries, or cross- Accommodations area
overs between parks. Because the other visita- The Accommodations Area is where the visitor
tion statistics are reported at the individual park is staying overnight. Broadly, we classify visitors
level, attendance is taken as the multiple-theme as staying either ‘On’ or ‘Off ’ of the theme
park attendance. In this example, it is worth park-managed property/hotel.
noting that attendance is truly exogenous – it is
determined by factors unrelated to the prices Ticket types
of merchandise and foods; traffic to the theme Ticket types are classified according to the
parks as a whole is typically driven by academic number of days over which a ticket can be used.
calendars (of school-age children, entrance
ticket prices and theme park/resort promo-
tions). As discussed earlier, this assumption may Ethnicity
not be true for other retail businesses. Ethnic origin is classified into Caucasian,
Other visitation statistics include origin, African-American, Hispanic and All Other;
group type, accommodation area, ticket type, ethnic origin does not include international
ethnicity, household income, repeat visitation tourists.
behavior and household segment.
Household income
Origin Household income is based on the household
Origin is the guest’s permanent residence, income reported.
and is derived from the guest’s zip code.
The highest-level classifications for origin are Last trip/repeat visitation behavior
Domestic Tourists, International Tourists and Repeat visitation behavior is when the guest
Local Residents. Domestic Tourists are from was on their last trip to the theme park; for
any of the 49 United States (not local). instance, was the guest here less than a year ago,
International Tourists live outside of the between 1 and 2 years ago, 2 and 3 years ago, 3
United States. Local Residents live locally. and 5 years ago, or more than 5 years ago. Last
Origin data are used to classify data such trip excludes first-timers.
as attendance. By being able to cut the data by
origin or even finer, we can see patterns
Household segment
emerge. In the normalization process, we do
Household segment is a classification of a
not use all of the highest-level classifications,
guest’s household based on life stage, and
but rather employ the classification that is most
visitation behavior. Similar to group type, life
significant and has the highest influence on
stage classification is based on presence of
demand. These classifications are determined
children in the household (family versus pre-
through statistical selection methods, such as
family and post-family) and ages of the children
backwards-stepwise linear regression.
or adults. There are six life stage segments:
Young Family, Tween Family, Teen Family,
Group type Pre-Family, Post-Family and Seniors.
Group type represents the composition of The visitation statistics discussed thus far are
the group visiting. The primary categories categories selected by the theme park’s research



department to measure and understand a theme process, and, typically, each guest receives a
park business. As a first pass, they are being room key with their name on it.
leveraged to normalize merchandise demand. We
did not create the categories and classifications Weather statistics
for the normalization of merchandise demand. We introduced weather statistics, such as
We did not use all classifications of the maximum temperature and total rainfall, which
categories in the normalization process. We are self-explanatory explanatory variables.
selected the following classifications of
the categories because of their statistical METHODOLOGY: AGGREGATION
significance:
OF DEMAND DATA
Normalization can occur on whatever level of
K local residents;
time that the demand data and drivers co-exist.
K families with children 0–13;
In the case of our study of merchandise
K guests on 1-day park tickets;
demand, normalization occurred on a weekly
K caucasian;
level. This is primarily because the theme park
K household income > $60 000; and
demographics were deemed only significant at
K guests whose last trip was less than 5 years ago.
the weekly level. We aggregated all data to the
weekly level, even though in some cases such as
Resort demographics demand for merchandise products and park
Resort demographics, listed in Table 2, were attendance, the data were available on a daily
obtained from the resort occupancy statistics. basis. We selected weekly theme park demo-
While surveys of resort guests are conducted, graphics. We produced weather variables, such
these descriptive statistics were not employed in as the minimum, maximum or average, to
the normalization owing to a more than 1- represent the entire week.
month lag in receiving this information, and to Normalization occurred within a demand
concerns that the statistics are not valid on a zone, or grouping of locations. The purpose
weekly level. The resort occupancy statistics was to use only applicable explanatory vari-
came directly from the hotel/resort operational ables. We used theme park visitation statistics to
systems for checking guests in and out. normalize merchandise demand observed
within the theme park. We used resort drivers
Resort population to normalize merchandise demand observed at
Resort population is the number of guests resorts locations. We later aggregated the results
staying at a resort. from the demand zones to the price zone level
for price differentiation.
Average length of stay Stage-wise regression (Alley, 1987) is one of
Average length of stay is the number of nights the statistical approaches that are available for
on average that a guest stays overnight. use. Analysis of Covariance in conjunction
with meta-analysis techniques (Thompson and
Average room rate Sharp, 1999) is just one approach of the many
Average room rate is the rate that a guest pays that are available. In the merchandise example,
for a room on average per night. we implemented a two-stage linear regression
with backward selection. The first-stage regres-
sion normalized the demand using all available
Percentage of children
explanatory variables in the following form:
Percentage of children is based on the reported
number of children staying at a resort; this
information is collected during the reservation ^
d ¼ a þ b1 x1 þ b2 x2 þ Á Á Á þ bn xn þ e


Quillinan

^
where d ¼ total demand owing to explanatory where Z ¼ profit function; P0 ¼ optimal price;
variables, excluding price effect; a ¼ base C ¼ cost (landed cost or procurement cost);
demand; bi ¼ rate at which demand changes ^
d0 ¼ normalized demand predicted with
with explanatory variable i; xi ¼ explanatory respect to price effect; j ¼ secondary demand
variables i; n ¼ number of explanatory variables intercept; and d ¼ rate at which demand
i; and e ¼ error term. changes with respect to price.
We then subtracted the fitted/predicted de-
^
mand calculated from the first step dt from actual NORMALIZATION USING ALL
observed demand, and added this residual to the
EXPLANATORY VARIABLES
average demand over the life of the product.
Normalization is a better alternative to how the
Pn
dt merchandise demand and price changes were
0
dt ¼ t¼i ^ ^
þ ðdt À dt Þ ¼ d þ ðdt À dt Þ measured in the past. Historically, merchants in
n the theme park business measured their success
where dt0 ¼ secondary stage demand at time t; based on per capita spending. Per capita is
dt ¼ actual observed demand at time t; dt0 ¼ - ^ simply the demand divided by the theme park
demand predicted by first-stage regression for attendance. ‘Per cap’ is the ‘average’ amount
time t, which is estimated using a þ b1x1t þ demanded per guest per day. All guests count,
¯
b2x2t þ ? þ bnxnt; d ¼ average demand over even those who do not spend any money at all.
life of the product; xjt ¼ explanatory variable j Per capita is analogous to normalization using
at time t; i ¼ first time period; n ¼ number of attendance alone.
historical weeks available; and t ¼ time period. d
In the second-stage regression, the secondary Per cap ¼
A
stage demand, which we refer to as the
normalized demand, was regressed against price where d ¼ demand; and A ¼ attendance.
to calculate the price coefficient, d. Attendance alone does not explain the
variability in merchandise demand as accurately
^
d 0 ¼ j þ dP þ e as an estimate based on normalization. We find
^ that guests of different origins spend different
where d 0 ¼ normalized demand predicted with
amounts, and therefore per caps actually vary
respect to price effect; j ¼ base demand;
by origin. This is fairly intuitive if you think
d ¼ rate at which demand changes with respect
about it – let us use food and beverages as an
to price; P ¼ price; and e ¼ error term.
example. When you are a tourist, you are likely
The price coefficient is that amount by
to eat in the theme park, as you probably
which demand changes with regard to price. It
do not have anywhere else to go (unless you
is not the price elasticity estimate of demand,
have brought in food or are in a timeshare
re, that is the rate by which the percentage of
where you can cook). Conversely, when you
demanded changes with respect to percentage
are a local resident, you have the ability to eat
of change in price. The above equation
at home, and thus you might spend very little
estimated from the second-stage regression,
on food and beverages in the parks.
which represents the price-demand function,
To illustrate the need to normalize using more
was then used to calculate the price elasticity of
than attendance in the merchandise business, we
demand. This is a form of function that is
choose a commonly purchased product for
ultimately passed to the price optimization
obtaining character signatures: a vacation auto-
system to find the optimal price.
graph book. We compare (1) normalization using
^ attendance only (per cap); and (2) normalization
Max Z ¼ ðP0 À CÞd 0
using all explanatory variables including price.
¼ ðP0 À CÞj þ dðP0 À CÞ2 The measure of fit is the adjusted R2.



Figure 2: Normalization using Attendance Only.

When we fit a statistical model of demand 0 and 13; Family 0–13 ¼ percentage visitors that
using only attendance as the explanatory are families with the oldest child between ages
variable to our example product, the adjusted 0 and 13; b5 ¼ rate at which demand changes
R2 is 0.64. In short, attendance helps to explain with respect to location exposure; location ¼
approximately 64 per cent of the variation in percentage exposure that product is given
demand (Figure 2). across locations; and e ¼ error term.
When the demand was normalized using all
^
d ¼ a þ bA þ e of the significant explanatory variables includ-
^ ing price, five variables were significant: theme
where d ¼ total demand predicted; a ¼ base
park attendance, Caucasian, First Trip, Family
demand; b ¼ rate at which demand changes
0–13 and Location. This approach resulted in
with attendance; A ¼ attendance; and e ¼ error
an adjusted R2 of 0.85; 85 per cent of the
term.
demand variation is accounted for by the
^
d ¼ a þ b1 A þ b2 Cauc þ b3 FirstTrip explanatory variables. Thus, the equation using
þ b4 Family 0 13 þ b5 Location þ e additional variables improved the explanatory
power of the model from 64 per cent to 85
^
where d ¼ total demand predicted; a ¼ base per cent, a dramatic change (Figure 3).
demand; b1 ¼ rate at which demand changes The 21 per cent improvement in goodness of
with attendance; A ¼ attendance; b2 ¼ rate at fit for the autograph book is a non-refutable
which demand changes with respect to per- reason for including all significant explanatory
centage Caucasian visitors; Cauc ¼ percent variables versus attendance only.
Caucasian visitors; b3 ¼ rate at which demand Another example of where the goodness
changes with respect to percentage visitors on of fit measure improved by incorporating
First Trip; First Trip ¼ percentage visitors on all significant explanatory variables was a
First Trip; b4 ¼ rate at which demand changes nighttime entertainment or ‘glow’ product.
with respect to percentage visitors that are Considering attendance only to explain the
families with the oldest child between ages variability in the demand, we end up with the


Quillinan

Figure 3: Normalization using all significant explanatory variables including price.

following model: Location ¼ percentage exposure that product
is given across locations.
^
d ¼ 133:7688 þ 0:00008518 A þ e The first-stage-regression adjusted R2 for
this product went from 0.9 per cent for
^
where d ¼ total demand predicted; A ¼ attendance only to 54.7 per cent for a full
attendance level; and e ¼ error term. normalization model using all significant ex-
When we added the significant explanatory planatory variables.
variables to the model, we ended up with the In the second-stage regression for the sample
following ‘full’ model: product in Figure 4, the normalized demand
is computed from the first-stage regression
^
d ¼ À88:317348 þ 0:000102ÂA residuals and the means of the series. Figure 5
illustrates how the normalized demand for the
þ 3:646500838ÂCauc superior model (that is, ‘full’ model) is then
þ 5:715628ÂTourist Ticket regressed against the price to arrive at the
þ 4:696201ÂLocal Resident predicted normalized demand with respect to
price.
À 6:176941ÂMax Temp
þ 5:757854ÂLocation ^
d0 ¼ 372:0399 À 12:6762ÂP þ e
^
where d 0 ¼ normalized demand predicted with
^
where d ¼ total demand predicted; A ¼ respect to price effect; and P ¼ price.
attendance; Cauc ¼ percentage Caucasian visi- Overall, there is a remarkable improvement
tors; Tourist_Ticket ¼ percentage of visitors in model fit across all items available for
on Tourist ticket; Local_Resident ¼ percen- normalization. The mean adjusted R2 goes
tage of visitors that are local residents; from 6 per cent to approximately 50 per cent
Max_Temp ¼ maximum temperature; and across the proportion of the products that are



Figure 4: Normalization of Glow Product using Attendance Only and Full Model.

Figure 5: Second-stage regression results using Full Model.

available for normalization. We applied nor- improvement in adjusted R2 for a sample
malization to analyze every item that had at of the products impacted. Normalized
least 52 weeks of demand history and at least demand was calculated for the demand zone
one price change. Figure 6 illustrates the exhibiting the most sales per item; alternatively,


Quillinan

Figure 6: Distribution of adjusted R2 from Attendance Only to Full Model.

normalized demand can be weighted by the multiplied by the pre-price-change normalized
proportion of demand in a demand zone and demand, then the price change is concluded to
summed up across all groupings of locations to be a success. The caveat to this test is that
produce an estimate of normalized demand at product relationships should be taken into
the price zone level, which represents the consideration when evaluating a price change.
higher level of grouping of locations (above
demand zone) where price differentiation
occurs. NORMALIZATION REQUIRES
Normalization is the first step in isolating the CONTINUAL PROCESS
effects of price on demand. After normalized IMPROVEMENT
demand, we can now go back and evaluate Normalization models must be updated reg-
whether or not a price change was successful. ularly. This requires a Continual Process
In Figures 2 and 3, the reader can see that Improvement. An organization that identifies
a price change occurred at about week 71. explanatory variables upon implementation and
A standard t-test was conducted on the means never goes back to update their models could
of the fitted (‘normalized’) demand before and be facing a disaster. In our example, annual
after the price change. If the means were audits to refresh models should be planned.
statistically different, then a simple comparison In addition to annual audits, there should be
of the pre- and post-price change normalized ongoing efforts to identify other explanatory
profit was carried out. If the normalized variables.
profit after the price change (that is, the new
price less cost of goods sold multiplied by the K In the case of the glow product, park
post-price change normalized demand) was operating hours and total hours of darkness
greater than the old price less cost of goods sold during park hours were not available



initially. However, later on, we evaluated location variable in the normalization mod-
these data and found them to have merit in els, the average adjusted R2 was 0.35. Only
the normalization process. approximately 25 per cent of the models had
K The number and type of locations at which a R2 higher than 0.5. After we added the
product is available change often, and there- location variable, the model performance
fore are likely to influence the demand for improved significantly (see Figure 7). The
the product. We introduced an additional average adjusted R2 increased to 0.52. More
explanatory variable, which represents the than 50 per cent of the models had adjusted
availability of a product at different locations, R2 that were higher than 0.5.
to account for changes in product demand K Big-box locations utilize ShopperTrak hard-
owing to changes in a product’s availability ware, which effectively counts the number
at different locations. The location variable is of guests entering and exiting the location.
the weighted number of locations at which a We leveraged the ShopperTrak guest counts
product is available each week. We calculate collected on the hour to allocate labor
the location variable for each item in each during heavy traffic periods. These guest
demand zone. The weights for each location counts could be used in lieu of park atten-
are the location’s percentage of contribution dance counts for demand at the big-box
to total sales during the previous year. The store locations. ShopperTrak is common-
weights are calculated at the sub-class level place at retail locations such as nationwide
and applied to all items in the sub-class. We chains at indoor-shopping malls, and would
update the weights every few months, to most likely be an ideal explanatory variable
adjust for any location changes. Without the for a retailer to use in normalizing its

Figure 7: Distribution of adjusted R2- for Without and With Location Variable.


Quillinan

demand data. Improving upon the estimate K Grad Night
of consumer traffic should have value for any K Father’s Day
retailer. K Sci-Fi Weekend
K Summer Vacation
K Independence Day
NEXT STEPS K Labor Day
K Food and Wine
Related products K October Fest
The models presented in this article are K Fright Fest
simplistic. Models that are more complex could K Halloween
incorporate the prices of related goods, both K Thanksgiving
complements and substitutes, and the number K Holiday Season
of available substitutes. Market basket analysis is K Christmas
one technique that can help understand
demand dynamics.
An example of the price of related products Diagnostic tests
is how the price of the complement of a Finally, we have spent a lot of time explaining
princess costume, the tiara, wand and novelty how adding all of the significant explanatory
shoes impacts the demand of the princess cos- variables will result in better model fits. We do
tume. The prices of other characters’ costumes not want to lose sight of the importance of
may also influence the demand of other running diagnostics on our models. Therefore,
costumes. Costume ensembles may cannibalize our last next steps are to incorporate diagnostics
each other. in the regression processes. Some recom-
mended diagnostics are:

Special event indicators K Significance of explanatory variables: Are all of
In spite of our best efforts to identify operating the explanatory variables significant?
business metrics that explain merchandise K Collinearity: Predictors that are highly colli-
demand, and incorporate product relationships, near, that is, linearly related, can cause
we will always fail at coming up with models problems in estimating the regression coeffi-
that result in high-adjusted R2. In some of cients. When more than two variables
these cases, further work with indicator vari- are involved, this is often called multi-
ables may be of some promise. Some products collinearity. The Variance Inflation Factor
are seasonal or driven specifically by recurring (VIF) is a good metric to use to evaluate
events. Some events that were identified in multi-collinearity. As a rule of thumb, a
theme park example are: variable whose VIF values are greater than
10 may merit further investigation.
K New Years Eves; K Normality: The errors should be normally
K Presidents Day distributed – technically normality is neces-
K Valentine’s Day sary only for the t-tests to be valid; esti-
K Sports Weekend mation of the coefficients only requires that
K Spring Break the errors be identically and independently
K Easter distributed. The Kolmogorov-Smirnov test
K Home and garden show compares the cumulative distribution of the
K Earth Day data with the expected cumulative Gaussian
K Mother’s Day distribution, and bases its P-value on the
K Memorial Day largest discrepancy.



K Homogeneity of variance (homoscedasticity): One CONCLUSION
of the main assumptions for the ordinary
least squares regression is the homogeneity Greater confidence in price
of variance of the residuals. If the model optimization
is well fitted, there should be no pattern to The two-stage regression methodology results
the residuals plotted against the fitted values. in a substantially improved model fit and thus
If the variance of the residuals is non- in much stronger explanatory power. With this
constant, then the residual variance is said greater explanatory power, pricing analysts
to be ‘heteroscedastic.’ There are graphical have greater confidence in taking price recom-
and non-graphical methods for detecting mendations from the price optimization. With-
heteroscedasticity. The White Test (White, out this confidence, the pricing analyst could
1980) tests the null hypothesis that the take the price in the wrong direction or perhaps
variance of the residuals is homogenous. in the right direction but could conclude that
Therefore, if the P-value is very small, we the price change had a negative effect.
would have to reject the hypothesis and
accept the alternative hypothesis that the
variance is not homogenous. Extensibility of normalization
The concept of normalization can be applied to
any line of business, where the demand is
Don’t lose sight of the big picture price-able. Though the demand drivers may
The pricing analyst should also incorporate change, the approach remains the same.
impacts on the category when evaluating the The end-user needs only to identify and
impact of price changes on individual products. apply those explanatory variables that are
The objective is to improve overall economics, unique to their business: the general demand
not the profitability of individual product lines. drivers that might influence demand, particu-
An example: When a retailer increased the larly retail demand. In Table 3, I have attempted
price on a leading brand of detergent, he saw to enumerate and classify possible explanatory
a drop in the total gross margin; normally, variables, and identify which industries might
the pricing analyst would have concluded that benefit from them.
the price increase was a negative price test
and that the company should take price in
a different direction. However, when the Final diagnostic
analyst looked at the category sales, he noticed While much of the process, for example, the
that another detergent whose price remained selection of explanatory variables and diagnos-
the same was gaining in higher gross margin tic tests, can be automated, this does not relax
owing to increased sales. He concluded that the need for a pricing analyst to ensure that the
the sales from the leading detergent had diver- inputs, that is, the explanatory variables, and
ted to the lower-quality detergent, on which the price optimization results, make sense.
the retailer happened to have a higher profit A pricing analyst provided three examples in
margin. At one point, the retailer was con- which price elasticities were out of kilter, and
sidering dropping the leading brand detergent, knowledge of consumer behavior or the
but the pro forma became more complicated. assortment provided additional insights for
The manufacturer of the leading brand deter- pricing decisions.
gent gave the retailer off-invoice promotions One example was a price recommendation
to keep their business; consequently, this allo- for a princess lightchaser (a glow product). The
wed the retailer some options with product lightchaser was originally priced at $14,
pricing. changed to $15, later to $10, and then raised


Quillinan

Table 3: Covariates across industries

Seasonality covariates (most/all industries)
K Fourier Filter-based Covariates (multiple yearly cycles)
K Other Data-derived Covariates (such as monthly contribution ratios)
K Day-of-Week Covariates
K Holiday/Special Event 0/1 Indicator Covariates
K Other User-defined Covariates

Customer-based covariates (Retail, hotel/cruise, distribution, services, casino, theme park, manufacturing)
K Customer Demographics (total number of customers, Age/Financial Distribution, etc)
K Customer Survey Information
K Information on Prior Customer Relationships

Competition based covariates (Airline, petroleum, financial services, apartment leasing, distribution, any industry with
access to competitive data)
K Market Strength Indices
K Price or Volume of a Competitor’s key products
K Your company’s Price Rank among competitors

Financial covariates (primarily financial services, petroleum, manufacturing, transportation, communications, others
where appropriate).
K Gross Domestic Product (GDP)
K Index of Leading Indicators
K Gross National Happiness , a new concept relating happiness to economic growth
K Population
K Labor Force: Employment, Unemployment rate, Average Weekly earnings, Job security
K Public Expenditure, Revenues, Budget Surplus and Deficit, National Debt
K Personal Income, Expenditure, Savings
K Broadband Internet Penetration
K International: Balance of Payments (Current Account Balance of Trade)
K Productivity Survey
K Manufacturing output, Capacity Utilization, Inventories
K Money supply, Interest Rates (Fed Funds, Prime, Major Competitors), Yield on various financial
Instruments and Yield Curves.
K Stock Market Indices (Dow-Jones Industrial Average, SP 500, etc)
K Inflation, Consumer Price Index, Producer Price Index
K New Home Sales
K Retail Sales, Auto Sales
K Lagging indicator, a historical indicator following an event that reacts slowly to economic changes
K Genuine Progress Indicator, a concept in ecological economics and welfare economics that has been
suggested as a replacement metric for GDP
K Spot market prices
K Energy prices, supply and consumption (Source: US Department of Energy, Energy Information
Administration, http://eia.doe.gov)

Climate covariates (retail, healthcare, pipeline, any industry where relevant)
K Daily Temperature (Low, High, Average, #consecutive cooling/heating days)
K Rainfall



back to $15. Using the price-demand history, pricing analyst otherwise, and prevented poten-
the system derived a price elasticity of tially disastrous price changes. In the end, the
0.0000175, and arrived at an ‘optimal’ price price-demand function from the normalization
of $400 000. Needless to say, the pricing analyst process must pass the final diagnostic: the
declined the price recommendation, even pricing analyst sanity check.
though it had an expected revenue lift of $8.3
billion.
Recent ‘flawed’ recommendations were ACKNOWLEDGEMENTS
produced for ponchos. After increasing the The model and graphs presented in this article are based
retail of ponchos by $1 – adults to $8 and youth on fictitious demand data. We did not use actual products
to $7 – feedback from operations was received and demand data in the creation of the graphs. I thank the
that guests were balking at the price increases, following individuals: Robert Shumsky for his encour-
agement, and valuable feedback throughout the referee
and purchasing a youth poncho instead. process; Utku Yildirim for his feedback on an early draft;
Ponchos in general were up in revenue. The the Merchandise Revenue Management and Pricing
systems generated pricing recommendations teams, in particular Ashley Bynoe and Tam Phan for
for ponchos based on an elasticity of À2.6 for their valuable input, and Jacqueline Kappes for the pricing
adults and a positive elasticity for youth. analyst examples; and to my wife Kim Quillinan and April
Walsh for their review, suggestions and proofing.
Although these elasticities supported opera-
tional feedback, intervention by a pricing
analyst was still needed to understand the
poncho price change as a whole. Pricing REFERENCES
Alley, W.M. (1987) A note on stagewise regression. The American
decided to consolidate the price points and
Statistician 41(2): 132–134.
tried lowering the price of both poncho sizes; Bascle, G. (2008) Controlling for endogeneity with instrumental
unfortunately, the price test results were not variables in strategic management research. Strategic Organiza-
encouraging. After normalizing for attendance, tion 6(3): 285–327.
Basmann, R., Molina, D. and Slottje, D. (1988) A note on
weather, demographics and so on, there was no measuring Veblen’s theory of conspicuous consumption.
increase in demand. This supports the pricing Review of Economics and Statistics 70(3): 531–535.
analyst’s original hypothesis that ponchos are Becker, G.S. and Murphy, K.M. (1988) A rational theory of
addiction. Journal of Political Economy 6(4): 675–700.
inelastic and therefore price decreases do not Becker, G., Grossman, M. and Murphy, K. (1994) An empirical
drive incremental revenue. analysis of cigarette addiction. American Economic Review 84:
The third and final example was related to 396–418.
Blattberg, R.C., Briesch, R. and Fox, E.J. (1995) How
theme park novelty headwear, where a core
promotions work. Marketing Science 14(3(2)): 122–132.
product had been increased from $9.95 to Blattberg, R.C. and Wisniewski, K.J. (1989) Price-induced
$11.95. At the same time, the novelty product patterns of competition. Marketing Science 8(4): 291–309.
line was expanded and a create-your-own hat Chao, A. and Schor, J.B. (1998) Empirical tests of status
consumption: Evidence from women’s cosmetics. Journal of
was introduced. Both events led to cannibaliza- Economic Psychology 19(1): 107–131.
tion of the core product’s demand. As a result, Dreze, X., Hoch, S.J. and Purk, M.E. (1994) Shelf management
the price optimization system recommended and space elasticity. Journal of Retailing 70(4): 301–326.
Erdem, T., Keane, M. and Sun, B. (2008) The impact of
a price decrease, because the core product advertising on consumer price sensitivity in experience goods
demand was down about 10 per cent. What the markets. Quantitative Marketing and Economics 6(2): 139–176.
system failed to take into account was that the Espey, M. (1996) Explaining the variation in elasticity estimates of
gasoline demand in the United States: A meta-analysis. Energy
total hat business was up; consequently, no
Journal 17(3): 49–60.
decreases in retail prices of the core product are Fibich, G., Gavious, A. and Lowengard, O. (2005) The dynamics
planned, as the quantity loss is believed stem to of price elasticity of demand in the presence of reference price
from cannibalization and not the price increase. effects. Journal of the Academy of Marketing Science 33(1): 66–78.
Fogarty, J. (2004) The Own-price Elasticity of Alcohol: A Meta-
In conclusion, the human mind caught analysis. The University of Western Australia. Department of
something about which the data told the Economics Working Paper, 04-01.


Quillinan

Goel, R.K. and Morey, M.J. (1995) The interdependence of agement and Pricing Conference; 19 June, Montreal,
cigarette and liquor demand. Southern Economic Journal 62(2): Quebec, Canada.
451–459. Schor, J.B. (1998) The Overspent American: Upscaling, Downshifting,
Goodwin, P., Dargay, J. and Hanly, M. (2004) Elasticities of road and the New Consumer. New York: Basic Books.
traffic and fuel consumption with respect to price and income: Schindler, R.M. (2006) The 99 price ending as a signal of a
A review. Transport Reviews 24(3): 275–292. low-price appeal. Journal of Retailing 82(01): 71–77.
Hays, C.L. (2004) What Wal-Mart knows about customers. The Thomas, M. and Morwitz, V.G. (2005) Penny wise and pound
New York Times 14 November, http://www.nytimes.com/ foolish: The left digit effect in price cognition. Journal of
2004/11/14/business/yourmoney/14wal.html. Consumer Research 32(01): 54–64.
Leibenstein, H. (1950) Bandwagon, Snob, and Veblen effects in Thompson, S.G. and Sharp, S.J. (1999) Explaining heterogeneity
the theory of consumers’ demand. Quarterly Journal of in meta-analysis: A comparison of methods. Statistics in
Economics 64(2): 183–207. Medicine 18: 2693–2708.
Lyon, H. and Simon, J.L. (1968) Price elasticity of the demand Veblen, T. (1912) The Theory of the Leisure Class: An Economic
for cigarettes in the United States. American Journal of Study of Institutions. New York: Macmillan.
Agricultural Economics 50(4): 888–895. White, H. (1980) A heteroskedasticity-consistent covariance
Quillinan, J.D. (2008) Normalization of demand data: An matrix estimator and a direct test for hetereoskedasticity.
introduction to normalization. INFORMS Revenue Man- Econometrica 48(4): 817–838.


"Introduction to normalization of demand data

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to "Introduction to normalization of demand data

Similar to "Introduction to normalization of demand data (20)