Choice Morsel

 

Home
RESEARCH
CONSULTING
PROFESSIONAL
PERSONAL

Up

The purpose of this page is to provide visitors with some interesting insight in consumer choice behavior and choice modeling, marketing and so forth. It is periodically updated, so check it out every once in a while!

Last updated on 04.21.2010.

 

Modeling Choice Set Formation Within the GEV Family of Models

INTRODUCTION

In standard choice models it is assumed that the alternatives among which choice is exercised can be exogenously specified by the analyst. Thus, in the most commonly used discrete choice models (e.g. Multinomial Logit – MNL, Nested MNL, Multinomial Probit – MNP) it is assumed that some set CnÍ M, where M is the master set of alternatives, is the true set from which the choice of person n is observed. (The most common strategy of choice set specification makes all choice sets equal to the master set, i.e. Cn=M, " n.) Strictly speaking, however, set Cn is a latent construct to the analyst since generally nothing is observed about it except the most preferred alternative.

Choice set imputation, or generation, is clearly relevant from the perspective of specifying models of choice processes. Swait and Ben-Akiva (1986) examined theoretically the impacts of choice set mis-specification when captivity (i.e. being unable to choose anything except the single alternative in the choice set) is present among a fraction of the population but ignored by the analyst, who erroneously specifies Cn=M for the entire population. They report that (1) alternative-specific constants are downward biased for the alternative exhibiting captivity, and (2) attribute slope effects become attenuated by the presence of the unrecognized captivity. One may safely surmise from their analysis that choice set mis-specification is deleterious to the analyst’s efforts to determine unbiased taste parameter estimates in any choice model..

Manski (1977) formulated a two-stage characterization of the choice process as the basis for model development:

, (1)

where C is a choice set in D (M), the set of subsets of M, Q(C) is the probability that C is the true choice set, and P(i|C) is the conditional probability of choice given set C (zero if iÏ C). The usual inspiration for specifying Q(C) has been the notion of random constraints (e.g. travel time limits, reservation prices, restrictions imposed by other agents) acting upon the formation of the set of actively considered alternatives.

Swait (2001) proposes a new model of choice set generation belonging to the GEV (Generalized Extreme Value) family of discrete choice models – to our best knowledge, this is the first such model. An interesting feature of the model is that the choice set probabilities need make no use of exogenous information, but are instead taste-driven. (Covariates may be added to the model, of course, to aid identification of the choice set generation probabilities. This possibility is addressed subsequently in the discussion of model extensions.) This differentiates it from Manski’s two-stage framework, described above.

Swait’s (2001) contributions are two-fold: (1) the class of GEV discrete choice models is expanded by a new member, denoted the Choice Set Generation Logit, or GenL, model; and (2) choice set generation models of a certain specific structure are shown to be consistent with the GEV class and, by implication, are shown to be consistent with utility maximizing behavior under certain conditions.

THE GenL MODEL

To those interested, McFadden (1978) presented a theorem characterizing the Generalized Extreme Value family of probabilistic choice models. The theorem relates a generation function G(), which must have certain characteristics, to a corresponding choice probability. The generation function

, m ³ 0, (2)

where the y’s are non-negative utilities for each of J alternatives, m is a scale factor, after substituting , Vi a latent variable without sign restrictions to guarantee the non-negativity of the arguments of G(), the choice probability is given by the familiar expression below:

. (3)

Before defining the proposed model, Swait (2001) first addresses the composition of the set D (M), the set of possible subsets of M, the master set of alternatives. In any model of choice set generation, D (M) is part of the specification of the choice model, just as are the attributes included in the utility function. Thus, in one model D (M) may include all possible subsets of M (of which there are 2J-1); in another, D (M) may be restricted to sets of size one and the full choice set (this might be called the "captivity" model); in yet another, it may be restricted to sets of size L or smaller, where LÎ [1,2J-1]. With this flexibility in mind, define K to be the number of sets included in D (M), such that 1£ K£ (2J-1).

The GEV generation function for the GenL model is given next:

(4)

which is a valid GEV generating function that satisfies the conditions of the GEV Theorem if m /m k£ 1, " k. The GenL choice probability for iÎ {1,…,J} is given by

(5a)

where Ki={k|1£ k£ K, iÎ Ck},

(5b)

(5c)

(5d)

Swait (2001) presents proofs for the above expressions.

Model GenL defines some set Ck is the true choice set is a function of the expected maximum utility derived from the alternatives in the set. Hence, as alternative j becomes more attractive, all sets including j will have increased probability of being the true choice set; this increase in Vj will not impact all sets equally, however, since the number of and utility levels of other alternatives will influence how much impact this increase will have on any given subset of alternatives. This perspective permits an interesting behavioral interpretation of GenL, namely that decision makers make choices by eliminating from consideration all alternatives that do not meet some minimum utility threshold, then pick the highest utility alternative. Then, ceteris paribus, higher utility means higher chance of being considered, therefore higher probability of being chosen. Thus, tastes play the fundamental role in choice set formation, as opposed, for example, to the role of constraints in eliminating alternatives. Constraint-based theories have generally been the assumed form of implementing Manski’s (1977) two-stage formulation, but the experimental and modeling work of Klein and Bither (1987), for example, shows that minimal utility thresholds are a viable basis for choice set formation.

An interesting behavioral corollary of the endogeneity of choice set probabilities is that the Q(Ck)’s change at every point in the attribute space. Thus, policies affecting the attributes of alternatives generate their impact in two stages, first by impacting choice set formation, then by impacting competition among alternatives within subsets of alternatives.

Swait (2001) goes on to examine many properties of the GenL model, as well as to estimate its parameters for an intercity mode choice situation.

SUMMARY AND CONCLUSION

After a hiatus of some 20 years, it is encouraging to see a revival of interest in the GEV family of choice models, which has produced two of the most widely used discrete choice models, MNL and NMNL. Recent progress in simulation estimation techniques has allowed use of certain complex specifications for discrete choice modeling (e.g. MNP and random coefficients versions of GEV models, especially the MNL), some of which are more promising than others from a practical perspective. Much of the attraction of these more complex models has been circumventing certain properties of GEV models (e.g. IIA in the MNL) or capturing more complex cross-substitution behavior than allowed by others (e.g. NMNL). The literature has not always been completely frank, however, about the difficulties inherent to estimating these more complex models, so it is heartening to see the recent burst of renewed energy in more flexible forms of the GEV family (see Swait 2001 for discussion on GenL and references for other GEV developments), which has undeniable computational advantages compared to other model forms.

One feature of the GenL model that is particularly interesting is its flexibility with respect to testing alternative choice set space representations. In general, a master set M={1,…,J} of alternatives has (2J-1) non-empty subsets; the model requires one scale parameter be estimated for each choice set. This is not likely to be practical for values of J much greater than 5 or 6. Hence, empirical applications are more likely to proceed with restricted representations of the choice set space. GenL is easily adapted for estimating any non-empty choice set generation process, simply by determining which subsets of M to exclude. In effect, it becomes possible to selectively model the choice set space itself using GenL.

REFERENCES

Klein,  N. and Bither, S. (1987) An Investigation of Utility-Directed Cutoff Selection, Journal of Consumer Research, 14, 240-256.

Manski, C. (1977) The Structure of Random Utility Models, Theory and Decision, 8, 229-254.

McFadden, D. (1978) Modeling the Choice of Residential Location, In Spatial Interaction Theory and Residential Location, A. Karlquist et al., editors, Amsterdam:North Holland, 75-96.

Swait, J. (2001) Choice Set Generation Within the Generalized Extreme Value Family of Discrete Choice Models, Transportation Research B, 35(7):643-666.

Swait, J. and Ben-Akiva, M. (1986) An Analysis of the Effects of Captivity on Travel Time and Cost Elasticities, Annals of the 1985 International Conference on Travel Behavior, April 16-19, 1985, Noordwijk, Holland, 113-128.

For other references, click here.

Top of Page

Home