Synthetic population generation by combining a hierarchical, simulation-based approach with reweighting by generalized raking
Transportation Research Record, 2015•journals.sagepub.com
A recent approach for generating populations of synthetic individuals through simulation is
extended to produce households of grouped individuals. The contingency tables of the
generated populations match external controls on the individual and household levels while
exhibiting far greater variety in composition than existing approaches can offer. The method
involves a two-step approach. The first consists of a procedure based on Gibbs sampling,
which has only recently been applied to population generation in transportation modeling …
extended to produce households of grouped individuals. The contingency tables of the
generated populations match external controls on the individual and household levels while
exhibiting far greater variety in composition than existing approaches can offer. The method
involves a two-step approach. The first consists of a procedure based on Gibbs sampling,
which has only recently been applied to population generation in transportation modeling …
A recent approach for generating populations of synthetic individuals through simulation is extended to produce households of grouped individuals. The contingency tables of the generated populations match external controls on the individual and household levels while exhibiting far greater variety in composition than existing approaches can offer. The method involves a two-step approach. The first consists of a procedure based on Gibbs sampling, which has only recently been applied to population generation in transportation modeling and is generically called Markov chain Monte Carlo (MCMC). For this work, the model was generalized, and an extension was developed, hierarchical MCMC, which was able to generate a hierarchical structure. The second step, a postprocessing step, uses generalized raking (GR), which reweights the output from hierarchical MCMC to perfectly satisfy known marginal control totals on the individual and household levels. The application input data—a demographic sample and some known marginals from Singapore—added further complexities to the problem, which had not yet been explored in the current literature. Despite data challenges, consecutively applying the methods above produced realistic synthetic populations. Results confirm their goodness of fit and their generated hierarchical structures.
Sage Journals