This article discusses the connection between the matrix models and algebraic geometry. In particular, it considers three specific applications of matrix models to algebraic geometry, namely: the Kontsevich matrix model that describes intersection indices on moduli spaces of curves with marked points; the Hermitian matrix model free energy at the leading expansion order as the prepotential of the Seiberg-Witten-Whitham-Krichever hierarchy; and the other orders of free energy and resolvent expansions as symplectic invariants and possibly amplitudes of open/closed strings. The article first describes the moduli space of algebraic curves and its parameterization via the Jenkins-Strebel differentials before analysing the relation between the so-called formal matrix models (solutions of the loop equation) and algebraic hierarchies of Dijkgraaf-Witten-Whitham-Krichever type. It also presents the WDVV (Witten-Dijkgraaf-Verlinde-Verlinde) equations, along with higher expansion terms and symplectic invariants.
Hedibert Lopes and Nicholas Polson
This article discusses the use of Bayesian multiscale spatio-temporal models for the analysis of economic data. It demonstrates the utility of a general modelling approach for multiscale analysis of spatio-temporal processes with areal data observations in an economic study of agricultural production in the Brazilian state of Espìrito Santo during the period 1990–2005. The article first describes multiscale factorizations for spatial processes before presenting an exploratory multiscale data analysis and explaining the motivation for multiscale spatio-temporal models. It then examines the temporal evolution of the underlying latent multiscale coefficients and goes on to introduce a Bayesian analysis based on the multiscale decomposition of the likelihood function along with Markov chain Monte Carlo (MCMC) methods. The results from agricultural production analysis show that the spatio-temporal framework can effectively analyse massive economics data sets.
James S. Clark, Dave Bell, Michael Dietze, Michelle Hersh, Ines Ibanez, Shannon LaDeau, Sean McMahon, Jessica Metcalf, Emily Moran, Luke Pangle, and Mike Wolosin
This article focuses on the use of Bayesian methods in assessing the probability of rare climate events, and more specifically the potential collapse of the meridional overturning circulation (MOC) in the Atlantic Ocean. It first provides an overview of climate models and their use to perform climate simulations, drawing attention to uncertainty in climate simulators and the role of data in climate prediction, before describing an experiment that simulates the evolution of the MOC through the twenty-first century. MOC collapse is predicted by the GENIE-1 (Grid Enabled Integrated Earth system model) for some values of the model inputs, and Bayesian emulation is used for collapse probability analysis. Data comprising a sparse time series of five measurements of the MOC from 1957 to 2004 are analysed. The results demonstrate the utility of Bayesian analysis in dealing with uncertainty in complex models, and in particular in quantifying the risk of extreme outcomes.
Antonia Tulino and Sergio Verdu
This article examines asymptotic singular value distributions in information theory, with particular emphasis on some of the main applications of random matrices to the capacity of communication channels. Results on the spectrum of random matrices have been adopted in information theory. Furthermore, information theorists, motivated by certain channel models, have obtained a number of new results in random matrix theory (RMT). Most of those results are related to the asymptotic distribution of the (square of) the singular values of certain random matrices that model data communication channels. The article first provides an overview of three transforms that are useful in expressing the asymptotic spectrum results — Stieltjes transform, η-transform, and Shannon transform — before discussing the main results on the limit of the empirical distributions of the eigenvalues of various random matrices of interest in information theory.
Antonio Pievatolo and Fabrizio Ruggeri
This article discusses the results of a Bayes linear uncertainty analysis for oil reservoirs based on multiscale computer experiments. Using the Gullfaks oil and gas reservoir located in the North Sea as a case study, the article demonstrates the applicability of Bayes linear methods to address highly complex problems for which the full Bayesian analysis may be computationally intractable. A reservoir simulation model, run at two different levels of complexity, is used, and a simulator of a hydrocarbon reservoir represents properties of the reservoir on a three-dimensional grid. The article also describes a general formulation for the approach to uncertainty analysis for complex physical systems given a computer model for that system. Finally, it presents the results of simulations and forecasting for the Gullfaks reservoir.
Jonathan A. Cumming and Michael Goldstein
This article discusses the results of a study in Bayesian analysis and decision making in the maintenance and reliability of nuclear power plants. It demonstrates the use of Bayesian parametric and semiparametric methodology to analyse the failure times of components that belong to an auxiliary feedwater system in a nuclear power plant at the South Texas Project (STP) Electric Generation Station. The parametric models produce estimates of the hazard functions that are compared to the output from a mixture of Polya trees model. The statistical output is used as the most critical input in a stochastic optimization model which finds the optimal replacement time for a system that randomly fails over a finite horizon. The article first introduces the model for maintenance and reliability analysis before presenting the optimization results. It also examines the nuclear power plant data to be used in the Bayesian models.
Dani Gamerman, Tufi M. Soares, and Flávio Gonçalves
This article discusses the use of a Bayesian model that incorporates differential item functioning (DIF) in analysing whether cultural differences may affect the performance of students from different countries in the various test items which make up the OECD’s Programme for International Student Assessment (PISA) test of mathematics ability. The PISA tests in mathematics and other subjects are used to compare the educational attainment of fifteen-year old students in different countries. The article first provides a background on PISA, DIF and item response theory (IRT) before describing a hierarchical three-parameter logistic model for the probability of a correct response on an individual item to determine the extent of DIF remaining in the mathematics test of 2003. The results of Bayesian analysis illustrate the importance of appropriately accounting for all sources of heterogeneity present in educational testing and highlight the advantages of the Bayesian paradigm when applied to large-scale educational assessment.
Bayesian approaches to aspects of the Vioxx trials: Non-ignorable dropout and sequential meta-analysis
Jerry Cheng and David Madigan
This article discusses Bayesian approaches to aspects of the Vioxx trials study, with a focus on non-ignorable dropout and sequential meta-analysis. It first provides a background on Vioxx, a COX-2 selective, non-steroidal anti-inflammatory drug (NSAID) approved by the FDA in May 1999 for the relief of the signs and symptoms of osteoarthritis, the management of acute pain in adults, and for the treatment of menstrual symptoms. However, Vioxx was found to cause an array of cardiovascular side-effects such as myocardial infarction, stroke, and unstable angina. As a result, Vioxx was withdrawn from the market. The article describes an approach to sequential meta-analysis in the context of Vioxx before considering dropouts in the key APPROVe study. It also presents a Bayesian approach to handling dropout and showcases the utility of Bayesian analysis in addressing multiple, challenging statistical issues and questions arising from clinical trials.
Bayesian causal inference: Approaches to estimating the effect of treating hospital type on cancer survival in Sweden using principal stratification
Donald Rubin, Xiaoqin Wang, Li Yin, and Elizabeth Zell
This article discusses the use of Bayesian causal inference, and more specifically the posterior predictive approach of Rubin’s causal model (RCM) and methods of principal stratification, in estimating the effects of ‘treating hospital type’ on cancer survival. Using the Karolinska Institute in Stockholm, Sweden, as a case study, the article investigates which type of hospital (large patient volume vs. small volume) is superior for treating certain serious conditions. The study examines which factors may reasonably be considered ignorable in the context of covariates available, as well as non-compliance complications due to transfers between hospital types for treatment. The article first provides an overview of the general Bayesian approach to causal inference, primarily with ignorable treatment assignment, before introducing the proposed approach and motivating it using simple method-of-moments summary statistics. Finally, the results of simulation using Markov chain Monte Carlo (MCMC) methods are presented.
Peter Green, Kanti Mardia, Vysaul Nyirongo, and Yann Ruffieux
This article describes Bayesian modelling for matching and alignment of biomolecules. One particular task where statistical modelling and inference can be useful in scientific understanding of protein structure is that of matching and alignment of two or more proteins. In this regard, statistical shape analysis potentially has something to offer in solving biomolecule matching and alignment problems. The article discusses the use of Bayesian methods for shape analysis to assist with understanding the three-dimensional structure of protein molecules, with a focus on the problem of matching instances of the same structure in the CoMFA (Comparative Molecular Field Analysis) database of steroid molecules. It introduces a Bayesian hierarchical model for pairwise matching and for alignment of multiple configurations before concluding with an overview of some advantages of the Bayesian approach to problems in protein bioinformatics, along with modelling and computation issues, alternative approaches, and directions for future research.
Marco Ferreira, Adelmo Bertoldey, and Scott Holan
This article discusses the results of a study in Bayesian reliability analysis concerning train door failures in a European underground system over a period of nine years. It examines failure data from forty underground trains, which were delivered to an European transportation company between November 1989 and March 1991. All of the trains were put in service from 20 March 1990 to 20 July 1992. Failure monitoring ended on 31 December 1998. The goal of the study was to find models able to assess the failure history and to predict the number of failures in future time intervals in order to help the company determine the reliability level of the train doors before warranty expiration. The article describes the development and application of a novel bivariate Poisson process as a natural way to extend the usual Poisson models for analysing the occurrence of failures in repairable systems.
A. Taylan Cemgil, Simon Godsill, Paul Peeling, and Nick Whiteley
This article focuses on the use of Bayesian statistical methods in audio and music processing in the context of an application to multipitch audio and determining a musical ‘score’ representation that includes pitch and time duration summary for a musical extract (the so-called ‘piano-roll’ representation of music). It first provides an overview of mainstream applications of audio signal processing, the properties of musical audio, superposition and how to address it using the Bayesian approach, and the principal challenges facing audio processing. It then considers the fundamental audio processing tasks before discussing a range of Bayesian hierarchical models involving both time and frequency domain dynamic models. It shows that Bayesian analysis is applicable in audio signal processing in real environments where acoustical conditions and sound sources are highly variable, yet audio signals possess strong statistical structure.
This article deals with beta ensembles. Classical random matrix ensembles contain a parameter β, taking on the values 1, 2, and 4. This parameter, which relates to the underlying symmetry, appears as a repulsion sβ between neighbouring eigenvalues for small s. β may be regarded as a continuous positive parameter on the basis of different viewpoints of the eigenvalue probability density function for the classical random matrix ensembles - as the Boltzmann factor for a log-gas or the squared ground state wave function of a quantum many-body system. The article first considers log-gas systems before discussing the Fokker-Planck equation and the Calogero-Sutherland system. It then describes the random matrix realization of the β-generalization of the circular ensemble and concludes with an analysis of stochastic differential equations resulting from the case of the bulk scaling limit of the β-generalization of the Gaussian ensemble.
This article considers the so-called loop equations satisfied by integrals over random matrices coupled in a chain as well as their recursive solution in the perturbative case when the matrices are Hermitian. Random matrices are used in fields such as the study of multi-orthogonal polynomials or the enumeration of discrete surfaces, both of which are based on the analysis of a matrix integral. However, this term can be confusing since the definition of a matrix integral in these two applications is not the same. The article discusses these two definitions, perturbative and non-perturbative, along with their relationship. It first provides an overview of a matrix integral before comparing convergent and formal matrix integrals. It then describes the loop equations and their solution in the one-matrix model. It also examines matrices coupled in a chain plus external field and concludes with a generalization of the topological recursion.
Edouard Brezin and Sinobu Hikami
This article considers characteristic polynomials and reviews a few useful results obtained in simple Gaussian models of random Hermitian matrices in the presence of an external matrix source. It first considers the products and ratio of characteristic polynomials before discussing the duality theorems for two different characteristic polynomials of Gaussian weights with external sources. It then describes the m-point correlation functions of the eigenvalues in the Gaussian unitary ensemble and how they are deduced from their Fourier transforms U(s1, … , sm). It also analyses the relation of the correlation function of the characteristic polynomials to the standard n-point correlation function using the replica and supersymmetric methods. Finally, it shows how the topological invariants of Riemann surfaces, such as the intersection numbers of the moduli space of curves, may be derived from averaged characteristic polynomials.
Carlos Carvalho and Jill Rickershauser
This article focuses on the use of Bayesian hierarchical models for integration and comparison of predictions from multiple models and groups, and more specifically for characterizing the uncertainty of climate change projections. It begins with a discussion of the current state and future scenarios concerning climate change and human influences, as well as various models used in climate simulations and the goals and challenges of analysing ensembles of opportunity. It then introduces a suite of statistical models that incorporate output from an ensemble of climate models, referred to as general circulation models (GCMs), with the aim of reconciling different future projections of climate change while characterizing their uncertainty in a rigorous fashion. Posterior distributions of future temperature and/or precipitation changes at regional scales are obtained, accounting for many peculiar data characteristics. The article confirms the reasonableness of the Bayesian modelling assumptions for climate change projections' uncertainty analysis.
Carlo W. J. Beenakker
This article focuses on applications of random matrix theory (RMT) to both classical optics and quantum optics, with emphasis on optical systems such as disordered wave guides and chaotic resonators. The discussion centres on topics that do not have an immediate analogue in electronics, either because they cannot readily be measured in the solid state or because they involve aspects (such as absorption, amplification, or bosonic statistics) that do not apply to electrons. The article first considers applications of RMT to classical optics, including optical speckle and coherent backscattering, reflection from an absorbing random medium, long-range wave function correlations in an open resonator, and direct detection of open transmission channels. It then discusses applications to quantum optics, namely: the statistics of grey-body radiation, lasing in a chaotic cavity, and the effect of absorption on the reflection eigenvalue statistics in a multimode wave guide.
Amparo Baillo, Antonio Cuevas, and Ricardo Fraiman
This article reviews the literature concerning supervised and unsupervised classification of functional data. It first explains the meaning of unsupervised classification vs. supervised classification before discussing the supervised classification problem in the infinite-dimensional case, showing that its formal statement generally coincides with that of discriminant analysis in the classical multivariate case. It then considers the optimal classifier and plug-in rules, empirical risk and empirical minimization rules, linear discrimination rules, the k nearest neighbor (k-NN) method, and kernel rules. It also describes classification based on partial least squares, classification based on reproducing kernels, and depth-based classification. Finally, it examines unsupervised classification methods, focusing on K-means for functional data, K-means for data in a Hilbert space, and impartial trimmed K-means for functional data. Some practical issues, in particular real-data examples and simulations, are reviewed and some selected proofs are given.
Samantha Low Choy, Justine Murray, Allan James, and Kerrie Mengersen
This article discusses an approach that combines monitoring data and computer model outputs for environmental exposure assessment. It describes the application of Bayesian data fusion methods using spatial Gaussian process models in studies of weekly wet deposition data for 2001 from 120 sites monitored by the US National Atmospheric Deposition Program (NADP) in the eastern United States. The article first provides an overview of environmental computer models, with a focus on the CMAQ (Community Multi-Scale Air Quality) Eta forecast model, before considering some algorithmic and pseudo-statistical approaches in weather prediction. It then reviews current state of the art fusion methods for environmental data analysis and introduces a non-dynamic downscaling approach. The static version of the dynamic spatial model is used to analyse the NADP weekly wet deposition data.
Dave Higdon, Katrin Heitmann, Charles Nakhleh, and Salman Habib
This article focuses on the use of a Bayesian approach that combines simulations and physical observations to estimate cosmological parameters. It begins with an overview of the Λ-cold dark matter (CDM) model, the simplest cosmological model in agreement with the cosmic microwave background (CMB) and largescale structure analysis. The CDM model is determined by a small number of parameters which control the composition, expansion and fluctuations of the universe. The present study aims to learn about the values of these parameters using measurements from the Sloan Digital Sky Survey (SDSS). Computationally intensive simulation results are combined with measurements from the SDSS to infer about a subset of the parameters that control the CDM model. The article also describes a statistical framework used to determine a posterior distribution for these cosmological parameters and concludes by showing how it can be extended to include data from diverse data sources.