Algebraic Complexity in Statistics using Combinatorial and Tensor Methods
MetadataShow full item record
Within the framework of algebraic statistics, this work explores several statistical models, e.g. toric models, phylogenetic models, and variance components models, and focuses on the algebraic complexity problems that lie at the root of them. We begin our exploration by studying toric ideals of hypergraphs, algebraic objects that are used for goodness-of-fit testing for log-linear models. In this study, we use the combinatorics of hypergraphs to give degree bounds on the generators of the ideals, give sufficiency conditions of when a binomial in the ideal is indispensable, show that the ideal of the first tangential variety of n copies of the projective line is generated by quadratics and cubics in cumulant coordinates, and recover a well-known complexity theorem in algebraic statistics due to De Loera and Onn. Second, we explore phylogenetic models by viewing the models as sets of tensors with bounded rank. We show that the variety of 4x4x4 complex-valued tensors with border rank at most 4 is defined by polynomials of degree 5, 6, and 9. This variety corresponds to the 4-state general Markov model on the three-leaf claw tree and its defining polynomials can be used in model selection. This result also gives further evidence that the phylogenetic ideal of this model can be generated by polynomials of degree 9 and less. Finally, we look at the algebraic complexity of maximum likelihood estimation for variance components models, where we give explicit formulas for the maximum likelihood and restricted maximum likelihood degree of the random effects model for the one-way layout and give examples of multimodal likelihood surfaces.
maximum likelihood degree