References
A. N. Kolmogorov. 1938. “On the Analytic Methods of Probability
Theory.” Rossíiskaya Akademiya Nauk, no. 5:
5–41.
Acemoglu, Daron, and Pascual Restrepo. 2018. “Artificial
Intelligence, Automation and Work.” National Bureau of Economic
Research.
Actor, Jonas. 2018. “Computation for the Kolmogorov
Superposition Theorem.” {{MS Thesis}}, Rice.
Albert, Jim. 1993. “A Statistical Analysis of
Hitting Streaks in Baseball:
Comment.” Journal of the American Statistical
Association 88 (424): 1184–88. https://www.jstor.org/stable/2291255.
Altić, Mirela Slukan. 2013. “Exploring Along the Rome Meridian:
Roger Boscovich and the First Modern Map of the Papal
States.” In History of Cartography:
International Symposium of the ICA, 2012,
71–89. Springer.
Amazon. 2021. “The History of Amazon’s Forecasting
Algorithm.” Amazon Science.
https://www.amazon.science/latest-news/the-history-of-amazons-forecasting-algorithm.
Amit, Yali, Gilles Blanchard, and Kenneth Wilder. 2000. “Multiple
Randomized Classifiers: MRCL.”
Andrews, D. F., and C. L. Mallows. 1974. “Scale
Mixtures of Normal Distributions.”
Journal of the Royal Statistical Society. Series B
(Methodological) 36 (1): 99–102. https://www.jstor.org/stable/2984774.
Arnol’d, Vladimir I. 2006. “Forgotten and Neglected Theories of
Poincaré.” Russian Mathematical
Surveys 61 (1): 1.
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. 2014. “Neural
Machine Translation by Jointly Learning to
Align and Translate.” arXiv. https://arxiv.org/abs/1409.0473.
Baum, Leonard E., Ted Petrie, George Soules, and Norman Weiss. 1970.
“A Maximization Technique Occurring in the
Statistical Analysis of Probabilistic
Functions of Markov Chains.” The Annals
of Mathematical Statistics 41 (1): 164–71. https://www.jstor.org/stable/2239727.
Baylor, Denis, Eric Breck, Heng-Tze Cheng, Noah Fiedel, Chuan Yu Foo,
Zakaria Haque, Salem Haykal, et al. 2017. “Tfx: A
Tensorflow-Based Production-Scale Machine Learning Platform.” In
Proceedings of the 23rd ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 1387–95. ACM.
Behnia, Farnaz, Dominik Karbowski, and Vadim Sokolov. 2021. “Deep
Generative Models for Vehicle Speed Trajectories.” arXiv
Preprint arXiv:2112.08361. https://arxiv.org/abs/2112.08361.
Benoit, Dries F., and Dirk Van den Poel. 2012. “Binary Quantile
Regression: A Bayesian Approach Based on the Asymmetric
Laplace Distribution.” Journal of Applied
Econometrics 27 (7): 1174–88.
Berge, Travis, Nitish Sinha, and Michael Smolyansky. 2016. “Which
Market Indicators Best Forecast Recessions?”
FEDS Notes, August.
Bertsimas, Dimitris, Angela King, and Rahul Mazumder. 2016. “Best
Subset Selection via a Modern Optimization Lens.” The Annals
of Statistics 44 (2): 813–52.
Bhadra, Anindya, Jyotishka Datta, Nick Polson, Vadim Sokolov, and
Jianeng Xu. 2021. “Merging Two Cultures: Deep and Statistical
Learning.” arXiv Preprint arXiv:2110.11561. https://arxiv.org/abs/2110.11561.
Bojarski, Mariusz, Davide Del Testa, Daniel Dworakowski, Bernhard
Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, et al. 2016.
“End to End Learning for Self-Driving Cars.” arXiv
Preprint arXiv:1604.07316. https://arxiv.org/abs/1604.07316.
Bonfiglio, Rita, Annarita Granaglia, Raffaella Giocondo, Manuel Scimeca,
and Elena Bonanno. 2021. “Molecular Aspects and Prognostic
Significance of Microcalcifications in Human Pathology: A
Narrative Review.” International Journal of Molecular
Sciences 22 (120).
Bottou, Léon, Frank E Curtis, and Jorge Nocedal. 2018.
“Optimization Methods for Large-Scale Machine Learning.”
SIAM Review 60 (2): 223–311.
Box, George E. P., and George C. Tiao. 1992. Bayesian
Inference in Statistical Analysis. New
York: Wiley-Interscience.
Brillinger, David R. 2012. “A Generalized Linear Model
With ‘Gaussian’ Regressor
Variables.” In Selected Works of
David Brillinger, edited by Peter Guttorp and David
Brillinger, 589–606. Selected Works in
Probability and Statistics. New York, NY:
Springer.
Bryson, Arthur E. 1961. “A Gradient Method for Optimizing
Multi-Stage Allocation Processes.” In Proc. Harvard
Univ. Symposium on Digital Computers and Their
Applications. Vol. 72.
Campagnoli, Patrizia, Sonia Petrone, and Giovanni Petris. 2009.
Dynamic Linear Models with R. New
York, NY: Springer.
Cannon, Alex J. 2018. “Non-Crossing Nonlinear Regression Quantiles
by Monotone Composite Quantile Regression Neural Network, with
Application to Rainfall Extremes.” Stochastic Environmental
Research and Risk Assessment 32 (11): 3207–25.
Carreira-Perpinán, Miguel A, and Weiran Wang. 2014. “Distributed
Optimization of Deeply Nested Systems.” In
AISTATS, 10–19.
Carvalho, Carlos M, Hedibert F Lopes, Nicholas G Polson, and Matt A
Taddy. 2010. “Particle Learning for General Mixtures.”
Bayesian Analysis 5 (4): 709–40.
Carvalho, Carlos M., Nicholas G. Polson, and James G. Scott. 2010.
“The Horseshoe Estimator for Sparse Signals.”
Biometrika, asq017.
Chernozhukov, Victor, Iván Fernández-Val, and Alfred Galichon. 2010.
“Quantile and Probability Curves Without
Crossing.” Econometrica 78 (3): 1093–1125. https://www.jstor.org/stable/40664520.
Chib, Siddhartha. 1998. “Estimation and Comparison of Multiple
Change-Point Models.” Journal of Econometrics 86 (2):
221–41.
Cook, R. Dennis. 2007. “Fisher Lecture: Dimension
Reduction in Regression.” Statistical Science, 1–26. https://www.jstor.org/stable/27645799.
Cootner, Paul H. 1967. The Random Character of Stock Market
Prices. MIT press.
Coppejans, Mark. 2004. “On Kolmogorov’s
Representation of Functions of Several Variables by Functions of One
Variable.” Journal of Econometrics 123 (1): 1–31.
Dabney, Will, Georg Ostrovski, David Silver, and Rémi Munos. 2018.
“Implicit Quantile Networks for Distributional
Reinforcement Learning.” arXiv. https://arxiv.org/abs/1806.06923.
Dabney, Will, Mark Rowland, Marc G. Bellemare, and Rémi Munos. 2017.
“Distributional Reinforcement Learning with
Quantile Regression.” arXiv. https://arxiv.org/abs/1710.10044.
Davison, Anthony Christopher. 2003. Statistical Models. Vol.
11. Cambridge university press.
Dean, Jeffrey, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark
Mao, Andrew Senior, et al. 2012. “Large Scale Distributed Deep
Networks.” In Advances in Neural Information Processing
Systems, 1223–31.
Demb, Robert, and David Sprecher. 2021. “A Note on Computing with
Kolmogorov Superpositions Without Iterations.”
Neural Networks 144 (December): 438–42.
Devroye, Luc. 1986. Non-Uniform Random Variate Generation.
Springer Science & Business Media.
Diaconis, Persi, and David Freedman. 1987. “A Dozen de Finetti-style Results in Search of a
Theory.” In Annales de l’IHP
Probabilités Et Statistiques, 23:397–423.
Diaconis, Persi, and Mehrdad Shahshahani. 1981. “Generating a
Random Permutation with Random Transpositions.” Probability
Theory and Related Fields 57 (2): 159–79.
———. 1984. “On Nonlinear Functions of Linear Combinations.”
SIAM Journal on Scientific and Statistical Computing 5 (1):
175–91.
Diaconis, P., and D. Ylvisaker. 1983. “Quantifying Prior
Opinion.”
Dixon, Mark J., and Stuart G. Coles. 1997. “Modelling
Association Football Scores and Inefficiencies
in the Football Betting Market.” Journal of the
Royal Statistical Society Series C: Applied Statistics 46 (2):
265–80.
Dixon, Matthew F, Nicholas G Polson, and Vadim O Sokolov. 2019.
“Deep Learning for Spatio-Temporal Modeling: Dynamic Traffic Flows
and High Frequency Trading.” Applied Stochastic Models in
Business and Industry 35 (3): 788–807.
Dreyfus, Stuart. 1962. “The Numerical Solution of Variational
Problems.” Journal of Mathematical Analysis and
Applications 5 (1): 30–45.
———. 1973. “The Computational Solution of Optimal Control Problems
with Time Lag.” IEEE Transactions on Automatic Control
18 (4): 383–85.
Efron, Bradley, and Carl Morris. 1977. “Stein’s Paradox in
Statistics.” Scientific American 236 (5): 119–27.
Enikolopov, Ruben, Vasily Korovkin, Maria Petrova, Konstantin Sonin, and
Alexei Zakharov. 2013. “Field Experiment Estimate of Electoral
Fraud in Russian Parliamentary Elections.”
Proceedings of the National Academy of Sciences 110 (2):
448–52.
Eric Tassone, and Farzan Rohani. 2017. “Our Quest for Robust Time
Series Forecasting at Scale.”
Feller, William. 1971. An Introduction to Probability Theory and Its
Applications. Wiley.
Feynman, Richard. n.d. “Feynman :: Rules of
Chess.”
Frank, Ildiko E., and Jerome H. Friedman. 1993. “A
Statistical View of Some Chemometrics Regression
Tools.” Technometrics 35 (2): 109–35. https://www.jstor.org/stable/1269656.
Fredholm, Ivar. 1903. “Sur Une Classe d’équations
Fonctionnelles.” Acta Mathematica 27 (none): 365–90.
Friedman, Jerome H., and Werner Stuetzle. 1981. “Projection
Pursuit Regression.” Journal of the American
Statistical Association 76 (376): 817–23.
Frühwirth-Schnatter, Sylvia, and Rudolf Frühwirth. 2007.
“Auxiliary Mixture Sampling with Applications to Logistic
Models.” Computational Statistics & Data Analysis 51
(April): 3509–28.
———. 2010. “Data Augmentation and MCMC
for Binary and Multinomial Logit
Models.” In Statistical Modelling and
Regression Structures: Festschrift in
Honour of Ludwig Fahrmeir, 111–32.
Frühwirth-Schnatter, Sylvia, Rudolf Frühwirth, Leonhard Held, and Håvard
Rue. 2008. “Improved Auxiliary Mixture Sampling for Hierarchical
Models of Non-Gaussian Data.” Statistics and
Computing 19 (4): 479.
Gan, Link, and Alan Fritzler. 2016. “How to Become an
Executive.”
García-Arenzana, Nicolás, Eva María Navarrete-Muñoz, Virginia Lope,
Pilar Moreo, Carmen Vidal, Soledad Laso-Pablos, Nieves Ascunce, et al.
2014. “Calorie
Intake, Olive Oil Consumption and Mammographic Density Among
Spanish Women.” International Journal of
Cancer 134 (8): 1916–25.
George, Edward I., and Robert E. and McCulloch. 1993. “Variable
Selection via Gibbs Sampling.”
Journal of the American Statistical Association 88 (423):
881–89.
Gramacy, Robert B., and Nicholas G. Polson. 2012.
“Simulation-Based Regularized Logistic
Regression.” arXiv. https://arxiv.org/abs/1005.3430.
Griewank, Andreas, Kshitij Kulshreshtha, and Andrea Walther. 2012.
“On the Numerical Stability of Algorithmic
Differentiation.” Computing. Archives for Scientific
Computing 94 (2-4): 125–49.
Hahn, P. Richard, Jared S. Murray, and Carlos M. Carvalho. 2020.
“Bayesian Regression Tree Models for Causal
Inference: Regularization, Confounding,
and Heterogeneous Effects (with
Discussion).” Bayesian Analysis 15 (3):
965–1056.
Halevy, Alon, Peter Norvig, and Fernando Pereira. 2009. “The
Unreasonable Effectiveness of Data.” IEEE Intelligent
Systems 24 (2): 8–12.
Hardt, Moritz, Ben Recht, and Yoram Singer. 2016. “Train Faster,
Generalize Better: Stability of Stochastic Gradient
Descent.” In International Conference on Machine
Learning, 1225–34. PMLR.
Held, Leonhard, and Chris C. Holmes. 2006. “Bayesian Auxiliary
Variable Models for Binary and Multinomial Regression.”
Bayesian Analysis 1 (1): 145–68.
Hermann, Jeremy, and Mike Del Balso. 2017. “Meet Michelangelo:
Uber’s Machine Learning Platform.”
Huang, Jian, Joel L. Horowitz, and Shuangge Ma. 2008. “Asymptotic
Properties of Bridge Estimators in Sparse High-Dimensional Regression
Models.” The Annals of Statistics 36 (2): 587–613.
Hyndman, Rob J., and George Athanasopoulos. 2021. Forecasting:
Principles and Practice. 3rd ed. edition.
Melbourne, Australia: Otexts.
Igelnik, B., and N. Parikh. 2003. “Kolmogorov’s Spline
Network.” IEEE Transactions on Neural Networks 14 (4):
725–33.
indeed. 2018. “Jobs of the Future: Emerging Trends in
Artificial Intelligence.”
Irwin, Neil. 2016. “How to Become a
C.E.O.? The Quickest Path
Is a Winding One.” The New York
Times, September.
Iwata, Shigeru. 2001. “Recentered and Rescaled Instrumental
Variable Estimation of Tobit and Probit
Models with Errors in
Variables.” Econometric Reviews 20 (3):
319–35.
Januschowski, Tim, Yuyang Wang, Kari Torkkola, Timo Erkkilä, Hilaf
Hasson, and Jan Gasthaus. 2022. “Forecasting with Trees.”
International Journal of Forecasting, Special
Issue: M5 competition, 38 (4): 1473–81.
Jeffreys, Harold. 1998. Theory of Probability.
Third Edition, Third Edition. Oxford Classic Texts in the
Physical Sciences. Oxford, New York: Oxford University
Press.
kaggle. 2020. “M5 Forecasting -
Accuracy.”
https://kaggle.com/competitions/m5-forecasting-accuracy.
Kallenberg, Olav. 1997. Foundations of Modern
Probability. 2nd ed. edition. Springer.
Keskar, Nitish Shirish, Dheevatsa Mudigere, Jorge Nocedal, Mikhail
Smelyanskiy, and Ping Tak Peter Tang. 2016. “On Large-Batch
Training for Deep Learning: Generalization Gap and Sharp
Minima.” arXiv Preprint arXiv:1609.04836. https://arxiv.org/abs/1609.04836.
Keynes, John Maynard. 1921. A Treatise on Probability.
Macmillan.
Kingma, Diederik, and Jimmy Ba. 2014. “Adam: A Method
for Stochastic Optimization.” arXiv Preprint
arXiv:1412.6980. https://arxiv.org/abs/1412.6980.
Klartag, Bo’az. 2007. “A Central Limit Theorem for Convex
Sets.” Inventiones Mathematicae 168 (1): 91–131.
Kolmogoroff, Andrei. 1931. “Über Die Analytischen
Methoden in Der
Wahrscheinlichkeitsrechnung.” Mathematische
Annalen 104 (1): 415–58.
Kolmogorov, AN. 1942. “Definition of Center of Dispersion and
Measure of Accuracy from a Finite Number of Observations (in
Russian).” Izv. Akad. Nauk SSSR Ser. Mat.
6: 3–32.
———. 1956. “On the Representation of Continuous Functions of
Several Variables as Superpositions of Functions of Smaller Number of
Variables.” In Soviet. Math.
Dokl, 108:179–82.
Kreps, David. 1988. Notes On The Theory Of Choice.
Boulder: Westview Press.
Levina, Elizaveta, and Peter Bickel. 2001. “The Earth Mover’s
Distance Is the Mallows Distance: Some Insights from
Statistics.” In Proceedings Eighth IEEE
International Conference on Computer Vision. ICCV
2001, 2:251–56. IEEE.
Lin, Zhouhan, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing
Xiang, Bowen Zhou, and Yoshua Bengio. 2017. “A Structured Self-attentive Sentence
Embedding.” arXiv. https://arxiv.org/abs/1703.03130.
Lindgren, Georg. 1978. “Markov Regime Models for
Mixed Distributions and Switching
Regressions.” Scandinavian Journal of Statistics
5 (2): 81–91. https://www.jstor.org/stable/4615692.
Linnainmaa, Seppo. 1970. “The Representation of the Cumulative
Rounding Error of an Algorithm as a Taylor Expansion of the
Local Rounding Errors.” Master’s Thesis (in Finnish), Univ.
Helsinki, 6–7.
Logan, John A. 1983. “A Multivariate Model for Mobility
Tables.” American Journal of Sociology 89 (2): 324–49.
Logunov, A. A. 2004. “Henri Poincare and Relativity
Theory.” https://arxiv.org/abs/physics/0408077.
Lorentz, George G. 1976. “The 13th Problem of
Hilbert.” In Proceedings of
Symposia in Pure Mathematics, 28:419–30.
American Mathematical Society.
Maharaj, Shiva, Nick Polson, and Vadim Sokolov. 2023. “Kramnik Vs
Nakamura or Bayes Vs p-Value.” {{SSRN
Scholarly Paper}}. Rochester, NY.
Malthouse, Edward, Richard Mah, and Ajit Tamhane. 1997. “Nonlinear
Partial Least Squares.” Computers & Chemical
Engineering 12 (April): 875–90.
Mazumder, Rahul, Friedman, and Trevor and Hastie. 2011. “SparseNet:
Coordinate Descent With Nonconvex Penalties.”
Journal of the American Statistical Association 106 (495):
1125–38.
Mehrasa, Nazanin, Yatao Zhong, Frederick Tung, Luke Bornn, and Greg
Mori. 2017. “Learning Person Trajectory Representations for Team
Activity Analysis.” arXiv Preprint arXiv:1706.00893. https://arxiv.org/abs/1706.00893.
Milman, Vitali D, and Gideon Schechtman. 2009. Asymptotic Theory of
Finite Dimensional Normed Spaces: Isoperimetric
Inequalities in Riemannian Manifolds. Vol. 1200. Springer.
Mitchell, T. J., and J. J. Beauchamp. 1988. “Bayesian
Variable Selection in Linear
Regression.” Journal of the American Statistical
Association 83 (404): 1023–32. https://www.jstor.org/stable/2290129.
Nadaraya, E. A. 1964. “On Estimating
Regression.” Theory of Probability & Its
Applications 9 (1): 141–42.
Naik, Prasad, and Chih-Ling Tsai. 2000. “Partial Least
Squares Estimator for Single-Index Models.”
Journal of the Royal Statistical Society. Series B (Statistical
Methodology) 62 (4): 763–71. https://www.jstor.org/stable/2680619.
Nareklishvili, Maria, Nicholas Polson, and Vadim Sokolov. 2022.
“Deep Partial Least Squares for Iv Regression.” arXiv
Preprint arXiv:2207.02612. https://arxiv.org/abs/2207.02612.
———. 2023a. “Generative Causal Inference,”
June. https://arxiv.org/abs/2306.16096.
———. 2023b. “Feature Selection for Personalized
Policy Analysis,” July. https://arxiv.org/abs/2301.00251.
Nesterov, Yurii. 1983. “A Method of Solving a Convex Programming
Problem with Convergence Rate O (1/K2).” In
Soviet Mathematics Doklady, 27:372–76.
———. 2013. Introductory Lectures on Convex Optimization:
A Basic Course. Vol. 87. Springer Science &
Business Media.
Nicosia, Luca, Giulia Gnocchi, Ilaria Gorini, Massimo Venturini,
Federico Fontana, Filippo Pesapane, Ida Abiuso, et al. 2023.
“History of Mammography: Analysis of Breast Imaging
Diagnostic Achievements over the Last Century.”
Healthcare 11 (1596).
Ostrovskii, GM, Yu M Volin, and WW Borisov. 1971. “Uber Die
Berechnung von Ableitungen.” Wissenschaftliche Zeitschrift
Der Technischen Hochschule f Ur Chemie, Leuna-Merseburg 13 (4):
382–84.
Parzen, Emanuel. 2004. “Quantile Probability and
Statistical Data Modeling.” Statistical
Science 19 (4): 652–62. https://www.jstor.org/stable/4144436.
Petris, Giovanni. 2010. “An R Package for
Dynamic Linear Models.” Journal of Statistical
Software 36 (October): 1–16.
Poincaré, Henri. 1898. “La Mesure Du Temps.” Revue de
métaphysique Et de Morale 6 (1): 1–13.
Polson, Nicholas G., and James G. Scott. 2011. “Shrink
Globally, Act Locally: Sparse Bayesian
Regularization and Prediction.” In
Bayesian Statistics 9, edited by José M. Bernardo,
M. J. Bayarri, James O. Berger, A. P. Dawid, David Heckerman, Adrian F.
M. Smith, and Mike West, 0. Oxford University Press.
Polson, Nicholas G., James G. Scott, and Jesse Windle. 2013.
“Bayesian Inference for Logistic
Models Using Pólya–Gamma Latent
Variables.” Journal of the American Statistical
Association 108 (504): 1339–49.
Polson, Nicholas G, and James Scott. 2018. AIQ: How
People and Machines Are Smarter Together. St. Martin’s Press.
Polson, Nicholas G., and Vadim Sokolov. 2023. “Generative
AI for Bayesian Computation,” June. https://arxiv.org/abs/2305.14972.
Polson, Nicholas G, Vadim Sokolov, et al. 2017. “Deep
Learning: A Bayesian Perspective.”
Bayesian Analysis 12 (4): 1275–1304.
Polson, Nicholas, and Steven Scott. 2011. “Data
Augmentation for Support Vector
Machines.” Bayesian Analysis 6 (March).
Polson, Nicholas, and Vadim Sokolov. 2020. “Deep Learning:
Computational Aspects.” Wiley Interdisciplinary
Reviews: Computational Statistics 12 (5): e1500.
Polson, Nicholas, Vadim Sokolov, and Jianeng Xu. 2021. “Deep
Learning Partial Least Squares.” arXiv Preprint
arXiv:2106.14085. https://arxiv.org/abs/2106.14085.
Polson, Nick, Fabrizio Ruggeri, and Vadim Sokolov. 2024.
“Generative Bayesian Computation for Maximum
Expected Utility.” Entropy 26 (12): 1076.
Poplin, Ryan, Avinash V Varadarajan, Katy Blumer, Yun Liu, Michael V
McConnell, Greg S Corrado, Lily Peng, and Dale R Webster. 2018.
“Prediction of Cardiovascular Risk Factors from Retinal Fundus
Photographs via Deep Learning.” Nature Biomedical
Engineering 2 (3): 158.
Robbins, Herbert, and Sutton Monro. 1951. “A Stochastic
Approximation Method.” The Annals of Mathematical
Statistics 22 (3): 400–407.
Rubin, Hal S. Stern, John B. Carlin. 2015. Bayesian Data
Analysis. 3rd ed. New York: Chapman and
Hall/CRC.
Rumelhart, David E, Geoffrey E Hinton, and Ronald J Williams. 1986.
“Learning Representations by Back-Propagating Errors.”
Nature 323 (6088): 533.
Schmidhuber, Jürgen. 2015. “Deep Learning in Neural Networks:
An Overview.” Neural Networks 61: 85–117.
Schmidt-Hieber, Johannes. 2021. “The
Kolmogorov–Arnold Representation Theorem
Revisited.” Neural Networks 137 (May): 119–26.
Schwertman, Neil C, AJ Gilks, and J Cameron. 1990. “A Simple
Noncalculus Proof That the Median Minimizes the Sum of the Absolute
Deviations.” The American Statistician 44 (1): 38–39.
Scott, Steven L. 2015. “Multi-Armed Bandit Experiments in the
Online Service Economy.” Applied Stochastic Models in
Business and Industry 31 (1): 37–45.
Scott, Steven L. 2022. “BoomSpikeSlab:
MCMC for Spike and Slab
Regression.”
Scott, Steven L., and Hal R. Varian. 2015. “Bayesian
Variable Selection for Nowcasting Economic Time
Series.” In Economic Analysis of the
Digital Economy, 119–35. University of Chicago Press.
Scott, Steven, and Hal Varian. 2014. “Predicting the
Present with Bayesian Structural Time
Series.” Int. J. Of Mathematical Modelling and
Numerical Optimisation 5 (January): 4–23.
Sean J. Taylor, and Ben Letham. 2017. “Prophet: Forecasting at
Scale - Meta Research.” Meta Research.
https://research.facebook.com/blog/2017/2/prophet-forecasting-at-scale/.
Shen, Changyu, Enrico G Ferro, Huiping Xu, Daniel B Kramer, Rushad
Patell, and Dhruv S Kazi. 2021. “Underperformance of Contemporary
Phase III Oncology Trials and Strategies for
Improvement.” Journal of the National Comprehensive Cancer
Network 19 (9): 1072–78.
Shiryayev, A. N. 1992. “On Analytical Methods in Probability
Theory.” In Selected Works of a. N.
Kolmogorov: Volume II Probability Theory and
Mathematical Statistics, edited by A. N. Shiryayev, 62–108.
Dordrecht: Springer Netherlands.
Silver, David, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou,
Matthew Lai, Arthur Guez, Marc Lanctot, et al. 2017. “Mastering
Chess and Shogi by Self-Play with
a General Reinforcement Learning Algorithm.” arXiv.
https://arxiv.org/abs/1712.01815.
Simpson, Edward. 2010. “Edward Simpson:
Bayes at Bletchley Park.”
Significance 7 (2): 76–80.
Smith, A. F. M. 1975. “A Bayesian Approach to
Inference about a Change-Point in a
Sequence of Random Variables.”
Biometrika 62 (2): 407–16. https://www.jstor.org/stable/2335381.
Sokolov, Vadim. 2017. “Discussion of ‘Deep
Learning for Finance: Deep Portfolios’.” Applied
Stochastic Models in Business and Industry 33 (1): 16–18.
Spiegelhalter, David, and Yin-Lam Ng. 2009. “One Match to
Go!” Significance 6 (4): 151–53.
Stein, Charles. 1964. “Inadmissibility of the Usual Estimator for
the Variance of a Normal Distribution with Unknown Mean.”
Annals of the Institute of Statistical Mathematics 16 (1):
155–60.
Stern, H, Adam Sugano, J Albert, and R Koning. 2007. “Inference
about Batter-Pitcher Matchups in Baseball from Small Samples.”
Statistical Thinking in Sports, 153–65.
Stigler, Stephen M. 1981. “Gauss and the Invention of Least
Squares.” The Annals of Statistics, 465–74.
Sun, Duxin, Wei Gao, Hongxiang Hu, and Simon Zhou. 2022. “Why 90%
of Clinical Drug Development Fails and How to Improve It?”
Acta Pharmaceutica Sinica B 12 (7): 3049–62.
Sutskever, Ilya, James Martens, George Dahl, and Geoffrey Hinton. 2013.
“On the Importance of Initialization and Momentum in Deep
Learning.” In International Conference on Machine
Learning, 1139–47.
Taleb, Nassim Nicholas. 2007. The Black Swan: The
Impact of the Highly Improbable. Annotated
edition. New York. N.Y: Random House.
Tarone, Robert E. 1982. “The Use of Historical Control Information
in Testing for a Trend in Proportions.” Biometrics. Journal
of the International Biometric Society, 215–20.
Tesauro, Gerald. 1995. “Temporal Difference Learning and
TD-Gammon.” Communications of the ACM 38
(3): 58–68.
Tiao, Louis. 2019. “Pólya-Gamma Bayesian
Logistic Regression.” Blog post.
Tsai, Yao-Hung Hubert, Shaojie Bai, Makoto Yamada, Louis-Philippe
Morency, and Ruslan Salakhutdinov. 2019. “Transformer
Dissection: A Unified Understanding of
Transformer’s Attention via the
Lens of Kernel.” arXiv. https://arxiv.org/abs/1908.11775.
Varian, Hal R. 2010. “Computer Mediated
Transactions.” American Economic Review 100 (2):
1–10.
Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion
Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2023.
“Attention Is All You Need.” arXiv. https://arxiv.org/abs/1706.03762.
Vecer, Jan, Frantisek Kopriva, and Tomoyuki Ichiba. 2009.
“Estimating the Effect of the Red Card
in Soccer: When to Commit an
Offense in Exchange for
Preventing a Goal Opportunity.”
Journal of Quantitative Analysis in Sports 5 (1).
Viterbi, A. 1967. “Error Bounds for Convolutional Codes and an
Asymptotically Optimum Decoding Algorithm.” IEEE Transactions
on Information Theory 13 (2): 260–69.
Watson, Geoffrey S. 1964. “Smooth Regression
Analysis.” Sankhyā: The Indian Journal of
Statistics, Series A (1961-2002) 26 (4): 359–72. https://www.jstor.org/stable/25049340.
Werbos, Paul. 1974. “Beyond Regression:" New Tools for Prediction
and Analysis in the Behavioral Sciences.” Ph. D.
Dissertation, Harvard University.
Werbos, Paul J. 1982. “Applications of Advances in Nonlinear
Sensitivity Analysis.” In System Modeling and
Optimization, 762–70. Springer.
Windle, Jesse. 2023. “BayesLogit:
Bayesian Logistic Regression.” R package version
2.1.
Windle, Jesse, Nicholas G. Polson, and James G. Scott. 2014.
“Sampling Polya-Gamma Random Variates: Alternate and
Approximate Techniques.” arXiv. https://arxiv.org/abs/1405.0506.
Wojna, Zbigniew, Alex Gorban, Dar-Shyang Lee, Kevin Murphy, Qian Yu,
Yeqing Li, and Julian Ibarz. 2017. “Attention-Based Extraction of
Structured Information from Street View Imagery.” arXiv
Preprint arXiv:1704.03549. https://arxiv.org/abs/1704.03549.
Wold, Herman. 1975/ed. “Soft Modelling by
Latent Variables: The Non-Linear Iterative Partial
Least Squares (NIPALS)
Approach.” Journal of Applied Probability
12 (S1): 117–42.
Yaari, Menahem E. 1987. “The Dual Theory of
Choice Under Risk.”
Econometrica 55 (1): 95–115. https://www.jstor.org/stable/1911158.
Zeiler, Matthew D. 2012. “ADADELTA: An Adaptive
Learning Rate Method.” arXiv Preprint arXiv:1212.5701.
https://arxiv.org/abs/1212.5701.
Zhang, Yichi, Anirban Datta, and Sudipto Banerjee. 2018. “Scalable
Gaussian Process Classification with Pólya-Gamma Data
Augmentation.” arXiv Preprint arXiv:1802.06383. https://arxiv.org/abs/1802.06383.