Abramowitz, M., Stegun, I.A., editors. (1972). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications. URL:


Aggarwal, C.C. (2015). Data Mining: The Textbook. Springer.


Arnold, T.B., Emerson, J.W. (2011). Nonparametric goodness-of-fit tests for discrete null distributions. The R Journal, 3(2):34–39. DOI: 10.32614/RJ-2011-016.


Bartoszyński, R., Niewiadomska-Bugaj, M. (2007). Probability and Statistical Inference. Wiley.


Beirlant, J., Goegebeur, Y., Teugels, J., Segers, J. (2004). Statistics of Extremes: Theory and Applications. Wiley. DOI: 10.1002/0470012382.


Bezdek, J.C., Ehrlich, R., Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computer and Geosciences, 10(2–3):191–203. DOI: 10.1016/0098-3004(84)90020-7.


Billingsley, P. (1995). Probability and Measure. John Wiley & Sons.


Bishop, C. (2006). Pattern Recognition and Machine Learning. Springer-Verlag. URL:


Blum, A., Hopcroft, J., Kannan, R. (2020). Foundations of Data Science. Cambridge University Press. URL:


Box, G.E.P., Cox, D.R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211–252.


Bullen, P.S. (2003). Handbook of Means and Their Inequalities. Springer Science+Business Media, Dordrecht.


Campello, R.J.G.B., Moulavi, D., Zimek, A., Sander, J. (2015). Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Transactions on Knowledge Discovery from Data, 10(1):5:1–5:51. DOI: 10.1145/2733381.


Chambers, J.M., Hastie, T. (1991). Statistical Models in S. Wadsworth & Brooks/Cole.


Clauset, A., Shalizi, C.R., Newman, M.E.J. (2009). Power-law distributions in empirical data. SIAM Review, 51(4):661–703. DOI: 10.1137/070710111.


Connolly, T., Begg, C. (2015). Database Systems: A Practical Approach to Design, Implementation, and Management. Pearson.


Conover, W.J. (1972). A Kolmogorov goodness-of-fit test for discontinuous distributions. Journal of the American Statistical Association, 67(339):591–596. DOI: 10.1080/01621459.1972.10481254.


Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press. URL:


Dasu, T., Johnson, T. (2003). Exploratory Data Mining and Data Cleaning. John Wiley & Sons.


Date, C.J. (2003). An Introduction to Database Systems. Pearson.


Deisenroth, M.P., Faisal, A.A., Ong, C.S. (2020). Mathematics for Machine Learning. Cambridge University Press. URL:


Dekking, F.M., Kraaikamp, C., Lopuhaä, H.P., Meester, L.E. (2005). A Modern Introduction to Probability and Statistics: Understanding Why and How. Springer.


Devroye, L., Györfi, L., Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer. DOI: 10.1007/978-1-4612-0711-5.


Deza, M.M., Deza, E. (2014). Encyclopedia of Distances. Springer.


Efron, B., Hastie, T. (2016). Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge University Press.


Ester, M., Kriegel, H.P., Sander, J., Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD'96, pp. 226–231.


Feller, W. (1950). An Introduction to Probability Theory and Its Applications: Volume I. Wiley.


Forbes, C., Evans, M., Hastings, N., Peacock, B. (2010). Statistical Distributions. Wiley.


Freedman, D., Diaconis, P. (1981). On the histogram as a density estimator: L₂ theory. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57:453–476.


Friedl, J.E.F. (2006). Mastering Regular Expressions. O'Reilly.


Gagolewski, M. (2015). Data Fusion: Theory, Methods, and Applications. Institute of Computer Science, Polish Academy of Sciences. DOI: 10.5281/zenodo.6960306.


Gagolewski, M. (2015). Spread measures and their relation to aggregation functions. European Journal of Operational Research, 241(2):469–477. DOI: 10.1016/j.ejor.2014.08.034.


Gagolewski, M. (2021). genieclust: Fast and robust hierarchical clustering. SoftwareX, 15:100722. URL:, DOI: 10.1016/j.softx.2021.100722.


Gagolewski, M. (2022). stringi: Fast and portable character string processing in R. Journal of Statistical Software, 103(2):1–59. URL:, DOI: 10.18637/jss.v103.i02.


Gagolewski, M. (2023). Deep R Programming. Zenodo, Melbourne. early draft. URL:, DOI: 10.5281/zenodo.7490464.


Gagolewski, M., Bartoszuk, M., Cena, A. (2016). Przetwarzanie i analiza danych w języku Python (Data Processing and Analysis in Python). PWN. in Polish.


Gagolewski, M., Bartoszuk, M., Cena, A. (2021). Are cluster validity measures (in)valid? Information Sciences, 581:620–636. DOI: 10.1016/j.ins.2021.10.004.


Gentle, J.E. (2003). Random Number Generation and Monte Carlo Methods. Springer-Verlag.


Gentle, J.E. (2009). Computational Statistics. Springer-Verlag.


Gentle, J.E. (2017). Matrix Algebra: Theory, Computations and Applications in Statistics. Springer.


Gentle, J.E. (2020). Theory of Statistics. book draft. URL:


Goldberg, D. (1991). What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys, 21(1):5–48. URL:


Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning. MIT Press. URL:


Grabisch, M., Marichal, J.-L., Mesiar, R., Pap, E. (2009). Aggregation Functions. Cambridge University Press.


Gumbel, E.J. (1939). La probabilité des hypothèses. Comptes Rendus de l'Académie des Sciences Paris, 209:645–647.


Harris, C.R., et al. (2020). Array programming with NumPy. Nature, 585(7825):357–362. DOI: 10.1038/s41586-020-2649-2.


Hart, E.M., et al. (2016). Ten simple rules for digital data storage. PLOS Computational Biology, 12(10):1–12. DOI: 10.1371/journal.pcbi.1005097.


Hastie, T., Tibshirani, R., Friedman, J. (2017). The Elements of Statistical Learning. Springer-Verlag. URL:


Higham, N.J. (2002). Accuracy and Stability of Numerical Algorithms. SIAM, Philadelphia, PA. DOI: 10.1137/1.9780898718027.


Hopcroft, J.E., Ullman, J.D. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley.


Huber, P.J., Ronchetti, E.M. (2009). Robust Statistics. Wiley.


Hunter, J.D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3):90–95.


Hyndman, R.J., Athanasopoulos, G. (2021). Forecasting: Principles and Practice. OTexts. URL:


Hyndman, R.J., Fan, Y. (1996). Sample quantiles in statistical packages. American Statistician, 50(4):361–365. DOI: 10.2307/2684934.


Kleene, S.C. (1951). Representation of events in nerve nets and finite automata. Technical Report RM-704, The RAND Corporation, Santa Monica, CA. URL:


Knuth, D.E. (1992). Literate Programming. CSLI.


Knuth, D.E. (1997). The Art of Computer Programming II: Seminumerical Algorithms. Addison-Wesley.


Kuchling, A.M. (2023). Regular Expression HOWTO. URL:


Lee, J. (2011). A First Course in Combinatorial Optimisation. Cambridge University Press.


Ling, R.F. (1973). A probability theory of cluster analysis. Journal of the American Statistical Association, 68(341):159–164. DOI: 10.1080/01621459.1973.10481356.


Little, R.J.A., Rubin, D.B. (2002). Statistical Analysis with Missing Data. John Wiley & Sons.


Lloyd, S.P. (1957 (1982)). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28:128–137. Originally a 1957 Bell Telephone Laboratories Research Report; republished in 1982. DOI: 10.1109/TIT.1982.1056489.


McKinney, W. (2022). Python for Data Analysis. O'Reilly. URL:


Modarres, M., Kaminskiy, M.P., Krivtsov, V. (2016). Reliability Engineering and Risk Analysis: A Practical Guide. CRC Press.


Monahan, J.F. (2011). Numerical Methods of Statistics. Cambridge University Press.


Müllner, D. (2011). Modern hierarchical, agglomerative clustering algorithms. arXiv:1109.2378 [stat.ML]. URL:


Nelsen, R.B. (1999). An Introduction to Copulas. Springer-Verlag.


Newman, M.E.J. (2005). Power laws, Pareto distributions and Zipf's law. Contemporary Physics, pages 323–351. DOI: 10.1080/00107510500052444.


Oetiker, T., et al. (2021). The Not So Short Introduction to LaTeX 2ε. URL:


Olver, F.W.J., et al. (2023). NIST Digital Library of Mathematical Functions. URL:


Ord, J.K., Fildes, R., Kourentzes, N. (2017). Principles of Business Forecasting. Wessex Press.


Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.


Poore, G.M. (2019). Codebraid: Live code in pandoc Markdown. In: Proc. 18th Python in Science Conf., pp. 54–61. URL:


Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P. (2007). Numerical Recipes. The Art of Scientific Computing. Cambridge University Press.


Pérez-Fernández, R., Baets, B. De, Gagolewski, M. (2019). A taxonomy of monotonicity properties for the aggregation of multidimensional data. Information Fusion, 52:322–334. DOI: 10.1016/j.inffus.2019.05.006.


Rabin, M., Scott, D. (1959). Finite automata and their decision problems. IBM Journal of Research and Development, 3:114–125.


Ritchie, D.M., Thompson, K.L. (1970). QED text editor. Technical Report 70107-002, Bell Telephone Laboratories, Inc. URL:


Robert, C.P., Casella, G. (2004). Monte Carlo Statistical Methods. Springer-Verlag.


Ross, S.M. (2020). Introduction to Probability and Statistics for Engineers and Scientists. Academic Press.


Rousseeuw, P.J., Ruts, I., Tukey, J.W. (1999). The bagplot: A bivariate boxplot. The American Statistician, 53(4):382–387. DOI: 10.2307/2686061.


Rubin, D.B. (1976). Inference and missing data. Biometrika, 63(3):581–590.


Sandve, G.K., Nekrutenko, A., Taylor, J., Hovig, E. (2013). Ten simple rules for reproducible computational research. PLOS Computational Biology, 9(10):1–4. DOI: 10.1371/journal.pcbi.1003285.


Smith, S.W. (2002). The Scientist and Engineer's Guide to Digital Signal Processing. Newnes. URL:


Steiglitz, K. (1996). A Digital Signal Processing Primer: With Applications to Digital Audio and Computer Music. Pearson.


Tijms, H.C. (2003). A First Course in Stochastic Models. Wiley.


Tufte, E.R. (2001). The Visual Display of Quantitative Information. Graphics Press.


Tukey, J.W. (1962). The future of data analysis. Annals of Mathematical Statistics, 33(1):1–67. URL:, DOI: 10.1214/aoms/1177704711.


Tukey, J.W. (1977). Exploratory Data Analysis. Addison-Wesley.


van Buuren, S. (2018). Flexible Imputation of Missing Data. CRC Press. URL:


van der Loo, M., de Jonge, E. (2018). Statistical Data Cleaning with Applications in R. John Wiley & Sons.


Virtanen, P., et al. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17:261–272. DOI: 10.1038/s41592-019-0686-2.


Waskom, M.L. (2021). seaborn: Statistical data visualization. Journal of Open Source Software, 6(60):3021. DOI: 10.21105/joss.03021.


Wickham, H. (2011). The split-apply-combine strategy for data analysis. Journal of Statistical Software, 40(1):1–29. DOI: 10.18637/jss.v040.i01.


Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10):1–23. DOI: 10.18637/jss.v059.i10.


Wierzchoń, S.T., Kłopotek, M.A. (2018). Modern Algorithms for Cluster Analysis. Springer. DOI: 10.1007/978-3-319-69308-8.


Wilson, G., et al. (2014). Best practices for scientific computing. PLOS Biology, 12(1):1–7. DOI: 10.1371/journal.pbio.1001745.


Wilson, G., et al. (2017). Good enough practices in scientific computing. PLOS Computational Biology, 13(6):1–20. DOI: 10.1371/journal.pcbi.1005510.


Xie, Y. (2015). Dynamic Documents with R and knitr. Chapman and Hall/CRC.