Abramowitz, M. and Stegun, I.A., editors. (1972). Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables. Dover Publications. URL:


Aggarwal, C.C. (2015). Data Mining: The Textbook. Springer.


Arnold, T.B. and Emerson, J.W. (2011). Nonparametric goodness-of-fit tests for discrete null distributions. The R Journal, 3(2):34–39. DOI: 10.32614/RJ-2011-016.


Bartoszyński, R. and Niewiadomska-Bugaj, M. (2007). Probability and Statistical Inference. Wiley.


Beirlant, J., Goegebeur, Y., Teugels, J., and Segers, J. (2004). Statistics of Extremes: Theory and Applications. Wiley. DOI: 10.1002/0470012382.


Bezdek, J.C., Ehrlich, R., and Full, W. (1984). FCM: The fuzzy c-means clustering algorithm. Computer and Geosciences, 10(2–3):191–203. DOI: 10.1016/0098-3004(84)90020-7.


Billingsley, P. (1995). Probability and Measure. John Wiley & Sons.


Bishop, C. (2006). Pattern Recognition and Machine Learning. Springer-Verlag. URL:


Blum, A., Hopcroft, J., and Kannan, R. (2020). Foundations of Data Science. Cambridge University Press. URL:


Box, G.E.P. and Cox, D.R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26(2):211–252.


Bullen, P.S. (2003). Handbook of Means and Their Inequalities. Springer Science+Business Media.


Campello, R.J.G.B., Moulavi, D., Zimek, A., and Sander, J. (2015). Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Transactions on Knowledge Discovery from Data, 10(1):5:1–5:51. DOI: 10.1145/2733381.


Chambers, J.M. and Hastie, T. (1991). Statistical Models in S. Wadsworth & Brooks/Cole.


Clauset, A., Shalizi, C.R., and Newman, M.E.J. (2009). Power-law distributions in empirical data. SIAM Review, 51(4):661–703. DOI: 10.1137/070710111.


Connolly, T. and Begg, C. (2015). Database Systems: A Practical Approach to Design, Implementation, and Management. Pearson.


Conover, W.J. (1972). A Kolmogorov goodness-of-fit test for discontinuous distributions. Journal of the American Statistical Association, 67(339):591–596. DOI: 10.1080/01621459.1972.10481254.


Cramér, H. (1946). Mathematical Methods of Statistics. Princeton University Press. URL:


Dasu, T. and Johnson, T. (2003). Exploratory Data Mining and Data Cleaning. John Wiley & Sons.


Date, C.J. (2003). An Introduction to Database Systems. Pearson.


Deisenroth, M.P., Faisal, A.A., and Ong, C.S. (2020). Mathematics for Machine Learning. Cambridge University Press. URL:


Dekking, F.M., Kraaikamp, C., Lopuhaä, H.P., and Meester, L.E. (2005). A Modern Introduction to Probability and Statistics: Understanding Why and How. Springer.


Devroye, L., Györfi, L., and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Springer. DOI: 10.1007/978-1-4612-0711-5.


Deza, M.M. and Deza, E. (2014). Encyclopedia of Distances. Springer.


Efron, B. and Hastie, T. (2016). Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge University Press.


Ester, M., Kriegel, H.P., Sander, J., and Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proc. KDD'96, pp. 226–231.


Feller, W. (1950). An Introduction to Probability Theory and Its Applications: Volume I. Wiley.


Forbes, C., Evans, M., Hastings, N., and Peacock, B. (2010). Statistical Distributions. Wiley.


Freedman, D. and Diaconis, P. (1981). On the histogram as a density estimator: L₂ theory. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete, 57:453–476.


Friedl, J.E.F. (2006). Mastering Regular Expressions. O'Reilly.


Gagolewski, M. (2015). Data Fusion: Theory, Methods, and Applications. Institute of Computer Science, Polish Academy of Sciences. DOI: 10.5281/zenodo.6960306.


Gagolewski, M. (2015). Spread measures and their relation to aggregation functions. European Journal of Operational Research, 241(2):469–477. DOI: 10.1016/j.ejor.2014.08.034.


Gagolewski, M. (2021). genieclust: Fast and robust hierarchical clustering. SoftwareX, 15:100722. URL:, DOI: 10.1016/j.softx.2021.100722.


Gagolewski, M. (2022). stringi: Fast and portable character string processing in R. Journal of Statistical Software, 103(2):1–59. URL:, DOI: 10.18637/jss.v103.i02.


Gagolewski, M. (2023). Deep R Programming. Zenodo. URL:, DOI: 10.5281/zenodo.7490464.


Gagolewski, M., Bartoszuk, M., and Cena, A. (2016). Przetwarzanie i analiza danych w języku Python (Data Processing and Analysis in Python). PWN. in Polish.


Gagolewski, M., Bartoszuk, M., and Cena, A. (2021). Are cluster validity measures (in)valid? Information Sciences, 581:620–636. DOI: 10.1016/j.ins.2021.10.004.


Gentle, J.E. (2003). Random Number Generation and Monte Carlo Methods. Springer.


Gentle, J.E. (2009). Computational Statistics. Springer-Verlag.


Gentle, J.E. (2017). Matrix Algebra: Theory, Computations and Applications in Statistics. Springer.


Gentle, J.E. (2020). Theory of Statistics. book draft. URL:


Goldberg, D. (1991). What every computer scientist should know about floating-point arithmetic. ACM Computing Surveys, 21(1):5–48. URL:


Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep Learning. MIT Press. URL:


Grabisch, M., Marichal, J.-L., Mesiar, R., and Pap, E. (2009). Aggregation Functions. Cambridge University Press.


Gumbel, E.J. (1939). La probabilité des hypothèses. Comptes Rendus de l'Académie des Sciences Paris, 209:645–647.


Harris, C.R. and others. (2020). Array programming with NumPy. Nature, 585(7825):357–362. DOI: 10.1038/s41586-020-2649-2.


Hart, E.M. and others. (2016). Ten simple rules for digital data storage. PLOS Computational Biology, 12(10):1–12. DOI: 10.1371/journal.pcbi.1005097.


Hastie, T., Tibshirani, R., and Friedman, J. (2017). The Elements of Statistical Learning. Springer-Verlag. URL:


Higham, N.J. (2002). Accuracy and Stability of Numerical Algorithms. SIAM. DOI: 10.1137/1.9780898718027.


Hopcroft, J.E. and Ullman, J.D. (1979). Introduction to Automata Theory, Languages, and Computation. Addison-Wesley.


Huber, P.J. and Ronchetti, E.M. (2009). Robust Statistics. Wiley.


Hunter, J.D. (2007). Matplotlib: A 2D graphics environment. Computing in Science & Engineering, 9(3):90–95.


Hyndman, R.J. and Athanasopoulos, G. (2021). Forecasting: Principles and Practice. OTexts. URL:


Hyndman, R.J. and Fan, Y. (1996). Sample quantiles in statistical packages. American Statistician, 50(4):361–365. DOI: 10.2307/2684934.


Kleene, S.C. (1951). Representation of events in nerve nets and finite automata. Technical Report RM-704, The RAND Corporation, Santa Monica, CA. URL:


Knuth, D.E. (1992). Literate Programming. CSLI.


Knuth, D.E. (1997). The Art of Computer Programming II: Seminumerical Algorithms. Addison-Wesley.


Kuchling, A.M. (2023). Regular Expression HOWTO. URL:


Lee, J. (2011). A First Course in Combinatorial Optimisation. Cambridge University Press.


Ling, R.F. (1973). A probability theory of cluster analysis. Journal of the American Statistical Association, 68(341):159–164. DOI: 10.1080/01621459.1973.10481356.


Little, R.J.A. and Rubin, D.B. (2002). Statistical Analysis with Missing Data. John Wiley & Sons.


Lloyd, S.P. (1957 (1982)). Least squares quantization in PCM. IEEE Transactions on Information Theory, 28:128–137. Originally a 1957 Bell Telephone Laboratories Research Report; republished in 1982. DOI: 10.1109/TIT.1982.1056489.


Matloff, N.S. (2011). The Art of R Programming: A Tour of Statistical Software Design. No Starch Press.


McKinney, W. (2022). Python for Data Analysis. O'Reilly. URL:


Modarres, M., Kaminskiy, M.P., and Krivtsov, V. (2016). Reliability Engineering and Risk Analysis: A Practical Guide. CRC Press.


Monahan, J.F. (2011). Numerical Methods of Statistics. Cambridge University Press.


Müllner, D. (2011). Modern hierarchical, agglomerative clustering algorithms. arXiv:1109.2378 [stat.ML]. URL:


Nelsen, R.B. (1999). An Introduction to Copulas. Springer-Verlag.


Newman, M.E.J. (2005). Power laws, Pareto distributions and Zipf's law. Contemporary Physics, pages 323–351. DOI: 10.1080/00107510500052444.


Oetiker, T. and others. (2021). The Not So Short Introduction to LaTeX 2ε. URL:


Olver, F.W.J. and others. (2023). NIST Digital Library of Mathematical Functions. URL:


Ord, J.K., Fildes, R., and Kourentzes, N. (2017). Principles of Business Forecasting. Wessex Press.


Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830.


Poore, G.M. (2019). Codebraid: Live code in pandoc Markdown. In: Proc. 18th Python in Science Conf., pp. 54–61. URL:


Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. (2007). Numerical Recipes. The Art of Scientific Computing. Cambridge University Press.


Pérez-Fernández, R., Baets, B. De, and Gagolewski, M. (2019). A taxonomy of monotonicity properties for the aggregation of multidimensional data. Information Fusion, 52:322–334. DOI: 10.1016/j.inffus.2019.05.006.


Rabin, M. and Scott, D. (1959). Finite automata and their decision problems. IBM Journal of Research and Development, 3:114–125.


Ritchie, D.M. and Thompson, K.L. (1970). QED text editor. Technical Report 70107-002, Bell Telephone Laboratories, Inc. URL:


Robert, C.P. and Casella, G. (2004). Monte Carlo Statistical Methods. Springer-Verlag.


Ross, S.M. (2020). Introduction to Probability and Statistics for Engineers and Scientists. Academic Press.


Rousseeuw, P.J., Ruts, I., and Tukey, J.W. (1999). The bagplot: A bivariate boxplot. The American Statistician, 53(4):382–387. DOI: 10.2307/2686061.


Rubin, D.B. (1976). Inference and missing data. Biometrika, 63(3):581–590.


Sandve, G.K., Nekrutenko, A., Taylor, J., and Hovig, E. (2013). Ten simple rules for reproducible computational research. PLOS Computational Biology, 9(10):1–4. DOI: 10.1371/journal.pcbi.1003285.


Smith, S.W. (2002). The Scientist and Engineer's Guide to Digital Signal Processing. Newnes. URL:


Steiglitz, K. (1996). A Digital Signal Processing Primer: With Applications to Digital Audio and Computer Music. Pearson.


Tijms, H.C. (2003). A First Course in Stochastic Models. Wiley.


Tufte, E.R. (2001). The Visual Display of Quantitative Information. Graphics Press.


Tukey, J.W. (1962). The future of data analysis. Annals of Mathematical Statistics, 33(1):1–67. URL:, DOI: 10.1214/aoms/1177704711.


Tukey, J.W. (1977). Exploratory Data Analysis. Addison-Wesley.


van Buuren, S. (2018). Flexible Imputation of Missing Data. CRC Press. URL:


van der Loo, M. and de Jonge, E. (2018). Statistical Data Cleaning with Applications in R. John Wiley & Sons.


Venables, W.N., Smith, D.M., and R Core Team. (2023). An Introduction to R. URL:


Virtanen, P. and others. (2020). SciPy 1.0: Fundamental algorithms for scientific computing in Python. Nature Methods, 17:261–272. DOI: 10.1038/s41592-019-0686-2.


Waskom, M.L. (2021). seaborn: Statistical data visualization. Journal of Open Source Software, 6(60):3021. DOI: 10.21105/joss.03021.


Wickham, H. (2011). The split-apply-combine strategy for data analysis. Journal of Statistical Software, 40(1):1–29. DOI: 10.18637/jss.v040.i01.


Wickham, H. (2014). Tidy data. Journal of Statistical Software, 59(10):1–23. DOI: 10.18637/jss.v059.i10.


Wickham, H. and Grolemund, G. (2017). R for Data Science. O'Reilly. URL:


Wierzchoń, S.T. and Kłopotek, M.A. (2018). Modern Algorithms for Cluster Analysis. Springer. DOI: 10.1007/978-3-319-69308-8.


Wilson, G. and others. (2014). Best practices for scientific computing. PLOS Biology, 12(1):1–7. DOI: 10.1371/journal.pbio.1001745.


Wilson, G. and others. (2017). Good enough practices in scientific computing. PLOS Computational Biology, 13(6):1–20. DOI: 10.1371/journal.pcbi.1005510.


Xie, Y. (2015). Dynamic Documents with R and knitr. Chapman and Hall/CRC.