# Changelog#

Important

Any bug/typos reports/fixes are appreciated.

Note that the most up-to-date version of this book can be found at https://datawranglingpy.gagolewski.com/.

Below is the list of the most noteworthy changes.

**under development (v1.0.3.9xxx)**:New HTML theme (featuring light and dark mode).

Not using

**seaborn**where it can easily be replaced by 1–3 calls to the lower-level**matplotlib**, especially in the**numpy**chapters.Use

**numpy.genfromtxt**more eagerly.A few more examples of using f-strings for pretty-printing of results.

(…) to do (…) work in progress (…)

**2023-02-06 (v1.0.3)**:Numeric reference style; updated bibliography.

Reduce the file size of the screen-optimised PDF at the cost of a slight decrease of the quality of some figures.

The print-optimised PDF now uses selective rasterisation of parts of figures, not whole pages containing them. This should result in a much better quality of the printed version.

Bug fixes.

Minor extensions, including:

**pandas.Series.dt.strftime**, more details how to avoid pitfalls in data frame indexing, etc.

**2022-08-24 (v1.0.2)**:First printed (paperback) version can be ordered from Amazon.

Fixed page margin and header sizes.

Minor typesetting and other fixes.

**2022-08-12 (v1.0.1)**:Cover.

ISBN 978-0-6455719-1-2 assigned.

**2022-07-16 (v1.0.0)**:Preface complete.

Handling tied observations.

Plots look better when printed in black and white.

Exception handling.

File connections.

Other minor extensions and material reordering: more aggregation functions,

**pandas.unique**,**pandas.factorize**, probability vectors representing binary categorical variables, etc.Final proof-reading.

**2022-06-13 (v0.5.1)**:The Kolmogorov–Smirnov Test (one and two sample).

The Pearson Chi-Squared Test (one and two sample and for independence).

Dealing with round-off and measurement errors.

Adding white noise (jitter).

Lambda expressions.

Matrices are iterable.

**2022-05-31 (v0.4.1)**:The Rules.

Matrix multiplication, dot products.

Euclidean distance, few-nearest-neighbour and fixed-radius search.

Aggregation of multidimensional data.

Regression with

*k*-nearest neighbours.Least squares fitting of linear regression models.

Geometric transforms; orthonormal matrices.

SVD and dimensionality reduction/PCA.

Classification with

*k*-nearest neighbours.Clustering with

*k*-means.Text Processing and Regular Expression chapters merged.

Unidimensional Data Aggregation and Transformation chapters merged.

`pandas.GroupBy`

objects are iterable.Semitransparent histograms.

Contour plots.

Argument unpacking and variadic arguments (

`*args`

,`**kwargs`

).

**2022-05-23 (v0.3.1)**:More lightweight mathematical notation.

Some equalities related to the mathematical functions we rely on (the natural logarithm, cosine, etc.).

A way to compute the most correlated pair of variables.

A note on modifying elements in an array and on adding new rows and columns.

An example seasonal plot in the time series chapter.

Solutions to the SQL exercises added; to ignore small round-off errors, use

**pandas.testing.assert_frame_equal**instead of**pandas.DataFrame.equals**.More details on file paths.

**2022-04-12 (v0.2.1)**:Many chapters merged or relocated.

Added captions to all figures.

Improved formatting of elements (information boxes such as

*note*,*important*,*exercise*,*example*).

**2022-03-27 (v0.1.1)**:First public release – most chapters are drafted, more or less.

Using

**Sphinx**for building.

**2022-01-05 (v0.0.0)**:Project started.