Raise your machine learning game and deal with imbalanced data using libraries, such as imbalanced-learn, PyTorch, scikit-learn, pandas, and NumPy, and squeeze better performance from machine learning models using this essential guide
As machine learning practitioners, we often encounter imbalanced datasets in which one class has considerably fewer instances than the other. Many machine learning algorithms assume an equilibrium between majority and minority classes, leading to a suboptimal performance on imbalanced data. Addressing class imbalance is crucial for significantly improving model performance.
Machine Learning for Imbalanced Data begins by introducing the challenges posed by imbalanced datasets and the importance of addressing these issues. It then guides you through techniques that enhance performance on imbalanced data when using classical machine learning models, including various sampling and cost-sensitive learning methods.
As you progress, the book delves into similar and more advanced techniques for deep learning models, employing PyTorch as the primary framework. Throughout the book, hands-on examples provide working, reproducible code that demonstrates the practical implementation of each technique.
By the end of this book, you will be adept at identifying and addressing class imbalances, and confidently applying various techniques including sampling, cost-sensitive techniques, and threshold adjustment when using traditional machine learning or deep learning models.
This book is for machine learning practitioners, who want to effectively address the challenges of imbalanced datasets in their projects. Data scientists, machine learning engineers/scientists, research scientists/engineers, and data scientists/engineers will find this book helpful. Though complete beginners are welcome to read this book, some familiarity with core ML concepts will help readers maximize the benefits and insights gained from this comprehensive resource.