This book represents a first step towards embedded machine learning. It presents techniques for optimizing and compressing deep learning models. These techniques make it easier to deploy a high-performance, lightweight deep learning model on resource-constrained devices such as smartphones and microcontrollers. This paper also explores a topical knowledge transfer technique, namely knowledge distillation. This technique makes it possible to improve the performance of a lightweight deep learning model, while transferring to it the knowledge of a complex, high-performance deep learning model. All these techniques have been detailed in this book and illustrated with practical Python implementations, generally based on the use of the pytorch and tensorflow libraries.