The rise of big data has significantly elevated the significance of data science, catalyzing extensive research across multiple fields, including mathematics, statistics, computer science, and artificial intelligence. Data science encompasses modeling, computation, and learning processes to transform data into information, information into knowledge, and knowledge into actionable decisions. However, the intricacies of big data pose numerous challenges, such as dealing with missing data, high- and ultra-high-dimensional data, response dependencies, time series analysis, and distributed storage. Existing theories, methods, and algorithms for analyzing big data encounter significant hurdles, especially concerning fundamental statistical concepts like estimation, hypothesis testing, confidence intervals, and variable selection, spanning frequentist and Bayesian approaches. This reprint offers an array of tools within the realm of data science aimed at tackling these challenges. It encompasses various topics, including handling measurement errors or missing data, cognitive diagnosis modeling, constructing credit risk scorecards using logistic regression models, geographically weighted regression modeling, privacy protection practices in data mining, clustering methods, and model selection for high-dimensional datasets. Furthermore, it delves into predicting sensitive features under indirect questioning. These discussions aim to provide valuable tools and examples for the practical application of data science.