Analytical Skills for AI and Data Science
This book strives to merge the gap between business and data science. It is about understanding the business value of analytics and what is important to work on.
This book strives to merge the gap between business and data science. It is about understanding the business value of analytics and what is important to work on.
As you can see from the above image, confident learning is about estimating the likelyhood of the data being labeled correctly based upon the confidence of the model. If the model confidence is above the threshold confidence (The Tj parameter, tdog, tfox tcow) and if the confidence of the model prediction is higher than the threshold but the label is different, then we predict a wrong label
Data Cleaning is the process of turning the data you have into data that is usable. It is, for the lack of a better term, the fight against entropy in the data domain.
Data drift refers to the phenomenon where the statistical properties of a dataset used for machine learning or analysis change over time. This alteration can be due to various factors, such as shifts in data collection processes, changes in the underlying distribution of the data, or modifications in the environment from which the data originates. Detecting and addressing data drift is crucial to maintaining the performance and reliability of machine learning models and analytical systems.
What is data science? It is a bunch of different jobs bunched together and given the tie of AI to make a company sound innovative.
What is data science? It is a bunch of different jobs bunched together and given the tie of AI to make a company sound innovative.
The start-up phase of a data science project is, for me, one of the most exciting parts of the project, but it is also one of the most unclear phases. To have a fighting chance of making it to production, there are several factors that are extremely important and need to be addressed.
This book dives into the difficult aspects of data science. The difficult aspects are business value proposition, communication and measuring impact. These topics are discussed and methods for doing this the right way are presented.
This book covers the fundamentals of designing machine learning systems. It goes through the entire lifecycle of a machine learning system and then discusses the ecosystem and the challenges and cases that need to be considered.
Evaluation is one of the most important aspects of machine learning development. It is the craft of understanding the model and how it works.
A book about advanced Python concepts. Nice to learn about more advanced Python concepts.
Dummy vs One-hot
This book is about how to make features for machine learning models and implement them into models. The book goes into natural language text, tabular data, and image data. It contains discussions about how to implement good engineering practices in feature engineering.
This book is a more advanced book on Python and dives more into the nitty-gritty of the language. It is about a lot of the core functionalities of Python and how they work. A lot of internal things, such as iterators, data objects and methods and functions, are discussed and analysed in detail.
This book covers the fundamentals of data engineering and how to solve problems. of data engineering without going to much into detail of the programming. It introduces concepts such as data warehouse and Kafka and data pipelines and ETL.
This book is about methods and ways to understand AI and data modeling and how to utilize the different ways of interpreting machine learning models. It gives the basis and then dives into the different types of models and methods you can use. It separates the models into specific to models or model families, and model agnostic.
This book is about machine learning design patterns and discussions around those—concepts in machine learning for this. And therefore, it is a good reminder of the concepts and core tenants of machine learning.
Decision Trees
The ML Design Sprint is a modification of the design sprint workshop, where the goal is to give the project relevant context on the problem, the data, and the resources available. The goal of the ML Design Sprint is to decide on the goal of the model, the input features of the model, and how the model should be evaluated. The design sprint brings together Subject Matter Experts, Users, Product Owners Data Scientists, and ML Engineers together to quickly understand the problem, the potential solutions, and the risks. ML Design Sprint should shorten the duration of the scoping and exploratory analysis phases by bringing the analysts and experts together.
This book is mostly about different ways of thinking about models, in the context of making models of real-world phenomena. It dives into the cons and benefits of each mindset and tries to explain how knowing each is a good advantage.
This book is a no-nonsense book about practical MLOps and how you should approach it to solve business problems. The book takes an even more hardline approach to automation and focuses on the concept of Kaizen ML, where continuous improvement and striving to make the feedback loop even shorter and the process more and more seamless.
Scripting
There are two types of recommendation systems
Random Cut Forrest