5910 Breckenridge Pkwy Suite B, Tampa, FL. 33610
(800) 272-0707

SkillSoft Explore Course

IT Skills     Cloud Computing and Virtualization     Amazon     AWS Certified Machine Learning

Raw data is typically not perfect for developing effective machine learning (ML) models. Often, it needs to be processed using various feature engineering techniques to make it more suitable for building accurate and optimized ML models. Take this course to learn about techniques that help prepare the data to be compatible and improve the performance of machine learning models. Investigate techniques that are used to improve data usability, such as one-hot encoding, binning, transformations, scaling, and shuffling. You will also learn about the importance and usage of text feature engineering and major workflows in the AWS environment. After completing this course, you'll be able to implement feature engineering techniques using AWS workflows, further preparing you for the AWS Certified Machine Learning – Specialty certification exam.



Objectives

AWS Certified Machine Learning: Feature Engineering Techniques

  • discover the key concepts covered in this course
  • describe how to perform one-hot encoding and its main purpose
  • define binning and discretization as the process of transforming numerical variables into categorical counterparts
  • outline how data transformation can be used to make data more useful for data analysis
  • define data scaling and normalization and describe why it is important to standardize independent variables
  • outline data shuffling and define its role in removing biases and building more robust training models
  • work with commonly used feature engineering techniques on real data
  • recognize the basic principles behind text feature engineering
  • describe the process of term frequency-inverse document frequency (TF-IDF) and its uses in text mining
  • describe bag-of-words model and compare it to TF-IDF
  • describe the concept of n-gram and why they are used for machine learning
  • use Spark and EMR workflows to prepare data for a TF-IDF problem
  • summarize the key concepts covered in this course