Big Data & Data Science training

Duration 3 days Get a quote

Introduction: Big Data & Data Science

Reminders on algebra (vectors and matrices) & statistics

Machine Learning: Automatic Learning

  • Definition and history
  • Examples of machine learning applications
  • Modeling a problem in Machine Learning
  • Types of learning (Supervised / Unsupervised)

Steps in learning

  • Choice of model
  • Learning: calculation of model parameters
  • Over-learning
  • Validation, cross validation, test
  • Model comparison criteria

Getting started with Python

  • The Jupyter Notebook
  • Introduction to Python Programming
  • Basic structures and operations in Python
  • LAB 1: Getting started with Python
  • Data recovery
  • Exploration and preprocessing of data (use of Pandas and Numpy libraries)
  • Visualization of data (use of the Matplotlib library)
  • LAB 2: Data exploration and pre-processing

Learning algorithms

Regression

  • Use case: Prediction of selling prices of houses
  • Metrics of the regression
  • Linear regression
  • Principle and functioning
  • Cost / loss function
  • Optimization function (Gradient Descent algorithm)
  • LAB 3: Linear regression
  • Regressions: multiple, Ridge, Lasso
  • LAB 4: Multiple regression, Ridge & Lasso regression

Classification

  • Use case: Detection of spam emails
  • Metrics of the classification
  • Logistic Regression
  • SVM (Vector Support Machine)
  • LAB 5: Logistic Regression & SVM
  • Decision trees
  • Random forests
  • LAB 6: Decision Trees and Random Forests
  • K-NN (the nearest K neighbors)
  • LAB 7: K-NN

Segmentation & clustering

  • Use cases: Segmentation of articles
  • Distances K-means (K-Means)
  • LAB 8: K-averages
  • Spectral clustering
  • Hierarchical clustering
  • LAB 9: Spectral & Hierarchical clustering

Recommendation Systems

  • Use case: Recommendation system for an e-commerce site
  • Content-based filtering
  • LAB 10: Content-based filtering
  • Collaborative filtering
  • LAB 11: Collaborative filtering

Dimensional reduction

  • Use case: example of the Iris data
  • PCA (Principle Component Analysis)

Deep learning: Deep learning

  • Use case: Image classification
  • Multilayer perceptron neuron networks
  • Convulsion neural networks
  • Recurrent neural networks
  • Autoencoders networks

Challenges and perspectives