Data Mining Methods - Practical Application & Team Work in a Case Study

Practical experience in building and assessing reliable Data Mining models by applying the CRISP DM model in detail. Hands-on experience by become familiar with many different methods to improve the quality of the data and identify the best method.

Course Factsheet

DegreeUniversity Certificate in Advanced Computational Data Analytics
PaceSupportive Lectures and Tutorials in online-class; Additional self-paced learning
CertificateUniversity certificate ≙ 10 ECTS
DurationApril – July
CertificationOnline exam
Time Commitment6-8 hours per week
Price2.950 Euro plus Tax
FundingEducational leave possible

Course Description

After this course you will be able to build and assess a good data mining model. Following the well-known modeling path of CRISP DM you will become familiar with the all necessary steps. Starting to assess and improve the variables quality then various techniques like imputation of missing values, transforming and balancing will be discussed. Beside the classification methods such ass trees you will dive into the factor analysis and the cluster analysis.

Understanding the details and apply the methods does not mean you need to tackle all the mathematical parameter adjustments. The course is a good mixture of practically needed knowledge and its application. Working together in international teams will help you to get a deep understanding while cycling through the different steps of the data mining modelling procedure. A concrete use case will make the learning experience effective. In addition to systems knowledge and knowledge of applications, you will also be able to precisely and professionally enumerate the pros and cons of various methods.

You will learn in real-time lectures with professors via video conference with direct & personal feedback. Additionally, the material in the eLearning system will support your learning process.

The ECTS system is employed to ensure international recognition of students' academic achievements. The credits awarded in this course can recognized by other universities and substitute modules in study programmes.

The learning outcomes are:

  • distinguish supervised and unsupervised methods
  • Components of the Cross-industry standard process for data mining (CRISP DM) model
  • details of each step to build a reliable model in data mining
  • identify and clean up possible in data problems
  • becoming familiar with missing value imputation methods
  • feature selection vs. feature reduction
  • scaling, transforming features and applying power transformations such as Box Cox and Yeo Johnson
  • assessing data models quality using confusion matrices and quality measures
  • concept of segmentation in datasets for training, test and validation as well as more complex cross validation techniques such as Bootstrap and k-fold
  • Algorithms such as Trees, Support Vector Machines, Factor and Cluster Analysis
  • dealing with a complex case study to apply the acquired knowledge