BME VIK - Intelligens adatelemzés és döntéstámogatás

vissza a tantárgylistához nyomtatható verzió

Intelligent Data Analysis and Decision Support

A tantárgy neve magyarul / Name of the subject in Hungarian: Intelligens adatelemzés és döntéstámogatás

Last updated: 2024. március 1.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics

MSc in Computer Engineering
Data Science and Artifical Intelligence specialization

Course ID	Semester	Assessment	Credit	Tantárgyfélév
VIMIMB09		2/1/0/v	5

3. Course coordinator and department Dr. Antal Péter,

4. Instructors

Dr. Péter Antal, associate professor, MIT

Dr. Gábor Szűcs, associate professor, TMIT

5. Required knowledge Data science, machine learning, foundation of artificial intelligence.

6. Pre-requisites

Ajánlott:
Statistics

7. Objectives, learning outcomes and obtained knowledge

Intelligent Data Analysis and Decision Support presents advanced approaches at the forefront of machine learning and deep learning research, helping to solve a wider range of real-world problems in engineering. We will first review Bayesian statistical and decision-theoretic frameworks that provide a unified framework for using background knowledge, dealing with incomplete and uncertain data, applying complex models and intelligent forms of inference, adaptive data collection.

Among intelligent data analysis methods, we present techniques that can help improve the efficiency and goodness of the analysis as a pre-processing step. Among these, dimension reduction and representation learning methods improve efficiency - the latter providing a more abstract solution - and clustering is an important part of the data analysis process. The performance of machine learning methods for data analysis can be improved by using ensemble machine learning methods, and more robust performance on real test sets can be achieved by regularisation. We describe in detail the data-driven decision support with these machine learning methods and the process of evaluating the decisions, and demonstrate their use in practice on different types of data (simple, hierarchical, time-series, unstructured).

We will present the probabilistic graphical models and the associated decision nets and causal nets, as well as probabilistic, causal and counterfactual inference methods to handle intervention data and support intelligent data mining. We describe approximate computational methods for Bayesian inference, particularly Markov chain Monte Carlo methods. We present modern machine learning methods for causal models and the role of background knowledge in learning, data and knowledge fusion. Within the framework of adaptive data mining, we present active learning, reinforcement learning, and multi-armed bandits, and their applications in recommender systems and discovery systems.

8. Synopsis

Detailed topics of the presentations:

Estimation and decision theory, optimal decision and properties of human decisions, types of utility functions. Intelligent inference types: probabilistic, causal and counterfactual inference. Value of information and optimal information gathering strategies.
Intelligent data analysis methods, data analysis on different types of data (tabular, time series, unstructured).
Regression type decision problems. Regularized regression methods: ridge, lasso, elastic net.
Non-linear dimension reduction methods (autoencoder, manifold). Applications of dimensionality reduction.
Clustering for clustering tasks and as a preprocessing of classification problems. Biclustering, spectral clustering methods.
Improving the performance (accuracy) of ML methods. Ensemble (ECOC) machine learning methods.
Types of recommender systems and data analysis methods. Matrix factorization and collaborative filtering in recommender systems.
Data-driven decision support with machine learning models. Decision evaluation process.
Definitions, parametric and structural semantics of probabilistic graphical models, use of sparse representations, inference algorithms, notable classes of models (naive Bayes nets, Hidden Markov Models). Extensions to first-order probabilistic logics and stochastic grammars.
Derivation of causal models, notion of observational equivalence. Modelling interventions using do(.) semantics and graph truncation. The notion of correction in causal power estimation. Counterfactual inference.
Conjugacy and sufficient statistics in exact Bayesian inference. Approximation methods for Bayesian inference. Monte Carlo methods, rejection sampling and importance sampling. Markov Chain Monte Carlo Methods (MLMC): convergence and confidence diagnostics, multilinear methods, Metropolis-linked MLMC. Hybrid MLMC.
Learning causal models from observation and intervention data. Learning with background knowledge, data and knowledge fusion in learning system models. Bayesian learning of model properties.
Active learning, learning with cost. k-armed bandits, Monte Carlo tree search. Reinforcement learning, deep reinforcement learning.
Recommender systems, noise and informative miss handling. Discovery systems, early discovery performance measures, expected utility of experiment, adaptive experiment design.

Detailed topics for exercises:

Decision model construction. Optimal decision and value of information.
Advanced regression exercise in Python
Spectral clustering on images (Python)
Joint machine learning methods (computational exercise)
Constructing a causal model. Probabilistic, causal and counterfactual inference testing.
Examination of Markov Chain Monte Carlo methods: Gibbs and hybrid MCMC sampling.
Hyperparameter optimization with k-armed robbers and deep learning Monte Carlo tree search.

9. Method of instruction

2 hours of lectures per week, 1 hour of practice (computational exercise and computer laboratory exercise).

10. Assessment

During teaching period:

Complete six bi-weekly homework assignments for grading. An optional comprehensive assignment is available, requiring prior approval of the topic and data set by the instructor. High-quality submissions of the major assignment may receive a proposed mark from the tutors, which can exempt you from the exam.

During the exam period:

Written exam covering the theoretical and practical materials addressed, related to the homework assignments. The passing level for the exam is 40%.

11. Recaps Two small homework assignments and the major homework assignment can be made up by the end of the makeup week.

12. Consultations

Consultation is available by prior arrangement.

13. References, textbooks and resources

Russell, Stuart J., Peter Norvig: Artificial intelligence a modern approach. Pearson Education, Inc., 2010.
Antal Péter - Antos András - Hajós Gergely - Hullám Gábor - Millinghoffer András, Antal Péter (szerk.), Valószínűségi döntéstámogató rendszerek, ISBN: 978-963-2791-84-5, 2014.
Antal Péter - Antos András - Horváth Gábor - Hullám Gábor - Kocsis Imre - Marx Péter - Millinghoffer András - Pataricza András - Salánki Ágnes, Antal Péter (szerk.), Intelligens adatelemzés, ISBN: 978-963-2791-71-5, 2014.
Thomas A. Runkler: Data Analytics - Models and Algorithms for Intelligent Data Analysis, 2nd Edition, Springer, 2016.

14. Required learning hours and assignment

Contact hours (lectures)	42
Study during the semester	28
Preparation for midterm test
Preparation of homework	40
Study of written material
Preparation for exams	40
Total	150

15. Syllabus prepared by Dr. Péter Antal, associate professor, MIT
Dr. Gábor Szűcs, associate professor, TMIT

Budapest University of Technology and Economics, Faculty of Electrical Engineering and Informatics

Intelligens adatelemzés és döntéstámogatás