Trustworthy AI and Data Analysis 

A tantárgy neve magyarul / Name of the subject in Hungarian: Megbízható mesterséges intelligencia és adatelemzés

Last updated: 2024. március 1.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics
MSc in Computer Engineering
Data Science and Artifical Intelligence specialization
Course ID Semester Assessment Credit Tantárgyfélév
VIMIMB10   2/1/0/v 5  
3. Course coordinator and department Dr. Gönczy László,
4. Instructors

Dr. László Gönczy associate professor, MIT

Dr. Péter Antal  associate professor, MIT
5. Required knowledge Basic statistical knowledge, data structures, algorithms, fundamental concepts of artificial intelligence.
6. Pre-requisites
Kötelező:
NEM
(TárgyEredmény( "BMEVITMMA05", "jegy" , _ ) >= 2
VAGY
TárgyEredmény("BMEVITMMA05", "FELVETEL", AktualisFelev()) > 0)

A fenti forma a Neptun sajátja, ezen technikai okokból nem változtattunk.

A kötelező előtanulmányi rend az adott szak honlapján és képzési programjában található.

Ajánlott:
Mathematical statistics.
7. Objectives, learning outcomes and obtained knowledge

The results of artificial intelligence, machine learning, and data analytics are increasingly used for several real-life purposes as a service embedded in complex IT systems. However, the operational safety of these IT systems is currently often not addressed, as their correct functioning is typically not guaranteed, there are no standardized development/testing methods, the robustness of such systems is not ensured, and they are not protected against accidental or malicious input errors. However, there is a wide range of research and regulatory activity to improve reliability, which has led to new ethical, legal, technological, and theoretical approaches to managing societal-level risks. 

The objective of this course is to introduce the approaches, concepts, and engineering best practices of trustworthy data analysis, machine learning, and artificial intelligence. The course will also review issues related to the integration of intelligent algorithms into IT systems, methods for data-driven solutions to technical problems, and integration of these into development/operations processes. 

The course will introduce the human-centered approach to data analytics and artificial intelligence at a societal level, its ethical background, legal regulation, its representation in standards, and its implementation in engineering practice. For both data analytics and AI, it will present the potential and limitations of interpretability, explainability, testability, and sensitivity analysis. It describes the comprehensive formalization of the data analysis workflow and the lifecycle of creating an AI service/product, specifically validated documentation, with the potential of using blockchain tools and the auditing of the result.
8. Synopsis Detailed topics of the lectures:
  1. Fundamental concepts of trustworthiness in data analysis and Artificial Intelligence. Approaches to reliable data analysis and artificial intelligence, human-centered artificial intelligence. Ethical background of analysis and AI, legal regulations, standardization, and integration in engineering best practices.
  2. Data quality and veracity. Validation of input datasets: goals and applications of exploratory data analysis. Measurement of data quality, data processing, tidy data, ETL / ELT frameworks, automated data processing and visualization. Use of engineering assumptions in data analysis: considering causal, temporal, and topological relationships.
  3. Understanding and explainability of data through data visualization: comparison, trend analysis, outlier detection, determining relationships, clustering. Use cases of visualization and their supporting technologies: monitoring/dashboard, business reporting, evaluation of alternatives/hypotheses, reproducible research.
  4. Evaluation, testing, and assurance of data analysis and machine learning models: defining performance metrics, evaluating alternatives, visual support for evaluating results and parameterization. Sensitivity analysis, examination of variable importance.
  5. Data analysis lifecycle. Cloud-based systems. Application of blockchain in the data sharing process.
  6. Use of qualitative models to describe the construction and changes of reliable systems. Validation of qualitative models/model details based on measured data.
  7. Data-driven model building: methods and applications of process mining: model building, conformance checking, log analysis, fraud detection. Parameterization of business rule systems based on data, rule mining.
  8. Use of intelligent learning methods in critical systems. Application of fault-tolerant patterns. Test generation for AI services.
  9. Reliable and explainable artificial intelligence: black and white box approaches. Probabilistic and causal models.
  10. Reliable probabilistic, causal, decision-theoretic, and counterfactual reasoning.
  11. Interpretable AI models in the formalization of AI: explainability, utility, fairness.
  12. Lifecycle of white box models, auditing, evaluation, and risk analysis of models: ALTAI approach, process of model acceptance/adoption, analytical/hybrid methods, model testing, explanation generation.
  13. Explainability of black box models, model derivation.
  14. Reliable human-machine hybrid systems, "human in the loop" approach, reliable multi-agent systems.
Detailed topics of the exercises:
  1. Data quality evaluation, transformation and validaiton of input data, data profiling.
  2. Visual Exploratory Data Analysis, automated visualization derivation.
  3. Application of process mining in model building and validation.
  4. Test generation for black box testing of AI models.
  5. Sensitivity analysis of models, examination of variable importance, CP, PDP, Shapley DALEX.
  6. Derivation of interpretable models, representation of dependencies and causal relationships.
  7. Methods of explanation generation, generating logical, probabilistic, and causal explanations.
9. Method of instruction 2 hours of lectures per week, 1 hour of practice
10. Assessment

During the semester:

Homework assignment, where students are required to complete an individual task on a topic and dataset agreed upon with the instructors.

During the exam period:

For courses with less than 30 students, oral exams are conducted; otherwise, written exams will be held. Exams cover both theoretical concepts and their practical application.

11. Recaps Homework can be re-submitted during the repeat period.
12. Consultations Consultation is available by prior arrangement.
13. References, textbooks and resources
  • Presentations, notes, interactive Jupyter Notebooks.
  • Russell, Stuart J., Peter Norvig: Artificial intelligence a modern approach. Pearson Education, Inc., 2010.
  • Theus, Martin, and Simon Urbanek. Interactive graphics for data analysis: principles and examples. CRC Press, 2008.
  • Wickham, Hadley; Grolemund, Garrett (2017). R for Data Science : Import, Tidy, Transform, Visualize, and Model Data. Sebastopol, CA: O'Reilly Media. ISBN 978-1491910399. (online)
  • Antal Péter (szerk.). Intelligens adatelemzés. Typotex Kiadó, 2014. Online.
  • Biecek, Przemyslaw, and Tomasz Burzykowski. Explanatory model analysis: Explore, explain and examine predictive models. Chapman and Hall/CRC, 2021. (online)
  • Cristoph Molnar. Interpretable Machine Learning: A Guide For Making Black Box Models Explainable, Second Edition, 2022. (online)
14. Required learning hours and assignment
Contact hours (lectures)42
Study during the semester21
Preparation for midterm test 
Preparation of homework32
Study of written material15
Preparation for exams40
Total150
15. Syllabus prepared by

Dr. László Gönczy associate professor, MIT

Dr. Péter Antal  associate professor, MIT