Media Informatics Systems

A tantárgy neve magyarul / Name of the subject in Hungarian: Médiainformatikai rendszerek

Last updated: 2018. augusztus 26.

Budapest University of Technology and Economics
Faculty of Electrical Engineering and Informatics
Course ID Semester Assessment Credit Tantárgyfélév
VITMMA08 2 2/1/0/v 4  
3. Course coordinator and department Dr. Mihajlik Péter,
4. Instructors

 Name

 Position

  University, Department

 Dr. Gábor MAGYAR PhD

Associate Professor

 BME-TMIT

 Dr. Péter MIHAJLIK PhD

 Lecturer

 BME-TMIT

 Dr. Gábor SZŰCS PhD

 Associate Professor

 BME-TMIT

5. Required knowledge Mediainformation technologies and tools
7. Objectives, learning outcomes and obtained knowledge
The aim of the course is to present the fundamentals of digital multimedia content management and to introduce the applied pattern recognition and analytic techniques. The students learn about the recognition and categorization issues and the standards of desriptive attributes of multimedia contents. At the end of the semester the student will be able to understand and accomplish engineering tasks related to media informatics systems by acquiring the required technologies and tools. 
8. Synopsis

 

Week 1:

 

Basic definitions. Media content management. Main topics of multimedia (image, voice, video) processing.

 

Week 2:

 

Processing audio contents. Short Time Fourier Spectrum, windowing, spectrogram computation. The fundamentals of signal detection.

 

Week 3:

 

Music recognition. Challenges in real-time pattern matching: additive noise, linear and non-linear distortions. Audio fingerprinting. Case study: Shazam.

 

Week 4:

 

Recognition of variable sound and image signals. Statistics based general classification. Probability density function, likelihood, training and test. Bayes' theorem.

 

Week 5:

 

Multimedia classification tasks based on multi-variate Gaussian Mixture Models.

 

Week 6:

 

Maximum Likelihood vs. Discriminative Training – theoretical and practical issues regarding audio and image data. The application of Multi Layer Perceptrons on media informatics tasks.

 

Week 7:

 

Processing time-varying signals: Dynamic Time Warping, Hidden Markov-Models and their practical implementations. Fundamentals of speech recognition.

 

Week 8:

 

Contemporary speech recognition technologies. Acoustic, pronunciation and language models. Subtitling methods and standards. Case studies: BBC and the Hungarian National Television subtitling approaches.

 

Week 9:

 

Deep learning: deep feed-forward neural nets, convolutional nets and requrrent networks - and their applications on automatic annotation of media contents (text, sound, image and video).

 

Week 10:

 

Complex tasks based on multimedia technologies: shape detection in images, object tracking in videos. Face detection and recognition solutions. Video processing methods in practice: hand gesture recognition in videos.

 

Week 11:

 

Metadata: semantic and desriptive metadata. Multimedia metadata standards. EBU/SMPTE metadata, Dublin Core, Material Exchange Format (MXF).

 

Week 12:

 

Multimedia databases. Multimedia information retrieval. Search modes, types, algorithms. Quality measurement of an information retrieval system.

 

Week 13:

 

Digital Media Management Systems (DMMS) / Multimedia Asset Management. Structure of the systems: gathering, storing, displaying subsystems. Lifecycle attributes. Integration tools. Architecture of the media content management systems, and types of it: DAM, DM, KM, Web CMS, ECM. Overall model of systems.

 

Week 14:

 

Digital archiving: tasks, approaches, technics. Archiving strategies: On-line, near-line, off-line, off-site accessibility, levels. Record management.

 

In the practice sessions, the following topics (not exculsively and not fully) will be discussed: audio feature extraction, audio fingerprinting, GMM, automatic speech recognition, language modeling, applied deep neural nets in Keras, image annotation tasks, multimedia retrieval, visualization tools, video processing.

9. Method of instruction 2×45 min lecture and 1×45 min seminar per week (90 minutes biweekly).
10. Assessment
Requirements:
- Mid-term (written) test
Exam period:
- Exam
11. Recaps There is one possibility to repeat the test (Mid-term) in the teaching period and there is a final one in the official recap period.
12. Consultations Personally - agreed by e-mail
13. References, textbooks and resources
David Austerberry: Digital Asset Management, FocalPress, 2006.
Serkan Kiranyaz, Moncef Gabbouj: Content-Based Management of Multimedia Databases: Advanced Techniques for Multimedia Analysis and Retrieval, LAP LAMBERT Academic Publishing, 2012.
Altrichter Márta, Horváth Gábor, Pataki Béla, Strausz György, Takács Gábor, Valyon József: Neurális hálózatok, Hungarian Edition Panem Könyvkiadó Kft., Budapest, 2006  
Michael Nielsen: Neural Networks and Deep Learning, 2016. Online: http://neuralnetworksanddeeplearning.com/
Rabiner, L., Juang, B-H., Fundamentals of Speech Recognition. Prentice Hall, New Jersey, 1993

14. Required learning hours and assignment
Lessons42
Preparing for the lessons18
Preparing for the test25
Home work
Preparing for the exam35
Total120
15. Syllabus prepared by

 Name

 Position

 University, department

 Dr. Magyar Gábor PhD

 Associate Professor

 BME-TMIT

 Dr. Szűcs Gábor PhD

 Associate Professor

 BME-TMIT

 Dr. Mihajlik Péter PhD

 Lecturer

 BME-TMIT

 Dr. Gyires-Tóth Bálint PhD

 Lecturer

 BME-TMIT