Esta página em português Ajuda Autenticar-se

Você está em: Start > MEB15

Campus Map

Options

Data Analysis and Machine Learning

Code:

MEB15

Sigla:

ADDA

Level:

Áreas Científicas
Classificação	Área Científica
OFICIAL	Informática

Ocorrência: 2023/2024 - 1S

Ativa?	Yes
Unidade Responsável:	Departamento de Sistemas e Informática
Curso/CE Responsável:

Ciclos de Estudo/Cursos

Sigla	Nº de Estudantes	Plano de Estudos	Anos Curriculares	Créditos UCN	Créditos ECTS	Horas de Contacto	Horas Totais
MEB	11	Plano Oficial do ano letivo 2021	2	-	6	60	162

Docência - Responsabilidades

Docente	Responsabilidade
Miguel Angel Guevara López

Docência - Horas

Theorethical and Practical :	1,00
Practical and Laboratory:	1,50
Seminário:	0,50
Orientação Tutorial:	1,00

Type	Docente	Turmas	Horas
Theorethical and Practical	Totais	1	1,00
Theorethical and Practical	Miguel Angel Guevara López		1,00
Practical and Laboratory	Totais	1	1,50
Practical and Laboratory	Miguel Angel Guevara López		1,50
Seminário	Totais	1	0,50
Seminário	Miguel Angel Guevara López		0,50
Orientação Tutorial	Totais	1	1,00
Orientação Tutorial	Miguel Angel Guevara López		1,00

Língua de trabalho

Portuguese - Suitable for English-speaking students

Objetivos

The students will obtain knowledge, skills, and proficiency to:

Recognize the specific challenges and needs of data analysis.

Collect (capture), explore, clean, munge, and manipulate datasets (Big-data).

Know (study) the fundamentals and possibilities of the application of machine learning techniques.

Implement machine learning models (e.g., regression, classification, decision trees, artificial neural networks), using Python programming language and available (open source) tools/libraries.

Develop projects that represent solutions to real practical problems, mainly applications in the area of health data analysis, through the analysis and exploration of public and/or private datasets made available.

Resultados de aprendizagem e competências

The UC "Data Analysis and Machine Learning" aims to provide students with the knowledge to recognize the challenges and specific needs of data processing and analysis and in particular to pave the way to develop specific modeling capabilities in the Big-data context. To understand the criteria for differentiating and selecting classes of algorithms and methods, as well as the assumptions of their use with emerging techniques of machine learning and artificial intelligence (AI). Develop theoretical and practical skills of dataset exploration based on the Python programming language, and the use of algorithms and methods developed / established in common tools / libraries (open source) tested and made available by the research and development
community in data science and AI. Among other topics the concepts of Supervised Learning, Unsupervised Learning and Reinforcement Learning will be introduced. In particular, it is intended that students will be able to apply the data analysis and machine learning techniques developed on patient’s data/
health record (clinical data, medical image analysis, etc.)

Modo de trabalho

Presencial

Pré-requisitos (conhecimentos prévios) e co-requisitos (conhecimentos simultâneos)

Basics of Python programming and linear algebra

Programa

PART 1: INTRODUCTION
1. Introduction to Data Análise e Machine Learning (ML)
1.1. What is Data Science (DS)? Why is DS important?
1.2. Analytics Building Blocks
1.3. Data Analysis Examples
1.4. What is ML? Why use ML?
1.5. Machine Learning Framework
1.6. Performance Evaluation
2. Introduction to Python Programing
2.1. Installing Python, Tools for Python
2.2. Control Flow (Conditional Logic, Loops, and Functions)
2.3. Python Collections
2.4. Introduction to NumPy and Pandas
3. Getting and Working with Data
3.1. Capturing Data
3.2. Feature Extraction and Transformation
3.3. Dimension reduction
3.4. Clustering

PART 2: ALGORITHMS AND METHODS
4. Machine Learning Techniques
4.1. Understanding ML (Problems, Goals, Challenges)
4.2. Python Tools/Libraries
4.3. Types of Learning
4.4. Supervised Learning
4.4.1. Classification
4.4.2. Training Models (Regression and Logistic Regression Approaches)
4.4.3. Support Vector Machines
4.4.4. Decision Trees
4.4.5. Random Forest
4.4.6. Ensemble Learning
4.4.7. Dimensionality Reduction
4.5. Unsupervised Learning
4.5.1. Clustering
4.5.2. Gaussian Mixtures
5. Neural Networks and Deep Learning
5.1. Introduction to Artificial Neural Networks
5.2. Training Deep Neural Networks
5.3. Custom Models
5.4. Loading and Preprocessing Data
5.5. Convolutional Networks (Computer Vision)
5.6. Processing Sequences (RNNs and CNNs)
5.7. Natural Language Processing
5.8. Representation and Generative Learning (Autoencoders and Generative Adversarial Networks)

PART 3: APPLICATIONS
6. Examples of Applications
6.1. Classification
6.2. Regression
6.3. Clustering

Bibliografia Obrigatória

Ethem Alpaydın; Introduction to Machine Learning, Second-Edition, MIT Press, 2010
Stuart Russell, Peter Norvig; Artificial Intelligence: A Modern Approach, 4th Edition, 2021. ISBN: 978-0134610993
Andriy Burkov; The Hundred-Page Machine Learning Book.. , 2019. ISBN: 978-1999579500
Sebastian Raschka, Vahid Mirjalili; Python Machine Learning Third Edition, Packt Publishing, 2019
M. Mohri, A. Rostamizadeh, A. Talwalkar; Foundations of Machine Learning, Second Edition, MIT Press, 2018

Bibliografia Complementar

Ian Goodfellow, Yoshua Bengio, and Aaron Courville; Deep Learning, 2016. ISBN: 978-0262035613
Max Kuhn, Kjell Johnson; Applied Predictive Modeling, Springer, 2016. ISBN: 978-1-4614-6849-3 (eBook).
Trevor Hastie, Robert Tibshirani, and Jerome Friedman; The Elements of Statistical Learning: Data Mining, Inference, and Prediction., Springer, 2009. ISBN: 978-0387848570

Métodos de ensino e atividades de aprendizagem

Teaching will have 3 major components:

Theoretical-practical classes - partially expository and with intensive use of supervised resolution of exercises, analysis of case studies and two seminars on specific topics, which will take place entirely at a distance.

Laboratory classes - for supervised execution and individual assessment of practical work in a computing environment for personal computers, internet and mobile devices.

Tutorial guidance - for personalized follow-up of the preparation of seminars and execution of distance projects.

Files will be made available with the matter of laboratory exercises to be executed autonomously (asynchronous regime), but with monitoring by videoconference at the established time and the use of synchronous classes (by video conference) for clarification of doubts and individual monitoring.

All types of classes (OT, TP and PL), as well as seminars may be held remotely since the materials, resources, tools and teacher training allow this in this computer science discipline in a natural way. Laboratory and project work may be done individually or in groups of 2 or 3 students upon registration and approval by the teacher.

Software

Anaconda Distribution
Bibliotecas (módulos) de Machine Learning para Python

Tipo de avaliação

Distributed evaluation without final exam

Componentes de Avaliação

Designation	Peso (%)
Apresentação/discussão de um trabalho científico	20,00
Teste	30,00
Trabalho laboratorial	50,00
Total:	100,00

Componentes de Ocupação

Designation	Tempo (Horas)
Apresentação/discussão de um trabalho científico	20,00
Elaboração de projeto	40,00
Trabalho escrito	30,00
Trabalho laboratorial	10,00
Trabalho de investigação	40,00
Total:	140,00

Obtenção de frequência

The evaluation will include all components, namely:

Through two seminars for which students will have to prepare their presentations autonomously. Theoretical knowledge and the ability to apply it to specific cases will be evaluated.

Through a test or exam.

Through a selection of the 4 best lab assignments, the practical implementation skills will be evaluated (supervised).

Through the execution of an individual project, the ability to work independently will be evaluated.

The laboratory work and project can be performed individually or in groups of 2 or up to 3 students upon registration and approval from the professor.

The evaluation is distributed, with the final mark calculated by the formula:

TP (50%) + PL (50%)

TP = 30% test + 20% (2 Seminars - 10% each)
PL = 10% best 4 lab works + 40% final project.

Fórmula de cálculo da classificação final

The evaluation is distributed, with the final mark calculated by the formula:
TP (50%) + PL (50%)

30% tests or exams + 20% (2 Seminars / 10% each) + 10% best 4 lab works + 40% final project.

Recomendar Página Voltar ao Topo