Saltar para:
Esta página em português Ajuda Autenticar-se
ESTS
Você está em: Start > MEB15
Autenticação




Esqueceu-se da senha?

Campus Map
Edifício ESTS Bloco A Edifício ESTS Bloco B Edifício ESTS Bloco C Edifício ESTS Bloco D Edifício ESTS Bloco E Edifício ESTS BlocoF

Data Analysis and Machine Learning

Code: MEB15     Sigla: ADDA     Level: 1

Áreas Científicas
Classificação Área Científica
OFICIAL Informática

Ocorrência: 2023/2024 - 1S

Ativa? Yes
Unidade Responsável: Departamento de Sistemas e Informática
Curso/CE Responsável:

Ciclos de Estudo/Cursos

Sigla Nº de Estudantes Plano de Estudos Anos Curriculares Créditos UCN Créditos ECTS Horas de Contacto Horas Totais
MEB 11 Plano Oficial do ano letivo 2021 2 - 6 60 162

Docência - Responsabilidades

Docente Responsabilidade
Miguel Angel Guevara López

Docência - Horas

Theorethical and Practical : 1,00
Practical and Laboratory: 1,50
Seminário: 0,50
Orientação Tutorial: 1,00
Type Docente Turmas Horas
Theorethical and Practical Totais 1 1,00
Miguel Angel Guevara López 1,00
Practical and Laboratory Totais 1 1,50
Miguel Angel Guevara López 1,50
Seminário Totais 1 0,50
Miguel Angel Guevara López 0,50
Orientação Tutorial Totais 1 1,00
Miguel Angel Guevara López 1,00

Língua de trabalho

Portuguese - Suitable for English-speaking students

Objetivos

The students will obtain knowledge, skills, and proficiency to:


  1. Recognize the specific challenges and needs of data analysis.

  2. Collect (capture), explore, clean, munge, and manipulate datasets (Big-data).

  3. Know (study) the fundamentals and possibilities of the application of machine learning techniques.

  4. Implement machine learning models (e.g., regression, classification, decision trees, artificial neural networks), using Python programming language and available (open source) tools/libraries.

  5. Develop projects that represent solutions to real practical problems, mainly applications in the area of health data analysis, through the analysis and exploration of public and/or private datasets made available.

Resultados de aprendizagem e competências

The UC "Data Analysis and Machine Learning" aims to provide students with the knowledge to recognize the challenges and specific needs of data processing and analysis and in particular to pave the way to develop specific modeling capabilities in the Big-data context. To understand the criteria for differentiating and selecting classes of algorithms and methods, as well as the assumptions of their use with emerging techniques of machine learning and artificial intelligence (AI). Develop theoretical and practical skills of dataset exploration based on the Python programming language, and the use of algorithms and methods developed / established in common tools / libraries (open source) tested and made available by the research and development
community in data science and AI. Among other topics the concepts of Supervised Learning, Unsupervised Learning and Reinforcement Learning will be introduced. In particular, it is intended that students will be able to apply the data analysis and machine learning techniques developed on patient’s data/
health record (clinical data, medical image analysis, etc.)

Modo de trabalho

Presencial

Pré-requisitos (conhecimentos prévios) e co-requisitos (conhecimentos simultâneos)

Basics of Python programming and linear algebra

Programa

PART 1: INTRODUCTION
1. Introduction to Data Análise e Machine Learning (ML)
1.1. What is Data Science (DS)? Why is DS important?
1.2. Analytics Building Blocks
1.3. Data Analysis Examples
1.4. What is ML? Why use ML?
1.5. Machine Learning Framework
1.6. Performance Evaluation
2. Introduction to Python Programing
2.1. Installing Python, Tools for Python
2.2. Control Flow (Conditional Logic, Loops, and Functions)
2.3. Python Collections
2.4. Introduction to NumPy and Pandas
3. Getting and Working with Data
3.1. Capturing Data
3.2. Feature Extraction and Transformation
3.3. Dimension reduction
3.4. Clustering

PART 2: ALGORITHMS AND METHODS
4. Machine Learning Techniques
4.1. Understanding ML (Problems, Goals, Challenges)
4.2. Python Tools/Libraries
4.3. Types of Learning
4.4. Supervised Learning
4.4.1. Classification
4.4.2. Training Models (Regression and Logistic Regression Approaches)
4.4.3. Support Vector Machines
4.4.4. Decision Trees
4.4.5. Random Forest
4.4.6. Ensemble Learning
4.4.7. Dimensionality Reduction
4.5. Unsupervised Learning
4.5.1. Clustering
4.5.2. Gaussian Mixtures
5. Neural Networks and Deep Learning
5.1. Introduction to Artificial Neural Networks
5.2. Training Deep Neural Networks
5.3. Custom Models
5.4. Loading and Preprocessing Data
5.5. Convolutional Networks (Computer Vision)
5.6. Processing Sequences (RNNs and CNNs)
5.7. Natural Language Processing
5.8. Representation and Generative Learning (Autoencoders and Generative Adversarial Networks)

PART 3: APPLICATIONS
6. Examples of Applications
6.1. Classification
6.2. Regression
6.3. Clustering

Bibliografia Obrigatória

Ethem Alpaydın; Introduction to Machine Learning, Second-Edition, MIT Press, 2010
Stuart Russell, Peter Norvig; Artificial Intelligence: A Modern Approach, 4th Edition, 2021. ISBN: 978-0134610993
Andriy Burkov; The Hundred-Page Machine Learning Book.. , 2019. ISBN: 978-1999579500
Sebastian Raschka, Vahid Mirjalili; Python Machine Learning Third Edition, Packt Publishing, 2019
M. Mohri, A. Rostamizadeh, A. Talwalkar; Foundations of Machine Learning, Second Edition, MIT Press, 2018

Bibliografia Complementar

Ian Goodfellow, Yoshua Bengio, and Aaron Courville; Deep Learning, 2016. ISBN: 978-0262035613
Max Kuhn, Kjell Johnson; Applied Predictive Modeling, Springer, 2016. ISBN: 978-1-4614-6849-3 (eBook).
Trevor Hastie, Robert Tibshirani, and Jerome Friedman; The Elements of Statistical Learning: Data Mining, Inference, and Prediction., Springer, 2009. ISBN: 978-0387848570

Métodos de ensino e atividades de aprendizagem

Teaching will have 3 major components:


  • Theoretical-practical classes - partially expository and with intensive use of supervised resolution of exercises, analysis of case studies and two seminars on specific topics, which will take place entirely at a distance.

  • Laboratory classes - for supervised execution and individual assessment of practical work in a computing environment for personal computers, internet and mobile devices.

  • Tutorial guidance - for personalized follow-up of the preparation of seminars and execution of distance projects.


Files will be made available with the matter of laboratory exercises to be executed autonomously (asynchronous regime), but with monitoring by videoconference at the established time and the use of synchronous classes (by video conference) for clarification of doubts and individual monitoring.

All types of classes (OT, TP and PL), as well as seminars may be held remotely since the materials, resources, tools and teacher training allow this in this computer science discipline in a natural way. Laboratory and project work may be done individually or in groups of 2 or 3 students upon registration and approval by the teacher.

Software

Anaconda Distribution
Bibliotecas (módulos) de Machine Learning para Python

Tipo de avaliação

Distributed evaluation without final exam

Componentes de Avaliação

Designation Peso (%)
Apresentação/discussão de um trabalho científico 20,00
Teste 30,00
Trabalho laboratorial 50,00
Total: 100,00

Componentes de Ocupação

Designation Tempo (Horas)
Apresentação/discussão de um trabalho científico 20,00
Elaboração de projeto 40,00
Trabalho escrito 30,00
Trabalho laboratorial 10,00
Trabalho de investigação 40,00
Total: 140,00

Obtenção de frequência

The evaluation will include all components, namely:


  • Through two seminars for which students will have to prepare their presentations autonomously. Theoretical knowledge and the ability to apply it to specific cases will be evaluated.

  • Through a test or exam.

  • Through a selection of the 4 best lab assignments, the practical implementation skills will be evaluated (supervised).

  • Through the execution of an individual project, the ability to work independently will be evaluated.


The laboratory work and project can be performed individually or in groups of 2 or up to 3 students upon registration and approval from the professor.

The evaluation is distributed, with the final mark calculated by the formula:

TP (50%) + PL (50%)

TP = 30% test + 20% (2 Seminars - 10% each)
PL = 10% best 4 lab works + 40% final project.

Fórmula de cálculo da classificação final

The evaluation is distributed, with the final mark calculated by the formula:
      TP (50%) + PL (50%)

30% tests or exams + 20% (2 Seminars / 10% each) + 10% best 4 lab works + 40% final project.
Recomendar Página Voltar ao Topo
Copyright 1996-2024 © Instituto Politécnico de Setúbal - Escola Superior de Tecnologia de Setúbal  I Termos e Condições  I Acessibilidade  I Índice A-Z
Página gerada em: 2024-11-23 às 11:11:59