Automatic information Extraction
Áreas Científicas |
Classificação |
Área Científica |
OFICIAL |
Informática |
Ocorrência: 2021/2022 - 2S
Ciclos de Estudo/Cursos
Sigla |
Nº de Estudantes |
Plano de Estudos |
Anos Curriculares |
Créditos UCN |
Créditos ECTS |
Horas de Contacto |
Horas Totais |
MES |
14 |
Plano de Estudos 2017-2018 |
2 |
- |
7,5 |
- |
202,5 |
Docência - Responsabilidades
Língua de trabalho
Portuguese
Objetivos
- Acquisition of knowledge for the formulation of a practical problem using pattern recognition techniques
- Ensure that students acquire the ability to implement and use various supervised and unsupervised classification algorithms
- Provide theoretical and practical skills to perform intelligent analysis of unstructured data, namely in texts, and in particular using common examples on the internet, such as web pages, asynchronous messages and e-mail, using the acquired pattern recognition techniques in this UC.
Resultados de aprendizagem e competências
- Contact with a set of social systems intelligence applications, non-built building sets and including a set of systems for analyzing social intelligence systems, analyzing other social systems, civil construction, systems analyzing other systems social, civil engineering and systems analytics from other cyber analytics systems.
- in students' scientific curiosity and ability to investigate Innovative developments in the area of pattern recognition.
Modo de trabalho
Presencial
Pré-requisitos (conhecimentos prévios) e co-requisitos (conhecimentos simultâneos)
Probabilities and Statistics
Artificial intelligence
Programa
Aspects taught mainly in theoretical-practical classes:
1. Introduction to Pattern Recognition
1.1. Nature of the Problem
1.2. Application Areas
1.3. Supervised and Unsupervised Classification
2. Decision Trees
2.1. Induction Learning.
2.2. ID3 Algorithm
2.3. Logical Rules
3. Classifier Project
3.1. Vector space of features
3.2. Discriminating Functions
3.3. confusion matrix
3.4. Evaluation criteria
4. Dimensionality Reduction: Selection vs. Feature Extraction
5. Decision Theory
5.1. Parametric Supervised Classification: Observations Model
5.1.1. Maximum a posteriori (MAP) classifier
5.1.2. Bayes classifier
5.1.3. Parametric Estimation of the Observations Model
5.2. Non-Parametric Supervised Classification
5.2.1. k-NN classifier
6. Unsupervised Classification
6.1. K-Means Algorithm
6.2. Hierarchical Clustering
Aspects taught mainly in practical-laboratory classes:
7. Text Mining
7.1. unstructured information
7.2. Regular Expressions
7.3. Elementary Operations
7.4. Pre-processing Techniques
7.5. Feature Selection and Extraction
7.6. Text Categorization
7.7. Performance evaluation
8. Discovery of Knowledge
8.1. Similarity Measures
8.1.1. Co-occurrence of terms
8.1.2. cosine
8.1.3. semantic similarity
8.2. Document search on the web
8.3. Information Extraction
8.3.1. Entity Extraction
8.3.2. Relationships extraction
9. Applications
Bibliografia Obrigatória
Jorge Salvador Marques; Reconhecimento de Padrões, IST-Press, Lisboa
Bibliografia Complementar
Ronen Feldman and James Sanger, ; The Text Mining Handbook: Advanced Approaches In Analyzing Unstructured Data, Cambridge University Press. ISBN: ISBN-13 978-0-521-83657-9
Métodos de ensino e atividades de aprendizagem
Theoretical-practical classes: exposition with the aid of slides and consolidation exercises.
Laboratory classes: Programming exercises to be solved on the computer, framed by a previous theoretical background.
Active learning will also be encouraged by students, regarding aspects of the UC program related to automatic learning, which students wish to deepen autonomously.
The use of distance learning means arising from the current situation is documented in the annex
Tipo de avaliação
Distributed evaluation with final exam
Componentes de Avaliação
Designation |
Peso (%) |
Apresentação/discussão de um trabalho científico |
15,00 |
Exame |
30,00 |
Trabalho escrito |
55,00 |
Total: |
100,00 |
Componentes de Ocupação
Designation |
Tempo (Horas) |
Elaboração de projeto |
35,00 |
Estudo autónomo |
10,00 |
Frequência das aulas |
25,00 |
Trabalho de investigação |
10,00 |
Trabalho escrito |
20,00 |
Total: |
100,00 |
Obtenção de frequência
Continuous assessment mode: Presentation and article on a research topic 15% + 2 Tests 15% + 15% + Project 55%.
Approval requires that each of the elements obtain a grade equal to or greater than 7.
Fórmula de cálculo da classificação final
Assessment method by exam:: Exam 45% + Project 55%.
Approval requires that each of the elements obtain a grade equal to or greater than 7.
the reference to fraud mitigation mechanisms and their consequences is Order No. 40/President/2021
Melhoria de classificação
Repetition of Exam and Project