Big Data
Áreas Científicas |
Classificação |
Área Científica |
OFICIAL |
Informática |
Ocorrência: 2021/2022 - 1S
Ciclos de Estudo/Cursos
Sigla |
Nº de Estudantes |
Plano de Estudos |
Anos Curriculares |
Créditos UCN |
Créditos ECTS |
Horas de Contacto |
Horas Totais |
BINF |
24 |
Study Plan |
3 |
- |
5 |
52,5 |
135 |
Docência - Responsabilidades
Língua de trabalho
Portuguese
Objetivos
Students who complete this course successfully should be able to :
- Know a set of non-conventional threads to enable a scalable data management, as well as the use of parallel algorithms and statistical modeling, with and without the use of the cloud ;
- Be proficient in an ecosystem of tools and platforms to allow them, in the face of a concrete problem, to determine the solution to be applied and the tools to be used in storage, exploration and analysis of large volumes of data.
Resultados de aprendizagem e competências
Not applicable
Modo de trabalho
Presencial
Programa
1. Introduction
History and context. Overview of Big Data technology . Science data . Search, indexing and memory
2. Large scale data handling
Large Scale Storage System. MapReduce and Hadoop . Relation to current databases, streams , algorithms, extensions and languages. Parallel query processing and computational analysis of statistics. Key - value storage ; Comparing SQL databases and non- SQL
3. Communication of results
Visualization of computational results. Sources of data, privacy, ethics and governance
4. Special Topics
Analysis of graphs : structure , crossings , computational analysis, PageRank, recursive queries, semantic web, advertising and recommendation systems on the internet .
Bibliografia Obrigatória
Sadalage et al.; No SQL distilled : a brief guide to the emerging world of polyglot persistence, Pearson Education, 2012
O'Neil, C. and Schutt, R.; Doing Data Science: Straight Talk from the Frontline, 2013
Leskovec, J., Rajaraman, A., Ullman, K.; Mining of Massive Datasets, Cambridge University Press, 2nd Ed., 2014
White, T.; Hadoop: The Definitive Guide, O'Reilly, 2015
Wilke, C. O; Data Visualisation, O’Reilly, 2019
Bibliografia Complementar
Knaflick, N. C; Storytelling with data, Wiley, 2015
Métodos de ensino e atividades de aprendizagem
As teaching methodology the following approaches will be adopted:
1. Oral presentation of the basic concepts and tools
2. Preparation of laboratory work
Continuous assessment will be based on two projects and a written test.
Final assessment will be based on two projects and on written exam.
Software
Python
MongoDb
Tipo de avaliação
Distributed evaluation with final exam
Componentes de Avaliação
Designation |
Peso (%) |
Teste |
30,00 |
Trabalho escrito |
70,00 |
Total: |
100,00 |
Componentes de Ocupação
Designation |
Tempo (Horas) |
Estudo autónomo |
82,50 |
Frequência das aulas |
52,50 |
Total: |
135,00 |
Obtenção de frequência
Not applicable
Fórmula de cálculo da classificação final
Continuous assessment
- 30%*project1+40%*project2+30%*testt
Final assessment
- 30%*project1+40%*project2+30%*exam