# Great Ideas in ICT (2016)

Once every few years the information and communication technology community is shaken by some results that fundamentally impact several core topics. These results often have strong consequence on real systems, and thus, finally, impact our everyday life as well. This course has the goal of introducing the attendees to several breakthroughs, representative of different areas, showing the practical impact they had on ICT as it is today. Lectures will be delivered by professors and researchers from the department of computer science (DI), the department of computer, control, and management engineering Antonio Ruberti (DIAG) and the department of Information, communication and electronic engineering (DIET).

**Index**:

Notes

Schedule

Lectures

Rules for the Ph.D. in Engineering in Computer Science

Rules for the Ph.D. in Automatica and Operations Research

Rules for the Ph.D. in Computer Science

Rules for the Ph.D. in Information and Communications Technologies

__Notes:__

__Schedule:__

__Lectures:__

**Bitcoin****Lecturer**: Alberto Marchetti Spaccamela**Schedule**: 8-9/6/2016 9:30-11:00 - Via Ariosto 25, Aula 5, ground floor.**Abstract**: There's a lot of excitement about Bitcoin and cryptocurrencies. Optimists claim that Bitcoin will fundamentally alter payments, economics, and even politics around the world. Pessimists claim Bitcoin is inherently broken and will suffer an inevitable and spectacular collapse. Bitcoin truly is a new technology; the goal of these lectures is to provide an introduction to get to the core of what makes Bitcoin unique understanding how it works at a technical level. The first 30 minutes of the first lecture will be used for revising/introducing cryptography preliminaries (public key, cryptographic hash, digital signatures).

Slides: pdf 1, pdf 2

**Parallel data processing in big data systems****Lecturer**: Irene Finocchi **Schedule**: 16/6/2016 9:30-12:30 - Via Salaria 113, Aula Seminari, 3rd floor.**Abstract**: The ability to perform scalable and timely processing of massive datasets is a crucial endeavor of our era. Since the introduction of MapReduce in 2004, there has been a proliferation of programming models and software frameworks for large-scale data analysis. A large part of the power of these frameworks comes from their simplicity: programmers need to implement only a few functions, all the other aspects of the execution being handled transparently by the runtime system. However, algorithms must be expressed in terms of a small number of rigid components that must fit together in very specific ways, and the lack of explicit control over the system resources could easily yield hotspots and performance bugs very difficult to diagnose. The purpose of the course is to overview algorithm design patterns and challenges that arise when programming big data systems, highlighting common pitfalls and good programming practices.

Slides: pdf

**3D Indoor positioning and navigation: theory, implementation and applications****Lecturer**: Luca De Nardis**Schedule**: 27/6/2016 9:30-12:30 - Via Eudossiana 18, Building B, Room 206 "Sala di Lettura".**Abstract**: Indoor navigation and way finding of people and objects is of tremendous interest nowadays, where the challenge in extending GPS to indoor environments limits service continuity and accuracy, and calls for a cross-disciplinary approach encompassing technologies, data representation, visual design and semiotics. This seminar will present the main solutions to the problem of indoor positioning, focusing to the case of 3D positioning, that poses the hardest challenges in heterogeneous indoor. Recent advances for both positioning and tracking will be presented, and a practical demonstration of an indoor positioning system based on Wi-Fi fingerprinting will be given.

Slides: pdf

**New models for sequencing and scheduling****Lecturer**: Paolo Dell'Olmo**Schedule**: 4/7/2016 9:30-12:30 - Via Ariosto 25, Aula 5, ground floor.**Abstract**: The lecture present models and algorithms for sequencing and scheduling problems. Beyond a necessary introduction and trying to keep technicalities as limited as possibile, the objective of the talk is to put in relation these problems and techniques with general optimization procedures and complexity issues which are quite common in a variety of fields.

**Hyperscanning: a new approach to the study of the physiological basis of human social interaction****Lecturer**: Laura Astolfi**Schedule**: 5/7/2016 9:30-12:30 - Via Ariosto 25, Aula 6, ground floor.**Abstract**: Simultaneous multi-subjects recording (hyperscanning) is a recent approach adopted in social neuroscience to go beyond the modeling of the social brain as an isolated system towards the analysis of a group of interacting subjects as a complex system. This implies taking into account not only the internal organization of each subject’s brain activation but also the relations between brain activities of different subjects. Hyperscanning has opened a completely new field in neuroscience, raising methodological issues and opening new applicative perspectives. In this lecture, we will see some examples of controlled experiments conducted in the well-developed framework of the Game Theory, as well as during ecological tasks performed inside and outside a laboratory (such as the control of a flight simulator by professional pilots). Finally, the impact and possible future applications of this research will be discussed

**Boolean techniques in Data Mining, with a focus on Classification****Lecturer**: Renato Bruni**Schedule**: 7/7/2016 14:30-17:30 - Via Ariosto 25, Aula 5, ground floor.**Abstract**: Data mining is the extraction from large data sets of information that is not obvious, not already available in that form and potentially useful (rules, regularities, patterns, etc. = knowledge) using automatic or semi-automatic methods. Similar operations are required in a large variety of applicative fields, in order to gain insight into many complex phenomena. Depending on the application, different activities may be needed. Typical data mining tasks are the following: (1) Classification, that is learning a function or a criterion to map objects on a pre-defined set of classes; (2)Regression, that is learning a function or a criterion to associate real values to objects; (3) Clustering, that is partitioning a the set of objects to group together similar objects; (4) Learning of Dependencies and Associations, that is identification of significant relationships among data attributes; (5) Rule Learning or Summarization, that is identification of a compact description of a set or a subset of the data.

Machine learning techniques allow a machine to automatically simulate the learning process. Due to the exponential increase in the amount of data produced and stored (the so-called information explosion, or data flood), learning algorithms must continuously evolve. Many steps of the learning procedure turn out to be difficult optimization problems. Finding efficiently good solutions to them allows to perform the learning operations more successfully and to produce knowledge of better quality. Therefore, considerable advantages can be obtained from the use of state-of-the-art optimization techniques, in particular from the field of discrete and Boolean optimization.

Classification is among the most requested tasks in many practical applications, and several solution approaches have been proposed in the literature. One methodology that provides a good accuracy is the Statistical and Logical Analysis of Data (SLAD), recently proposed in [2] as an evolution of the classical Logical Analysis of Data (LAD) [1]. This methodology is based on Boolean techniques and is closely related to Decision Trees and Nearest Neighbor methods, actually constituting an extension of the latter two, as showed in [3].

LAD techniques are inspired by the mental processes that a human being applies when learning to classify from examples. By considering data organized into records, with each record being a set of values for a set of fields, the procedure can be roughly described as follows. Data are initially encoded into binary form by means of a discretization process called Binarization. This is obtained by using the training set for computing specific values to convert each field into a set of binary attributes.

In the case of a qualitative field, all its values are simply encoded by means of a suitable number of binary attributes. In the case of a numerical field, the specific values mentioned above are called cut-points. They should be set at values representing some kind of watershed for the analyzed phenomenon. For a given numerical field, this can be done by considering, for each couple of adjacent values belonging to records of opposite classes, the middle value. Cut-points are then used to binarize the values of that numerical field into a number of binary attributes, each representing being above or below the value of the cut-point. Since in practice the number of binary attributes obtainable with the above procedure is very large, a selection step is needed. SLAD procedure aims at selecting only the binary attributes having the best separating power, in order to compute a binarization of reasonable size but effective. The separating power of each value is computed by evaluating how it divides the underlying distribution of one class from that of the other classes. The selection step is modeled as a Boolean optimization problem either of set covering or of knapsack.

The selected set of binary attributes is then used to build the patterns. A pattern is a conjunction of binary attributes characterizing a class. Each binary attribute can be seen as a condition. A conjunction of binary attributes is a positive pattern if it evaluates to True (i.e., it covers) at least a certain number of positive records and it does not cover more than a certain number of non-positive records. A negative pattern is defined symmetrically. In the space defined by the original fields of the records, each record is a point and each pattern corresponds to a polyhedron (the intersection of a finite number of equations and inequalities). Now, a new record located in a region of the space covered only by positive patterns is classified as positive, and vice versa. However, most of the regions of the space are actually covered by patterns of mixed classes. In this case, each pattern must be given a weight, that is a measure of its importance, and then a weighted sum determines the class of the record under classification. More details can be found in [2].

An additional key feature of LAD methodologies is that patterns can be seen as an interpretation of the analyzed phenomenon. Therefore, the described procedure can be used to perform rules learning tasks. We will also see some examples of application of the described procedure.

References

[1] E. Boros, P.L. Hammer, T. Ibaraki, A. Kogan, E. Mayoraz, I. Muchnik. An Implementation of Logical Analysis of Data. IEEE Transactions on Knowledge and Data Engineering, Vol. 12(2), 292-306, 2000.

[2] R. Bruni, G. Bianchi. Effective Classification using Binarization and Statistical Analysis. IEEE Transactions on Knowledge and Data Engineering Vol. 27(9), 2349-2361, 2015.

[3] Y. Crama and P.L. Hammer. Boolean Functions: Theory, Algorithms, and Applications. Cambridge University Press, New York, 2011. ISBN: 9780521847513.

**Deep Learning Neural Networks: Challenges and Perspective for Big-Data Processing****Lecturer**: Aurelio Uncini and Michele Scarpiniti**Schedule**: 13/7/2016 9:30-12:30 - Via Eudossiana 18, Building B, Room 206 "Sala di Lettura".**Abstract**: In the digital world the most significant emerging phenomenon, generically referred to as Big Data (BD) concerns the exponential growth of information available in various forms and with various access methods. The presence in the data of non-traditional hidden information, requires the development of advanced tools and technologies and interdisciplinary teams that work closely together. Today, 'intelligent' methodologies, together with advances in the available computing power, play a central role in the BD analysis and in the knowledge discovery. The BD is most often noisy, inconsistent, incomplete, available in real time in huge amounts and with non-stationary statistics. This causes a great interest for the development of data-driven procedures that are able to extract knowledge and signifiance. In fact, in socio-economic terms, the BD represents a great opportunity: if BD is the 'oil of digital media', methods for extracting hidden information, represent the 'refinery' and their combination and aggregation is a huge value and opportunities for new business models. Although the BD has the potential to revolutionize all aspects of our society, the collection of valuable and useful knowledge to them is not a simple and ordinary task. In particular, are presented the basic principles of the deep learning methods, used for the solution of the problem of determination of unknown and complex relationships between the data. Moreover, are presented deep neural models (DNN) with unsupervised, supervised and hybrid DNN learning algorithms, are introduced the problems for the realization of DNN on parallel and distributed machines. Finally, are presented some deep learning solutions that represent the current state-of-the-art in areas of strategic interest such as: text, language model and natural language processing; information retrieval; visual object recognition and computer vision; speech recognition and audio processing; multimodal and multi-task learning ( text-image, speech-image).

**Ontology-Based Data Access: Definitions, Algorithms and Methodologies****Lecturer**: Domenico Lembo**Schedule**: 19/7/2016 9:30-12:30 - Via Ariosto 25, Aula 5, ground floor.**Abstract**: This lecture introduces Ontology-based Data Acess (OBDA), a paradigm for data integration that has received an always increasing attention in the last years in the knowledge representation and database communities. OBDA aims at coupling conceptual views of information, commonly expressed as Description Logic (DL) ontologies, with actual and possibly pre-existing data stores. The lecture gives the basics of OBDA, provides an overview of the main reasoning tasks and algorithms for solving them, and illustrates methodologies for developing OBDA applications. More in detail, we gently introduce Description Logics through the use of a graphical model for quick development of OWL 2 ontologies, i.e., ontologies expressed in the W3C standard language. We then survey typical mechanisms to link ontologies with data, and discuss some special reusable patterns for modeling recurrent representation needs.

**Computer Vision in Sports****Lecturer**: Domenico Bloisi **Schedule**: 21/7/2016 9:30-12:30 - Via Ariosto 25, Aula 5, ground floor.**Abstract**: Automatic analysis of games is becoming a key factor in multiple sports, as for example in football, where Computer Vision techniques are used to extract statistics about players and teams. In this talk, we present possible solutions for handling the diverse aspects of the problem. In particular, we discuss the calibration, foreground extraction, player tracking, and event detection modules that compose the pipeline to build a complete automatic sport analysis system.

__Rules for the Ph.D. in Engineering in Computer Science:__

This course can be considered as a B-type course (2.5 CFUs or 3 CFUs for students belonging to cycle 30 or later) as long as both the following requirements are satisfied:

- the student attends at least five lectures (six lectures for cycle 30 or later) among the ones listed above; students must download this attendance sheet and fill it in to have their attendance recognized;
- the student completes an assignment for one of the listed lectures. The assignment must be agreed with the corresponding lecturer.

Assignments will be discussed through seminar-like presentations to be scheduled with lecturers. Students can opt to gather double the CFUs (i.e. 5 CFUs or 6 CFUs for cycle 30 or later) by doubling their work (i.e. attending at least 10 lectures, 12 for cycle 30 or later, and completing two distinct assignments).

__Rules for the PhD in Automatica, Bioengineering and Operations Research:__

This course can be considered as a B-type course (2.5 CFUs) as long as both the following requirements are satisfied:

- the student attends at least five lectures among the ones listed above; students must download this attendance sheet and fill it in to have their attendance recognized;
- the student completes an assignment for one of the listed lectures. The assignment must be agreed with the corresponding lecturer.

Assignments will be discussed through seminar-like presentations to be scheduled with lecturers. Students can opt to gather double the CFUS (i.e. 5 CFUs) by doubling their work (i.e. attending at least 10 lectures and completing two distinct assignments).

__Rules for the Ph.D. in Computer Science:__

All students are required to attend 3 of the lectures listed above, and invited to choose among the lectures that are not strictly related to their own topic of research. Students must download this attendance sheet and fill it in to have their attendance recognized;

__Rules for the Ph.D. in Information and Communications Technologies:__

Students will be granted 1.5 CFUs by attending at least 3 lectures. The CFUs can be doubled (for a total of 3) by completing the assignments for at least three lectures.Students must download this attendance sheet and fill it in to have their attendance recognized;