
About Us
The Knowledge Discovery and Data Mining research unit(KDD LAB) is a joint research initiative of ISTI and the Computer Science Department of the University of Pisa.
The objective of the research unit is the development of theory, techniques, and systems for extracting and delivering useful knowledge out of large masses of data.
Today, knowledge discovery and data mining is both a technology that blends data analysis methods with sophisticated algorithms for processing large data sets, and an active research field that aims at developing new data analysis methods for novel forms of data. On one side, classification, clustering and pattern discovery tools are now part of mature data analysis and Business Intelligence systems and have been successfully applied to problems in various commercial and scientific domains. On the other side, the increasing heterogeneity and complexity of the new forms of data – such as those arriving from medicine, biology, the Web, the Earth observation systems, the mobility data arriving from wireless networks – call for new forms of patterns and models, together with new algorithms to discover such patterns and models efficiently.
In this context, the mission of the KDD laboratory is to pursue fundamental research, strategic applications and higher education in the areas of:
- Mobility data analysis: new discipline at the convergence of data mining, mobility, geography and privacy aimed at the discovery of movement behaviour of people and vehicles in a territory, especially in the urban setting.
- Mining novel forms of data: new data mining methods for spatial, temporal and spatio-temporal data, web log data and graph data from social networking, graph data from proteomics and business processes.
- Privacy-preserving data mining: methods for anonymizing, randomizing, sanitizing both data and patterns, to the purpose of protecting the privacy and anonymity of the data subjects, i.e., the individuals whose personal data are under analysis.
- Data mining query languages: support environments to the knowledge discovery process for expressing complex analytical queries and reasoning on data mining results with respect to domain dependent knowledge.
- Knowledge discovery and ontologies: combination of natural language processing, knowledge management, text mining and data mining for ontology-driven knowledge discovery and automated ontology discovery.
- Advanced data mining applications: intelligent adaptive solutions for forecasting complex phenomena in CRM (e.g., churn analysis) and in fraud detection (e.g., fiscal evasion).
Relevant Projects
- GeoPKDD Geographic Privacy Preserving Knowledge Discovery funded by EC-FP6-FET. The objective of the GeoPKDD project is to discover useful knowledge about human movement behavior from mobility data, while preserving the privacy of the people under observation.
- MOTUS - Mobility and Tourism. MOTUS aims to improve the management, sustainability and eco-compatibility of urban mobility focussing on the citizen as user and provider of the traffic service. MOTUS provides a service platform capable of detecting, aggregate, and interpret urban mobility in real time from information from heterogeneous infrastructures and data from mobile devices.
- Mercato della Mobilità. MdM includes the development of models for analysis of data collected from mobile sensors and wireless networks and the testing of specific services, to promote the use of public transport and optimize the use of private vehicles in the path home-to-work.
- MOVE - COST Action and MODAP - Coordination Action.
- ANONIMO This project aims to develop a formal framework for measuring privacy and anonymity and, from the legal viewpoint, to study and define the very concepts of privacy and anonymity.
- MiningForLife and FSE - Fascicolo Sanitario Elettronico. The objective of these projects is to apply novel data mining methods to medical data on oncological patients, to discover correlation between clinical values and disease progression.
- BI-COOP is an industrial R&D project funded by Unicoop, one of largest italian retail companies. KDDLab created a very large sales data warehouse, equipped with advanced BI solutions for strategic marketing and customer care based on data mining predictive analytics.
- DIVA The DIVA project created a predictive system aimed at fighting against tax evasion in VAT domain.
Last Publications
- Wireless Network Data Sources: Tracking and Synthesizing Trajectories
- Characterising the Next Generation of Mobile Applications Through a Privacy-Aware Geographic Knowledge Discovery Process
- An Ontology-Based Approach for the Semantic Modelling and Reasoning on Trajectories
- Towards Semantic Interpretation of Movement Behavior
Events
-
15 June 2009 - 10:30pm
-
27 February 2009 - 10:00am
-
28 November 2008 - 9:15am
-
12 September 2008 - 9:15am
