FD-04: Mining Earth Observation Big Data: methods and applications

Mihai Datcu

Abstract:

The goal of the tutorial is the presentation of leading edge concepts, methods and algorithms for information content exploration and extraction from Big Data provided by EO sensors and other related sources.

The Earth is facing unprecedented climatic, geomorphologic, environmental and human made changes, which require global scale observation and monitoring thus resulting in a multitude of new orbital and suborbital Earth Observation (EO) sensors. The collected EO data volumes are increasing immensely with a rate of several Terabytes of data a day. With the current EO technologies these figure will be soon amplified, the horizons are beyond Zettabytes. At the same time, the need for timely delivery of focused information for decision making is increasing. Therefore, EO data, which are not usable in its crude format, require chained transformations before becoming the needed information products easily understandable and ready to use without further manipulations. The challenge is increasingly going to be how to enlarge the usability of the millions of EO images being stored in archives to a larger and larger group of end-user applications. Big Data Mining, and Big Data Analytics are new fields of study that have arisen to seek solutions to automating the extraction of information from EO data and other related sources, that can lead to Knowledge Discovery and the creation of actionable intelligence. Also, Knowledge Discovery is among the most interesting intelligent computing, however, the real challenge is to combine machine intelligence with the power and potential of human intelligence, this being a primary objective in the field. The goal is to go beyond the today methods of information retrieval and develop new concepts and methods to support end users of EO data to interactively analyze the information content, extract relevant parameters, associate various sources of information, learn and/or apply knowledge and to visualize the pertinent information without getting overwhelmed.

The tutorial gives the basics of the Mining Big Data methods, with emphasis on the particularities of EO data and applications, revealing the challenges raised by the data deluge, both in volumes and diversity.

The tutorial begins with a presentation of the EO sensor image formation models and the evaluation and characterization of their quantitative and qualitative information content. The most important categories of optical and Synthetic Aperture Radar (SAR) sensors are treated. The basic EO products and the typical metadata are overviewed in relation with the typical applications. An overview of the area of Big Data, mainly the fields of Data Mining, Knowledge and Data Discovery, and image search engines, is intended to position Data Mining field for EO applications. The basic EO image primitive feature extraction algorithms for optical and SAR data presented in relation with the specificity of the methodology for geostatistics and spatial statistics, i.e. geometrical and topological information representation. EO image time series and spatio-temporal information descriptors are introduced as a base for long term Earth processes understanding and full exploitation of historical EO data archives. Particularly focus will be on the presentation and discussion of the use and processing of the EO products metadata and other external information as GIS or map information. The kernel problematic is the discovery of information, thus similarity measures and grouping methods are presented and analyzed. Methods for active learning and techniques for the Visual Analytics and Data Mining, will be introduced. The presentation will include techniques for the generation of semantic catalogues for large EO archives, and methods for KDD.

The tutorial will be concluded, focusing on various EO applications, but also on ubiquitous use of the EO data, mainly on simple, easy to understand information summaries, geospatial visual analytics, and visual data mining, revealing the most relevant information in a distilled and easy representation, for direct understanding of users outside the area of geoinformation.

The introduced methods will be demonstrated for selected actual scenarios using TerraSAR-X, TanDEM-X, optical images, such as Landsat, or WorldView, and Image Time Series. The perspective of the use of the Big Data Mining for Sentinel 1 and 2 will be discussed.

Biography:

Mihai Datcu, received the M.S. and Ph.D. degrees in electronics and telecommunications from the University "Politechnica" of Bucharest (UPB), Bucharest, Romania, in 1978 and 1986, respectively, and the title Habilitation ´ diriger des recherches in computer science from University Louis Pasteur, Strasbourg, France, in 1999. Since 1993, he has been a Scientist with the German Aerospace Center (DLR), Oberpfaffenhofen, Germany, where he is currently a Senior Scientist and an Image Analysis Research Group Leader in the Remote Sensing Technology Institute (IMF). He is developing algorithms for analyzing very high resolution synthetic aperture radar (SAR) and interferometric SAR data. He is engaged in research related to information theoretical aspects and semantic representations in advanced communication systems. Since 2011, he has also been leading the Immersive Visual Information Mining Research Laboratory, Munich Aerospace Faculty, Munich, Germany. He has held Visiting Professor appointments with the University of Oviedo, Oviedo, Spain; University Louis Pasteur; the International Space University, Strasbourg; the University of Siegen, Germany; the University of Camerino, Italy; and the Swiss Center for Scientific Computing (CSCS), Manno. From 1992 to 2002, he had a longer Invited Professor assignment with the Swiss Federal Institute of Technology, ETH Zurich. Since 2001, he has initiated and led the Competence Centre on Information Extraction and Image Understanding for Earth Observation at Telecom ParisTech, Paris, a collaboration of DLR with the French Space Agency (CNES). He has been a Professor holder of the DLR-CNES Chair at Telecom ParisTech. Since 1981, he has been a Professor with the Faculty of Electronics, Telecommunications and Information Technology (ETTI), UPB, working on signal/image processing and electronic speckle interferometry, where has been the Director of the Research Center for Spatial Information since 2011. He and his team have developed and are currently developing the operational image information mining (IIM) processor in the payload ground segment systems for the German missions TerraSAR-X, TanDEM-X, and the ESA Sentinel 1 and 2. He is the author of more than 200 scientific publications, among them about 50 journal papers, and a book on number theory. His interests are in information and complexity theory, stochastic processes, Bayesian inference, and IIM. Dr. Datcu is a member of the European Image Information Mining Coordination Group and the Data Archiving and Distribution Technical Committee of the IEEE Geoscience and Remote Sensing Society. He is IEEE Fellow.