Technological advances are enabling scientists to collect vast amounts of data in fields such as medicine, remote sensing, astronomy, and high-energy physics. These data arise not only from experiments and observations, but also from computer simulations of complex phenomena. They are often complex, with both spatial and temporal components. As a result, it has become impractical to manually explore, analyze, and understand the data. Scientific Data Mining: A Practical Perspective describes how techniques from the multi-disciplinary field of data mining can be used to address the modern problem of data overload in science and engineering domains.
Starting with a survey of analysis problems in different applications, this book identifies the common themes across these domains and uses them to define an end-to-end process of scientific data mining. This multi-step process includes tasks such as processing the raw image or mesh data to identify objects of interest;extracting relevant features describing the objects; detecting patterns among the objects; and displaying the patterns for validation by the scientists.
A majority of the book describes how techniques from disciplines such as image and video processing, statistics, machine learning, pattern recognition, and mathematical optimization can be used for the tasks in each step. It also includes a description of software systems developed for scientific data mining; general guidelines for getting started on the analysis of massive, complex data sets; and an extensive bibliography.