Summarization-guided greedy optimization of machine learning model

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Immense amounts of unstructured data account for up to 90% of all human generated data, yet the attempts to extract significant value from it with Machine Learning (ML) and Big Data (BD) technologies yield limited successes. We propose a generic approach to deep data summarization and subsequent automated ML design optimization to extract maximum predictive value from big data. Knowledge summarization is a central component of the proposed methodology and we argue that coupled with strictly linear modeling complexity, hierarchical decomposition and optimized model design may define a backbone of the new platform for automated and scalable construction of robust ML models. We consider ML build process as data journeys through the layers of modeling that consistently follow the same patterns of data summarization and transformation at the subsequent layers of abstraction. In such framework we argue that the robust construction of the ML model can be achieved through hierarchical greedy optimization of the links between connected ML model components. We demonstrate several case studies of deep data summarization and automated ML model design on text, numerical time series and images data. We point out that application awareness allows to deepen data summarizations while maintaining or improving its predictive value.

Original languageBritish English
Title of host publicationMachine Learning and Data Mining in Pattern Recognition - 13th International Conference, MLDM 2017, Proceedings
EditorsPetra Perner
PublisherSpringer Verlag
Number of pages16
ISBN (Print)9783319624150
StatePublished - 2017
Event13th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2017 - New York, United States
Duration: 15 Jul 201720 Jul 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10358 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference13th International Conference on Machine Learning and Data Mining in Pattern Recognition, MLDM 2017
Country/TerritoryUnited States
CityNew York


  • Backward-forward search
  • Big data
  • Data summarization
  • Feature selection
  • Machine learning
  • Meta-learning


Dive into the research topics of 'Summarization-guided greedy optimization of machine learning model'. Together they form a unique fingerprint.

Cite this