Fast summarization and anonymization of multivariate big time series

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Sequential, predominantly temporal nature of the vast amounts of big data released every day from many different sources could potentially be linked, aligned along the time and deliver new evidence for the next generation predictive systems or knowledge discovery engines. However, big data owners are reluctant to share their data due to legally binding privacy and identity protection concerns, thereby posing a major hurdle preventing shared exploitation of big data on a massive scale. Data anonymization is expected to solve this problem, yet the current approaches are limited predominantly to univariate time series generalized by aggregation or clustering to eliminate identifiable uniqueness of individual data points or patterns. For multivariate time series, uniqueness among of the combination of values or patterns across multiple dimensions is much harder to eliminate due the to exponentially growing number of unique configurations of point values across multiple dimensions. Our method implements linearly scalable asynchronous summarization of multivariate time series independently at every dimension. As a result the series retain only a small subset of defining points at different times along multiple dimensions effectively breaking up the multivariate time series into a collection of summarized univariate time series that are perturbed from the original series in terms of actual points and pattern shapes. Current implementation of the anonymizing summarization involves shape preserving greedy elimination and aggregation that supports parallel cluster processing for big data implementation.

Original languageBritish English
Title of host publicationProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015
EditorsFeng Luo, Kemafor Ogan, Mohammed J. Zaki, Laura Haas, Beng Chin Ooi, Vipin Kumar, Sudarsan Rachuri, Saumyadipta Pyne, Howard Ho, Xiaohua Hu, Shipeng Yu, Morris Hui-I Hsiao, Jian Li
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1901-1904
Number of pages4
ISBN (Electronic)9781479999255
DOIs
StatePublished - 22 Dec 2015
Event3rd IEEE International Conference on Big Data, IEEE Big Data 2015 - Santa Clara, United States
Duration: 29 Oct 20151 Nov 2015

Publication series

NameProceedings - 2015 IEEE International Conference on Big Data, IEEE Big Data 2015

Conference

Conference3rd IEEE International Conference on Big Data, IEEE Big Data 2015
Country/TerritoryUnited States
CitySanta Clara
Period29/10/151/11/15

Keywords

  • anonymization
  • big data
  • multi-variate time series
  • parallel processing
  • summarization

Fingerprint

Dive into the research topics of 'Fast summarization and anonymization of multivariate big time series'. Together they form a unique fingerprint.

Cite this