Context based Event Detection using Time Series Classification

Student thesis: Doctoral Thesis

Abstract

Intelligent systems that help in automatic event detection and appropriate decision making are essential assets for smart cities. Their use span over a wide range of domains such as healthcare, security, and education. However, their use in automatic detection of traffic incidents is still rather limited. In fact, while Automatic Incident Detection has been under research for decades, many transportation centers do not use the available automated methods due to their low accuracy and high false alarm rate. In this thesis, we propose a novel framework for traffic event detection using the recent Time Series Classification technique shapelets. The shapelets technique generates subsequences of the time series representing distinctive patterns. Each pattern is called a shapelet and is selected based on a way that will maximally differentiate between the different classes of a time series set. In traffic events, shapelets can represent patterns of incidents/congestions as well as normal traffic situations that the framework utilizes to detect the occurrence of events. The framework adapts the properties of the classical Shapelet Transform to tailor it for traffic event detection. In fact, our framework relies on three major components: a new shapelet selection mechanism, namely Complementary Shapelets (C-Shapelets), that aims at improving the current selection of the shapelets in Shapelet Transform; a new shapelet feature, namely Shapelet Profile, which provides a context for the shapelets in the form of lists of properties; and the fusion of different sources/channels of data, namely social media data and sensor reading data, to have a more reliable decision-making process and to provide contextual information for the detected events. The C-Shapelets selection mechanism relies on selecting the set of shapelets that are complementing each other rather than the top-K Shapelets, as proposed originally. In the original formulation, the process of selection considers top K candidates and hence, the selection of a given shapelet is made independently from the other selected shapelets with the assumption that taking the top-K shapelets will yield better results. Top-K selection approach neglects the fact that a higher discriminatory factor could be achievable by choosing a combination of shapelets that are complementing each other to increase the classification accuracy. In the C-Shapelets selection mechanisms instead, shapelets are selected in an iterative process so that the selection of a given shapelet depends on the shapelets that are already selected. The validation of the new selection mechanism using the UCR datasets proves that C-Shapelets selection helps improve the accuracy of the results compared to the top-K shapelet selection in 80% of the datasets. Using the Friedman test, our proposed selection is better ranked than the classical top-K selection for any K size. The framework extends Shapelet Transform with a novel set of modifications and additions to adopt it for event detection. The modifications done to refine the framework are anoutcome of the analysis of the Shapelet Transform technique with regards to its various iv properties including the shapelets selection technique, shapelets pool selection, quality measures, distance measures, number of selected shapelets, choice of classifier, size of the time series and the type of the time series spatial or temporal. Moreover, a novel addition namely, Shapelet Profiles, provides a context for the shapelets in the form of lists of properties. For each shapelet, the correspondent list of properties is extracted from the time series sequences in the training dataset that are close to it according to a suitable distance. The Shapelet Profile's use is two-fold: First, semantically complement the lack of insight of machine learning models, because the profile of the shapelet can provide the context of the decision; secondly, improve classification accuracy when the machine learning model used has low confidence. Finally, we investigated the possibilities and the limitations of traffic event detection using the Social Media site Twitter. As we compared between the detection of traffic events using sensors data and social media, we also looked on how to improve the performance of the proposed framework as we explored the different ways that the fusion of traffic sensor data with social media messages can be done for the detection of traffic events and congestions. Mapping between the tweets of a given event and its sensor readings could not only improve the classification but also provide a context for the event, such as the type of the incident and status of the road. Experiments using the social media messages (tweets) sent from the London circular road M25 for the same period show an improvement of the classification in terms of accuracy, detection rate, false alarm rate and mean time to detect.
Date of AwardDec 2019
Original languageAmerican English

Keywords

  • Automatic Incident Detection
  • Time Series Classification
  • Shapelets
  • Social Media
  • Event Detection

Cite this

'