An Overview of Machine Learning-Based Predictions Techniques Using Dynamic Graphs

4 minute read

Published:

This blogpost showcases the introduction section of a term paper I wrote for a graduate level big data analytics course. The full article can be found here

Graphs and graph representation learning have become an increasingly popular field of research in the past few years. The number of publications on these topics, as can be seen in Figure 1, has seen a tremendous surge in the past few years - a 1522.54% increase since the year 2000[1-3]. The widespread interest in this field can be attributed to the utility of graphs and their ability to accurately model real-life scenarios. Not only are graphs useful tools in the field of data analysis where they help visualize and analyze the nature of interactions between represented actors but they have also increasingly become integral to the field of machine learning and predictive analysis[4,5]. Traditionally, machine learning models have been trained on datasets that constitute instances inherently assumed to be independent of each other. However, as the complexity of applications increases, so does that of the subjects of predictive analysis, and thus more complex representations are warranted.

To substantiate the above point, the example of a simple content-based recommender system can be considered. In such a system the training examples typically consist of the past behavior of the customer and each training example or customer data in the training set is assumed to be independent of another. However, if user data is modeled as a graph and the underlying assumption that each data point is independent of the other is ignored, then the interactions and similarities that exist between each user may be captured more effectively. The consequent increase in information available leads to more personalized recommendations and enhanced user experience. In fact, state-of-the-art recommendation systems such as those employed by big corporations like Netflix and Amazon make use of network graph models to personalize user experience[6,7]. Thus, it can be concluded that graphical data as an input to machine learning algorithms helps make more accurate predictions for complex real-life situations simply because more information is captured.


A majority of the current machine learning techniques for graphs work under the assumption that the underlying structure used to make predictions is static in nature but this is rarely true of real-life interactions [8,9]. For instance, in medical applications of machine learning, most of the captured patient data is time-variant and in fact, it is these timed variations that most often carry characterizing information. This is where dynamic graphs come into the picture. In addition to capturing the spatial information about the scenario they model, dynamic graphs also have a time component that helps capture temporal fluctuations of information. In terms of applications of dynamic graphs in predictive analysis, one way to handle this is to apply static graph machine learning techniques to time-variant graphs, however, the results obtained are more often than not sub-optimal. This warrants the need to develop dedicated frameworks capable of working with dynamic graphs. The field of the application of machine learning algorithms to dynamic graphs, however, is fairly new and most of the work focuses on extending static techniques by treating dynamic graphs as discrete time-stamped snapshots - a practice that is restrictive in nature. Only very recently techniques have started to emerge that treat dynamic graphs as continuous time entities that constantly evolve[10].


The objective of this term paper is to study some of the most recent machine-learning techniques that have been applied to dynamic graphs to obtain useful predictions. The paper is divided into five main sections. The present section serves as an introduction to the problem statement, this is followed by section two where a discussion of the importance of graphs to big data and analytics is done. In section three, an overview of dynamic graphs, major modeling techniques, and their areas of application has been provided. In section four, we present the latest machine learning frameworks that have been proposed for dynamic graphs and carry out a detailed discussion about graph representation learning algorithms and graph embedding strategies. Section number five serves to present the major conclusions drawn from the study undertaken and also discusses the future line of work that can be carried out in this domain.