Pattern Matching in Link Streams: Timed-Automata with Finite Memory

Clément Bertrand, Frédéric Peschanski, Hanna Klaudel and Matthieu Latapy

Scientific Annals of Computer Science, 2018

Link streams model the dynamics of interactions in complex distributed systems as sequences of links (interactions) occurring at a given time. Detecting patterns in such sequences is crucial for many applications but it raises several challenges. In particular, there is no generic approach for the specification and detection of link stream patterns in a way similar to regular expressions and automata for text patterns. To address this, we propose a novel automata framework integrating both timed constraints and finite memory together with a recognition algorithm. The algorithm uses structures similar to tokens in high-level Petri nets and includes non-determinism and concurrency. We illustrate the use of our framework in real-world cases and evaluate its practical performances.

Download

Degree-based Outliers Detection within IP Traffic Modelled as a Link Stream

Audrey Wilmet, Tiphaine Viard, Matthieu Latapy and Robin Lamarche-Perrin

TMA Conference 2018, Vienna

Precise detection and identification of anomalous events in IP traffic are crucial in many applications. This paper intends to address this task by adopting the link stream formalism which properly captures temporal and structural features of the data. Within this framework we focus on finding anomalous behaviours with the degree of IP addresses over time. Due to diversity in IP profiles, this feature is typically distributed heterogeneously, preventing us to find anomalies. To deal with this challenge, we design a method to detect outliers as well as precisely identify their cause in a sequence of similar heterogeneous distributions. We apply it to a MAWI capture of IP traffic and we show that it succeeds at detecting relevant patterns in terms of anomalous network activity.Degree-based Outliers Detection within IP Traffic Modelled as a Link Stream

Download

OLCPM: An Online Framework for Detecting Overlapping Communities in Dynamic Social Networks

Souâad Boudebza, Rémy Cazabet, Faiçal Azouaou and Omar Noual

Computer Communications, Elsevier, In press, 2018.

Community structure is one of the most prominent features of complex networks. Community structure detection is of great importance to provide insights into the network structure and functionalities. Most proposals focus on static networks. However, finding communities in a dynamic network is even more challenging, especially when communities overlap with each other. In this article , we present an online algorithm, called OLCPM, based on clique percolation and label propagation methods. OLCPM can detect overlapping communities and works on temporal networks with a fine granularity. By locally updating the community structure, OLCPM delivers significant improvement in running time compared with previous clique percolation techniques. The experimental results on both synthetic and real-world networks illustrate the effectiveness of the method.

Download

Enumerating maximal cliques in link streams with durations

Tiphaine Viard, Clémence Magnien, and Matthieu Latapy

Information Processing Letters 133 (2018), p. 44-48

Link streams model interactions over time, and a clique in a link stream is defined as a set of nodes and a time interval such that all pairs of nodes in this set interact permanently during this time interval. This notion was introduced recently in the case where interactions are instantaneous. We generalize it to the case of interactions with durations and show that the instantaneous case actually is a particular case of the case with durations. We propose an algorithm to detect maximal cliques that improves our previous one for instantaneous link streams, and performs better than the state of the art algorithms in several cases of interest.

Download

Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams

Tiphaine Viard, Raphaël Fournier-S’niehotta, Clémence Magnien and Matthieu Latapy

International Conference on Complex Networks (CompleNet), 2018.

Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions (t1,t2,u,v) meaning that devices u and v exchanged packets from time t1 to time t2 . This traffic is captured on a single router and so has a bipartite structure: links occur only between nodes in two disjoint sets. We design a method for finding interesting bipartite cliques in such link streams, i.e. two sets of nodes and a time interval such that all nodes in the first set are linked to all nodes in the second set throughout the time interval. We then explore the bipartite cliques present in the considered trace. Comparison with the MAWILab classification of anomalous IP addresses shows that the found cliques succeed in detecting anomalous network activity.

Stream Graphs and Link Streams for the Modeling of Interactions over Time

Matthieu Latapy, Tiphaine Viard and Clémence Magnien

Social Networks Analysis and Mining, 8: 61, 2018

Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like density, clusters, or paths, and derive from them more advanced concepts like cliques, degrees, clustering coefficients, or connected components. We obtain a language to directly deal with interactions over time, similar to the language provided by graphs to deal with relations. This formalism is self-consistent: usual relations between different concepts are preserved. It is also consistent with graph theory: graph concepts are special cases of the ones we introduce. This makes it easy to generalize higher-level objects such as quotient graphs, line graphs, k-cores, and centralities. This paper also considers discrete versus continuous time assumptions, instantaneous links, and extensions to more complex cases.

Comparaison de mesures de centralité basées sur les plus courts chemins dans les réseaux dynamiques

Marwan Ghanem , Clémence Magnien, Fabien Tarissan

Extraction et Gestion de Connaissances (EGC 2018)

Définir l’importance des noeuds dans les réseaux statiques est une question de recherche très étudiée depuis de nombreuses années. Dernièrement, des adaptations des métriques classiques ont été proposées pour les réseaux dynamiques. Ces méthodes reposent sur des approches très différentes dans leur façon d’évaluer l’importance des noeuds à un instant donné. Il est donc nécessaire de pouvoir les évaluer et les comparer. Dans cet article, nous comparons trois approches existes pour mieux comprendre ce qui les différencie. Nous montrons que la nature des jeux de données influe grandement sur le comportement des méthodes, et que pour certains d’entre eux, la notion d’importance n’est pas toujours  pertinente.

Culture et (in)dépendance – Les enjeux de lindépendance dans les industries culturelles

Olivier Alexandre, Sophie Noël and Aurélie Pinto

ICCA – Industries culturelles, création, numérique, 2017

« Indépendant », « alternatif », « indie », « underground », « avant-garde », « de création »… Depuis les années 1970, la revendication d’indépendance a pris une importance grandissante dans les univers de production culturelle. Qu’elle se rapporte à des contenus, des méthodes de travail ou des dispositifs de médiation, cette revendication propose une alternative à la domination des groupes et des productions mainstream. Son succès conduit cependant à s’interroger sur la cohérence même d’une notion progressivement transformée en label de qualité. À travers douze contributions traitant de l’édition, du cinéma, de la musique, des médias et de la vulgarisation scientifique, cet ouvrage montre en quoi l’indépendance relève d’une construction sociale tributaire de son environnement institutionnel et marchand. Des ondes aux écrans, de l’Europe aux États-Unis, des managers aux artistes, il met en évidence le balancement entre artisanat de création et recherche d’une structuration économique pérenne. En mettant à distance la dénonciation ritualisée de l’hégémonie des majors et autres « grands groupes » et en s’appuyant sur des terrains ancrés dans différents contextes nationaux, ce livre fait le pari d’une approche transversale pour mieux saisir la manière dont l’indépendance irrigue et structure des filières trop souvent envisagées de manière cloisonnée. Il éclaire ainsi une catégorie de référence des industries culturelles paradoxalement peu étudiée par les sciences sociales, et permet de saisir l’évolution des rapports de force dans des secteurs confrontés à une rationalisation économique et à des mutations technologiques de grande ampleur.

Tracking bitcoin users activity using community detection on a network of weak signals.

Rémy Cazabet, Rym Baccour and Matthieu Latapy

The 6th International Conference on Complex Networks and Their Applications, Nov 2017, Lyon, France.

Abstract Bitcoin is a cryptocurrency attracting a lot of interest both from the gen- eral public and researchers. There is an ongoing debate on the question of users? anonymity: while the Bitcoin protocol has been designed to ensure that the activ- ity of individual users could not be tracked, some methods have been proposed to partially bypass this limitation. In this article, we show how the Bitcoin transac- tion network can be studied using complex networks analysis techniques, and in particular how community detection can be efficiently used to re-identify multiple addresses belonging to a same user.

Download

Dynamic Community Detection

Cazabet, Rémy, Giulio Rossetti, and Fredéric Amblard. 

Encyclopedia of Social Network Analysis and Mining, 2017

Impact of temporal features of cattle exchanges on the size and speed of epidemic outbreaks

Aurore Payen, Lionel Tabourier and Matthieu Latapy

ICCSA 2017, Part II, LNCS 10405 proceedings

Databases recording cattle exchanges offer unique opportunities for a better understanding and fighting of disease spreading. Most studies model contacts with (sequences of) networks, but this approach neglects important dynamical features of exchanges, that are known to play a key role in spreading. We use here a fully dynamic modeling of contacts and empirically compare the spreading outbreaks obtained with it to the ones obtained with network approaches. We show that neglecting time information leads to significant over-estimates of actual sizes of spreading cascades, and that these sizes are much more heterogeneous than generally assumed. Our approach also makes it possible to study the speed of spreading, and we show that the observed speeds vary greatly, even for a same cascade size.

Download

Combining structural and dynamic information to predict activity in link streams

Thibaud Arnoux, Lionel Tabourier and Matthieu Latapy

In proceedings of the International Symposium on Foundations and Applications of Big Data Analytics (FAB), in conjunction with ASONAM, 2017.

A link stream is a sequence of triplets (t, u, v) meaning that nodes u and v have interacted at time t. Capturing both the structural and temporal aspects of interactions is crucial for many real world datasets like contact between individuals. We tackle the issue of activity prediction in link streams, that is to say predicting the number of links occurring during a given period of time and we present a protocol that takes advantage of the temporal and structural information contained in the link stream. We introduce a way to represent the information captured using different features and combine them in a prediction function which is used to evaluate the future activity of links.

Download

Ego-betweenness centrality in link streams

Marwan Ghanem, Florent Coriat and Lionel Tabourier

6th Workshop on Social Network Analysis in Applications (SNAA 2017), in conjunction with ASONAM, 2017.

The ability of a node to relay information in a network is often measured using betweenness centrality. In order to take into account the fact that the role of the nodes vary through time, several adaptations of this concept have been proposed to time- evolving networks. However, these definitions are demanding in terms of computational cost, as they call for the computation of time-ordered paths. We propose a definition of centrality in link streams which is node-centric, in the sense that we only take into account the direct neighbors of a node to compute its centrality. This restriction allows to carry out the computation in a shorter time compared to a case where any couple of nodes in the network should be considered. Tests on empirical data show that this measure is relatively highly correlated to the number of times a node would relay information in a flooding process. We suggest that this is a good indication that this measurement can be of use in practical contexts where a node has a limited knowledge of its environment, such as routing protocols in delay tolerant networks.

Download

Tracking the evolution of temporal patterns of usage in bicycle-Sharing systems using nonnegative matrix factorization on multiple sliding windows

Remy Cazabet, Pablo Jensen, Pierre Borgnat

International Journal of Urban Sciences, 2017, p. 1-15

Bicycle-Sharing Systems (BSS) are growing quickly in popularity all over the world. In this article, we propose a method based on Nonnegative Matrix Factorization to study the typical temporal patterns of usage of the BSS of Lyon, France, by studying logs of rentals. First, we show how this approach allows us to understand the spatial and temporal usage of the system. Second, we show how we can track the evolution of these temporal patterns over several years, and how this information can be used to better understand the BSS, but also changes in the city itself, by considering the stations as social sensors.

Download

Time Weight Content-Based Extensions of Temporal Graphs for Personalized Recommendation

Armel Jacques Nzekon Nzeko’o, Maurice Tchuente, Matthieu Latapy

In WEBIST 2017

Recommender systems are an answer to information overload on the web. They filter and present to customer, a small subset of items that he is most likely to be interested in. Since user’s interests may change over time, accurately capturing these dynamics is important, though challenging. The Session-based Temporal Graph (STG) has been proposed by Xiang et al. to provide temporal recommendations by combining long- and short-term preferences. Later, Yu et al. have introduced an extension called Topic-STG, which takes into account topics extracted from tweets’ textual information. Recently, we pushed the idea further and proposed Content-based STG. However, in all these frameworks, the importance of links does not depend on their arrival time, which is a strong limitation: at any given time, purchases made last week should have a greater influence than purchases made a year ago. In this paper, we address this problem by proposing Time Weight Content-based STG, in which we assign a time-decreasing weight to edges. Using Time-Averaged Hit Ratio, we show that this approach outperforms all previous ones in real-world situations.

Download

Using Degree Constrained Gravity Null-Models to understand the structure of journeys networks in Bicycle Sharing Systems

Remy Cazabet, Pierre Borgnat, Pablo jensen

In ESANN 2017

Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys’ data can be ex- tracted, providing rich information to better understand it. Recent works have used network based-machine learning, and in particular space-corrected node clustering, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.

Download

Enhancing Space-Aware Community Detection Using Degree Constrained Spatial Null Model

Remy Cazabet, Pierre Borgnat, Pablo Jensen

In Complenet 2017

Null models have many applications on networks, from testing the significance of observations to the conception of algorithms such as community detection. They usually preserve some network properties , such as degree distribution. Recently, some null-models have been proposed for spatial networks, and applied to the community detection problem. In this article, we propose a new null-model adapted to spatial networks, that, unlike previous ones, preserves both the spatial structure and the degrees of nodes. We show the efficacy of this null-model in the community detection case both on synthetic and collected networks.

Download

Characterising inter and intra-community interactions in link streams using temporal motifs

Jean Creusefond, Remy Cazabet

In Complenet 2017

The analysis of dynamic networks has received a lot of attention in recent years, thanks to the greater availability of suitable datasets. One way to analyze such dataset is to study temporal motifs in link streams , i.e. sequences of links for which we can assume causality. In this article, we study the relationship between temporal motifs and communities , another important topic of complex networks. Through experiments on several real-world networks, with synthetic and ground truth community partitions, we identify motifs that are overrepresented at the frontier –or inside of– communities.

Download

Rigorous Measurement of the Internet Degree Distribution

Matthieu Latapy, Elie Rotenberg, Christophe Crespelle, Fabien Tarissan

In Complex Systems, volume 26, number 1 (2017)

The degree distribution of the internet, i.e. the fraction of routers with k links for any k, is its most studied property. It has a crucial influence on network robustness, spreading phenomena, and protocol design. In practice, however, this distribution is observed on partial, biased and erroneous maps. This raises serious concerns about the true knowledge we actually have of this key property. Here, we design and run a drastically new measurement approach for the reliable estimation of the degree distribution of the internet, without resorting to any map. It consists in sampling random core routers and precisely estimating their degree with probes sent from many monitors scattered over the internet. Our measurement shows that the true degree distribution significantly differs from classical assumptions: it is heterogeneous but it decreases sharply, in a way incompatible with a heavy-tailed power law.

Download