OLCPM: An Online Framework for Detecting Overlapping Communities in Dynamic Social Networks

Souâad Boudebza, Rémy Cazabet, Faiçal Azouaou and Omar Noual

Computer Communications, Elsevier, In press, 2018.

Community structure is one of the most prominent features of complex networks. Community structure detection is of great importance to provide insights into the network structure and functionalities. Most proposals focus on static networks. However, finding communities in a dynamic network is even more challenging, especially when communities overlap with each other. In this article , we present an online algorithm, called OLCPM, based on clique percolation and label propagation methods. OLCPM can detect overlapping communities and works on temporal networks with a fine granularity. By locally updating the community structure, OLCPM delivers significant improvement in running time compared with previous clique percolation techniques. The experimental results on both synthetic and real-world networks illustrate the effectiveness of the method.

Download

Enumerating maximal cliques in link streams with durations

Tiphaine Viard, Clémence Magnien, and Matthieu Latapy

Information Processing Letters 133 (2018), p. 44-48

Link streams model interactions over time, and a clique in a link stream is defined as a set of nodes and a time interval such that all pairs of nodes in this set interact permanently during this time interval. This notion was introduced recently in the case where interactions are instantaneous. We generalize it to the case of interactions with durations and show that the instantaneous case actually is a particular case of the case with durations. We propose an algorithm to detect maximal cliques that improves our previous one for instantaneous link streams, and performs better than the state of the art algorithms in several cases of interest.

Download

Discovering Patterns of Interest in IP Traffic Using Cliques in Bipartite Link Streams

Tiphaine Viard, Raphaël Fournier-S’niehotta, Clémence Magnien and Matthieu Latapy

International Conference on Complex Networks (CompleNet), 2018.

Studying IP traffic is crucial for many applications. We focus here on the detection of (structurally and temporally) dense sequences of interactions, that may indicate botnets or coordinated network scans. More precisely, we model a MAWI capture of IP traffic as a link streams, i.e. a sequence of interactions (t1,t2,u,v) meaning that devices u and v exchanged packets from time t1 to time t2 . This traffic is captured on a single router and so has a bipartite structure: links occur only between nodes in two disjoint sets. We design a method for finding interesting bipartite cliques in such link streams, i.e. two sets of nodes and a time interval such that all nodes in the first set are linked to all nodes in the second set throughout the time interval. We then explore the bipartite cliques present in the considered trace. Comparison with the MAWILab classification of anomalous IP addresses shows that the found cliques succeed in detecting anomalous network activity.

Stream Graphs and Link Streams for the Modeling of Interactions over Time

Matthieu Latapy, Tiphaine Viard and Clémence Magnien

Social Networks Analysis and Mining, 8: 61, 2018

Graph theory provides a language for studying the structure of relations, and it is often used to study interactions over time too. However, it poorly captures the both temporal and structural nature of interactions, that calls for a dedicated formalism. In this paper, we generalize graph concepts in order to cope with both aspects in a consistent way. We start with elementary concepts like density, clusters, or paths, and derive from them more advanced concepts like cliques, degrees, clustering coefficients, or connected components. We obtain a language to directly deal with interactions over time, similar to the language provided by graphs to deal with relations. This formalism is self-consistent: usual relations between different concepts are preserved. It is also consistent with graph theory: graph concepts are special cases of the ones we introduce. This makes it easy to generalize higher-level objects such as quotient graphs, line graphs, k-cores, and centralities. This paper also considers discrete versus continuous time assumptions, instantaneous links, and extensions to more complex cases.

Comparaison de mesures de centralité basées sur les plus courts chemins dans les réseaux dynamiques

Marwan Ghanem , Clémence Magnien, Fabien Tarissan

Extraction et Gestion de Connaissances (EGC 2018)

Définir l’importance des noeuds dans les réseaux statiques est une question de recherche très étudiée depuis de nombreuses années. Dernièrement, des adaptations des métriques classiques ont été proposées pour les réseaux dynamiques. Ces méthodes reposent sur des approches très différentes dans leur façon d’évaluer l’importance des noeuds à un instant donné. Il est donc nécessaire de pouvoir les évaluer et les comparer. Dans cet article, nous comparons trois approches existes pour mieux comprendre ce qui les différencie. Nous montrons que la nature des jeux de données influe grandement sur le comportement des méthodes, et que pour certains d’entre eux, la notion d’importance n’est pas toujours  pertinente.

Culture et (in)dépendance – Les enjeux de lindépendance dans les industries culturelles

Olivier Alexandre, Sophie Noël and Aurélie Pinto

ICCA – Industries culturelles, création, numérique, 2017

« Indépendant », « alternatif », « indie », « underground », « avant-garde », « de création »… Depuis les années 1970, la revendication d’indépendance a pris une importance grandissante dans les univers de production culturelle. Qu’elle se rapporte à des contenus, des méthodes de travail ou des dispositifs de médiation, cette revendication propose une alternative à la domination des groupes et des productions mainstream. Son succès conduit cependant à s’interroger sur la cohérence même d’une notion progressivement transformée en label de qualité. À travers douze contributions traitant de l’édition, du cinéma, de la musique, des médias et de la vulgarisation scientifique, cet ouvrage montre en quoi l’indépendance relève d’une construction sociale tributaire de son environnement institutionnel et marchand. Des ondes aux écrans, de l’Europe aux États-Unis, des managers aux artistes, il met en évidence le balancement entre artisanat de création et recherche d’une structuration économique pérenne. En mettant à distance la dénonciation ritualisée de l’hégémonie des majors et autres « grands groupes » et en s’appuyant sur des terrains ancrés dans différents contextes nationaux, ce livre fait le pari d’une approche transversale pour mieux saisir la manière dont l’indépendance irrigue et structure des filières trop souvent envisagées de manière cloisonnée. Il éclaire ainsi une catégorie de référence des industries culturelles paradoxalement peu étudiée par les sciences sociales, et permet de saisir l’évolution des rapports de force dans des secteurs confrontés à une rationalisation économique et à des mutations technologiques de grande ampleur.

Tracking bitcoin users activity using community detection on a network of weak signals.

Rémy Cazabet, Rym Baccour and Matthieu Latapy

The 6th International Conference on Complex Networks and Their Applications, Nov 2017, Lyon, France.

Abstract Bitcoin is a cryptocurrency attracting a lot of interest both from the gen- eral public and researchers. There is an ongoing debate on the question of users? anonymity: while the Bitcoin protocol has been designed to ensure that the activ- ity of individual users could not be tracked, some methods have been proposed to partially bypass this limitation. In this article, we show how the Bitcoin transac- tion network can be studied using complex networks analysis techniques, and in particular how community detection can be efficiently used to re-identify multiple addresses belonging to a same user.

Download

Dynamic Community Detection

Cazabet, Rémy, Giulio Rossetti, and Fredéric Amblard. 

Encyclopedia of Social Network Analysis and Mining, 2017

Impact of temporal features of cattle exchanges on the size and speed of epidemic outbreaks

Aurore Payen, Lionel Tabourier and Matthieu Latapy

ICCSA 2017, Part II, LNCS 10405 proceedings

Databases recording cattle exchanges offer unique opportunities for a better understanding and fighting of disease spreading. Most studies model contacts with (sequences of) networks, but this approach neglects important dynamical features of exchanges, that are known to play a key role in spreading. We use here a fully dynamic modeling of contacts and empirically compare the spreading outbreaks obtained with it to the ones obtained with network approaches. We show that neglecting time information leads to significant over-estimates of actual sizes of spreading cascades, and that these sizes are much more heterogeneous than generally assumed. Our approach also makes it possible to study the speed of spreading, and we show that the observed speeds vary greatly, even for a same cascade size.

Download

Combining structural and dynamic information to predict activity in link streams

Thibaud Arnoux, Lionel Tabourier and Matthieu Latapy

In proceedings of the International Symposium on Foundations and Applications of Big Data Analytics (FAB), in conjunction with ASONAM, 2017.

A link stream is a sequence of triplets (t, u, v) meaning that nodes u and v have interacted at time t. Capturing both the structural and temporal aspects of interactions is crucial for many real world datasets like contact between individuals. We tackle the issue of activity prediction in link streams, that is to say predicting the number of links occurring during a given period of time and we present a protocol that takes advantage of the temporal and structural information contained in the link stream. We introduce a way to represent the information captured using different features and combine them in a prediction function which is used to evaluate the future activity of links.

Download

Ego-betweenness centrality in link streams

Marwan Ghanem, Florent Coriat and Lionel Tabourier

6th Workshop on Social Network Analysis in Applications (SNAA 2017), in conjunction with ASONAM, 2017.

The ability of a node to relay information in a network is often measured using betweenness centrality. In order to take into account the fact that the role of the nodes vary through time, several adaptations of this concept have been proposed to time- evolving networks. However, these definitions are demanding in terms of computational cost, as they call for the computation of time-ordered paths. We propose a definition of centrality in link streams which is node-centric, in the sense that we only take into account the direct neighbors of a node to compute its centrality. This restriction allows to carry out the computation in a shorter time compared to a case where any couple of nodes in the network should be considered. Tests on empirical data show that this measure is relatively highly correlated to the number of times a node would relay information in a flooding process. We suggest that this is a good indication that this measurement can be of use in practical contexts where a node has a limited knowledge of its environment, such as routing protocols in delay tolerant networks.

Download

Tracking the evolution of temporal patterns of usage in bicycle-Sharing systems using nonnegative matrix factorization on multiple sliding windows

Remy Cazabet, Pablo Jensen, Pierre Borgnat

International Journal of Urban Sciences, 2017, p. 1-15

Bicycle-Sharing Systems (BSS) are growing quickly in popularity all over the world. In this article, we propose a method based on Nonnegative Matrix Factorization to study the typical temporal patterns of usage of the BSS of Lyon, France, by studying logs of rentals. First, we show how this approach allows us to understand the spatial and temporal usage of the system. Second, we show how we can track the evolution of these temporal patterns over several years, and how this information can be used to better understand the BSS, but also changes in the city itself, by considering the stations as social sensors.

Download

Time Weight Content-Based Extensions of Temporal Graphs for Personalized Recommendation

Armel Jacques Nzekon Nzeko’o, Maurice Tchuente, Matthieu Latapy

In WEBIST 2017

Recommender systems are an answer to information overload on the web. They filter and present to customer, a small subset of items that he is most likely to be interested in. Since user’s interests may change over time, accurately capturing these dynamics is important, though challenging. The Session-based Temporal Graph (STG) has been proposed by Xiang et al. to provide temporal recommendations by combining long- and short-term preferences. Later, Yu et al. have introduced an extension called Topic-STG, which takes into account topics extracted from tweets’ textual information. Recently, we pushed the idea further and proposed Content-based STG. However, in all these frameworks, the importance of links does not depend on their arrival time, which is a strong limitation: at any given time, purchases made last week should have a greater influence than purchases made a year ago. In this paper, we address this problem by proposing Time Weight Content-based STG, in which we assign a time-decreasing weight to edges. Using Time-Averaged Hit Ratio, we show that this approach outperforms all previous ones in real-world situations.

Download

Using Degree Constrained Gravity Null-Models to understand the structure of journeys networks in Bicycle Sharing Systems

Remy Cazabet, Pierre Borgnat, Pablo jensen

In ESANN 2017

Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys’ data can be ex- tracted, providing rich information to better understand it. Recent works have used network based-machine learning, and in particular space-corrected node clustering, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.

Download

Enhancing Space-Aware Community Detection Using Degree Constrained Spatial Null Model

Remy Cazabet, Pierre Borgnat, Pablo Jensen

In Complenet 2017

Null models have many applications on networks, from testing the significance of observations to the conception of algorithms such as community detection. They usually preserve some network properties , such as degree distribution. Recently, some null-models have been proposed for spatial networks, and applied to the community detection problem. In this article, we propose a new null-model adapted to spatial networks, that, unlike previous ones, preserves both the spatial structure and the degrees of nodes. We show the efficacy of this null-model in the community detection case both on synthetic and collected networks.

Download

Characterising inter and intra-community interactions in link streams using temporal motifs

Jean Creusefond, Remy Cazabet

In Complenet 2017

The analysis of dynamic networks has received a lot of attention in recent years, thanks to the greater availability of suitable datasets. One way to analyze such dataset is to study temporal motifs in link streams , i.e. sequences of links for which we can assume causality. In this article, we study the relationship between temporal motifs and communities , another important topic of complex networks. Through experiments on several real-world networks, with synthetic and ground truth community partitions, we identify motifs that are overrepresented at the frontier –or inside of– communities.

Download

Rigorous Measurement of the Internet Degree Distribution

Matthieu Latapy, Elie Rotenberg, Christophe Crespelle, Fabien Tarissan

In Complex Systems, volume 26, number 1 (2017)

The degree distribution of the internet, i.e. the fraction of routers with k links for any k, is its most studied property. It has a crucial influence on network robustness, spreading phenomena, and protocol design. In practice, however, this distribution is observed on partial, biased and erroneous maps. This raises serious concerns about the true knowledge we actually have of this key property. Here, we design and run a drastically new measurement approach for the reliable estimation of the degree distribution of the internet, without resorting to any map. It consists in sampling random core routers and precisely estimating their degree with probes sent from many monitors scattered over the internet. Our measurement shows that the true degree distribution significantly differs from classical assumptions: it is heterogeneous but it decreases sharply, in a way incompatible with a heavy-tailed power law.

Download

Finding remarkably dense sequences of contacts in link streams

Noé Gaumont, Clémence Magnien and Matthieu Latapy

In Social Network Analysis and Mining (2016) 6: 87. doi:10.1007/s13278-016-0396-z

A link stream is a set of quadruplets (b, e, u, v) meaning that a link exists between u and v from time b to time e. Link streams model many real-world situations like contacts between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. We argue that the density is a valuable notion for understanding and characterizing links streams. We propose a method to capture specific groups of links that are structurally and temporally densely connected and show that they are meaningful for the description of link streams. To find such groups, we use classical graph community detection algorithms, and we assess obtained groups. We apply our method to several real-world contact traces (captured by sensors) and demonstrate the relevance of the obtained structures.

Download

P2PTV Multi-channel Peers Analysis

Marwan Ghanem, Olivier Fourmaux, Fabien Tarissan and Takumi Miyoshi

In The 18th Asia-Pacific Network Operations and Management Symposium (APNOMS’16), Kanazawa, Japan, 2016.

After being the support of the data and voice convergence, the Internet has become one of the main video providers such as TV-stream. As an  alternative to limited or expensive technologies, P2PTV has turned out to be a promising support for such applications. This infrastructure strongly relies on the overlay composed by the peers that consume and diffuse video contents at the same time. Understanding the dynamical properties of this overlay, and in particular how the users switch from one overlay to another, appears to be a key aspect if one wants to improve the quality of P2PTV. In this paper, we investigate the question of relying on non-invasive measurement techniques to track the presence of users on several channels of P2PTV. Using two datasets obtained by using network measurement on P2PTV infrastructure, we show that such an approach contains sufficient information to track the presence of users on several channels. Besides, exploiting the view provided by sliding time windows, we are able to refine the analysis and track users that switch from one channel to another, leading to the detection of super-peers and providing explanations of the different roles they can play in the infrastructure. In addition, by comparing the results obtained on the two datasets, we show how such analyses can shed some light on the evolution of the infrastructure policy.

Download