Finding remarkably dense sequences of contacts in link streams

Noé Gaumont, Clémence Magnien and Matthieu Latapy

In Social Network Analysis and Mining (2016) 6: 87. doi:10.1007/s13278-016-0396-z

A link stream is a set of quadruplets (b, e, u, v) meaning that a link exists between u and v from time b to time e. Link streams model many real-world situations like contacts between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. We argue that the density is a valuable notion for understanding and characterizing links streams. We propose a method to capture specific groups of links that are structurally and temporally densely connected and show that they are meaningful for the description of link streams. To find such groups, we use classical graph community detection algorithms, and we assess obtained groups. We apply our method to several real-world contact traces (captured by sensors) and demonstrate the relevance of the obtained structures.

Download

Mining Time-stamped Network Data Spectral Methods and Centrality Measures

Ingo Scholtes

Mercredi 19 Octobre 2016à 11h00 salle 24-25/405

Recent research has highlighted limitations of studying complex systems with time-varying topologies from the perspective of static, time-aggregated networks. Non-Markovian characteristics resulting from the specific ordering of interactions in temporal networks were identified as one important mechanism that alters causality and affects dynamical processes. So far, an analytical explanation for this phenomenon and for the significant variations observed across different systems is missing. Summarizing our recent research in this area, in this talk I will introduce a framework that allows to analyze temporal networks with non-Markovian characteristics. The framework is based on higher-order aggregate networks, a simple generalization of the commonly used static representation of temporal network data. I will show that spectral properties of such higher-order aggregate networks can explain the slow-down of diffusion processes compared to aggregate networks, which has been observed in a number of empirical data sets. I further show that we can derive an exact analytical prediction for the magnitude of this change compared to the weighted, time-aggregate network. I finally present recent results on the analysis of node centralities in non-Markovian temporal networks, concluding that this approach provides interesting perspectives for (i) temporal community detection by spectral clustering, (ii) refined measures of centrality for time-evolving networks, and (iii) analytical studies of dynamical processes in complex systems with time-evolving interaction topologies.

On the notion of hyperbolicity in graphs

David Coudert

Salle 24-25/405

The Gromov hyperbolicity is an important parameter for analyzing complex networks since it is a measure of the tree-likeness of a graph from a metric perspective. This notion has been recently applied in different contexts, such as the design of routing schemes, network security, computational biology, the analysis of graph algorithms, and the classification of complex networks. For instance, it gives bounds on the best possible stretch of some greedy-routing schemes in Internet-like graphs. The best known algorithm for computing hyperbolicity has running-time O(n^{3.69}), which is clearly prohibitive for big graphs. In this talk, we will present recent advances on the computation of this parameter. We will present an algorithm that performs well in practice. Although its time complexity is in O(n^4), it can compute the hyperbolicity of graphs with 100.000 nodes in short time. We will discuss the limitations of this algorithm and related open problems. We will also provide some results on the time complexity lower bound. In addition, we will show how to use some simple properties of the graph to get tight bounds on its hyperbolicity, e.g., being bipartite, a line graph, vertex transitive, etc. These properties are used for establishing bounds on the hyperbolicity of various interconnection network topologies proposed for large data centers.

Classification of online discussions: is tree structure sufficient enough?

Mattias Mano

Vendredi 30 Septembre à 10h30, Salle 26-00/332

Slides

Over the past years, online communities have considerably grown. Especially, a new kind of forums emerged: the « Questions and Answers » (Q&A) forums. They cover lots of different fields, from opinion questions (such as Yahoo! Answers, Reddit, Quora, …) to really technical questions in Computer Science (Stack Overflow, Bugzilla, …) or in Mathematics (Math Overflow). We study the opinion forum « Reddit – Change My View ». Participants debate on society subject. Using graph theory modeling, I wonder if structural informations of the discussion (graph) are sufficient enough to be characteristic of what happen in the discussion.

P2PTV Multi-channel Peers Analysis

Marwan Ghanem, Olivier Fourmaux, Fabien Tarissan and Takumi Miyoshi

In The 18th Asia-Pacific Network Operations and Management Symposium (APNOMS’16), Kanazawa, Japan, 2016.

After being the support of the data and voice convergence, the Internet has become one of the main video providers such as TV-stream. As an  alternative to limited or expensive technologies, P2PTV has turned out to be a promising support for such applications. This infrastructure strongly relies on the overlay composed by the peers that consume and diffuse video contents at the same time. Understanding the dynamical properties of this overlay, and in particular how the users switch from one overlay to another, appears to be a key aspect if one wants to improve the quality of P2PTV. In this paper, we investigate the question of relying on non-invasive measurement techniques to track the presence of users on several channels of P2PTV. Using two datasets obtained by using network measurement on P2PTV infrastructure, we show that such an approach contains sufficient information to track the presence of users on several channels. Besides, exploiting the view provided by sliding time windows, we are able to refine the analysis and track users that switch from one channel to another, leading to the detection of super-peers and providing explanations of the different roles they can play in the infrastructure. In addition, by comparing the results obtained on the two datasets, we show how such analyses can shed some light on the evolution of the infrastructure policy.

Download

Predicting links in ego-networks using temporal information

Lionel Tabourier, Anne-Sophie Libert and Renaud Lambiotte

In EPJ Data Science (2016) 5: 1

Link prediction appears as a central problem of network science, as it calls for unfolding the mechanisms that govern the micro-dynamics of the network. In this work, we are interested in ego-networks, that is the mere information of interactions of a node to its neighbors, in the context of social relationships. As the structural information is very poor, we rely on another source of information to predict links among egos’ neighbors: the timing of interactions. We define several features to capture different kinds of temporal information and apply machine learning methods to combine these various features and improve the quality of the prediction. We demonstrate the efficiency of this temporal approach on a cellphone interaction dataset, pointing out features which prove themselves to perform well in this context, in particular the temporal profile of interactions and elapsed time between contacts.

Download

Characterizing and predicting mobile application usage

Keun-Woo Lim, Stefano Secci, Lionel Tabourier and Badis Tebbani

In Computer Communications, 2016, vol. 95, p. 82-94

In this paper, we propose data clustering techniques to predict temporal characteristics of data consumption behavior of different mobile applications via wireless communications. While most of the research on mobile data analytics focuses on the analysis of call data records and mobility traces, our analysis concentrates on mobile application usages, to characterize them and predict their behavior. We exploit mobile application usage logs provided by a Wi-Fi local area network service provider to characterize temporal behavior of mobile applications. More specifically, we generate daily profiles of “what” types of mobile applications users access and “when” users access them. From these profiles, we create usage classes of mobile applications via aggregation of similar profiles depending on data consumption rate, using three clustering techniques that we compare. Furthermore, we show that we can utilize these classes to analyze and predict future usages of each mobile application through progressive comparison using distance and similarity comparison techniques. Finally, we also detect and exploit outlying behavior in application usage profiles and discuss methods to efficiently predict them.

Download

Parameterized complexity: from graph minor theory to efficient algorithms

Christophe Paul

Vendredi 08 juillet 2016 à 11h, salle 24-25/405

Slides

Parameterized complexity suggests a multi-parameter analysis of the computational complexity of hard problems. The idea is to understand the influence of parameters, distinct from the input size, in the resolution of a problem. Such parameters could be the solution size or the structural parameters such as width parameters. After an introduction to parameterized complexity, we will present some of the algorithmic consequences of the graph minor theory. From the work of Robertson and Seymour, it is known that every graph family closed under minor can be recognized in cubic time. However for most of such graph family, such a result is existential only. Since then constructive meta-algorithmic theorems have been proposed (including Courcelles theorem) within the framework of parameterized complexity. We will discuss recent developments in this line of research that led to efficient algorithms for large family of problems.

Graphs: is there something between theory and practice?

Gilles Tredan

Vendredi 24 juin 2016 à 11h, salle 24-25/405

My first part will focus on the problem of charting graphs: can we believe the maps we build for complex systems? My second part will present efforts to capture AFK social interaction networks (the Souk project), and figure out what to do with the obtained dynamic graphs. My third part is yet to be defined. My hole presentation will try to find a balance between algorithmic perspectives and data analysis.

Detection and classification of network traffic anomalies

Johan Mazel

Mercredi 8 juin 2016 à 11h, salle 24-25/405

Internet plays a central role in our lives. However, it is an extremely diverse and complex system. Ranging from non-malicious unexpected events such as flash-crowds and failures, to network attacks such as denials-of-service and network scans, network traffic anomalies can have serious detrimental effects on the performance and integrity of the network. Anomaly detection is thus paramount in order to guarantee users’ access to Internet resources. In this talk, we will address recent advances in network traffic anomaly detection and classification, that leverage graph analysis techniques, machine learning techniques and big data.

Suppressing diffusion processes on arbitrary networks using treatment resources of limited efficiency

Argyris Kalogeratos

Lundi 13 juin 2016 à 11h, salle 26-00/332

Slides

In many real-life situations, it is critical to dynamically suppress or remove an undesired diffusion process (viruses, information, behaviors, etc.). The talk will present a framework for Dynamic Resource Allocation (DRA) assuming a continuous-time SIS epidemic model, and that a budget of treatment resources of limited efficiency are at the disposal of authorities. Special emphasis will be given on the macro- and microscopic (or local) properties of the network structure for the problem and two strategies will be presented that fall in this framework: a) a simple yet effective greedy approach, and b) a more sophisticated one that uses a precomputed priority plan of how the healing strategy should proceed on a specific network. Additionally, extensions in competitive scenarios will be discussed.

Kempe equivalence of colourings

Marthe Bonamy

Vendredi 3 juin 2016 à 11h, salle 24-25/405

Slides

Given a colouring of a graph, a Kempe change is the operation of picking a maximal bichromatic subgraph and switching the two colours in it. Two colourings are Kempe equivalent if they can be obtained from each other through a series of Kempe changes. Kempe changes were first introduced in a failed attempt to prove the Four Colour Theorem, but they proved to be a powerful tool for other colouring problems. They are also relevant for more applied questions, most notably in theoretical physics. Consider a graph with no vertex of degree more than some integer D. In 2007, Mohar conjectured that all its D-colourings are Kempe-equivalent. Feghali, Johnson and Paulusma proved in 2015 that this is true for D=3, with the exception of one single graph which disproves the conjecture in its generality. We settle the remaining cases by proving the conjecture holds for every integer D at least 4. This is a joint work with Nicolas Bousquet (LIRIS, Ecole Centrale Lyon, France), Carl Feghali (Durham University, UK) and Matthew Johnson (Durham University, UK).

Temporal density of complex networks and ego-community dynamics

Sergey Kirgizov

Lundi 04 julliet 2016 à 11h, salle 24-25/405

Slides

At first, we say that a ego-community structure is a probability measure defined on the set of network nodes. Any subset of nodes may engender its own ego-community structure around. Many community detection algorithms can be modified to yield a result of this type, for instance, the personalized pagerank. We also recall that community detection algorithms (including personalized pagerank) can be viewed from different perspectives: random walks, convergence of markov chain, spectral clustering, optimization, mincut(s), discrete cheeger inequality(ies), etc. Next, we present a continuous version of Viard-Latapy-Magnien link streams, that we call « temporal density ». Classical kernel density estimation is used to move from discrete link streams towards their continuous counterparts. Using matrix perturbation theory we can prove that ego-community structure changes smoothly when the network evolves smoothly. This is very important, for example, for visualization purposes. Combining the temporal density and personalized pagerank methods, we are able to visualize and study the evolution of the ego-community structures of complex networks with a large number of temporal links in order to extract interacting information. For example, we can detect events, trace the evolution of (ego-)community structure, etc. We illustrate and validate our approach using « Primary school temporal network data » provided by sociopatterns.org, and we show how the temporal density can be applied to the study of very large datasets, such as a collection of tweets written by European Parliament candidates during European Parliament election in 2014.

La structure communautaire : évaluation et analyse de motifs dans les flux de liens

Jean Creusefond

Vendredi 13 mai 2016 à 11h, salle 24-25/405

Slides

Cet exposé sera en deux parties relativement indépendantes : l’évaluation des structures communautaires par le biais des vérités de terrain et l’analyse des l’appartenance communautaire des motifs dans les flux de liens L’évaluation de structures communautaires de manière théorique est très délicate : de multiples propriétés structurelles sont considérées comme importantes, par conséquent considérer une structure comme meilleure qu’une autre implique des choix arbitraires sur ces préférences, matérialisé par le choix d’une fonction de qualité ou de benchmarks. Afin d’éviter ces problèmes, beaucoup de chercheurs évaluent maintenant leurs résultats par comparaison avec des structures communautaires extraites en même temps que des jeux de données, en argumentant que la proximité entre leurs résultats et la vérité de terrain est une preuve significative de pertinence. Dans cette partie, je vais discuter d’une méthodologie permettant de concilier les deux approches et d’identifier quelles vérités de terrain favorisent quelles fonctions de qualité. Je soulignerai notamment le choix de la fonction de comparaison de partitions, souvent considéré comme anodin, mais changeant en fait radicalement les résultats. Pour référence, le programme développé (incluant un grand nombre d’algorithmes de détection de communautés et de fonctions de qualité) est entièrement disponible à l’adresse suivante : https://codacom.greyc.fr/ En seconde partie, je discuterai de travaux en cours d’analyse de flots de liens : des graphe dont chaque arc est étiqueté par un temps et où les multiarcs sont possibles. Les flots de liens qui nous intéressent ici représentent des réseaux de communication, c’est-à-dire que chaque arc représente une interaction orientée entre deux utilisateurs. Fréquemment, les algorithmes de détection de communautés qui tentent de les analyser agglomèrent le réseau de communication sur des fenêtres temporelles, où des méthodes traditionnelles (ou adaptées) peuvent êtres appliquées. Dans ce cas, une information est perdue : la causalité entre les liens. Par exemple, si un ensemble de personnes ont systématiquement la même structure de communication (ex : quand « A » interagit avec « B », celui-ci intéragit ensuite systématiquement avec « C » et « D »), peut-on en déduire la structure communautaire associée? Afin d’évaluer l’impact de cette information, je me suis intéressé aux motifs : des chaînes de communication dont la causalité semble probable (la première interaction a probablement entraîné la suivante, etc.). Le lien entre ces motifs et la structure communautaire reste donc à analyser, et je présenterai les outils mis au point à ce dessein ainsi que quelques résultats préliminaires.

Degeneracy-based mining of social and information networks: dynamics and applications

Fragkiskos Malliaros

Vendredi 01 avril 2016 à 11h, salle 24-25/405

Slides

Networks have become ubiquitous as data from diverse disciplines can naturally be mapped to graph structures. The problem of extracting meaningful information from large scale graph data in an efficient and effective way has become crucial and challenging with several important applications and towards this end, graph mining and analysis methods constitute prominent tools. In this talk, I will present part of my work that builts upon computationally efficient graph mining methods in order to: (i) design models for analyzing the structure and dynamics of real-world networks towards unraveling properties that can further be used in practical applications; (ii) develop algorithmic tools for large-scale analytics on data with inherent (e.g., social networks) or without inherent (e.g., text) graph structure. Our approaches rely on the concepts of graph degeneracy and core decomposition in graphs. In particular, for the former point I will show how to model the engagement dynamics of large social networks and how to assess their vulnerability with respect to user departures from the network. In both cases, by unraveling the dynamics of real social networks, regularities and patterns about their structure and formation can be identified; such knowledge can further be used in various applications including churn prediction and anomaly detection. For the latter, I will present a core decomposition-based approach for locating influential nodes in complex networks, with direct applications to epidemic control and viral marketing.

Modelling influence and opinion evolution in online collective behaviour

Samuel Martin

Vendredi 18 mars 2016 à 11h, salle 26-00/332

Slides

Opinion evolution and judgment revision are mediated through social influence. In this talk, I will present a study based on a crowdsourced in vitro experiment. The study shows how a consensus model can be used to predict opinion evolution in online collective behaviour. The model is parametrized by the influenceability of each individuals, a factor representing to what extent individuals incorporate external judgments. Judgment revision includes unpredictable variations which limit the potential for prediction. The study also serves to measure this level of unpredictability via a specific control experiment. More than two thirds of the prediction errors are found to occur due to unpredictability of the human judgment revision process rather than to model imperfection.

Uncovering the spatial structure of mobility networks in cities – Methods and applications

Thomas Louail

Vendredi 19 février 2016 à 11h, salle 24-25/405

Slides

The increasing availability of individual geographic footprints has generated a broad scientific interest for human mobility networks, at various scales and in different geographical contexts. In this talk I will present some recent results related to urban mobility. I will first present a method we developed to extract an expressive and coarse-grained signature from a large, weighted and directed network. I will then discuss the results we obtained when we applied this method to mobility networks extracted from mobile phone data in 31 Spanish cities, in order to compare the structure of journey-to-work commuting in these cities. The method distinguishes different types of links/flows, and clearly highlights a clear relation between city size and the importance of these types of . In the second part of my talk, I will focus on the shopping trips networks extracted from credit card transactions, performed by hundreds of thousands of anonymized individuals in the two largest Spanish cities. Starting from the bipartite networks that link individuals and businesses of the city, I will show that it is possible to evenly distribute business income among neighbourhoods — and then mitigate spatial inequality — by reassigning only a very small fraction of each individual’s shopping trips. This spatial and bottom-up approach of wealth redistribution could be easily implemented in mobile apps, that would assist individuals in slightly reshaping their mobility routines. Our results hence illustrate the social benefits individuals are entitled to expect from the analysis of the data they daily produce.

A method for the Approximation of the Maximal Consensus Local Community detection problem in Complex Networks

Patricia Conde Céspedes

Vendredi 22 janvier 2016 à 11h, salle 26-00/332

Slides

Although the notion of community does not have a unanimous accepted definition, it is often related to a set of strongly interconnected nodes. Indeed, the density of links plays an important role because it measures the strength of the relationships in the community. The need of these well connected and dense communities has led to the notion of consensus community. An consensus community, is a group of nodes where each member is connected to more than a proportion of the other nodes. An consensus community is maximal if and only if adding a new node to the set breaks the rule. Consequently, an consensus community has a density greater than . Existing methods for mining consensus communities generally assume that the network is entirely known and they try to detect all such consensus communities. Detecting the local community of specific nodes is very important for applications dealing with huge networks, when iterating through all nodes would be impractical. In this paper, we propose an efficient algorithm, called RANK-NUM-NEIGHS (RNN), based on local optimizations to approximate the maximal consensus local community of a given node. The proposed method is evaluated experimentally on real and artificial complex networks in terms of quality, execution time and stability. We also provide an upper bound on the optimal solution. The experiments show that It provides better results than the existing methods.

Analysis of the temporal and structural features of threads in a mailing-list

Noé Gaumont, Tiphaine Viard, Raphaël Fournier-S’niehotta, Qinna Wang and Matthieu Latapy

In Complex Networks VII: Proceedings of the 2016 Workshop on Complex Networks.

A link stream is a collection of triplets (t,u,v) indicating that an interaction occurred between u and v at time t. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the existing notions of intra-community density and inter-community density. We focus on emails exchanges in the Debian mailing list, and show that threads of emails, like communities in graphs, are dense subsets loosely connected from a link stream perspective.

Computing maximal cliques in link streams

Tiphaine Viard, Matthieu Latapy and Clémence Magnien

Theoretical Computer Science (TCS), Volume 609, Part 1, 4 January 2016, Pages 245–252.

A link stream is a collection of triplets (t, u, v) indicating that an interaction occurred between u and v at time t. We generalize the classical notion of cliques in graphs to such link streams: for a given , a -clique is a set of nodes and a time interval such that all pairs of nodes in this set interact at least once during each sub-interval of duration . We propose an algorithm to enumerate all maximal (in terms of nodes or time interval) cliques of a link stream, and illustrate its practical relevance on a real-world contact trace.