Parameterized complexity: from graph minor theory to efficient algorithms

Christophe Paul

Vendredi 08 juillet 2016 à 11h, salle 24-25/405

Slides

Parameterized complexity suggests a multi-parameter analysis of the computational complexity of hard problems. The idea is to understand the influence of parameters, distinct from the input size, in the resolution of a problem. Such parameters could be the solution size or the structural parameters such as width parameters. After an introduction to parameterized complexity, we will present some of the algorithmic consequences of the graph minor theory. From the work of Robertson and Seymour, it is known that every graph family closed under minor can be recognized in cubic time. However for most of such graph family, such a result is existential only. Since then constructive meta-algorithmic theorems have been proposed (including Courcelles theorem) within the framework of parameterized complexity. We will discuss recent developments in this line of research that led to efficient algorithms for large family of problems.

Graphs: is there something between theory and practice?

Gilles Tredan

Vendredi 24 juin 2016 à 11h, salle 24-25/405

My first part will focus on the problem of charting graphs: can we believe the maps we build for complex systems? My second part will present efforts to capture AFK social interaction networks (the Souk project), and figure out what to do with the obtained dynamic graphs. My third part is yet to be defined. My hole presentation will try to find a balance between algorithmic perspectives and data analysis.

Detection and classification of network traffic anomalies

Johan Mazel

Mercredi 8 juin 2016 à 11h, salle 24-25/405

Internet plays a central role in our lives. However, it is an extremely diverse and complex system. Ranging from non-malicious unexpected events such as flash-crowds and failures, to network attacks such as denials-of-service and network scans, network traffic anomalies can have serious detrimental effects on the performance and integrity of the network. Anomaly detection is thus paramount in order to guarantee users’ access to Internet resources. In this talk, we will address recent advances in network traffic anomaly detection and classification, that leverage graph analysis techniques, machine learning techniques and big data.

Suppressing diffusion processes on arbitrary networks using treatment resources of limited efficiency

Argyris Kalogeratos

Lundi 13 juin 2016 à 11h, salle 26-00/332

Slides

In many real-life situations, it is critical to dynamically suppress or remove an undesired diffusion process (viruses, information, behaviors, etc.). The talk will present a framework for Dynamic Resource Allocation (DRA) assuming a continuous-time SIS epidemic model, and that a budget of treatment resources of limited efficiency are at the disposal of authorities. Special emphasis will be given on the macro- and microscopic (or local) properties of the network structure for the problem and two strategies will be presented that fall in this framework: a) a simple yet effective greedy approach, and b) a more sophisticated one that uses a precomputed priority plan of how the healing strategy should proceed on a specific network. Additionally, extensions in competitive scenarios will be discussed.

Kempe equivalence of colourings

Marthe Bonamy

Vendredi 3 juin 2016 à 11h, salle 24-25/405

Slides

Given a colouring of a graph, a Kempe change is the operation of picking a maximal bichromatic subgraph and switching the two colours in it. Two colourings are Kempe equivalent if they can be obtained from each other through a series of Kempe changes. Kempe changes were first introduced in a failed attempt to prove the Four Colour Theorem, but they proved to be a powerful tool for other colouring problems. They are also relevant for more applied questions, most notably in theoretical physics. Consider a graph with no vertex of degree more than some integer D. In 2007, Mohar conjectured that all its D-colourings are Kempe-equivalent. Feghali, Johnson and Paulusma proved in 2015 that this is true for D=3, with the exception of one single graph which disproves the conjecture in its generality. We settle the remaining cases by proving the conjecture holds for every integer D at least 4. This is a joint work with Nicolas Bousquet (LIRIS, Ecole Centrale Lyon, France), Carl Feghali (Durham University, UK) and Matthew Johnson (Durham University, UK).

Temporal density of complex networks and ego-community dynamics

Sergey Kirgizov

Lundi 04 julliet 2016 à 11h, salle 24-25/405

Slides

At first, we say that a ego-community structure is a probability measure defined on the set of network nodes. Any subset of nodes may engender its own ego-community structure around. Many community detection algorithms can be modified to yield a result of this type, for instance, the personalized pagerank. We also recall that community detection algorithms (including personalized pagerank) can be viewed from different perspectives: random walks, convergence of markov chain, spectral clustering, optimization, mincut(s), discrete cheeger inequality(ies), etc. Next, we present a continuous version of Viard-Latapy-Magnien link streams, that we call « temporal density ». Classical kernel density estimation is used to move from discrete link streams towards their continuous counterparts. Using matrix perturbation theory we can prove that ego-community structure changes smoothly when the network evolves smoothly. This is very important, for example, for visualization purposes. Combining the temporal density and personalized pagerank methods, we are able to visualize and study the evolution of the ego-community structures of complex networks with a large number of temporal links in order to extract interacting information. For example, we can detect events, trace the evolution of (ego-)community structure, etc. We illustrate and validate our approach using « Primary school temporal network data » provided by sociopatterns.org, and we show how the temporal density can be applied to the study of very large datasets, such as a collection of tweets written by European Parliament candidates during European Parliament election in 2014.

La structure communautaire : évaluation et analyse de motifs dans les flux de liens

Jean Creusefond

Vendredi 13 mai 2016 à 11h, salle 24-25/405

Slides

Cet exposé sera en deux parties relativement indépendantes : l’évaluation des structures communautaires par le biais des vérités de terrain et l’analyse des l’appartenance communautaire des motifs dans les flux de liens L’évaluation de structures communautaires de manière théorique est très délicate : de multiples propriétés structurelles sont considérées comme importantes, par conséquent considérer une structure comme meilleure qu’une autre implique des choix arbitraires sur ces préférences, matérialisé par le choix d’une fonction de qualité ou de benchmarks. Afin d’éviter ces problèmes, beaucoup de chercheurs évaluent maintenant leurs résultats par comparaison avec des structures communautaires extraites en même temps que des jeux de données, en argumentant que la proximité entre leurs résultats et la vérité de terrain est une preuve significative de pertinence. Dans cette partie, je vais discuter d’une méthodologie permettant de concilier les deux approches et d’identifier quelles vérités de terrain favorisent quelles fonctions de qualité. Je soulignerai notamment le choix de la fonction de comparaison de partitions, souvent considéré comme anodin, mais changeant en fait radicalement les résultats. Pour référence, le programme développé (incluant un grand nombre d’algorithmes de détection de communautés et de fonctions de qualité) est entièrement disponible à l’adresse suivante : https://codacom.greyc.fr/ En seconde partie, je discuterai de travaux en cours d’analyse de flots de liens : des graphe dont chaque arc est étiqueté par un temps et où les multiarcs sont possibles. Les flots de liens qui nous intéressent ici représentent des réseaux de communication, c’est-à-dire que chaque arc représente une interaction orientée entre deux utilisateurs. Fréquemment, les algorithmes de détection de communautés qui tentent de les analyser agglomèrent le réseau de communication sur des fenêtres temporelles, où des méthodes traditionnelles (ou adaptées) peuvent êtres appliquées. Dans ce cas, une information est perdue : la causalité entre les liens. Par exemple, si un ensemble de personnes ont systématiquement la même structure de communication (ex : quand « A » interagit avec « B », celui-ci intéragit ensuite systématiquement avec « C » et « D »), peut-on en déduire la structure communautaire associée? Afin d’évaluer l’impact de cette information, je me suis intéressé aux motifs : des chaînes de communication dont la causalité semble probable (la première interaction a probablement entraîné la suivante, etc.). Le lien entre ces motifs et la structure communautaire reste donc à analyser, et je présenterai les outils mis au point à ce dessein ainsi que quelques résultats préliminaires.

Degeneracy-based mining of social and information networks: dynamics and applications

Fragkiskos Malliaros

Vendredi 01 avril 2016 à 11h, salle 24-25/405

Slides

Networks have become ubiquitous as data from diverse disciplines can naturally be mapped to graph structures. The problem of extracting meaningful information from large scale graph data in an efficient and effective way has become crucial and challenging with several important applications and towards this end, graph mining and analysis methods constitute prominent tools. In this talk, I will present part of my work that builts upon computationally efficient graph mining methods in order to: (i) design models for analyzing the structure and dynamics of real-world networks towards unraveling properties that can further be used in practical applications; (ii) develop algorithmic tools for large-scale analytics on data with inherent (e.g., social networks) or without inherent (e.g., text) graph structure. Our approaches rely on the concepts of graph degeneracy and core decomposition in graphs. In particular, for the former point I will show how to model the engagement dynamics of large social networks and how to assess their vulnerability with respect to user departures from the network. In both cases, by unraveling the dynamics of real social networks, regularities and patterns about their structure and formation can be identified; such knowledge can further be used in various applications including churn prediction and anomaly detection. For the latter, I will present a core decomposition-based approach for locating influential nodes in complex networks, with direct applications to epidemic control and viral marketing.

Modelling influence and opinion evolution in online collective behaviour

Samuel Martin

Vendredi 18 mars 2016 à 11h, salle 26-00/332

Slides

Opinion evolution and judgment revision are mediated through social influence. In this talk, I will present a study based on a crowdsourced in vitro experiment. The study shows how a consensus model can be used to predict opinion evolution in online collective behaviour. The model is parametrized by the influenceability of each individuals, a factor representing to what extent individuals incorporate external judgments. Judgment revision includes unpredictable variations which limit the potential for prediction. The study also serves to measure this level of unpredictability via a specific control experiment. More than two thirds of the prediction errors are found to occur due to unpredictability of the human judgment revision process rather than to model imperfection.

Uncovering the spatial structure of mobility networks in cities – Methods and applications

Thomas Louail

Vendredi 19 février 2016 à 11h, salle 24-25/405

Slides

The increasing availability of individual geographic footprints has generated a broad scientific interest for human mobility networks, at various scales and in different geographical contexts. In this talk I will present some recent results related to urban mobility. I will first present a method we developed to extract an expressive and coarse-grained signature from a large, weighted and directed network. I will then discuss the results we obtained when we applied this method to mobility networks extracted from mobile phone data in 31 Spanish cities, in order to compare the structure of journey-to-work commuting in these cities. The method distinguishes different types of links/flows, and clearly highlights a clear relation between city size and the importance of these types of . In the second part of my talk, I will focus on the shopping trips networks extracted from credit card transactions, performed by hundreds of thousands of anonymized individuals in the two largest Spanish cities. Starting from the bipartite networks that link individuals and businesses of the city, I will show that it is possible to evenly distribute business income among neighbourhoods — and then mitigate spatial inequality — by reassigning only a very small fraction of each individual’s shopping trips. This spatial and bottom-up approach of wealth redistribution could be easily implemented in mobile apps, that would assist individuals in slightly reshaping their mobility routines. Our results hence illustrate the social benefits individuals are entitled to expect from the analysis of the data they daily produce.

A method for the Approximation of the Maximal Consensus Local Community detection problem in Complex Networks

Patricia Conde Céspedes

Vendredi 22 janvier 2016 à 11h, salle 26-00/332

Slides

Although the notion of community does not have a unanimous accepted definition, it is often related to a set of strongly interconnected nodes. Indeed, the density of links plays an important role because it measures the strength of the relationships in the community. The need of these well connected and dense communities has led to the notion of consensus community. An consensus community, is a group of nodes where each member is connected to more than a proportion of the other nodes. An consensus community is maximal if and only if adding a new node to the set breaks the rule. Consequently, an consensus community has a density greater than . Existing methods for mining consensus communities generally assume that the network is entirely known and they try to detect all such consensus communities. Detecting the local community of specific nodes is very important for applications dealing with huge networks, when iterating through all nodes would be impractical. In this paper, we propose an efficient algorithm, called RANK-NUM-NEIGHS (RNN), based on local optimizations to approximate the maximal consensus local community of a given node. The proposed method is evaluated experimentally on real and artificial complex networks in terms of quality, execution time and stability. We also provide an upper bound on the optimal solution. The experiments show that It provides better results than the existing methods.

Analysis of the temporal and structural features of threads in a mailing-list

Noé Gaumont, Tiphaine Viard, Raphaël Fournier-S’niehotta, Qinna Wang and Matthieu Latapy

In Complex Networks VII: Proceedings of the 2016 Workshop on Complex Networks.

A link stream is a collection of triplets (t,u,v) indicating that an interaction occurred between u and v at time t. Link streams model many real-world situations like email exchanges between individuals, connections between devices, and others. Much work is currently devoted to the generalization of classical graph and network concepts to link streams. In this paper, we generalize the existing notions of intra-community density and inter-community density. We focus on emails exchanges in the Debian mailing list, and show that threads of emails, like communities in graphs, are dense subsets loosely connected from a link stream perspective.

Computing maximal cliques in link streams

Tiphaine Viard, Matthieu Latapy and Clémence Magnien

Theoretical Computer Science (TCS), Volume 609, Part 1, 4 January 2016, Pages 245–252.

A link stream is a collection of triplets (t, u, v) indicating that an interaction occurred between u and v at time t. We generalize the classical notion of cliques in graphs to such link streams: for a given , a -clique is a set of nodes and a time interval such that all pairs of nodes in this set interact at least once during each sub-interval of duration . We propose an algorithm to enumerate all maximal (in terms of nodes or time interval) cliques of a link stream, and illustrate its practical relevance on a real-world contact trace.

Efficient and simple generation of random simple connected graphs with prescribed degree sequence

Fabien Viger and Matthieu Latapy

Journal of Complex Networks (2015) 4 (1): 15-37

We address here the problem of generating random graphs uniformly from the set of simple connected graphs having a prescribed degree sequence. Our goal is to provide an algorithm designed for practical use both because of its ability to generate very large graphs (efficiency) and because it is easy to implement (simplicity). We focus on a family of heuristics for which we introduce optimality conditions, and show how this optimality can be reached in practice. We then propose a different approach, specifically designed for real-world degree distributions, which outperforms the first one. Based on a conjecture which we argue rigorously and which was confirmed by strong empirical evidence, we finally reduce the best asymptotic complexity bound known so far.

RankMerging: Apprentissage supervisé de classements pour la prédiction de liens dans les grands réseaux sociaux

Lionel Tabourier, Anne-Sophie Libert et Renaud Lambiotte

EGC 2015, 15ème conférence internationale sur l’extraction et la gestion des connaissances

Trouver les liens manquants dans un grand réseau social est une tâche difficile, car ces réseaux sont peu denses, et les liens peuvent correspondre à des environnements structurels variés. Dans cet article, nous décrivons RankMerging, une méthode d’apprentissage supervisé simple pour combiner l’information obtenue par différentes méthodes de classement. Afin d’illustrer son intérêt, nous l’appliquons à un réseau d’utilisateurs de téléphones portables, pour montrer comment un opérateur peut détecter des liens entre les clients de ses concurrents. Nous montrons que RankMerging surpasse les méthodes à disposition pour prédire un nombre variable de liens dans un grand graphe épars.

Revealing intricate properties of communities in the bipartite structure of online social networks

Raphaël Tackx, Jean-Loup Guillaume and Fabien Tarissan

In IEEE Ninth International Conference on Research Challenges in Information Science (RCIS’15), Athènes, Greece, 2015

Many real-world networks based on human activities exhibit a bipartite structure. Although bipartite graphs seem appropriate to analyse and model their properties, it has been shown that standard metrics fail to reproduce intricate patterns observed in real networks. In particular, the overlapping of the neighbourhood of communities is difficult to capture precisely. In this work, we tackle this issue by analysing the structure of 4 real-world networks coming from online social activities. We first analyse their structure using standard metrics. Surprisingly, the clustering coefficient turns out to be less relevant than the redundancy coefficient to account for overlapping patterns. We then propose new metrics, namely the dispersion and the monopoly coefficients, and show that they help refining the study of bipartite overlaps. Finally, we compare the results obtained on real networks with the ones obtained on random bipartite models. This shows that the patterns captured by the redundancy and the dispersion coefficients are strongly related to the real nature of the observed overlaps.

Download

Temporal properties of legal decision networks: a case study from the International Criminal Court

Fabien Tarissan and Raphaëlle Nollez-Goldbach

In 28th International Conference on Legal Knowledge and Information Systems (JURIX’15), Braga, Portugal, 2015.

Many studies have proposed to apply artificial intelligence techniques to legal networks, whether it be for highlighting legal reasoning, resolving conflict or extracting information from legal databases. In this context, a new line of research has recently emerged which consists in considering legal decisions as elements of complex networks and conduct a structural analysis of the relations between the decisions. It has proved to be efficient for detecting important decisions in legal rulings. In this paper, we follow this approach and propose to extend structural analyses with temporal properties. We define in particular the notion of relative in-degree, temporal distance and average longevity and use those metrics to rank the legal decisions of the two first trials of the International Criminal Court. The results presented in this paper highlight non trivial temporal properties of those legal networks, such as the presence of decisions with an unexpected high longevity, and show the relevance of the proposed relative in-degree property to detect landmark decisions. We validate the outcomes by confronting the results to the one obtained with the standard in-degree property and provide juridical explanations of the decisions identified as important by our approach.

Download

Augmenter les retweets sur Twitter : comment tirer parti des mentions ?

Soumajit Pramanik, Qinna Wang, Maximilien Danisch, Mohit Sharma, Sumanth Bandi, Jean-Loup Guillaume, Stéphane Raux and Bivas Mitra

6ème conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Paris, 2015

Alors que Twitter est devenu incontournable, la propagation des tweets et hashtags est toujours largement incomprise. Le propagation d’information sur Twitter est principalement due aux retweets et aux mentions mais, alors que les retweets ne permettent d’atteindre que les abonnés d’un individu, les mentions permettent d’atteindre n’importe qui directement. De nombreuses études ont montré que les mentions sont largement utilisées sur Twitter, mais surtout qu’elles sont fondamentales pour augmenter la popularité des tweets et hashtags. Des méthodes automatiques pour choisir les bons utilisateurs à mentionner pourraient donc permettre d’augmenter la visibilité des tweets. Dans cet article nous proposons un système de recommandation de mentions en temps réel pour aug- menter la popularité d’un tweet. Ce système est basé sur un modèle de propagation de tweet dans un graphe multiplexe construit à partir d’une étude de données réelles. Il permet de clairement faire la différence entre les propagations dues aux mentions et celles dues aux abonnements. Les simulations du modèle donnent des résultats similaires aux observations empiriques et sont également fondées sur des résultats analytiques. En utilisant ces différents résultats nous proposons une stratégie de recom- mandation effective et une application Twitter associée.

Download

Déplier la structure communautaire dun réseau en mesurant la proximité aux représentants de communauté

Maximilien Danisch, Jean-Loup Guillaume and Bénédicte Le Grand

6ème conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Paris, 2015

Nous proposons un algorithme pour déplier la structure communautaire des grands graphes de terrain. L’algorithme est basé sur la détection de la communauté de chaque représentant communau- taire : nœud contenu dans une seule communauté et important en son sein. Cette détection est faite avec une approche à base de mesure de proximité développée récemment. Par comparaison avec d’autres méthodes de l’état de l’art nous montrons que notre algorithme a des performances équivalentes voire meilleures et est capable de traiter les plus grands graphes de terrain.

Download

Détection de communautés dans les flots de liens par optimisation de la modularité

Emmanuel Orsini

6ème conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Paris, 2015.

L’article qui suit propose de donner un sens à la modularité dans les flots de liens et ainsi de bénéficier de certaines de ses propriétés, et des heuristiques qui l’optimisent. Cette no- tion de modularité aboutira après quelques simplifications à un algorithme capable de calculer une partition sur un jeu de données de 400 000 emails. Pour ce faire on construira une nou- velle modélisation où le temps est complètement continu, sur laquelle la modularité se définit naturellement et de manière pertinente. Cette modélisation apporte une nouvelle interpretation des réseaux dynamiques, qui se veut suffisamment générale pour s’adapter à différents types de données.

Download