Spatial organisation of large networks

Marc Barthelemy

08 avril 2010

Most transportation networks are embedded in space and consequently, their topology is not independent from spatial constraints. In particular, these constraints can induce a hierarchical organization with hubs controlling specific regions of space, non-trivial correlations between the weights, the connectivity pattern and the actual spatial distances of vertices, and the emergence of anomalous fluctuations in the betweenness-degree correlation function. In this talk I will illustrate these effects both from an empirical and modeling point of view, with various examples ranging from the large scale air travel network to smaller scales of inter- and intra- cities transportation networks.

Gephi 0.7 – Modular Architecture for a Large Network Visualization Software

Mathieu Bastian

25 mars 2010

In this practical talk, I will present 0.7 version of Gephi, open-source graph visualization & manipulation software. Gephi possesses a complete set of network visualization and analysis features, proposes major innovations in dynamic and hierarchical graphs and above all is a flexible and extensible architecture for plugins. I will firstly sum up embedded features: Layout, Metrics, Ranking, Filters, Partition, Preview, Clustering, DataLab. Then, I will present and discuss dynamic and hierarchical graphs features and roadmaps while showing live demo. Afterwards, I will outline our approach by presenting the Gephi project, some insights about performance and the various possibilities of using and extending Gephi in the way scientists and engineers can rely on. To conclude I will speak about plugin development and show how to create a plugin in five minutes.

Extracting Intra-Domain Topology from mrinfo Probing

Jean-Jacques Pansiot

11 mars 2010

Slides

Active and passive measurements for topology discovery have known an impressive growth during the last decade. If a lot of work has been done regarding inter-domain topology discovery and modeling, only a few papers raise the question of how to extract intra-domain topologies from measurements results. In this paper, based on a large dataset collected with mrinfo, a multicast tool that silently discovers all interfaces of a router, we provide a mechanism for retrieving intra-domain topologies. The main challenge is to assign an AS number to a border router whose IP addresses are not mapped to the same AS. Our algorithm is based on probabilistic and empirical IP allocation rules. The goal of our pool of rules is to converge to a consistent router to AS mapping. We show that our router-to-AS algorithm results in a mapping in more than 99% of the cases. Furthermore, with mrinfo, point-to-point links between routers can be distinguished from multiple links attached to a switch, providing an accurate view of the collected topologies. Finally, we provide a set of large intra-domain topologies in various formats.

Unsupervised host behavior classification from connection patterns

Pierre Borgnat

04 mars 2010

Slides

Dans un but de classification du trafic émis par des ordinateurs via internet mais aussi de détection des anomalies de trafic, une méthode de caractérisation des communications d’ordinateurs est étudiée. Un espace de représentation des motifs de connexions de chaque ordinateur est étudié en ce qu’il représente le trafic échangé, la dispersion de ce trafic, la connectivité de l’ordinateur. On montrera que cette représentation permet par exemple de quantifier plus finement les graphlets empiriques utilisés par la méthode BLINC (Karagiannis et al. 2005). Cette représentation du trafic échangé permet alors de mettre en place une classification non supervisée du trafic internet par une approches de classification par MST (Minimum Spanning Tree), sur des liens backbones et sans disposer du contenu des paquets ni du trafic bidirectionnel (ici validé sur le trafic backbone d’un lien transpacifique opéré par WIDE, Japon).

Diffusions et cascades dans les graphes aléatoires

Marc Lelarge

25 février 2010

Slides

Nous introduisons un modèle de diffusion qui généralise à la fois le processus de contact et la percolation ‘bootstrap’. Nous analysons ce processus sur des graphes aléatoires dilués. Ceci nous permet de retrouver des résultats connus (taille de la composante géante, seuil de percolation) et nouveaux (condition de cascade, impact de différentes vaccinations). Les preuves reposent sur des idées de couplages développées récemment par Janson et Luczak.

Réseaux sociaux égocentriquesActive learning on networks : you must label all nodes. You can ask the true labels of 42 nodes. Choose wisely.

Jean-Baptiste Rouquier

28 janvier 2010

Slides

In many networks, vertices have hidden attributes that are correlated with the networks topology. For instance, in social networks, people are more likely to be friends if they are demographically similar. In food webs, predators typically eat prey of lower body mass. We explore a setting in which the networks topology is known, but these attributes are not. If each vertex can be queried, learning the value of its hidden attributes but only at some cost then we need an algorithm which chooses which vertex to query next, in order to learn as much as possible about the attributes of the remaining vertices. We assume that the network is generated by a probabilistic model, but we make no assumptions about the assortativity or disassortativity of the network. We then query the vertex with the largest mutual information between its type and that of the others (a well-known approach in active learning) or with the largest average agreement between two independent samples of the Gibbs distribution which agree on its type. We test these approaches on two networks with known attributes, the Karate Club network and a food web of species in the Weddell Sea. In several cases, we found that the average agreement algorithm performs better than mutual information, and both perform better than simpler heuristics. The algorithms appear to explore the network intelligently, first querying vertices at the centers of communities, and then vertices along the boundaries between communities.

CLOAK: a virtual networking architecture

Damien Magoni

02 février 2011 de 14h à 15h : salle 26-00/228

Slides

In order to untie entities, applications and devices, CLOAK introduces a P2P overlay architecture and redefines the notion of a session. A session is a communication descriptor that contains everything needed for linking entities, applications and devices together in a flexible way. A session represents the identity and the management information of a given communication. Thus the lifetime of a communication between several entities is equal to the lifetime of its corresponding session. A device can move or be changed for another without terminating the session. Similarly, an application can be changed for another if deemed appropriate or even moved (i.e. mobile code) also without terminating the session. Finally entities can move or change (i.e. be transferred to another entity) without terminating the session if this is appropriate for a given communication. We can see that in our new architecture, entities, applications and devices are loosely bound together during a communication and all the movements and changes of devices, applications and entities are supported. To be able to implement our notion of session, we need to introduce several new layers of identifiers in order to decouple lower layers (devices) from middle layers (entities) and from higher layers (applications). Hence the name CLOAK given to this project. We add naming layers above the existing ones in order to give abstract names to objects such as entities and devices. Thus we lay a cloak on the current Internet architecture.

Looking into the surround of contacts in intermittently-connected wireless networks

Marcelo Dias De Amorim

14/01/2010, de 11H00 à 12H00, en salle 847

Slides

In this talk, we will discuss some preliminary work on the spatial characterization of intermittently-connected wireless networks. The motivation is that the literature has focused almost exclusively on the temporal aspect of contacts and inter-contacts between nodes. Furthermore, it is frequently assumed that contacts have infinite capacity, which is a too strong consideration especially in wireless networks. We propose the surround indicator as a metric to exhibit the spatial dimension of contacts in opportunistic networks. This metric can be used for example as a measure of potential interference that a contact might undergo or to understand the validity of forwarding strategies like betweenness centrality. We evaluate the surround indicator on two existing datasets. Our preliminary results reveal that contacts have too heterogeneous surrounds to be considered only in terms of duration. This work is part of Nadjet Belblidia’s thesis and is done in collaboration with Jérémie Leguay (Thalès), Vania Conan (Thalès), Jon Crowcroft (Cambridge University), and Serge Fdida (UPMC).

Réseaux sociaux égocentriques

Christophe Prieur

19/11/2009, de 11H00 à 12H00, en salle 555

Slides

En dix ans, la vision Google des réseaux sociaux a cédé la place à la vision Facebook. Dans le milieu des années 90, avec l’apparition du web et son intégration progressive dans la société, s’est répandue l’image d’un monde connecté, village global traversé par les autoroutes de l’information permettant de relier n’importe qui à n’importe qui d’autre en six degrés de séparation. En 1998, Google apportait à cette civilisation-là un algorithme permettant de traiter le gigantesque bazar que constituait déjà le graphe des pages web, permettant d’en extraire les pages les plus pertinentes pour le genre humain. Aujourd’hui, les résultats de Google sont un acquis, c’est la wikipédie de tout un chacun, presque aussi poussiéreuse que le Larousse de la grand-mère. Je sais, quand j’en ai besoin, chercher des informations dans le réseau des réseaux, reste donc à trouver des informations quand je ne les cherche pas (Google ne me dira ce qu’est Paf le chien que si j’ai l’idée de le lui demander), et pour cela, écouter ce qui se passe dans… mon réseau. Parce que si je suis à O(6) degrés de séparation de n’importe qui, je préfère réduire la portée de mon réseau pour limiter la cacophonie, et le pagerank ne m’aidera pas pour ça. Nous allons donc profiter de notre expérience de vieux googliens pour aborder les réseaux personnels avec des outils que les anciens analystes des réseaux sociaux n’avaient pas, quitte à ce qu’ils nous reprochent de leur piquer leur jouet.

Vidéo à la demande en Peer-to-peer intégral

Fabien de Montgolfier

12/11/2009, de 11H00 à 12H00, salle 847

Slides

La vidéo à la demande (VoD) est un service en pleine expansion. Avec le développement des modems ( box ) d’accès à l’ADSL, offrant maintenant des capacités de stockage, on peut imaginer un service totalement décentralisé. Les films sont pré-chargés sur certains modems. Les utilisateurs se connectent aux modems qui les ont pré-chargés et visionnent, sans intervention d’un serveur (d’où un service robuste et à coût faible). Nous montrons que le nombre de films disponibles peut croitre de façon linéaire (ou quasi-linéaire sous d’autres hypothèses) en fonction du nombre de modems et que toutes requêtes d’utilisateurs peuvent être satisfaites AFP. Travail réalisé avec Y. Boufkhad, F. Mathieu, D. Perino et L. Viennot.

Optimisation locale multi-niveaux de la modularité

Thomas Aynaud, Vincent Blondel, Jean-Loup Guillaume and Renaud Lambiotte

in Partitionnement de graphe : optimisation et applications, Traité IC2, Hermes-Lavoisier 2011

Dans ce chapitre, nous présentons une méthode gloutonne pour optimiser la modularité d’un graphe. Cette méthode de partionnement permet de traiter avec une excellente précision des systèmes de taille inégalée, allant jusqu’à plusieurs milliards de liens. Notre algorithme a de surcroît l’avantage de ne pas être limité à l’optimisation de la modularité puisqu’il peut être généralisé à d’autres fonctions de qualité, et de découvrir des communautés à différentes échelles. Les performances de l’algorithme sont évaluées sur des graphes artificiels pour lesquels la structure communautaire est connue, ainsi que sur des graphes de terrain réels.

Long range community detection

Thomas Aynaud and Jean-Loup Guillaume

Latin-American Workshop on Dynamic Networks (LAWDN), Buenos Aires, 2010

Complex networks can usually be divided in dense subnetworks called communities. In evolving networks, the usual way to detect communities is to find several partitions independently, one for each time step. However, this generally causes troubles when trying to track communities from one time step to the next. We propose here a new method to detect only one decomposition in communities that is good for (almost) every time step. We show that this unique partition can be computed with a modification of the Louvain method and that the loss of quality at each time step is generally low despite the constraint of global maximization. We also show that some specific modifications of the networks topology can be identified using this unique partition in the case of the Internet topology.

Download

Détection de communautés à long terme dans les graphes dynamiques

Thomas Aynaud and Jean-Loup Guillaume

Journée thématique Fouille de grands graphes, en conjonction avec la première conférence sur les Modèles et l’Analyse des Réseaux : Approches Mathématiques et Informatique (MARAMI), Toulouse, France, 2010

La plupart des graphes de terrain peuvent être décomposés en sous graphes denses appelés communautés. Habituellement, dans des graphes dynamiques, les communautés sont détectées pour chaque instant indépendamment ce qui pose de nombreux problèmes tels que la stabilité ou le suivi de des communautés entre deux décompositions successives. Nous proposons ici une méthode pour trouver une partition unique, de qualité, couvrant une longue période. Cette décomposition peut être trouvée efficacement via une adaptation de la méthode de Louvain et la perte de qualité à chaque instant due à la contrainte de détecter des communautés globales s’avère assez faible.

Download

Termination of Multipartite Graph Series Arising from Complex Network Modelling

Matthieu Latapy, Thi Ha Duong Phan, Christophe Crespelle, Thanh Qui Nguyen

COCOA 2010

An intense activity is nowadays devoted to the definition of models capturing the properties of complex networks. Among the most promising approaches, it has been proposed to model these graphs via their clique incidence bipartite graphs. However, this approach has, until now, severe limitations resulting from its incapacity to reproduce a key property of this object: the overlapping nature of cliques in complex networks. In order to get rid of these limitations we propose to encode the structure of clique overlaps in a network thanks to a process consisting in iteratively factorising the maximal bicliques between the upper level and the other levels of a multipartite graph. We show that the most natural definition of this factorising process leads to infinite series for some instances. Our main result is to design a restriction of this process that terminates for any arbitrary graph. Moreover, we show that the resulting multipartite graph has remarkable combinatorial properties and is closely related to another fundamental combinatorial object. Finally, we show that, in practice, this multipartite graph is computationally tractable and has a size that makes it suitable for complex network modelling.

Download

Evaluation of a new method for measuring the internet degree distribution: Simulation results

Christophe Crespelle and Fabien Tarissan

in Complex Networks, special issue of Computer Communications, 34-5, pp. 635-648 (2011), DOI: 10.1016/j.comcom.2010.06.006

Many contributions rely on the degree distribution of the Internet topology. However, current knowledge of this property is based on biased and erroneous measurements and is subject to much debate. Recently, a new approach, referred to as the Neighborhood Flooding method, was proposed to avoid issues raised by classical measurements. It aims at measuring the neighborhood of Internet core routers by sending traceroute probes from many monitors distributed in the Internet towards a given target router. In this paper, we investigate the accuracy of this method with simulations. Our results show that Neighborhood Flooding is free from the bias highlighted in the classical approach and is able to observe properly the exact degree of a vast majority of nodes in the core of the network. We show how the quality of the estimation depends on the number of monitors used and we carefully examine the influence of parameters of the simulations on our results. We also point out some limitations of the Neighborhood Flooding method and discuss their impact on the observed distribution.

Download

Static community detection algorithms for evolving networks

Thomas Aynaud and Jean-Loup Guillaume

Proceedings of International Workshop on Dynamic Networks (WDN), in conjunction with WiOpt 2010, pages 508-514

Complex networks can often be divided in dense sub-networks called communities. We study, using a partition edit distance, how three community detection algorithms transform their outputs if the input network is sligthly modified. The instabilities appear to be important and we propose a modification of one algorithm to stabilize it and to allow the tracking of the communities in a dynamic network. This modification has one parameter which is a tradeoff between stability and quality. The resulting algorithm appears to be very effective. We finally use it on a dynamic network of blogs.

Download

Some Insight on Dynamics of Posts and Citations in Different Blog Communities

Abdelhamid Salah Brahim, Bénédicte Le Grand and Matthieu Latapy

IEEE ICC 2010 workshop « SocNets », Cape Town, South Africa, May 2010

This paper explores new approaches and methods to characterize post and citation dynamics in different blog communities. In particular, evolution of post popularity over time is studied, as well as information spreading cascades. This methodology goes beyond traditional approaches by defining classes of dynamic behaviors based on topological features of the post network, and by investigating the impact of topical communities on post popularity dynamics and on information spreading cascades. This methodology has been applied to a corpus of active French blogs monitored during 4 months.

Download

Impact of Sources and Destinations on the Observed Properties of the Internet Topology

Frédéric Ouédraogo, Clémence Magnien

Computer Communications, 34, 670-679, 2011

Maps of the internet topology are generally obtained by measuring the routes from a given set of sources to a given set of destinations (with tools such as traceroute). It has been shown that this approach misses some links and nodes. Worse, in some cases it can induce a bias in the obtained data, i.e. the properties of the obtained maps are significantly different from those of the real topology. In order to reduce this bias, the general approach consists in increasing the number of sources. Some works have studied the relevance of this approach. Most of them have used theoretical results, or simulations on network models. Some papers have used real data obtained from actual measurement procedures to evaluate the importance of the number of sources and destinations, but no work to our knowledge has studied extensively the importance of the choice of sources or destinations. Here, we use real data from internet topology measurements to study this question: by comparing partial measurements to our complete data, we can evaluate the impact of adding sources or destinations on the observed properties. We show that the number of sources and destinations used plays a role in the observed properties, but that their choice, and not only their number, also has a strong influence on the observations. We then study common statistics used to describe the internet topology, and show that they behave differently: some can be trusted once the number of sources and destinations are not too small, while others are difficult to evaluate.

Download

Age, Gender and Communication Networks

Alina Stoica, Zbigniew Smoreda, Christophe Prieur and Jean-Loup Guillaume

Proceedings of the Workshop on the Analysis of Mobile Phone Networks, satellite workshop to NetSci 2010

In this paper, we address some sociological andtopological issues associated with mobile phone communication.Based on a dataset of a few million users, we use customers’ ageand gender information to study relation between theseparameters and the average behavior of users in terms of numberof calls, number of SMS and calls duration. We also study thedataset from a networking point of view: we define differentprofiles based on the topological properties of the personalnetwork of each individual and study the relations between theseprofiles and the age of customers

Download