Pointers to various network resources follow, including network datasets, analysis and visualization software; as well as other Network Science courses offering valuable material and readings. These could be helpful for your project, and hopefully for your research on networks down the road. I will be continuously updating the lists, so please let me know if you come across something worth adding.
Prominent researchers and institutions have been compiling valuable network datasets for use of the research community.
- Stanford Large Network Dataset Collection compiled by Jure Leskovec. You can also take a look at their collection of Web and Blog datasets, and their own list of additional network dataset resources.
- Network Repository is a scientific network data repository with interactive visual analytics tools.
- The unified New York City taxi and Uber data with information of more than a billion trips. This other directory contains data on over 4.5 million Uber pickups in New York City from April to September 2014, and 14.3 million more Uber pickups from January to June 2015.
- Graph and Social Data compiled by the Yahoo Webscope Program.
- The Social Computing Data Repository hosts datasets from a collection of many different social media sites.
- A list of network datasets used in Eric Kolaczyk's book on Statistical Analysis of Network Data may be found here.
- The Yelp Dataset Challenge offers a social review dataset including a nearly 1 million edge social graph.
- Network datasets compiled over the years by Mark Newman.
- The UCI Network Data Repository is an effort to facilitate the scientific study of networks; see also their own list of additional network dataset resources.
- The Network database offers access to the network topologies studied by László Barabási's group.
- KONECT (the Koblenz Network Collection) is a project to collect large network datasets of all types in order to perform research in network science and related fields, collected by the Institute of Web Science and Technologies at the University of Koblenz-Landau.
- A collection of complex networks compiled by Uri Alon.
- Netzschleuder network cataloge, repository and centrifuge. From the creators of graph-tool.
- The Grouplens research lab at the University of Minnesota host the Movielens dataset, a benchmark for recommendation systems.
The sheer variety of software available for the study of large-scale graphs mirrors the number of communities involved in such work. While e.g., Matlab could be used to implement many of the network analysis algorithms studied in this class, efficiently dealing with large-scale network calls for custom-made graph analytics tools. Some useful network analysis and visualization software packages follow.
- Stanford Network Analysis Platform (SNAP) is a general purpose, high performance system for analysis and manipulation of large networks. Snap.py is a Python interface for SNAP. To get going with SNAP, you can check the following tutorial.
- NetworkX is probably the most popular Python language software package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. Other libraries offering similar functionalities, but likely faster and more scalable implementations are graph-tool, igraph (also in C and R), and NetworkKit.
- PyTorch geometric is popular library for deep learning on graphs and other irregular structures, also known as geometric deep learning.
- The Deep Graph Library is a recent package for graph neural networks and machine learning on graphs. It has the advantange of being framework agnostic (build models in PyTorch, TensorFlow or MXNet).
- PyGSP: Graph Signal Processing in Python is a recent package for signal processing on graphs (e.g., implementing the graph Fourier transform, graph filters, graphs learning, signal interpolation and denoising). Leveraging these graphs signal processing fundamentals and intuitions, the Graph Neural Networks library offers PyTorch implementations. Spektral is a Tensorflow variant.
- Brain Connectivity Toolbox is a MATLAB toolbox for complex-network analysis of structural and functional brain-connectivity data sets.
- Pegasus is a peta-scale graph mining system, fully written in Java. It runs in parallel, distributed manner on top of Hadoop
- JUNG -- the Java Universal Network/Graph Framework -- is a software library that provides a common and extendible language for the modeling, analysis, and visualization of data that can be represented as a graph or network.
- The mfinder software package is network motif finding tool; see also their pointers to other related software.
- yEd Graph Editor is a powerful desktop application that can be used to quickly and effectively generate high-quality diagrams.
- Gephi is an open-source interactive visualization and exploration platform for all kinds of networks and complex systems, dynamic and hierarchical graphs.
- Pajek is a freely available Windows-based package for the visualization of large networks.
- Graphviz is open-source graph visualization software.
- LaNet-vi is a large networks visualization tool based on the k-core decomposition of a graph.
Network Science courses
See below a few links to several other great Network Science-related classes. Note that depending on the background of the instructor, the focus could vary accordingly.
- Networks by David Easley and Jon Kleinberg (who also offers The Structure of Information Networks).
- Network Science by Constantine Dovrolis.
- Social and Information Network Analysis by Jure Leskovec; check out also his talks and tutorials.
- Network Science by Lazlo Barabasi.
- Video webcast of a short course on Statistical Analysis of Network Data by Eric Kolaczyk.
- The tensor analysis and graph mining sections of Multimedia Databases and Data Mining by Christos Faloutsos; check out some of his tutorials on graphs.