Project information

About the class project: A research project where you can investigate and apply state-of-the-art network analysis tools and algorithms to an application of your preference is a required portion of this course. You should select a specific topic related to network science and perform a relatively in-depth survey of the topic. This requires finding good literature sources (research papers, textbooks, presentation slides, etc.), possibly performing some analysis and/or numerical simulations to experiment on interesting network datasets, and providing a detailed summary of the main ideas. Your project is an in-depth study of a specific topic; it is not just a summary of a few research papers. If you wish to be more ambitious, you can also do some original open ended research on your selected topic in addition to the topic survey (possibly leading to a conference publication), but this is not required.

I am quite flexible with the type of study you may choose to carry out. Typically, these could be of any the following types, or as with the homework assignments a mix of both:

  • An experimental evaluation of algorithms and models on interesting network data, implementing your own code and/or investigating existing software for network analysis.
  • An analytical project that considers a model, an algorithm or a network property and derives a rigorous theoretical result about it.

Conceivably, other ideas could involve scalable/more efficient implementations of algorithms for large-scale network data processing, or you could think of collecting your own data from the web, social media, etc. for the subsequent analysis.

You are strongly encouraged to team up with another student to work on your project (i.e., work in pairs). Still, if you wish so you can work alone.

What: Your first task is to pick a topic you like for the project. I suggest you first speak to you supervisor (if you have one) about a possible topic that relates to networks. If you don't have a research supervisor or just want additional feedback, you are welcome to talk to me during office hours and I will be happy to make suggestions and brainstorm with you, help you refine your initial ideas, or point you towards useful datasets, code, papers, and other resources. You are encouraged to pick an application area that is related to your current or future research.

A few (and continuously updated) exciting areas from where you could explore project ideas are:

  • Signal processing on graphs, an emerging field with the goal of extending high-dimensional data analysis to networks and other irregular domains. A few good tutorial references to get started can be found here, here, and here. A recent overview on how graph spectral methods can be used in image processing can be accessed here. A comprehensive resource on foundations and emergent directions in graph signal processing is this cluster of tutorial articles.
  • Graph anomaly detection. A comprehensive tutorial paper on the state-of-the-art can be found here; see also this related tutorial presentation.
  • Network-based analysis of the brain. An interesting tutorial paper on machine learning with brain graphs can be found here; see also this paper for a network-based analysis of epilepsy, or this other paper for a system identification approach to brain topology inference. A recent tutorial on leveraging graph signal processing tools for functional brain imaging can be accessed here; see also a video of D. van de Ville's lecture here.
  • Deep learning for non-Euclidean (e.g., graph-valued data). An accessible tutorial paper on early work in this timely field can be found here. See also this website for comprehensive resources about geometric deep learning.
  • Tensor (multi-way array) analysis and decomponsitions for large-scale, dynamic graph mining. Tutorial presentations on tensors can be found here and here.
  • Community detection in graphs, a challenging problem that remains an active area of research. A comprehensive tutorial paper can be found here.
  • Diffusion and cascading processes over networks, opinion formation and influence maximization. A seminal paper on influence maximization can be found here.
  • Leveraging compressed sensing, sparsity and low-rank structures for network analytics. A few related pointers include a tutorial paper on dynamic communication network health monitoring, applications to smart grid load curve cleansing and cyber-attack detection, as well as estimating diffusion network structres and information cascades.
  • The challenge of graph representation learning is how to embed or represent a given graph in an effective way that facilitates solving downstream machine learning tasks such as node or graph classification. Recent tutorials using an ecompassing encoder-decoder framework can be found here and here. Another very good entry point to this vibrant field is W. Hamilton's book.
  • Using graph neural networks (e.g., convolutional graph neural networks) for solving inverse problems on graphs. Problems of interest include network community detection, link prediction, rating prediction and recommendation systems, semantic segmentation, just to name a few. The following presentation slides offer a nice overview on current trends; see also the video of M. Bronstein's lecture here.
  • Models for graph generation. Recent advances in deep generative models seek to learn the generative mechanism from a set of training graphs. Three broad classes of methods rely on the paradigms of variational autoencoders (VAEs), generative adversarial networks (GANs) and autoregressive models. For a friendly introduction to the topic, see Chapter 9 of W. Hamilton's book. An application of particular interest in this domain has been molecule generation; a couple representative papers can be found here and here.

In deciding what to work on, I suggest you explore the datasets, software, and additional resources I have posted under links and resources.

When: There will be three deliverables as indicated in the class schedule. These are:

  1. Project proposal due Friday 03/03, via Gradescope.
  2. Progress report due Monday 04/03, via Gradescope.
  3. In-class final presentations on Thursday 04/27, in CSB 523.

Moreover, the final report should be submitted via Gradescope. The final report submission deadline is 10 pm ET, Friday 05/05, 2023. Late submissions will not be accepted.

As indicated in the detailed instructions below, the proposal will be a short writeup describing what you plan to do and how you plan to do it. The progress report will be a more extensive writeup, describing the work performed up to then, and the revised plans for the whole project. It mainly serves as a "checkpoint", to detect and prevent dead-ends and other problems early on. The report will be a more detailed description of what you did, what results you obtained, and what you have learned and/or can conclude from your work.

What's at stake: The research project amounts to an 70% of your final class grade. The grade distribution across the deliverables is as follows: proposal 10%, progress report 10%, final report and in-class presentation 50%. As indicated in the detailed instructions below, those using LaTeX to prepare their reports and slides will get 5% extra credit.

Projects will be evaluated based on:

  • Technical quality. Is the project technically sound? Are the modeling assumptions made and the algorithms tried reasonable? Do the conclusions suggest in-depth critical thinking about the chosen topic, possibly conveying novel insights about the problem and/or chosen algorithms?
  • Significance. Is this an interesting and timely problem to work on? It this work useful and the underlying research area likely to have impact?
  • Clarity of presentation. How effectively are the research findings conveyed orally (during the in-class presentation) and in writing (in the final report)?

Instructions: I encourage you to prepare the project reports and presentation slides using LaTeX. Not a mandatory requirement but if you are not familiar with LaTeX this is a great chance to learn, and you will find this skill extremely useful down the road. As an extra incentive, those using LaTeX will get 5% extra credit. If you choose LaTeX to prepare your documents, you should adopt the following templates:

  • Report template files based on the NeurIPS conference paper kit. Here is a pdf generated from the template.
  • Presentation slides template files that I use to prepare lecture slides. Here is a pdf generated from the template.

The project proposal should summarize what you plan to do for your project. The writeup should not exceed 5 pages, and you should try to include:

  1. A clear description of the problem that you will be addressing;
  2. Preliminary ideas on how you plan to address it (models/algorithms/techniques);
  3. Basic literature references you will be consulting;
  4. If applicable what software tools you will need for your work (or if you plan to write your own code what language you will use);
  5. Network dataset(s) you will be working with;
  6. What you expect to produce as a result of your work and how you will judge success of the project; and
  7. Anything else that you think the I should know to evaluate your plans.

The progess report should look like a first (incomplete) draft of your final report, but naturally shorter and most likely without your major results. The writeup should not exceed 8 pages, and you should try to include:

  1. An introduction, literature review of relevant prior work (with corresponding list of references), and clear problem statement in finalized form;
  2. If you collected your own data to construct a network graph, describe that process;
  3. For all those applicable, provide mathematical derivations, detailed model descriptions, and algorithms you have used, adapted or developed;
  4. Summary of preliminary results obtained so far and datasets you have analyzed;
  5. Anything worth commenting on unforseen complications that arised, and workarounds; and
  6. Outline of the work-to-do, including any portions you see infeasible and why.

The final report should naturally build on your progress report, provinding a clear and detailed description of what you did, what results you obtained, and what you have learned and concluded from your work. The writeup should not exceed 12 pages, and you should try to include:

  1. A motivating introduction, literature review of relevant prior work, and clear problem statement in finalized form;
  2. If you collected your own data to construct a network graph, describe that process;
  3. For all those applicable, provide mathematical derivations, detailed model descriptions, and algorithms you have used, adapted or developed;
  4. Description of your experiments, showcasing the obtained results and a relevant discussion based on your observations;
  5. Conclusions indicating the accomplished goals and what you learned, as well as possible extensions or future directions; and
  6. A list of relevant references.

In-class presentations are scheduled for Thursday April 27, 12:30 pm - 3 pm in CSB 523. Each group will be allocated a 15 minute slot, where presentations should be around 14 minutes long thus allowing for 1 minute of Q&A. Be congnizant that it is difficult to present more than 15 slides in that amount of time. If you wish, you can use these LaTeX template files to prepare your slides. Ideally, the presentation should touch upon all of these:

  1. Motivation and succinct description of the problem being addressed;
  2. Place your work in context of existing literature;
  3. Briefly describe the methods, algorithms, and data utilized;
  4. Showcase your results; and
  5. Conclude possibly indicating potential future directions.

The detailed presentation schedule for April 27 follows.

Names Project title Time
Boonin Influence Maximization Methods to Optimize Telecom Referral 12:30pm
Brehm & Swar Identifying Bottlenecks in Specialty Referral Networks 12:45pm
Burns & Siddiqui Network Community Detection Using Natural Computing 1pm
Chen & Du Complex Network Analysis Using Chinese Stock Market Data 1:15pm
Charles & Collins An Analysis of a Diffusion Model for Molecular Generation 1:30pm
Quirk Network Model of Solute Motion in the Glymphatic System 1:45pm
Szymula Identifying Functional Differences in Network Representations of Brain 2pm
  Imaging Data from Individuals with Autism Spectrum Disorder  
Uzun Effects of the Cell Density on the Network Organization of Human 2:15pm
  Induced Pluripotent Stem Cell Cultures  
Wang A Model of Public Opinion Reversal on the Internet 2:30pm