GML Newsletter: Do we do it right?

"Just because someone else does the wrong thing we are not exempt from doing what’s right.”

Welcome to the Graph ML newsletter!

First of all, Happy New Year 🧨With all the knowledge that we accumulated over 2020, let this year be more productive and predictable for you. Second, I moved the newsletter from revue to substack platform. There are several reasons for it: (a) it’s free for me to send these emails, which was not true before, becoming a noticeable financial burden; (b) it has more features (for example, I can do podcasts or engage with readers via threads); and (c) you can now support me with monthly payments if you wish.

As you probably know writing about graph ML (through the blog posts, telegram, and newsletter) has been my much-loved hobby, which I have been doing pro bono and now there is a way to support all of it via substack. In return, I offer additional perks such as promotions of your work or personal chats (more on this here), but as this format is new to me, I would be happy to hear more feedback on what you would like to see in the offer. It would also motivate me to write more frequently and advertise graph ML broader in AI community. All in all, I hope that if anything changes for the reader, it will be only for the better.

Now about this issue. As we are entering 2021, there is much hype associated with GNNs. Legends of ML now speak loudly that it’s a very promising direction.

That’s exciting times for us, graph researchers, as we are getting the green light from our managers and colleagues to try these things out in the real world. It’s also bigger responsibility for us: so far we had comparatively fewer applications of GML than for example CV or NLP and in the end, the impact of our work will be measured by how much value it brings to society. So to start I outlined the top applications of GNNs in 2021, which I will elaborate on in this email. Furthermore, we will discuss the predictions of many researchers in this field of what will come out in the next years. And as always there are some special links, which could be relevant to you.


Interesting links

Blog post: Geometric ML becomes real in fundamental sciences Michael Bronstein nicely summarizes three papers that use graph ML for drug design and bioinformatics.

Course: Machine Learning for Graphs and Sequential Data (MLGS) Stephan Günnemann covers in-depth generative models, robustness, sequential data, clustering, label propagation, GNNs, and more.

Podcasts: TWIML with Taco Cohen and Michael Bronstein 1-hour discussions with two researchers on equivariance and state of graph ML field.

Book: The Atlas for the Aspiring Network Scientist More network science than ML book by Michele Coscia, covering hitting time matrix, Kronecker graph model, network measurement error, graph embedding techniques, and more.

Book: Probabilistic Machine Learning: An Introduction a general ML book by Kevin Patrick Murphy that has a chapter on graph embeddings co-authored with Bryan Perozzi.

Video: How to Predict Which Candidate COVID-19 mRNA Vaccines Are Stable with AI Kaggle grandmasters discuss how they use GNNs to win in the OpenVaccine Kaggle competition.

Video: Graph Neural Nets series Aleksa Gordić digests architectures of popular GNN models (GCN, GAT, GraphSage, etc.)


What 2021 holds for Graph ML?

In a new format, Michael Bronstein did mini-interviews with diverse researchers in this field about their predictions of where we are going. I was fortunate to participate and to express my feeling that GNNs will search for new ways to get through production infrastructure to make a real-world impact (more on this below).

One of the common themes raised by Will Hamilton is going beyond message-passing neural nets (MPNN) that has been dominating the field. Right now, most of the works assume static graphs that are given to us or as Thomas Kipf put it “the nodes and edges of the dataset are taken as the gold standard for the computation structure“.

There are indeed many assumptions we make when we claim SOTA results of GNNs. As Emanuele Rossi mentions, dynamic graphs, where nodes and edges are added and removed, have been largely understudied; or, graphs with homophily, which are very well suited for label propagation techniques than GNNs and we don’t have a good understanding why (mentioned by Petar Veličković and Matthias Fey); or, at the other extreme, when we don’t have graphs per se, how do we account for the relational structure (by Thomas Kipf).

Another frequent point is the use of graph ML to make a difference in the world. Sure, numbers in node classification benchmarks are important, but the underlying task is what we really care about. Novel approaches for protein modeling, new particle discoveries in physics, and medical imaging are among the current great candidates for such important tasks but discovering even more new forms that could be efficiently learned with the help of graphs (such as in AlphaFold 2) is what we hope to get in the nearest future.


Top Applications of Graph Neural Networks 2021

GNNs are a very popular topic right. I mean there are approximately ~300 new graph papers each month in ArXiv, of which about a third is related to GNNs. ICML/NeurIPS/ICLR have about 7% of all their papers related to graphs, data mining conferences such as KDD has ~30% graph papers. So it’s pretty big numbers in academia, but is it reflected in the industrial applications? In this post, I gathered the most noticeable recent GNN applications in small and big companies.

This includes the use of GNNs to embed users and products for recommendation and personalized search. I must add that large-scale recommender systems combine signals from different sources, including text, images, and sequential data, and GNNs are probably not enough. But if there is a strong relational signal for recommender systems GNNs are in the best position to capture it.

Another increasingly popular application is combinatorial optimization problems that are ubiquitous in manufacturing or logistics. This has been explored from two orthogonal perspectives: integrating it within existing solvers (e.g. Gasse-et al., 2019, Nair et al. 2020, the library Ecole) or doing end-to-end optimization from scratch (e.g. multiple surveys, Google’s presentation on chip manufacturing, a blog post by Chaitanya Joshi).

Next, if I bet for the most promising application of GNN I would say it’s going to be drug development and bioinformatics in general. There are already startups calling for graph ML researchers and developing GNN platforms, and this industry arguably perfectly fits in the world of graphs.

There is more in the post, but in the end, I hope that my list is not an ultimate one and in a year we will see many more valuable applications of GNNs; at least we need to strive to influence the world with our research work. After all, the world is connected and there is always a place for some graph.


That’s all for today folks, I hope you enjoyed it. As always if you have some feedback or have some topics to share, feel free to reply to this email. Subscribe to me on Twitter or Telegram. And more than anything else take care.

Peace!

Sergey