GML Newsletter - Issue #4: NeurIPS 2020

Graph Machine Learning News

Welcome to the 4th issue of GML newsletter!

This issue I decided to devote to the upcoming NeurIPS 2020 (6-12 December), covering what authors and organizations publish the most at the conference and some of the popular topics among graph papers.


Part1. Who publishes at NeurIPS?

Before diving into graph papers let’s first take a look at NeurIPS 2020 analysis I made about authors, affiliations, countries, and collaborations.

Number of papers. This year the exponential trend of the number of submissions and accepted papers continued for the 6th year in a row and reviewers got almost 9.5K papers (40% increase over the last year). If this trend continues there will be 50K submissions in 2025 🤪 Sounds absurd, but hey, I bet you could not believe the 10K number 5 years ago.

One of the complaints people discussed is that the acceptance rate remains suspiciously constant as if the quality of papers haven’t changed over time. While I agree it would be great to merit papers primarily based on their quality rather than a predefined number, I’m not sure how to cope with the increasing load of the papers one conference can maintain.

Top affiliations. Nothing surprising here, Google champions the ranking with a significant lead over Stanford (2nd) and MIT (3rd). This trio preserved its place from ICML 2020, showing consistent research efforts at top venues. What’s more interesting is the first time China’s affiliation made to the top-10 list, with Tsinghua University in 7th place. We have heard many times that China steps on the heels of the UK and it’s just one of the illustrations of it (more next).

Top authors. For some reason, NeurIPS also has many more authors with many publications than at ICML or other venues. There are 28 authors with 7+ papers at NeurIPS (vs 9 at ICML). Traditionally Sergey Levine (UC Berkeley) leads with 12 papers.

I don’t have a good answer why people write more papers for NeurIPS rather than ICML, but this year it’s incredible how many papers top authors published: the first 10 authors published 83 papers in total. There are discussions that we should not encourage scientists and the whole research ecosystem to publish that many papers, favoring quality over quantity. While I agree with the last statement, based on the number of submissions I believe that the quality of papers increased on average over time and it’s rather a problem of searching the gems in a large space of publications.

Top countries. The USA is just far ahead of other countries, no questions about it. China for the first time overtakes 2nd place of the UK, with several industrial companies (Tencent, Alibaba, Huawei) contributing significantly to the country’s performance. Some small countries such as Israel, South Korea, and Singapore published more than such behemoths as Russia, India, and Australia. A single university in Saudi Arabia, KAUST, published all 10 papers of the country.

Collaboration. Some companies collaborate more than others (and as far as I’m concerned there are internal policies to encourage people to collaborate inside the affiliation rather than with external people). For example, Google tends not to publish their papers with other industrial companies (except DeepMind), while MIT has collaborations with both industry and academia across the world.

Part 1 conclusion. My conclusion is that NeurIPS is not slowing down in terms of the number of papers and I’m looking forward to seeing the length of the queue to a registration desk when people get back to real conferences in a year or two (hopefully). Top authors now publish even more: if it’s the rich get richer phenomenon or they just accumulated their rejected papers from the previous years, I don’t know – but again I would not be surprised to see someone to publish 25 papers at a single conference in 5 years from now (what now accounts for all publications of India).

Part 2. Graph topics at NeurIPS 2020.

About 7% of all papers at NeurIPS 2020 are using graph machine learning in one part or another. Proceedings with all papers are available here and graph papers are available here. Scientists use graphs for many purposes and hence the papers are covering a broad spectrum of topics, from theoretical machine learning to practical improvements of a particular GNN model. Next is just a glimpse of some topics.

Theory. Several works study the limitations of existing GNNs in terms of their ability to distinguish non-isomorphic graphs. Chen et al. show that many GNNs are not capable of learning induced subgraphs such as triangles. Andreas Loukas introduces a measure of communication capacity and shows that it needs to grow quadratically with the number of nodes in order to distinguish non-isomorphic connected graphs. Morris et al. propose a scalable version of WL algorithm and show that it’s strictly more powerful than the original WL.

Oversmoothing. For those who don’t know oversmoothing is the problem that GNN with many layers tend to have similar embeddings across all nodes which leads to poor performance and as we work in the paradigm of deep learning we want to tackle this problem by all means. As such Zhou et al. proposes to use community structure and Min et al. to use scattering transforms when designing aggregation schemes for GNN. Oono and Suzuki further analyze the problem through the lens of gradient boosting.

Adversarial attacks. Similar to adversarial settings in computer vision, this topic covers the mechanisms of attack and defense of node prediction models. Ma et al. describes a new type of node attack, when the attacker only has access to a small subgraph, while Zhang and Zitnik propose GNNGuard – a model-agnostic way that defends against adversaries by increasing the weights on similar nodes and decreasing it on unrelated nodes during the aggregation.

Faster GNN. This year improved a bit in terms of the sizes of graph datasets that we have access to, which reflects the scale of industrial applications and several methods were proposed to improve the efficiency of standard GNN models. Chen et al. GNN model has a sublinear time complexity for preprocessing and training and can scale to billions of edges. Ramezani et al. propose a generic augmentation of sampling techniques when sampling is performed rarely and reused for next iterations.

Explainability. Explaining the predictions of neural network models on graphs should include the explanation at the feature level (which features are important) and at the topology level (which links are important). Unlike previous approaches that usually offer a single explanation of predictions for each node, Luo et al. train a NN to provide multi-instance explanations in an inductive manner. Orthogonally, Vu and Thai use a graphical model to show the dependencies of explained features in form of conditional probabilities.

Computer Vision. Using graphs to represent objects in an image, video, or 3D point clouds is one of the popular applications for GML. Bear et al. propose to build a hierarchical graph of objects in an image and show that it can better segment an image into a scene than past approaches. Zhou et al. leverage the graph structure of the low-resolution image to recover detailed textures of the image and retrieve a super-resolution image.

Novel applications. Predicting the properties of molecules has been one of the most promising applications of GNNs in the real world. Rong et al. pretrain a GNN model with 100M parameters trained on a huge unlabeled molecular dataset achieving significant improvement over past approaches. Several works apply GNNs for software programs. Zhou et al. use GNN as an encoder of the program for a deep RL algorithm that optimizes the computational graph of the program. In another work, Huang et al. exploit inductive biases of GNN for assembling parts of the furniture by predicting translation and rotation for each input part.

Physics. More and more works now use GNN models to predict how elementary particles interact with each other. Cranmer et al. demonstrate exciting results of discovering unknown equations that govern the concentration of dark matter. For this, GNN’s messages on edges and node outputs are provided to an additional genetic algorithm that iteratively searches for an underlying equation that explains data well. Schoenholz and Cubuk present JAX MD for performing molecular dynamics in JAX, a NumPy-like package with autograd, purely in python. The models are accelerated on GPU with end-to-end training and can be suitable for graphs with up to hundreds of thousands of particles.

Part 2 conclusion. I just scratched the surface and there are many other interesting and impactful papers and you should check out the whole list of papers if you want to deep dive into graph research. There is still 1 month before the conference takes place 😊

That’s all for today. Thanks for reading!

Feedback 💬 As always if you have something to say, feel free to reply to this email. Likewise, contribute 💪to the future newsletter by sending me relevant content. Subscribe to my telegram channel about graph machine learning, Medium and twitter. And spread the word among your friends by emailing this letter or by tweeting about it 🐦