- Graph Convolutional Networks
- Understanding a 3D CNN and Its Uses
- The Complete Guide to Artificial Neural Networks
Concepts and Models

- Using a U-Net for Semantic Segmentation
- Using Convolutional Neural Networks for Sentence Classification
- YOLO Deep Learning
Don’t Think Twice

- Perceptrons and Multi-Layer Perceptrons
The Artificial Neuron at the Core of Deep Learning

- Neural Networks for Regression (Part 1)—Overkill or Opportunity?
- Neural Networks for Image Recognition
Methods, Best Practices, Applications

- Neural Network Bias
Bias Neuron, Overfitting and Underfitting

- Instance Segmentation with Deep Learning
- Image Segmentation in Deep Learning
Methods and Applications

- Hyperparameters
Optimization Methods and Real World Model Management

- Generative Adversarial Networks
- Fully Connected Layers in Convolutional Neural Networks
The Complete Guide

- Convolutional Neural Network
How to Build One in Keras & PyTorch

- Complete Guide to Deep Reinforcement Learning
Concepts, Process, and Real World Applications

- Classification with Neural Networks
Is it the Right Choice?

- CapsNet
Origin, Characteristics, and Advantages

- Backpropagation in Neural Networks
Process, Example & Code

- A Recurrent Neural Network Glossary
Uses, Types, and Basic Structure

- 7 Types of Neural Network Activation Functions
How to Choose?

A Graph Neural Network, also known as a Graph Convolutional Network (GCN), is an image classification method. In this article, we’ll provide an introduction to the concepts of graphs, convolutional neural networks, and Graph Neural Networks.

Read on to discover an example of a simple GCN in action, and see applications of graph convolutional networks including, generating predictions in physical systems and image classification. Learn how to scale GCNs on large numbers of machines using the MissingLink deep learning platform.

Let’s start with basic definitions to get an orientation of the subject.

**A graph **in computer science is a data structure consisting of Vertices (also called nodes) and Edges (also called connections). Graphs are extremely useful and expressive mathematical structures, which can be used to model real-world phenomena like social networks, molecular structures, semantic structures, geographical or physical models, and more.

Here are two ways to represent a graph: using an equation with a group of vertices V and a group of edges E. And a diagram with nodes and the connections between those nodes. The example below shows a **directed graph**, but edges can also be **undirected**.

**A Convolutional Neural Network** (CNN) is a neural network structure which breaks down an input, typically an image, into smaller pieces and performs feature extraction – it derives important parts of the input which can be used to make a decision, typically a classification decision.

The CNN alternates between convolution and pooling layers. The convolution layers pass a filter over the source image and extract the important information from each piece. The pooling layers take the extracted information and downsample it to retain only the most important information. When the essential data is extracted, it is passed through a fully connected layer to arrive at the final classification decision.

A Graph Neural Network, also known as a Graph Convolutional Networks (GCN), performs a convolution on a graph, instead of on an image composed of pixels.

Just like a CNN aims to extract the most important information from the image to classify the image, a GCN passes a filter over the graph, looking for essential vertices and edges that can help classify nodes within the graph.

Source: Zonghan Wu et. al., 2019

Following is a simplified formula that shows how a graph can be normalized and “packaged” into a regular neural network function, that takes parameters and weights and returns the output to the next neural network layer. The function also has a non-linear element, enabling backpropagation.

A few important evolutions of GSNs provide additional capabilities:

**Attention mechanisms**attention-based Graph Convolutional Networks maintain state information to capture the neighborhood properties of the nodes. This makes it possible to “remember” important nodes and give them higher weights throughout the learning process.*—***Graph Spatial-Temporal Networks**these are GCNs that can support graphs with a fixed structure but inputs that change over time. For example, a traffic system with a fixed set of roads but a variable traffic arriving over time. Below is one approach to Spatial-Temporal Networks, in which a 1-dimensional CNN layer moves over the X axis, representing time, while the GCN processes the spatial information at each time step.*—*

Source: Zonghan Wu et. al., 2019

**Graph Generative Networks—**can generate new, realistic structures from data, similar to a Generative Adversarial Network. For example, MolGAN is a graph generative architecture that tries to propose a fake molecular graph, while the discriminator aims to distinguish fake graphs from real molecular graphs taken from empirical data. An external evaluation and reward system helps the network generate progressively more realistic graphs.

Below we can see a simple GCN in action. It goes through the following stages:

**Normalizing the graph structure**and creating an input to the neural network, with node properties and weights.**Graph convolution layer**that passes a filter over the nodes and extracts essential features.**Leaky ReLU and dropout layers**that perform a sort of pooling/downsampling over the first convolution.

**Note: **There are new approaches for __pooling over a graph representation__, which are more elegant and could enable multiple convolutions for GNNs.

**Second graph convolution**performed on the downsampled graph information.**Softmax****layer**to perform the final classification decision.

**Note: **In a large graph, SoftMax could become very computationally intensive, and so GNN frameworks like __DeepWalk__ use hierarchical Softmax to save time and compute Softmax for only some of the nodes.

GCNs can be used to model real-world entities as graphs, and predict their interaction and future behavior. The network can treat real-world objects as vertices, and their interactions as edges. A GCN can make accurate inferences about properties of a physical system, such as collision dynamics, trajectories of objects, and even the effect of objects not seen in the image on other objects.

Image classification was traditionally performed by regular CNNs. However, practitioners are now applying GCN architectures to image classification problems, with encouraging results. A particular focus is on Zero Shot Learning (ZSL), in which a model needs to recognize and label an image belonging to an unknown label, by inferring which known labels it may be similar to. GCNs can use knowledge graphs to categorize a “zero shot”, based on similarities between the images, objects extracted from the images, or semantic information in the category labels.

GCNs are used to solve community prediction problems like Zachary’s Karate Club** —**a small social network where there is a conflict between the administrator and instructor in a karate club, and we need to predict which side each member of the club will choose. A GCN can address similar problems using semi-supervised classification via spectral graph convolutions.

__Tobias Jepsen__ showed that with just using two labeled nodes, a GCN can achieve a high degree of separation and generate accurate predictions for the two communities in the Karate Club problem. Below is the correct Karate Club classification and one of the results obtained by his GCN model.

Correct Classification of Karate Club Problem

Classification Achieved by Jepsen’s GCN Model

Image source: How to do Deep Learning on Graphs with Graph Convolutional Networks

Molecular structures are also graph structures. GCNs can learn about existing molecular structures and help researchers discover new ones. Taking a fixed-length molecular fingerprint as their input, they train on known molecules and generate predictions regarding unknown molecular structures. Generative Graph Networks such as MolGAN can be trained to create new molecular structures that have certain desired properties.

The field of operations research has a deep interest in combinatorial optimization problems, in which the objective is to find an optimal solution object from a set of objects representing possible strategies. For example, researchers have applied GCNs to the classic __travelling salesman problem__, combining a GCN with reinforcement learning to iteratively lean a solution, starting from an input graph. GCN outperforms traditional algorithms like Quadratic Assignment Problem.

In this article, we learned the basics of GCNs, a new and powerful tool in the deep learning arsenal. Training GCNs is very computationally intensive, and you may have to experiment with several architectures or variants to get things right. Don’t wait for hours for GCNs to train. Use the MissingLink deep learning framework to:

**Scale out GCNs**automatically across multiple machines or GPUs, either on-premise or in the cloud.**Define a cluster of machines**and automatically run GCN training jobs, ensuring each machine is utilized to the max.**Avoid idle time**by scheduling jobs and running experiments in sequence, without having to “babysit” machines and see when each experiment ends or fails.

MissingLink can also help you manage large numbers of experiments, track and share results, and manage large datasets and sync them easily to training machines.

__Learn more__ about the MissingLink deep learning platform.

The most comprehensive platform to manage experiments, data and resources more frequently, at scale and with greater confidence.

- Intelligently version massive datasets.
- Reproduce experiments with one click.
- Scale your compute resources.
- Stream your data, cache it locally

and only syncs changes.

Thank you!

We will be in touch with more information in one business day.

In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.

The most comprehensive platform to manage experiments, data and resources more frequently, at scale and with greater confidence.

Thank you!

We will be in touch with more information in one business day.

In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.

- Run experiments across hundreds of machines
- Easily collaborate with your team on experiments
- Reproduce experiments with one click
- Save time and immediately understand what works and what doesn’t

MissingLink is the most comprehensive deep learning platform to manage experiments, data, and resources more frequently, at scale and with greater confidence.

Request your personal demo to start training models faster

Thank you!

We will be in touch with more information in one business day.

In the meantime, why not check out how Nanit is using MissingLink to streamline deep learning training and accelerate time to Market.