Linear Layer

Linear Layer#

Note

Sometimes, Linear Layers are also called Dense Layers, like in the toolkit Keras.

What do linear layers do?#

A linear layer transforms a vector into another vector. For example, you can transform a vector [1, 2, 3] to [1, 2, 3, 4] with a linear layer.

When to use linear layers?#

Use linear layers when you want to change a vector into another vector. This often happens when the target vector’s shape is different from the vector at hand.

Note

Linear layers are often called linear transformation or linear mapping.

How does a linear layer work?#

There are two components in a linear layer. A weight \( W \), and a bias \( B \). If the input of a linear layer is a vector \( X \), then the output is \( W X + B \).

If the linear layer transforms a vector of dimension \( N \) to dimension \( M \), then \( W \) is a \( M \times N \) matrix, \( X \) is of dimension \( N \), \( B \) is of dimension \( M \).

Linear layers in code?#

import torch
from torch.nn import Linear

linear = Linear(3, 4)
print(linear.weight.detach())
print(linear.bias.detach())

tensor([[-0.5237,  0.3947,  0.3233],
        [-0.5529, -0.0427,  0.4963],
        [ 0.2408, -0.0646, -0.4288],
        [ 0.3744, -0.1231,  0.0030]])
tensor([ 0.4035, -0.1470, -0.4819,  0.4523])

You see, linear layers are just 2 matrices, weight and bias.

x = torch.tensor([1., 2., 3.])
y1 = linear(x)
y2 = linear.weight @ x + linear.bias
print(y1)
print(y2)
print(y1 == y2)

tensor([ 1.6389,  0.7036, -1.6568,  0.5896], grad_fn=<ViewBackward0>)
tensor([ 1.6389,  0.7036, -1.6568,  0.5896], grad_fn=<AddBackward0>)
tensor([True, True, True, True])

All that a linear layer do is to matmul the input vector, then added by the bias. It’s the linear algebra notation of \( WX+B \), with \( W \) the weight matrix, and \( B \) the bias vector.