Data Manipulation

Dive into Deep Learning · §1.1

Storing & transforming data with tensors
The n-dimensional arrays that every model in this book is built on.

The tensor: our basic data structure

Motivation

An n-dimensional array of numbers
generalizes the NumPy ndarray.
Runs on GPUs and other accelerators.
Records operations for automatic differentiation.

Rank = number of axes; shape = size per axis.

Getting Started

creating & inspecting tensors

Create a vector, then inspect it

Getting Started

arange(n) builds a 1-D tensor of evenly spaced values:

x = tf.range(12, dtype=tf.float32)
x

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10., 11.],
      dtype=float32)>

x.shape

TensorShape([12])

numel() → total elements. shape → size along each axis. We ask for float32 because nearly all neural-net math is in floating point.

randn breaks symmetry; lists pin exact values

Getting Started

For weight init, randn draws from \mathcal{N}(0, 1):

tf.random.normal(shape=[3, 4])

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 1.3810029 , -0.22911465, -0.5846182 ,  0.43986928],
       [-0.56185037,  0.5317954 ,  2.0249772 ,  0.27406558],
       [-0.27821013, -0.01750856, -0.08361343, -0.95873785]],
      dtype=float32)>

Or type exact values as a list:

tf.constant([[2, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])

<tf.Tensor: shape=(3, 4), dtype=int32, numpy=
array([[2, 1, 4, 3],
       [1, 2, 3, 4],
       [4, 3, 2, 1]], dtype=int32)>

Also zeros, ones, full(shape, value), eye(n). Random values break symmetry when initializing network weights; lists let you type a tensor by hand.

Reshape: same data, new layout

Getting Started

Same elements in a new shape; numel is preserved:

X = tf.reshape(x, (3, 4))
X

<tf.Tensor: shape=(3, 4), dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5.,  6.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

Usually no copy: only the shape metadata changes. Use -1 to infer an axis: x.reshape(3, -1).

Indexing & Slicing

reading & writing elements, rows, ranges

Reading: elements, rows, ranges

Indexing & Slicing

X[-1] is the last row;
X[1:3] is rows 1–2:

X[-1], X[1:3]

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 8.,  9., 10., 11.], dtype=float32)>,
 <tf.Tensor: shape=(2, 4), dtype=float32, numpy=
 array([[ 4.,  5.,  6.,  7.],
        [ 8.,  9., 10., 11.]], dtype=float32)>)

0-based; negatives count from the end; a range a:b is half-open (b excluded).

Writing: assign through a Variable

Indexing & Slicing

A Tensor is immutable; wrap it in a tf.Variable, then assign one element:

X_var = tf.Variable(X)
X_var[1, 2].assign(17)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[ 0.,  1.,  2.,  3.],
       [ 4.,  5., 17.,  7.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

Writing: a whole region

Indexing & Slicing

A slice on the left assigns to a whole region at once:

X_var = tf.Variable(X)
X_var[:2, :].assign(tf.ones(X_var[:2,:].shape, dtype=tf.float32) * 12)
X_var

<tf.Variable 'Variable:0' shape=(3, 4) dtype=float32, numpy=
array([[12., 12., 12., 12.],
       [12., 12., 12., 12.],
       [ 8.,  9., 10., 11.]], dtype=float32)>

Operations

elementwise math, joins, comparisons, broadcasting

Elementwise ops: matching shapes, entry by entry

Operations

The operators + - * / ** act elementwise on matching shapes:

x = tf.constant([1.0, 2, 4, 8])
y = tf.constant([2.0, 2, 2, 2])
x + y, x - y, x * y, x / y, x ** y

(<tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 3.,  4.,  6., 10.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([-1.,  0.,  2.,  6.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 2.,  4.,  8., 16.], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([0.5, 1. , 2. , 4. ], dtype=float32)>,
 <tf.Tensor: shape=(4,), dtype=float32, numpy=array([ 1.,  4., 16., 64.], dtype=float32)>)

Unary functions like exp map each element:

tf.exp(x)

<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([1.0000000e+00, 2.7182817e+00, 7.3890562e+00, 2.0085537e+01,
       5.4598148e+01, 1.4841316e+02, 4.0342880e+02, 1.0966332e+03,
       2.9809580e+03, 8.1030840e+03, 2.2026467e+04, 5.9874145e+04],
      dtype=float32)>

Any scalar→scalar map (exp, sin, log) extends to a whole tensor.

Concatenate along an axis

Operations

cat joins along an existing axis
dim=0 adds rows, dim=1 widens:

X = tf.reshape(tf.range(12, dtype=tf.float32), (3, 4))
Y = tf.constant([[2.0, 1, 4, 3], [1, 2, 3, 4], [4, 3, 2, 1]])
tf.concat([X, Y], axis=0), tf.concat([X, Y], axis=1)

Every other axis must already match.

Comparisons build masks; reductions collapse

Operations

Comparisons return a boolean tensor.
A ready-made mask:

X == Y

<tf.Tensor: shape=(3, 4), dtype=bool, numpy=
array([[False,  True, False,  True],
       [False, False, False, False],
       [False, False, False, False]])>

Reductions collapse axes
no dim= gives a scalar:

tf.reduce_sum(X)

<tf.Tensor: shape=(), dtype=float32, numpy=66.0>

==, <, > build masks; sum, mean, max collapse axes; add dim= to reduce just one.

Broadcasting stretches size-1 axes for free

Operations · the exception

Size-1 axes are virtually stretched
a 3\times1 plus a 1\times2 gives a 3\times2:

a = tf.reshape(tf.range(3), (3, 1))
b = tf.reshape(tf.range(2), (1, 2))
a, b

a + b

<tf.Tensor: shape=(3, 2), dtype=int32, numpy=
array([[0, 1],
       [1, 2],
       [2, 3]], dtype=int32)>

Any axis of size 1 stretches to match the other tensor, without a copy.

Compatible only if each axis is equal or 1.

…or it refuses: no size-1 axis, no guess

Operations · the exception

Line up (3, 2) and (2, 3) from the right, pairing 2 with 3 and 3 with 2: no pair matches, neither member is 1, so the framework raises rather than guessing:

try:
    tf.ones((3, 2)) + tf.ones((2, 3))
except Exception as e:
    print(e)

{{function_node __wrapped__AddV2_device_/job:localhost/replica:0/task:0/device:GPU:0}} required broadcastable shapes [Op:AddV2] name:

Broadcasting aligns shapes from the right; each axis pair must be equal or 1.

Memory & Interop

in-place updates and leaving the tensor world

The hidden cost of `Y = Y + X`

Performance

Every arithmetic expression allocates a new tensor
costly when Y is gigabytes and updated many times per second:

before = id(Y)
Y = Y + X
id(Y) == before

False

id(Y) changed: Y is now bound to a new tensor object.

Saving memory through a Variable

Performance

A Variable.assign writes in place; its id is unchanged:

Z = tf.Variable(tf.zeros_like(Y))
print('id(Z):', id(Z))
Z.assign(X + Y)
print('id(Z):', id(Z))

id(Z): 128629978526544
id(Z): 128629978526544

Converting to other Python objects

Interop

Convert to / from a NumPy ndarray:

A = X.numpy()
B = tf.constant(A)
type(A), type(B)

(numpy.ndarray, tensorflow.python.framework.ops.EagerTensor)

The result is a copy; host/device arrays don’t share storage here.

A size-1 tensor unwraps to a Python scalar with .item():

a = tf.constant([3.5]).numpy()
a, a.item(), float(a.item()), int(a.item())

(array([3.5], dtype=float32), 3.5, 3.5, 3)

Summary

Wrap-up

Tensor = n-d array; the core data structure (GPU + autodiff).
Create: arange, zeros, ones, randn, tensor([…]).
Inspect / restructure: .shape, .numel(), reshape.
Index / slice to read and write: negatives, ranges, regions.

Elementwise math, comparisons (masks), reductions, cat.
Broadcasting stretches size-1 axes and refuses anything else.
Save memory with in-place ops (X[:] = …, +=), or in JAX via jit buffer reuse.
Interop: tensor ↔︎ NumPy, .item() for scalars.

Data Manipulation

The tensor: our basic data structure

Create a vector, then inspect it

randn breaks symmetry; lists pin exact values

Reshape: same data, new layout

Reading: elements, rows, ranges

Writing: assign through a Variable

Writing: a whole region

Elementwise ops: matching shapes, entry by entry

Concatenate along an axis

Comparisons build masks; reductions collapse

Broadcasting stretches size-1 axes for free

…or it refuses: no size-1 axis, no guess

The hidden cost of Y = Y + X

Saving memory through a Variable

Converting to other Python objects

Summary

The hidden cost of `Y = Y + X`