Tensors

TensorFlow, as the name indicates, is a framework used to define and run computations involving tensors. A tensor is a generalization of vectors and matrices to potentially higher dimensions. Internally, TensorFlow represents tensors as n-dimensional arrays of some underlying data type. A Tensor has a DataType (e.g., FLOAT32, which corresponds to 32-bit floating point numbers) and a Shape (that is, the number of dimensions it has and the size of each dimension – e.g., Shape(10, 2) which corresponds to a matrix with 10 rows and 2 columns) associated with it. Each element in the Tensor has the same data type. For example, the following code creates an integer tensor filled with zeros with shape [2, 5] (i.e., a two-dimensional array holding integer values, where the first dimension size is 2 and the second is 5):

val tensor = Tensor.zeros[Int](Shape(2, 5))

You can print the contents of a tensor as follows:

tensor.summarize()
// Prints the following:
//   Tensor[Int, [2, 5]]
//   [[0, 0, 0, 0, 0],
//    [0, 0, 0, 0, 0]]

Tensor Creation

Tensors can be created using various constructors defined in the Tensor companion object. For example:

val a = Tensor[Int](1, 2)                  // Creates a Tensor[Int] with shape [2]
val b = Tensor[Long](1L, 2)                // Creates a Tensor[Long] with shape [2]
val c = Tensor[Float](3.0f)                // Creates a Tensor[Float] with shape [1]
val d = Tensor[Double](-4.0)               // Creates a Tensor[Double] with shape [1]
val e = Tensor.empty[Int]                  // Creates an empty Tensor[Int] with shape [0]
val z = Tensor.zeros[Float](Shape(5, 2))   // Creates a zeros Tensor[Float] with shape [5, 2]
val r = Tensor.randn(Double, Shape(10, 3)) // Creates a Tensor[Double] with shape [10, 3] and
                                           // elements drawn from the standard Normal distribution.

Data Types

As already mentioned, tensors have a data type. Various numeric data types are supported, as well as strings (i.e., tensors containing strings are supported). It is not possible to have a Tensor with more than one data type. It is possible, however, to serialize arbitrary data structures as strings and store those in tensors.

The list of all supported data types is:

STRING     // String
BOOLEAN    // Boolean
FLOAT16    // 16-bit half-precision floating-point
FLOAT32    // 32-bit single-precision floating-point
FLOAT64    // 64-bit double-precision floating-point
BFLOAT16   // 16-bit truncated floating-point
COMPLEX64  // 64-bit single-precision complex
COMPLEX128 // 128-bit double-precision complex
INT8       // 8-bit signed integer
INT16      // 16-bit signed integer
INT32      // 32-bit signed integer
INT64      // 64-bit signed integer
UINT8      // 8-bit unsigned integer
UINT16     // 16-bit unsigned integer
QINT8      // Quantized 8-bit signed integer
QINT16     // Quantized 16-bit signed integer
QINT32     // Quantized 32-bit signed integer
QUINT8     // Quantized 8-bit unsigned integer
QUINT16    // Quantized 16-bit unsigned integer
RESOURCE   // Handle to a mutable resource
VARIANT    // Variant

TensorFlow Scala also provides value classes for the types that are not natively supported by Scala (e.g., UByte corresponds to UINT8).

It is also possible to cast tensors from one data type to another using the toXXX operator, or the castTo[XXX] operator:

val floatTensor = Tensor[Float](1, 2, 3) // Floating point vector containing the elements: 1.0f, 2.0f, and 3.0f.
floatTensor.toInt                        // Integer vector containing the elements: 1, 2, and 3.
floatTensor.castTo[Int]                  // Integer vector containing the elements: 1, 2, and 3.

A tensor’s data type can be inspected using:

floatTensor.dataType // Returns FLOAT32
Performing Operations on Tensors

In general, all tensor-supported operations can be accessed as direct methods/operators of the Tensor object, or as static methods defined in the tfi package, which stands for TensorFlow Imperative (given the imperative nature of this API).

Shape

The shape of a tensor is the number of elements it contains in each dimension. The TensorFlow documentation uses two notational conventions to describe tensor dimensionality: rank, and shape. The following table shows how these relate to one another:

Rank Shape Example
0 [] A 0-D tensor. A scalar.
1 [D0] A 1-D tensor with shape [5].
2 [D0, D1] A 2-D tensor with shape [3, 4].
3 [D0, D1, D2] A 3-D tensor with shape [1, 4, 3].
n [D0, D1, … Dn-1] A tensor with shape [D0, D1, … Dn-1].

Shapes can be automatically converted to integer tensors, if necessary.

Note

Shapes are automatically converted to Tensor[Int] and not Tensor[Long] in order to improve performance when working with GPUs. The reason is that TensorFlow treats integer tensors in a special manner, if they are placed on GPUs, assuming that they represent shapes.

For example:

val t0 = Tensor.ones[Int](Shape())     // Creates a scalar equal to the value 1
val t1 = Tensor.ones[Int](Shape(10))   // Creates a vector with 10 elements, all of which are equal to 1
val t2 = Tensor.ones[Int](Shape(5, 2)) // Creates a matrix with 5 rows with 2 columns

// You can also create tensors in the following way:
val t3 = Tensor(2.0, 5.6)                                 // Creates a vector that contains the numbers 2.0 and 5.6
val t4 = Tensor(Tensor(1.2f, -8.4f), Tensor(-2.3f, 0.4f)) // Creates a matrix with 2 rows and 2 columns

The shape of a tensor can be inspected using:

t4.shape // Returns the value Shape(2, 2)

Rank

The rank of of a tensor is its number of dimensions. Synonyms for rank include order or degree or n-dimension. Note that rank in TensorFlow is not the same as matrix rank in mathematics. As the following table shows, each rank in TensorFlow corresponds to a different mathematical entity:

Rank Math Entity
0 Scalar (magnitude only)
1 Vector (magnitude and direction)
2 Matrix (table of numbers)
3 3-Tensor (cube of numbers)
n n-Tensor (you get the idea)

The rank of a tensor can be inspected using:

t4.rank // Returns the value 2

Indexing / Slicing

Similar to NumPy, tensors can be indexed/sliced in various ways. An indexer can be one of:

  • Ellipsis: Full slice over multiple dimensions of a tensor. Ellipses are used to represent zero or more dimensions of a full-dimension indexer sequence.
  • NewAxis: Addition of a new dimension.
  • Slice: Slice over a single dimension of a tensor.

Examples of constructing and using indexers are provided in the Ellipsis and the Slice documentation. Here we provide examples of indexing over tensors using indexers:

val t = Tensor.zeros[Float](Shape(4, 2, 3, 8))
t(::, ::, 1, ::)            // Tensor with shape [4, 2, 1, 8]
t(1 :: -2, ---, 2)          // Tensor with shape [1, 2, 3, 1]
t(---)                      // Tensor with shape [4, 2, 3, 8]
t(1 :: -2, ---, NewAxis, 2) // Tensor with shape [1, 2, 3, 1, 1]
t(1 ::, ---, NewAxis, 2)    // Tensor with shape [3, 2, 3, 1, 1]

where --- corresponds to an ellipsis.

Note that each indexing sequence is only allowed to contain at most one ellipsis. Furthermore, if an ellipsis is not provided, then one is implicitly appended at the end of the indexing sequence. For example, foo(2 :: 4) is equivalent to foo(2 :: 4, ---).