TensorFlow, as the name indicates, is a framework used to define and run computations involving tensors. A tensor is a generalization of vectors and matrices to potentially higher dimensions. Internally, TensorFlow represents tensors as
n-dimensional arrays of some underlying data type. A Tensor has a DataType (e.g.,
FLOAT32, which corresponds to 32-bit floating point numbers) and a Shape (that is, the number of dimensions it has and the size of each dimension – e.g.,
Shape(10, 2) which corresponds to a matrix with 10 rows and 2 columns) associated with it. Each element in the Tensor has the same data type. For example, the following code creates an integer tensor filled with zeros with shape
[2, 5] (i.e., a two-dimensional array holding integer values, where the first dimension size is 2 and the second is 5):
val tensor = Tensor.zeros[Int](Shape(2, 5))
You can print the contents of a tensor as follows:
tensor.summarize() // Prints the following: // Tensor[Int, [2, 5]] // [[0, 0, 0, 0, 0], // [0, 0, 0, 0, 0]]
Tensors can be created using various constructors defined in the Tensor companion object. For example:
val a = Tensor[Int](1, 2) // Creates a Tensor[Int] with shape  val b = Tensor[Long](1L, 2) // Creates a Tensor[Long] with shape  val c = Tensor[Float](3.0f) // Creates a Tensor[Float] with shape  val d = Tensor[Double](-4.0) // Creates a Tensor[Double] with shape  val e = Tensor.empty[Int] // Creates an empty Tensor[Int] with shape  val z = Tensor.zeros[Float](Shape(5, 2)) // Creates a zeros Tensor[Float] with shape [5, 2] val r = Tensor.randn(Double, Shape(10, 3)) // Creates a Tensor[Double] with shape [10, 3] and // elements drawn from the standard Normal distribution.
As already mentioned, tensors have a data type. Various numeric data types are supported, as well as strings (i.e., tensors containing strings are supported). It is not possible to have a Tensor with more than one data type. It is possible, however, to serialize arbitrary data structures as strings and store those in tensors.
The list of all supported data types is:
STRING // String BOOLEAN // Boolean FLOAT16 // 16-bit half-precision floating-point FLOAT32 // 32-bit single-precision floating-point FLOAT64 // 64-bit double-precision floating-point BFLOAT16 // 16-bit truncated floating-point COMPLEX64 // 64-bit single-precision complex COMPLEX128 // 128-bit double-precision complex INT8 // 8-bit signed integer INT16 // 16-bit signed integer INT32 // 32-bit signed integer INT64 // 64-bit signed integer UINT8 // 8-bit unsigned integer UINT16 // 16-bit unsigned integer QINT8 // Quantized 8-bit signed integer QINT16 // Quantized 16-bit signed integer QINT32 // Quantized 32-bit signed integer QUINT8 // Quantized 8-bit unsigned integer QUINT16 // Quantized 16-bit unsigned integer RESOURCE // Handle to a mutable resource VARIANT // Variant
TensorFlow Scala also provides value classes for the types that are not natively supported by Scala (e.g., UByte corresponds to
It is also possible to cast tensors from one data type to another using the
toXXX operator, or the
val floatTensor = Tensor[Float](1, 2, 3) // Floating point vector containing the elements: 1.0f, 2.0f, and 3.0f. floatTensor.toInt // Integer vector containing the elements: 1, 2, and 3. floatTensor.castTo[Int] // Integer vector containing the elements: 1, 2, and 3.
A tensor’s data type can be inspected using:
floatTensor.dataType // Returns FLOAT32
In general, all tensor-supported operations can be accessed as direct methods/operators of the Tensor object, or as static methods defined in the tfi package, which stands for TensorFlow Imperative (given the imperative nature of this API).
The shape of a tensor is the number of elements it contains in each dimension. The TensorFlow documentation uses two notational conventions to describe tensor dimensionality: rank, and shape. The following table shows how these relate to one another:
|0||||A 0-D tensor. A scalar.|
|1||[D0]||A 1-D tensor with shape .|
|2||[D0, D1]||A 2-D tensor with shape [3, 4].|
|3||[D0, D1, D2]||A 3-D tensor with shape [1, 4, 3].|
|n||[D0, D1, … Dn-1]||A tensor with shape [D0, D1, … Dn-1].|
Shapes can be automatically converted to integer tensors, if necessary.
Shapes are automatically converted to
Tensor[Int] and not
Tensor[Long] in order to improve performance when working with GPUs. The reason is that TensorFlow treats integer tensors in a special manner, if they are placed on GPUs, assuming that they represent shapes.
val t0 = Tensor.ones[Int](Shape()) // Creates a scalar equal to the value 1 val t1 = Tensor.ones[Int](Shape(10)) // Creates a vector with 10 elements, all of which are equal to 1 val t2 = Tensor.ones[Int](Shape(5, 2)) // Creates a matrix with 5 rows with 2 columns // You can also create tensors in the following way: val t3 = Tensor(2.0, 5.6) // Creates a vector that contains the numbers 2.0 and 5.6 val t4 = Tensor(Tensor(1.2f, -8.4f), Tensor(-2.3f, 0.4f)) // Creates a matrix with 2 rows and 2 columns
The shape of a tensor can be inspected using:
t4.shape // Returns the value Shape(2, 2)
The rank of of a tensor is its number of dimensions. Synonyms for rank include order or degree or
n-dimension. Note that rank in TensorFlow is not the same as matrix rank in mathematics. As the following table shows, each rank in TensorFlow corresponds to a different mathematical entity:
|0||Scalar (magnitude only)|
|1||Vector (magnitude and direction)|
|2||Matrix (table of numbers)|
|3||3-Tensor (cube of numbers)|
|n||n-Tensor (you get the idea)|
The rank of a tensor can be inspected using:
t4.rank // Returns the value 2
Similar to NumPy, tensors can be indexed/sliced in various ways. An indexer can be one of:
Ellipsis: Full slice over multiple dimensions of a tensor. Ellipses are used to represent zero or more dimensions of a full-dimension indexer sequence.
NewAxis: Addition of a new dimension.
Slice: Slice over a single dimension of a tensor.
val t = Tensor.zeros[Float](Shape(4, 2, 3, 8)) t(::, ::, 1, ::) // Tensor with shape [4, 2, 1, 8] t(1 :: -2, ---, 2) // Tensor with shape [1, 2, 3, 1] t(---) // Tensor with shape [4, 2, 3, 8] t(1 :: -2, ---, NewAxis, 2) // Tensor with shape [1, 2, 3, 1, 1] t(1 ::, ---, NewAxis, 2) // Tensor with shape [3, 2, 3, 1, 1]
--- corresponds to an ellipsis.
Note that each indexing sequence is only allowed to contain at most one ellipsis. Furthermore, if an ellipsis is not provided, then one is implicitly appended at the end of the indexing sequence. For example,
foo(2 :: 4) is equivalent to
foo(2 :: 4, ---).