Deep Learning Tutorial Series (VI) Introduction to the use of tf.dataAPI

" Playing around with TensorFlow and deep learning models "A series of text tutorials, this week bringing Introduction to using

In the process of learning and practice, you can ask any questions through the Academy WeChat exchange group, there are tutors and teaching assistants, cattle, etc. to solve your problems and answer questions. ( How to join the group is at the end of the article

The sixth tutorial focuses on TensorFlow data import (introduction to using the API).

Introduction to

The previous methods of importing TensorFLow model data can be divided into two main methods, one using the other using the one in TensorFlow. The former is more flexible to use and can use Python to process a variety of input data, the disadvantage is also more obvious, that is, the program is less efficient to run; the latter method is more efficient, but more complex to use and less flexible.

As a new API, it is faster than both of these methods and far less difficult to use. contains two interfaces for the TensorFLow program:and.

The dataset (dataset) API has been migrated from in TensorFlow version 1.4, adding support for Python's generators, and it is officially highly recommended to use the Dataset API to create input pipelines for TensorFlow models for the following reasons.

The dataset API can provide more functionality than the old API (or queued pipeline).

The Dataset API has higher performance.

The Dataset API is cleaner and easier to use.

In the future the TensorFlow team will be centering development on the Dataset API rather than the old one.


represents a collection of elements that can be thought of as a lazy list in functional programming, with the elements being tensor tuple. The creation can be done in two ways, which are.


Apply transformation


Here source refers to creation from an object, and the common methods are as follows.

The roles are: creating a single-element dataset from a tensor tuple; creating a dataset containing multiple elements from a tensor tuple; reading a list of filenames and composing a dataset with each line in each file as an element; and reading a format file from the hard disk to construct the dataset.

Apply transformation

The second method is to get the new dataset by transforming the existing dataset, TensorFLow supports many medium transformations, some common ones are described here.

Each of the above three ways represents: using map for each element in the dataset, in this case decoding the image data; repeating the dataset a certain number of times for training in multiple epochs; and stacking the elements of the original dataset by some amount to generate a mini batch.

TensorFlow version 1.4 also allows users to construct datasets via Python's generators, e.g.

Combining the above code gives us a common code snippet :


After defining the dataset you can access the tensor tuple in the dataset through the Iterator interface. iterator maintains the location of the data in the dataset and provides a way to access the data in the dataset.

An iterator can be constructed by calling the make iterator method of the dataset.

The API supports the following four iterators, of increasing complexity.






one-shot iterator who is the simplest kind of iterator, supporting access to the entire data set only once, without explicit initialization. The one-shot iterator does not support parameterization. The following code uses the generated dataset, which works similarly to range in python.


Initializable iterator requires that it be explicitly initialized by invoking the operation before it can be used, which allows the dataset to be defined in conjunction with incoming parameters such as :


The reinitializable iterator can be initialized by different dataset objects, e.g. shuffle is performed for the training set but not processed for the validation set, usually two dataset objects with the same structure are used in this case, e.g.


The feedable iterator can be combined with a mechanism to choose which one to select at each call. It provides similar functionality to reinitilizable iterator and does not need to initialize iterator at the beginning when switching datasets, as in the example above, by defining a feedable iterator for the purpose of switching datasets:

Code Example

Here is an example of reading, decoding, and resizing an image.

See the references for more code and detailed instructions.

reference material

Official TensorFlow documentation

Google Developer Chinese Blog - Announcing TensorFlow r1.4

Same time next Monday." Playing around with TensorFlow and deep learning models" Tutorial Series (vii): TensorBoard and model preservation

Please stay tuned!

TensorFlow and Deep Learning Models Tutorial Series

Join the Community Artificial Intelligence Academy

Cultivate real-world AI talent that meets corporate needs

1、Smart home intelligence is not enough can rely on emotional intelligence to make up for it
2、EY The Rise and Impact of Regulatory Sandboxes in the UK and AsiaPacific
3、About Operation Privileges
4、How to back up your MySQL database
5、Original vs2010 Configuring Qt

    已推荐到看一看 和朋友分享想法
    最多200字,当前共 发送