Methods to Load Data

TensorFlow Tutorial

About Lesson

There are two ways to load data, they are as follows:

Load Data using NumPy Array
Load Data using TensorFlow Data Pipeline

Load Data using NumPy Array

We can hard-code data into a NumPy Array or we can load data from an Excel (xls or xlsx) or CSV file into a Pandas DataFrame later will be converted into a NumPy Array. If your dataset is not pretty big, which is less than 10 gigabytes, you can use this method. The data can fit into memory.

## Numpy to pandas    
import numpy as np    
import pandas as pd  
  
h = [[1,2],[3,4]]     
df_h = pd.DataFrame(h)    
print('Data Frame:', df_h)    
    
## Pandas to numpy    
df_h_n = np.array(df_h)    
print('Numpy array:', df_h_n)

The output of the above code will be Data Frame: 0 1 0 1 2 1 3 4 Numpy array: [[1 2] [3 4]]

Load Data using TensorFlow Data Pipeline

Tensorflow has a built-in API that helps you to load the data, perform the operation and feed the machine learning algorithm easily. This method works very well when you have a pretty large dataset. For instance, image records are known to be huge and do not fit into memory. The data pipeline manages the memory by itself. This method works best if you have a huge dataset. For instance, if you have a dataset of 50 gigabytes, and your computer has only 16 gigabytes of memory then the machine will crash.

In these circumstances, you need to build a Tensorflow pipeline. The pipeline will load the data in batch, or small chunks. Each batch will be pushed to the pipeline and be ready for the training. Building a pipeline is an excellent solution because it permits you to use parallel computing. It means Tensorflow will train the model through multiple CPUs. It fosters computation and permits the training of powerful neural networks.

Methods to create TensorFlow Data Pipeline:

Create the Data:
```
import numpy as np  
import tensorflow as tf  
x_input = np.random.sample((1,2))  
print(x_input)
```
In the above code, we are generating two random numbers using NumPy’s Random Number Generator
Create the Placeholder
```
x = tf.placeholder(tf.float32, shape=[1,2], name = 'X')
```
We are creating a placeholder using the tf.placeholder()
Define the Dataset Method
```
dataset = tf.data.Dataset.from_tensor_slices(x)
```
We define the dataset method as tf.data.Dataset.from_tensor_slices()
Create the Pipeline
```
iterator = dataset.make_initializable_iterator()   
get_next = iterator.get_next()
```
In the above code, we need to initialize the pipeline where the data will flow. We need to create an iterator with make_initializable_iterator. We name its iterator. Then we need to call this iterator to supply the next batch of data, get_next. We name this step get_next. Note that in this example, there is only one batch of data with only two values.
Execute the Operation
```
with tf.Session() as sess:  
    # feed the placeholder with data  
    sess.run(iterator.initializer, feed_dict={ x: x_input })   
    print(sess.run(get_next)) # output [ 0.52374458  0.71968478]
```
In the above code, we initiate a session, and we run the operation iterator. We feed the feed_dict with the value generated by numpy. These two values will populate the placeholder x. Then we run get_next to print the result.

Source Code

import numpy as np  
import tensorflow as tf  
x_input = np.random.sample((1,2))  
print(x_input)  
# using a placeholder  
x = tf.placeholder(tf.float32, shape=[1,2], name = 'X')  
dataset = tf.data.Dataset.from_tensor_slices(x)  
iterator = dataset.make_initializable_iterator()   
get_next = iterator.get_next()  
with tf.Session() as sess:  
    # feed the placeholder with data  
    sess.run(iterator.initializer, feed_dict={ x: x_input })   
    print(sess.run(get_next))

The output of the above code will be [[0.87908525 0.80727791]] [0.87908524 0.8072779 ]

What's Hot

OpenAI’s ChatGPT Now Available on WhatsApp

OpenAI rollouts ChatGPT search access to everyone

What is Willow? Google’s Quantum Computing Chip

Load Data using NumPy Array

Load Data using TensorFlow Data Pipeline

Source Code

Most Popular

Retrofit – SSLHandshakeException Android

How to Integrate Google Gemini to WhatsApp

Hackers using fake video apps to target Web3 Professionals data

Our Picks

OpenAI’s ChatGPT Now Available on WhatsApp

OpenAI rollouts ChatGPT search access to everyone

What is Willow? Google’s Quantum Computing Chip

Subscribe to Updates

What's Hot

Load Data using NumPy Array

Load Data using TensorFlow Data Pipeline

Source Code

Subscribe to Updates