How to Load CSV data in TensorFlow | Python
Hey there everyone, Today we will learn how to load a CSV file data using TensorFlow in Python. For this tutorial, we are going to use Tensorflow 2.1. We will be loading a ‘.csv’ file that contains values for the area of land and their corresponding prices.
So, let’s get started.
Python code to loading CSV data in TensorFlow
Let’s first import TensorFlow and check its version.
import tensorflow as tf tf. __version__
OUTPUT:
'2.1.0'
Importing other required libraries.
import numpy as np import pandas as pd
The contents of our ‘.csv’ file.
!head {'file.csv'}
OUTPUT:
area,prices 1000,316404.1095890411 1500,384297.9452054794 2300,492928.0821917808 3540,661304.794520548 4120,740061.6438356165 4560,799808.2191780822 5490,926090.7534246575 3460,650441.7808219178 4750,825607.8767123288
Now, let’s have a look at different ways for loading CSV data.
Example 1:
Using Dataset.from_tensor_slices
, this method works on dictionaries and allows us to easily import our data.
#dataframe df = pd.read_csv('file.csv', index_col=None) df.head(10)
OUTPUT:
df_slices = tf.data.Dataset.from_tensor_slices(dict(df)) for features in df_slices.take(10): for df_key, df_value in features.items(): print(f"{df_key} : {df_value}")
OUTPUT:
area : 1000 prices : 316404.1095890411 area : 1500 prices : 384297.9452054794 area : 2300 prices : 492928.0821917808 area : 3540 prices : 661304.794520548 area : 4120 prices : 740061.6438356165 area : 4560 prices : 799808.2191780822 area : 5490 prices : 926090.7534246576 area : 3460 prices : 650441.7808219178 area : 4750 prices : 825607.8767123288 area : 2300 prices : 492928.0821917808
Example 2:
Another way of loading our CSV data is by using experimental.make_csv_dataset
, this function is a high-level interface that allows us to read sets of CSV files. It also supports features like batching and shuffling that makes its usage simpler, It also supports column type inference.
data= tf.data.experimental.make_csv_dataset('file.csv', batch_size=4, label_name="area")
for features, labels in data.take(1): print("'area': {}".format(labels)) for data_key, data_value in features.items(): print(f"{data_key} : {data_value}")
OUTPUT:
'area': [3460 2300 2300 3540] prices : [650441.75 492928.1 492928.1 661304.8 ]
Example 3:
There is a lower-level class
experimental.CsvDataset, which provides finer-grained control. But, this does not support column type inference.
col_types = [tf.int32, tf.float32] dataset = tf.data.experimental.CsvDataset('file.csv', col_types , header=True) dataset
OUTPUT:
<CsvDatasetV2 shapes: ((), ()), types: (tf.int32, tf.float32)>
for x in dataset.take(10): print([y.numpy() for y in x])
OUTPUT:
[1000, 316404.12] [1500, 384297.94] [2300, 492928.1] [3540, 661304.8] [4120, 740061.6] [4560, 799808.25] [5490, 926090.75] [3460, 650441.75] [4750, 825607.9] [2300, 492928.1]
So, these were the different ways of loading our CSV data using TensorFlow.
Leave a Reply