Issue
For example:
raw_train_ds = tf.keras.preprocessing.text_dataset_from_directory(
'../ml-test-data/aclImdb/train',
batch_size=batch_size,
validation_split=0.2,
subset='training',
seed=seed)
train_ds = raw_train_ds.cache().prefetch(buffer_size=AUTOTUNE)
raw_train_ds
is a BatchDataSet and train_ds
is a PrefetchDataSet of 625 batches. How can I get a subset of either BatchDataSet
or PrefetchDataSet
, for example, only the first 10 batches, or the first 320 examples?
If I convert them into a list, the problem is that later code is using the PrefetchDataSet type.
Solution
You can get samples by take()
function. It returns an iterable object. So you can get items like this:
ds_subset = raw_train_ds.take(10) #returns first 10 batch, if the data has batched
for data_batch in ds_subset:
#do whatever you want with each batch
or if you want to get examples, not batches:
ds_subset = raw_train_ds.unbatch().take(320) #returns first 320 examples
for input, label in ds_subset:
#do whatever you want with each sample (input,label)
Answered By - Kaveh
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.