이 발표는 2018년 4월 14일 서울에서 열린 TensorFlow Dev Summit Extended Seoul '18 에서 TensorFlow Dev Summit 2018의 발표 내용 중 TensorFlow.Data 및 TensorFlow.Hub에 관한 발표들을 정리한 내용입니다.
This presentation summarizes the talks about TensorFlow.Data and TensorFlow.Hub among the sessions of TensorFlow Dev Summit 2018, and presented at TensorFlow Dev Summit Extended Seoul '18 held on April 14, 2018 in Seoul.
30. tf.enable_eager_execution()
# Also implements best practices for high performance!
# (See optional args for details.)
dataset = tf.contrib.data.make_batched_features_dataset(
file_pattern, BATCH_SIZE, features, num_epochs=NUM_EPOCHS)
for batch in dataset:
train_model(batch)
31. tf.enable_eager_execution()
# In a terminal, run the following commands, e.g.:
# $ pip install kaggle
# $ kaggle datasets download -d therohk/million-headlines -p .
dataset = tf.contrib.data.make_csv_dataset(
"*.csv", BATCH_SIZE, num_epochs=NUM_EPOCHS)
for batch in dataset:
train_model(batch["publish_date"], batch["headline_text"])
32. § ( .8 S 1 8 E K
§ 8 . . C K
§ ( 8 C ).8 P ).8 !
34. # Wrap the dataset in an input function, and return it directly.
def input_fn():
dataset = tf.contrib.data.make_csv_dataset(
"*.csv", BATCH_SIZE, num_epochs=NUM_EPOCHS)
return dataset
# Train an Estimator on the dataset.
tf.estimator.Estimator(model_fn=train_model).train(input_fn=input_fn)
42. saeta@saeta:~$ capture_tpu_profile --tpu_name=saeta --logdir=myprofile/ --duration_ms=10000
Welcome to the Cloud TPU Profiler v1.5.1
Starting to profile TPU traces for 10000 ms. Remaining attempt(s): 3
Limiting the number of trace events to 1000000
2018-03-21 01:13:12.350004: I tensorflow/contrib/tpu/profiler/dump_tpu_profile.cc:155] Converting trace events to TraceViewer JSON.
2018-03-21 01:13:12.392162: I tensorflow/contrib/tpu/profiler/dump_tpu_profile.cc:69] Dumped raw-proto trace data to profiles/5/plugins/profile/2018-03-21_01:13:12/tr
ace
Trace contains 998114 events.
Dumped JSON trace data to myprofile/plugins/profile/2018-03-21_01:13:12/trace.json.gz
Dumped json op profile data to myprofile/plugins/profile/2018-03-21_01:13:12/op_profile.json
Dumped tool data for input_pipeline.json to myprofile/plugins/profile/2018-03-21_01:13:12/input_pipeline.json
Dumped tool data for overview_page.json to myprofile/plugins/profile/2018-03-21_01:13:12/overview_page.json
NOTE: using the trace duration 10000ms.
Set an appropriate duration (with --duration_ms) if you don't see a full step in your trace or the captured trace is too large.
saeta@saeta:~$ tensorboard --logdir=myprofile/
TensorBoard 1.6.0 at <redacted> (Press CTRL+C to quit)
/: // # / : # : /: : -/
/: // - .# /#- . - - . - :/ : -/
51. def input_fn(batch_size):
files = tf.data.Dataset.list_files(FLAGS.data_dir)
dataset = tf.data.TFRecordDataset(files, num_parallel_reads=32)
dataset = dataset.shuffle(10000)
dataset = dataset.repeat(NUM_EPOCHS)
dataset = dataset.map(parser_fn, num_parallel_calls=64)
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(2)
return dataset
. . T
bi n D c
i
Fd_ e a lfR D