INFO:gluonnlp:01:55:59 Namespace(accumulate=None, batch_size=12, bert_dataset='book_corpus_wiki_en_uncased', bert_model='bert_12_768_12', doc_stride=128, epochs=2, gpu=True, log_interval=50, lr=3e-05, max_answer_length=30, max_query_length=64, max_seq_length=384, model_parameters=None, n_best_size=20, null_score_diff_threshold=0.0, only_predict=False, optimizer='adam', output_dir='./output_dir', test_batch_size=24, uncased=True, version_2=False, warmup_ratio=0.1) INFO:gluonnlp:01:58:04 Loader Train data... INFO:gluonnlp:01:58:09 Number of records in Train data:87599 INFO:gluonnlp:02:00:01 The number of examples after preprocessing:88641 INFO:gluonnlp:02:00:01 Start Training INFO:gluonnlp:02:00:20 Epoch: 0, Batch: 49/7387, Loss=5.9267, lr=0.0000010 Time cost=18.8 Thoughput=31.84 samples/s INFO:gluonnlp:02:00:38 Epoch: 0, Batch: 99/7387, Loss=5.7081, lr=0.0000020 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:02:00:57 Epoch: 0, Batch: 149/7387, Loss=5.3687, lr=0.0000030 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:02:01:15 Epoch: 0, Batch: 199/7387, Loss=4.9597, lr=0.0000041 Time cost=18.6 Thoughput=32.33 samples/s INFO:gluonnlp:02:01:34 Epoch: 0, Batch: 249/7387, Loss=4.5552, lr=0.0000051 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:02:01:53 Epoch: 0, Batch: 299/7387, Loss=4.2620, lr=0.0000061 Time cost=18.6 Thoughput=32.24 samples/s INFO:gluonnlp:02:02:11 Epoch: 0, Batch: 349/7387, Loss=3.9116, lr=0.0000071 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:02:02:30 Epoch: 0, Batch: 399/7387, Loss=3.4229, lr=0.0000081 Time cost=18.6 Thoughput=32.21 samples/s INFO:gluonnlp:02:02:48 Epoch: 0, Batch: 449/7387, Loss=3.2318, lr=0.0000091 Time cost=18.7 Thoughput=32.10 samples/s INFO:gluonnlp:02:03:07 Epoch: 0, Batch: 499/7387, Loss=2.9334, lr=0.0000102 Time cost=18.7 Thoughput=32.14 samples/s INFO:gluonnlp:02:03:25 Epoch: 0, Batch: 549/7387, Loss=2.8114, lr=0.0000112 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:03:44 Epoch: 0, Batch: 599/7387, Loss=2.7050, lr=0.0000122 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:04:02 Epoch: 0, Batch: 649/7387, Loss=2.4761, lr=0.0000132 Time cost=18.5 Thoughput=32.43 samples/s INFO:gluonnlp:02:04:21 Epoch: 0, Batch: 699/7387, Loss=2.3212, lr=0.0000142 Time cost=18.5 Thoughput=32.36 samples/s INFO:gluonnlp:02:04:39 Epoch: 0, Batch: 749/7387, Loss=2.2082, lr=0.0000152 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:02:04:58 Epoch: 0, Batch: 799/7387, Loss=2.1345, lr=0.0000162 Time cost=18.3 Thoughput=32.70 samples/s INFO:gluonnlp:02:05:16 Epoch: 0, Batch: 849/7387, Loss=2.1810, lr=0.0000173 Time cost=18.3 Thoughput=32.75 samples/s INFO:gluonnlp:02:05:35 Epoch: 0, Batch: 899/7387, Loss=2.0395, lr=0.0000183 Time cost=18.4 Thoughput=32.52 samples/s INFO:gluonnlp:02:05:53 Epoch: 0, Batch: 949/7387, Loss=1.9430, lr=0.0000193 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:02:06:11 Epoch: 0, Batch: 999/7387, Loss=1.8072, lr=0.0000203 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:02:06:30 Epoch: 0, Batch: 1049/7387, Loss=1.9380, lr=0.0000213 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:02:06:48 Epoch: 0, Batch: 1099/7387, Loss=1.7500, lr=0.0000223 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:07:07 Epoch: 0, Batch: 1149/7387, Loss=1.7590, lr=0.0000234 Time cost=18.5 Thoughput=32.35 samples/s INFO:gluonnlp:02:07:25 Epoch: 0, Batch: 1199/7387, Loss=1.8002, lr=0.0000244 Time cost=18.5 Thoughput=32.42 samples/s INFO:gluonnlp:02:07:44 Epoch: 0, Batch: 1249/7387, Loss=1.7206, lr=0.0000254 Time cost=18.6 Thoughput=32.33 samples/s INFO:gluonnlp:02:08:02 Epoch: 0, Batch: 1299/7387, Loss=1.5665, lr=0.0000264 Time cost=18.5 Thoughput=32.42 samples/s INFO:gluonnlp:02:08:21 Epoch: 0, Batch: 1349/7387, Loss=1.6741, lr=0.0000274 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:08:39 Epoch: 0, Batch: 1399/7387, Loss=1.6386, lr=0.0000284 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:08:58 Epoch: 0, Batch: 1449/7387, Loss=1.5948, lr=0.0000295 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:09:16 Epoch: 0, Batch: 1499/7387, Loss=1.5577, lr=0.0000299 Time cost=18.6 Thoughput=32.28 samples/s INFO:gluonnlp:02:09:35 Epoch: 0, Batch: 1549/7387, Loss=1.6180, lr=0.0000298 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:02:09:53 Epoch: 0, Batch: 1599/7387, Loss=1.4979, lr=0.0000297 Time cost=18.3 Thoughput=32.79 samples/s INFO:gluonnlp:02:10:11 Epoch: 0, Batch: 1649/7387, Loss=1.5158, lr=0.0000296 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:02:10:30 Epoch: 0, Batch: 1699/7387, Loss=1.4924, lr=0.0000295 Time cost=18.4 Thoughput=32.58 samples/s INFO:gluonnlp:02:10:48 Epoch: 0, Batch: 1749/7387, Loss=1.3948, lr=0.0000294 Time cost=18.6 Thoughput=32.31 samples/s INFO:gluonnlp:02:11:07 Epoch: 0, Batch: 1799/7387, Loss=1.4371, lr=0.0000293 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:02:11:25 Epoch: 0, Batch: 1849/7387, Loss=1.4158, lr=0.0000292 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:11:44 Epoch: 0, Batch: 1899/7387, Loss=1.4211, lr=0.0000290 Time cost=18.5 Thoughput=32.43 samples/s INFO:gluonnlp:02:12:02 Epoch: 0, Batch: 1949/7387, Loss=1.5020, lr=0.0000289 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:02:12:20 Epoch: 0, Batch: 1999/7387, Loss=1.3847, lr=0.0000288 Time cost=18.4 Thoughput=32.62 samples/s INFO:gluonnlp:02:12:39 Epoch: 0, Batch: 2049/7387, Loss=1.3961, lr=0.0000287 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:12:57 Epoch: 0, Batch: 2099/7387, Loss=1.4810, lr=0.0000286 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:13:16 Epoch: 0, Batch: 2149/7387, Loss=1.4467, lr=0.0000285 Time cost=18.7 Thoughput=32.13 samples/s INFO:gluonnlp:02:13:35 Epoch: 0, Batch: 2199/7387, Loss=1.4014, lr=0.0000284 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:02:13:53 Epoch: 0, Batch: 2249/7387, Loss=1.3014, lr=0.0000283 Time cost=18.6 Thoughput=32.28 samples/s INFO:gluonnlp:02:14:12 Epoch: 0, Batch: 2299/7387, Loss=1.3815, lr=0.0000281 Time cost=18.5 Thoughput=32.37 samples/s INFO:gluonnlp:02:14:30 Epoch: 0, Batch: 2349/7387, Loss=1.3831, lr=0.0000280 Time cost=18.6 Thoughput=32.20 samples/s INFO:gluonnlp:02:14:49 Epoch: 0, Batch: 2399/7387, Loss=1.4332, lr=0.0000279 Time cost=18.6 Thoughput=32.25 samples/s INFO:gluonnlp:02:15:07 Epoch: 0, Batch: 2449/7387, Loss=1.3940, lr=0.0000278 Time cost=18.6 Thoughput=32.22 samples/s INFO:gluonnlp:02:15:26 Epoch: 0, Batch: 2499/7387, Loss=1.2575, lr=0.0000277 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:02:15:44 Epoch: 0, Batch: 2549/7387, Loss=1.3892, lr=0.0000276 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:16:03 Epoch: 0, Batch: 2599/7387, Loss=1.2649, lr=0.0000275 Time cost=18.5 Thoughput=32.39 samples/s INFO:gluonnlp:02:16:21 Epoch: 0, Batch: 2649/7387, Loss=1.2419, lr=0.0000274 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:16:41 Epoch: 0, Batch: 2699/7387, Loss=1.3383, lr=0.0000272 Time cost=19.2 Thoughput=31.20 samples/s INFO:gluonnlp:02:16:59 Epoch: 0, Batch: 2749/7387, Loss=1.3171, lr=0.0000271 Time cost=18.8 Thoughput=31.88 samples/s INFO:gluonnlp:02:17:18 Epoch: 0, Batch: 2799/7387, Loss=1.2957, lr=0.0000270 Time cost=18.7 Thoughput=32.13 samples/s INFO:gluonnlp:02:17:37 Epoch: 0, Batch: 2849/7387, Loss=1.2774, lr=0.0000269 Time cost=18.5 Thoughput=32.46 samples/s INFO:gluonnlp:02:17:55 Epoch: 0, Batch: 2899/7387, Loss=1.3789, lr=0.0000268 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:02:18:14 Epoch: 0, Batch: 2949/7387, Loss=1.2600, lr=0.0000267 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:18:32 Epoch: 0, Batch: 2999/7387, Loss=1.2830, lr=0.0000266 Time cost=18.4 Thoughput=32.60 samples/s INFO:gluonnlp:02:18:51 Epoch: 0, Batch: 3049/7387, Loss=1.2399, lr=0.0000265 Time cost=18.5 Thoughput=32.42 samples/s INFO:gluonnlp:02:19:09 Epoch: 0, Batch: 3099/7387, Loss=1.3109, lr=0.0000263 Time cost=18.4 Thoughput=32.52 samples/s INFO:gluonnlp:02:19:28 Epoch: 0, Batch: 3149/7387, Loss=1.2812, lr=0.0000262 Time cost=18.6 Thoughput=32.26 samples/s INFO:gluonnlp:02:19:46 Epoch: 0, Batch: 3199/7387, Loss=1.2078, lr=0.0000261 Time cost=18.5 Thoughput=32.44 samples/s INFO:gluonnlp:02:20:05 Epoch: 0, Batch: 3249/7387, Loss=1.2879, lr=0.0000260 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:02:20:23 Epoch: 0, Batch: 3299/7387, Loss=1.2161, lr=0.0000259 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:02:20:42 Epoch: 0, Batch: 3349/7387, Loss=1.1780, lr=0.0000258 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:21:00 Epoch: 0, Batch: 3399/7387, Loss=1.2423, lr=0.0000257 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:21:18 Epoch: 0, Batch: 3449/7387, Loss=1.2576, lr=0.0000255 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:21:37 Epoch: 0, Batch: 3499/7387, Loss=1.2332, lr=0.0000254 Time cost=18.5 Thoughput=32.42 samples/s INFO:gluonnlp:02:21:56 Epoch: 0, Batch: 3549/7387, Loss=1.1840, lr=0.0000253 Time cost=18.6 Thoughput=32.30 samples/s INFO:gluonnlp:02:22:14 Epoch: 0, Batch: 3599/7387, Loss=1.2255, lr=0.0000252 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:22:32 Epoch: 0, Batch: 3649/7387, Loss=1.3250, lr=0.0000251 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:02:22:51 Epoch: 0, Batch: 3699/7387, Loss=1.2106, lr=0.0000250 Time cost=18.4 Thoughput=32.64 samples/s INFO:gluonnlp:02:23:09 Epoch: 0, Batch: 3749/7387, Loss=1.2569, lr=0.0000249 Time cost=18.6 Thoughput=32.24 samples/s INFO:gluonnlp:02:23:28 Epoch: 0, Batch: 3799/7387, Loss=1.1660, lr=0.0000248 Time cost=18.6 Thoughput=32.31 samples/s INFO:gluonnlp:02:23:47 Epoch: 0, Batch: 3849/7387, Loss=1.2804, lr=0.0000246 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:02:24:05 Epoch: 0, Batch: 3899/7387, Loss=1.2254, lr=0.0000245 Time cost=18.5 Thoughput=32.35 samples/s INFO:gluonnlp:02:24:24 Epoch: 0, Batch: 3949/7387, Loss=1.3485, lr=0.0000244 Time cost=18.7 Thoughput=32.17 samples/s INFO:gluonnlp:02:24:42 Epoch: 0, Batch: 3999/7387, Loss=1.1967, lr=0.0000243 Time cost=18.7 Thoughput=32.10 samples/s INFO:gluonnlp:02:25:01 Epoch: 0, Batch: 4049/7387, Loss=1.1829, lr=0.0000242 Time cost=18.6 Thoughput=32.27 samples/s INFO:gluonnlp:02:25:20 Epoch: 0, Batch: 4099/7387, Loss=1.2182, lr=0.0000241 Time cost=18.5 Thoughput=32.43 samples/s INFO:gluonnlp:02:25:38 Epoch: 0, Batch: 4149/7387, Loss=1.1955, lr=0.0000240 Time cost=18.5 Thoughput=32.40 samples/s INFO:gluonnlp:02:25:57 Epoch: 0, Batch: 4199/7387, Loss=1.1923, lr=0.0000239 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:26:15 Epoch: 0, Batch: 4249/7387, Loss=1.3155, lr=0.0000237 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:26:33 Epoch: 0, Batch: 4299/7387, Loss=1.2399, lr=0.0000236 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:02:26:52 Epoch: 0, Batch: 4349/7387, Loss=1.3427, lr=0.0000235 Time cost=18.5 Thoughput=32.44 samples/s INFO:gluonnlp:02:27:10 Epoch: 0, Batch: 4399/7387, Loss=1.1924, lr=0.0000234 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:27:29 Epoch: 0, Batch: 4449/7387, Loss=1.2093, lr=0.0000233 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:02:27:47 Epoch: 0, Batch: 4499/7387, Loss=1.0482, lr=0.0000232 Time cost=18.5 Thoughput=32.40 samples/s INFO:gluonnlp:02:28:06 Epoch: 0, Batch: 4549/7387, Loss=1.1968, lr=0.0000231 Time cost=18.5 Thoughput=32.44 samples/s INFO:gluonnlp:02:28:24 Epoch: 0, Batch: 4599/7387, Loss=1.1472, lr=0.0000230 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:02:28:42 Epoch: 0, Batch: 4649/7387, Loss=1.2098, lr=0.0000228 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:02:29:01 Epoch: 0, Batch: 4699/7387, Loss=1.0668, lr=0.0000227 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:02:29:19 Epoch: 0, Batch: 4749/7387, Loss=1.1808, lr=0.0000226 Time cost=18.6 Thoughput=32.22 samples/s INFO:gluonnlp:02:29:38 Epoch: 0, Batch: 4799/7387, Loss=1.1441, lr=0.0000225 Time cost=18.5 Thoughput=32.35 samples/s INFO:gluonnlp:02:29:56 Epoch: 0, Batch: 4849/7387, Loss=1.2670, lr=0.0000224 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:30:15 Epoch: 0, Batch: 4899/7387, Loss=1.1665, lr=0.0000223 Time cost=18.6 Thoughput=32.31 samples/s INFO:gluonnlp:02:30:33 Epoch: 0, Batch: 4949/7387, Loss=1.1898, lr=0.0000222 Time cost=18.5 Thoughput=32.44 samples/s INFO:gluonnlp:02:30:52 Epoch: 0, Batch: 4999/7387, Loss=1.1833, lr=0.0000221 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:02:31:10 Epoch: 0, Batch: 5049/7387, Loss=1.1157, lr=0.0000219 Time cost=18.3 Thoughput=32.75 samples/s INFO:gluonnlp:02:31:29 Epoch: 0, Batch: 5099/7387, Loss=1.1533, lr=0.0000218 Time cost=18.5 Thoughput=32.43 samples/s INFO:gluonnlp:02:31:47 Epoch: 0, Batch: 5149/7387, Loss=1.1311, lr=0.0000217 Time cost=18.3 Thoughput=32.70 samples/s INFO:gluonnlp:02:32:05 Epoch: 0, Batch: 5199/7387, Loss=1.1378, lr=0.0000216 Time cost=18.4 Thoughput=32.65 samples/s INFO:gluonnlp:02:32:24 Epoch: 0, Batch: 5249/7387, Loss=1.1070, lr=0.0000215 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:32:42 Epoch: 0, Batch: 5299/7387, Loss=1.1372, lr=0.0000214 Time cost=18.4 Thoughput=32.52 samples/s INFO:gluonnlp:02:33:01 Epoch: 0, Batch: 5349/7387, Loss=1.1307, lr=0.0000213 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:33:19 Epoch: 0, Batch: 5399/7387, Loss=1.0868, lr=0.0000211 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:33:38 Epoch: 0, Batch: 5449/7387, Loss=1.1191, lr=0.0000210 Time cost=18.5 Thoughput=32.35 samples/s INFO:gluonnlp:02:33:56 Epoch: 0, Batch: 5499/7387, Loss=1.1121, lr=0.0000209 Time cost=18.3 Thoughput=32.75 samples/s INFO:gluonnlp:02:34:14 Epoch: 0, Batch: 5549/7387, Loss=1.1454, lr=0.0000208 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:02:34:33 Epoch: 0, Batch: 5599/7387, Loss=1.2427, lr=0.0000207 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:34:51 Epoch: 0, Batch: 5649/7387, Loss=1.0484, lr=0.0000206 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:35:09 Epoch: 0, Batch: 5699/7387, Loss=1.1345, lr=0.0000205 Time cost=18.3 Thoughput=32.71 samples/s INFO:gluonnlp:02:35:28 Epoch: 0, Batch: 5749/7387, Loss=1.0112, lr=0.0000204 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:02:35:46 Epoch: 0, Batch: 5799/7387, Loss=1.0097, lr=0.0000202 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:36:05 Epoch: 0, Batch: 5849/7387, Loss=1.1942, lr=0.0000201 Time cost=18.5 Thoughput=32.35 samples/s INFO:gluonnlp:02:36:23 Epoch: 0, Batch: 5899/7387, Loss=1.1880, lr=0.0000200 Time cost=18.6 Thoughput=32.21 samples/s INFO:gluonnlp:02:36:42 Epoch: 0, Batch: 5949/7387, Loss=1.0441, lr=0.0000199 Time cost=18.5 Thoughput=32.45 samples/s INFO:gluonnlp:02:37:00 Epoch: 0, Batch: 5999/7387, Loss=1.0812, lr=0.0000198 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:37:19 Epoch: 0, Batch: 6049/7387, Loss=1.0289, lr=0.0000197 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:02:37:37 Epoch: 0, Batch: 6099/7387, Loss=1.0441, lr=0.0000196 Time cost=18.7 Thoughput=32.16 samples/s INFO:gluonnlp:02:37:56 Epoch: 0, Batch: 6149/7387, Loss=1.0431, lr=0.0000195 Time cost=18.4 Thoughput=32.64 samples/s INFO:gluonnlp:02:38:14 Epoch: 0, Batch: 6199/7387, Loss=1.1018, lr=0.0000193 Time cost=18.6 Thoughput=32.23 samples/s INFO:gluonnlp:02:38:33 Epoch: 0, Batch: 6249/7387, Loss=1.0309, lr=0.0000192 Time cost=18.7 Thoughput=32.15 samples/s INFO:gluonnlp:02:38:52 Epoch: 0, Batch: 6299/7387, Loss=1.0654, lr=0.0000191 Time cost=18.6 Thoughput=32.19 samples/s INFO:gluonnlp:02:39:10 Epoch: 0, Batch: 6349/7387, Loss=1.0709, lr=0.0000190 Time cost=18.6 Thoughput=32.32 samples/s INFO:gluonnlp:02:39:29 Epoch: 0, Batch: 6399/7387, Loss=1.0877, lr=0.0000189 Time cost=18.6 Thoughput=32.28 samples/s INFO:gluonnlp:02:39:47 Epoch: 0, Batch: 6449/7387, Loss=1.0749, lr=0.0000188 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:40:06 Epoch: 0, Batch: 6499/7387, Loss=1.0494, lr=0.0000187 Time cost=18.6 Thoughput=32.34 samples/s INFO:gluonnlp:02:40:24 Epoch: 0, Batch: 6549/7387, Loss=1.1032, lr=0.0000186 Time cost=18.3 Thoughput=32.71 samples/s INFO:gluonnlp:02:40:42 Epoch: 0, Batch: 6599/7387, Loss=1.1148, lr=0.0000184 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:02:41:01 Epoch: 0, Batch: 6649/7387, Loss=1.1324, lr=0.0000183 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:41:19 Epoch: 0, Batch: 6699/7387, Loss=1.1002, lr=0.0000182 Time cost=18.6 Thoughput=32.26 samples/s INFO:gluonnlp:02:41:38 Epoch: 0, Batch: 6749/7387, Loss=1.0455, lr=0.0000181 Time cost=18.7 Thoughput=32.15 samples/s INFO:gluonnlp:02:41:56 Epoch: 0, Batch: 6799/7387, Loss=1.2576, lr=0.0000180 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:42:15 Epoch: 0, Batch: 6849/7387, Loss=1.1040, lr=0.0000179 Time cost=18.6 Thoughput=32.27 samples/s INFO:gluonnlp:02:42:33 Epoch: 0, Batch: 6899/7387, Loss=1.0021, lr=0.0000178 Time cost=18.4 Thoughput=32.53 samples/s INFO:gluonnlp:02:42:52 Epoch: 0, Batch: 6949/7387, Loss=1.0743, lr=0.0000177 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:02:43:10 Epoch: 0, Batch: 6999/7387, Loss=1.1161, lr=0.0000175 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:43:29 Epoch: 0, Batch: 7049/7387, Loss=1.1027, lr=0.0000174 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:02:43:47 Epoch: 0, Batch: 7099/7387, Loss=1.0024, lr=0.0000173 Time cost=18.5 Thoughput=32.49 samples/s INFO:gluonnlp:02:44:06 Epoch: 0, Batch: 7149/7387, Loss=0.9139, lr=0.0000172 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:44:24 Epoch: 0, Batch: 7199/7387, Loss=0.9686, lr=0.0000171 Time cost=18.5 Thoughput=32.39 samples/s INFO:gluonnlp:02:44:42 Epoch: 0, Batch: 7249/7387, Loss=1.0661, lr=0.0000170 Time cost=18.4 Thoughput=32.67 samples/s INFO:gluonnlp:02:45:01 Epoch: 0, Batch: 7299/7387, Loss=1.0903, lr=0.0000169 Time cost=18.4 Thoughput=32.67 samples/s INFO:gluonnlp:02:45:19 Epoch: 0, Batch: 7349/7387, Loss=1.0820, lr=0.0000167 Time cost=18.5 Thoughput=32.40 samples/s INFO:gluonnlp:02:45:33 Time cost=2731.88 s, Thoughput=32.45 samples/s INFO:gluonnlp:02:45:51 Epoch: 1, Batch: 49/7387, Loss=0.7666, lr=0.0000166 Time cost=18.4 Thoughput=56.48 samples/s INFO:gluonnlp:02:46:10 Epoch: 1, Batch: 99/7387, Loss=0.8699, lr=0.0000164 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:02:46:28 Epoch: 1, Batch: 149/7387, Loss=0.8291, lr=0.0000163 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:02:46:47 Epoch: 1, Batch: 199/7387, Loss=0.8319, lr=0.0000162 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:02:47:05 Epoch: 1, Batch: 249/7387, Loss=0.7373, lr=0.0000161 Time cost=18.4 Thoughput=32.53 samples/s INFO:gluonnlp:02:47:24 Epoch: 1, Batch: 299/7387, Loss=0.8154, lr=0.0000160 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:02:47:42 Epoch: 1, Batch: 349/7387, Loss=0.7781, lr=0.0000159 Time cost=18.5 Thoughput=32.37 samples/s INFO:gluonnlp:02:48:01 Epoch: 1, Batch: 399/7387, Loss=0.8521, lr=0.0000158 Time cost=18.7 Thoughput=32.11 samples/s INFO:gluonnlp:02:48:19 Epoch: 1, Batch: 449/7387, Loss=0.7923, lr=0.0000156 Time cost=18.4 Thoughput=32.55 samples/s INFO:gluonnlp:02:48:38 Epoch: 1, Batch: 499/7387, Loss=0.8271, lr=0.0000155 Time cost=18.4 Thoughput=32.53 samples/s INFO:gluonnlp:02:48:56 Epoch: 1, Batch: 549/7387, Loss=0.9286, lr=0.0000154 Time cost=18.5 Thoughput=32.39 samples/s INFO:gluonnlp:02:49:15 Epoch: 1, Batch: 599/7387, Loss=0.7953, lr=0.0000153 Time cost=18.7 Thoughput=32.16 samples/s INFO:gluonnlp:02:49:34 Epoch: 1, Batch: 649/7387, Loss=0.8132, lr=0.0000152 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:02:49:52 Epoch: 1, Batch: 699/7387, Loss=0.8052, lr=0.0000151 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:02:50:11 Epoch: 1, Batch: 749/7387, Loss=0.7799, lr=0.0000150 Time cost=18.6 Thoughput=32.27 samples/s INFO:gluonnlp:02:50:29 Epoch: 1, Batch: 799/7387, Loss=0.8509, lr=0.0000149 Time cost=18.7 Thoughput=32.14 samples/s INFO:gluonnlp:02:50:48 Epoch: 1, Batch: 849/7387, Loss=0.8172, lr=0.0000147 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:02:51:06 Epoch: 1, Batch: 899/7387, Loss=0.8462, lr=0.0000146 Time cost=18.4 Thoughput=32.62 samples/s INFO:gluonnlp:02:51:25 Epoch: 1, Batch: 949/7387, Loss=0.7648, lr=0.0000145 Time cost=18.5 Thoughput=32.36 samples/s INFO:gluonnlp:02:51:43 Epoch: 1, Batch: 999/7387, Loss=0.8557, lr=0.0000144 Time cost=18.6 Thoughput=32.33 samples/s INFO:gluonnlp:02:52:02 Epoch: 1, Batch: 1049/7387, Loss=0.7668, lr=0.0000143 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:02:52:20 Epoch: 1, Batch: 1099/7387, Loss=0.8001, lr=0.0000142 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:02:52:38 Epoch: 1, Batch: 1149/7387, Loss=0.7947, lr=0.0000141 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:02:52:57 Epoch: 1, Batch: 1199/7387, Loss=0.7580, lr=0.0000140 Time cost=18.4 Thoughput=32.65 samples/s INFO:gluonnlp:02:53:15 Epoch: 1, Batch: 1249/7387, Loss=0.7580, lr=0.0000138 Time cost=18.4 Thoughput=32.58 samples/s INFO:gluonnlp:02:53:34 Epoch: 1, Batch: 1299/7387, Loss=0.7663, lr=0.0000137 Time cost=18.7 Thoughput=32.13 samples/s INFO:gluonnlp:02:53:52 Epoch: 1, Batch: 1349/7387, Loss=0.8296, lr=0.0000136 Time cost=18.4 Thoughput=32.55 samples/s INFO:gluonnlp:02:54:11 Epoch: 1, Batch: 1399/7387, Loss=0.7939, lr=0.0000135 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:02:54:29 Epoch: 1, Batch: 1449/7387, Loss=0.8980, lr=0.0000134 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:02:54:47 Epoch: 1, Batch: 1499/7387, Loss=0.9092, lr=0.0000133 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:02:55:06 Epoch: 1, Batch: 1549/7387, Loss=0.8219, lr=0.0000132 Time cost=19.0 Thoughput=31.63 samples/s INFO:gluonnlp:02:55:25 Epoch: 1, Batch: 1599/7387, Loss=0.8043, lr=0.0000131 Time cost=18.7 Thoughput=32.09 samples/s INFO:gluonnlp:02:55:44 Epoch: 1, Batch: 1649/7387, Loss=0.7480, lr=0.0000129 Time cost=18.6 Thoughput=32.30 samples/s INFO:gluonnlp:02:56:02 Epoch: 1, Batch: 1699/7387, Loss=0.7592, lr=0.0000128 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:02:56:20 Epoch: 1, Batch: 1749/7387, Loss=0.9395, lr=0.0000127 Time cost=18.3 Thoughput=32.78 samples/s INFO:gluonnlp:02:56:39 Epoch: 1, Batch: 1799/7387, Loss=0.7673, lr=0.0000126 Time cost=18.3 Thoughput=32.70 samples/s INFO:gluonnlp:02:56:57 Epoch: 1, Batch: 1849/7387, Loss=0.8829, lr=0.0000125 Time cost=18.4 Thoughput=32.57 samples/s INFO:gluonnlp:02:57:16 Epoch: 1, Batch: 1899/7387, Loss=0.8374, lr=0.0000124 Time cost=18.6 Thoughput=32.18 samples/s INFO:gluonnlp:02:57:34 Epoch: 1, Batch: 1949/7387, Loss=0.8901, lr=0.0000123 Time cost=18.4 Thoughput=32.64 samples/s INFO:gluonnlp:02:57:52 Epoch: 1, Batch: 1999/7387, Loss=0.7383, lr=0.0000122 Time cost=18.4 Thoughput=32.68 samples/s INFO:gluonnlp:02:58:11 Epoch: 1, Batch: 2049/7387, Loss=0.7925, lr=0.0000120 Time cost=18.4 Thoughput=32.68 samples/s INFO:gluonnlp:02:58:29 Epoch: 1, Batch: 2099/7387, Loss=0.7891, lr=0.0000119 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:02:58:48 Epoch: 1, Batch: 2149/7387, Loss=0.8226, lr=0.0000118 Time cost=18.4 Thoughput=32.68 samples/s INFO:gluonnlp:02:59:06 Epoch: 1, Batch: 2199/7387, Loss=0.7963, lr=0.0000117 Time cost=18.3 Thoughput=32.75 samples/s INFO:gluonnlp:02:59:24 Epoch: 1, Batch: 2249/7387, Loss=0.7571, lr=0.0000116 Time cost=18.4 Thoughput=32.67 samples/s INFO:gluonnlp:02:59:43 Epoch: 1, Batch: 2299/7387, Loss=0.7849, lr=0.0000115 Time cost=18.4 Thoughput=32.60 samples/s INFO:gluonnlp:03:00:01 Epoch: 1, Batch: 2349/7387, Loss=0.7798, lr=0.0000114 Time cost=18.3 Thoughput=32.70 samples/s INFO:gluonnlp:03:00:19 Epoch: 1, Batch: 2399/7387, Loss=0.7741, lr=0.0000113 Time cost=18.4 Thoughput=32.55 samples/s INFO:gluonnlp:03:00:38 Epoch: 1, Batch: 2449/7387, Loss=0.8524, lr=0.0000111 Time cost=18.6 Thoughput=32.22 samples/s INFO:gluonnlp:03:00:56 Epoch: 1, Batch: 2499/7387, Loss=0.8651, lr=0.0000110 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:03:01:15 Epoch: 1, Batch: 2549/7387, Loss=0.8263, lr=0.0000109 Time cost=18.6 Thoughput=32.33 samples/s INFO:gluonnlp:03:01:34 Epoch: 1, Batch: 2599/7387, Loss=0.8359, lr=0.0000108 Time cost=18.8 Thoughput=32.00 samples/s INFO:gluonnlp:03:01:52 Epoch: 1, Batch: 2649/7387, Loss=0.8721, lr=0.0000107 Time cost=18.8 Thoughput=31.95 samples/s INFO:gluonnlp:03:02:11 Epoch: 1, Batch: 2699/7387, Loss=0.8180, lr=0.0000106 Time cost=18.6 Thoughput=32.21 samples/s INFO:gluonnlp:03:02:29 Epoch: 1, Batch: 2749/7387, Loss=0.8130, lr=0.0000105 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:03:02:48 Epoch: 1, Batch: 2799/7387, Loss=0.8010, lr=0.0000103 Time cost=18.3 Thoughput=32.76 samples/s INFO:gluonnlp:03:03:06 Epoch: 1, Batch: 2849/7387, Loss=0.7338, lr=0.0000102 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:03:03:25 Epoch: 1, Batch: 2899/7387, Loss=0.8094, lr=0.0000101 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:03:03:43 Epoch: 1, Batch: 2949/7387, Loss=0.8930, lr=0.0000100 Time cost=18.4 Thoughput=32.67 samples/s INFO:gluonnlp:03:04:01 Epoch: 1, Batch: 2999/7387, Loss=0.7581, lr=0.0000099 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:03:04:20 Epoch: 1, Batch: 3049/7387, Loss=0.8129, lr=0.0000098 Time cost=18.4 Thoughput=32.52 samples/s INFO:gluonnlp:03:04:38 Epoch: 1, Batch: 3099/7387, Loss=0.8741, lr=0.0000097 Time cost=18.6 Thoughput=32.18 samples/s INFO:gluonnlp:03:04:57 Epoch: 1, Batch: 3149/7387, Loss=0.8252, lr=0.0000096 Time cost=18.6 Thoughput=32.32 samples/s INFO:gluonnlp:03:05:16 Epoch: 1, Batch: 3199/7387, Loss=0.7757, lr=0.0000094 Time cost=18.6 Thoughput=32.27 samples/s INFO:gluonnlp:03:05:34 Epoch: 1, Batch: 3249/7387, Loss=0.7472, lr=0.0000093 Time cost=18.6 Thoughput=32.27 samples/s INFO:gluonnlp:03:05:53 Epoch: 1, Batch: 3299/7387, Loss=0.8070, lr=0.0000092 Time cost=18.5 Thoughput=32.40 samples/s INFO:gluonnlp:03:06:11 Epoch: 1, Batch: 3349/7387, Loss=0.8725, lr=0.0000091 Time cost=18.6 Thoughput=32.23 samples/s INFO:gluonnlp:03:06:30 Epoch: 1, Batch: 3399/7387, Loss=0.8053, lr=0.0000090 Time cost=18.7 Thoughput=32.14 samples/s INFO:gluonnlp:03:06:49 Epoch: 1, Batch: 3449/7387, Loss=0.8932, lr=0.0000089 Time cost=18.5 Thoughput=32.41 samples/s INFO:gluonnlp:03:07:07 Epoch: 1, Batch: 3499/7387, Loss=0.8389, lr=0.0000088 Time cost=18.7 Thoughput=32.09 samples/s INFO:gluonnlp:03:07:26 Epoch: 1, Batch: 3549/7387, Loss=0.8583, lr=0.0000087 Time cost=18.6 Thoughput=32.20 samples/s INFO:gluonnlp:03:07:44 Epoch: 1, Batch: 3599/7387, Loss=0.7959, lr=0.0000085 Time cost=18.5 Thoughput=32.49 samples/s INFO:gluonnlp:03:08:03 Epoch: 1, Batch: 3649/7387, Loss=0.8609, lr=0.0000084 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:03:08:21 Epoch: 1, Batch: 3699/7387, Loss=0.7918, lr=0.0000083 Time cost=18.5 Thoughput=32.38 samples/s INFO:gluonnlp:03:08:40 Epoch: 1, Batch: 3749/7387, Loss=0.8026, lr=0.0000082 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:08:58 Epoch: 1, Batch: 3799/7387, Loss=0.8322, lr=0.0000081 Time cost=18.6 Thoughput=32.32 samples/s INFO:gluonnlp:03:09:17 Epoch: 1, Batch: 3849/7387, Loss=0.7709, lr=0.0000080 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:03:09:35 Epoch: 1, Batch: 3899/7387, Loss=0.7491, lr=0.0000079 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:03:09:54 Epoch: 1, Batch: 3949/7387, Loss=0.7895, lr=0.0000078 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:03:10:12 Epoch: 1, Batch: 3999/7387, Loss=0.7182, lr=0.0000076 Time cost=18.7 Thoughput=32.12 samples/s INFO:gluonnlp:03:10:31 Epoch: 1, Batch: 4049/7387, Loss=0.7126, lr=0.0000075 Time cost=18.6 Thoughput=32.24 samples/s INFO:gluonnlp:03:10:49 Epoch: 1, Batch: 4099/7387, Loss=0.8015, lr=0.0000074 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:03:11:08 Epoch: 1, Batch: 4149/7387, Loss=0.7470, lr=0.0000073 Time cost=18.4 Thoughput=32.65 samples/s INFO:gluonnlp:03:11:26 Epoch: 1, Batch: 4199/7387, Loss=0.8290, lr=0.0000072 Time cost=18.5 Thoughput=32.37 samples/s INFO:gluonnlp:03:11:45 Epoch: 1, Batch: 4249/7387, Loss=0.7753, lr=0.0000071 Time cost=18.8 Thoughput=31.85 samples/s INFO:gluonnlp:03:12:04 Epoch: 1, Batch: 4299/7387, Loss=0.8274, lr=0.0000070 Time cost=18.8 Thoughput=31.90 samples/s INFO:gluonnlp:03:12:23 Epoch: 1, Batch: 4349/7387, Loss=0.8349, lr=0.0000069 Time cost=18.8 Thoughput=31.89 samples/s INFO:gluonnlp:03:12:41 Epoch: 1, Batch: 4399/7387, Loss=0.7817, lr=0.0000067 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:03:13:00 Epoch: 1, Batch: 4449/7387, Loss=0.7510, lr=0.0000066 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:03:13:18 Epoch: 1, Batch: 4499/7387, Loss=0.8059, lr=0.0000065 Time cost=18.4 Thoughput=32.57 samples/s INFO:gluonnlp:03:13:37 Epoch: 1, Batch: 4549/7387, Loss=0.7245, lr=0.0000064 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:03:13:55 Epoch: 1, Batch: 4599/7387, Loss=0.7389, lr=0.0000063 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:14:13 Epoch: 1, Batch: 4649/7387, Loss=0.8121, lr=0.0000062 Time cost=18.4 Thoughput=32.62 samples/s INFO:gluonnlp:03:14:32 Epoch: 1, Batch: 4699/7387, Loss=0.8031, lr=0.0000061 Time cost=18.4 Thoughput=32.63 samples/s INFO:gluonnlp:03:14:50 Epoch: 1, Batch: 4749/7387, Loss=0.7670, lr=0.0000059 Time cost=18.4 Thoughput=32.53 samples/s INFO:gluonnlp:03:15:09 Epoch: 1, Batch: 4799/7387, Loss=0.7730, lr=0.0000058 Time cost=18.7 Thoughput=32.14 samples/s INFO:gluonnlp:03:15:27 Epoch: 1, Batch: 4849/7387, Loss=0.8202, lr=0.0000057 Time cost=18.5 Thoughput=32.39 samples/s INFO:gluonnlp:03:15:46 Epoch: 1, Batch: 4899/7387, Loss=0.7554, lr=0.0000056 Time cost=18.5 Thoughput=32.38 samples/s INFO:gluonnlp:03:16:04 Epoch: 1, Batch: 4949/7387, Loss=0.7982, lr=0.0000055 Time cost=18.4 Thoughput=32.60 samples/s INFO:gluonnlp:03:16:23 Epoch: 1, Batch: 4999/7387, Loss=0.7445, lr=0.0000054 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:16:41 Epoch: 1, Batch: 5049/7387, Loss=0.9091, lr=0.0000053 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:03:16:59 Epoch: 1, Batch: 5099/7387, Loss=0.7319, lr=0.0000052 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:17:18 Epoch: 1, Batch: 5149/7387, Loss=0.7024, lr=0.0000050 Time cost=18.4 Thoughput=32.57 samples/s INFO:gluonnlp:03:17:36 Epoch: 1, Batch: 5199/7387, Loss=0.6475, lr=0.0000049 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:03:17:55 Epoch: 1, Batch: 5249/7387, Loss=0.7990, lr=0.0000048 Time cost=18.6 Thoughput=32.34 samples/s INFO:gluonnlp:03:18:13 Epoch: 1, Batch: 5299/7387, Loss=0.7473, lr=0.0000047 Time cost=18.6 Thoughput=32.29 samples/s INFO:gluonnlp:03:18:32 Epoch: 1, Batch: 5349/7387, Loss=0.6899, lr=0.0000046 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:03:18:50 Epoch: 1, Batch: 5399/7387, Loss=0.6940, lr=0.0000045 Time cost=18.5 Thoughput=32.50 samples/s INFO:gluonnlp:03:19:09 Epoch: 1, Batch: 5449/7387, Loss=0.9089, lr=0.0000044 Time cost=18.7 Thoughput=32.16 samples/s INFO:gluonnlp:03:19:27 Epoch: 1, Batch: 5499/7387, Loss=0.7963, lr=0.0000043 Time cost=18.4 Thoughput=32.64 samples/s INFO:gluonnlp:03:19:46 Epoch: 1, Batch: 5549/7387, Loss=0.7090, lr=0.0000041 Time cost=18.4 Thoughput=32.62 samples/s INFO:gluonnlp:03:20:04 Epoch: 1, Batch: 5599/7387, Loss=0.8100, lr=0.0000040 Time cost=18.3 Thoughput=32.78 samples/s INFO:gluonnlp:03:20:22 Epoch: 1, Batch: 5649/7387, Loss=0.6813, lr=0.0000039 Time cost=18.3 Thoughput=32.71 samples/s INFO:gluonnlp:03:20:41 Epoch: 1, Batch: 5699/7387, Loss=0.8174, lr=0.0000038 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:20:59 Epoch: 1, Batch: 5749/7387, Loss=0.7121, lr=0.0000037 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:03:21:17 Epoch: 1, Batch: 5799/7387, Loss=0.7549, lr=0.0000036 Time cost=18.4 Thoughput=32.65 samples/s INFO:gluonnlp:03:21:36 Epoch: 1, Batch: 5849/7387, Loss=0.7068, lr=0.0000035 Time cost=18.5 Thoughput=32.42 samples/s INFO:gluonnlp:03:21:54 Epoch: 1, Batch: 5899/7387, Loss=0.7714, lr=0.0000034 Time cost=18.4 Thoughput=32.60 samples/s INFO:gluonnlp:03:22:13 Epoch: 1, Batch: 5949/7387, Loss=0.7138, lr=0.0000032 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:03:22:31 Epoch: 1, Batch: 5999/7387, Loss=0.7848, lr=0.0000031 Time cost=18.3 Thoughput=32.74 samples/s INFO:gluonnlp:03:22:49 Epoch: 1, Batch: 6049/7387, Loss=0.7772, lr=0.0000030 Time cost=18.3 Thoughput=32.73 samples/s INFO:gluonnlp:03:23:08 Epoch: 1, Batch: 6099/7387, Loss=0.8273, lr=0.0000029 Time cost=18.3 Thoughput=32.72 samples/s INFO:gluonnlp:03:23:26 Epoch: 1, Batch: 6149/7387, Loss=0.7723, lr=0.0000028 Time cost=18.4 Thoughput=32.54 samples/s INFO:gluonnlp:03:23:45 Epoch: 1, Batch: 6199/7387, Loss=0.8291, lr=0.0000027 Time cost=18.5 Thoughput=32.44 samples/s INFO:gluonnlp:03:24:03 Epoch: 1, Batch: 6249/7387, Loss=0.7536, lr=0.0000026 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:03:24:21 Epoch: 1, Batch: 6299/7387, Loss=0.7496, lr=0.0000025 Time cost=18.3 Thoughput=32.80 samples/s INFO:gluonnlp:03:24:40 Epoch: 1, Batch: 6349/7387, Loss=0.7672, lr=0.0000023 Time cost=18.4 Thoughput=32.69 samples/s INFO:gluonnlp:03:24:58 Epoch: 1, Batch: 6399/7387, Loss=0.7233, lr=0.0000022 Time cost=18.3 Thoughput=32.79 samples/s INFO:gluonnlp:03:25:16 Epoch: 1, Batch: 6449/7387, Loss=0.7626, lr=0.0000021 Time cost=18.4 Thoughput=32.64 samples/s INFO:gluonnlp:03:25:35 Epoch: 1, Batch: 6499/7387, Loss=0.7407, lr=0.0000020 Time cost=18.5 Thoughput=32.48 samples/s INFO:gluonnlp:03:25:53 Epoch: 1, Batch: 6549/7387, Loss=0.7720, lr=0.0000019 Time cost=18.5 Thoughput=32.49 samples/s INFO:gluonnlp:03:26:12 Epoch: 1, Batch: 6599/7387, Loss=0.7060, lr=0.0000018 Time cost=18.4 Thoughput=32.56 samples/s INFO:gluonnlp:03:26:30 Epoch: 1, Batch: 6649/7387, Loss=0.8115, lr=0.0000017 Time cost=18.3 Thoughput=32.70 samples/s INFO:gluonnlp:03:26:49 Epoch: 1, Batch: 6699/7387, Loss=0.8684, lr=0.0000015 Time cost=18.5 Thoughput=32.51 samples/s INFO:gluonnlp:03:27:07 Epoch: 1, Batch: 6749/7387, Loss=0.8071, lr=0.0000014 Time cost=18.6 Thoughput=32.26 samples/s INFO:gluonnlp:03:27:26 Epoch: 1, Batch: 6799/7387, Loss=0.7380, lr=0.0000013 Time cost=18.4 Thoughput=32.57 samples/s INFO:gluonnlp:03:27:44 Epoch: 1, Batch: 6849/7387, Loss=0.8137, lr=0.0000012 Time cost=18.5 Thoughput=32.36 samples/s INFO:gluonnlp:03:28:03 Epoch: 1, Batch: 6899/7387, Loss=0.7353, lr=0.0000011 Time cost=18.4 Thoughput=32.59 samples/s INFO:gluonnlp:03:28:21 Epoch: 1, Batch: 6949/7387, Loss=0.7733, lr=0.0000010 Time cost=18.6 Thoughput=32.33 samples/s INFO:gluonnlp:03:28:40 Epoch: 1, Batch: 6999/7387, Loss=0.7408, lr=0.0000009 Time cost=18.5 Thoughput=32.46 samples/s INFO:gluonnlp:03:28:58 Epoch: 1, Batch: 7049/7387, Loss=0.8095, lr=0.0000008 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:03:29:16 Epoch: 1, Batch: 7099/7387, Loss=0.7353, lr=0.0000006 Time cost=18.4 Thoughput=32.58 samples/s INFO:gluonnlp:03:29:35 Epoch: 1, Batch: 7149/7387, Loss=0.7550, lr=0.0000005 Time cost=18.4 Thoughput=32.65 samples/s INFO:gluonnlp:03:29:53 Epoch: 1, Batch: 7199/7387, Loss=0.7870, lr=0.0000004 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:03:30:12 Epoch: 1, Batch: 7249/7387, Loss=0.7283, lr=0.0000003 Time cost=18.4 Thoughput=32.55 samples/s INFO:gluonnlp:03:30:30 Epoch: 1, Batch: 7299/7387, Loss=0.7837, lr=0.0000002 Time cost=18.4 Thoughput=32.66 samples/s INFO:gluonnlp:03:30:48 Epoch: 1, Batch: 7349/7387, Loss=0.7475, lr=0.0000001 Time cost=18.4 Thoughput=32.61 samples/s INFO:gluonnlp:03:31:02 Time cost=5460.75 s, Thoughput=32.46 samples/s INFO:gluonnlp:03:31:03 Loader dev data... INFO:gluonnlp:03:31:04 Number of records in Train data:10570 INFO:gluonnlp:03:31:17 The number of examples after preprocessing:10833 INFO:gluonnlp:03:31:17 Start predict INFO:gluonnlp:03:32:12 Time cost=54.59 s, Thoughput=198.45 samples/s INFO:gluonnlp:03:32:12 Get prediction results... INFO:gluonnlp:03:33:56 {'exact_match': 80.98391674550615, 'f1': 88.52929328904605}