Training: 2021-03-14 21:39:25,688-rank_id: 0 Training: 2021-03-14 21:39:34,934-softmax weight init successfully! Training: 2021-03-14 21:39:34,934-softmax weight mom init successfully! Training: 2021-03-14 21:39:34,937-Total Step is: 124550 Training: 2021-03-14 21:40:14,842-Reducer buckets have been rebuilt in this iteration. Training: 2021-03-14 21:40:39,151-Speed 4475.51 samples/sec Loss 55.8521 Epoch: 0 Global Step: 100 Fp16 Grad Scale: 256 Required: 9 hours Training: 2021-03-14 21:40:50,925-Speed 4348.80 samples/sec Loss 53.2779 Epoch: 0 Global Step: 150 Fp16 Grad Scale: 256 Required: 9 hours Training: 2021-03-14 21:41:02,445-Speed 4444.92 samples/sec Loss 51.1987 Epoch: 0 Global Step: 200 Fp16 Grad Scale: 512 Required: 9 hours Training: 2021-03-14 21:41:13,980-Speed 4438.56 samples/sec Loss 49.5734 Epoch: 0 Global Step: 250 Fp16 Grad Scale: 512 Required: 9 hours Training: 2021-03-14 21:41:25,498-Speed 4445.78 samples/sec Loss 48.2149 Epoch: 0 Global Step: 300 Fp16 Grad Scale: 1024 Required: 8 hours Training: 2021-03-14 21:41:37,137-Speed 4399.07 samples/sec Loss 46.8667 Epoch: 0 Global Step: 350 Fp16 Grad Scale: 1024 Required: 8 hours Training: 2021-03-14 21:41:50,126-Speed 3941.82 samples/sec Loss 45.9123 Epoch: 0 Global Step: 400 Fp16 Grad Scale: 2048 Required: 8 hours Training: 2021-03-14 21:42:01,628-Speed 4451.90 samples/sec Loss 45.1643 Epoch: 0 Global Step: 450 Fp16 Grad Scale: 2048 Required: 8 hours Training: 2021-03-14 21:42:13,125-Speed 4453.46 samples/sec Loss 44.5562 Epoch: 0 Global Step: 500 Fp16 Grad Scale: 4096 Required: 8 hours Training: 2021-03-14 21:42:24,420-Speed 4533.17 samples/sec Loss 44.0767 Epoch: 0 Global Step: 550 Fp16 Grad Scale: 4096 Required: 8 hours Training: 2021-03-14 21:42:35,736-Speed 4524.75 samples/sec Loss 43.5758 Epoch: 0 Global Step: 600 Fp16 Grad Scale: 8192 Required: 8 hours Training: 2021-03-14 21:42:47,082-Speed 4512.97 samples/sec Loss 43.1387 Epoch: 0 Global Step: 650 Fp16 Grad Scale: 8192 Required: 8 hours Training: 2021-03-14 21:42:58,630-Speed 4433.58 samples/sec Loss 42.7071 Epoch: 0 Global Step: 700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:43:09,935-Speed 4529.18 samples/sec Loss 42.1376 Epoch: 0 Global Step: 750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:43:21,322-Speed 4496.64 samples/sec Loss 41.6190 Epoch: 0 Global Step: 800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:43:32,755-Speed 4478.70 samples/sec Loss 41.0818 Epoch: 0 Global Step: 850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:43:44,128-Speed 4501.87 samples/sec Loss 40.4158 Epoch: 0 Global Step: 900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:43:55,377-Speed 4551.71 samples/sec Loss 39.8163 Epoch: 0 Global Step: 950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:44:06,783-Speed 4489.05 samples/sec Loss 39.1616 Epoch: 0 Global Step: 1000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:44:17,945-Speed 4587.20 samples/sec Loss 38.4532 Epoch: 0 Global Step: 1050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:44:29,471-Speed 4442.27 samples/sec Loss 37.6957 Epoch: 0 Global Step: 1100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:44:40,675-Speed 4570.31 samples/sec Loss 36.9186 Epoch: 0 Global Step: 1150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:44:52,108-Speed 4478.50 samples/sec Loss 36.1454 Epoch: 0 Global Step: 1200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:45:03,384-Speed 4540.80 samples/sec Loss 35.0905 Epoch: 0 Global Step: 1250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:45:15,253-Speed 4313.85 samples/sec Loss 34.3267 Epoch: 0 Global Step: 1300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:45:26,762-Speed 4449.09 samples/sec Loss 33.4499 Epoch: 0 Global Step: 1350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:45:39,038-Speed 4170.94 samples/sec Loss 32.5158 Epoch: 0 Global Step: 1400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:45:50,302-Speed 4545.63 samples/sec Loss 31.5561 Epoch: 0 Global Step: 1450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:01,639-Speed 4516.21 samples/sec Loss 30.6976 Epoch: 0 Global Step: 1500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:12,988-Speed 4511.71 samples/sec Loss 29.7484 Epoch: 0 Global Step: 1550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:24,146-Speed 4588.72 samples/sec Loss 28.8447 Epoch: 0 Global Step: 1600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:35,522-Speed 4501.07 samples/sec Loss 28.0661 Epoch: 0 Global Step: 1650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:46,818-Speed 4532.81 samples/sec Loss 27.3511 Epoch: 0 Global Step: 1700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:46:58,279-Speed 4467.40 samples/sec Loss 26.6512 Epoch: 0 Global Step: 1750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:47:09,502-Speed 4562.18 samples/sec Loss 25.9673 Epoch: 0 Global Step: 1800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:47:20,954-Speed 4470.98 samples/sec Loss 25.2759 Epoch: 0 Global Step: 1850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:47:32,181-Speed 4560.96 samples/sec Loss 24.7682 Epoch: 0 Global Step: 1900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:47:43,750-Speed 4425.63 samples/sec Loss 24.1834 Epoch: 0 Global Step: 1950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:47:54,985-Speed 4557.37 samples/sec Loss 23.6697 Epoch: 0 Global Step: 2000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 21:48:28,128-[lfw][2000]XNorm: 21.772935 Training: 2021-03-14 21:48:28,128-[lfw][2000]Accuracy-Flip: 0.97933+-0.00720 Training: 2021-03-14 21:48:28,128-[lfw][2000]Accuracy-Highest: 0.97933 Training: 2021-03-14 21:49:04,888-[cfp_fp][2000]XNorm: 18.421573 Training: 2021-03-14 21:49:04,888-[cfp_fp][2000]Accuracy-Flip: 0.86143+-0.01553 Training: 2021-03-14 21:49:04,888-[cfp_fp][2000]Accuracy-Highest: 0.86143 Training: 2021-03-14 21:49:37,403-[agedb_30][2000]XNorm: 21.133333 Training: 2021-03-14 21:49:37,403-[agedb_30][2000]Accuracy-Flip: 0.87367+-0.02516 Training: 2021-03-14 21:49:37,403-[agedb_30][2000]Accuracy-Highest: 0.87367 Training: 2021-03-14 21:49:48,801-Speed 449.85 samples/sec Loss 23.2106 Epoch: 0 Global Step: 2050 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-14 21:50:00,070-Speed 4543.81 samples/sec Loss 22.6493 Epoch: 0 Global Step: 2100 Fp16 Grad Scale: 16384 Required: 10 hours Training: 2021-03-14 21:50:11,388-Speed 4524.01 samples/sec Loss 22.4492 Epoch: 0 Global Step: 2150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:50:22,706-Speed 4524.11 samples/sec Loss 21.9853 Epoch: 0 Global Step: 2200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:50:34,056-Speed 4511.01 samples/sec Loss 21.6679 Epoch: 0 Global Step: 2250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:50:45,202-Speed 4593.92 samples/sec Loss 21.3549 Epoch: 0 Global Step: 2300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:50:56,547-Speed 4513.34 samples/sec Loss 21.0168 Epoch: 0 Global Step: 2350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:51:07,771-Speed 4561.57 samples/sec Loss 20.7479 Epoch: 0 Global Step: 2400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:51:19,146-Speed 4501.51 samples/sec Loss 20.4745 Epoch: 0 Global Step: 2450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:51:30,692-Speed 4434.53 samples/sec Loss 20.3408 Epoch: 0 Global Step: 2500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:51:42,187-Speed 4454.18 samples/sec Loss 20.1140 Epoch: 0 Global Step: 2550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:51:53,730-Speed 4435.87 samples/sec Loss 19.7399 Epoch: 0 Global Step: 2600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:52:05,454-Speed 4367.21 samples/sec Loss 19.5484 Epoch: 0 Global Step: 2650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:52:17,184-Speed 4365.09 samples/sec Loss 19.4670 Epoch: 0 Global Step: 2700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:52:28,663-Speed 4460.66 samples/sec Loss 19.1967 Epoch: 0 Global Step: 2750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:52:40,733-Speed 4242.05 samples/sec Loss 19.1854 Epoch: 0 Global Step: 2800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:52:52,292-Speed 4429.70 samples/sec Loss 19.0601 Epoch: 0 Global Step: 2850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:53:03,644-Speed 4510.22 samples/sec Loss 18.8012 Epoch: 0 Global Step: 2900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:53:14,828-Speed 4578.29 samples/sec Loss 18.6420 Epoch: 0 Global Step: 2950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:53:26,111-Speed 4537.98 samples/sec Loss 18.5917 Epoch: 0 Global Step: 3000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:53:37,445-Speed 4517.63 samples/sec Loss 18.5324 Epoch: 0 Global Step: 3050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:53:48,816-Speed 4502.77 samples/sec Loss 18.3959 Epoch: 0 Global Step: 3100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:00,210-Speed 4493.73 samples/sec Loss 18.1309 Epoch: 0 Global Step: 3150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:11,577-Speed 4504.47 samples/sec Loss 18.1493 Epoch: 0 Global Step: 3200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:22,929-Speed 4510.58 samples/sec Loss 18.0127 Epoch: 0 Global Step: 3250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:34,480-Speed 4432.51 samples/sec Loss 17.8993 Epoch: 0 Global Step: 3300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:45,668-Speed 4576.46 samples/sec Loss 17.8649 Epoch: 0 Global Step: 3350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:54:57,096-Speed 4480.68 samples/sec Loss 17.6893 Epoch: 0 Global Step: 3400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:55:08,412-Speed 4524.70 samples/sec Loss 17.6489 Epoch: 0 Global Step: 3450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:55:19,908-Speed 4454.04 samples/sec Loss 17.5809 Epoch: 0 Global Step: 3500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:55:31,247-Speed 4515.42 samples/sec Loss 17.4990 Epoch: 0 Global Step: 3550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:55:42,660-Speed 4486.12 samples/sec Loss 17.4120 Epoch: 0 Global Step: 3600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:55:54,022-Speed 4506.60 samples/sec Loss 17.3049 Epoch: 0 Global Step: 3650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:56:05,694-Speed 4386.87 samples/sec Loss 17.2304 Epoch: 0 Global Step: 3700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:56:16,944-Speed 4550.99 samples/sec Loss 17.2827 Epoch: 0 Global Step: 3750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:56:28,362-Speed 4484.46 samples/sec Loss 17.0764 Epoch: 0 Global Step: 3800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:56:39,685-Speed 4521.99 samples/sec Loss 17.0244 Epoch: 0 Global Step: 3850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:56:51,047-Speed 4506.56 samples/sec Loss 16.9304 Epoch: 0 Global Step: 3900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:57:02,340-Speed 4534.14 samples/sec Loss 16.8477 Epoch: 0 Global Step: 3950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:57:13,811-Speed 4463.42 samples/sec Loss 16.7940 Epoch: 0 Global Step: 4000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:57:45,065-[lfw][4000]XNorm: 21.248177 Training: 2021-03-14 21:57:45,065-[lfw][4000]Accuracy-Flip: 0.99267+-0.00429 Training: 2021-03-14 21:57:45,065-[lfw][4000]Accuracy-Highest: 0.99267 Training: 2021-03-14 21:58:21,080-[cfp_fp][4000]XNorm: 18.229645 Training: 2021-03-14 21:58:21,081-[cfp_fp][4000]Accuracy-Flip: 0.91943+-0.01404 Training: 2021-03-14 21:58:21,081-[cfp_fp][4000]Accuracy-Highest: 0.91943 Training: 2021-03-14 21:58:52,760-[agedb_30][4000]XNorm: 20.799000 Training: 2021-03-14 21:58:52,760-[agedb_30][4000]Accuracy-Flip: 0.92850+-0.01527 Training: 2021-03-14 21:58:52,761-[agedb_30][4000]Accuracy-Highest: 0.92850 Training: 2021-03-14 21:59:03,999-Speed 464.66 samples/sec Loss 16.8180 Epoch: 0 Global Step: 4050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:59:15,576-Speed 4422.92 samples/sec Loss 16.8008 Epoch: 0 Global Step: 4100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:59:26,963-Speed 4496.58 samples/sec Loss 16.8067 Epoch: 0 Global Step: 4150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:59:38,304-Speed 4514.59 samples/sec Loss 16.6681 Epoch: 0 Global Step: 4200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 21:59:49,840-Speed 4438.55 samples/sec Loss 16.8047 Epoch: 0 Global Step: 4250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:01,497-Speed 4392.57 samples/sec Loss 16.6429 Epoch: 0 Global Step: 4300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:12,783-Speed 4536.71 samples/sec Loss 16.5423 Epoch: 0 Global Step: 4350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:24,648-Speed 4315.21 samples/sec Loss 16.5444 Epoch: 0 Global Step: 4400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:36,436-Speed 4343.77 samples/sec Loss 16.4978 Epoch: 0 Global Step: 4450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:47,627-Speed 4575.13 samples/sec Loss 16.4804 Epoch: 0 Global Step: 4500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:00:58,885-Speed 4548.30 samples/sec Loss 16.3749 Epoch: 0 Global Step: 4550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:01:10,171-Speed 4536.51 samples/sec Loss 16.3069 Epoch: 0 Global Step: 4600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:01:21,445-Speed 4541.57 samples/sec Loss 16.2470 Epoch: 0 Global Step: 4650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:01:32,696-Speed 4551.22 samples/sec Loss 16.2902 Epoch: 0 Global Step: 4700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:01:44,177-Speed 4459.56 samples/sec Loss 16.2382 Epoch: 0 Global Step: 4750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:01:55,489-Speed 4526.32 samples/sec Loss 16.2135 Epoch: 0 Global Step: 4800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:02:07,058-Speed 4425.84 samples/sec Loss 16.2327 Epoch: 0 Global Step: 4850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:02:18,308-Speed 4551.50 samples/sec Loss 16.1618 Epoch: 0 Global Step: 4900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:02:29,747-Speed 4475.92 samples/sec Loss 16.1123 Epoch: 0 Global Step: 4950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:02:42,820-Speed 3916.79 samples/sec Loss 15.8160 Epoch: 1 Global Step: 5000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:02:54,971-Speed 4213.88 samples/sec Loss 15.3632 Epoch: 1 Global Step: 5050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:03:06,553-Speed 4420.66 samples/sec Loss 15.4204 Epoch: 1 Global Step: 5100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:03:18,056-Speed 4451.48 samples/sec Loss 15.4542 Epoch: 1 Global Step: 5150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:03:29,252-Speed 4573.23 samples/sec Loss 15.4057 Epoch: 1 Global Step: 5200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:03:40,504-Speed 4550.34 samples/sec Loss 15.5942 Epoch: 1 Global Step: 5250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:03:51,793-Speed 4535.55 samples/sec Loss 15.5590 Epoch: 1 Global Step: 5300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:04:03,177-Speed 4498.02 samples/sec Loss 15.6327 Epoch: 1 Global Step: 5350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:04:14,446-Speed 4543.43 samples/sec Loss 15.7070 Epoch: 1 Global Step: 5400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:04:25,844-Speed 4492.54 samples/sec Loss 15.6890 Epoch: 1 Global Step: 5450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:04:37,517-Speed 4386.23 samples/sec Loss 15.6559 Epoch: 1 Global Step: 5500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:04:48,755-Speed 4556.20 samples/sec Loss 15.6682 Epoch: 1 Global Step: 5550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:00,284-Speed 4441.19 samples/sec Loss 15.6706 Epoch: 1 Global Step: 5600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:11,456-Speed 4582.98 samples/sec Loss 15.5852 Epoch: 1 Global Step: 5650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:22,830-Speed 4501.77 samples/sec Loss 15.5864 Epoch: 1 Global Step: 5700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:34,089-Speed 4547.70 samples/sec Loss 15.6468 Epoch: 1 Global Step: 5750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:45,560-Speed 4463.65 samples/sec Loss 15.5574 Epoch: 1 Global Step: 5800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:05:56,853-Speed 4533.91 samples/sec Loss 15.4870 Epoch: 1 Global Step: 5850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:06:08,247-Speed 4493.70 samples/sec Loss 15.5631 Epoch: 1 Global Step: 5900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:06:19,722-Speed 4462.26 samples/sec Loss 15.4278 Epoch: 1 Global Step: 5950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:06:31,372-Speed 4394.81 samples/sec Loss 15.5845 Epoch: 1 Global Step: 6000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:07:02,591-[lfw][6000]XNorm: 22.442214 Training: 2021-03-14 22:07:02,591-[lfw][6000]Accuracy-Flip: 0.99383+-0.00269 Training: 2021-03-14 22:07:02,591-[lfw][6000]Accuracy-Highest: 0.99383 Training: 2021-03-14 22:07:38,362-[cfp_fp][6000]XNorm: 19.089140 Training: 2021-03-14 22:07:38,362-[cfp_fp][6000]Accuracy-Flip: 0.94114+-0.01468 Training: 2021-03-14 22:07:38,362-[cfp_fp][6000]Accuracy-Highest: 0.94114 Training: 2021-03-14 22:08:09,205-[agedb_30][6000]XNorm: 21.675709 Training: 2021-03-14 22:08:09,205-[agedb_30][6000]Accuracy-Flip: 0.93900+-0.01465 Training: 2021-03-14 22:08:09,205-[agedb_30][6000]Accuracy-Highest: 0.93900 Training: 2021-03-14 22:08:20,513-Speed 469.12 samples/sec Loss 15.4833 Epoch: 1 Global Step: 6050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:08:32,769-Speed 4177.48 samples/sec Loss 15.4587 Epoch: 1 Global Step: 6100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:08:44,231-Speed 4467.19 samples/sec Loss 15.4595 Epoch: 1 Global Step: 6150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:08:55,567-Speed 4516.88 samples/sec Loss 15.4554 Epoch: 1 Global Step: 6200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:09:06,973-Speed 4488.95 samples/sec Loss 15.3544 Epoch: 1 Global Step: 6250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:09:18,711-Speed 4362.05 samples/sec Loss 15.3548 Epoch: 1 Global Step: 6300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:09:30,031-Speed 4523.32 samples/sec Loss 15.3467 Epoch: 1 Global Step: 6350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:09:41,529-Speed 4453.24 samples/sec Loss 15.2982 Epoch: 1 Global Step: 6400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:09:52,716-Speed 4576.68 samples/sec Loss 15.2779 Epoch: 1 Global Step: 6450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:10:04,177-Speed 4467.82 samples/sec Loss 15.3268 Epoch: 1 Global Step: 6500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:10:15,314-Speed 4597.14 samples/sec Loss 15.3345 Epoch: 1 Global Step: 6550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:10:26,394-Speed 4621.58 samples/sec Loss 15.2487 Epoch: 1 Global Step: 6600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:10:37,581-Speed 4576.71 samples/sec Loss 15.2882 Epoch: 1 Global Step: 6650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:10:48,996-Speed 4485.75 samples/sec Loss 15.2249 Epoch: 1 Global Step: 6700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:00,418-Speed 4482.62 samples/sec Loss 15.2593 Epoch: 1 Global Step: 6750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:11,619-Speed 4571.25 samples/sec Loss 15.2190 Epoch: 1 Global Step: 6800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:22,964-Speed 4513.09 samples/sec Loss 15.0769 Epoch: 1 Global Step: 6850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:34,139-Speed 4582.06 samples/sec Loss 15.0862 Epoch: 1 Global Step: 6900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:45,336-Speed 4572.57 samples/sec Loss 15.1081 Epoch: 1 Global Step: 6950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:11:56,449-Speed 4607.47 samples/sec Loss 15.0740 Epoch: 1 Global Step: 7000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:12:07,871-Speed 4482.72 samples/sec Loss 15.0906 Epoch: 1 Global Step: 7050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:12:19,403-Speed 4440.05 samples/sec Loss 15.1104 Epoch: 1 Global Step: 7100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:12:31,257-Speed 4319.66 samples/sec Loss 14.9855 Epoch: 1 Global Step: 7150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:12:42,413-Speed 4589.47 samples/sec Loss 15.1202 Epoch: 1 Global Step: 7200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:12:53,614-Speed 4571.16 samples/sec Loss 15.0065 Epoch: 1 Global Step: 7250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:13:04,782-Speed 4585.00 samples/sec Loss 14.9254 Epoch: 1 Global Step: 7300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:13:16,141-Speed 4507.37 samples/sec Loss 15.0296 Epoch: 1 Global Step: 7350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:13:27,520-Speed 4499.73 samples/sec Loss 14.9337 Epoch: 1 Global Step: 7400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:13:39,053-Speed 4439.87 samples/sec Loss 14.8663 Epoch: 1 Global Step: 7450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:13:50,373-Speed 4523.20 samples/sec Loss 14.9025 Epoch: 1 Global Step: 7500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:01,796-Speed 4482.07 samples/sec Loss 14.9747 Epoch: 1 Global Step: 7550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:13,090-Speed 4533.65 samples/sec Loss 14.8099 Epoch: 1 Global Step: 7600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:24,317-Speed 4560.98 samples/sec Loss 14.8035 Epoch: 1 Global Step: 7650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:35,506-Speed 4575.78 samples/sec Loss 14.8963 Epoch: 1 Global Step: 7700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:47,654-Speed 4215.11 samples/sec Loss 14.8433 Epoch: 1 Global Step: 7750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:14:59,104-Speed 4471.65 samples/sec Loss 14.8770 Epoch: 1 Global Step: 7800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:15:10,532-Speed 4480.44 samples/sec Loss 14.8513 Epoch: 1 Global Step: 7850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:15:22,437-Speed 4300.80 samples/sec Loss 14.9244 Epoch: 1 Global Step: 7900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:15:33,869-Speed 4479.04 samples/sec Loss 14.8788 Epoch: 1 Global Step: 7950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:15:45,319-Speed 4471.78 samples/sec Loss 14.8866 Epoch: 1 Global Step: 8000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:16:16,047-[lfw][8000]XNorm: 23.360111 Training: 2021-03-14 22:16:16,048-[lfw][8000]Accuracy-Flip: 0.99217+-0.00448 Training: 2021-03-14 22:16:16,048-[lfw][8000]Accuracy-Highest: 0.99383 Training: 2021-03-14 22:16:51,791-[cfp_fp][8000]XNorm: 19.679413 Training: 2021-03-14 22:16:51,792-[cfp_fp][8000]Accuracy-Flip: 0.93086+-0.01303 Training: 2021-03-14 22:16:51,792-[cfp_fp][8000]Accuracy-Highest: 0.94114 Training: 2021-03-14 22:17:22,617-[agedb_30][8000]XNorm: 22.382423 Training: 2021-03-14 22:17:22,618-[agedb_30][8000]Accuracy-Flip: 0.94283+-0.01209 Training: 2021-03-14 22:17:22,618-[agedb_30][8000]Accuracy-Highest: 0.94283 Training: 2021-03-14 22:17:33,769-Speed 472.11 samples/sec Loss 14.7507 Epoch: 1 Global Step: 8050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:17:45,129-Speed 4507.21 samples/sec Loss 14.8146 Epoch: 1 Global Step: 8100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:17:56,335-Speed 4569.14 samples/sec Loss 14.7279 Epoch: 1 Global Step: 8150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:18:07,833-Speed 4453.33 samples/sec Loss 14.7791 Epoch: 1 Global Step: 8200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:18:19,055-Speed 4562.57 samples/sec Loss 14.7574 Epoch: 1 Global Step: 8250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:18:30,496-Speed 4475.30 samples/sec Loss 14.7312 Epoch: 1 Global Step: 8300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:18:41,822-Speed 4520.87 samples/sec Loss 14.7360 Epoch: 1 Global Step: 8350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:18:53,325-Speed 4451.12 samples/sec Loss 14.5957 Epoch: 1 Global Step: 8400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:19:04,520-Speed 4573.65 samples/sec Loss 14.7364 Epoch: 1 Global Step: 8450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:19:15,947-Speed 4480.75 samples/sec Loss 14.7062 Epoch: 1 Global Step: 8500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:19:27,415-Speed 4464.98 samples/sec Loss 14.6640 Epoch: 1 Global Step: 8550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:19:38,708-Speed 4533.76 samples/sec Loss 14.6765 Epoch: 1 Global Step: 8600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:19:50,088-Speed 4499.44 samples/sec Loss 14.7359 Epoch: 1 Global Step: 8650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:01,615-Speed 4442.13 samples/sec Loss 14.6457 Epoch: 1 Global Step: 8700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:13,257-Speed 4397.94 samples/sec Loss 14.6147 Epoch: 1 Global Step: 8750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:24,788-Speed 4440.66 samples/sec Loss 14.5694 Epoch: 1 Global Step: 8800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:36,152-Speed 4505.44 samples/sec Loss 14.6180 Epoch: 1 Global Step: 8850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:47,441-Speed 4535.49 samples/sec Loss 14.6516 Epoch: 1 Global Step: 8900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:20:58,938-Speed 4453.52 samples/sec Loss 14.6414 Epoch: 1 Global Step: 8950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:21:10,277-Speed 4515.75 samples/sec Loss 14.5783 Epoch: 1 Global Step: 9000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:21:21,637-Speed 4507.44 samples/sec Loss 14.5705 Epoch: 1 Global Step: 9050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:21:32,875-Speed 4556.08 samples/sec Loss 14.5549 Epoch: 1 Global Step: 9100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:21:44,324-Speed 4472.16 samples/sec Loss 14.6013 Epoch: 1 Global Step: 9150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:21:55,656-Speed 4518.35 samples/sec Loss 14.6045 Epoch: 1 Global Step: 9200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:22:06,950-Speed 4533.53 samples/sec Loss 14.5071 Epoch: 1 Global Step: 9250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:22:18,199-Speed 4551.93 samples/sec Loss 14.4240 Epoch: 1 Global Step: 9300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:22:29,640-Speed 4475.04 samples/sec Loss 14.4802 Epoch: 1 Global Step: 9350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:22:41,001-Speed 4507.07 samples/sec Loss 14.5016 Epoch: 1 Global Step: 9400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:22:52,611-Speed 4410.12 samples/sec Loss 14.4549 Epoch: 1 Global Step: 9450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:23:04,193-Speed 4420.93 samples/sec Loss 14.4499 Epoch: 1 Global Step: 9500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:23:15,881-Speed 4380.69 samples/sec Loss 14.5105 Epoch: 1 Global Step: 9550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:23:27,436-Speed 4431.14 samples/sec Loss 14.4165 Epoch: 1 Global Step: 9600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:23:39,307-Speed 4313.22 samples/sec Loss 14.3639 Epoch: 1 Global Step: 9650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:23:50,622-Speed 4525.24 samples/sec Loss 14.4888 Epoch: 1 Global Step: 9700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:24:01,982-Speed 4507.34 samples/sec Loss 14.3548 Epoch: 1 Global Step: 9750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:24:13,422-Speed 4475.70 samples/sec Loss 14.3642 Epoch: 1 Global Step: 9800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:24:24,607-Speed 4577.67 samples/sec Loss 14.4219 Epoch: 1 Global Step: 9850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:24:36,082-Speed 4462.08 samples/sec Loss 14.4776 Epoch: 1 Global Step: 9900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:24:47,493-Speed 4487.18 samples/sec Loss 14.4164 Epoch: 1 Global Step: 9950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:25:02,642-Speed 3379.78 samples/sec Loss 13.7947 Epoch: 2 Global Step: 10000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:25:33,771-[lfw][10000]XNorm: 23.671964 Training: 2021-03-14 22:25:33,771-[lfw][10000]Accuracy-Flip: 0.99383+-0.00388 Training: 2021-03-14 22:25:33,771-[lfw][10000]Accuracy-Highest: 0.99383 Training: 2021-03-14 22:26:09,316-[cfp_fp][10000]XNorm: 20.188665 Training: 2021-03-14 22:26:09,316-[cfp_fp][10000]Accuracy-Flip: 0.94414+-0.01140 Training: 2021-03-14 22:26:09,316-[cfp_fp][10000]Accuracy-Highest: 0.94414 Training: 2021-03-14 22:26:39,993-[agedb_30][10000]XNorm: 22.711823 Training: 2021-03-14 22:26:39,993-[agedb_30][10000]Accuracy-Flip: 0.95167+-0.01014 Training: 2021-03-14 22:26:39,994-[agedb_30][10000]Accuracy-Highest: 0.95167 Training: 2021-03-14 22:26:51,630-Speed 469.78 samples/sec Loss 13.6785 Epoch: 2 Global Step: 10050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:27:03,076-Speed 4473.28 samples/sec Loss 13.8045 Epoch: 2 Global Step: 10100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:27:14,560-Speed 4458.93 samples/sec Loss 13.8611 Epoch: 2 Global Step: 10150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:27:25,880-Speed 4523.24 samples/sec Loss 13.9817 Epoch: 2 Global Step: 10200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:27:37,254-Speed 4501.73 samples/sec Loss 14.0867 Epoch: 2 Global Step: 10250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:27:48,645-Speed 4494.94 samples/sec Loss 14.1539 Epoch: 2 Global Step: 10300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:00,236-Speed 4417.40 samples/sec Loss 14.2404 Epoch: 2 Global Step: 10350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:12,039-Speed 4337.90 samples/sec Loss 14.2261 Epoch: 2 Global Step: 10400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:23,172-Speed 4599.39 samples/sec Loss 14.2359 Epoch: 2 Global Step: 10450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:34,548-Speed 4500.90 samples/sec Loss 14.1536 Epoch: 2 Global Step: 10500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:45,941-Speed 4494.03 samples/sec Loss 14.2720 Epoch: 2 Global Step: 10550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:28:57,546-Speed 4412.07 samples/sec Loss 14.2321 Epoch: 2 Global Step: 10600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:29:08,905-Speed 4507.62 samples/sec Loss 14.2084 Epoch: 2 Global Step: 10650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:29:20,426-Speed 4444.50 samples/sec Loss 14.2343 Epoch: 2 Global Step: 10700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:29:31,624-Speed 4572.20 samples/sec Loss 14.2422 Epoch: 2 Global Step: 10750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:29:42,852-Speed 4560.13 samples/sec Loss 14.2027 Epoch: 2 Global Step: 10800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:29:54,186-Speed 4517.87 samples/sec Loss 14.2936 Epoch: 2 Global Step: 10850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:30:05,696-Speed 4448.19 samples/sec Loss 14.2358 Epoch: 2 Global Step: 10900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:30:17,007-Speed 4526.98 samples/sec Loss 14.2046 Epoch: 2 Global Step: 10950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:30:28,455-Speed 4472.53 samples/sec Loss 14.1451 Epoch: 2 Global Step: 11000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:30:39,743-Speed 4536.05 samples/sec Loss 14.1713 Epoch: 2 Global Step: 11050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:30:51,121-Speed 4499.94 samples/sec Loss 14.2027 Epoch: 2 Global Step: 11100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:31:02,621-Speed 4452.52 samples/sec Loss 14.2212 Epoch: 2 Global Step: 11150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:31:14,109-Speed 4456.84 samples/sec Loss 14.2120 Epoch: 2 Global Step: 11200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:31:25,491-Speed 4498.62 samples/sec Loss 14.1010 Epoch: 2 Global Step: 11250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:31:37,163-Speed 4386.63 samples/sec Loss 14.1050 Epoch: 2 Global Step: 11300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:31:48,976-Speed 4334.51 samples/sec Loss 14.2313 Epoch: 2 Global Step: 11350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:00,655-Speed 4384.15 samples/sec Loss 14.2096 Epoch: 2 Global Step: 11400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:12,242-Speed 4418.79 samples/sec Loss 14.1985 Epoch: 2 Global Step: 11450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:23,930-Speed 4380.70 samples/sec Loss 14.2992 Epoch: 2 Global Step: 11500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:35,379-Speed 4472.31 samples/sec Loss 14.1573 Epoch: 2 Global Step: 11550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:46,488-Speed 4609.28 samples/sec Loss 14.2429 Epoch: 2 Global Step: 11600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:32:57,818-Speed 4518.84 samples/sec Loss 14.1784 Epoch: 2 Global Step: 11650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:33:09,144-Speed 4520.99 samples/sec Loss 14.1504 Epoch: 2 Global Step: 11700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:33:20,593-Speed 4472.28 samples/sec Loss 14.1956 Epoch: 2 Global Step: 11750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:33:31,947-Speed 4509.54 samples/sec Loss 14.0733 Epoch: 2 Global Step: 11800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:33:43,423-Speed 4461.54 samples/sec Loss 14.0804 Epoch: 2 Global Step: 11850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:33:54,950-Speed 4442.14 samples/sec Loss 14.0581 Epoch: 2 Global Step: 11900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:34:06,515-Speed 4427.40 samples/sec Loss 14.0982 Epoch: 2 Global Step: 11950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:34:17,613-Speed 4613.32 samples/sec Loss 14.0867 Epoch: 2 Global Step: 12000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:34:48,416-[lfw][12000]XNorm: 22.331438 Training: 2021-03-14 22:34:48,417-[lfw][12000]Accuracy-Flip: 0.99383+-0.00366 Training: 2021-03-14 22:34:48,417-[lfw][12000]Accuracy-Highest: 0.99383 Training: 2021-03-14 22:35:24,644-[cfp_fp][12000]XNorm: 19.292588 Training: 2021-03-14 22:35:24,644-[cfp_fp][12000]Accuracy-Flip: 0.94557+-0.01126 Training: 2021-03-14 22:35:24,644-[cfp_fp][12000]Accuracy-Highest: 0.94557 Training: 2021-03-14 22:35:55,562-[agedb_30][12000]XNorm: 21.691359 Training: 2021-03-14 22:35:55,562-[agedb_30][12000]Accuracy-Flip: 0.95150+-0.00845 Training: 2021-03-14 22:35:55,562-[agedb_30][12000]Accuracy-Highest: 0.95167 Training: 2021-03-14 22:36:07,192-Speed 467.24 samples/sec Loss 14.1503 Epoch: 2 Global Step: 12050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:36:18,797-Speed 4412.34 samples/sec Loss 14.1534 Epoch: 2 Global Step: 12100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:36:30,248-Speed 4471.14 samples/sec Loss 14.0075 Epoch: 2 Global Step: 12150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:36:41,479-Speed 4559.22 samples/sec Loss 14.0452 Epoch: 2 Global Step: 12200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:36:53,023-Speed 4435.48 samples/sec Loss 14.0574 Epoch: 2 Global Step: 12250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:37:04,271-Speed 4552.04 samples/sec Loss 14.0070 Epoch: 2 Global Step: 12300 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:37:15,549-Speed 4539.81 samples/sec Loss 14.0320 Epoch: 2 Global Step: 12350 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:37:26,825-Speed 4541.08 samples/sec Loss 13.9957 Epoch: 2 Global Step: 12400 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:37:38,458-Speed 4401.24 samples/sec Loss 14.0418 Epoch: 2 Global Step: 12450 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:37:49,738-Speed 4539.16 samples/sec Loss 14.0921 Epoch: 2 Global Step: 12500 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:01,320-Speed 4420.97 samples/sec Loss 14.0033 Epoch: 2 Global Step: 12550 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:12,768-Speed 4472.68 samples/sec Loss 13.9569 Epoch: 2 Global Step: 12600 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:24,089-Speed 4522.55 samples/sec Loss 14.0753 Epoch: 2 Global Step: 12650 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:35,566-Speed 4461.38 samples/sec Loss 14.0217 Epoch: 2 Global Step: 12700 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:46,808-Speed 4554.46 samples/sec Loss 14.1091 Epoch: 2 Global Step: 12750 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:38:58,452-Speed 4397.34 samples/sec Loss 13.9549 Epoch: 2 Global Step: 12800 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:39:09,741-Speed 4535.65 samples/sec Loss 13.9200 Epoch: 2 Global Step: 12850 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:39:21,191-Speed 4471.91 samples/sec Loss 14.0056 Epoch: 2 Global Step: 12900 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:39:32,411-Speed 4563.32 samples/sec Loss 14.0288 Epoch: 2 Global Step: 12950 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:39:43,902-Speed 4455.97 samples/sec Loss 14.0037 Epoch: 2 Global Step: 13000 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:39:55,468-Speed 4427.01 samples/sec Loss 13.9449 Epoch: 2 Global Step: 13050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:40:06,886-Speed 4484.11 samples/sec Loss 14.0486 Epoch: 2 Global Step: 13100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:40:18,399-Speed 4447.54 samples/sec Loss 13.9952 Epoch: 2 Global Step: 13150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:40:30,191-Speed 4341.92 samples/sec Loss 14.0025 Epoch: 2 Global Step: 13200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:40:41,425-Speed 4557.85 samples/sec Loss 13.9380 Epoch: 2 Global Step: 13250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:40:53,142-Speed 4369.80 samples/sec Loss 13.9225 Epoch: 2 Global Step: 13300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:41:04,633-Speed 4456.14 samples/sec Loss 13.9442 Epoch: 2 Global Step: 13350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:41:15,901-Speed 4543.84 samples/sec Loss 14.0167 Epoch: 2 Global Step: 13400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:41:27,135-Speed 4557.90 samples/sec Loss 13.9963 Epoch: 2 Global Step: 13450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:41:38,481-Speed 4512.63 samples/sec Loss 14.0103 Epoch: 2 Global Step: 13500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:41:49,814-Speed 4517.99 samples/sec Loss 13.9057 Epoch: 2 Global Step: 13550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:01,158-Speed 4513.93 samples/sec Loss 13.9230 Epoch: 2 Global Step: 13600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:12,626-Speed 4464.82 samples/sec Loss 13.9179 Epoch: 2 Global Step: 13650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:23,811-Speed 4577.36 samples/sec Loss 13.9801 Epoch: 2 Global Step: 13700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:35,542-Speed 4364.99 samples/sec Loss 13.9931 Epoch: 2 Global Step: 13750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:46,756-Speed 4565.98 samples/sec Loss 13.8612 Epoch: 2 Global Step: 13800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:42:58,393-Speed 4399.63 samples/sec Loss 13.9556 Epoch: 2 Global Step: 13850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:43:09,767-Speed 4502.03 samples/sec Loss 13.9370 Epoch: 2 Global Step: 13900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:43:21,182-Speed 4485.45 samples/sec Loss 13.9771 Epoch: 2 Global Step: 13950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:43:32,619-Speed 4476.72 samples/sec Loss 13.7918 Epoch: 2 Global Step: 14000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:44:03,307-[lfw][14000]XNorm: 22.001071 Training: 2021-03-14 22:44:03,307-[lfw][14000]Accuracy-Flip: 0.99267+-0.00382 Training: 2021-03-14 22:44:03,307-[lfw][14000]Accuracy-Highest: 0.99383 Training: 2021-03-14 22:44:38,905-[cfp_fp][14000]XNorm: 18.869263 Training: 2021-03-14 22:44:38,905-[cfp_fp][14000]Accuracy-Flip: 0.94471+-0.01262 Training: 2021-03-14 22:44:38,905-[cfp_fp][14000]Accuracy-Highest: 0.94557 Training: 2021-03-14 22:45:09,672-[agedb_30][14000]XNorm: 21.527927 Training: 2021-03-14 22:45:09,672-[agedb_30][14000]Accuracy-Flip: 0.95417+-0.00949 Training: 2021-03-14 22:45:09,673-[agedb_30][14000]Accuracy-Highest: 0.95417 Training: 2021-03-14 22:45:20,923-Speed 472.74 samples/sec Loss 13.9035 Epoch: 2 Global Step: 14050 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:45:32,377-Speed 4470.48 samples/sec Loss 13.9690 Epoch: 2 Global Step: 14100 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:45:43,806-Speed 4479.96 samples/sec Loss 13.8820 Epoch: 2 Global Step: 14150 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:45:55,308-Speed 4451.66 samples/sec Loss 14.0041 Epoch: 2 Global Step: 14200 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:46:06,636-Speed 4519.65 samples/sec Loss 13.8826 Epoch: 2 Global Step: 14250 Fp16 Grad Scale: 16384 Required: 9 hours Training: 2021-03-14 22:46:18,077-Speed 4475.34 samples/sec Loss 13.8451 Epoch: 2 Global Step: 14300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:46:29,560-Speed 4458.95 samples/sec Loss 13.8821 Epoch: 2 Global Step: 14350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:46:40,942-Speed 4498.78 samples/sec Loss 13.8244 Epoch: 2 Global Step: 14400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:46:52,363-Speed 4483.01 samples/sec Loss 13.7670 Epoch: 2 Global Step: 14450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:47:03,699-Speed 4516.63 samples/sec Loss 13.8445 Epoch: 2 Global Step: 14500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:47:15,267-Speed 4426.25 samples/sec Loss 14.0385 Epoch: 2 Global Step: 14550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:47:26,631-Speed 4505.94 samples/sec Loss 13.8959 Epoch: 2 Global Step: 14600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:47:37,811-Speed 4579.42 samples/sec Loss 13.9358 Epoch: 2 Global Step: 14650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:47:49,139-Speed 4520.35 samples/sec Loss 13.7790 Epoch: 2 Global Step: 14700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:48:00,547-Speed 4488.22 samples/sec Loss 13.8011 Epoch: 2 Global Step: 14750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:48:11,967-Speed 4483.51 samples/sec Loss 13.8158 Epoch: 2 Global Step: 14800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:48:23,171-Speed 4569.98 samples/sec Loss 13.8222 Epoch: 2 Global Step: 14850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:48:34,558-Speed 4496.58 samples/sec Loss 13.8453 Epoch: 2 Global Step: 14900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:48:48,705-Speed 3619.34 samples/sec Loss 13.8031 Epoch: 3 Global Step: 14950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:01,359-Speed 4046.31 samples/sec Loss 13.1055 Epoch: 3 Global Step: 15000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:13,117-Speed 4354.66 samples/sec Loss 13.1936 Epoch: 3 Global Step: 15050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:24,935-Speed 4332.81 samples/sec Loss 13.2769 Epoch: 3 Global Step: 15100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:36,231-Speed 4532.89 samples/sec Loss 13.4009 Epoch: 3 Global Step: 15150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:47,902-Speed 4387.08 samples/sec Loss 13.4404 Epoch: 3 Global Step: 15200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:49:59,571-Speed 4387.90 samples/sec Loss 13.4903 Epoch: 3 Global Step: 15250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:50:11,053-Speed 4459.19 samples/sec Loss 13.5198 Epoch: 3 Global Step: 15300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:50:22,360-Speed 4528.64 samples/sec Loss 13.5403 Epoch: 3 Global Step: 15350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:50:33,619-Speed 4547.40 samples/sec Loss 13.5796 Epoch: 3 Global Step: 15400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:50:44,902-Speed 4538.24 samples/sec Loss 13.7090 Epoch: 3 Global Step: 15450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:50:56,892-Speed 4270.28 samples/sec Loss 13.7197 Epoch: 3 Global Step: 15500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:51:08,199-Speed 4528.48 samples/sec Loss 13.7664 Epoch: 3 Global Step: 15550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:51:19,810-Speed 4409.59 samples/sec Loss 13.7554 Epoch: 3 Global Step: 15600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:51:30,994-Speed 4578.30 samples/sec Loss 13.6705 Epoch: 3 Global Step: 15650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:51:42,434-Speed 4475.59 samples/sec Loss 13.7426 Epoch: 3 Global Step: 15700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:51:53,742-Speed 4528.06 samples/sec Loss 13.7655 Epoch: 3 Global Step: 15750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:52:05,216-Speed 4462.44 samples/sec Loss 13.6968 Epoch: 3 Global Step: 15800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:52:16,438-Speed 4562.87 samples/sec Loss 13.7768 Epoch: 3 Global Step: 15850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:52:27,908-Speed 4463.77 samples/sec Loss 13.6517 Epoch: 3 Global Step: 15900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:52:39,343-Speed 4477.72 samples/sec Loss 13.6594 Epoch: 3 Global Step: 15950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:52:50,562-Speed 4564.04 samples/sec Loss 13.7187 Epoch: 3 Global Step: 16000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:53:21,330-[lfw][16000]XNorm: 20.869780 Training: 2021-03-14 22:53:21,330-[lfw][16000]Accuracy-Flip: 0.99500+-0.00357 Training: 2021-03-14 22:53:21,330-[lfw][16000]Accuracy-Highest: 0.99500 Training: 2021-03-14 22:53:57,059-[cfp_fp][16000]XNorm: 17.713747 Training: 2021-03-14 22:53:57,059-[cfp_fp][16000]Accuracy-Flip: 0.93943+-0.01511 Training: 2021-03-14 22:53:57,059-[cfp_fp][16000]Accuracy-Highest: 0.94557 Training: 2021-03-14 22:54:27,782-[agedb_30][16000]XNorm: 20.180858 Training: 2021-03-14 22:54:27,783-[agedb_30][16000]Accuracy-Flip: 0.95250+-0.00898 Training: 2021-03-14 22:54:27,783-[agedb_30][16000]Accuracy-Highest: 0.95417 Training: 2021-03-14 22:54:39,047-Speed 471.96 samples/sec Loss 13.7911 Epoch: 3 Global Step: 16050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:54:50,538-Speed 4455.74 samples/sec Loss 13.7165 Epoch: 3 Global Step: 16100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:01,779-Speed 4555.16 samples/sec Loss 13.6841 Epoch: 3 Global Step: 16150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:13,018-Speed 4555.57 samples/sec Loss 13.6674 Epoch: 3 Global Step: 16200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:24,546-Speed 4441.66 samples/sec Loss 13.7654 Epoch: 3 Global Step: 16250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:36,112-Speed 4426.74 samples/sec Loss 13.6695 Epoch: 3 Global Step: 16300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:47,410-Speed 4532.15 samples/sec Loss 13.7761 Epoch: 3 Global Step: 16350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:55:58,631-Speed 4563.10 samples/sec Loss 13.7186 Epoch: 3 Global Step: 16400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:56:10,036-Speed 4489.54 samples/sec Loss 13.7023 Epoch: 3 Global Step: 16450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:56:21,024-Speed 4659.59 samples/sec Loss 13.7477 Epoch: 3 Global Step: 16500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:56:32,482-Speed 4468.84 samples/sec Loss 13.6975 Epoch: 3 Global Step: 16550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:56:43,669-Speed 4577.07 samples/sec Loss 13.6762 Epoch: 3 Global Step: 16600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:56:55,141-Speed 4462.99 samples/sec Loss 13.7131 Epoch: 3 Global Step: 16650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:57:06,568-Speed 4480.70 samples/sec Loss 13.6407 Epoch: 3 Global Step: 16700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:57:18,234-Speed 4389.07 samples/sec Loss 13.7334 Epoch: 3 Global Step: 16750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:57:29,998-Speed 4352.51 samples/sec Loss 13.7264 Epoch: 3 Global Step: 16800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:57:41,783-Speed 4344.87 samples/sec Loss 13.6405 Epoch: 3 Global Step: 16850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:57:53,272-Speed 4456.51 samples/sec Loss 13.7017 Epoch: 3 Global Step: 16900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:58:05,075-Speed 4338.16 samples/sec Loss 13.6738 Epoch: 3 Global Step: 16950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:58:16,274-Speed 4571.99 samples/sec Loss 13.6900 Epoch: 3 Global Step: 17000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:58:27,774-Speed 4452.37 samples/sec Loss 13.7050 Epoch: 3 Global Step: 17050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:58:39,392-Speed 4407.06 samples/sec Loss 13.6635 Epoch: 3 Global Step: 17100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:58:50,596-Speed 4570.24 samples/sec Loss 13.7048 Epoch: 3 Global Step: 17150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:02,216-Speed 4406.11 samples/sec Loss 13.6261 Epoch: 3 Global Step: 17200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:13,787-Speed 4425.09 samples/sec Loss 13.5875 Epoch: 3 Global Step: 17250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:25,164-Speed 4500.64 samples/sec Loss 13.5875 Epoch: 3 Global Step: 17300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:36,614-Speed 4471.73 samples/sec Loss 13.6442 Epoch: 3 Global Step: 17350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:47,940-Speed 4520.64 samples/sec Loss 13.6666 Epoch: 3 Global Step: 17400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 22:59:59,171-Speed 4558.96 samples/sec Loss 13.5868 Epoch: 3 Global Step: 17450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:00:10,549-Speed 4500.31 samples/sec Loss 13.6344 Epoch: 3 Global Step: 17500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:00:21,858-Speed 4527.45 samples/sec Loss 13.6136 Epoch: 3 Global Step: 17550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:00:33,237-Speed 4499.93 samples/sec Loss 13.6837 Epoch: 3 Global Step: 17600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:00:44,490-Speed 4549.92 samples/sec Loss 13.7086 Epoch: 3 Global Step: 17650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:00:56,037-Speed 4434.44 samples/sec Loss 13.7151 Epoch: 3 Global Step: 17700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:01:07,237-Speed 4571.37 samples/sec Loss 13.6568 Epoch: 3 Global Step: 17750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:01:18,608-Speed 4502.83 samples/sec Loss 13.7556 Epoch: 3 Global Step: 17800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:01:29,927-Speed 4523.74 samples/sec Loss 13.5717 Epoch: 3 Global Step: 17850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:01:41,318-Speed 4494.76 samples/sec Loss 13.5827 Epoch: 3 Global Step: 17900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:01:52,633-Speed 4525.37 samples/sec Loss 13.7211 Epoch: 3 Global Step: 17950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:02:04,396-Speed 4352.64 samples/sec Loss 13.6889 Epoch: 3 Global Step: 18000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:02:35,151-[lfw][18000]XNorm: 21.680356 Training: 2021-03-14 23:02:35,151-[lfw][18000]Accuracy-Flip: 0.99350+-0.00411 Training: 2021-03-14 23:02:35,151-[lfw][18000]Accuracy-Highest: 0.99500 Training: 2021-03-14 23:03:10,962-[cfp_fp][18000]XNorm: 18.264520 Training: 2021-03-14 23:03:10,962-[cfp_fp][18000]Accuracy-Flip: 0.94871+-0.01142 Training: 2021-03-14 23:03:10,962-[cfp_fp][18000]Accuracy-Highest: 0.94871 Training: 2021-03-14 23:03:41,795-[agedb_30][18000]XNorm: 21.166075 Training: 2021-03-14 23:03:41,795-[agedb_30][18000]Accuracy-Flip: 0.95650+-0.00717 Training: 2021-03-14 23:03:41,795-[agedb_30][18000]Accuracy-Highest: 0.95650 Training: 2021-03-14 23:03:53,073-Speed 471.12 samples/sec Loss 13.6310 Epoch: 3 Global Step: 18050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:04:04,467-Speed 4493.79 samples/sec Loss 13.5933 Epoch: 3 Global Step: 18100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:04:15,695-Speed 4560.27 samples/sec Loss 13.5431 Epoch: 3 Global Step: 18150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:04:27,156-Speed 4467.69 samples/sec Loss 13.7114 Epoch: 3 Global Step: 18200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:04:38,591-Speed 4477.62 samples/sec Loss 13.5933 Epoch: 3 Global Step: 18250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:04:49,992-Speed 4491.11 samples/sec Loss 13.5771 Epoch: 3 Global Step: 18300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:01,168-Speed 4581.21 samples/sec Loss 13.7137 Epoch: 3 Global Step: 18350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:12,432-Speed 4545.81 samples/sec Loss 13.6075 Epoch: 3 Global Step: 18400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:23,765-Speed 4517.94 samples/sec Loss 13.6663 Epoch: 3 Global Step: 18450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:34,995-Speed 4559.46 samples/sec Loss 13.5419 Epoch: 3 Global Step: 18500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:46,336-Speed 4514.52 samples/sec Loss 13.6123 Epoch: 3 Global Step: 18550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:05:57,599-Speed 4546.18 samples/sec Loss 13.6106 Epoch: 3 Global Step: 18600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:06:08,998-Speed 4491.92 samples/sec Loss 13.6376 Epoch: 3 Global Step: 18650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:06:21,112-Speed 4226.75 samples/sec Loss 13.5363 Epoch: 3 Global Step: 18700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:06:32,394-Speed 4538.37 samples/sec Loss 13.5201 Epoch: 3 Global Step: 18750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:06:43,923-Speed 4441.19 samples/sec Loss 13.6462 Epoch: 3 Global Step: 18800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:06:55,503-Speed 4421.53 samples/sec Loss 13.5618 Epoch: 3 Global Step: 18850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:07:07,105-Speed 4413.40 samples/sec Loss 13.6214 Epoch: 3 Global Step: 18900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:07:18,803-Speed 4376.81 samples/sec Loss 13.6325 Epoch: 3 Global Step: 18950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:07:30,335-Speed 4440.19 samples/sec Loss 13.6397 Epoch: 3 Global Step: 19000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:07:41,861-Speed 4442.19 samples/sec Loss 13.5597 Epoch: 3 Global Step: 19050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:07:53,251-Speed 4495.25 samples/sec Loss 13.4842 Epoch: 3 Global Step: 19100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:08:04,660-Speed 4488.14 samples/sec Loss 13.5343 Epoch: 3 Global Step: 19150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:08:15,941-Speed 4538.71 samples/sec Loss 13.6003 Epoch: 3 Global Step: 19200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:08:27,325-Speed 4497.96 samples/sec Loss 13.4786 Epoch: 3 Global Step: 19250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:08:38,758-Speed 4478.07 samples/sec Loss 13.5870 Epoch: 3 Global Step: 19300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:08:50,024-Speed 4544.86 samples/sec Loss 13.6178 Epoch: 3 Global Step: 19350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:01,247-Speed 4562.64 samples/sec Loss 13.5250 Epoch: 3 Global Step: 19400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:12,591-Speed 4513.40 samples/sec Loss 13.5058 Epoch: 3 Global Step: 19450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:23,841-Speed 4551.22 samples/sec Loss 13.5343 Epoch: 3 Global Step: 19500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:35,258-Speed 4484.95 samples/sec Loss 13.6363 Epoch: 3 Global Step: 19550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:46,742-Speed 4458.26 samples/sec Loss 13.5448 Epoch: 3 Global Step: 19600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:09:58,091-Speed 4512.00 samples/sec Loss 13.6033 Epoch: 3 Global Step: 19650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:10:09,811-Speed 4368.49 samples/sec Loss 13.5579 Epoch: 3 Global Step: 19700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:10:21,078-Speed 4544.36 samples/sec Loss 13.4800 Epoch: 3 Global Step: 19750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:10:32,485-Speed 4488.88 samples/sec Loss 13.4951 Epoch: 3 Global Step: 19800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:10:43,793-Speed 4527.75 samples/sec Loss 13.4648 Epoch: 3 Global Step: 19850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:10:55,166-Speed 4502.15 samples/sec Loss 13.5276 Epoch: 3 Global Step: 19900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:11:10,356-Speed 3370.94 samples/sec Loss 13.1759 Epoch: 4 Global Step: 19950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:11:22,290-Speed 4290.27 samples/sec Loss 12.8748 Epoch: 4 Global Step: 20000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:11:53,004-[lfw][20000]XNorm: 22.829273 Training: 2021-03-14 23:11:53,004-[lfw][20000]Accuracy-Flip: 0.99483+-0.00329 Training: 2021-03-14 23:11:53,004-[lfw][20000]Accuracy-Highest: 0.99500 Training: 2021-03-14 23:12:28,742-[cfp_fp][20000]XNorm: 19.612702 Training: 2021-03-14 23:12:28,742-[cfp_fp][20000]Accuracy-Flip: 0.94586+-0.00911 Training: 2021-03-14 23:12:28,742-[cfp_fp][20000]Accuracy-Highest: 0.94871 Training: 2021-03-14 23:12:59,578-[agedb_30][20000]XNorm: 22.251450 Training: 2021-03-14 23:12:59,578-[agedb_30][20000]Accuracy-Flip: 0.95717+-0.01030 Training: 2021-03-14 23:12:59,578-[agedb_30][20000]Accuracy-Highest: 0.95717 Training: 2021-03-14 23:13:11,044-Speed 470.79 samples/sec Loss 12.8930 Epoch: 4 Global Step: 20050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:13:22,464-Speed 4483.29 samples/sec Loss 12.9993 Epoch: 4 Global Step: 20100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:13:33,914-Speed 4471.82 samples/sec Loss 13.1833 Epoch: 4 Global Step: 20150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:13:45,262-Speed 4512.40 samples/sec Loss 13.2459 Epoch: 4 Global Step: 20200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:13:56,663-Speed 4490.89 samples/sec Loss 13.2502 Epoch: 4 Global Step: 20250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:14:08,051-Speed 4496.29 samples/sec Loss 13.2611 Epoch: 4 Global Step: 20300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:14:19,449-Speed 4492.31 samples/sec Loss 13.2961 Epoch: 4 Global Step: 20350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:14:30,946-Speed 4453.46 samples/sec Loss 13.4032 Epoch: 4 Global Step: 20400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:14:42,220-Speed 4541.49 samples/sec Loss 13.3911 Epoch: 4 Global Step: 20450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:14:53,627-Speed 4488.82 samples/sec Loss 13.3841 Epoch: 4 Global Step: 20500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:15:05,328-Speed 4375.96 samples/sec Loss 13.4558 Epoch: 4 Global Step: 20550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:15:17,291-Speed 4279.88 samples/sec Loss 13.5102 Epoch: 4 Global Step: 20600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:15:29,136-Speed 4322.79 samples/sec Loss 13.5503 Epoch: 4 Global Step: 20650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:15:40,756-Speed 4406.25 samples/sec Loss 13.4196 Epoch: 4 Global Step: 20700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:15:52,074-Speed 4524.01 samples/sec Loss 13.4114 Epoch: 4 Global Step: 20750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:16:03,603-Speed 4441.12 samples/sec Loss 13.3998 Epoch: 4 Global Step: 20800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:16:15,256-Speed 4393.83 samples/sec Loss 13.4826 Epoch: 4 Global Step: 20850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:16:26,704-Speed 4472.65 samples/sec Loss 13.3772 Epoch: 4 Global Step: 20900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:16:38,010-Speed 4528.74 samples/sec Loss 13.3743 Epoch: 4 Global Step: 20950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:16:49,323-Speed 4525.96 samples/sec Loss 13.4643 Epoch: 4 Global Step: 21000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:00,542-Speed 4564.09 samples/sec Loss 13.4284 Epoch: 4 Global Step: 21050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:12,027-Speed 4458.17 samples/sec Loss 13.4816 Epoch: 4 Global Step: 21100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:23,496-Speed 4464.23 samples/sec Loss 13.5139 Epoch: 4 Global Step: 21150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:34,702-Speed 4569.26 samples/sec Loss 13.5423 Epoch: 4 Global Step: 21200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:46,058-Speed 4508.85 samples/sec Loss 13.4395 Epoch: 4 Global Step: 21250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:17:57,354-Speed 4533.00 samples/sec Loss 13.3464 Epoch: 4 Global Step: 21300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:18:08,747-Speed 4493.83 samples/sec Loss 13.4693 Epoch: 4 Global Step: 21350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:18:19,999-Speed 4550.55 samples/sec Loss 13.4392 Epoch: 4 Global Step: 21400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:18:31,694-Speed 4378.10 samples/sec Loss 13.4768 Epoch: 4 Global Step: 21450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:18:42,994-Speed 4531.45 samples/sec Loss 13.3722 Epoch: 4 Global Step: 21500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:18:54,550-Speed 4430.70 samples/sec Loss 13.4790 Epoch: 4 Global Step: 21550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:19:05,859-Speed 4527.39 samples/sec Loss 13.5161 Epoch: 4 Global Step: 21600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:19:17,329-Speed 4464.16 samples/sec Loss 13.4403 Epoch: 4 Global Step: 21650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:19:28,607-Speed 4539.82 samples/sec Loss 13.4921 Epoch: 4 Global Step: 21700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:19:40,018-Speed 4487.24 samples/sec Loss 13.4368 Epoch: 4 Global Step: 21750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:19:51,245-Speed 4560.57 samples/sec Loss 13.3792 Epoch: 4 Global Step: 21800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:20:02,562-Speed 4524.43 samples/sec Loss 13.3598 Epoch: 4 Global Step: 21850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:20:13,859-Speed 4532.40 samples/sec Loss 13.3793 Epoch: 4 Global Step: 21900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:20:25,210-Speed 4510.86 samples/sec Loss 13.4723 Epoch: 4 Global Step: 21950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:20:36,748-Speed 4437.81 samples/sec Loss 13.4849 Epoch: 4 Global Step: 22000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:21:07,487-[lfw][22000]XNorm: 22.083531 Training: 2021-03-14 23:21:07,487-[lfw][22000]Accuracy-Flip: 0.99450+-0.00308 Training: 2021-03-14 23:21:07,487-[lfw][22000]Accuracy-Highest: 0.99500 Training: 2021-03-14 23:21:42,923-[cfp_fp][22000]XNorm: 18.843301 Training: 2021-03-14 23:21:42,923-[cfp_fp][22000]Accuracy-Flip: 0.95071+-0.01114 Training: 2021-03-14 23:21:42,923-[cfp_fp][22000]Accuracy-Highest: 0.95071 Training: 2021-03-14 23:22:13,866-[agedb_30][22000]XNorm: 21.215162 Training: 2021-03-14 23:22:13,866-[agedb_30][22000]Accuracy-Flip: 0.95117+-0.01041 Training: 2021-03-14 23:22:13,866-[agedb_30][22000]Accuracy-Highest: 0.95717 Training: 2021-03-14 23:22:25,325-Speed 471.56 samples/sec Loss 13.4278 Epoch: 4 Global Step: 22050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:22:36,686-Speed 4506.54 samples/sec Loss 13.3911 Epoch: 4 Global Step: 22100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:22:48,074-Speed 4496.18 samples/sec Loss 13.3521 Epoch: 4 Global Step: 22150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:22:59,589-Speed 4446.52 samples/sec Loss 13.3471 Epoch: 4 Global Step: 22200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:23:10,838-Speed 4551.87 samples/sec Loss 13.3490 Epoch: 4 Global Step: 22250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:23:22,036-Speed 4572.29 samples/sec Loss 13.3699 Epoch: 4 Global Step: 22300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:23:33,599-Speed 4428.30 samples/sec Loss 13.3687 Epoch: 4 Global Step: 22350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:23:45,228-Speed 4403.03 samples/sec Loss 13.4113 Epoch: 4 Global Step: 22400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:23:56,734-Speed 4449.79 samples/sec Loss 13.4412 Epoch: 4 Global Step: 22450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:24:08,737-Speed 4265.97 samples/sec Loss 13.3918 Epoch: 4 Global Step: 22500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:24:20,190-Speed 4470.56 samples/sec Loss 13.4414 Epoch: 4 Global Step: 22550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:24:31,804-Speed 4408.65 samples/sec Loss 13.3670 Epoch: 4 Global Step: 22600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:24:43,157-Speed 4509.96 samples/sec Loss 13.4363 Epoch: 4 Global Step: 22650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:24:54,624-Speed 4465.14 samples/sec Loss 13.4792 Epoch: 4 Global Step: 22700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:25:06,053-Speed 4480.15 samples/sec Loss 13.4776 Epoch: 4 Global Step: 22750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:25:17,326-Speed 4542.01 samples/sec Loss 13.3849 Epoch: 4 Global Step: 22800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:25:28,671-Speed 4513.12 samples/sec Loss 13.3644 Epoch: 4 Global Step: 22850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:25:40,162-Speed 4456.03 samples/sec Loss 13.3176 Epoch: 4 Global Step: 22900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:25:51,574-Speed 4486.83 samples/sec Loss 13.3766 Epoch: 4 Global Step: 22950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:02,878-Speed 4529.39 samples/sec Loss 13.4822 Epoch: 4 Global Step: 23000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:14,135-Speed 4548.66 samples/sec Loss 13.4173 Epoch: 4 Global Step: 23050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:25,522-Speed 4496.48 samples/sec Loss 13.3387 Epoch: 4 Global Step: 23100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:36,984-Speed 4467.02 samples/sec Loss 13.4699 Epoch: 4 Global Step: 23150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:48,290-Speed 4528.76 samples/sec Loss 13.3400 Epoch: 4 Global Step: 23200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:26:59,944-Speed 4393.67 samples/sec Loss 13.2975 Epoch: 4 Global Step: 23250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:27:11,210-Speed 4544.74 samples/sec Loss 13.5051 Epoch: 4 Global Step: 23300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:27:22,571-Speed 4506.75 samples/sec Loss 13.4149 Epoch: 4 Global Step: 23350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:27:33,852-Speed 4538.68 samples/sec Loss 13.3635 Epoch: 4 Global Step: 23400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:27:45,165-Speed 4526.25 samples/sec Loss 13.4120 Epoch: 4 Global Step: 23450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:27:56,474-Speed 4527.26 samples/sec Loss 13.3263 Epoch: 4 Global Step: 23500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:28:07,909-Speed 4477.87 samples/sec Loss 13.3298 Epoch: 4 Global Step: 23550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:28:19,177-Speed 4543.82 samples/sec Loss 13.3704 Epoch: 4 Global Step: 23600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:28:30,567-Speed 4495.71 samples/sec Loss 13.4106 Epoch: 4 Global Step: 23650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:28:41,676-Speed 4609.05 samples/sec Loss 13.4415 Epoch: 4 Global Step: 23700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:28:52,931-Speed 4549.34 samples/sec Loss 13.2592 Epoch: 4 Global Step: 23750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:29:04,266-Speed 4516.98 samples/sec Loss 13.3706 Epoch: 4 Global Step: 23800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:29:15,620-Speed 4509.84 samples/sec Loss 13.3478 Epoch: 4 Global Step: 23850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:29:26,901-Speed 4538.74 samples/sec Loss 13.3412 Epoch: 4 Global Step: 23900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:29:38,156-Speed 4549.31 samples/sec Loss 13.4246 Epoch: 4 Global Step: 23950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:29:49,371-Speed 4565.44 samples/sec Loss 13.3490 Epoch: 4 Global Step: 24000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:30:20,016-[lfw][24000]XNorm: 20.616754 Training: 2021-03-14 23:30:20,016-[lfw][24000]Accuracy-Flip: 0.99550+-0.00317 Training: 2021-03-14 23:30:20,016-[lfw][24000]Accuracy-Highest: 0.99550 Training: 2021-03-14 23:30:55,569-[cfp_fp][24000]XNorm: 17.293343 Training: 2021-03-14 23:30:55,569-[cfp_fp][24000]Accuracy-Flip: 0.94971+-0.00947 Training: 2021-03-14 23:30:55,569-[cfp_fp][24000]Accuracy-Highest: 0.95071 Training: 2021-03-14 23:31:26,321-[agedb_30][24000]XNorm: 19.885662 Training: 2021-03-14 23:31:26,321-[agedb_30][24000]Accuracy-Flip: 0.95700+-0.01035 Training: 2021-03-14 23:31:26,321-[agedb_30][24000]Accuracy-Highest: 0.95717 Training: 2021-03-14 23:31:37,676-Speed 472.74 samples/sec Loss 13.3217 Epoch: 4 Global Step: 24050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:31:49,516-Speed 4324.47 samples/sec Loss 13.3955 Epoch: 4 Global Step: 24100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:00,701-Speed 4577.71 samples/sec Loss 13.3074 Epoch: 4 Global Step: 24150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:11,982-Speed 4538.95 samples/sec Loss 13.3765 Epoch: 4 Global Step: 24200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:23,194-Speed 4566.85 samples/sec Loss 13.3869 Epoch: 4 Global Step: 24250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:34,629-Speed 4477.38 samples/sec Loss 13.3724 Epoch: 4 Global Step: 24300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:46,176-Speed 4434.44 samples/sec Loss 13.3306 Epoch: 4 Global Step: 24350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:32:58,249-Speed 4241.17 samples/sec Loss 13.3750 Epoch: 4 Global Step: 24400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:33:09,846-Speed 4415.06 samples/sec Loss 13.2681 Epoch: 4 Global Step: 24450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:33:21,263-Speed 4484.70 samples/sec Loss 13.2479 Epoch: 4 Global Step: 24500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:33:32,429-Speed 4585.59 samples/sec Loss 13.3061 Epoch: 4 Global Step: 24550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:33:43,927-Speed 4453.10 samples/sec Loss 13.3155 Epoch: 4 Global Step: 24600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:33:55,251-Speed 4521.44 samples/sec Loss 13.3285 Epoch: 4 Global Step: 24650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:34:06,632-Speed 4499.08 samples/sec Loss 13.3452 Epoch: 4 Global Step: 24700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:34:18,148-Speed 4446.20 samples/sec Loss 13.3923 Epoch: 4 Global Step: 24750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:34:29,491-Speed 4514.08 samples/sec Loss 13.3488 Epoch: 4 Global Step: 24800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:34:40,789-Speed 4531.98 samples/sec Loss 13.3286 Epoch: 4 Global Step: 24850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:34:52,104-Speed 4525.08 samples/sec Loss 13.3637 Epoch: 4 Global Step: 24900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:35:07,562-Speed 3312.24 samples/sec Loss 12.7056 Epoch: 5 Global Step: 24950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:35:19,199-Speed 4399.98 samples/sec Loss 12.6161 Epoch: 5 Global Step: 25000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:35:30,603-Speed 4490.02 samples/sec Loss 12.7677 Epoch: 5 Global Step: 25050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:35:42,194-Speed 4417.28 samples/sec Loss 12.8512 Epoch: 5 Global Step: 25100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:35:53,391-Speed 4573.04 samples/sec Loss 12.9929 Epoch: 5 Global Step: 25150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:36:04,907-Speed 4446.21 samples/sec Loss 13.1001 Epoch: 5 Global Step: 25200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:36:16,241-Speed 4517.62 samples/sec Loss 13.0318 Epoch: 5 Global Step: 25250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:36:27,738-Speed 4453.52 samples/sec Loss 13.2072 Epoch: 5 Global Step: 25300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:36:39,170-Speed 4478.74 samples/sec Loss 13.1976 Epoch: 5 Global Step: 25350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:36:50,485-Speed 4525.35 samples/sec Loss 13.2697 Epoch: 5 Global Step: 25400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:01,898-Speed 4486.43 samples/sec Loss 13.2648 Epoch: 5 Global Step: 25450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:13,242-Speed 4513.66 samples/sec Loss 13.2604 Epoch: 5 Global Step: 25500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:24,551-Speed 4527.57 samples/sec Loss 13.3292 Epoch: 5 Global Step: 25550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:35,844-Speed 4533.84 samples/sec Loss 13.2336 Epoch: 5 Global Step: 25600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:47,361-Speed 4445.69 samples/sec Loss 13.1840 Epoch: 5 Global Step: 25650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:37:58,572-Speed 4567.51 samples/sec Loss 13.2785 Epoch: 5 Global Step: 25700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:38:09,971-Speed 4491.66 samples/sec Loss 13.3159 Epoch: 5 Global Step: 25750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:38:21,428-Speed 4469.12 samples/sec Loss 13.3239 Epoch: 5 Global Step: 25800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:38:33,066-Speed 4399.60 samples/sec Loss 13.3293 Epoch: 5 Global Step: 25850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:38:44,228-Speed 4587.14 samples/sec Loss 13.2235 Epoch: 5 Global Step: 25900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:38:55,554-Speed 4520.82 samples/sec Loss 13.3434 Epoch: 5 Global Step: 25950 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:39:06,793-Speed 4555.79 samples/sec Loss 13.3085 Epoch: 5 Global Step: 26000 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:39:37,521-[lfw][26000]XNorm: 20.932833 Training: 2021-03-14 23:39:37,521-[lfw][26000]Accuracy-Flip: 0.99567+-0.00291 Training: 2021-03-14 23:39:37,521-[lfw][26000]Accuracy-Highest: 0.99567 Training: 2021-03-14 23:40:13,068-[cfp_fp][26000]XNorm: 17.977917 Training: 2021-03-14 23:40:13,068-[cfp_fp][26000]Accuracy-Flip: 0.94714+-0.01247 Training: 2021-03-14 23:40:13,068-[cfp_fp][26000]Accuracy-Highest: 0.95071 Training: 2021-03-14 23:40:43,716-[agedb_30][26000]XNorm: 20.571849 Training: 2021-03-14 23:40:43,717-[agedb_30][26000]Accuracy-Flip: 0.95917+-0.00892 Training: 2021-03-14 23:40:43,717-[agedb_30][26000]Accuracy-Highest: 0.95917 Training: 2021-03-14 23:40:55,126-Speed 472.62 samples/sec Loss 13.2928 Epoch: 5 Global Step: 26050 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:41:06,311-Speed 4577.78 samples/sec Loss 13.3012 Epoch: 5 Global Step: 26100 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:41:17,748-Speed 4476.61 samples/sec Loss 13.2387 Epoch: 5 Global Step: 26150 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:41:29,217-Speed 4464.68 samples/sec Loss 13.2625 Epoch: 5 Global Step: 26200 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:41:40,640-Speed 4482.16 samples/sec Loss 13.3144 Epoch: 5 Global Step: 26250 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:41:52,823-Speed 4202.93 samples/sec Loss 13.2641 Epoch: 5 Global Step: 26300 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:42:04,276-Speed 4470.47 samples/sec Loss 13.2906 Epoch: 5 Global Step: 26350 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:42:15,585-Speed 4527.77 samples/sec Loss 13.2830 Epoch: 5 Global Step: 26400 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:42:26,987-Speed 4490.58 samples/sec Loss 13.2224 Epoch: 5 Global Step: 26450 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:42:38,239-Speed 4550.24 samples/sec Loss 13.2425 Epoch: 5 Global Step: 26500 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:42:49,877-Speed 4399.85 samples/sec Loss 13.2665 Epoch: 5 Global Step: 26550 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:01,225-Speed 4511.74 samples/sec Loss 13.2812 Epoch: 5 Global Step: 26600 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:12,536-Speed 4526.93 samples/sec Loss 13.3360 Epoch: 5 Global Step: 26650 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:24,133-Speed 4414.94 samples/sec Loss 13.3510 Epoch: 5 Global Step: 26700 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:35,478-Speed 4513.35 samples/sec Loss 13.2596 Epoch: 5 Global Step: 26750 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:46,973-Speed 4454.35 samples/sec Loss 13.2998 Epoch: 5 Global Step: 26800 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:43:58,293-Speed 4523.24 samples/sec Loss 13.2117 Epoch: 5 Global Step: 26850 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:44:09,741-Speed 4472.43 samples/sec Loss 13.2551 Epoch: 5 Global Step: 26900 Fp16 Grad Scale: 16384 Required: 8 hours Training: 2021-03-14 23:44:21,011-Speed 4543.47 samples/sec Loss 13.1987 Epoch: 5 Global Step: 26950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:44:32,326-Speed 4524.96 samples/sec Loss 13.2796 Epoch: 5 Global Step: 27000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:44:43,685-Speed 4507.57 samples/sec Loss 13.2871 Epoch: 5 Global Step: 27050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:44:54,855-Speed 4584.06 samples/sec Loss 13.2351 Epoch: 5 Global Step: 27100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:45:06,232-Speed 4500.51 samples/sec Loss 13.3412 Epoch: 5 Global Step: 27150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:45:17,522-Speed 4535.30 samples/sec Loss 13.3139 Epoch: 5 Global Step: 27200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:45:28,692-Speed 4583.73 samples/sec Loss 13.2896 Epoch: 5 Global Step: 27250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:45:40,198-Speed 4450.03 samples/sec Loss 13.2540 Epoch: 5 Global Step: 27300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:45:51,507-Speed 4527.48 samples/sec Loss 13.2639 Epoch: 5 Global Step: 27350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:46:02,814-Speed 4528.59 samples/sec Loss 13.1889 Epoch: 5 Global Step: 27400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:46:14,183-Speed 4503.66 samples/sec Loss 13.3042 Epoch: 5 Global Step: 27450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:46:25,541-Speed 4507.84 samples/sec Loss 13.2306 Epoch: 5 Global Step: 27500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:46:37,193-Speed 4394.55 samples/sec Loss 13.2924 Epoch: 5 Global Step: 27550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:46:48,695-Speed 4451.32 samples/sec Loss 13.2612 Epoch: 5 Global Step: 27600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:00,260-Speed 4427.49 samples/sec Loss 13.1974 Epoch: 5 Global Step: 27650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:11,691-Speed 4479.46 samples/sec Loss 13.3076 Epoch: 5 Global Step: 27700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:23,227-Speed 4438.33 samples/sec Loss 13.3744 Epoch: 5 Global Step: 27750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:34,510-Speed 4537.93 samples/sec Loss 13.2819 Epoch: 5 Global Step: 27800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:45,986-Speed 4461.76 samples/sec Loss 13.2732 Epoch: 5 Global Step: 27850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:47:57,307-Speed 4522.82 samples/sec Loss 13.1617 Epoch: 5 Global Step: 27900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:48:08,821-Speed 4446.91 samples/sec Loss 13.2118 Epoch: 5 Global Step: 27950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:48:19,892-Speed 4624.65 samples/sec Loss 13.2060 Epoch: 5 Global Step: 28000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:48:50,739-[lfw][28000]XNorm: 22.668872 Training: 2021-03-14 23:48:50,740-[lfw][28000]Accuracy-Flip: 0.99383+-0.00373 Training: 2021-03-14 23:48:50,740-[lfw][28000]Accuracy-Highest: 0.99567 Training: 2021-03-14 23:49:26,270-[cfp_fp][28000]XNorm: 18.988123 Training: 2021-03-14 23:49:26,270-[cfp_fp][28000]Accuracy-Flip: 0.94300+-0.01193 Training: 2021-03-14 23:49:26,270-[cfp_fp][28000]Accuracy-Highest: 0.95071 Training: 2021-03-14 23:49:56,920-[agedb_30][28000]XNorm: 22.063123 Training: 2021-03-14 23:49:56,921-[agedb_30][28000]Accuracy-Flip: 0.95767+-0.00886 Training: 2021-03-14 23:49:56,921-[agedb_30][28000]Accuracy-Highest: 0.95917 Training: 2021-03-14 23:50:08,355-Speed 472.05 samples/sec Loss 13.2311 Epoch: 5 Global Step: 28050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:50:19,967-Speed 4409.57 samples/sec Loss 13.2664 Epoch: 5 Global Step: 28100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:50:31,499-Speed 4440.03 samples/sec Loss 13.2641 Epoch: 5 Global Step: 28150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:50:42,910-Speed 4487.03 samples/sec Loss 13.2964 Epoch: 5 Global Step: 28200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:50:54,676-Speed 4351.69 samples/sec Loss 13.2336 Epoch: 5 Global Step: 28250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:51:05,846-Speed 4583.90 samples/sec Loss 13.2432 Epoch: 5 Global Step: 28300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:51:17,103-Speed 4548.64 samples/sec Loss 13.1584 Epoch: 5 Global Step: 28350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:51:28,471-Speed 4504.12 samples/sec Loss 13.2666 Epoch: 5 Global Step: 28400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:51:39,847-Speed 4500.90 samples/sec Loss 13.2779 Epoch: 5 Global Step: 28450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:51:51,807-Speed 4281.14 samples/sec Loss 13.1234 Epoch: 5 Global Step: 28500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:03,355-Speed 4433.70 samples/sec Loss 13.3573 Epoch: 5 Global Step: 28550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:14,726-Speed 4503.12 samples/sec Loss 13.1192 Epoch: 5 Global Step: 28600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:25,941-Speed 4565.27 samples/sec Loss 13.1465 Epoch: 5 Global Step: 28650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:37,159-Speed 4564.66 samples/sec Loss 13.2546 Epoch: 5 Global Step: 28700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:48,400-Speed 4554.82 samples/sec Loss 13.2781 Epoch: 5 Global Step: 28750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:52:59,673-Speed 4541.89 samples/sec Loss 13.2170 Epoch: 5 Global Step: 28800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:53:10,823-Speed 4592.43 samples/sec Loss 13.2499 Epoch: 5 Global Step: 28850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:53:22,329-Speed 4449.93 samples/sec Loss 13.2221 Epoch: 5 Global Step: 28900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:53:33,773-Speed 4474.08 samples/sec Loss 13.1311 Epoch: 5 Global Step: 28950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:53:45,340-Speed 4426.77 samples/sec Loss 13.3042 Epoch: 5 Global Step: 29000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:53:56,584-Speed 4553.42 samples/sec Loss 13.1957 Epoch: 5 Global Step: 29050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:54:07,965-Speed 4498.92 samples/sec Loss 13.1641 Epoch: 5 Global Step: 29100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:54:19,152-Speed 4576.98 samples/sec Loss 13.1609 Epoch: 5 Global Step: 29150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:54:30,683-Speed 4440.64 samples/sec Loss 13.1669 Epoch: 5 Global Step: 29200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:54:41,897-Speed 4565.87 samples/sec Loss 13.1977 Epoch: 5 Global Step: 29250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:54:53,303-Speed 4489.09 samples/sec Loss 13.1431 Epoch: 5 Global Step: 29300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:55:05,024-Speed 4368.46 samples/sec Loss 13.2484 Epoch: 5 Global Step: 29350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:55:16,471-Speed 4473.02 samples/sec Loss 13.2546 Epoch: 5 Global Step: 29400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:55:27,790-Speed 4523.48 samples/sec Loss 13.2800 Epoch: 5 Global Step: 29450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:55:39,181-Speed 4494.73 samples/sec Loss 13.2179 Epoch: 5 Global Step: 29500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:55:50,411-Speed 4559.83 samples/sec Loss 13.1937 Epoch: 5 Global Step: 29550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:01,730-Speed 4523.19 samples/sec Loss 13.1289 Epoch: 5 Global Step: 29600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:12,985-Speed 4549.31 samples/sec Loss 13.2333 Epoch: 5 Global Step: 29650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:24,510-Speed 4442.85 samples/sec Loss 13.1349 Epoch: 5 Global Step: 29700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:35,943-Speed 4478.57 samples/sec Loss 13.2542 Epoch: 5 Global Step: 29750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:47,230-Speed 4536.41 samples/sec Loss 13.2517 Epoch: 5 Global Step: 29800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:56:58,468-Speed 4555.89 samples/sec Loss 13.1687 Epoch: 5 Global Step: 29850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:57:12,900-Speed 3547.90 samples/sec Loss 13.0882 Epoch: 6 Global Step: 29900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:57:25,692-Speed 4002.71 samples/sec Loss 12.4431 Epoch: 6 Global Step: 29950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:57:37,323-Speed 4402.16 samples/sec Loss 12.5278 Epoch: 6 Global Step: 30000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:58:08,141-[lfw][30000]XNorm: 21.577463 Training: 2021-03-14 23:58:08,141-[lfw][30000]Accuracy-Flip: 0.99517+-0.00311 Training: 2021-03-14 23:58:08,141-[lfw][30000]Accuracy-Highest: 0.99567 Training: 2021-03-14 23:58:43,682-[cfp_fp][30000]XNorm: 18.109447 Training: 2021-03-14 23:58:43,682-[cfp_fp][30000]Accuracy-Flip: 0.94271+-0.01154 Training: 2021-03-14 23:58:43,682-[cfp_fp][30000]Accuracy-Highest: 0.95071 Training: 2021-03-14 23:59:14,623-[agedb_30][30000]XNorm: 20.550435 Training: 2021-03-14 23:59:14,624-[agedb_30][30000]Accuracy-Flip: 0.95417+-0.00786 Training: 2021-03-14 23:59:14,624-[agedb_30][30000]Accuracy-Highest: 0.95917 Training: 2021-03-14 23:59:26,025-Speed 471.02 samples/sec Loss 12.6792 Epoch: 6 Global Step: 30050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:59:37,570-Speed 4434.67 samples/sec Loss 12.8247 Epoch: 6 Global Step: 30100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-14 23:59:49,505-Speed 4290.11 samples/sec Loss 12.9038 Epoch: 6 Global Step: 30150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:01,604-Speed 4232.24 samples/sec Loss 12.9227 Epoch: 6 Global Step: 30200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:13,052-Speed 4472.54 samples/sec Loss 13.0381 Epoch: 6 Global Step: 30250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:24,677-Speed 4404.23 samples/sec Loss 13.0130 Epoch: 6 Global Step: 30300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:35,963-Speed 4536.86 samples/sec Loss 13.0461 Epoch: 6 Global Step: 30350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:47,198-Speed 4557.49 samples/sec Loss 13.1478 Epoch: 6 Global Step: 30400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:00:59,016-Speed 4332.50 samples/sec Loss 13.0966 Epoch: 6 Global Step: 30450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:01:10,518-Speed 4451.62 samples/sec Loss 13.1773 Epoch: 6 Global Step: 30500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:01:21,927-Speed 4488.04 samples/sec Loss 13.1683 Epoch: 6 Global Step: 30550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:01:33,203-Speed 4540.67 samples/sec Loss 13.1020 Epoch: 6 Global Step: 30600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:01:44,499-Speed 4532.88 samples/sec Loss 13.1670 Epoch: 6 Global Step: 30650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:01:55,819-Speed 4523.04 samples/sec Loss 13.1622 Epoch: 6 Global Step: 30700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:02:07,374-Speed 4431.37 samples/sec Loss 13.1960 Epoch: 6 Global Step: 30750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:02:18,712-Speed 4515.74 samples/sec Loss 13.1730 Epoch: 6 Global Step: 30800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:02:30,236-Speed 4443.30 samples/sec Loss 13.1920 Epoch: 6 Global Step: 30850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:02:41,431-Speed 4573.68 samples/sec Loss 13.1616 Epoch: 6 Global Step: 30900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:02:52,940-Speed 4448.71 samples/sec Loss 13.1940 Epoch: 6 Global Step: 30950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:03:04,329-Speed 4495.88 samples/sec Loss 13.1168 Epoch: 6 Global Step: 31000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:03:15,792-Speed 4466.88 samples/sec Loss 13.1615 Epoch: 6 Global Step: 31050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:03:27,159-Speed 4504.31 samples/sec Loss 13.1707 Epoch: 6 Global Step: 31100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:03:38,890-Speed 4364.68 samples/sec Loss 13.1186 Epoch: 6 Global Step: 31150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:03:50,083-Speed 4574.39 samples/sec Loss 13.0939 Epoch: 6 Global Step: 31200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:01,331-Speed 4552.17 samples/sec Loss 13.2132 Epoch: 6 Global Step: 31250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:12,851-Speed 4444.87 samples/sec Loss 13.1689 Epoch: 6 Global Step: 31300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:24,231-Speed 4499.05 samples/sec Loss 13.1840 Epoch: 6 Global Step: 31350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:35,645-Speed 4486.17 samples/sec Loss 13.1599 Epoch: 6 Global Step: 31400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:46,723-Speed 4621.67 samples/sec Loss 13.1075 Epoch: 6 Global Step: 31450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:04:58,293-Speed 4425.53 samples/sec Loss 13.1047 Epoch: 6 Global Step: 31500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:05:09,571-Speed 4539.90 samples/sec Loss 13.2277 Epoch: 6 Global Step: 31550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:05:21,048-Speed 4461.55 samples/sec Loss 13.1550 Epoch: 6 Global Step: 31600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:05:32,125-Speed 4622.27 samples/sec Loss 13.1380 Epoch: 6 Global Step: 31650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:05:43,372-Speed 4552.60 samples/sec Loss 13.0682 Epoch: 6 Global Step: 31700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:05:54,763-Speed 4495.01 samples/sec Loss 13.0380 Epoch: 6 Global Step: 31750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:06:06,263-Speed 4452.23 samples/sec Loss 13.2155 Epoch: 6 Global Step: 31800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:06:17,754-Speed 4456.01 samples/sec Loss 13.1088 Epoch: 6 Global Step: 31850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:06:29,159-Speed 4489.27 samples/sec Loss 13.1048 Epoch: 6 Global Step: 31900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:06:40,571-Speed 4486.59 samples/sec Loss 13.1363 Epoch: 6 Global Step: 31950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:06:52,019-Speed 4472.87 samples/sec Loss 13.1815 Epoch: 6 Global Step: 32000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:07:22,817-[lfw][32000]XNorm: 23.122709 Training: 2021-03-15 00:07:22,817-[lfw][32000]Accuracy-Flip: 0.99433+-0.00300 Training: 2021-03-15 00:07:22,817-[lfw][32000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:07:58,501-[cfp_fp][32000]XNorm: 19.849104 Training: 2021-03-15 00:07:58,501-[cfp_fp][32000]Accuracy-Flip: 0.93771+-0.01280 Training: 2021-03-15 00:07:58,501-[cfp_fp][32000]Accuracy-Highest: 0.95071 Training: 2021-03-15 00:08:29,284-[agedb_30][32000]XNorm: 22.668103 Training: 2021-03-15 00:08:29,284-[agedb_30][32000]Accuracy-Flip: 0.96000+-0.00986 Training: 2021-03-15 00:08:29,284-[agedb_30][32000]Accuracy-Highest: 0.96000 Training: 2021-03-15 00:08:40,683-Speed 471.18 samples/sec Loss 13.2141 Epoch: 6 Global Step: 32050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:08:53,065-Speed 4134.99 samples/sec Loss 13.1498 Epoch: 6 Global Step: 32100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:09:04,624-Speed 4429.78 samples/sec Loss 13.2036 Epoch: 6 Global Step: 32150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:09:16,027-Speed 4490.12 samples/sec Loss 13.1327 Epoch: 6 Global Step: 32200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:09:27,422-Speed 4493.55 samples/sec Loss 13.1693 Epoch: 6 Global Step: 32250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:09:38,771-Speed 4511.53 samples/sec Loss 13.1370 Epoch: 6 Global Step: 32300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:09:50,423-Speed 4394.26 samples/sec Loss 13.2549 Epoch: 6 Global Step: 32350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:01,804-Speed 4498.80 samples/sec Loss 13.1466 Epoch: 6 Global Step: 32400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:13,205-Speed 4491.36 samples/sec Loss 13.2211 Epoch: 6 Global Step: 32450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:24,357-Speed 4591.21 samples/sec Loss 13.1621 Epoch: 6 Global Step: 32500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:35,764-Speed 4488.65 samples/sec Loss 13.2005 Epoch: 6 Global Step: 32550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:46,956-Speed 4574.72 samples/sec Loss 13.0521 Epoch: 6 Global Step: 32600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:10:58,296-Speed 4515.30 samples/sec Loss 13.0605 Epoch: 6 Global Step: 32650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:11:09,517-Speed 4563.05 samples/sec Loss 13.1531 Epoch: 6 Global Step: 32700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:11:21,045-Speed 4441.58 samples/sec Loss 13.0805 Epoch: 6 Global Step: 32750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:11:32,255-Speed 4567.68 samples/sec Loss 13.2175 Epoch: 6 Global Step: 32800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:11:43,895-Speed 4398.73 samples/sec Loss 13.1915 Epoch: 6 Global Step: 32850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:11:55,133-Speed 4556.23 samples/sec Loss 13.2749 Epoch: 6 Global Step: 32900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:12:06,976-Speed 4323.59 samples/sec Loss 13.1743 Epoch: 6 Global Step: 32950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:12:18,361-Speed 4497.19 samples/sec Loss 13.0542 Epoch: 6 Global Step: 33000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:12:29,733-Speed 4502.35 samples/sec Loss 13.1443 Epoch: 6 Global Step: 33050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:12:41,031-Speed 4532.17 samples/sec Loss 13.1376 Epoch: 6 Global Step: 33100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:12:52,546-Speed 4446.46 samples/sec Loss 13.0702 Epoch: 6 Global Step: 33150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:13:03,992-Speed 4473.50 samples/sec Loss 13.0948 Epoch: 6 Global Step: 33200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:13:15,279-Speed 4536.34 samples/sec Loss 13.1662 Epoch: 6 Global Step: 33250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:13:26,475-Speed 4573.37 samples/sec Loss 13.1017 Epoch: 6 Global Step: 33300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:13:37,895-Speed 4483.49 samples/sec Loss 13.2006 Epoch: 6 Global Step: 33350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:13:49,222-Speed 4520.10 samples/sec Loss 13.0324 Epoch: 6 Global Step: 33400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:00,634-Speed 4486.98 samples/sec Loss 13.1263 Epoch: 6 Global Step: 33450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:11,930-Speed 4532.45 samples/sec Loss 13.1505 Epoch: 6 Global Step: 33500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:23,248-Speed 4524.00 samples/sec Loss 13.1003 Epoch: 6 Global Step: 33550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:34,558-Speed 4527.46 samples/sec Loss 13.0922 Epoch: 6 Global Step: 33600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:45,808-Speed 4551.28 samples/sec Loss 13.1096 Epoch: 6 Global Step: 33650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:14:57,207-Speed 4491.84 samples/sec Loss 13.2402 Epoch: 6 Global Step: 33700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:15:08,860-Speed 4393.95 samples/sec Loss 13.0399 Epoch: 6 Global Step: 33750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:15:20,281-Speed 4483.17 samples/sec Loss 13.1209 Epoch: 6 Global Step: 33800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:15:31,537-Speed 4548.84 samples/sec Loss 13.1101 Epoch: 6 Global Step: 33850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:15:43,041-Speed 4450.58 samples/sec Loss 13.0965 Epoch: 6 Global Step: 33900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:15:54,770-Speed 4365.45 samples/sec Loss 13.1210 Epoch: 6 Global Step: 33950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:16:06,328-Speed 4430.22 samples/sec Loss 13.1359 Epoch: 6 Global Step: 34000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:16:37,169-[lfw][34000]XNorm: 21.280032 Training: 2021-03-15 00:16:37,169-[lfw][34000]Accuracy-Flip: 0.99533+-0.00194 Training: 2021-03-15 00:16:37,169-[lfw][34000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:17:12,731-[cfp_fp][34000]XNorm: 17.974097 Training: 2021-03-15 00:17:12,731-[cfp_fp][34000]Accuracy-Flip: 0.95043+-0.01132 Training: 2021-03-15 00:17:12,731-[cfp_fp][34000]Accuracy-Highest: 0.95071 Training: 2021-03-15 00:17:43,503-[agedb_30][34000]XNorm: 20.803966 Training: 2021-03-15 00:17:43,503-[agedb_30][34000]Accuracy-Flip: 0.96017+-0.00794 Training: 2021-03-15 00:17:43,503-[agedb_30][34000]Accuracy-Highest: 0.96017 Training: 2021-03-15 00:17:55,003-Speed 471.13 samples/sec Loss 13.1043 Epoch: 6 Global Step: 34050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:18:06,433-Speed 4479.49 samples/sec Loss 13.1021 Epoch: 6 Global Step: 34100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:18:17,592-Speed 4588.34 samples/sec Loss 13.2036 Epoch: 6 Global Step: 34150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:18:28,921-Speed 4519.90 samples/sec Loss 13.1613 Epoch: 6 Global Step: 34200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:18:40,248-Speed 4520.30 samples/sec Loss 13.1007 Epoch: 6 Global Step: 34250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:18:51,892-Speed 4397.40 samples/sec Loss 13.1483 Epoch: 6 Global Step: 34300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:19:03,178-Speed 4536.80 samples/sec Loss 13.1049 Epoch: 6 Global Step: 34350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:19:14,611-Speed 4478.23 samples/sec Loss 13.0461 Epoch: 6 Global Step: 34400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:19:25,946-Speed 4517.23 samples/sec Loss 13.0505 Epoch: 6 Global Step: 34450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:19:37,460-Speed 4447.02 samples/sec Loss 13.0299 Epoch: 6 Global Step: 34500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:19:48,790-Speed 4519.10 samples/sec Loss 13.0978 Epoch: 6 Global Step: 34550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:00,106-Speed 4524.61 samples/sec Loss 13.1215 Epoch: 6 Global Step: 34600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:11,556-Speed 4472.12 samples/sec Loss 13.0861 Epoch: 6 Global Step: 34650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:23,160-Speed 4412.37 samples/sec Loss 13.0784 Epoch: 6 Global Step: 34700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:34,681-Speed 4444.27 samples/sec Loss 13.1018 Epoch: 6 Global Step: 34750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:46,357-Speed 4385.26 samples/sec Loss 13.1093 Epoch: 6 Global Step: 34800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:20:57,759-Speed 4490.66 samples/sec Loss 13.1267 Epoch: 6 Global Step: 34850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:21:12,217-Speed 3541.34 samples/sec Loss 12.6871 Epoch: 7 Global Step: 34900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:21:24,128-Speed 4298.86 samples/sec Loss 12.4920 Epoch: 7 Global Step: 34950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:21:35,705-Speed 4422.81 samples/sec Loss 12.5356 Epoch: 7 Global Step: 35000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:21:47,240-Speed 4439.08 samples/sec Loss 12.6376 Epoch: 7 Global Step: 35050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:21:58,572-Speed 4518.34 samples/sec Loss 12.6909 Epoch: 7 Global Step: 35100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:22:10,061-Speed 4456.78 samples/sec Loss 12.8033 Epoch: 7 Global Step: 35150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:22:21,444-Speed 4497.98 samples/sec Loss 12.8934 Epoch: 7 Global Step: 35200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:22:32,998-Speed 4431.56 samples/sec Loss 12.9738 Epoch: 7 Global Step: 35250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:22:44,383-Speed 4497.51 samples/sec Loss 12.9697 Epoch: 7 Global Step: 35300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:22:55,833-Speed 4471.87 samples/sec Loss 12.9466 Epoch: 7 Global Step: 35350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:23:07,133-Speed 4530.86 samples/sec Loss 12.9638 Epoch: 7 Global Step: 35400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:23:18,572-Speed 4476.29 samples/sec Loss 12.9913 Epoch: 7 Global Step: 35450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:23:29,773-Speed 4571.32 samples/sec Loss 12.9738 Epoch: 7 Global Step: 35500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:23:41,611-Speed 4325.21 samples/sec Loss 13.1075 Epoch: 7 Global Step: 35550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:23:52,875-Speed 4545.42 samples/sec Loss 13.0143 Epoch: 7 Global Step: 35600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:24:04,243-Speed 4504.07 samples/sec Loss 12.9331 Epoch: 7 Global Step: 35650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:24:15,900-Speed 4392.45 samples/sec Loss 13.0947 Epoch: 7 Global Step: 35700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:24:27,393-Speed 4455.14 samples/sec Loss 13.1183 Epoch: 7 Global Step: 35750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:24:38,803-Speed 4487.64 samples/sec Loss 13.0648 Epoch: 7 Global Step: 35800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:24:50,276-Speed 4462.71 samples/sec Loss 13.1440 Epoch: 7 Global Step: 35850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:25:02,445-Speed 4207.50 samples/sec Loss 13.1379 Epoch: 7 Global Step: 35900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:25:14,126-Speed 4383.41 samples/sec Loss 13.1676 Epoch: 7 Global Step: 35950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:25:25,662-Speed 4438.44 samples/sec Loss 13.1543 Epoch: 7 Global Step: 36000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:25:56,465-[lfw][36000]XNorm: 22.059684 Training: 2021-03-15 00:25:56,466-[lfw][36000]Accuracy-Flip: 0.99500+-0.00408 Training: 2021-03-15 00:25:56,466-[lfw][36000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:26:32,026-[cfp_fp][36000]XNorm: 18.507055 Training: 2021-03-15 00:26:32,027-[cfp_fp][36000]Accuracy-Flip: 0.94100+-0.01305 Training: 2021-03-15 00:26:32,027-[cfp_fp][36000]Accuracy-Highest: 0.95071 Training: 2021-03-15 00:27:02,854-[agedb_30][36000]XNorm: 21.136847 Training: 2021-03-15 00:27:02,855-[agedb_30][36000]Accuracy-Flip: 0.95533+-0.01080 Training: 2021-03-15 00:27:02,855-[agedb_30][36000]Accuracy-Highest: 0.96017 Training: 2021-03-15 00:27:14,427-Speed 470.74 samples/sec Loss 13.1145 Epoch: 7 Global Step: 36050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:27:25,862-Speed 4477.63 samples/sec Loss 13.2046 Epoch: 7 Global Step: 36100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:27:37,636-Speed 4348.89 samples/sec Loss 13.0677 Epoch: 7 Global Step: 36150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:27:49,204-Speed 4426.35 samples/sec Loss 12.9338 Epoch: 7 Global Step: 36200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:00,389-Speed 4577.77 samples/sec Loss 13.0557 Epoch: 7 Global Step: 36250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:11,790-Speed 4490.81 samples/sec Loss 13.0370 Epoch: 7 Global Step: 36300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:23,077-Speed 4536.57 samples/sec Loss 13.1232 Epoch: 7 Global Step: 36350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:34,615-Speed 4437.69 samples/sec Loss 13.1016 Epoch: 7 Global Step: 36400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:46,329-Speed 4371.06 samples/sec Loss 13.0459 Epoch: 7 Global Step: 36450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:28:57,830-Speed 4451.84 samples/sec Loss 13.1192 Epoch: 7 Global Step: 36500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:29:09,065-Speed 4557.57 samples/sec Loss 13.0659 Epoch: 7 Global Step: 36550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:29:20,425-Speed 4507.19 samples/sec Loss 13.1150 Epoch: 7 Global Step: 36600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:29:31,818-Speed 4493.99 samples/sec Loss 13.0498 Epoch: 7 Global Step: 36650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:29:43,466-Speed 4395.68 samples/sec Loss 13.1540 Epoch: 7 Global Step: 36700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:29:54,731-Speed 4545.33 samples/sec Loss 13.0256 Epoch: 7 Global Step: 36750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:30:06,180-Speed 4472.17 samples/sec Loss 13.0526 Epoch: 7 Global Step: 36800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:30:17,548-Speed 4504.09 samples/sec Loss 13.0611 Epoch: 7 Global Step: 36850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:30:29,074-Speed 4442.40 samples/sec Loss 13.0677 Epoch: 7 Global Step: 36900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:30:40,433-Speed 4507.74 samples/sec Loss 12.9465 Epoch: 7 Global Step: 36950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:30:51,938-Speed 4450.41 samples/sec Loss 13.0622 Epoch: 7 Global Step: 37000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:31:03,328-Speed 4495.49 samples/sec Loss 13.0110 Epoch: 7 Global Step: 37050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:31:14,784-Speed 4469.13 samples/sec Loss 13.0334 Epoch: 7 Global Step: 37100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:31:26,093-Speed 4527.57 samples/sec Loss 13.0979 Epoch: 7 Global Step: 37150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:31:37,341-Speed 4552.10 samples/sec Loss 13.1146 Epoch: 7 Global Step: 37200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:31:48,643-Speed 4530.58 samples/sec Loss 13.0843 Epoch: 7 Global Step: 37250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:00,031-Speed 4496.18 samples/sec Loss 13.0193 Epoch: 7 Global Step: 37300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:11,515-Speed 4458.52 samples/sec Loss 13.0170 Epoch: 7 Global Step: 37350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:22,889-Speed 4501.88 samples/sec Loss 13.0009 Epoch: 7 Global Step: 37400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:34,621-Speed 4364.31 samples/sec Loss 13.0318 Epoch: 7 Global Step: 37450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:45,741-Speed 4604.20 samples/sec Loss 13.0242 Epoch: 7 Global Step: 37500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:32:57,066-Speed 4521.46 samples/sec Loss 13.0448 Epoch: 7 Global Step: 37550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:33:08,316-Speed 4551.06 samples/sec Loss 13.0369 Epoch: 7 Global Step: 37600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:33:19,677-Speed 4507.10 samples/sec Loss 13.0464 Epoch: 7 Global Step: 37650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:33:30,953-Speed 4540.87 samples/sec Loss 13.0331 Epoch: 7 Global Step: 37700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:33:42,766-Speed 4334.32 samples/sec Loss 12.9925 Epoch: 7 Global Step: 37750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:33:54,262-Speed 4453.74 samples/sec Loss 13.0548 Epoch: 7 Global Step: 37800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:34:05,699-Speed 4477.13 samples/sec Loss 13.0970 Epoch: 7 Global Step: 37850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:34:17,228-Speed 4441.18 samples/sec Loss 13.1105 Epoch: 7 Global Step: 37900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:34:28,569-Speed 4514.52 samples/sec Loss 12.9836 Epoch: 7 Global Step: 37950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:34:39,775-Speed 4569.11 samples/sec Loss 13.0438 Epoch: 7 Global Step: 38000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:35:10,641-[lfw][38000]XNorm: 22.556278 Training: 2021-03-15 00:35:10,642-[lfw][38000]Accuracy-Flip: 0.99500+-0.00333 Training: 2021-03-15 00:35:10,642-[lfw][38000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:35:46,515-[cfp_fp][38000]XNorm: 19.134774 Training: 2021-03-15 00:35:46,515-[cfp_fp][38000]Accuracy-Flip: 0.95200+-0.00888 Training: 2021-03-15 00:35:46,515-[cfp_fp][38000]Accuracy-Highest: 0.95200 Training: 2021-03-15 00:36:17,434-[agedb_30][38000]XNorm: 21.708122 Training: 2021-03-15 00:36:17,434-[agedb_30][38000]Accuracy-Flip: 0.95933+-0.00821 Training: 2021-03-15 00:36:17,434-[agedb_30][38000]Accuracy-Highest: 0.96017 Training: 2021-03-15 00:36:28,823-Speed 469.52 samples/sec Loss 13.1926 Epoch: 7 Global Step: 38050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:36:40,405-Speed 4420.88 samples/sec Loss 13.1113 Epoch: 7 Global Step: 38100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:36:51,819-Speed 4485.62 samples/sec Loss 13.0549 Epoch: 7 Global Step: 38150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:37:03,037-Speed 4564.59 samples/sec Loss 12.9513 Epoch: 7 Global Step: 38200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:37:14,582-Speed 4434.95 samples/sec Loss 13.0659 Epoch: 7 Global Step: 38250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:37:26,102-Speed 4444.53 samples/sec Loss 13.0213 Epoch: 7 Global Step: 38300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:37:37,533-Speed 4479.40 samples/sec Loss 13.0849 Epoch: 7 Global Step: 38350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:37:48,895-Speed 4506.42 samples/sec Loss 13.0564 Epoch: 7 Global Step: 38400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:00,351-Speed 4469.38 samples/sec Loss 13.0444 Epoch: 7 Global Step: 38450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:11,799-Speed 4472.64 samples/sec Loss 12.9084 Epoch: 7 Global Step: 38500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:23,151-Speed 4510.29 samples/sec Loss 13.0665 Epoch: 7 Global Step: 38550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:34,549-Speed 4492.16 samples/sec Loss 13.0461 Epoch: 7 Global Step: 38600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:45,716-Speed 4585.20 samples/sec Loss 13.0538 Epoch: 7 Global Step: 38650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:38:57,014-Speed 4531.89 samples/sec Loss 13.0140 Epoch: 7 Global Step: 38700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:39:08,310-Speed 4532.75 samples/sec Loss 13.1031 Epoch: 7 Global Step: 38750 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:39:19,587-Speed 4540.60 samples/sec Loss 13.0538 Epoch: 7 Global Step: 38800 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:39:30,885-Speed 4531.93 samples/sec Loss 13.0998 Epoch: 7 Global Step: 38850 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:39:42,481-Speed 4415.56 samples/sec Loss 13.0678 Epoch: 7 Global Step: 38900 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:39:53,709-Speed 4560.01 samples/sec Loss 13.0352 Epoch: 7 Global Step: 38950 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:40:05,287-Speed 4422.37 samples/sec Loss 12.9508 Epoch: 7 Global Step: 39000 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:40:16,578-Speed 4534.71 samples/sec Loss 13.0183 Epoch: 7 Global Step: 39050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:40:28,025-Speed 4473.04 samples/sec Loss 13.0184 Epoch: 7 Global Step: 39100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:40:39,236-Speed 4567.19 samples/sec Loss 12.9240 Epoch: 7 Global Step: 39150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:40:50,839-Speed 4412.87 samples/sec Loss 13.0834 Epoch: 7 Global Step: 39200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:02,462-Speed 4405.22 samples/sec Loss 13.0689 Epoch: 7 Global Step: 39250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:13,848-Speed 4497.19 samples/sec Loss 13.0869 Epoch: 7 Global Step: 39300 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:25,382-Speed 4438.93 samples/sec Loss 12.9291 Epoch: 7 Global Step: 39350 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:36,675-Speed 4534.35 samples/sec Loss 13.0496 Epoch: 7 Global Step: 39400 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:47,880-Speed 4569.27 samples/sec Loss 13.0786 Epoch: 7 Global Step: 39450 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:41:59,345-Speed 4466.09 samples/sec Loss 13.0014 Epoch: 7 Global Step: 39500 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:42:10,685-Speed 4515.29 samples/sec Loss 13.0130 Epoch: 7 Global Step: 39550 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:42:21,888-Speed 4570.37 samples/sec Loss 13.0782 Epoch: 7 Global Step: 39600 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:42:33,267-Speed 4499.65 samples/sec Loss 13.0631 Epoch: 7 Global Step: 39650 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:42:44,719-Speed 4471.10 samples/sec Loss 13.0405 Epoch: 7 Global Step: 39700 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:42:56,243-Speed 4443.16 samples/sec Loss 12.9228 Epoch: 7 Global Step: 39750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:43:07,707-Speed 4466.20 samples/sec Loss 13.0092 Epoch: 7 Global Step: 39800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:43:19,266-Speed 4429.61 samples/sec Loss 13.0698 Epoch: 7 Global Step: 39850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:43:34,359-Speed 3392.36 samples/sec Loss 12.4180 Epoch: 8 Global Step: 39900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:43:46,050-Speed 4379.80 samples/sec Loss 12.3766 Epoch: 8 Global Step: 39950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:43:57,353-Speed 4530.18 samples/sec Loss 12.5612 Epoch: 8 Global Step: 40000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:44:28,029-[lfw][40000]XNorm: 22.939969 Training: 2021-03-15 00:44:28,030-[lfw][40000]Accuracy-Flip: 0.99550+-0.00350 Training: 2021-03-15 00:44:28,030-[lfw][40000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:45:03,516-[cfp_fp][40000]XNorm: 19.491800 Training: 2021-03-15 00:45:03,516-[cfp_fp][40000]Accuracy-Flip: 0.94343+-0.01123 Training: 2021-03-15 00:45:03,516-[cfp_fp][40000]Accuracy-Highest: 0.95200 Training: 2021-03-15 00:45:34,146-[agedb_30][40000]XNorm: 22.356961 Training: 2021-03-15 00:45:34,146-[agedb_30][40000]Accuracy-Flip: 0.95583+-0.01039 Training: 2021-03-15 00:45:34,146-[agedb_30][40000]Accuracy-Highest: 0.96017 Training: 2021-03-15 00:45:45,944-Speed 471.49 samples/sec Loss 12.5990 Epoch: 8 Global Step: 40050 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:45:57,867-Speed 4294.47 samples/sec Loss 12.7360 Epoch: 8 Global Step: 40100 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:46:09,456-Speed 4418.00 samples/sec Loss 12.7564 Epoch: 8 Global Step: 40150 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:46:20,734-Speed 4540.14 samples/sec Loss 12.8359 Epoch: 8 Global Step: 40200 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:46:32,156-Speed 4482.95 samples/sec Loss 12.9374 Epoch: 8 Global Step: 40250 Fp16 Grad Scale: 16384 Required: 7 hours Training: 2021-03-15 00:46:43,365-Speed 4567.63 samples/sec Loss 12.8067 Epoch: 8 Global Step: 40300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:46:54,831-Speed 4465.65 samples/sec Loss 12.8154 Epoch: 8 Global Step: 40350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:47:06,097-Speed 4544.81 samples/sec Loss 12.9664 Epoch: 8 Global Step: 40400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:47:17,532-Speed 4477.81 samples/sec Loss 12.9277 Epoch: 8 Global Step: 40450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:47:28,940-Speed 4488.14 samples/sec Loss 13.0219 Epoch: 8 Global Step: 40500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:47:40,590-Speed 4395.34 samples/sec Loss 13.0138 Epoch: 8 Global Step: 40550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:47:51,807-Speed 4564.53 samples/sec Loss 12.9604 Epoch: 8 Global Step: 40600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:48:03,210-Speed 4490.41 samples/sec Loss 13.0134 Epoch: 8 Global Step: 40650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:48:14,569-Speed 4507.36 samples/sec Loss 13.0094 Epoch: 8 Global Step: 40700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:48:25,866-Speed 4532.42 samples/sec Loss 12.9836 Epoch: 8 Global Step: 40750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:48:37,220-Speed 4509.90 samples/sec Loss 13.0816 Epoch: 8 Global Step: 40800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:48:48,892-Speed 4386.70 samples/sec Loss 13.0896 Epoch: 8 Global Step: 40850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:00,168-Speed 4540.99 samples/sec Loss 13.0029 Epoch: 8 Global Step: 40900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:11,527-Speed 4507.29 samples/sec Loss 13.0025 Epoch: 8 Global Step: 40950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:23,077-Speed 4433.28 samples/sec Loss 12.9004 Epoch: 8 Global Step: 41000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:34,484-Speed 4488.59 samples/sec Loss 13.0273 Epoch: 8 Global Step: 41050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:45,904-Speed 4483.72 samples/sec Loss 12.9988 Epoch: 8 Global Step: 41100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:49:57,080-Speed 4581.55 samples/sec Loss 12.9397 Epoch: 8 Global Step: 41150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:50:08,469-Speed 4495.50 samples/sec Loss 12.9904 Epoch: 8 Global Step: 41200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:50:19,715-Speed 4552.89 samples/sec Loss 13.0538 Epoch: 8 Global Step: 41250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:50:31,574-Speed 4317.52 samples/sec Loss 13.0632 Epoch: 8 Global Step: 41300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:50:42,819-Speed 4553.45 samples/sec Loss 13.0599 Epoch: 8 Global Step: 41350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:50:54,201-Speed 4498.47 samples/sec Loss 13.0393 Epoch: 8 Global Step: 41400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:51:05,475-Speed 4541.69 samples/sec Loss 13.0833 Epoch: 8 Global Step: 41450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:51:16,903-Speed 4480.63 samples/sec Loss 12.9958 Epoch: 8 Global Step: 41500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:51:28,404-Speed 4451.93 samples/sec Loss 13.1174 Epoch: 8 Global Step: 41550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:51:39,902-Speed 4452.99 samples/sec Loss 13.0076 Epoch: 8 Global Step: 41600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:51:51,538-Speed 4400.45 samples/sec Loss 12.9565 Epoch: 8 Global Step: 41650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:52:03,507-Speed 4277.94 samples/sec Loss 12.9857 Epoch: 8 Global Step: 41700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:52:14,638-Speed 4599.95 samples/sec Loss 12.9963 Epoch: 8 Global Step: 41750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:52:26,063-Speed 4481.56 samples/sec Loss 12.9460 Epoch: 8 Global Step: 41800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:52:37,488-Speed 4481.70 samples/sec Loss 12.9933 Epoch: 8 Global Step: 41850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:52:49,379-Speed 4305.71 samples/sec Loss 12.9591 Epoch: 8 Global Step: 41900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:53:00,821-Speed 4475.09 samples/sec Loss 12.9861 Epoch: 8 Global Step: 41950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:53:12,270-Speed 4472.27 samples/sec Loss 13.0133 Epoch: 8 Global Step: 42000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:53:43,098-[lfw][42000]XNorm: 21.253726 Training: 2021-03-15 00:53:43,099-[lfw][42000]Accuracy-Flip: 0.99550+-0.00299 Training: 2021-03-15 00:53:43,099-[lfw][42000]Accuracy-Highest: 0.99567 Training: 2021-03-15 00:54:18,933-[cfp_fp][42000]XNorm: 18.198212 Training: 2021-03-15 00:54:18,934-[cfp_fp][42000]Accuracy-Flip: 0.94329+-0.01525 Training: 2021-03-15 00:54:18,934-[cfp_fp][42000]Accuracy-Highest: 0.95200 Training: 2021-03-15 00:54:49,640-[agedb_30][42000]XNorm: 20.608816 Training: 2021-03-15 00:54:49,641-[agedb_30][42000]Accuracy-Flip: 0.96250+-0.00720 Training: 2021-03-15 00:54:49,641-[agedb_30][42000]Accuracy-Highest: 0.96250 Training: 2021-03-15 00:55:00,857-Speed 471.51 samples/sec Loss 13.0216 Epoch: 8 Global Step: 42050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:55:12,155-Speed 4532.24 samples/sec Loss 13.0378 Epoch: 8 Global Step: 42100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:55:23,564-Speed 4487.77 samples/sec Loss 13.0275 Epoch: 8 Global Step: 42150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:55:34,928-Speed 4505.52 samples/sec Loss 13.0490 Epoch: 8 Global Step: 42200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:55:46,364-Speed 4477.46 samples/sec Loss 12.9445 Epoch: 8 Global Step: 42250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:55:57,708-Speed 4513.57 samples/sec Loss 13.0229 Epoch: 8 Global Step: 42300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:56:09,382-Speed 4385.94 samples/sec Loss 13.0094 Epoch: 8 Global Step: 42350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:56:20,679-Speed 4532.40 samples/sec Loss 13.0753 Epoch: 8 Global Step: 42400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:56:32,042-Speed 4506.10 samples/sec Loss 12.9929 Epoch: 8 Global Step: 42450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:56:43,314-Speed 4542.39 samples/sec Loss 13.0128 Epoch: 8 Global Step: 42500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:56:54,850-Speed 4438.36 samples/sec Loss 12.9525 Epoch: 8 Global Step: 42550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:57:06,212-Speed 4506.47 samples/sec Loss 12.9818 Epoch: 8 Global Step: 42600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:57:17,481-Speed 4543.82 samples/sec Loss 12.8954 Epoch: 8 Global Step: 42650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:57:28,746-Speed 4545.20 samples/sec Loss 13.0176 Epoch: 8 Global Step: 42700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:57:40,265-Speed 4445.16 samples/sec Loss 13.1197 Epoch: 8 Global Step: 42750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:57:51,841-Speed 4422.95 samples/sec Loss 12.9483 Epoch: 8 Global Step: 42800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:58:03,145-Speed 4529.72 samples/sec Loss 13.0214 Epoch: 8 Global Step: 42850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:58:14,530-Speed 4497.37 samples/sec Loss 13.1063 Epoch: 8 Global Step: 42900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:58:26,003-Speed 4462.92 samples/sec Loss 12.9085 Epoch: 8 Global Step: 42950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:58:37,331-Speed 4519.94 samples/sec Loss 12.8674 Epoch: 8 Global Step: 43000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:58:48,654-Speed 4521.91 samples/sec Loss 12.9882 Epoch: 8 Global Step: 43050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:00,009-Speed 4509.23 samples/sec Loss 12.9099 Epoch: 8 Global Step: 43100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:11,282-Speed 4541.95 samples/sec Loss 12.9493 Epoch: 8 Global Step: 43150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:22,607-Speed 4521.08 samples/sec Loss 12.9542 Epoch: 8 Global Step: 43200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:34,223-Speed 4408.13 samples/sec Loss 12.9897 Epoch: 8 Global Step: 43250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:45,716-Speed 4454.94 samples/sec Loss 12.9547 Epoch: 8 Global Step: 43300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 00:59:56,927-Speed 4567.11 samples/sec Loss 12.9219 Epoch: 8 Global Step: 43350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:00:08,282-Speed 4509.14 samples/sec Loss 12.9352 Epoch: 8 Global Step: 43400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:00:19,350-Speed 4626.34 samples/sec Loss 12.9966 Epoch: 8 Global Step: 43450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:00:30,812-Speed 4467.11 samples/sec Loss 12.8586 Epoch: 8 Global Step: 43500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:00:42,221-Speed 4487.76 samples/sec Loss 12.9440 Epoch: 8 Global Step: 43550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:00:54,130-Speed 4299.49 samples/sec Loss 12.9350 Epoch: 8 Global Step: 43600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:01:05,993-Speed 4316.10 samples/sec Loss 12.9709 Epoch: 8 Global Step: 43650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:01:17,890-Speed 4303.91 samples/sec Loss 12.9741 Epoch: 8 Global Step: 43700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:01:29,028-Speed 4596.88 samples/sec Loss 12.9820 Epoch: 8 Global Step: 43750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:01:40,426-Speed 4492.46 samples/sec Loss 13.0091 Epoch: 8 Global Step: 43800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:01:51,555-Speed 4600.84 samples/sec Loss 12.8953 Epoch: 8 Global Step: 43850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:02:03,003-Speed 4472.38 samples/sec Loss 13.0040 Epoch: 8 Global Step: 43900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:02:14,590-Speed 4418.94 samples/sec Loss 12.9454 Epoch: 8 Global Step: 43950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:02:25,910-Speed 4523.07 samples/sec Loss 12.9539 Epoch: 8 Global Step: 44000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:02:56,509-[lfw][44000]XNorm: 23.549339 Training: 2021-03-15 01:02:56,509-[lfw][44000]Accuracy-Flip: 0.99500+-0.00342 Training: 2021-03-15 01:02:56,509-[lfw][44000]Accuracy-Highest: 0.99567 Training: 2021-03-15 01:03:31,940-[cfp_fp][44000]XNorm: 20.113040 Training: 2021-03-15 01:03:31,940-[cfp_fp][44000]Accuracy-Flip: 0.94829+-0.01269 Training: 2021-03-15 01:03:31,940-[cfp_fp][44000]Accuracy-Highest: 0.95200 Training: 2021-03-15 01:04:02,530-[agedb_30][44000]XNorm: 22.796351 Training: 2021-03-15 01:04:02,530-[agedb_30][44000]Accuracy-Flip: 0.95933+-0.01088 Training: 2021-03-15 01:04:02,530-[agedb_30][44000]Accuracy-Highest: 0.96250 Training: 2021-03-15 01:04:13,928-Speed 474.00 samples/sec Loss 12.9478 Epoch: 8 Global Step: 44050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:04:25,291-Speed 4506.31 samples/sec Loss 12.9533 Epoch: 8 Global Step: 44100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:04:36,684-Speed 4493.88 samples/sec Loss 13.0384 Epoch: 8 Global Step: 44150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:04:47,896-Speed 4567.07 samples/sec Loss 12.9558 Epoch: 8 Global Step: 44200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:04:59,331-Speed 4477.45 samples/sec Loss 12.9844 Epoch: 8 Global Step: 44250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:05:10,680-Speed 4511.75 samples/sec Loss 12.9086 Epoch: 8 Global Step: 44300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:05:22,173-Speed 4455.23 samples/sec Loss 12.8520 Epoch: 8 Global Step: 44350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:05:33,354-Speed 4579.23 samples/sec Loss 12.9516 Epoch: 8 Global Step: 44400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:05:44,672-Speed 4524.06 samples/sec Loss 12.9629 Epoch: 8 Global Step: 44450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:05:56,214-Speed 4436.21 samples/sec Loss 12.9610 Epoch: 8 Global Step: 44500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:06:07,785-Speed 4424.94 samples/sec Loss 13.0322 Epoch: 8 Global Step: 44550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:06:19,114-Speed 4519.40 samples/sec Loss 12.9569 Epoch: 8 Global Step: 44600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:06:30,415-Speed 4530.68 samples/sec Loss 13.0060 Epoch: 8 Global Step: 44650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:06:41,641-Speed 4561.13 samples/sec Loss 12.9935 Epoch: 8 Global Step: 44700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:06:53,108-Speed 4465.22 samples/sec Loss 12.9736 Epoch: 8 Global Step: 44750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:07:04,415-Speed 4528.47 samples/sec Loss 12.9320 Epoch: 8 Global Step: 44800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:07:19,009-Speed 3508.53 samples/sec Loss 12.8389 Epoch: 9 Global Step: 44850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:07:31,080-Speed 4241.79 samples/sec Loss 12.2108 Epoch: 9 Global Step: 44900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:07:42,765-Speed 4382.04 samples/sec Loss 12.3629 Epoch: 9 Global Step: 44950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:07:54,060-Speed 4533.07 samples/sec Loss 12.4679 Epoch: 9 Global Step: 45000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:08:05,492-Speed 4479.04 samples/sec Loss 12.5907 Epoch: 9 Global Step: 45050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:08:17,268-Speed 4347.86 samples/sec Loss 12.6296 Epoch: 9 Global Step: 45100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:08:28,858-Speed 4417.93 samples/sec Loss 12.7133 Epoch: 9 Global Step: 45150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:08:40,223-Speed 4505.44 samples/sec Loss 12.8577 Epoch: 9 Global Step: 45200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:08:51,673-Speed 4471.66 samples/sec Loss 12.8440 Epoch: 9 Global Step: 45250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:09:03,004-Speed 4518.88 samples/sec Loss 12.8122 Epoch: 9 Global Step: 45300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:09:14,753-Speed 4358.06 samples/sec Loss 12.8702 Epoch: 9 Global Step: 45350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:09:26,439-Speed 4381.47 samples/sec Loss 12.9906 Epoch: 9 Global Step: 45400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:09:37,737-Speed 4531.82 samples/sec Loss 12.9151 Epoch: 9 Global Step: 45450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:09:49,261-Speed 4443.27 samples/sec Loss 12.8781 Epoch: 9 Global Step: 45500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:01,314-Speed 4247.97 samples/sec Loss 12.9198 Epoch: 9 Global Step: 45550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:12,635-Speed 4522.68 samples/sec Loss 12.9477 Epoch: 9 Global Step: 45600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:24,581-Speed 4286.06 samples/sec Loss 12.9041 Epoch: 9 Global Step: 45650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:36,106-Speed 4442.95 samples/sec Loss 12.9576 Epoch: 9 Global Step: 45700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:47,615-Speed 4448.68 samples/sec Loss 12.8980 Epoch: 9 Global Step: 45750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:10:58,945-Speed 4519.21 samples/sec Loss 12.9559 Epoch: 9 Global Step: 45800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:11:10,248-Speed 4530.07 samples/sec Loss 12.8957 Epoch: 9 Global Step: 45850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:11:22,073-Speed 4330.03 samples/sec Loss 12.9189 Epoch: 9 Global Step: 45900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:11:33,431-Speed 4508.05 samples/sec Loss 12.8993 Epoch: 9 Global Step: 45950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:11:44,936-Speed 4450.32 samples/sec Loss 12.9046 Epoch: 9 Global Step: 46000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:12:15,907-[lfw][46000]XNorm: 22.392061 Training: 2021-03-15 01:12:15,908-[lfw][46000]Accuracy-Flip: 0.99517+-0.00283 Training: 2021-03-15 01:12:15,908-[lfw][46000]Accuracy-Highest: 0.99567 Training: 2021-03-15 01:12:51,864-[cfp_fp][46000]XNorm: 19.074740 Training: 2021-03-15 01:12:51,864-[cfp_fp][46000]Accuracy-Flip: 0.94600+-0.01150 Training: 2021-03-15 01:12:51,864-[cfp_fp][46000]Accuracy-Highest: 0.95200 Training: 2021-03-15 01:13:22,813-[agedb_30][46000]XNorm: 21.607151 Training: 2021-03-15 01:13:22,814-[agedb_30][46000]Accuracy-Flip: 0.95683+-0.00693 Training: 2021-03-15 01:13:22,814-[agedb_30][46000]Accuracy-Highest: 0.96250 Training: 2021-03-15 01:13:34,151-Speed 468.80 samples/sec Loss 12.8165 Epoch: 9 Global Step: 46050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:13:45,739-Speed 4418.70 samples/sec Loss 12.9629 Epoch: 9 Global Step: 46100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:13:56,901-Speed 4587.31 samples/sec Loss 12.8411 Epoch: 9 Global Step: 46150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:14:08,549-Speed 4395.68 samples/sec Loss 12.9404 Epoch: 9 Global Step: 46200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:14:19,626-Speed 4622.51 samples/sec Loss 12.9253 Epoch: 9 Global Step: 46250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:14:31,304-Speed 4384.52 samples/sec Loss 12.9367 Epoch: 9 Global Step: 46300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:14:42,618-Speed 4525.45 samples/sec Loss 12.9390 Epoch: 9 Global Step: 46350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:14:54,041-Speed 4482.50 samples/sec Loss 12.9871 Epoch: 9 Global Step: 46400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:15:05,236-Speed 4573.53 samples/sec Loss 12.9495 Epoch: 9 Global Step: 46450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:15:16,687-Speed 4471.48 samples/sec Loss 12.9961 Epoch: 9 Global Step: 46500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:15:28,089-Speed 4490.66 samples/sec Loss 12.9749 Epoch: 9 Global Step: 46550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:15:39,533-Speed 4474.08 samples/sec Loss 12.9878 Epoch: 9 Global Step: 46600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:15:50,798-Speed 4545.15 samples/sec Loss 13.0090 Epoch: 9 Global Step: 46650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:02,210-Speed 4486.80 samples/sec Loss 12.9415 Epoch: 9 Global Step: 46700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:13,587-Speed 4500.42 samples/sec Loss 13.0215 Epoch: 9 Global Step: 46750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:24,887-Speed 4531.09 samples/sec Loss 13.0012 Epoch: 9 Global Step: 46800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:36,339-Speed 4471.17 samples/sec Loss 12.9812 Epoch: 9 Global Step: 46850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:47,813-Speed 4462.59 samples/sec Loss 12.8739 Epoch: 9 Global Step: 46900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:16:59,361-Speed 4433.98 samples/sec Loss 12.9333 Epoch: 9 Global Step: 46950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:17:10,586-Speed 4561.13 samples/sec Loss 12.8650 Epoch: 9 Global Step: 47000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:17:22,144-Speed 4430.04 samples/sec Loss 13.0218 Epoch: 9 Global Step: 47050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:17:33,657-Speed 4447.42 samples/sec Loss 12.9944 Epoch: 9 Global Step: 47100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:17:45,301-Speed 4397.38 samples/sec Loss 12.9382 Epoch: 9 Global Step: 47150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:17:56,566-Speed 4545.05 samples/sec Loss 12.9024 Epoch: 9 Global Step: 47200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:18:08,197-Speed 4402.59 samples/sec Loss 12.8882 Epoch: 9 Global Step: 47250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:18:19,490-Speed 4533.99 samples/sec Loss 12.9839 Epoch: 9 Global Step: 47300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:18:30,741-Speed 4550.81 samples/sec Loss 12.8177 Epoch: 9 Global Step: 47350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:18:41,982-Speed 4555.02 samples/sec Loss 12.9406 Epoch: 9 Global Step: 47400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:18:53,653-Speed 4387.09 samples/sec Loss 12.9181 Epoch: 9 Global Step: 47450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:19:04,875-Speed 4562.55 samples/sec Loss 12.9179 Epoch: 9 Global Step: 47500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:19:16,879-Speed 4265.26 samples/sec Loss 12.9799 Epoch: 9 Global Step: 47550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:19:28,041-Speed 4587.50 samples/sec Loss 12.9102 Epoch: 9 Global Step: 47600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:19:39,666-Speed 4404.18 samples/sec Loss 12.9023 Epoch: 9 Global Step: 47650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:19:50,907-Speed 4555.06 samples/sec Loss 13.0003 Epoch: 9 Global Step: 47700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:02,337-Speed 4479.66 samples/sec Loss 13.0075 Epoch: 9 Global Step: 47750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:13,794-Speed 4469.25 samples/sec Loss 12.7952 Epoch: 9 Global Step: 47800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:25,488-Speed 4378.57 samples/sec Loss 12.9747 Epoch: 9 Global Step: 47850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:36,990-Speed 4451.39 samples/sec Loss 12.9257 Epoch: 9 Global Step: 47900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:48,376-Speed 4496.89 samples/sec Loss 12.8513 Epoch: 9 Global Step: 47950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:20:59,857-Speed 4459.70 samples/sec Loss 12.9899 Epoch: 9 Global Step: 48000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:21:30,411-[lfw][48000]XNorm: 24.400071 Training: 2021-03-15 01:21:30,411-[lfw][48000]Accuracy-Flip: 0.99500+-0.00350 Training: 2021-03-15 01:21:30,411-[lfw][48000]Accuracy-Highest: 0.99567 Training: 2021-03-15 01:22:05,879-[cfp_fp][48000]XNorm: 20.538087 Training: 2021-03-15 01:22:05,880-[cfp_fp][48000]Accuracy-Flip: 0.93986+-0.01485 Training: 2021-03-15 01:22:05,880-[cfp_fp][48000]Accuracy-Highest: 0.95200 Training: 2021-03-15 01:22:36,471-[agedb_30][48000]XNorm: 23.632328 Training: 2021-03-15 01:22:36,472-[agedb_30][48000]Accuracy-Flip: 0.95817+-0.00724 Training: 2021-03-15 01:22:36,472-[agedb_30][48000]Accuracy-Highest: 0.96250 Training: 2021-03-15 01:22:47,866-Speed 474.04 samples/sec Loss 12.8818 Epoch: 9 Global Step: 48050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:22:59,466-Speed 4414.06 samples/sec Loss 13.0516 Epoch: 9 Global Step: 48100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:23:10,959-Speed 4455.24 samples/sec Loss 12.8480 Epoch: 9 Global Step: 48150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:23:22,343-Speed 4497.75 samples/sec Loss 12.9348 Epoch: 9 Global Step: 48200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:23:33,598-Speed 4549.28 samples/sec Loss 12.9645 Epoch: 9 Global Step: 48250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:23:44,853-Speed 4549.16 samples/sec Loss 13.0222 Epoch: 9 Global Step: 48300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:23:56,250-Speed 4492.61 samples/sec Loss 12.9001 Epoch: 9 Global Step: 48350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:24:07,827-Speed 4422.67 samples/sec Loss 12.9018 Epoch: 9 Global Step: 48400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:24:18,997-Speed 4583.92 samples/sec Loss 12.9730 Epoch: 9 Global Step: 48450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:24:30,234-Speed 4556.79 samples/sec Loss 12.9631 Epoch: 9 Global Step: 48500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:24:41,486-Speed 4550.23 samples/sec Loss 12.9379 Epoch: 9 Global Step: 48550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:24:52,886-Speed 4491.51 samples/sec Loss 12.9075 Epoch: 9 Global Step: 48600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:25:04,428-Speed 4436.40 samples/sec Loss 13.0047 Epoch: 9 Global Step: 48650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:25:15,779-Speed 4510.77 samples/sec Loss 12.9116 Epoch: 9 Global Step: 48700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:25:27,068-Speed 4535.28 samples/sec Loss 13.0594 Epoch: 9 Global Step: 48750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:25:38,323-Speed 4549.61 samples/sec Loss 12.8680 Epoch: 9 Global Step: 48800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:25:49,773-Speed 4471.72 samples/sec Loss 12.8908 Epoch: 9 Global Step: 48850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:01,285-Speed 4447.56 samples/sec Loss 12.9604 Epoch: 9 Global Step: 48900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:12,582-Speed 4532.34 samples/sec Loss 12.7942 Epoch: 9 Global Step: 48950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:24,496-Speed 4297.77 samples/sec Loss 12.9471 Epoch: 9 Global Step: 49000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:35,895-Speed 4492.01 samples/sec Loss 12.9672 Epoch: 9 Global Step: 49050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:47,265-Speed 4503.26 samples/sec Loss 12.8925 Epoch: 9 Global Step: 49100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:26:58,706-Speed 4475.38 samples/sec Loss 13.0212 Epoch: 9 Global Step: 49150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:27:09,877-Speed 4583.42 samples/sec Loss 12.9506 Epoch: 9 Global Step: 49200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:27:21,385-Speed 4449.39 samples/sec Loss 12.8870 Epoch: 9 Global Step: 49250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:27:32,685-Speed 4530.92 samples/sec Loss 12.8829 Epoch: 9 Global Step: 49300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:27:44,201-Speed 4446.16 samples/sec Loss 12.9097 Epoch: 9 Global Step: 49350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:27:55,632-Speed 4479.55 samples/sec Loss 12.8934 Epoch: 9 Global Step: 49400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:28:06,797-Speed 4585.83 samples/sec Loss 12.8415 Epoch: 9 Global Step: 49450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:28:18,726-Speed 4292.07 samples/sec Loss 12.9052 Epoch: 9 Global Step: 49500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:28:30,144-Speed 4484.47 samples/sec Loss 12.9712 Epoch: 9 Global Step: 49550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:28:41,356-Speed 4566.91 samples/sec Loss 12.9358 Epoch: 9 Global Step: 49600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:28:52,920-Speed 4427.61 samples/sec Loss 13.0030 Epoch: 9 Global Step: 49650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:29:04,308-Speed 4495.92 samples/sec Loss 12.9674 Epoch: 9 Global Step: 49700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:29:15,749-Speed 4475.32 samples/sec Loss 12.8003 Epoch: 9 Global Step: 49750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:29:27,017-Speed 4544.11 samples/sec Loss 12.9630 Epoch: 9 Global Step: 49800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:29:42,470-Speed 3313.49 samples/sec Loss 11.8942 Epoch: 10 Global Step: 49850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:29:54,486-Speed 4261.12 samples/sec Loss 10.3490 Epoch: 10 Global Step: 49900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:30:06,105-Speed 4406.70 samples/sec Loss 9.9042 Epoch: 10 Global Step: 49950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:30:17,414-Speed 4527.58 samples/sec Loss 9.5989 Epoch: 10 Global Step: 50000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:30:48,186-[lfw][50000]XNorm: 23.033529 Training: 2021-03-15 01:30:48,186-[lfw][50000]Accuracy-Flip: 0.99633+-0.00277 Training: 2021-03-15 01:30:48,186-[lfw][50000]Accuracy-Highest: 0.99633 Training: 2021-03-15 01:31:23,948-[cfp_fp][50000]XNorm: 19.686778 Training: 2021-03-15 01:31:23,948-[cfp_fp][50000]Accuracy-Flip: 0.96943+-0.00858 Training: 2021-03-15 01:31:23,948-[cfp_fp][50000]Accuracy-Highest: 0.96943 Training: 2021-03-15 01:31:54,801-[agedb_30][50000]XNorm: 22.352149 Training: 2021-03-15 01:31:54,801-[agedb_30][50000]Accuracy-Flip: 0.97150+-0.00652 Training: 2021-03-15 01:31:54,801-[agedb_30][50000]Accuracy-Highest: 0.97150 Training: 2021-03-15 01:32:06,022-Speed 471.42 samples/sec Loss 9.3024 Epoch: 10 Global Step: 50050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:32:17,371-Speed 4511.77 samples/sec Loss 9.0112 Epoch: 10 Global Step: 50100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:32:28,677-Speed 4528.49 samples/sec Loss 8.7813 Epoch: 10 Global Step: 50150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:32:39,913-Speed 4557.02 samples/sec Loss 8.5364 Epoch: 10 Global Step: 50200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:32:51,248-Speed 4517.27 samples/sec Loss 8.3278 Epoch: 10 Global Step: 50250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:02,633-Speed 4497.54 samples/sec Loss 8.1290 Epoch: 10 Global Step: 50300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:14,252-Speed 4406.54 samples/sec Loss 8.0751 Epoch: 10 Global Step: 50350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:25,663-Speed 4487.22 samples/sec Loss 7.8826 Epoch: 10 Global Step: 50400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:36,917-Speed 4549.61 samples/sec Loss 7.7734 Epoch: 10 Global Step: 50450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:48,268-Speed 4511.11 samples/sec Loss 7.6117 Epoch: 10 Global Step: 50500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:33:59,696-Speed 4480.15 samples/sec Loss 7.6015 Epoch: 10 Global Step: 50550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:34:11,066-Speed 4503.49 samples/sec Loss 7.4329 Epoch: 10 Global Step: 50600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:34:22,361-Speed 4533.14 samples/sec Loss 7.3446 Epoch: 10 Global Step: 50650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:34:33,957-Speed 4415.59 samples/sec Loss 7.1224 Epoch: 10 Global Step: 50700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:34:45,269-Speed 4526.16 samples/sec Loss 7.0087 Epoch: 10 Global Step: 50750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:34:56,668-Speed 4491.87 samples/sec Loss 6.9359 Epoch: 10 Global Step: 50800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:35:08,217-Speed 4433.48 samples/sec Loss 6.9038 Epoch: 10 Global Step: 50850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:35:19,682-Speed 4466.01 samples/sec Loss 6.7834 Epoch: 10 Global Step: 50900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:35:31,494-Speed 4334.82 samples/sec Loss 6.6610 Epoch: 10 Global Step: 50950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:35:42,981-Speed 4457.40 samples/sec Loss 6.6110 Epoch: 10 Global Step: 51000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:35:54,123-Speed 4595.43 samples/sec Loss 6.5701 Epoch: 10 Global Step: 51050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:36:05,604-Speed 4459.71 samples/sec Loss 6.4930 Epoch: 10 Global Step: 51100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:36:17,019-Speed 4485.55 samples/sec Loss 6.3479 Epoch: 10 Global Step: 51150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:36:28,392-Speed 4501.96 samples/sec Loss 6.3633 Epoch: 10 Global Step: 51200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:36:39,725-Speed 4517.96 samples/sec Loss 6.2040 Epoch: 10 Global Step: 51250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:36:51,033-Speed 4527.84 samples/sec Loss 6.2720 Epoch: 10 Global Step: 51300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:37:02,368-Speed 4517.27 samples/sec Loss 6.0927 Epoch: 10 Global Step: 51350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:37:14,564-Speed 4198.49 samples/sec Loss 6.0538 Epoch: 10 Global Step: 51400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:37:25,837-Speed 4542.05 samples/sec Loss 6.0511 Epoch: 10 Global Step: 51450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:37:37,266-Speed 4479.93 samples/sec Loss 5.9544 Epoch: 10 Global Step: 51500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:37:48,701-Speed 4477.64 samples/sec Loss 5.9252 Epoch: 10 Global Step: 51550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:00,015-Speed 4525.44 samples/sec Loss 5.8712 Epoch: 10 Global Step: 51600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:11,364-Speed 4511.86 samples/sec Loss 5.7730 Epoch: 10 Global Step: 51650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:22,927-Speed 4427.95 samples/sec Loss 5.7570 Epoch: 10 Global Step: 51700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:34,517-Speed 4417.60 samples/sec Loss 5.7554 Epoch: 10 Global Step: 51750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:45,744-Speed 4560.81 samples/sec Loss 5.6467 Epoch: 10 Global Step: 51800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:38:57,180-Speed 4477.20 samples/sec Loss 5.6317 Epoch: 10 Global Step: 51850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:39:08,366-Speed 4577.41 samples/sec Loss 5.5605 Epoch: 10 Global Step: 51900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:39:20,059-Speed 4378.98 samples/sec Loss 5.5812 Epoch: 10 Global Step: 51950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:39:31,153-Speed 4615.42 samples/sec Loss 5.5345 Epoch: 10 Global Step: 52000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:40:01,947-[lfw][52000]XNorm: 22.891168 Training: 2021-03-15 01:40:01,948-[lfw][52000]Accuracy-Flip: 0.99750+-0.00281 Training: 2021-03-15 01:40:01,948-[lfw][52000]Accuracy-Highest: 0.99750 Training: 2021-03-15 01:40:37,641-[cfp_fp][52000]XNorm: 20.132457 Training: 2021-03-15 01:40:37,641-[cfp_fp][52000]Accuracy-Flip: 0.97757+-0.00535 Training: 2021-03-15 01:40:37,641-[cfp_fp][52000]Accuracy-Highest: 0.97757 Training: 2021-03-15 01:41:08,218-[agedb_30][52000]XNorm: 22.426077 Training: 2021-03-15 01:41:08,218-[agedb_30][52000]Accuracy-Flip: 0.97233+-0.00616 Training: 2021-03-15 01:41:08,218-[agedb_30][52000]Accuracy-Highest: 0.97233 Training: 2021-03-15 01:41:19,685-Speed 471.75 samples/sec Loss 5.5205 Epoch: 10 Global Step: 52050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:41:31,114-Speed 4479.69 samples/sec Loss 5.4583 Epoch: 10 Global Step: 52100 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:41:42,647-Speed 4439.76 samples/sec Loss 5.3871 Epoch: 10 Global Step: 52150 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:41:53,786-Speed 4596.76 samples/sec Loss 5.3428 Epoch: 10 Global Step: 52200 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:42:05,469-Speed 4382.60 samples/sec Loss 5.3887 Epoch: 10 Global Step: 52250 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:42:16,851-Speed 4498.40 samples/sec Loss 5.4183 Epoch: 10 Global Step: 52300 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:42:28,287-Speed 4477.29 samples/sec Loss 5.3082 Epoch: 10 Global Step: 52350 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:42:39,519-Speed 4558.83 samples/sec Loss 5.2771 Epoch: 10 Global Step: 52400 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:42:51,062-Speed 4435.66 samples/sec Loss 5.3179 Epoch: 10 Global Step: 52450 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:02,631-Speed 4425.99 samples/sec Loss 5.2034 Epoch: 10 Global Step: 52500 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:14,037-Speed 4489.05 samples/sec Loss 5.2261 Epoch: 10 Global Step: 52550 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:25,417-Speed 4499.25 samples/sec Loss 5.2001 Epoch: 10 Global Step: 52600 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:36,869-Speed 4471.16 samples/sec Loss 5.1164 Epoch: 10 Global Step: 52650 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:48,597-Speed 4365.80 samples/sec Loss 5.1445 Epoch: 10 Global Step: 52700 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:43:59,801-Speed 4570.03 samples/sec Loss 5.1831 Epoch: 10 Global Step: 52750 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:44:11,105-Speed 4529.38 samples/sec Loss 5.1047 Epoch: 10 Global Step: 52800 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:44:22,618-Speed 4447.33 samples/sec Loss 5.0923 Epoch: 10 Global Step: 52850 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:44:33,860-Speed 4554.57 samples/sec Loss 5.1149 Epoch: 10 Global Step: 52900 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:44:45,132-Speed 4542.30 samples/sec Loss 5.0507 Epoch: 10 Global Step: 52950 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:44:56,710-Speed 4422.63 samples/sec Loss 5.0629 Epoch: 10 Global Step: 53000 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:45:07,942-Speed 4558.21 samples/sec Loss 5.0341 Epoch: 10 Global Step: 53050 Fp16 Grad Scale: 16384 Required: 6 hours Training: 2021-03-15 01:45:19,506-Speed 4427.77 samples/sec Loss 5.0405 Epoch: 10 Global Step: 53100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:45:30,737-Speed 4559.06 samples/sec Loss 5.0372 Epoch: 10 Global Step: 53150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:45:42,318-Speed 4421.38 samples/sec Loss 4.9820 Epoch: 10 Global Step: 53200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:45:53,797-Speed 4460.58 samples/sec Loss 4.9541 Epoch: 10 Global Step: 53250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:46:05,518-Speed 4368.22 samples/sec Loss 4.9735 Epoch: 10 Global Step: 53300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:46:17,095-Speed 4422.68 samples/sec Loss 4.9284 Epoch: 10 Global Step: 53350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:46:28,515-Speed 4483.55 samples/sec Loss 4.9882 Epoch: 10 Global Step: 53400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:46:39,762-Speed 4552.82 samples/sec Loss 4.9486 Epoch: 10 Global Step: 53450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:46:51,397-Speed 4400.58 samples/sec Loss 4.9393 Epoch: 10 Global Step: 53500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:02,759-Speed 4506.40 samples/sec Loss 4.9523 Epoch: 10 Global Step: 53550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:14,189-Speed 4479.46 samples/sec Loss 4.9100 Epoch: 10 Global Step: 53600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:25,495-Speed 4529.04 samples/sec Loss 4.9514 Epoch: 10 Global Step: 53650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:36,797-Speed 4530.09 samples/sec Loss 4.9024 Epoch: 10 Global Step: 53700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:48,342-Speed 4435.18 samples/sec Loss 4.8727 Epoch: 10 Global Step: 53750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:47:59,805-Speed 4466.90 samples/sec Loss 4.8863 Epoch: 10 Global Step: 53800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:48:11,248-Speed 4474.29 samples/sec Loss 4.8909 Epoch: 10 Global Step: 53850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:48:22,641-Speed 4494.24 samples/sec Loss 4.9228 Epoch: 10 Global Step: 53900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:48:34,148-Speed 4449.88 samples/sec Loss 4.9329 Epoch: 10 Global Step: 53950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:48:45,360-Speed 4566.67 samples/sec Loss 4.8874 Epoch: 10 Global Step: 54000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:49:16,109-[lfw][54000]XNorm: 22.428432 Training: 2021-03-15 01:49:16,109-[lfw][54000]Accuracy-Flip: 0.99717+-0.00248 Training: 2021-03-15 01:49:16,109-[lfw][54000]Accuracy-Highest: 0.99750 Training: 2021-03-15 01:49:51,633-[cfp_fp][54000]XNorm: 19.920852 Training: 2021-03-15 01:49:51,634-[cfp_fp][54000]Accuracy-Flip: 0.98086+-0.00547 Training: 2021-03-15 01:49:51,634-[cfp_fp][54000]Accuracy-Highest: 0.98086 Training: 2021-03-15 01:50:22,294-[agedb_30][54000]XNorm: 22.125941 Training: 2021-03-15 01:50:22,294-[agedb_30][54000]Accuracy-Flip: 0.97467+-0.00488 Training: 2021-03-15 01:50:22,294-[agedb_30][54000]Accuracy-Highest: 0.97467 Training: 2021-03-15 01:50:33,599-Speed 473.03 samples/sec Loss 4.9067 Epoch: 10 Global Step: 54050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:50:44,858-Speed 4547.56 samples/sec Loss 4.8690 Epoch: 10 Global Step: 54100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:50:56,285-Speed 4480.84 samples/sec Loss 4.8595 Epoch: 10 Global Step: 54150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:51:07,489-Speed 4570.10 samples/sec Loss 4.8680 Epoch: 10 Global Step: 54200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:51:19,130-Speed 4398.43 samples/sec Loss 4.8312 Epoch: 10 Global Step: 54250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:51:30,369-Speed 4555.72 samples/sec Loss 4.8303 Epoch: 10 Global Step: 54300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:51:42,364-Speed 4268.80 samples/sec Loss 4.8251 Epoch: 10 Global Step: 54350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:51:53,586-Speed 4562.62 samples/sec Loss 4.9075 Epoch: 10 Global Step: 54400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:52:04,786-Speed 4571.57 samples/sec Loss 4.8083 Epoch: 10 Global Step: 54450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:52:16,003-Speed 4564.63 samples/sec Loss 4.8469 Epoch: 10 Global Step: 54500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:52:27,756-Speed 4356.43 samples/sec Loss 4.8609 Epoch: 10 Global Step: 54550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:52:39,188-Speed 4479.01 samples/sec Loss 4.8258 Epoch: 10 Global Step: 54600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:52:50,547-Speed 4507.59 samples/sec Loss 4.8500 Epoch: 10 Global Step: 54650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:53:01,950-Speed 4490.16 samples/sec Loss 4.8560 Epoch: 10 Global Step: 54700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:53:13,451-Speed 4452.08 samples/sec Loss 4.7819 Epoch: 10 Global Step: 54750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:53:24,561-Speed 4608.85 samples/sec Loss 4.8405 Epoch: 10 Global Step: 54800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:53:39,887-Speed 3340.83 samples/sec Loss 4.0325 Epoch: 11 Global Step: 54850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:53:51,514-Speed 4403.53 samples/sec Loss 4.0464 Epoch: 11 Global Step: 54900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:54:02,731-Speed 4564.89 samples/sec Loss 4.0663 Epoch: 11 Global Step: 54950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:54:14,094-Speed 4506.15 samples/sec Loss 4.0770 Epoch: 11 Global Step: 55000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:54:25,457-Speed 4505.96 samples/sec Loss 4.1156 Epoch: 11 Global Step: 55050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:54:36,985-Speed 4441.54 samples/sec Loss 4.1210 Epoch: 11 Global Step: 55100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:54:48,625-Speed 4398.84 samples/sec Loss 4.1560 Epoch: 11 Global Step: 55150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:00,072-Speed 4472.89 samples/sec Loss 4.2371 Epoch: 11 Global Step: 55200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:11,468-Speed 4493.10 samples/sec Loss 4.2591 Epoch: 11 Global Step: 55250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:23,828-Speed 4142.54 samples/sec Loss 4.2957 Epoch: 11 Global Step: 55300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:35,207-Speed 4499.65 samples/sec Loss 4.2940 Epoch: 11 Global Step: 55350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:46,687-Speed 4460.07 samples/sec Loss 4.3358 Epoch: 11 Global Step: 55400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:55:57,991-Speed 4529.89 samples/sec Loss 4.3598 Epoch: 11 Global Step: 55450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:56:09,232-Speed 4554.99 samples/sec Loss 4.3527 Epoch: 11 Global Step: 55500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:56:20,657-Speed 4481.63 samples/sec Loss 4.3223 Epoch: 11 Global Step: 55550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:56:32,175-Speed 4445.10 samples/sec Loss 4.4111 Epoch: 11 Global Step: 55600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:56:43,875-Speed 4376.40 samples/sec Loss 4.4114 Epoch: 11 Global Step: 55650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:56:55,586-Speed 4372.07 samples/sec Loss 4.4104 Epoch: 11 Global Step: 55700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:57:06,932-Speed 4512.74 samples/sec Loss 4.4785 Epoch: 11 Global Step: 55750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:57:18,756-Speed 4330.53 samples/sec Loss 4.4359 Epoch: 11 Global Step: 55800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:57:30,298-Speed 4436.07 samples/sec Loss 4.4922 Epoch: 11 Global Step: 55850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:57:42,109-Speed 4335.18 samples/sec Loss 4.5089 Epoch: 11 Global Step: 55900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:57:53,580-Speed 4463.76 samples/sec Loss 4.5599 Epoch: 11 Global Step: 55950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:58:05,337-Speed 4354.72 samples/sec Loss 4.4941 Epoch: 11 Global Step: 56000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 01:58:36,091-[lfw][56000]XNorm: 22.858359 Training: 2021-03-15 01:58:36,092-[lfw][56000]Accuracy-Flip: 0.99717+-0.00236 Training: 2021-03-15 01:58:36,092-[lfw][56000]Accuracy-Highest: 0.99750 Training: 2021-03-15 01:59:11,851-[cfp_fp][56000]XNorm: 20.509701 Training: 2021-03-15 01:59:11,851-[cfp_fp][56000]Accuracy-Flip: 0.98286+-0.00394 Training: 2021-03-15 01:59:11,851-[cfp_fp][56000]Accuracy-Highest: 0.98286 Training: 2021-03-15 01:59:42,651-[agedb_30][56000]XNorm: 22.700510 Training: 2021-03-15 01:59:42,651-[agedb_30][56000]Accuracy-Flip: 0.97817+-0.00669 Training: 2021-03-15 01:59:42,651-[agedb_30][56000]Accuracy-Highest: 0.97817 Training: 2021-03-15 01:59:54,135-Speed 470.60 samples/sec Loss 4.5983 Epoch: 11 Global Step: 56050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:00:05,726-Speed 4417.32 samples/sec Loss 4.5147 Epoch: 11 Global Step: 56100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:00:17,645-Speed 4295.83 samples/sec Loss 4.5646 Epoch: 11 Global Step: 56150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:00:29,510-Speed 4315.45 samples/sec Loss 4.6109 Epoch: 11 Global Step: 56200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:00:41,054-Speed 4435.47 samples/sec Loss 4.6414 Epoch: 11 Global Step: 56250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:00:52,773-Speed 4368.97 samples/sec Loss 4.5849 Epoch: 11 Global Step: 56300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:01:04,271-Speed 4453.30 samples/sec Loss 4.6292 Epoch: 11 Global Step: 56350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:01:16,084-Speed 4334.12 samples/sec Loss 4.6892 Epoch: 11 Global Step: 56400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:01:27,675-Speed 4417.45 samples/sec Loss 4.7048 Epoch: 11 Global Step: 56450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:01:39,516-Speed 4324.18 samples/sec Loss 4.6952 Epoch: 11 Global Step: 56500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:01:51,121-Speed 4412.17 samples/sec Loss 4.6701 Epoch: 11 Global Step: 56550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:02:02,879-Speed 4354.61 samples/sec Loss 4.6712 Epoch: 11 Global Step: 56600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:02:14,580-Speed 4375.93 samples/sec Loss 4.6892 Epoch: 11 Global Step: 56650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:02:26,415-Speed 4326.29 samples/sec Loss 4.7073 Epoch: 11 Global Step: 56700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:02:37,925-Speed 4448.29 samples/sec Loss 4.6765 Epoch: 11 Global Step: 56750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:02:49,522-Speed 4415.20 samples/sec Loss 4.7544 Epoch: 11 Global Step: 56800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:01,165-Speed 4397.95 samples/sec Loss 4.7241 Epoch: 11 Global Step: 56850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:12,632-Speed 4464.84 samples/sec Loss 4.6851 Epoch: 11 Global Step: 56900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:24,037-Speed 4489.69 samples/sec Loss 4.7856 Epoch: 11 Global Step: 56950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:35,684-Speed 4396.07 samples/sec Loss 4.7814 Epoch: 11 Global Step: 57000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:47,550-Speed 4315.11 samples/sec Loss 4.7789 Epoch: 11 Global Step: 57050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:03:59,202-Speed 4394.32 samples/sec Loss 4.8034 Epoch: 11 Global Step: 57100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:04:10,806-Speed 4412.51 samples/sec Loss 4.7419 Epoch: 11 Global Step: 57150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:04:22,303-Speed 4453.41 samples/sec Loss 4.8206 Epoch: 11 Global Step: 57200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:04:34,201-Speed 4303.34 samples/sec Loss 4.8468 Epoch: 11 Global Step: 57250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:04:45,959-Speed 4354.59 samples/sec Loss 4.7729 Epoch: 11 Global Step: 57300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:04:57,507-Speed 4434.03 samples/sec Loss 4.8124 Epoch: 11 Global Step: 57350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:05:09,105-Speed 4414.52 samples/sec Loss 4.7948 Epoch: 11 Global Step: 57400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:05:20,856-Speed 4357.55 samples/sec Loss 4.8791 Epoch: 11 Global Step: 57450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:05:32,385-Speed 4441.20 samples/sec Loss 4.8297 Epoch: 11 Global Step: 57500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:05:44,292-Speed 4299.95 samples/sec Loss 4.8370 Epoch: 11 Global Step: 57550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:05:55,814-Speed 4443.97 samples/sec Loss 4.8630 Epoch: 11 Global Step: 57600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:06:07,675-Speed 4316.61 samples/sec Loss 4.9240 Epoch: 11 Global Step: 57650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:06:19,255-Speed 4421.66 samples/sec Loss 4.9026 Epoch: 11 Global Step: 57700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:06:31,206-Speed 4284.64 samples/sec Loss 4.8476 Epoch: 11 Global Step: 57750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:06:42,820-Speed 4408.35 samples/sec Loss 4.8870 Epoch: 11 Global Step: 57800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:06:54,594-Speed 4348.69 samples/sec Loss 4.8922 Epoch: 11 Global Step: 57850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:07:06,343-Speed 4358.21 samples/sec Loss 4.8561 Epoch: 11 Global Step: 57900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:07:18,290-Speed 4285.61 samples/sec Loss 4.8967 Epoch: 11 Global Step: 57950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:07:29,758-Speed 4464.74 samples/sec Loss 4.9206 Epoch: 11 Global Step: 58000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:08:00,527-[lfw][58000]XNorm: 23.670324 Training: 2021-03-15 02:08:00,528-[lfw][58000]Accuracy-Flip: 0.99767+-0.00238 Training: 2021-03-15 02:08:00,528-[lfw][58000]Accuracy-Highest: 0.99767 Training: 2021-03-15 02:08:35,991-[cfp_fp][58000]XNorm: 21.001148 Training: 2021-03-15 02:08:35,991-[cfp_fp][58000]Accuracy-Flip: 0.98157+-0.00303 Training: 2021-03-15 02:08:35,991-[cfp_fp][58000]Accuracy-Highest: 0.98286 Training: 2021-03-15 02:09:06,539-[agedb_30][58000]XNorm: 23.276858 Training: 2021-03-15 02:09:06,540-[agedb_30][58000]Accuracy-Flip: 0.97833+-0.00606 Training: 2021-03-15 02:09:06,540-[agedb_30][58000]Accuracy-Highest: 0.97833 Training: 2021-03-15 02:09:18,255-Speed 471.91 samples/sec Loss 4.9438 Epoch: 11 Global Step: 58050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:09:30,006-Speed 4357.27 samples/sec Loss 4.9393 Epoch: 11 Global Step: 58100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:09:41,608-Speed 4413.13 samples/sec Loss 4.9195 Epoch: 11 Global Step: 58150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:09:53,125-Speed 4445.69 samples/sec Loss 4.9316 Epoch: 11 Global Step: 58200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:10:04,876-Speed 4357.32 samples/sec Loss 4.9710 Epoch: 11 Global Step: 58250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:10:16,423-Speed 4434.20 samples/sec Loss 4.9270 Epoch: 11 Global Step: 58300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:10:28,107-Speed 4382.29 samples/sec Loss 4.9292 Epoch: 11 Global Step: 58350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:10:39,621-Speed 4447.01 samples/sec Loss 4.9587 Epoch: 11 Global Step: 58400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:10:51,362-Speed 4361.02 samples/sec Loss 4.9298 Epoch: 11 Global Step: 58450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:11:03,114-Speed 4356.84 samples/sec Loss 5.0267 Epoch: 11 Global Step: 58500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:11:14,769-Speed 4393.18 samples/sec Loss 4.9819 Epoch: 11 Global Step: 58550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:11:26,456-Speed 4381.15 samples/sec Loss 4.9551 Epoch: 11 Global Step: 58600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:11:38,411-Speed 4282.86 samples/sec Loss 4.9562 Epoch: 11 Global Step: 58650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:11:50,081-Speed 4387.42 samples/sec Loss 4.9841 Epoch: 11 Global Step: 58700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:12:01,722-Speed 4398.53 samples/sec Loss 5.0216 Epoch: 11 Global Step: 58750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:12:13,503-Speed 4346.15 samples/sec Loss 4.9595 Epoch: 11 Global Step: 58800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:12:25,299-Speed 4340.59 samples/sec Loss 5.0053 Epoch: 11 Global Step: 58850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:12:36,981-Speed 4383.20 samples/sec Loss 4.9502 Epoch: 11 Global Step: 58900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:12:48,644-Speed 4390.07 samples/sec Loss 5.0190 Epoch: 11 Global Step: 58950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:00,241-Speed 4415.04 samples/sec Loss 5.0644 Epoch: 11 Global Step: 59000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:11,768-Speed 4441.91 samples/sec Loss 5.0380 Epoch: 11 Global Step: 59050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:23,498-Speed 4364.93 samples/sec Loss 4.9581 Epoch: 11 Global Step: 59100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:35,608-Speed 4228.18 samples/sec Loss 5.0175 Epoch: 11 Global Step: 59150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:47,181-Speed 4424.41 samples/sec Loss 5.1274 Epoch: 11 Global Step: 59200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:13:58,937-Speed 4355.50 samples/sec Loss 5.0263 Epoch: 11 Global Step: 59250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:14:10,614-Speed 4384.77 samples/sec Loss 5.0850 Epoch: 11 Global Step: 59300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:14:22,233-Speed 4406.74 samples/sec Loss 5.0513 Epoch: 11 Global Step: 59350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:14:34,044-Speed 4335.16 samples/sec Loss 5.0366 Epoch: 11 Global Step: 59400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:14:45,538-Speed 4454.49 samples/sec Loss 5.0526 Epoch: 11 Global Step: 59450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:14:57,179-Speed 4398.41 samples/sec Loss 5.0505 Epoch: 11 Global Step: 59500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:15:08,698-Speed 4444.98 samples/sec Loss 5.0366 Epoch: 11 Global Step: 59550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:15:21,093-Speed 4130.98 samples/sec Loss 5.0688 Epoch: 11 Global Step: 59600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:15:32,640-Speed 4434.43 samples/sec Loss 5.0688 Epoch: 11 Global Step: 59650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:15:44,436-Speed 4340.55 samples/sec Loss 5.0963 Epoch: 11 Global Step: 59700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:15:56,125-Speed 4380.37 samples/sec Loss 5.0881 Epoch: 11 Global Step: 59750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:16:11,479-Speed 3334.63 samples/sec Loss 4.7802 Epoch: 12 Global Step: 59800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:16:23,866-Speed 4133.75 samples/sec Loss 4.2316 Epoch: 12 Global Step: 59850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:16:35,752-Speed 4307.76 samples/sec Loss 4.3137 Epoch: 12 Global Step: 59900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:16:47,373-Speed 4406.25 samples/sec Loss 4.3327 Epoch: 12 Global Step: 59950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:16:59,034-Speed 4390.83 samples/sec Loss 4.3581 Epoch: 12 Global Step: 60000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:17:29,757-[lfw][60000]XNorm: 23.478598 Training: 2021-03-15 02:17:29,757-[lfw][60000]Accuracy-Flip: 0.99733+-0.00186 Training: 2021-03-15 02:17:29,757-[lfw][60000]Accuracy-Highest: 0.99767 Training: 2021-03-15 02:18:05,313-[cfp_fp][60000]XNorm: 20.782304 Training: 2021-03-15 02:18:05,313-[cfp_fp][60000]Accuracy-Flip: 0.98214+-0.00425 Training: 2021-03-15 02:18:05,315-[cfp_fp][60000]Accuracy-Highest: 0.98286 Training: 2021-03-15 02:18:36,220-[agedb_30][60000]XNorm: 23.011419 Training: 2021-03-15 02:18:36,220-[agedb_30][60000]Accuracy-Flip: 0.97867+-0.00670 Training: 2021-03-15 02:18:36,221-[agedb_30][60000]Accuracy-Highest: 0.97867 Training: 2021-03-15 02:18:47,878-Speed 470.40 samples/sec Loss 4.4503 Epoch: 12 Global Step: 60050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:18:59,491-Speed 4408.82 samples/sec Loss 4.4361 Epoch: 12 Global Step: 60100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:19:10,952-Speed 4467.63 samples/sec Loss 4.4770 Epoch: 12 Global Step: 60150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:19:22,642-Speed 4380.17 samples/sec Loss 4.5472 Epoch: 12 Global Step: 60200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:19:34,306-Speed 4389.46 samples/sec Loss 4.5564 Epoch: 12 Global Step: 60250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:19:46,093-Speed 4344.15 samples/sec Loss 4.5120 Epoch: 12 Global Step: 60300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:19:57,895-Speed 4338.43 samples/sec Loss 4.6428 Epoch: 12 Global Step: 60350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:20:09,585-Speed 4380.13 samples/sec Loss 4.5911 Epoch: 12 Global Step: 60400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:20:21,128-Speed 4435.61 samples/sec Loss 4.6118 Epoch: 12 Global Step: 60450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:20:32,846-Speed 4369.45 samples/sec Loss 4.6280 Epoch: 12 Global Step: 60500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:20:44,480-Speed 4401.21 samples/sec Loss 4.6911 Epoch: 12 Global Step: 60550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:20:56,606-Speed 4222.50 samples/sec Loss 4.7513 Epoch: 12 Global Step: 60600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:21:08,283-Speed 4385.03 samples/sec Loss 4.7234 Epoch: 12 Global Step: 60650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:21:19,833-Speed 4432.84 samples/sec Loss 4.7199 Epoch: 12 Global Step: 60700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:21:31,370-Speed 4438.18 samples/sec Loss 4.7725 Epoch: 12 Global Step: 60750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:21:43,186-Speed 4333.34 samples/sec Loss 4.7932 Epoch: 12 Global Step: 60800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:21:54,929-Speed 4359.92 samples/sec Loss 4.7823 Epoch: 12 Global Step: 60850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:22:06,765-Speed 4326.26 samples/sec Loss 4.8460 Epoch: 12 Global Step: 60900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:22:18,351-Speed 4419.33 samples/sec Loss 4.8015 Epoch: 12 Global Step: 60950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:22:30,056-Speed 4374.19 samples/sec Loss 4.8911 Epoch: 12 Global Step: 61000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:22:41,615-Speed 4429.60 samples/sec Loss 4.8581 Epoch: 12 Global Step: 61050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:22:53,460-Speed 4322.59 samples/sec Loss 4.8810 Epoch: 12 Global Step: 61100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:23:05,496-Speed 4254.38 samples/sec Loss 4.8610 Epoch: 12 Global Step: 61150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:23:17,214-Speed 4369.46 samples/sec Loss 4.8776 Epoch: 12 Global Step: 61200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:23:29,055-Speed 4324.22 samples/sec Loss 4.9409 Epoch: 12 Global Step: 61250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:23:40,915-Speed 4316.95 samples/sec Loss 4.9563 Epoch: 12 Global Step: 61300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:23:52,271-Speed 4508.83 samples/sec Loss 4.9249 Epoch: 12 Global Step: 61350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:24:03,987-Speed 4370.44 samples/sec Loss 4.9259 Epoch: 12 Global Step: 61400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:24:15,656-Speed 4387.77 samples/sec Loss 4.9834 Epoch: 12 Global Step: 61450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:24:27,504-Speed 4321.45 samples/sec Loss 4.9807 Epoch: 12 Global Step: 61500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:24:39,048-Speed 4435.64 samples/sec Loss 4.9948 Epoch: 12 Global Step: 61550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:24:51,468-Speed 4122.36 samples/sec Loss 4.9306 Epoch: 12 Global Step: 61600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:25:03,083-Speed 4408.25 samples/sec Loss 4.9848 Epoch: 12 Global Step: 61650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:25:14,739-Speed 4393.04 samples/sec Loss 4.9416 Epoch: 12 Global Step: 61700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:25:26,343-Speed 4412.51 samples/sec Loss 4.9879 Epoch: 12 Global Step: 61750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:25:38,093-Speed 4357.40 samples/sec Loss 4.9999 Epoch: 12 Global Step: 61800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:25:49,530-Speed 4476.86 samples/sec Loss 4.9446 Epoch: 12 Global Step: 61850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:26:01,187-Speed 4392.48 samples/sec Loss 5.0011 Epoch: 12 Global Step: 61900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:26:12,921-Speed 4363.41 samples/sec Loss 5.0146 Epoch: 12 Global Step: 61950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:26:24,608-Speed 4381.49 samples/sec Loss 5.0114 Epoch: 12 Global Step: 62000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:26:55,450-[lfw][62000]XNorm: 22.217303 Training: 2021-03-15 02:26:55,451-[lfw][62000]Accuracy-Flip: 0.99750+-0.00227 Training: 2021-03-15 02:26:55,451-[lfw][62000]Accuracy-Highest: 0.99767 Training: 2021-03-15 02:27:31,283-[cfp_fp][62000]XNorm: 19.727174 Training: 2021-03-15 02:27:31,283-[cfp_fp][62000]Accuracy-Flip: 0.98343+-0.00339 Training: 2021-03-15 02:27:31,283-[cfp_fp][62000]Accuracy-Highest: 0.98343 Training: 2021-03-15 02:28:02,112-[agedb_30][62000]XNorm: 21.983036 Training: 2021-03-15 02:28:02,112-[agedb_30][62000]Accuracy-Flip: 0.97750+-0.00676 Training: 2021-03-15 02:28:02,112-[agedb_30][62000]Accuracy-Highest: 0.97867 Training: 2021-03-15 02:28:13,708-Speed 469.30 samples/sec Loss 5.0502 Epoch: 12 Global Step: 62050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:28:25,280-Speed 4424.75 samples/sec Loss 5.0420 Epoch: 12 Global Step: 62100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:28:36,955-Speed 4385.28 samples/sec Loss 5.0756 Epoch: 12 Global Step: 62150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:28:48,737-Speed 4346.00 samples/sec Loss 5.0701 Epoch: 12 Global Step: 62200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:00,382-Speed 4396.74 samples/sec Loss 5.1242 Epoch: 12 Global Step: 62250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:12,043-Speed 4391.12 samples/sec Loss 5.1022 Epoch: 12 Global Step: 62300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:24,193-Speed 4213.96 samples/sec Loss 5.1053 Epoch: 12 Global Step: 62350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:35,771-Speed 4422.24 samples/sec Loss 5.1169 Epoch: 12 Global Step: 62400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:47,493-Speed 4368.15 samples/sec Loss 5.0577 Epoch: 12 Global Step: 62450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:29:59,566-Speed 4241.13 samples/sec Loss 5.0210 Epoch: 12 Global Step: 62500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:30:11,289-Speed 4367.57 samples/sec Loss 5.1000 Epoch: 12 Global Step: 62550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:30:22,836-Speed 4434.12 samples/sec Loss 5.1123 Epoch: 12 Global Step: 62600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:30:34,535-Speed 4376.80 samples/sec Loss 5.0995 Epoch: 12 Global Step: 62650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:30:46,117-Speed 4420.93 samples/sec Loss 5.0659 Epoch: 12 Global Step: 62700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:30:57,842-Speed 4366.79 samples/sec Loss 5.1060 Epoch: 12 Global Step: 62750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:31:09,371-Speed 4441.46 samples/sec Loss 5.1562 Epoch: 12 Global Step: 62800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:31:21,283-Speed 4298.04 samples/sec Loss 5.1722 Epoch: 12 Global Step: 62850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:31:32,848-Speed 4427.54 samples/sec Loss 5.0968 Epoch: 12 Global Step: 62900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:31:44,432-Speed 4419.93 samples/sec Loss 5.1433 Epoch: 12 Global Step: 62950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:31:55,970-Speed 4437.75 samples/sec Loss 5.0800 Epoch: 12 Global Step: 63000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:32:07,493-Speed 4443.44 samples/sec Loss 5.1175 Epoch: 12 Global Step: 63050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:32:19,324-Speed 4327.71 samples/sec Loss 5.1520 Epoch: 12 Global Step: 63100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:32:31,029-Speed 4374.63 samples/sec Loss 5.0864 Epoch: 12 Global Step: 63150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:32:42,890-Speed 4316.62 samples/sec Loss 5.0971 Epoch: 12 Global Step: 63200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:32:54,515-Speed 4404.71 samples/sec Loss 5.0651 Epoch: 12 Global Step: 63250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:33:06,079-Speed 4427.59 samples/sec Loss 5.1021 Epoch: 12 Global Step: 63300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:33:18,122-Speed 4251.64 samples/sec Loss 5.1731 Epoch: 12 Global Step: 63350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:33:29,949-Speed 4329.06 samples/sec Loss 5.1294 Epoch: 12 Global Step: 63400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:33:41,600-Speed 4394.79 samples/sec Loss 5.1535 Epoch: 12 Global Step: 63450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:33:53,086-Speed 4457.97 samples/sec Loss 5.1175 Epoch: 12 Global Step: 63500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:34:05,043-Speed 4282.01 samples/sec Loss 5.1431 Epoch: 12 Global Step: 63550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:34:16,569-Speed 4442.45 samples/sec Loss 5.1043 Epoch: 12 Global Step: 63600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:34:28,197-Speed 4403.30 samples/sec Loss 5.1808 Epoch: 12 Global Step: 63650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:34:39,819-Speed 4405.81 samples/sec Loss 5.2303 Epoch: 12 Global Step: 63700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:34:51,601-Speed 4345.57 samples/sec Loss 5.1373 Epoch: 12 Global Step: 63750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:35:03,229-Speed 4403.30 samples/sec Loss 5.1852 Epoch: 12 Global Step: 63800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:35:14,820-Speed 4417.51 samples/sec Loss 5.1310 Epoch: 12 Global Step: 63850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:35:26,516-Speed 4377.78 samples/sec Loss 5.1727 Epoch: 12 Global Step: 63900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:35:37,980-Speed 4466.31 samples/sec Loss 5.1067 Epoch: 12 Global Step: 63950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:35:49,679-Speed 4376.72 samples/sec Loss 5.1530 Epoch: 12 Global Step: 64000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:36:20,478-[lfw][64000]XNorm: 23.549239 Training: 2021-03-15 02:36:20,478-[lfw][64000]Accuracy-Flip: 0.99767+-0.00226 Training: 2021-03-15 02:36:20,478-[lfw][64000]Accuracy-Highest: 0.99767 Training: 2021-03-15 02:36:55,986-[cfp_fp][64000]XNorm: 20.909362 Training: 2021-03-15 02:36:55,987-[cfp_fp][64000]Accuracy-Flip: 0.98100+-0.00399 Training: 2021-03-15 02:36:55,987-[cfp_fp][64000]Accuracy-Highest: 0.98343 Training: 2021-03-15 02:37:26,665-[agedb_30][64000]XNorm: 22.754223 Training: 2021-03-15 02:37:26,665-[agedb_30][64000]Accuracy-Flip: 0.97500+-0.00601 Training: 2021-03-15 02:37:26,665-[agedb_30][64000]Accuracy-Highest: 0.97867 Training: 2021-03-15 02:37:38,451-Speed 470.71 samples/sec Loss 5.1233 Epoch: 12 Global Step: 64050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:37:50,143-Speed 4379.09 samples/sec Loss 5.1314 Epoch: 12 Global Step: 64100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:38:02,057-Speed 4297.65 samples/sec Loss 5.1342 Epoch: 12 Global Step: 64150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:38:13,579-Speed 4443.68 samples/sec Loss 5.1881 Epoch: 12 Global Step: 64200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:38:25,290-Speed 4372.17 samples/sec Loss 5.1775 Epoch: 12 Global Step: 64250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:38:37,122-Speed 4327.30 samples/sec Loss 5.1292 Epoch: 12 Global Step: 64300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:38:48,654-Speed 4440.04 samples/sec Loss 5.1542 Epoch: 12 Global Step: 64350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:00,407-Speed 4356.79 samples/sec Loss 5.1790 Epoch: 12 Global Step: 64400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:12,094-Speed 4380.92 samples/sec Loss 5.2049 Epoch: 12 Global Step: 64450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:24,234-Speed 4217.59 samples/sec Loss 5.1727 Epoch: 12 Global Step: 64500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:35,655-Speed 4483.21 samples/sec Loss 5.1802 Epoch: 12 Global Step: 64550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:47,296-Speed 4398.44 samples/sec Loss 5.1698 Epoch: 12 Global Step: 64600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:39:58,741-Speed 4473.65 samples/sec Loss 5.1665 Epoch: 12 Global Step: 64650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:40:10,528-Speed 4344.11 samples/sec Loss 5.1816 Epoch: 12 Global Step: 64700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:40:22,045-Speed 4445.58 samples/sec Loss 5.1731 Epoch: 12 Global Step: 64750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:40:38,604-Speed 3092.09 samples/sec Loss 4.6588 Epoch: 13 Global Step: 64800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:40:50,353-Speed 4358.40 samples/sec Loss 4.3885 Epoch: 13 Global Step: 64850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:41:02,070-Speed 4369.87 samples/sec Loss 4.4335 Epoch: 13 Global Step: 64900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:41:13,634-Speed 4427.66 samples/sec Loss 4.4509 Epoch: 13 Global Step: 64950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:41:25,681-Speed 4250.07 samples/sec Loss 4.4814 Epoch: 13 Global Step: 65000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:41:37,866-Speed 4202.10 samples/sec Loss 4.5152 Epoch: 13 Global Step: 65050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:41:49,803-Speed 4289.50 samples/sec Loss 4.5555 Epoch: 13 Global Step: 65100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:42:01,903-Speed 4231.59 samples/sec Loss 4.6084 Epoch: 13 Global Step: 65150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:42:14,050-Speed 4215.32 samples/sec Loss 4.6286 Epoch: 13 Global Step: 65200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:42:25,602-Speed 4432.13 samples/sec Loss 4.6823 Epoch: 13 Global Step: 65250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:42:37,209-Speed 4411.32 samples/sec Loss 4.6892 Epoch: 13 Global Step: 65300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:42:48,837-Speed 4403.29 samples/sec Loss 4.6970 Epoch: 13 Global Step: 65350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:00,530-Speed 4378.90 samples/sec Loss 4.7710 Epoch: 13 Global Step: 65400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:12,127-Speed 4415.40 samples/sec Loss 4.7604 Epoch: 13 Global Step: 65450 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:24,044-Speed 4296.40 samples/sec Loss 4.8071 Epoch: 13 Global Step: 65500 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:35,548-Speed 4450.96 samples/sec Loss 4.8203 Epoch: 13 Global Step: 65550 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:47,214-Speed 4388.77 samples/sec Loss 4.8446 Epoch: 13 Global Step: 65600 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:43:58,766-Speed 4432.42 samples/sec Loss 4.8227 Epoch: 13 Global Step: 65650 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:44:10,531-Speed 4351.94 samples/sec Loss 4.8486 Epoch: 13 Global Step: 65700 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:44:22,114-Speed 4420.56 samples/sec Loss 4.8653 Epoch: 13 Global Step: 65750 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:44:33,903-Speed 4343.23 samples/sec Loss 4.9156 Epoch: 13 Global Step: 65800 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:44:45,818-Speed 4297.36 samples/sec Loss 4.8929 Epoch: 13 Global Step: 65850 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:44:57,593-Speed 4348.12 samples/sec Loss 4.8621 Epoch: 13 Global Step: 65900 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:45:09,187-Speed 4416.23 samples/sec Loss 4.9509 Epoch: 13 Global Step: 65950 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:45:20,854-Speed 4388.83 samples/sec Loss 4.9279 Epoch: 13 Global Step: 66000 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:45:51,667-[lfw][66000]XNorm: 23.082890 Training: 2021-03-15 02:45:51,668-[lfw][66000]Accuracy-Flip: 0.99783+-0.00269 Training: 2021-03-15 02:45:51,668-[lfw][66000]Accuracy-Highest: 0.99783 Training: 2021-03-15 02:46:27,459-[cfp_fp][66000]XNorm: 20.264311 Training: 2021-03-15 02:46:27,460-[cfp_fp][66000]Accuracy-Flip: 0.98071+-0.00434 Training: 2021-03-15 02:46:27,460-[cfp_fp][66000]Accuracy-Highest: 0.98343 Training: 2021-03-15 02:46:58,347-[agedb_30][66000]XNorm: 22.447137 Training: 2021-03-15 02:46:58,347-[agedb_30][66000]Accuracy-Flip: 0.97667+-0.00619 Training: 2021-03-15 02:46:58,347-[agedb_30][66000]Accuracy-Highest: 0.97867 Training: 2021-03-15 02:47:09,889-Speed 469.58 samples/sec Loss 4.9390 Epoch: 13 Global Step: 66050 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:47:21,688-Speed 4339.41 samples/sec Loss 4.9533 Epoch: 13 Global Step: 66100 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:47:33,227-Speed 4437.48 samples/sec Loss 4.9849 Epoch: 13 Global Step: 66150 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:47:44,956-Speed 4365.42 samples/sec Loss 4.9712 Epoch: 13 Global Step: 66200 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:47:56,485-Speed 4440.99 samples/sec Loss 4.9972 Epoch: 13 Global Step: 66250 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:48:08,132-Speed 4396.22 samples/sec Loss 4.9482 Epoch: 13 Global Step: 66300 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:48:19,680-Speed 4433.73 samples/sec Loss 4.9808 Epoch: 13 Global Step: 66350 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:48:31,397-Speed 4369.94 samples/sec Loss 5.0354 Epoch: 13 Global Step: 66400 Fp16 Grad Scale: 16384 Required: 5 hours Training: 2021-03-15 02:48:43,182-Speed 4344.85 samples/sec Loss 5.0180 Epoch: 13 Global Step: 66450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:48:54,895-Speed 4371.15 samples/sec Loss 5.0320 Epoch: 13 Global Step: 66500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:49:06,299-Speed 4489.96 samples/sec Loss 4.9900 Epoch: 13 Global Step: 66550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:49:18,187-Speed 4307.27 samples/sec Loss 5.0591 Epoch: 13 Global Step: 66600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:49:29,811-Speed 4404.86 samples/sec Loss 5.0756 Epoch: 13 Global Step: 66650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:49:41,758-Speed 4285.66 samples/sec Loss 5.0223 Epoch: 13 Global Step: 66700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:49:53,226-Speed 4464.87 samples/sec Loss 5.0321 Epoch: 13 Global Step: 66750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:50:05,203-Speed 4274.84 samples/sec Loss 5.0239 Epoch: 13 Global Step: 66800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:50:16,821-Speed 4407.25 samples/sec Loss 4.9919 Epoch: 13 Global Step: 66850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:50:29,013-Speed 4199.83 samples/sec Loss 4.9975 Epoch: 13 Global Step: 66900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:50:40,754-Speed 4360.62 samples/sec Loss 5.0223 Epoch: 13 Global Step: 66950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:50:52,625-Speed 4313.51 samples/sec Loss 5.0581 Epoch: 13 Global Step: 67000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:51:04,174-Speed 4433.17 samples/sec Loss 5.0973 Epoch: 13 Global Step: 67050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:51:16,634-Speed 4109.64 samples/sec Loss 5.0702 Epoch: 13 Global Step: 67100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:51:28,109-Speed 4461.77 samples/sec Loss 5.0387 Epoch: 13 Global Step: 67150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:51:39,854-Speed 4359.49 samples/sec Loss 5.0778 Epoch: 13 Global Step: 67200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:51:51,511-Speed 4392.68 samples/sec Loss 5.1195 Epoch: 13 Global Step: 67250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:52:03,257-Speed 4358.77 samples/sec Loss 5.0896 Epoch: 13 Global Step: 67300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:52:14,817-Speed 4429.34 samples/sec Loss 5.0870 Epoch: 13 Global Step: 67350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:52:26,526-Speed 4373.09 samples/sec Loss 5.1183 Epoch: 13 Global Step: 67400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:52:38,394-Speed 4314.21 samples/sec Loss 5.1008 Epoch: 13 Global Step: 67450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:52:50,098-Speed 4374.78 samples/sec Loss 5.1086 Epoch: 13 Global Step: 67500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:53:01,669-Speed 4425.17 samples/sec Loss 5.0750 Epoch: 13 Global Step: 67550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:53:13,185-Speed 4445.99 samples/sec Loss 5.1019 Epoch: 13 Global Step: 67600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:53:24,849-Speed 4389.63 samples/sec Loss 5.1632 Epoch: 13 Global Step: 67650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:53:36,628-Speed 4347.23 samples/sec Loss 5.1132 Epoch: 13 Global Step: 67700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:53:48,422-Speed 4341.36 samples/sec Loss 5.1615 Epoch: 13 Global Step: 67750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:54:00,043-Speed 4405.77 samples/sec Loss 5.0970 Epoch: 13 Global Step: 67800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:54:11,674-Speed 4402.23 samples/sec Loss 5.1307 Epoch: 13 Global Step: 67850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:54:23,253-Speed 4422.17 samples/sec Loss 5.0963 Epoch: 13 Global Step: 67900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:54:35,041-Speed 4343.45 samples/sec Loss 5.1332 Epoch: 13 Global Step: 67950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:54:46,629-Speed 4418.44 samples/sec Loss 5.1718 Epoch: 13 Global Step: 68000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:55:17,379-[lfw][68000]XNorm: 22.068312 Training: 2021-03-15 02:55:17,379-[lfw][68000]Accuracy-Flip: 0.99783+-0.00224 Training: 2021-03-15 02:55:17,379-[lfw][68000]Accuracy-Highest: 0.99783 Training: 2021-03-15 02:55:53,124-[cfp_fp][68000]XNorm: 19.698687 Training: 2021-03-15 02:55:53,125-[cfp_fp][68000]Accuracy-Flip: 0.98214+-0.00280 Training: 2021-03-15 02:55:53,125-[cfp_fp][68000]Accuracy-Highest: 0.98343 Training: 2021-03-15 02:56:23,932-[agedb_30][68000]XNorm: 21.664952 Training: 2021-03-15 02:56:23,933-[agedb_30][68000]Accuracy-Flip: 0.97833+-0.00654 Training: 2021-03-15 02:56:23,933-[agedb_30][68000]Accuracy-Highest: 0.97867 Training: 2021-03-15 02:56:35,693-Speed 469.45 samples/sec Loss 5.1381 Epoch: 13 Global Step: 68050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:56:47,081-Speed 4496.15 samples/sec Loss 5.0916 Epoch: 13 Global Step: 68100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:56:58,776-Speed 4378.04 samples/sec Loss 5.0993 Epoch: 13 Global Step: 68150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:57:10,447-Speed 4387.20 samples/sec Loss 5.1697 Epoch: 13 Global Step: 68200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:57:22,096-Speed 4395.16 samples/sec Loss 5.1153 Epoch: 13 Global Step: 68250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:57:33,690-Speed 4416.29 samples/sec Loss 5.1097 Epoch: 13 Global Step: 68300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:57:45,280-Speed 4417.90 samples/sec Loss 5.1425 Epoch: 13 Global Step: 68350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:57:57,161-Speed 4309.45 samples/sec Loss 5.1211 Epoch: 13 Global Step: 68400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:58:08,958-Speed 4340.48 samples/sec Loss 5.1807 Epoch: 13 Global Step: 68450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:58:20,478-Speed 4444.46 samples/sec Loss 5.1327 Epoch: 13 Global Step: 68500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:58:31,903-Speed 4481.51 samples/sec Loss 5.1304 Epoch: 13 Global Step: 68550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:58:43,234-Speed 4519.01 samples/sec Loss 5.1869 Epoch: 13 Global Step: 68600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:58:55,061-Speed 4329.11 samples/sec Loss 5.1083 Epoch: 13 Global Step: 68650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:59:06,695-Speed 4401.02 samples/sec Loss 5.1283 Epoch: 13 Global Step: 68700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:59:18,394-Speed 4376.64 samples/sec Loss 5.1204 Epoch: 13 Global Step: 68750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:59:29,915-Speed 4444.45 samples/sec Loss 5.1571 Epoch: 13 Global Step: 68800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:59:41,628-Speed 4371.48 samples/sec Loss 5.1528 Epoch: 13 Global Step: 68850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 02:59:53,113-Speed 4457.90 samples/sec Loss 5.1322 Epoch: 13 Global Step: 68900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:00:05,002-Speed 4306.93 samples/sec Loss 5.1648 Epoch: 13 Global Step: 68950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:00:16,470-Speed 4464.66 samples/sec Loss 5.1643 Epoch: 13 Global Step: 69000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:00:28,511-Speed 4252.12 samples/sec Loss 5.0890 Epoch: 13 Global Step: 69050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:00:40,272-Speed 4353.82 samples/sec Loss 5.1464 Epoch: 13 Global Step: 69100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:00:51,991-Speed 4369.01 samples/sec Loss 5.1862 Epoch: 13 Global Step: 69150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:01:03,512-Speed 4444.12 samples/sec Loss 5.1513 Epoch: 13 Global Step: 69200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:01:15,123-Speed 4410.06 samples/sec Loss 5.2001 Epoch: 13 Global Step: 69250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:01:26,714-Speed 4417.27 samples/sec Loss 5.1191 Epoch: 13 Global Step: 69300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:01:38,682-Speed 4278.25 samples/sec Loss 5.1933 Epoch: 13 Global Step: 69350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:01:50,224-Speed 4436.31 samples/sec Loss 5.1782 Epoch: 13 Global Step: 69400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:02:02,173-Speed 4285.06 samples/sec Loss 5.1261 Epoch: 13 Global Step: 69450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:02:14,018-Speed 4322.53 samples/sec Loss 5.1307 Epoch: 13 Global Step: 69500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:02:25,620-Speed 4413.31 samples/sec Loss 5.2221 Epoch: 13 Global Step: 69550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:02:37,221-Speed 4413.28 samples/sec Loss 5.1330 Epoch: 13 Global Step: 69600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:02:48,679-Speed 4468.97 samples/sec Loss 5.1770 Epoch: 13 Global Step: 69650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:03:00,224-Speed 4434.76 samples/sec Loss 5.1746 Epoch: 13 Global Step: 69700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:03:14,678-Speed 3542.63 samples/sec Loss 5.1497 Epoch: 14 Global Step: 69750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:03:27,681-Speed 3937.80 samples/sec Loss 4.3806 Epoch: 14 Global Step: 69800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:03:39,206-Speed 4442.74 samples/sec Loss 4.4112 Epoch: 14 Global Step: 69850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:03:50,997-Speed 4342.46 samples/sec Loss 4.4108 Epoch: 14 Global Step: 69900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:04:02,771-Speed 4348.63 samples/sec Loss 4.4581 Epoch: 14 Global Step: 69950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:04:14,402-Speed 4402.45 samples/sec Loss 4.4699 Epoch: 14 Global Step: 70000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:04:45,237-[lfw][70000]XNorm: 22.769371 Training: 2021-03-15 03:04:45,237-[lfw][70000]Accuracy-Flip: 0.99750+-0.00250 Training: 2021-03-15 03:04:45,237-[lfw][70000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:05:20,981-[cfp_fp][70000]XNorm: 20.167005 Training: 2021-03-15 03:05:20,982-[cfp_fp][70000]Accuracy-Flip: 0.98214+-0.00363 Training: 2021-03-15 03:05:20,982-[cfp_fp][70000]Accuracy-Highest: 0.98343 Training: 2021-03-15 03:05:51,855-[agedb_30][70000]XNorm: 22.341581 Training: 2021-03-15 03:05:51,856-[agedb_30][70000]Accuracy-Flip: 0.97767+-0.00633 Training: 2021-03-15 03:05:51,856-[agedb_30][70000]Accuracy-Highest: 0.97867 Training: 2021-03-15 03:06:03,390-Speed 469.78 samples/sec Loss 4.6028 Epoch: 14 Global Step: 70050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:06:14,898-Speed 4449.23 samples/sec Loss 4.5178 Epoch: 14 Global Step: 70100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:06:26,634-Speed 4362.76 samples/sec Loss 4.5960 Epoch: 14 Global Step: 70150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:06:38,346-Speed 4372.03 samples/sec Loss 4.6202 Epoch: 14 Global Step: 70200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:06:50,044-Speed 4376.72 samples/sec Loss 4.6595 Epoch: 14 Global Step: 70250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:01,525-Speed 4459.80 samples/sec Loss 4.7342 Epoch: 14 Global Step: 70300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:13,139-Speed 4408.68 samples/sec Loss 4.6889 Epoch: 14 Global Step: 70350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:25,197-Speed 4246.48 samples/sec Loss 4.7745 Epoch: 14 Global Step: 70400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:36,698-Speed 4451.95 samples/sec Loss 4.6687 Epoch: 14 Global Step: 70450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:48,331-Speed 4401.44 samples/sec Loss 4.7667 Epoch: 14 Global Step: 70500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:07:59,972-Speed 4398.47 samples/sec Loss 4.7586 Epoch: 14 Global Step: 70550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:08:12,100-Speed 4221.72 samples/sec Loss 4.7877 Epoch: 14 Global Step: 70600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:08:23,675-Speed 4423.49 samples/sec Loss 4.7951 Epoch: 14 Global Step: 70650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:08:35,288-Speed 4408.98 samples/sec Loss 4.8388 Epoch: 14 Global Step: 70700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:08:46,832-Speed 4435.34 samples/sec Loss 4.7896 Epoch: 14 Global Step: 70750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:08:58,610-Speed 4347.51 samples/sec Loss 4.8403 Epoch: 14 Global Step: 70800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:09:10,286-Speed 4385.10 samples/sec Loss 4.8858 Epoch: 14 Global Step: 70850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:09:21,984-Speed 4376.97 samples/sec Loss 4.8692 Epoch: 14 Global Step: 70900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:09:33,792-Speed 4336.36 samples/sec Loss 4.9046 Epoch: 14 Global Step: 70950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:09:45,832-Speed 4252.76 samples/sec Loss 4.9107 Epoch: 14 Global Step: 71000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:09:57,315-Speed 4458.67 samples/sec Loss 4.9644 Epoch: 14 Global Step: 71050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:10:09,177-Speed 4316.51 samples/sec Loss 4.9178 Epoch: 14 Global Step: 71100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:10:20,822-Speed 4397.06 samples/sec Loss 4.9272 Epoch: 14 Global Step: 71150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:10:32,651-Speed 4328.40 samples/sec Loss 4.9485 Epoch: 14 Global Step: 71200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:10:44,278-Speed 4403.74 samples/sec Loss 4.9510 Epoch: 14 Global Step: 71250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:10:55,753-Speed 4462.02 samples/sec Loss 4.9630 Epoch: 14 Global Step: 71300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:11:07,393-Speed 4398.82 samples/sec Loss 4.9962 Epoch: 14 Global Step: 71350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:11:18,986-Speed 4416.79 samples/sec Loss 4.9945 Epoch: 14 Global Step: 71400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:11:30,769-Speed 4345.46 samples/sec Loss 4.9701 Epoch: 14 Global Step: 71450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:11:42,257-Speed 4456.78 samples/sec Loss 4.9859 Epoch: 14 Global Step: 71500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:11:53,699-Speed 4475.00 samples/sec Loss 4.9387 Epoch: 14 Global Step: 71550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:12:05,390-Speed 4379.69 samples/sec Loss 4.9767 Epoch: 14 Global Step: 71600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:12:16,940-Speed 4433.06 samples/sec Loss 4.9419 Epoch: 14 Global Step: 71650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:12:28,488-Speed 4433.96 samples/sec Loss 4.9971 Epoch: 14 Global Step: 71700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:12:40,108-Speed 4406.45 samples/sec Loss 5.0694 Epoch: 14 Global Step: 71750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:12:51,729-Speed 4405.88 samples/sec Loss 4.9716 Epoch: 14 Global Step: 71800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:13:03,246-Speed 4445.82 samples/sec Loss 4.9861 Epoch: 14 Global Step: 71850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:13:14,603-Speed 4508.32 samples/sec Loss 5.0493 Epoch: 14 Global Step: 71900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:13:26,242-Speed 4399.36 samples/sec Loss 5.0568 Epoch: 14 Global Step: 71950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:13:38,072-Speed 4328.35 samples/sec Loss 4.9654 Epoch: 14 Global Step: 72000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:14:08,635-[lfw][72000]XNorm: 21.525573 Training: 2021-03-15 03:14:08,635-[lfw][72000]Accuracy-Flip: 0.99733+-0.00238 Training: 2021-03-15 03:14:08,635-[lfw][72000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:14:44,159-[cfp_fp][72000]XNorm: 19.145940 Training: 2021-03-15 03:14:44,160-[cfp_fp][72000]Accuracy-Flip: 0.98243+-0.00419 Training: 2021-03-15 03:14:44,160-[cfp_fp][72000]Accuracy-Highest: 0.98343 Training: 2021-03-15 03:15:14,732-[agedb_30][72000]XNorm: 21.117880 Training: 2021-03-15 03:15:14,732-[agedb_30][72000]Accuracy-Flip: 0.97783+-0.00654 Training: 2021-03-15 03:15:14,732-[agedb_30][72000]Accuracy-Highest: 0.97867 Training: 2021-03-15 03:15:26,276-Speed 473.18 samples/sec Loss 5.0482 Epoch: 14 Global Step: 72050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:15:38,170-Speed 4304.77 samples/sec Loss 5.0446 Epoch: 14 Global Step: 72100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:15:49,705-Speed 4438.91 samples/sec Loss 5.0153 Epoch: 14 Global Step: 72150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:01,221-Speed 4446.29 samples/sec Loss 5.0565 Epoch: 14 Global Step: 72200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:12,701-Speed 4459.93 samples/sec Loss 5.0766 Epoch: 14 Global Step: 72250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:24,369-Speed 4388.51 samples/sec Loss 5.0809 Epoch: 14 Global Step: 72300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:36,151-Speed 4345.62 samples/sec Loss 5.0436 Epoch: 14 Global Step: 72350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:47,622-Speed 4463.73 samples/sec Loss 5.0645 Epoch: 14 Global Step: 72400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:16:59,318-Speed 4377.65 samples/sec Loss 5.0413 Epoch: 14 Global Step: 72450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:17:11,184-Speed 4314.97 samples/sec Loss 5.1153 Epoch: 14 Global Step: 72500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:17:22,713-Speed 4441.11 samples/sec Loss 5.0290 Epoch: 14 Global Step: 72550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:17:34,376-Speed 4390.23 samples/sec Loss 5.0961 Epoch: 14 Global Step: 72600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:17:46,134-Speed 4354.62 samples/sec Loss 5.0996 Epoch: 14 Global Step: 72650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:17:57,591-Speed 4469.10 samples/sec Loss 5.0899 Epoch: 14 Global Step: 72700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:18:09,108-Speed 4445.94 samples/sec Loss 5.0390 Epoch: 14 Global Step: 72750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:18:20,631-Speed 4443.44 samples/sec Loss 5.0990 Epoch: 14 Global Step: 72800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:18:32,379-Speed 4358.33 samples/sec Loss 5.1088 Epoch: 14 Global Step: 72850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:18:44,279-Speed 4302.60 samples/sec Loss 5.0472 Epoch: 14 Global Step: 72900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:18:56,051-Speed 4349.48 samples/sec Loss 5.0854 Epoch: 14 Global Step: 72950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:19:07,810-Speed 4354.35 samples/sec Loss 5.1373 Epoch: 14 Global Step: 73000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:19:19,882-Speed 4241.38 samples/sec Loss 5.0800 Epoch: 14 Global Step: 73050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:19:31,399-Speed 4446.01 samples/sec Loss 5.1114 Epoch: 14 Global Step: 73100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:19:43,332-Speed 4290.59 samples/sec Loss 5.0704 Epoch: 14 Global Step: 73150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:19:54,751-Speed 4484.03 samples/sec Loss 5.0726 Epoch: 14 Global Step: 73200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:20:06,381-Speed 4402.50 samples/sec Loss 5.1007 Epoch: 14 Global Step: 73250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:20:17,826-Speed 4473.78 samples/sec Loss 5.1117 Epoch: 14 Global Step: 73300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:20:29,487-Speed 4391.03 samples/sec Loss 5.0777 Epoch: 14 Global Step: 73350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:20:41,155-Speed 4388.29 samples/sec Loss 5.1375 Epoch: 14 Global Step: 73400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:20:53,043-Speed 4306.96 samples/sec Loss 5.0841 Epoch: 14 Global Step: 73450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:21:04,605-Speed 4428.73 samples/sec Loss 5.0960 Epoch: 14 Global Step: 73500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:21:16,329-Speed 4366.99 samples/sec Loss 5.1338 Epoch: 14 Global Step: 73550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:21:27,951-Speed 4405.88 samples/sec Loss 5.0992 Epoch: 14 Global Step: 73600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:21:39,802-Speed 4320.49 samples/sec Loss 5.0882 Epoch: 14 Global Step: 73650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:21:51,520-Speed 4369.51 samples/sec Loss 5.1040 Epoch: 14 Global Step: 73700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:22:03,203-Speed 4382.54 samples/sec Loss 5.0994 Epoch: 14 Global Step: 73750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:22:14,991-Speed 4343.45 samples/sec Loss 5.1206 Epoch: 14 Global Step: 73800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:22:26,984-Speed 4269.40 samples/sec Loss 5.0911 Epoch: 14 Global Step: 73850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:22:38,830-Speed 4322.52 samples/sec Loss 5.1375 Epoch: 14 Global Step: 73900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:22:50,698-Speed 4314.31 samples/sec Loss 5.0762 Epoch: 14 Global Step: 73950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:23:02,129-Speed 4479.20 samples/sec Loss 5.1231 Epoch: 14 Global Step: 74000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:23:32,978-[lfw][74000]XNorm: 22.081088 Training: 2021-03-15 03:23:32,978-[lfw][74000]Accuracy-Flip: 0.99750+-0.00271 Training: 2021-03-15 03:23:32,978-[lfw][74000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:24:08,782-[cfp_fp][74000]XNorm: 19.710361 Training: 2021-03-15 03:24:08,782-[cfp_fp][74000]Accuracy-Flip: 0.98186+-0.00542 Training: 2021-03-15 03:24:08,783-[cfp_fp][74000]Accuracy-Highest: 0.98343 Training: 2021-03-15 03:24:39,547-[agedb_30][74000]XNorm: 21.802135 Training: 2021-03-15 03:24:39,547-[agedb_30][74000]Accuracy-Flip: 0.97717+-0.00592 Training: 2021-03-15 03:24:39,547-[agedb_30][74000]Accuracy-Highest: 0.97867 Training: 2021-03-15 03:24:51,173-Speed 469.54 samples/sec Loss 5.1087 Epoch: 14 Global Step: 74050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:25:02,743-Speed 4425.37 samples/sec Loss 5.0807 Epoch: 14 Global Step: 74100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:25:14,393-Speed 4395.19 samples/sec Loss 5.1273 Epoch: 14 Global Step: 74150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:25:26,031-Speed 4399.44 samples/sec Loss 5.1058 Epoch: 14 Global Step: 74200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:25:38,028-Speed 4268.02 samples/sec Loss 5.1279 Epoch: 14 Global Step: 74250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:25:49,919-Speed 4305.99 samples/sec Loss 5.0609 Epoch: 14 Global Step: 74300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:01,611-Speed 4379.09 samples/sec Loss 5.1676 Epoch: 14 Global Step: 74350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:13,244-Speed 4401.51 samples/sec Loss 5.0841 Epoch: 14 Global Step: 74400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:24,675-Speed 4479.18 samples/sec Loss 5.1622 Epoch: 14 Global Step: 74450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:36,199-Speed 4443.32 samples/sec Loss 5.0551 Epoch: 14 Global Step: 74500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:47,781-Speed 4420.71 samples/sec Loss 5.0544 Epoch: 14 Global Step: 74550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:26:59,338-Speed 4430.24 samples/sec Loss 5.0499 Epoch: 14 Global Step: 74600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:27:11,046-Speed 4373.34 samples/sec Loss 5.1219 Epoch: 14 Global Step: 74650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:27:22,680-Speed 4401.09 samples/sec Loss 5.0819 Epoch: 14 Global Step: 74700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:27:37,869-Speed 3371.05 samples/sec Loss 4.8360 Epoch: 15 Global Step: 74750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:27:50,087-Speed 4190.78 samples/sec Loss 4.3197 Epoch: 15 Global Step: 74800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:28:01,746-Speed 4391.86 samples/sec Loss 4.3474 Epoch: 15 Global Step: 74850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:28:14,127-Speed 4135.49 samples/sec Loss 4.4149 Epoch: 15 Global Step: 74900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:28:25,836-Speed 4372.98 samples/sec Loss 4.4589 Epoch: 15 Global Step: 74950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:28:37,372-Speed 4438.46 samples/sec Loss 4.4671 Epoch: 15 Global Step: 75000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:28:48,957-Speed 4419.63 samples/sec Loss 4.4772 Epoch: 15 Global Step: 75050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:00,785-Speed 4328.99 samples/sec Loss 4.5163 Epoch: 15 Global Step: 75100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:12,445-Speed 4391.10 samples/sec Loss 4.5836 Epoch: 15 Global Step: 75150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:23,907-Speed 4467.20 samples/sec Loss 4.5871 Epoch: 15 Global Step: 75200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:35,506-Speed 4414.34 samples/sec Loss 4.5752 Epoch: 15 Global Step: 75250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:47,059-Speed 4431.97 samples/sec Loss 4.6588 Epoch: 15 Global Step: 75300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:29:58,480-Speed 4483.02 samples/sec Loss 4.6703 Epoch: 15 Global Step: 75350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:30:10,059-Speed 4422.19 samples/sec Loss 4.6696 Epoch: 15 Global Step: 75400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:30:21,831-Speed 4349.43 samples/sec Loss 4.6555 Epoch: 15 Global Step: 75450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:30:33,782-Speed 4284.12 samples/sec Loss 4.7170 Epoch: 15 Global Step: 75500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:30:45,351-Speed 4425.99 samples/sec Loss 4.7314 Epoch: 15 Global Step: 75550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:30:57,148-Speed 4340.18 samples/sec Loss 4.7455 Epoch: 15 Global Step: 75600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:31:08,704-Speed 4431.03 samples/sec Loss 4.8018 Epoch: 15 Global Step: 75650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:31:20,611-Speed 4299.96 samples/sec Loss 4.8249 Epoch: 15 Global Step: 75700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:31:32,119-Speed 4449.26 samples/sec Loss 4.7885 Epoch: 15 Global Step: 75750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:31:43,644-Speed 4442.70 samples/sec Loss 4.8140 Epoch: 15 Global Step: 75800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:31:55,317-Speed 4386.49 samples/sec Loss 4.8266 Epoch: 15 Global Step: 75850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:32:07,083-Speed 4351.72 samples/sec Loss 4.8293 Epoch: 15 Global Step: 75900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:32:18,566-Speed 4458.95 samples/sec Loss 4.8594 Epoch: 15 Global Step: 75950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:32:30,204-Speed 4399.46 samples/sec Loss 4.8458 Epoch: 15 Global Step: 76000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:33:01,130-[lfw][76000]XNorm: 22.692948 Training: 2021-03-15 03:33:01,130-[lfw][76000]Accuracy-Flip: 0.99783+-0.00248 Training: 2021-03-15 03:33:01,130-[lfw][76000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:33:36,996-[cfp_fp][76000]XNorm: 20.231139 Training: 2021-03-15 03:33:36,996-[cfp_fp][76000]Accuracy-Flip: 0.98057+-0.00415 Training: 2021-03-15 03:33:36,996-[cfp_fp][76000]Accuracy-Highest: 0.98343 Training: 2021-03-15 03:34:07,837-[agedb_30][76000]XNorm: 22.375511 Training: 2021-03-15 03:34:07,837-[agedb_30][76000]Accuracy-Flip: 0.97667+-0.00641 Training: 2021-03-15 03:34:07,837-[agedb_30][76000]Accuracy-Highest: 0.97867 Training: 2021-03-15 03:34:19,338-Speed 469.15 samples/sec Loss 4.8407 Epoch: 15 Global Step: 76050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:34:30,877-Speed 4437.64 samples/sec Loss 4.8847 Epoch: 15 Global Step: 76100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:34:42,483-Speed 4411.67 samples/sec Loss 4.9240 Epoch: 15 Global Step: 76150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:34:55,076-Speed 4065.89 samples/sec Loss 4.8487 Epoch: 15 Global Step: 76200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:35:06,747-Speed 4387.00 samples/sec Loss 4.9224 Epoch: 15 Global Step: 76250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:35:18,155-Speed 4488.22 samples/sec Loss 4.9409 Epoch: 15 Global Step: 76300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:35:29,744-Speed 4418.27 samples/sec Loss 4.8910 Epoch: 15 Global Step: 76350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:35:41,313-Speed 4425.69 samples/sec Loss 4.8851 Epoch: 15 Global Step: 76400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:35:52,901-Speed 4418.50 samples/sec Loss 4.9452 Epoch: 15 Global Step: 76450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:36:04,729-Speed 4329.16 samples/sec Loss 4.9314 Epoch: 15 Global Step: 76500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:36:16,201-Speed 4463.06 samples/sec Loss 4.9255 Epoch: 15 Global Step: 76550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:36:28,145-Speed 4286.78 samples/sec Loss 4.9683 Epoch: 15 Global Step: 76600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:36:39,748-Speed 4412.90 samples/sec Loss 4.9172 Epoch: 15 Global Step: 76650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:36:51,155-Speed 4488.66 samples/sec Loss 4.9750 Epoch: 15 Global Step: 76700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:37:02,738-Speed 4420.61 samples/sec Loss 4.9874 Epoch: 15 Global Step: 76750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:37:14,534-Speed 4340.50 samples/sec Loss 5.0155 Epoch: 15 Global Step: 76800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:37:26,006-Speed 4463.34 samples/sec Loss 4.9395 Epoch: 15 Global Step: 76850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:37:37,598-Speed 4417.03 samples/sec Loss 5.0024 Epoch: 15 Global Step: 76900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:37:49,436-Speed 4325.29 samples/sec Loss 4.9893 Epoch: 15 Global Step: 76950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:00,946-Speed 4448.60 samples/sec Loss 4.9618 Epoch: 15 Global Step: 77000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:12,627-Speed 4383.36 samples/sec Loss 5.0113 Epoch: 15 Global Step: 77050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:24,536-Speed 4299.41 samples/sec Loss 4.9981 Epoch: 15 Global Step: 77100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:36,366-Speed 4327.96 samples/sec Loss 5.0152 Epoch: 15 Global Step: 77150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:47,814-Speed 4472.85 samples/sec Loss 4.9440 Epoch: 15 Global Step: 77200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:38:59,518-Speed 4374.52 samples/sec Loss 4.9758 Epoch: 15 Global Step: 77250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:39:11,101-Speed 4420.75 samples/sec Loss 4.9894 Epoch: 15 Global Step: 77300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:39:22,965-Speed 4315.53 samples/sec Loss 5.0063 Epoch: 15 Global Step: 77350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:39:34,493-Speed 4441.62 samples/sec Loss 5.0388 Epoch: 15 Global Step: 77400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:39:46,266-Speed 4349.04 samples/sec Loss 4.9915 Epoch: 15 Global Step: 77450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:39:58,444-Speed 4204.69 samples/sec Loss 5.0350 Epoch: 15 Global Step: 77500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:40:10,098-Speed 4393.25 samples/sec Loss 5.0337 Epoch: 15 Global Step: 77550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:40:21,537-Speed 4476.19 samples/sec Loss 5.0112 Epoch: 15 Global Step: 77600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:40:33,068-Speed 4440.40 samples/sec Loss 5.0152 Epoch: 15 Global Step: 77650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:40:44,479-Speed 4487.39 samples/sec Loss 4.9975 Epoch: 15 Global Step: 77700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:40:56,488-Speed 4263.65 samples/sec Loss 5.0531 Epoch: 15 Global Step: 77750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:41:08,054-Speed 4426.94 samples/sec Loss 5.0757 Epoch: 15 Global Step: 77800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:41:19,769-Speed 4370.44 samples/sec Loss 5.0242 Epoch: 15 Global Step: 77850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:41:31,324-Speed 4431.31 samples/sec Loss 5.0326 Epoch: 15 Global Step: 77900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:41:42,821-Speed 4453.34 samples/sec Loss 5.0512 Epoch: 15 Global Step: 77950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:41:54,720-Speed 4303.26 samples/sec Loss 5.0603 Epoch: 15 Global Step: 78000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:42:25,486-[lfw][78000]XNorm: 21.726639 Training: 2021-03-15 03:42:25,487-[lfw][78000]Accuracy-Flip: 0.99733+-0.00200 Training: 2021-03-15 03:42:25,487-[lfw][78000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:43:01,238-[cfp_fp][78000]XNorm: 19.206605 Training: 2021-03-15 03:43:01,238-[cfp_fp][78000]Accuracy-Flip: 0.97971+-0.00312 Training: 2021-03-15 03:43:01,238-[cfp_fp][78000]Accuracy-Highest: 0.98343 Training: 2021-03-15 03:43:32,038-[agedb_30][78000]XNorm: 21.214429 Training: 2021-03-15 03:43:32,038-[agedb_30][78000]Accuracy-Flip: 0.97517+-0.00584 Training: 2021-03-15 03:43:32,038-[agedb_30][78000]Accuracy-Highest: 0.97867 Training: 2021-03-15 03:43:43,590-Speed 470.29 samples/sec Loss 5.0328 Epoch: 15 Global Step: 78050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:43:55,263-Speed 4386.17 samples/sec Loss 5.0817 Epoch: 15 Global Step: 78100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:44:06,981-Speed 4369.73 samples/sec Loss 5.0494 Epoch: 15 Global Step: 78150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:44:18,940-Speed 4281.46 samples/sec Loss 5.0980 Epoch: 15 Global Step: 78200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:44:30,644-Speed 4374.77 samples/sec Loss 5.0472 Epoch: 15 Global Step: 78250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:44:42,046-Speed 4490.43 samples/sec Loss 5.0607 Epoch: 15 Global Step: 78300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:44:53,578-Speed 4440.16 samples/sec Loss 5.0478 Epoch: 15 Global Step: 78350 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:45:05,238-Speed 4391.35 samples/sec Loss 5.0496 Epoch: 15 Global Step: 78400 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:45:17,255-Speed 4260.74 samples/sec Loss 5.0481 Epoch: 15 Global Step: 78450 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:45:28,808-Speed 4431.90 samples/sec Loss 5.0594 Epoch: 15 Global Step: 78500 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:45:40,387-Speed 4421.89 samples/sec Loss 5.0590 Epoch: 15 Global Step: 78550 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:45:51,953-Speed 4426.83 samples/sec Loss 5.0707 Epoch: 15 Global Step: 78600 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:46:03,507-Speed 4431.67 samples/sec Loss 5.0701 Epoch: 15 Global Step: 78650 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:46:15,097-Speed 4417.88 samples/sec Loss 5.0671 Epoch: 15 Global Step: 78700 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:46:27,298-Speed 4196.33 samples/sec Loss 5.0512 Epoch: 15 Global Step: 78750 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:46:38,963-Speed 4389.58 samples/sec Loss 5.0160 Epoch: 15 Global Step: 78800 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:46:50,859-Speed 4304.01 samples/sec Loss 5.0716 Epoch: 15 Global Step: 78850 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:47:02,735-Speed 4311.50 samples/sec Loss 5.0897 Epoch: 15 Global Step: 78900 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:47:14,394-Speed 4391.57 samples/sec Loss 5.0397 Epoch: 15 Global Step: 78950 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:47:26,024-Speed 4402.75 samples/sec Loss 5.0686 Epoch: 15 Global Step: 79000 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:47:37,622-Speed 4414.65 samples/sec Loss 5.0692 Epoch: 15 Global Step: 79050 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:47:49,443-Speed 4331.41 samples/sec Loss 5.0341 Epoch: 15 Global Step: 79100 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:48:01,144-Speed 4376.02 samples/sec Loss 5.0601 Epoch: 15 Global Step: 79150 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:48:12,767-Speed 4405.35 samples/sec Loss 5.0322 Epoch: 15 Global Step: 79200 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:48:24,361-Speed 4416.38 samples/sec Loss 5.0637 Epoch: 15 Global Step: 79250 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:48:36,362-Speed 4266.51 samples/sec Loss 5.0513 Epoch: 15 Global Step: 79300 Fp16 Grad Scale: 16384 Required: 4 hours Training: 2021-03-15 03:48:48,162-Speed 4338.89 samples/sec Loss 5.1077 Epoch: 15 Global Step: 79350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:48:59,788-Speed 4404.11 samples/sec Loss 5.0730 Epoch: 15 Global Step: 79400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:49:11,218-Speed 4479.53 samples/sec Loss 5.0777 Epoch: 15 Global Step: 79450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:49:23,133-Speed 4297.52 samples/sec Loss 5.0810 Epoch: 15 Global Step: 79500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:49:34,786-Speed 4393.77 samples/sec Loss 5.0844 Epoch: 15 Global Step: 79550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:49:46,593-Speed 4336.66 samples/sec Loss 5.0831 Epoch: 15 Global Step: 79600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:49:58,158-Speed 4427.38 samples/sec Loss 5.0566 Epoch: 15 Global Step: 79650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:50:09,933-Speed 4348.46 samples/sec Loss 5.0110 Epoch: 15 Global Step: 79700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:50:26,060-Speed 3174.84 samples/sec Loss 4.2868 Epoch: 16 Global Step: 79750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:50:37,664-Speed 4412.66 samples/sec Loss 3.7307 Epoch: 16 Global Step: 79800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:50:49,334-Speed 4387.48 samples/sec Loss 3.6937 Epoch: 16 Global Step: 79850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:51:00,811-Speed 4461.32 samples/sec Loss 3.5259 Epoch: 16 Global Step: 79900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:51:12,445-Speed 4401.17 samples/sec Loss 3.5156 Epoch: 16 Global Step: 79950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:51:24,282-Speed 4325.41 samples/sec Loss 3.4750 Epoch: 16 Global Step: 80000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:51:55,096-[lfw][80000]XNorm: 22.616439 Training: 2021-03-15 03:51:55,097-[lfw][80000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 03:51:55,097-[lfw][80000]Accuracy-Highest: 0.99783 Training: 2021-03-15 03:52:30,903-[cfp_fp][80000]XNorm: 20.404796 Training: 2021-03-15 03:52:30,903-[cfp_fp][80000]Accuracy-Flip: 0.98586+-0.00445 Training: 2021-03-15 03:52:30,903-[cfp_fp][80000]Accuracy-Highest: 0.98586 Training: 2021-03-15 03:53:01,877-[agedb_30][80000]XNorm: 22.178103 Training: 2021-03-15 03:53:01,877-[agedb_30][80000]Accuracy-Flip: 0.98067+-0.00646 Training: 2021-03-15 03:53:01,877-[agedb_30][80000]Accuracy-Highest: 0.98067 Training: 2021-03-15 03:53:13,401-Speed 469.22 samples/sec Loss 3.4395 Epoch: 16 Global Step: 80050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:53:25,188-Speed 4343.76 samples/sec Loss 3.4394 Epoch: 16 Global Step: 80100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:53:37,389-Speed 4196.49 samples/sec Loss 3.4209 Epoch: 16 Global Step: 80150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:53:49,012-Speed 4405.33 samples/sec Loss 3.3899 Epoch: 16 Global Step: 80200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:00,627-Speed 4408.28 samples/sec Loss 3.3186 Epoch: 16 Global Step: 80250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:12,280-Speed 4394.01 samples/sec Loss 3.3398 Epoch: 16 Global Step: 80300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:24,028-Speed 4358.34 samples/sec Loss 3.3316 Epoch: 16 Global Step: 80350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:35,915-Speed 4307.38 samples/sec Loss 3.2470 Epoch: 16 Global Step: 80400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:47,584-Speed 4387.92 samples/sec Loss 3.2912 Epoch: 16 Global Step: 80450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:54:59,276-Speed 4379.08 samples/sec Loss 3.2402 Epoch: 16 Global Step: 80500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:55:10,824-Speed 4434.08 samples/sec Loss 3.2707 Epoch: 16 Global Step: 80550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:55:22,522-Speed 4376.72 samples/sec Loss 3.1961 Epoch: 16 Global Step: 80600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:55:34,147-Speed 4404.50 samples/sec Loss 3.2417 Epoch: 16 Global Step: 80650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:55:46,066-Speed 4296.07 samples/sec Loss 3.2081 Epoch: 16 Global Step: 80700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:55:57,663-Speed 4414.95 samples/sec Loss 3.1317 Epoch: 16 Global Step: 80750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:56:09,465-Speed 4338.55 samples/sec Loss 3.1344 Epoch: 16 Global Step: 80800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:56:20,943-Speed 4460.75 samples/sec Loss 3.1781 Epoch: 16 Global Step: 80850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:56:32,924-Speed 4273.79 samples/sec Loss 3.1172 Epoch: 16 Global Step: 80900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:56:44,535-Speed 4409.88 samples/sec Loss 3.1151 Epoch: 16 Global Step: 80950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:56:56,411-Speed 4311.31 samples/sec Loss 3.0751 Epoch: 16 Global Step: 81000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:57:08,120-Speed 4372.96 samples/sec Loss 3.0769 Epoch: 16 Global Step: 81050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:57:19,867-Speed 4358.48 samples/sec Loss 3.1119 Epoch: 16 Global Step: 81100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:57:31,798-Speed 4291.79 samples/sec Loss 3.0809 Epoch: 16 Global Step: 81150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:57:43,793-Speed 4268.55 samples/sec Loss 3.0608 Epoch: 16 Global Step: 81200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:57:55,224-Speed 4479.09 samples/sec Loss 3.1095 Epoch: 16 Global Step: 81250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:58:06,820-Speed 4415.72 samples/sec Loss 3.0344 Epoch: 16 Global Step: 81300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:58:18,610-Speed 4342.66 samples/sec Loss 2.9958 Epoch: 16 Global Step: 81350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:58:30,428-Speed 4332.74 samples/sec Loss 3.0319 Epoch: 16 Global Step: 81400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:58:42,137-Speed 4372.70 samples/sec Loss 3.0495 Epoch: 16 Global Step: 81450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:58:54,172-Speed 4254.67 samples/sec Loss 3.0004 Epoch: 16 Global Step: 81500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:59:05,837-Speed 4389.30 samples/sec Loss 2.9970 Epoch: 16 Global Step: 81550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:59:17,739-Speed 4301.74 samples/sec Loss 3.0370 Epoch: 16 Global Step: 81600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:59:29,322-Speed 4420.55 samples/sec Loss 3.0047 Epoch: 16 Global Step: 81650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:59:41,003-Speed 4383.55 samples/sec Loss 2.9773 Epoch: 16 Global Step: 81700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 03:59:52,596-Speed 4416.47 samples/sec Loss 3.0073 Epoch: 16 Global Step: 81750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:00:04,196-Speed 4413.96 samples/sec Loss 2.9967 Epoch: 16 Global Step: 81800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:00:15,659-Speed 4466.94 samples/sec Loss 2.9984 Epoch: 16 Global Step: 81850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:00:27,279-Speed 4406.35 samples/sec Loss 2.9742 Epoch: 16 Global Step: 81900 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:00:38,939-Speed 4391.34 samples/sec Loss 2.9347 Epoch: 16 Global Step: 81950 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:00:50,545-Speed 4411.54 samples/sec Loss 2.9400 Epoch: 16 Global Step: 82000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:01:21,329-[lfw][82000]XNorm: 22.419023 Training: 2021-03-15 04:01:21,330-[lfw][82000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 04:01:21,330-[lfw][82000]Accuracy-Highest: 0.99783 Training: 2021-03-15 04:01:56,997-[cfp_fp][82000]XNorm: 20.453627 Training: 2021-03-15 04:01:56,997-[cfp_fp][82000]Accuracy-Flip: 0.98714+-0.00263 Training: 2021-03-15 04:01:56,997-[cfp_fp][82000]Accuracy-Highest: 0.98714 Training: 2021-03-15 04:02:27,586-[agedb_30][82000]XNorm: 22.185378 Training: 2021-03-15 04:02:27,586-[agedb_30][82000]Accuracy-Flip: 0.98150+-0.00713 Training: 2021-03-15 04:02:27,586-[agedb_30][82000]Accuracy-Highest: 0.98150 Training: 2021-03-15 04:02:39,118-Speed 471.57 samples/sec Loss 2.9547 Epoch: 16 Global Step: 82050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:02:50,892-Speed 4348.75 samples/sec Loss 2.9734 Epoch: 16 Global Step: 82100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:03:02,654-Speed 4353.07 samples/sec Loss 2.9110 Epoch: 16 Global Step: 82150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:03:14,434-Speed 4346.80 samples/sec Loss 2.9119 Epoch: 16 Global Step: 82200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:03:26,034-Speed 4413.99 samples/sec Loss 2.9503 Epoch: 16 Global Step: 82250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:03:37,511-Speed 4461.16 samples/sec Loss 2.8959 Epoch: 16 Global Step: 82300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:03:48,975-Speed 4466.19 samples/sec Loss 2.9324 Epoch: 16 Global Step: 82350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:00,487-Speed 4447.89 samples/sec Loss 2.9100 Epoch: 16 Global Step: 82400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:11,985-Speed 4453.09 samples/sec Loss 2.9233 Epoch: 16 Global Step: 82450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:23,748-Speed 4352.75 samples/sec Loss 2.8727 Epoch: 16 Global Step: 82500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:35,490-Speed 4360.72 samples/sec Loss 2.9012 Epoch: 16 Global Step: 82550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:46,936-Speed 4473.32 samples/sec Loss 2.8830 Epoch: 16 Global Step: 82600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:04:58,984-Speed 4249.67 samples/sec Loss 2.8439 Epoch: 16 Global Step: 82650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:05:10,659-Speed 4385.83 samples/sec Loss 2.8749 Epoch: 16 Global Step: 82700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:05:22,275-Speed 4407.88 samples/sec Loss 2.8493 Epoch: 16 Global Step: 82750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:05:33,768-Speed 4454.87 samples/sec Loss 2.8533 Epoch: 16 Global Step: 82800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:05:45,404-Speed 4400.70 samples/sec Loss 2.8855 Epoch: 16 Global Step: 82850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:05:57,185-Speed 4345.85 samples/sec Loss 2.8472 Epoch: 16 Global Step: 82900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:06:08,869-Speed 4382.56 samples/sec Loss 2.8169 Epoch: 16 Global Step: 82950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:06:20,353-Speed 4458.38 samples/sec Loss 2.7891 Epoch: 16 Global Step: 83000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:06:32,250-Speed 4303.98 samples/sec Loss 2.8738 Epoch: 16 Global Step: 83050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:06:43,732-Speed 4459.06 samples/sec Loss 2.8372 Epoch: 16 Global Step: 83100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:06:55,253-Speed 4444.48 samples/sec Loss 2.7831 Epoch: 16 Global Step: 83150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:07:06,892-Speed 4398.94 samples/sec Loss 2.7757 Epoch: 16 Global Step: 83200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:07:18,699-Speed 4336.76 samples/sec Loss 2.8615 Epoch: 16 Global Step: 83250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:07:30,286-Speed 4418.77 samples/sec Loss 2.8038 Epoch: 16 Global Step: 83300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:07:42,000-Speed 4371.03 samples/sec Loss 2.8018 Epoch: 16 Global Step: 83350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:07:53,779-Speed 4346.81 samples/sec Loss 2.8089 Epoch: 16 Global Step: 83400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:08:05,458-Speed 4384.13 samples/sec Loss 2.7832 Epoch: 16 Global Step: 83450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:08:17,342-Speed 4308.66 samples/sec Loss 2.7288 Epoch: 16 Global Step: 83500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:08:28,931-Speed 4418.16 samples/sec Loss 2.7307 Epoch: 16 Global Step: 83550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:08:40,599-Speed 4388.30 samples/sec Loss 2.7694 Epoch: 16 Global Step: 83600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:08:52,345-Speed 4359.20 samples/sec Loss 2.7760 Epoch: 16 Global Step: 83650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:09:03,908-Speed 4427.91 samples/sec Loss 2.7623 Epoch: 16 Global Step: 83700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:09:15,657-Speed 4357.91 samples/sec Loss 2.7404 Epoch: 16 Global Step: 83750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:09:27,206-Speed 4433.49 samples/sec Loss 2.7569 Epoch: 16 Global Step: 83800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:09:39,098-Speed 4305.72 samples/sec Loss 2.7417 Epoch: 16 Global Step: 83850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:09:50,653-Speed 4431.28 samples/sec Loss 2.7645 Epoch: 16 Global Step: 83900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:10:02,251-Speed 4414.76 samples/sec Loss 2.8021 Epoch: 16 Global Step: 83950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:10:14,033-Speed 4345.79 samples/sec Loss 2.7313 Epoch: 16 Global Step: 84000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:10:44,805-[lfw][84000]XNorm: 22.465896 Training: 2021-03-15 04:10:44,805-[lfw][84000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 04:10:44,805-[lfw][84000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:11:20,587-[cfp_fp][84000]XNorm: 20.558157 Training: 2021-03-15 04:11:20,587-[cfp_fp][84000]Accuracy-Flip: 0.98743+-0.00298 Training: 2021-03-15 04:11:20,587-[cfp_fp][84000]Accuracy-Highest: 0.98743 Training: 2021-03-15 04:11:51,436-[agedb_30][84000]XNorm: 22.327744 Training: 2021-03-15 04:11:51,436-[agedb_30][84000]Accuracy-Flip: 0.98100+-0.00807 Training: 2021-03-15 04:11:51,436-[agedb_30][84000]Accuracy-Highest: 0.98150 Training: 2021-03-15 04:12:03,030-Speed 469.74 samples/sec Loss 2.7378 Epoch: 16 Global Step: 84050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:12:14,812-Speed 4345.76 samples/sec Loss 2.7358 Epoch: 16 Global Step: 84100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:12:26,273-Speed 4467.66 samples/sec Loss 2.7760 Epoch: 16 Global Step: 84150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:12:38,190-Speed 4296.53 samples/sec Loss 2.7140 Epoch: 16 Global Step: 84200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:12:49,949-Speed 4354.51 samples/sec Loss 2.7246 Epoch: 16 Global Step: 84250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:01,519-Speed 4425.27 samples/sec Loss 2.7540 Epoch: 16 Global Step: 84300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:13,070-Speed 4432.80 samples/sec Loss 2.6895 Epoch: 16 Global Step: 84350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:24,686-Speed 4407.99 samples/sec Loss 2.7187 Epoch: 16 Global Step: 84400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:36,238-Speed 4432.01 samples/sec Loss 2.7090 Epoch: 16 Global Step: 84450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:47,750-Speed 4447.82 samples/sec Loss 2.7195 Epoch: 16 Global Step: 84500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:13:59,248-Speed 4453.38 samples/sec Loss 2.7125 Epoch: 16 Global Step: 84550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:14:10,812-Speed 4427.40 samples/sec Loss 2.6854 Epoch: 16 Global Step: 84600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:14:22,546-Speed 4363.58 samples/sec Loss 2.7252 Epoch: 16 Global Step: 84650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:14:37,155-Speed 3504.90 samples/sec Loss 2.6334 Epoch: 17 Global Step: 84700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:14:49,892-Speed 4019.95 samples/sec Loss 2.3428 Epoch: 17 Global Step: 84750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:15:01,487-Speed 4415.92 samples/sec Loss 2.2830 Epoch: 17 Global Step: 84800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:15:13,569-Speed 4237.84 samples/sec Loss 2.3089 Epoch: 17 Global Step: 84850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:15:25,175-Speed 4411.92 samples/sec Loss 2.2923 Epoch: 17 Global Step: 84900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:15:36,850-Speed 4385.56 samples/sec Loss 2.3032 Epoch: 17 Global Step: 84950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:15:48,702-Speed 4320.26 samples/sec Loss 2.3230 Epoch: 17 Global Step: 85000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:00,625-Speed 4294.39 samples/sec Loss 2.3113 Epoch: 17 Global Step: 85050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:12,267-Speed 4397.86 samples/sec Loss 2.3318 Epoch: 17 Global Step: 85100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:23,983-Speed 4370.16 samples/sec Loss 2.3228 Epoch: 17 Global Step: 85150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:35,777-Speed 4341.51 samples/sec Loss 2.3410 Epoch: 17 Global Step: 85200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:47,494-Speed 4369.90 samples/sec Loss 2.3162 Epoch: 17 Global Step: 85250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:16:59,232-Speed 4361.99 samples/sec Loss 2.2869 Epoch: 17 Global Step: 85300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:17:10,678-Speed 4473.42 samples/sec Loss 2.3045 Epoch: 17 Global Step: 85350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:17:22,234-Speed 4431.10 samples/sec Loss 2.3258 Epoch: 17 Global Step: 85400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:17:33,746-Speed 4447.77 samples/sec Loss 2.3227 Epoch: 17 Global Step: 85450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:17:45,804-Speed 4246.18 samples/sec Loss 2.3169 Epoch: 17 Global Step: 85500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:17:57,328-Speed 4443.28 samples/sec Loss 2.3054 Epoch: 17 Global Step: 85550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:18:08,771-Speed 4474.25 samples/sec Loss 2.2988 Epoch: 17 Global Step: 85600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:18:20,290-Speed 4444.94 samples/sec Loss 2.3233 Epoch: 17 Global Step: 85650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:18:32,047-Speed 4355.29 samples/sec Loss 2.3165 Epoch: 17 Global Step: 85700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:18:43,595-Speed 4433.55 samples/sec Loss 2.3529 Epoch: 17 Global Step: 85750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:18:55,131-Speed 4438.54 samples/sec Loss 2.3580 Epoch: 17 Global Step: 85800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:19:06,663-Speed 4439.96 samples/sec Loss 2.2947 Epoch: 17 Global Step: 85850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:19:18,427-Speed 4352.57 samples/sec Loss 2.3085 Epoch: 17 Global Step: 85900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:19:30,027-Speed 4413.87 samples/sec Loss 2.3416 Epoch: 17 Global Step: 85950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:19:41,940-Speed 4298.32 samples/sec Loss 2.3306 Epoch: 17 Global Step: 86000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:20:12,619-[lfw][86000]XNorm: 22.552899 Training: 2021-03-15 04:20:12,619-[lfw][86000]Accuracy-Flip: 0.99800+-0.00208 Training: 2021-03-15 04:20:12,619-[lfw][86000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:20:48,094-[cfp_fp][86000]XNorm: 20.704837 Training: 2021-03-15 04:20:48,095-[cfp_fp][86000]Accuracy-Flip: 0.98800+-0.00280 Training: 2021-03-15 04:20:48,095-[cfp_fp][86000]Accuracy-Highest: 0.98800 Training: 2021-03-15 04:21:18,781-[agedb_30][86000]XNorm: 22.453203 Training: 2021-03-15 04:21:18,781-[agedb_30][86000]Accuracy-Flip: 0.98283+-0.00683 Training: 2021-03-15 04:21:18,782-[agedb_30][86000]Accuracy-Highest: 0.98283 Training: 2021-03-15 04:21:30,600-Speed 471.19 samples/sec Loss 2.3244 Epoch: 17 Global Step: 86050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:21:42,095-Speed 4454.46 samples/sec Loss 2.3186 Epoch: 17 Global Step: 86100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:21:54,142-Speed 4250.03 samples/sec Loss 2.3185 Epoch: 17 Global Step: 86150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:22:05,776-Speed 4401.29 samples/sec Loss 2.2752 Epoch: 17 Global Step: 86200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:22:17,300-Speed 4443.18 samples/sec Loss 2.3274 Epoch: 17 Global Step: 86250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:22:29,108-Speed 4336.25 samples/sec Loss 2.3544 Epoch: 17 Global Step: 86300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:22:40,657-Speed 4433.41 samples/sec Loss 2.3354 Epoch: 17 Global Step: 86350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:22:52,491-Speed 4326.53 samples/sec Loss 2.3498 Epoch: 17 Global Step: 86400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:23:04,095-Speed 4412.57 samples/sec Loss 2.3518 Epoch: 17 Global Step: 86450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:23:16,055-Speed 4280.91 samples/sec Loss 2.2930 Epoch: 17 Global Step: 86500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:23:27,915-Speed 4317.49 samples/sec Loss 2.3278 Epoch: 17 Global Step: 86550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:23:39,527-Speed 4409.40 samples/sec Loss 2.3211 Epoch: 17 Global Step: 86600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:23:50,992-Speed 4465.71 samples/sec Loss 2.3634 Epoch: 17 Global Step: 86650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:24:02,684-Speed 4379.30 samples/sec Loss 2.3194 Epoch: 17 Global Step: 86700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:24:14,183-Speed 4452.93 samples/sec Loss 2.3347 Epoch: 17 Global Step: 86750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:24:25,766-Speed 4420.28 samples/sec Loss 2.3226 Epoch: 17 Global Step: 86800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:24:37,360-Speed 4416.48 samples/sec Loss 2.3060 Epoch: 17 Global Step: 86850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:24:49,407-Speed 4250.31 samples/sec Loss 2.3271 Epoch: 17 Global Step: 86900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:00,991-Speed 4419.98 samples/sec Loss 2.3503 Epoch: 17 Global Step: 86950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:12,890-Speed 4302.89 samples/sec Loss 2.3294 Epoch: 17 Global Step: 87000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:24,564-Speed 4386.25 samples/sec Loss 2.3283 Epoch: 17 Global Step: 87050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:36,146-Speed 4420.65 samples/sec Loss 2.3284 Epoch: 17 Global Step: 87100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:47,746-Speed 4413.95 samples/sec Loss 2.3186 Epoch: 17 Global Step: 87150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:25:59,277-Speed 4440.47 samples/sec Loss 2.3180 Epoch: 17 Global Step: 87200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:26:11,038-Speed 4353.38 samples/sec Loss 2.3185 Epoch: 17 Global Step: 87250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:26:22,610-Speed 4424.73 samples/sec Loss 2.3105 Epoch: 17 Global Step: 87300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:26:34,252-Speed 4398.22 samples/sec Loss 2.3256 Epoch: 17 Global Step: 87350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:26:45,826-Speed 4423.73 samples/sec Loss 2.3222 Epoch: 17 Global Step: 87400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:26:57,664-Speed 4325.30 samples/sec Loss 2.3277 Epoch: 17 Global Step: 87450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:27:09,336-Speed 4386.71 samples/sec Loss 2.3255 Epoch: 17 Global Step: 87500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:27:21,311-Speed 4275.76 samples/sec Loss 2.3327 Epoch: 17 Global Step: 87550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:27:32,976-Speed 4389.35 samples/sec Loss 2.3256 Epoch: 17 Global Step: 87600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:27:44,794-Speed 4332.51 samples/sec Loss 2.3365 Epoch: 17 Global Step: 87650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:27:56,332-Speed 4437.70 samples/sec Loss 2.3249 Epoch: 17 Global Step: 87700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:28:07,817-Speed 4458.43 samples/sec Loss 2.3346 Epoch: 17 Global Step: 87750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:28:19,367-Speed 4433.08 samples/sec Loss 2.3098 Epoch: 17 Global Step: 87800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:28:30,944-Speed 4422.68 samples/sec Loss 2.3372 Epoch: 17 Global Step: 87850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:28:42,568-Speed 4404.77 samples/sec Loss 2.3479 Epoch: 17 Global Step: 87900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:28:54,526-Speed 4282.03 samples/sec Loss 2.3112 Epoch: 17 Global Step: 87950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:29:06,314-Speed 4343.32 samples/sec Loss 2.3267 Epoch: 17 Global Step: 88000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:29:36,906-[lfw][88000]XNorm: 22.696223 Training: 2021-03-15 04:29:36,906-[lfw][88000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 04:29:36,906-[lfw][88000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:30:12,419-[cfp_fp][88000]XNorm: 20.898905 Training: 2021-03-15 04:30:12,420-[cfp_fp][88000]Accuracy-Flip: 0.98871+-0.00375 Training: 2021-03-15 04:30:12,420-[cfp_fp][88000]Accuracy-Highest: 0.98871 Training: 2021-03-15 04:30:43,037-[agedb_30][88000]XNorm: 22.608277 Training: 2021-03-15 04:30:43,038-[agedb_30][88000]Accuracy-Flip: 0.98200+-0.00823 Training: 2021-03-15 04:30:43,038-[agedb_30][88000]Accuracy-Highest: 0.98283 Training: 2021-03-15 04:30:54,785-Speed 472.02 samples/sec Loss 2.2555 Epoch: 17 Global Step: 88050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:31:06,474-Speed 4380.17 samples/sec Loss 2.3126 Epoch: 17 Global Step: 88100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:31:18,058-Speed 4420.08 samples/sec Loss 2.3350 Epoch: 17 Global Step: 88150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:31:29,722-Speed 4389.73 samples/sec Loss 2.3346 Epoch: 17 Global Step: 88200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:31:41,866-Speed 4216.31 samples/sec Loss 2.3213 Epoch: 17 Global Step: 88250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:31:53,346-Speed 4460.27 samples/sec Loss 2.3160 Epoch: 17 Global Step: 88300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:32:05,036-Speed 4380.01 samples/sec Loss 2.3342 Epoch: 17 Global Step: 88350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:32:16,640-Speed 4412.24 samples/sec Loss 2.3292 Epoch: 17 Global Step: 88400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:32:28,250-Speed 4410.17 samples/sec Loss 2.3300 Epoch: 17 Global Step: 88450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:32:40,049-Speed 4339.48 samples/sec Loss 2.2956 Epoch: 17 Global Step: 88500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:32:51,908-Speed 4317.80 samples/sec Loss 2.3497 Epoch: 17 Global Step: 88550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:33:03,464-Speed 4430.54 samples/sec Loss 2.3034 Epoch: 17 Global Step: 88600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:33:15,244-Speed 4346.61 samples/sec Loss 2.3130 Epoch: 17 Global Step: 88650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:33:27,046-Speed 4338.41 samples/sec Loss 2.2924 Epoch: 17 Global Step: 88700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:33:38,530-Speed 4458.68 samples/sec Loss 2.3196 Epoch: 17 Global Step: 88750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:33:50,132-Speed 4413.14 samples/sec Loss 2.2746 Epoch: 17 Global Step: 88800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:34:01,736-Speed 4412.38 samples/sec Loss 2.3605 Epoch: 17 Global Step: 88850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:34:13,610-Speed 4312.41 samples/sec Loss 2.3251 Epoch: 17 Global Step: 88900 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:34:25,231-Speed 4405.68 samples/sec Loss 2.3182 Epoch: 17 Global Step: 88950 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:34:36,918-Speed 4381.43 samples/sec Loss 2.3349 Epoch: 17 Global Step: 89000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:34:48,553-Speed 4400.65 samples/sec Loss 2.3134 Epoch: 17 Global Step: 89050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:00,215-Speed 4390.42 samples/sec Loss 2.3460 Epoch: 17 Global Step: 89100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:11,637-Speed 4482.76 samples/sec Loss 2.2650 Epoch: 17 Global Step: 89150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:23,218-Speed 4421.04 samples/sec Loss 2.2978 Epoch: 17 Global Step: 89200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:34,734-Speed 4446.33 samples/sec Loss 2.2861 Epoch: 17 Global Step: 89250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:46,438-Speed 4374.93 samples/sec Loss 2.2901 Epoch: 17 Global Step: 89300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:35:57,980-Speed 4435.92 samples/sec Loss 2.3190 Epoch: 17 Global Step: 89350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:36:09,759-Speed 4346.86 samples/sec Loss 2.2896 Epoch: 17 Global Step: 89400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:36:21,542-Speed 4345.37 samples/sec Loss 2.3038 Epoch: 17 Global Step: 89450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:36:33,194-Speed 4394.61 samples/sec Loss 2.2906 Epoch: 17 Global Step: 89500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:36:44,592-Speed 4492.10 samples/sec Loss 2.3227 Epoch: 17 Global Step: 89550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:36:56,151-Speed 4429.62 samples/sec Loss 2.2947 Epoch: 17 Global Step: 89600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:37:07,587-Speed 4477.30 samples/sec Loss 2.3051 Epoch: 17 Global Step: 89650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:37:22,919-Speed 3339.53 samples/sec Loss 2.1417 Epoch: 18 Global Step: 89700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:37:35,034-Speed 4226.45 samples/sec Loss 1.9099 Epoch: 18 Global Step: 89750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:37:46,570-Speed 4438.34 samples/sec Loss 1.9494 Epoch: 18 Global Step: 89800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:37:58,258-Speed 4380.74 samples/sec Loss 1.9516 Epoch: 18 Global Step: 89850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:38:09,804-Speed 4434.71 samples/sec Loss 1.9606 Epoch: 18 Global Step: 89900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:38:21,417-Speed 4409.17 samples/sec Loss 1.9785 Epoch: 18 Global Step: 89950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:38:33,681-Speed 4174.98 samples/sec Loss 1.9526 Epoch: 18 Global Step: 90000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:39:04,519-[lfw][90000]XNorm: 23.361170 Training: 2021-03-15 04:39:04,520-[lfw][90000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 04:39:04,520-[lfw][90000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:39:40,196-[cfp_fp][90000]XNorm: 21.408432 Training: 2021-03-15 04:39:40,196-[cfp_fp][90000]Accuracy-Flip: 0.98871+-0.00282 Training: 2021-03-15 04:39:40,196-[cfp_fp][90000]Accuracy-Highest: 0.98871 Training: 2021-03-15 04:40:10,944-[agedb_30][90000]XNorm: 23.264217 Training: 2021-03-15 04:40:10,945-[agedb_30][90000]Accuracy-Flip: 0.98117+-0.00775 Training: 2021-03-15 04:40:10,945-[agedb_30][90000]Accuracy-Highest: 0.98283 Training: 2021-03-15 04:40:22,763-Speed 469.37 samples/sec Loss 1.9911 Epoch: 18 Global Step: 90050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:40:34,460-Speed 4377.49 samples/sec Loss 1.9628 Epoch: 18 Global Step: 90100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:40:46,119-Speed 4391.77 samples/sec Loss 1.9575 Epoch: 18 Global Step: 90150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:40:57,608-Speed 4456.38 samples/sec Loss 1.9624 Epoch: 18 Global Step: 90200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:41:09,152-Speed 4435.58 samples/sec Loss 1.9773 Epoch: 18 Global Step: 90250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:41:20,631-Speed 4460.54 samples/sec Loss 1.9939 Epoch: 18 Global Step: 90300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:41:32,215-Speed 4419.95 samples/sec Loss 1.9895 Epoch: 18 Global Step: 90350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:41:43,961-Speed 4359.26 samples/sec Loss 1.9575 Epoch: 18 Global Step: 90400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:41:55,511-Speed 4432.91 samples/sec Loss 2.0547 Epoch: 18 Global Step: 90450 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:42:07,533-Speed 4259.16 samples/sec Loss 2.0029 Epoch: 18 Global Step: 90500 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:42:19,223-Speed 4379.92 samples/sec Loss 1.9998 Epoch: 18 Global Step: 90550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:42:30,789-Speed 4427.01 samples/sec Loss 1.9885 Epoch: 18 Global Step: 90600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:42:42,351-Speed 4428.52 samples/sec Loss 1.9972 Epoch: 18 Global Step: 90650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:42:54,124-Speed 4349.29 samples/sec Loss 1.9782 Epoch: 18 Global Step: 90700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:43:05,774-Speed 4394.78 samples/sec Loss 1.9938 Epoch: 18 Global Step: 90750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:43:17,485-Speed 4372.35 samples/sec Loss 2.0097 Epoch: 18 Global Step: 90800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:43:29,104-Speed 4406.69 samples/sec Loss 2.0080 Epoch: 18 Global Step: 90850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:43:40,821-Speed 4369.94 samples/sec Loss 2.0240 Epoch: 18 Global Step: 90900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:43:52,376-Speed 4431.19 samples/sec Loss 2.0310 Epoch: 18 Global Step: 90950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:44:04,200-Speed 4330.08 samples/sec Loss 2.0178 Epoch: 18 Global Step: 91000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:44:16,235-Speed 4254.65 samples/sec Loss 2.0254 Epoch: 18 Global Step: 91050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:44:27,962-Speed 4366.08 samples/sec Loss 2.0431 Epoch: 18 Global Step: 91100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:44:39,621-Speed 4391.71 samples/sec Loss 2.0159 Epoch: 18 Global Step: 91150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:44:51,093-Speed 4463.18 samples/sec Loss 2.0544 Epoch: 18 Global Step: 91200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:45:02,558-Speed 4465.89 samples/sec Loss 2.0266 Epoch: 18 Global Step: 91250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:45:14,382-Speed 4330.37 samples/sec Loss 2.0494 Epoch: 18 Global Step: 91300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:45:26,098-Speed 4370.14 samples/sec Loss 2.0710 Epoch: 18 Global Step: 91350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:45:37,941-Speed 4323.60 samples/sec Loss 2.0041 Epoch: 18 Global Step: 91400 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:45:49,890-Speed 4285.14 samples/sec Loss 1.9922 Epoch: 18 Global Step: 91450 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:46:01,492-Speed 4412.98 samples/sec Loss 2.0269 Epoch: 18 Global Step: 91500 Fp16 Grad Scale: 8192 Required: 3 hours Training: 2021-03-15 04:46:13,126-Speed 4401.29 samples/sec Loss 2.0390 Epoch: 18 Global Step: 91550 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:46:24,590-Speed 4466.19 samples/sec Loss 2.0532 Epoch: 18 Global Step: 91600 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:46:36,289-Speed 4376.53 samples/sec Loss 2.0064 Epoch: 18 Global Step: 91650 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:46:47,813-Speed 4443.14 samples/sec Loss 2.0454 Epoch: 18 Global Step: 91700 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:46:59,213-Speed 4491.54 samples/sec Loss 2.0387 Epoch: 18 Global Step: 91750 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:47:10,787-Speed 4423.91 samples/sec Loss 2.0264 Epoch: 18 Global Step: 91800 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:47:22,511-Speed 4367.04 samples/sec Loss 2.0481 Epoch: 18 Global Step: 91850 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:47:34,066-Speed 4431.14 samples/sec Loss 2.0758 Epoch: 18 Global Step: 91900 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:47:45,879-Speed 4334.64 samples/sec Loss 2.0675 Epoch: 18 Global Step: 91950 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:47:57,670-Speed 4342.23 samples/sec Loss 2.0229 Epoch: 18 Global Step: 92000 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:48:28,752-[lfw][92000]XNorm: 22.524783 Training: 2021-03-15 04:48:28,753-[lfw][92000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 04:48:28,753-[lfw][92000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:49:04,544-[cfp_fp][92000]XNorm: 20.842414 Training: 2021-03-15 04:49:04,544-[cfp_fp][92000]Accuracy-Flip: 0.98843+-0.00296 Training: 2021-03-15 04:49:04,544-[cfp_fp][92000]Accuracy-Highest: 0.98871 Training: 2021-03-15 04:49:35,438-[agedb_30][92000]XNorm: 22.425502 Training: 2021-03-15 04:49:35,439-[agedb_30][92000]Accuracy-Flip: 0.98167+-0.00738 Training: 2021-03-15 04:49:35,439-[agedb_30][92000]Accuracy-Highest: 0.98283 Training: 2021-03-15 04:49:46,936-Speed 468.59 samples/sec Loss 2.0360 Epoch: 18 Global Step: 92050 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:49:58,600-Speed 4389.54 samples/sec Loss 2.0588 Epoch: 18 Global Step: 92100 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:50:09,997-Speed 4492.90 samples/sec Loss 2.0287 Epoch: 18 Global Step: 92150 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:50:21,559-Speed 4428.28 samples/sec Loss 2.0290 Epoch: 18 Global Step: 92200 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:50:33,405-Speed 4322.15 samples/sec Loss 2.0666 Epoch: 18 Global Step: 92250 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:50:45,136-Speed 4364.89 samples/sec Loss 2.0683 Epoch: 18 Global Step: 92300 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:50:57,130-Speed 4269.09 samples/sec Loss 2.0485 Epoch: 18 Global Step: 92350 Fp16 Grad Scale: 16384 Required: 3 hours Training: 2021-03-15 04:51:08,776-Speed 4396.35 samples/sec Loss 2.0647 Epoch: 18 Global Step: 92400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:51:20,543-Speed 4351.21 samples/sec Loss 2.0975 Epoch: 18 Global Step: 92450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:51:32,435-Speed 4305.85 samples/sec Loss 2.0864 Epoch: 18 Global Step: 92500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:51:44,056-Speed 4405.70 samples/sec Loss 2.0391 Epoch: 18 Global Step: 92550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:51:55,599-Speed 4435.89 samples/sec Loss 2.0697 Epoch: 18 Global Step: 92600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:52:07,415-Speed 4333.17 samples/sec Loss 2.0789 Epoch: 18 Global Step: 92650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:52:19,094-Speed 4384.38 samples/sec Loss 2.0694 Epoch: 18 Global Step: 92700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:52:31,154-Speed 4245.42 samples/sec Loss 2.1048 Epoch: 18 Global Step: 92750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:52:42,641-Speed 4457.47 samples/sec Loss 2.0952 Epoch: 18 Global Step: 92800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:52:54,745-Speed 4230.16 samples/sec Loss 2.0706 Epoch: 18 Global Step: 92850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:53:06,229-Speed 4458.80 samples/sec Loss 2.0731 Epoch: 18 Global Step: 92900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:53:17,838-Speed 4410.49 samples/sec Loss 2.0482 Epoch: 18 Global Step: 92950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:53:29,280-Speed 4475.07 samples/sec Loss 2.0677 Epoch: 18 Global Step: 93000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:53:41,099-Speed 4332.24 samples/sec Loss 2.0646 Epoch: 18 Global Step: 93050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:53:52,680-Speed 4421.17 samples/sec Loss 2.1200 Epoch: 18 Global Step: 93100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:54:04,096-Speed 4484.97 samples/sec Loss 2.0640 Epoch: 18 Global Step: 93150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:54:15,618-Speed 4444.15 samples/sec Loss 2.1128 Epoch: 18 Global Step: 93200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:54:27,110-Speed 4455.16 samples/sec Loss 2.1024 Epoch: 18 Global Step: 93250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:54:38,693-Speed 4420.59 samples/sec Loss 2.0893 Epoch: 18 Global Step: 93300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:54:50,254-Speed 4428.94 samples/sec Loss 2.0852 Epoch: 18 Global Step: 93350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:55:02,054-Speed 4339.10 samples/sec Loss 2.0988 Epoch: 18 Global Step: 93400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:55:13,788-Speed 4363.52 samples/sec Loss 2.0855 Epoch: 18 Global Step: 93450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:55:25,714-Speed 4293.38 samples/sec Loss 2.0988 Epoch: 18 Global Step: 93500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:55:37,381-Speed 4388.66 samples/sec Loss 2.0867 Epoch: 18 Global Step: 93550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:55:49,004-Speed 4405.29 samples/sec Loss 2.0754 Epoch: 18 Global Step: 93600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:00,756-Speed 4356.76 samples/sec Loss 2.1079 Epoch: 18 Global Step: 93650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:12,263-Speed 4449.87 samples/sec Loss 2.1216 Epoch: 18 Global Step: 93700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:23,841-Speed 4422.09 samples/sec Loss 2.0948 Epoch: 18 Global Step: 93750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:35,552-Speed 4372.39 samples/sec Loss 2.0802 Epoch: 18 Global Step: 93800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:47,111-Speed 4429.61 samples/sec Loss 2.1470 Epoch: 18 Global Step: 93850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:56:58,598-Speed 4457.16 samples/sec Loss 2.1208 Epoch: 18 Global Step: 93900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:57:10,117-Speed 4445.20 samples/sec Loss 2.0848 Epoch: 18 Global Step: 93950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:57:21,885-Speed 4350.94 samples/sec Loss 2.1190 Epoch: 18 Global Step: 94000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:57:52,491-[lfw][94000]XNorm: 22.682313 Training: 2021-03-15 04:57:52,492-[lfw][94000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 04:57:52,492-[lfw][94000]Accuracy-Highest: 0.99800 Training: 2021-03-15 04:58:28,193-[cfp_fp][94000]XNorm: 20.962662 Training: 2021-03-15 04:58:28,193-[cfp_fp][94000]Accuracy-Flip: 0.98914+-0.00308 Training: 2021-03-15 04:58:28,193-[cfp_fp][94000]Accuracy-Highest: 0.98914 Training: 2021-03-15 04:58:58,956-[agedb_30][94000]XNorm: 22.682149 Training: 2021-03-15 04:58:58,956-[agedb_30][94000]Accuracy-Flip: 0.98133+-0.00781 Training: 2021-03-15 04:58:58,956-[agedb_30][94000]Accuracy-Highest: 0.98283 Training: 2021-03-15 04:59:10,821-Speed 470.00 samples/sec Loss 2.1096 Epoch: 18 Global Step: 94050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:59:22,500-Speed 4384.16 samples/sec Loss 2.0825 Epoch: 18 Global Step: 94100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:59:34,460-Speed 4280.94 samples/sec Loss 2.1485 Epoch: 18 Global Step: 94150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:59:46,277-Speed 4332.94 samples/sec Loss 2.0770 Epoch: 18 Global Step: 94200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 04:59:57,942-Speed 4389.39 samples/sec Loss 2.1122 Epoch: 18 Global Step: 94250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:00:09,722-Speed 4346.66 samples/sec Loss 2.1286 Epoch: 18 Global Step: 94300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:00:21,209-Speed 4457.29 samples/sec Loss 2.0847 Epoch: 18 Global Step: 94350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:00:32,836-Speed 4403.58 samples/sec Loss 2.0538 Epoch: 18 Global Step: 94400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:00:44,462-Speed 4404.20 samples/sec Loss 2.1243 Epoch: 18 Global Step: 94450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:00:56,506-Speed 4251.28 samples/sec Loss 2.1319 Epoch: 18 Global Step: 94500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:01:08,360-Speed 4319.44 samples/sec Loss 2.1165 Epoch: 18 Global Step: 94550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:01:19,774-Speed 4486.07 samples/sec Loss 2.1038 Epoch: 18 Global Step: 94600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:01:31,335-Speed 4428.60 samples/sec Loss 2.1359 Epoch: 18 Global Step: 94650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:01:47,094-Speed 3249.15 samples/sec Loss 1.7884 Epoch: 19 Global Step: 94700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:01:58,958-Speed 4315.77 samples/sec Loss 1.8184 Epoch: 19 Global Step: 94750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:02:11,041-Speed 4237.44 samples/sec Loss 1.7539 Epoch: 19 Global Step: 94800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:02:22,663-Speed 4405.52 samples/sec Loss 1.7647 Epoch: 19 Global Step: 94850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:02:34,322-Speed 4391.77 samples/sec Loss 1.7915 Epoch: 19 Global Step: 94900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:02:45,894-Speed 4424.90 samples/sec Loss 1.7831 Epoch: 19 Global Step: 94950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:02:57,775-Speed 4309.52 samples/sec Loss 1.7567 Epoch: 19 Global Step: 95000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:03:09,390-Speed 4408.17 samples/sec Loss 1.7969 Epoch: 19 Global Step: 95050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:03:20,997-Speed 4411.44 samples/sec Loss 1.7670 Epoch: 19 Global Step: 95100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:03:32,560-Speed 4428.19 samples/sec Loss 1.7720 Epoch: 19 Global Step: 95150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:03:44,244-Speed 4382.05 samples/sec Loss 1.7755 Epoch: 19 Global Step: 95200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:03:55,887-Speed 4397.95 samples/sec Loss 1.8282 Epoch: 19 Global Step: 95250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:04:07,493-Speed 4411.50 samples/sec Loss 1.8080 Epoch: 19 Global Step: 95300 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:04:18,886-Speed 4494.16 samples/sec Loss 1.7893 Epoch: 19 Global Step: 95350 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:04:30,740-Speed 4319.33 samples/sec Loss 1.8003 Epoch: 19 Global Step: 95400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:04:42,603-Speed 4316.08 samples/sec Loss 1.8228 Epoch: 19 Global Step: 95450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:04:54,732-Speed 4221.55 samples/sec Loss 1.8356 Epoch: 19 Global Step: 95500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:05:06,278-Speed 4434.57 samples/sec Loss 1.8000 Epoch: 19 Global Step: 95550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:05:18,181-Speed 4301.72 samples/sec Loss 1.8042 Epoch: 19 Global Step: 95600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:05:29,682-Speed 4451.71 samples/sec Loss 1.8272 Epoch: 19 Global Step: 95650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:05:41,255-Speed 4424.38 samples/sec Loss 1.8041 Epoch: 19 Global Step: 95700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:05:52,826-Speed 4425.18 samples/sec Loss 1.8738 Epoch: 19 Global Step: 95750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:06:04,712-Speed 4307.60 samples/sec Loss 1.7771 Epoch: 19 Global Step: 95800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:06:16,229-Speed 4445.99 samples/sec Loss 1.8759 Epoch: 19 Global Step: 95850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:06:28,383-Speed 4212.77 samples/sec Loss 1.8189 Epoch: 19 Global Step: 95900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:06:39,974-Speed 4417.40 samples/sec Loss 1.8671 Epoch: 19 Global Step: 95950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:06:51,476-Speed 4451.40 samples/sec Loss 1.8235 Epoch: 19 Global Step: 96000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:07:22,346-[lfw][96000]XNorm: 22.163733 Training: 2021-03-15 05:07:22,346-[lfw][96000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 05:07:22,346-[lfw][96000]Accuracy-Highest: 0.99800 Training: 2021-03-15 05:07:57,975-[cfp_fp][96000]XNorm: 20.606782 Training: 2021-03-15 05:07:57,975-[cfp_fp][96000]Accuracy-Flip: 0.98729+-0.00353 Training: 2021-03-15 05:07:57,975-[cfp_fp][96000]Accuracy-Highest: 0.98914 Training: 2021-03-15 05:08:28,642-[agedb_30][96000]XNorm: 22.291953 Training: 2021-03-15 05:08:28,643-[agedb_30][96000]Accuracy-Flip: 0.98150+-0.00721 Training: 2021-03-15 05:08:28,643-[agedb_30][96000]Accuracy-Highest: 0.98283 Training: 2021-03-15 05:08:40,492-Speed 469.66 samples/sec Loss 1.8456 Epoch: 19 Global Step: 96050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:08:51,949-Speed 4469.08 samples/sec Loss 1.8579 Epoch: 19 Global Step: 96100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:09:03,593-Speed 4396.99 samples/sec Loss 1.8952 Epoch: 19 Global Step: 96150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:09:15,462-Speed 4314.00 samples/sec Loss 1.8631 Epoch: 19 Global Step: 96200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:09:27,154-Speed 4379.15 samples/sec Loss 1.8301 Epoch: 19 Global Step: 96250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:09:38,782-Speed 4403.62 samples/sec Loss 1.8621 Epoch: 19 Global Step: 96300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:09:50,523-Speed 4360.88 samples/sec Loss 1.8994 Epoch: 19 Global Step: 96350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:10:02,057-Speed 4439.29 samples/sec Loss 1.8275 Epoch: 19 Global Step: 96400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:10:13,522-Speed 4465.78 samples/sec Loss 1.8697 Epoch: 19 Global Step: 96450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:10:25,414-Speed 4305.79 samples/sec Loss 1.8902 Epoch: 19 Global Step: 96500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:10:37,051-Speed 4399.77 samples/sec Loss 1.8572 Epoch: 19 Global Step: 96550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:10:48,733-Speed 4382.85 samples/sec Loss 1.8686 Epoch: 19 Global Step: 96600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:00,455-Speed 4368.28 samples/sec Loss 1.8419 Epoch: 19 Global Step: 96650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:12,130-Speed 4385.49 samples/sec Loss 1.8690 Epoch: 19 Global Step: 96700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:23,663-Speed 4439.50 samples/sec Loss 1.9155 Epoch: 19 Global Step: 96750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:35,165-Speed 4451.92 samples/sec Loss 1.8843 Epoch: 19 Global Step: 96800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:46,948-Speed 4345.38 samples/sec Loss 1.8559 Epoch: 19 Global Step: 96850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:11:58,647-Speed 4376.38 samples/sec Loss 1.9011 Epoch: 19 Global Step: 96900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:12:10,323-Speed 4385.39 samples/sec Loss 1.8985 Epoch: 19 Global Step: 96950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:12:22,076-Speed 4356.50 samples/sec Loss 1.8900 Epoch: 19 Global Step: 97000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:12:33,875-Speed 4339.51 samples/sec Loss 1.8946 Epoch: 19 Global Step: 97050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:12:45,469-Speed 4416.32 samples/sec Loss 1.8985 Epoch: 19 Global Step: 97100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:12:57,169-Speed 4376.01 samples/sec Loss 1.9087 Epoch: 19 Global Step: 97150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:13:08,968-Speed 4339.80 samples/sec Loss 1.9027 Epoch: 19 Global Step: 97200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:13:20,545-Speed 4422.56 samples/sec Loss 1.9026 Epoch: 19 Global Step: 97250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:13:32,024-Speed 4460.71 samples/sec Loss 1.8987 Epoch: 19 Global Step: 97300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:13:43,781-Speed 4355.05 samples/sec Loss 1.9125 Epoch: 19 Global Step: 97350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:13:55,272-Speed 4455.54 samples/sec Loss 1.8946 Epoch: 19 Global Step: 97400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:14:07,187-Speed 4297.57 samples/sec Loss 1.9151 Epoch: 19 Global Step: 97450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:14:18,952-Speed 4352.01 samples/sec Loss 1.9556 Epoch: 19 Global Step: 97500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:14:30,753-Speed 4338.78 samples/sec Loss 1.9094 Epoch: 19 Global Step: 97550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:14:42,256-Speed 4451.03 samples/sec Loss 1.9075 Epoch: 19 Global Step: 97600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:14:54,079-Speed 4330.85 samples/sec Loss 1.9233 Epoch: 19 Global Step: 97650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:15:05,837-Speed 4354.62 samples/sec Loss 1.9585 Epoch: 19 Global Step: 97700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:15:17,580-Speed 4360.17 samples/sec Loss 1.9252 Epoch: 19 Global Step: 97750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:15:29,233-Speed 4393.96 samples/sec Loss 1.9259 Epoch: 19 Global Step: 97800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:15:40,873-Speed 4398.92 samples/sec Loss 1.9341 Epoch: 19 Global Step: 97850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:15:52,322-Speed 4472.06 samples/sec Loss 1.9175 Epoch: 19 Global Step: 97900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:16:04,035-Speed 4371.34 samples/sec Loss 1.9654 Epoch: 19 Global Step: 97950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:16:15,656-Speed 4406.21 samples/sec Loss 1.9393 Epoch: 19 Global Step: 98000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:16:46,533-[lfw][98000]XNorm: 22.264935 Training: 2021-03-15 05:16:46,533-[lfw][98000]Accuracy-Flip: 0.99833+-0.00183 Training: 2021-03-15 05:16:46,533-[lfw][98000]Accuracy-Highest: 0.99833 Training: 2021-03-15 05:17:22,097-[cfp_fp][98000]XNorm: 20.815576 Training: 2021-03-15 05:17:22,097-[cfp_fp][98000]Accuracy-Flip: 0.98800+-0.00287 Training: 2021-03-15 05:17:22,097-[cfp_fp][98000]Accuracy-Highest: 0.98914 Training: 2021-03-15 05:17:52,832-[agedb_30][98000]XNorm: 22.326680 Training: 2021-03-15 05:17:52,832-[agedb_30][98000]Accuracy-Flip: 0.98350+-0.00621 Training: 2021-03-15 05:17:52,832-[agedb_30][98000]Accuracy-Highest: 0.98350 Training: 2021-03-15 05:18:04,892-Speed 468.71 samples/sec Loss 1.9733 Epoch: 19 Global Step: 98050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:18:16,567-Speed 4385.31 samples/sec Loss 1.9531 Epoch: 19 Global Step: 98100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:18:28,095-Speed 4441.53 samples/sec Loss 1.9547 Epoch: 19 Global Step: 98150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:18:39,938-Speed 4323.58 samples/sec Loss 1.9349 Epoch: 19 Global Step: 98200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:18:51,609-Speed 4387.03 samples/sec Loss 1.9422 Epoch: 19 Global Step: 98250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:19:03,224-Speed 4408.45 samples/sec Loss 1.9412 Epoch: 19 Global Step: 98300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:19:14,909-Speed 4381.94 samples/sec Loss 1.9527 Epoch: 19 Global Step: 98350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:19:26,387-Speed 4460.85 samples/sec Loss 1.9540 Epoch: 19 Global Step: 98400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:19:38,031-Speed 4397.40 samples/sec Loss 1.9657 Epoch: 19 Global Step: 98450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:19:50,070-Speed 4253.10 samples/sec Loss 1.9359 Epoch: 19 Global Step: 98500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:01,581-Speed 4447.85 samples/sec Loss 1.9618 Epoch: 19 Global Step: 98550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:13,113-Speed 4440.07 samples/sec Loss 1.9586 Epoch: 19 Global Step: 98600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:24,934-Speed 4331.35 samples/sec Loss 1.9649 Epoch: 19 Global Step: 98650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:36,498-Speed 4427.77 samples/sec Loss 1.9745 Epoch: 19 Global Step: 98700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:48,202-Speed 4374.70 samples/sec Loss 1.9783 Epoch: 19 Global Step: 98750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:20:59,788-Speed 4419.51 samples/sec Loss 1.9793 Epoch: 19 Global Step: 98800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:21:11,213-Speed 4481.49 samples/sec Loss 1.9717 Epoch: 19 Global Step: 98850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:21:22,758-Speed 4435.19 samples/sec Loss 1.9725 Epoch: 19 Global Step: 98900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:21:34,326-Speed 4426.25 samples/sec Loss 1.9734 Epoch: 19 Global Step: 98950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:21:46,368-Speed 4251.85 samples/sec Loss 1.9932 Epoch: 19 Global Step: 99000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:21:58,000-Speed 4402.03 samples/sec Loss 1.9741 Epoch: 19 Global Step: 99050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:22:09,734-Speed 4363.30 samples/sec Loss 1.9905 Epoch: 19 Global Step: 99100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:22:21,425-Speed 4379.76 samples/sec Loss 1.9797 Epoch: 19 Global Step: 99150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:22:33,136-Speed 4372.22 samples/sec Loss 1.9789 Epoch: 19 Global Step: 99200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:22:44,763-Speed 4403.77 samples/sec Loss 1.9969 Epoch: 19 Global Step: 99250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:22:56,708-Speed 4286.33 samples/sec Loss 1.9715 Epoch: 19 Global Step: 99300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:23:08,263-Speed 4431.47 samples/sec Loss 2.0347 Epoch: 19 Global Step: 99350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:23:20,041-Speed 4346.96 samples/sec Loss 2.0142 Epoch: 19 Global Step: 99400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:23:31,581-Speed 4436.94 samples/sec Loss 1.9844 Epoch: 19 Global Step: 99450 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:23:43,772-Speed 4200.27 samples/sec Loss 1.9760 Epoch: 19 Global Step: 99500 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:23:55,438-Speed 4388.84 samples/sec Loss 2.0013 Epoch: 19 Global Step: 99550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:24:07,070-Speed 4401.84 samples/sec Loss 1.9665 Epoch: 19 Global Step: 99600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:24:22,230-Speed 3377.55 samples/sec Loss 1.8753 Epoch: 20 Global Step: 99650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:24:34,662-Speed 4118.48 samples/sec Loss 1.6044 Epoch: 20 Global Step: 99700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:24:46,277-Speed 4408.43 samples/sec Loss 1.6287 Epoch: 20 Global Step: 99750 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:24:57,943-Speed 4389.01 samples/sec Loss 1.6177 Epoch: 20 Global Step: 99800 Fp16 Grad Scale: 8192 Required: 2 hours Training: 2021-03-15 05:25:09,486-Speed 4435.90 samples/sec Loss 1.6477 Epoch: 20 Global Step: 99850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:25:21,185-Speed 4376.36 samples/sec Loss 1.6298 Epoch: 20 Global Step: 99900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:25:33,179-Speed 4269.16 samples/sec Loss 1.6773 Epoch: 20 Global Step: 99950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:25:45,006-Speed 4329.02 samples/sec Loss 1.6704 Epoch: 20 Global Step: 100000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:26:15,713-[lfw][100000]XNorm: 22.746553 Training: 2021-03-15 05:26:15,713-[lfw][100000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 05:26:15,713-[lfw][100000]Accuracy-Highest: 0.99833 Training: 2021-03-15 05:26:51,929-[cfp_fp][100000]XNorm: 20.988646 Training: 2021-03-15 05:26:51,930-[cfp_fp][100000]Accuracy-Flip: 0.98814+-0.00307 Training: 2021-03-15 05:26:51,930-[cfp_fp][100000]Accuracy-Highest: 0.98914 Training: 2021-03-15 05:27:22,857-[agedb_30][100000]XNorm: 22.640059 Training: 2021-03-15 05:27:22,857-[agedb_30][100000]Accuracy-Flip: 0.98167+-0.00745 Training: 2021-03-15 05:27:22,857-[agedb_30][100000]Accuracy-Highest: 0.98350 Training: 2021-03-15 05:27:34,434-Speed 467.89 samples/sec Loss 1.6396 Epoch: 20 Global Step: 100050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:27:46,040-Speed 4411.52 samples/sec Loss 1.6618 Epoch: 20 Global Step: 100100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:27:57,492-Speed 4471.13 samples/sec Loss 1.6524 Epoch: 20 Global Step: 100150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:28:09,247-Speed 4355.59 samples/sec Loss 1.6825 Epoch: 20 Global Step: 100200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:28:20,800-Speed 4432.13 samples/sec Loss 1.6664 Epoch: 20 Global Step: 100250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:28:32,816-Speed 4261.32 samples/sec Loss 1.7026 Epoch: 20 Global Step: 100300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:28:44,280-Speed 4466.23 samples/sec Loss 1.6951 Epoch: 20 Global Step: 100350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:28:55,866-Speed 4419.32 samples/sec Loss 1.6533 Epoch: 20 Global Step: 100400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:29:07,781-Speed 4297.00 samples/sec Loss 1.6759 Epoch: 20 Global Step: 100450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:29:19,207-Speed 4481.58 samples/sec Loss 1.7106 Epoch: 20 Global Step: 100500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:29:30,842-Speed 4400.39 samples/sec Loss 1.6890 Epoch: 20 Global Step: 100550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:29:42,593-Speed 4357.29 samples/sec Loss 1.7094 Epoch: 20 Global Step: 100600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:29:54,368-Speed 4348.61 samples/sec Loss 1.7239 Epoch: 20 Global Step: 100650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:30:06,170-Speed 4338.39 samples/sec Loss 1.6695 Epoch: 20 Global Step: 100700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:30:17,774-Speed 4412.50 samples/sec Loss 1.6810 Epoch: 20 Global Step: 100750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:30:29,475-Speed 4375.93 samples/sec Loss 1.7286 Epoch: 20 Global Step: 100800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:30:41,079-Speed 4412.33 samples/sec Loss 1.7491 Epoch: 20 Global Step: 100850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:30:52,911-Speed 4327.49 samples/sec Loss 1.7141 Epoch: 20 Global Step: 100900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:31:04,868-Speed 4282.04 samples/sec Loss 1.7437 Epoch: 20 Global Step: 100950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:31:16,845-Speed 4275.00 samples/sec Loss 1.6903 Epoch: 20 Global Step: 101000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:31:28,289-Speed 4474.19 samples/sec Loss 1.7349 Epoch: 20 Global Step: 101050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:31:39,951-Speed 4390.59 samples/sec Loss 1.7264 Epoch: 20 Global Step: 101100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:31:51,526-Speed 4423.40 samples/sec Loss 1.7422 Epoch: 20 Global Step: 101150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:32:03,060-Speed 4439.37 samples/sec Loss 1.7832 Epoch: 20 Global Step: 101200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:32:14,503-Speed 4474.37 samples/sec Loss 1.7599 Epoch: 20 Global Step: 101250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:32:26,262-Speed 4354.52 samples/sec Loss 1.7702 Epoch: 20 Global Step: 101300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:32:38,159-Speed 4303.57 samples/sec Loss 1.7394 Epoch: 20 Global Step: 101350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:32:49,762-Speed 4413.05 samples/sec Loss 1.7568 Epoch: 20 Global Step: 101400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:33:01,482-Speed 4368.67 samples/sec Loss 1.7445 Epoch: 20 Global Step: 101450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:33:13,654-Speed 4206.53 samples/sec Loss 1.7500 Epoch: 20 Global Step: 101500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:33:25,172-Speed 4445.21 samples/sec Loss 1.7722 Epoch: 20 Global Step: 101550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:33:36,877-Speed 4374.49 samples/sec Loss 1.8150 Epoch: 20 Global Step: 101600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:33:48,414-Speed 4438.26 samples/sec Loss 1.7952 Epoch: 20 Global Step: 101650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:00,019-Speed 4411.95 samples/sec Loss 1.7709 Epoch: 20 Global Step: 101700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:11,764-Speed 4359.56 samples/sec Loss 1.7807 Epoch: 20 Global Step: 101750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:23,208-Speed 4474.22 samples/sec Loss 1.7979 Epoch: 20 Global Step: 101800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:34,737-Speed 4441.13 samples/sec Loss 1.7868 Epoch: 20 Global Step: 101850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:46,254-Speed 4445.93 samples/sec Loss 1.7803 Epoch: 20 Global Step: 101900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:34:57,730-Speed 4461.37 samples/sec Loss 1.8277 Epoch: 20 Global Step: 101950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:35:09,652-Speed 4294.79 samples/sec Loss 1.8155 Epoch: 20 Global Step: 102000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:35:40,296-[lfw][102000]XNorm: 22.494587 Training: 2021-03-15 05:35:40,296-[lfw][102000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 05:35:40,296-[lfw][102000]Accuracy-Highest: 0.99833 Training: 2021-03-15 05:36:15,848-[cfp_fp][102000]XNorm: 20.923324 Training: 2021-03-15 05:36:15,849-[cfp_fp][102000]Accuracy-Flip: 0.98757+-0.00356 Training: 2021-03-15 05:36:15,849-[cfp_fp][102000]Accuracy-Highest: 0.98914 Training: 2021-03-15 05:36:46,522-[agedb_30][102000]XNorm: 22.409867 Training: 2021-03-15 05:36:46,522-[agedb_30][102000]Accuracy-Flip: 0.98067+-0.00716 Training: 2021-03-15 05:36:46,523-[agedb_30][102000]Accuracy-Highest: 0.98350 Training: 2021-03-15 05:36:58,301-Speed 471.24 samples/sec Loss 1.7864 Epoch: 20 Global Step: 102050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:37:09,845-Speed 4435.52 samples/sec Loss 1.8126 Epoch: 20 Global Step: 102100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:37:21,233-Speed 4495.91 samples/sec Loss 1.8258 Epoch: 20 Global Step: 102150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:37:32,768-Speed 4439.01 samples/sec Loss 1.8091 Epoch: 20 Global Step: 102200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:37:44,206-Speed 4476.29 samples/sec Loss 1.8156 Epoch: 20 Global Step: 102250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:37:55,655-Speed 4472.35 samples/sec Loss 1.8244 Epoch: 20 Global Step: 102300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:38:07,410-Speed 4355.95 samples/sec Loss 1.7955 Epoch: 20 Global Step: 102350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:38:18,963-Speed 4431.79 samples/sec Loss 1.8262 Epoch: 20 Global Step: 102400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:38:30,651-Speed 4380.91 samples/sec Loss 1.8470 Epoch: 20 Global Step: 102450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:38:42,532-Speed 4309.59 samples/sec Loss 1.8119 Epoch: 20 Global Step: 102500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:38:54,273-Speed 4360.75 samples/sec Loss 1.8403 Epoch: 20 Global Step: 102550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:39:06,017-Speed 4360.00 samples/sec Loss 1.8108 Epoch: 20 Global Step: 102600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:39:17,932-Speed 4297.35 samples/sec Loss 1.8115 Epoch: 20 Global Step: 102650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:39:29,429-Speed 4453.30 samples/sec Loss 1.8266 Epoch: 20 Global Step: 102700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:39:41,418-Speed 4270.76 samples/sec Loss 1.8078 Epoch: 20 Global Step: 102750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:39:52,887-Speed 4464.28 samples/sec Loss 1.8357 Epoch: 20 Global Step: 102800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:40:04,428-Speed 4436.77 samples/sec Loss 1.8600 Epoch: 20 Global Step: 102850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:40:15,972-Speed 4435.53 samples/sec Loss 1.8596 Epoch: 20 Global Step: 102900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:40:27,800-Speed 4328.67 samples/sec Loss 1.8428 Epoch: 20 Global Step: 102950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:40:39,633-Speed 4327.13 samples/sec Loss 1.8290 Epoch: 20 Global Step: 103000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:40:51,439-Speed 4336.82 samples/sec Loss 1.8503 Epoch: 20 Global Step: 103050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:41:03,016-Speed 4422.80 samples/sec Loss 1.8650 Epoch: 20 Global Step: 103100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:41:15,023-Speed 4264.38 samples/sec Loss 1.8419 Epoch: 20 Global Step: 103150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:41:26,621-Speed 4414.62 samples/sec Loss 1.8234 Epoch: 20 Global Step: 103200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:41:38,346-Speed 4367.12 samples/sec Loss 1.8654 Epoch: 20 Global Step: 103250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:41:50,031-Speed 4381.86 samples/sec Loss 1.8800 Epoch: 20 Global Step: 103300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:42:01,732-Speed 4375.86 samples/sec Loss 1.8949 Epoch: 20 Global Step: 103350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:42:13,303-Speed 4425.16 samples/sec Loss 1.8543 Epoch: 20 Global Step: 103400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:42:24,918-Speed 4408.11 samples/sec Loss 1.8654 Epoch: 20 Global Step: 103450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:42:36,834-Speed 4296.86 samples/sec Loss 1.8675 Epoch: 20 Global Step: 103500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:42:48,693-Speed 4317.85 samples/sec Loss 1.8918 Epoch: 20 Global Step: 103550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:00,402-Speed 4372.75 samples/sec Loss 1.8814 Epoch: 20 Global Step: 103600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:11,883-Speed 4459.73 samples/sec Loss 1.8531 Epoch: 20 Global Step: 103650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:23,409-Speed 4442.16 samples/sec Loss 1.9004 Epoch: 20 Global Step: 103700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:35,082-Speed 4386.56 samples/sec Loss 1.8800 Epoch: 20 Global Step: 103750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:46,891-Speed 4335.78 samples/sec Loss 1.9099 Epoch: 20 Global Step: 103800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:43:58,556-Speed 4389.36 samples/sec Loss 1.8920 Epoch: 20 Global Step: 103850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:44:10,226-Speed 4387.61 samples/sec Loss 1.8989 Epoch: 20 Global Step: 103900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:44:21,816-Speed 4417.87 samples/sec Loss 1.8727 Epoch: 20 Global Step: 103950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:44:33,505-Speed 4380.29 samples/sec Loss 1.8911 Epoch: 20 Global Step: 104000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:45:04,368-[lfw][104000]XNorm: 22.385771 Training: 2021-03-15 05:45:04,369-[lfw][104000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 05:45:04,369-[lfw][104000]Accuracy-Highest: 0.99833 Training: 2021-03-15 05:45:40,114-[cfp_fp][104000]XNorm: 20.935414 Training: 2021-03-15 05:45:40,114-[cfp_fp][104000]Accuracy-Flip: 0.98843+-0.00303 Training: 2021-03-15 05:45:40,114-[cfp_fp][104000]Accuracy-Highest: 0.98914 Training: 2021-03-15 05:46:10,920-[agedb_30][104000]XNorm: 22.355291 Training: 2021-03-15 05:46:10,921-[agedb_30][104000]Accuracy-Flip: 0.98183+-0.00783 Training: 2021-03-15 05:46:10,921-[agedb_30][104000]Accuracy-Highest: 0.98350 Training: 2021-03-15 05:46:22,452-Speed 469.95 samples/sec Loss 1.8932 Epoch: 20 Global Step: 104050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:46:34,146-Speed 4378.64 samples/sec Loss 1.9100 Epoch: 20 Global Step: 104100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:46:45,727-Speed 4421.20 samples/sec Loss 1.8724 Epoch: 20 Global Step: 104150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:46:57,540-Speed 4334.51 samples/sec Loss 1.8764 Epoch: 20 Global Step: 104200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:47:09,301-Speed 4353.45 samples/sec Loss 1.8881 Epoch: 20 Global Step: 104250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:47:21,068-Speed 4351.19 samples/sec Loss 1.8851 Epoch: 20 Global Step: 104300 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:47:32,695-Speed 4403.93 samples/sec Loss 1.9254 Epoch: 20 Global Step: 104350 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:47:44,366-Speed 4387.18 samples/sec Loss 1.8962 Epoch: 20 Global Step: 104400 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:47:56,158-Speed 4342.09 samples/sec Loss 1.9406 Epoch: 20 Global Step: 104450 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:48:07,990-Speed 4327.45 samples/sec Loss 1.9468 Epoch: 20 Global Step: 104500 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:48:19,668-Speed 4384.53 samples/sec Loss 1.9001 Epoch: 20 Global Step: 104550 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:48:31,624-Speed 4282.39 samples/sec Loss 1.8997 Epoch: 20 Global Step: 104600 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:48:47,605-Speed 3203.90 samples/sec Loss 1.7092 Epoch: 21 Global Step: 104650 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:48:59,740-Speed 4219.33 samples/sec Loss 1.5027 Epoch: 21 Global Step: 104700 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:49:11,302-Speed 4428.67 samples/sec Loss 1.4791 Epoch: 21 Global Step: 104750 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:49:23,051-Speed 4357.99 samples/sec Loss 1.4706 Epoch: 21 Global Step: 104800 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:49:34,774-Speed 4367.68 samples/sec Loss 1.4752 Epoch: 21 Global Step: 104850 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:49:46,853-Speed 4238.94 samples/sec Loss 1.4738 Epoch: 21 Global Step: 104900 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:49:58,708-Speed 4319.09 samples/sec Loss 1.4884 Epoch: 21 Global Step: 104950 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:50:10,027-Speed 4523.37 samples/sec Loss 1.4814 Epoch: 21 Global Step: 105000 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:50:21,810-Speed 4345.78 samples/sec Loss 1.4650 Epoch: 21 Global Step: 105050 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:50:33,466-Speed 4392.65 samples/sec Loss 1.4605 Epoch: 21 Global Step: 105100 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:50:44,985-Speed 4444.76 samples/sec Loss 1.4758 Epoch: 21 Global Step: 105150 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:50:56,506-Speed 4444.30 samples/sec Loss 1.4457 Epoch: 21 Global Step: 105200 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:51:07,890-Speed 4498.05 samples/sec Loss 1.4511 Epoch: 21 Global Step: 105250 Fp16 Grad Scale: 16384 Required: 2 hours Training: 2021-03-15 05:51:19,587-Speed 4377.03 samples/sec Loss 1.4764 Epoch: 21 Global Step: 105300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:51:31,552-Speed 4279.51 samples/sec Loss 1.4455 Epoch: 21 Global Step: 105350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:51:43,023-Speed 4463.65 samples/sec Loss 1.4578 Epoch: 21 Global Step: 105400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:51:54,522-Speed 4452.71 samples/sec Loss 1.4514 Epoch: 21 Global Step: 105450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:52:06,507-Speed 4272.03 samples/sec Loss 1.4661 Epoch: 21 Global Step: 105500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:52:17,964-Speed 4469.25 samples/sec Loss 1.4517 Epoch: 21 Global Step: 105550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:52:29,623-Speed 4391.71 samples/sec Loss 1.4612 Epoch: 21 Global Step: 105600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:52:41,210-Speed 4418.78 samples/sec Loss 1.4650 Epoch: 21 Global Step: 105650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:52:52,738-Speed 4441.56 samples/sec Loss 1.4300 Epoch: 21 Global Step: 105700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:53:04,261-Speed 4443.52 samples/sec Loss 1.4504 Epoch: 21 Global Step: 105750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:53:15,723-Speed 4467.01 samples/sec Loss 1.4644 Epoch: 21 Global Step: 105800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:53:27,309-Speed 4419.39 samples/sec Loss 1.4275 Epoch: 21 Global Step: 105850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:53:38,707-Speed 4492.39 samples/sec Loss 1.4315 Epoch: 21 Global Step: 105900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:53:50,224-Speed 4445.47 samples/sec Loss 1.4796 Epoch: 21 Global Step: 105950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:54:01,783-Speed 4429.70 samples/sec Loss 1.4411 Epoch: 21 Global Step: 106000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:54:32,470-[lfw][106000]XNorm: 22.507445 Training: 2021-03-15 05:54:32,470-[lfw][106000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 05:54:32,470-[lfw][106000]Accuracy-Highest: 0.99833 Training: 2021-03-15 05:55:08,142-[cfp_fp][106000]XNorm: 21.070935 Training: 2021-03-15 05:55:08,142-[cfp_fp][106000]Accuracy-Flip: 0.98943+-0.00345 Training: 2021-03-15 05:55:08,142-[cfp_fp][106000]Accuracy-Highest: 0.98943 Training: 2021-03-15 05:55:38,924-[agedb_30][106000]XNorm: 22.435100 Training: 2021-03-15 05:55:38,924-[agedb_30][106000]Accuracy-Flip: 0.98200+-0.00733 Training: 2021-03-15 05:55:38,924-[agedb_30][106000]Accuracy-Highest: 0.98350 Training: 2021-03-15 05:55:50,774-Speed 469.77 samples/sec Loss 1.4851 Epoch: 21 Global Step: 106050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:56:02,540-Speed 4351.55 samples/sec Loss 1.4922 Epoch: 21 Global Step: 106100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:56:14,062-Speed 4443.88 samples/sec Loss 1.4307 Epoch: 21 Global Step: 106150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:56:25,630-Speed 4426.43 samples/sec Loss 1.4466 Epoch: 21 Global Step: 106200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:56:37,230-Speed 4414.03 samples/sec Loss 1.4454 Epoch: 21 Global Step: 106250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:56:48,797-Speed 4426.35 samples/sec Loss 1.4524 Epoch: 21 Global Step: 106300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:00,452-Speed 4393.08 samples/sec Loss 1.4462 Epoch: 21 Global Step: 106350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:12,343-Speed 4305.91 samples/sec Loss 1.4568 Epoch: 21 Global Step: 106400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:23,828-Speed 4458.32 samples/sec Loss 1.4399 Epoch: 21 Global Step: 106450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:35,295-Speed 4464.99 samples/sec Loss 1.4542 Epoch: 21 Global Step: 106500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:47,219-Speed 4294.18 samples/sec Loss 1.4363 Epoch: 21 Global Step: 106550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:57:58,706-Speed 4457.42 samples/sec Loss 1.4599 Epoch: 21 Global Step: 106600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:58:10,787-Speed 4238.08 samples/sec Loss 1.4621 Epoch: 21 Global Step: 106650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:58:22,311-Speed 4443.44 samples/sec Loss 1.4525 Epoch: 21 Global Step: 106700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:58:34,083-Speed 4349.24 samples/sec Loss 1.4596 Epoch: 21 Global Step: 106750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:58:45,552-Speed 4464.32 samples/sec Loss 1.4437 Epoch: 21 Global Step: 106800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:58:57,222-Speed 4387.80 samples/sec Loss 1.4715 Epoch: 21 Global Step: 106850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:59:09,001-Speed 4346.97 samples/sec Loss 1.4566 Epoch: 21 Global Step: 106900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:59:20,806-Speed 4337.01 samples/sec Loss 1.4677 Epoch: 21 Global Step: 106950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:59:32,317-Speed 4448.17 samples/sec Loss 1.4752 Epoch: 21 Global Step: 107000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:59:44,062-Speed 4359.69 samples/sec Loss 1.4276 Epoch: 21 Global Step: 107050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 05:59:55,785-Speed 4367.49 samples/sec Loss 1.4316 Epoch: 21 Global Step: 107100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:00:07,535-Speed 4357.85 samples/sec Loss 1.4510 Epoch: 21 Global Step: 107150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:00:18,993-Speed 4468.60 samples/sec Loss 1.4699 Epoch: 21 Global Step: 107200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:00:30,940-Speed 4285.61 samples/sec Loss 1.4461 Epoch: 21 Global Step: 107250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:00:42,517-Speed 4422.69 samples/sec Loss 1.4579 Epoch: 21 Global Step: 107300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:00:54,201-Speed 4382.20 samples/sec Loss 1.4363 Epoch: 21 Global Step: 107350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:01:05,588-Speed 4496.80 samples/sec Loss 1.4339 Epoch: 21 Global Step: 107400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:01:17,268-Speed 4383.51 samples/sec Loss 1.4297 Epoch: 21 Global Step: 107450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:01:28,837-Speed 4425.87 samples/sec Loss 1.4466 Epoch: 21 Global Step: 107500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:01:40,653-Speed 4333.58 samples/sec Loss 1.4446 Epoch: 21 Global Step: 107550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:01:52,468-Speed 4333.42 samples/sec Loss 1.4460 Epoch: 21 Global Step: 107600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:02:04,476-Speed 4263.97 samples/sec Loss 1.4636 Epoch: 21 Global Step: 107650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:02:15,939-Speed 4466.77 samples/sec Loss 1.4500 Epoch: 21 Global Step: 107700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:02:27,355-Speed 4485.34 samples/sec Loss 1.4784 Epoch: 21 Global Step: 107750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:02:38,942-Speed 4418.94 samples/sec Loss 1.4446 Epoch: 21 Global Step: 107800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:02:50,684-Speed 4360.48 samples/sec Loss 1.4430 Epoch: 21 Global Step: 107850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:03:02,230-Speed 4434.45 samples/sec Loss 1.4472 Epoch: 21 Global Step: 107900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:03:13,984-Speed 4356.37 samples/sec Loss 1.4335 Epoch: 21 Global Step: 107950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:03:25,572-Speed 4418.61 samples/sec Loss 1.4178 Epoch: 21 Global Step: 108000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:03:56,394-[lfw][108000]XNorm: 22.576304 Training: 2021-03-15 06:03:56,394-[lfw][108000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 06:03:56,396-[lfw][108000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:04:32,188-[cfp_fp][108000]XNorm: 21.114068 Training: 2021-03-15 06:04:32,189-[cfp_fp][108000]Accuracy-Flip: 0.98871+-0.00303 Training: 2021-03-15 06:04:32,189-[cfp_fp][108000]Accuracy-Highest: 0.98943 Training: 2021-03-15 06:05:02,868-[agedb_30][108000]XNorm: 22.518431 Training: 2021-03-15 06:05:02,868-[agedb_30][108000]Accuracy-Flip: 0.98133+-0.00759 Training: 2021-03-15 06:05:02,868-[agedb_30][108000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:05:14,679-Speed 469.27 samples/sec Loss 1.4194 Epoch: 21 Global Step: 108050 Fp16 Grad Scale: 8192 Required: 1 hours Training: 2021-03-15 06:05:26,418-Speed 4361.83 samples/sec Loss 1.4324 Epoch: 21 Global Step: 108100 Fp16 Grad Scale: 8192 Required: 1 hours Training: 2021-03-15 06:05:37,986-Speed 4425.98 samples/sec Loss 1.4688 Epoch: 21 Global Step: 108150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:05:49,491-Speed 4450.28 samples/sec Loss 1.4409 Epoch: 21 Global Step: 108200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:01,113-Speed 4405.63 samples/sec Loss 1.4505 Epoch: 21 Global Step: 108250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:12,751-Speed 4399.66 samples/sec Loss 1.4489 Epoch: 21 Global Step: 108300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:24,468-Speed 4370.01 samples/sec Loss 1.4609 Epoch: 21 Global Step: 108350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:36,500-Speed 4255.62 samples/sec Loss 1.4371 Epoch: 21 Global Step: 108400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:48,090-Speed 4417.49 samples/sec Loss 1.4606 Epoch: 21 Global Step: 108450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:06:59,673-Speed 4420.66 samples/sec Loss 1.4885 Epoch: 21 Global Step: 108500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:07:11,522-Speed 4321.18 samples/sec Loss 1.4252 Epoch: 21 Global Step: 108550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:07:23,267-Speed 4359.39 samples/sec Loss 1.4282 Epoch: 21 Global Step: 108600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:07:34,979-Speed 4371.79 samples/sec Loss 1.4371 Epoch: 21 Global Step: 108650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:07:46,904-Speed 4293.55 samples/sec Loss 1.4665 Epoch: 21 Global Step: 108700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:07:58,386-Speed 4459.37 samples/sec Loss 1.4685 Epoch: 21 Global Step: 108750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:08:09,820-Speed 4477.97 samples/sec Loss 1.4604 Epoch: 21 Global Step: 108800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:08:21,756-Speed 4289.96 samples/sec Loss 1.4384 Epoch: 21 Global Step: 108850 Fp16 Grad Scale: 8192 Required: 1 hours Training: 2021-03-15 06:08:33,261-Speed 4450.46 samples/sec Loss 1.4497 Epoch: 21 Global Step: 108900 Fp16 Grad Scale: 8192 Required: 1 hours Training: 2021-03-15 06:08:44,821-Speed 4429.26 samples/sec Loss 1.4379 Epoch: 21 Global Step: 108950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:08:56,410-Speed 4418.03 samples/sec Loss 1.4530 Epoch: 21 Global Step: 109000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:09:08,100-Speed 4379.96 samples/sec Loss 1.4668 Epoch: 21 Global Step: 109050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:09:19,805-Speed 4374.36 samples/sec Loss 1.4729 Epoch: 21 Global Step: 109100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:09:31,251-Speed 4473.55 samples/sec Loss 1.4533 Epoch: 21 Global Step: 109150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:09:42,800-Speed 4433.42 samples/sec Loss 1.4769 Epoch: 21 Global Step: 109200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:09:54,217-Speed 4484.76 samples/sec Loss 1.4712 Epoch: 21 Global Step: 109250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:10:05,541-Speed 4521.48 samples/sec Loss 1.4411 Epoch: 21 Global Step: 109300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:10:17,138-Speed 4415.02 samples/sec Loss 1.4654 Epoch: 21 Global Step: 109350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:10:28,994-Speed 4318.50 samples/sec Loss 1.4388 Epoch: 21 Global Step: 109400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:10:40,585-Speed 4417.60 samples/sec Loss 1.4598 Epoch: 21 Global Step: 109450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:10:52,076-Speed 4455.67 samples/sec Loss 1.4585 Epoch: 21 Global Step: 109500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:11:03,747-Speed 4387.33 samples/sec Loss 1.4546 Epoch: 21 Global Step: 109550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:11:15,694-Speed 4285.78 samples/sec Loss 1.4661 Epoch: 21 Global Step: 109600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:11:31,510-Speed 3237.43 samples/sec Loss 1.3751 Epoch: 22 Global Step: 109650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:11:43,087-Speed 4422.60 samples/sec Loss 1.3916 Epoch: 22 Global Step: 109700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:11:54,702-Speed 4408.18 samples/sec Loss 1.3746 Epoch: 22 Global Step: 109750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:12:06,271-Speed 4426.09 samples/sec Loss 1.3765 Epoch: 22 Global Step: 109800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:12:17,956-Speed 4381.98 samples/sec Loss 1.4161 Epoch: 22 Global Step: 109850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:12:29,797-Speed 4324.05 samples/sec Loss 1.3992 Epoch: 22 Global Step: 109900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:12:41,413-Speed 4407.90 samples/sec Loss 1.4114 Epoch: 22 Global Step: 109950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:12:52,994-Speed 4421.31 samples/sec Loss 1.3856 Epoch: 22 Global Step: 110000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:13:23,802-[lfw][110000]XNorm: 22.548608 Training: 2021-03-15 06:13:23,803-[lfw][110000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 06:13:23,803-[lfw][110000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:13:59,491-[cfp_fp][110000]XNorm: 21.146048 Training: 2021-03-15 06:13:59,491-[cfp_fp][110000]Accuracy-Flip: 0.98843+-0.00353 Training: 2021-03-15 06:13:59,491-[cfp_fp][110000]Accuracy-Highest: 0.98943 Training: 2021-03-15 06:14:30,277-[agedb_30][110000]XNorm: 22.516984 Training: 2021-03-15 06:14:30,278-[agedb_30][110000]Accuracy-Flip: 0.98183+-0.00743 Training: 2021-03-15 06:14:30,278-[agedb_30][110000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:14:42,304-Speed 468.39 samples/sec Loss 1.3898 Epoch: 22 Global Step: 110050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:14:53,966-Speed 4390.34 samples/sec Loss 1.4031 Epoch: 22 Global Step: 110100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:15:05,582-Speed 4408.07 samples/sec Loss 1.4082 Epoch: 22 Global Step: 110150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:15:17,400-Speed 4332.67 samples/sec Loss 1.4140 Epoch: 22 Global Step: 110200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:15:28,901-Speed 4451.77 samples/sec Loss 1.4216 Epoch: 22 Global Step: 110250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:15:40,708-Speed 4336.62 samples/sec Loss 1.3907 Epoch: 22 Global Step: 110300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:15:52,345-Speed 4400.16 samples/sec Loss 1.3856 Epoch: 22 Global Step: 110350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:16:04,991-Speed 4048.79 samples/sec Loss 1.4119 Epoch: 22 Global Step: 110400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:16:16,425-Speed 4478.09 samples/sec Loss 1.4074 Epoch: 22 Global Step: 110450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:16:28,115-Speed 4379.90 samples/sec Loss 1.4165 Epoch: 22 Global Step: 110500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:16:39,512-Speed 4492.59 samples/sec Loss 1.3942 Epoch: 22 Global Step: 110550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:16:51,188-Speed 4385.06 samples/sec Loss 1.4156 Epoch: 22 Global Step: 110600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:17:02,856-Speed 4388.50 samples/sec Loss 1.4273 Epoch: 22 Global Step: 110650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:17:14,905-Speed 4249.52 samples/sec Loss 1.3736 Epoch: 22 Global Step: 110700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:17:26,584-Speed 4383.89 samples/sec Loss 1.3765 Epoch: 22 Global Step: 110750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:17:38,448-Speed 4315.80 samples/sec Loss 1.3777 Epoch: 22 Global Step: 110800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:17:50,301-Speed 4319.78 samples/sec Loss 1.4060 Epoch: 22 Global Step: 110850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:18:02,308-Speed 4264.35 samples/sec Loss 1.3791 Epoch: 22 Global Step: 110900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:18:13,836-Speed 4441.55 samples/sec Loss 1.3932 Epoch: 22 Global Step: 110950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:18:25,528-Speed 4379.24 samples/sec Loss 1.4013 Epoch: 22 Global Step: 111000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:18:36,950-Speed 4482.92 samples/sec Loss 1.3945 Epoch: 22 Global Step: 111050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:18:48,517-Speed 4426.40 samples/sec Loss 1.3984 Epoch: 22 Global Step: 111100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:00,151-Speed 4401.18 samples/sec Loss 1.4055 Epoch: 22 Global Step: 111150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:11,726-Speed 4423.71 samples/sec Loss 1.3934 Epoch: 22 Global Step: 111200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:23,218-Speed 4455.13 samples/sec Loss 1.3997 Epoch: 22 Global Step: 111250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:34,672-Speed 4470.46 samples/sec Loss 1.4200 Epoch: 22 Global Step: 111300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:46,370-Speed 4377.17 samples/sec Loss 1.3647 Epoch: 22 Global Step: 111350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:19:57,722-Speed 4510.08 samples/sec Loss 1.4029 Epoch: 22 Global Step: 111400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:20:09,532-Speed 4335.48 samples/sec Loss 1.4010 Epoch: 22 Global Step: 111450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:20:21,068-Speed 4438.69 samples/sec Loss 1.3959 Epoch: 22 Global Step: 111500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:20:32,955-Speed 4307.52 samples/sec Loss 1.4029 Epoch: 22 Global Step: 111550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:20:44,804-Speed 4320.95 samples/sec Loss 1.4180 Epoch: 22 Global Step: 111600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:20:56,744-Speed 4288.23 samples/sec Loss 1.4179 Epoch: 22 Global Step: 111650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:21:08,283-Speed 4437.40 samples/sec Loss 1.4128 Epoch: 22 Global Step: 111700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:21:19,965-Speed 4383.07 samples/sec Loss 1.4072 Epoch: 22 Global Step: 111750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:21:31,459-Speed 4454.71 samples/sec Loss 1.4193 Epoch: 22 Global Step: 111800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:21:43,087-Speed 4403.20 samples/sec Loss 1.4470 Epoch: 22 Global Step: 111850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:21:54,695-Speed 4410.92 samples/sec Loss 1.3865 Epoch: 22 Global Step: 111900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:22:06,496-Speed 4339.06 samples/sec Loss 1.3894 Epoch: 22 Global Step: 111950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:22:18,301-Speed 4337.26 samples/sec Loss 1.4023 Epoch: 22 Global Step: 112000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:22:49,313-[lfw][112000]XNorm: 22.540763 Training: 2021-03-15 06:22:49,314-[lfw][112000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 06:22:49,314-[lfw][112000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:23:25,261-[cfp_fp][112000]XNorm: 21.137539 Training: 2021-03-15 06:23:25,261-[cfp_fp][112000]Accuracy-Flip: 0.98857+-0.00319 Training: 2021-03-15 06:23:25,261-[cfp_fp][112000]Accuracy-Highest: 0.98943 Training: 2021-03-15 06:23:55,892-[agedb_30][112000]XNorm: 22.533321 Training: 2021-03-15 06:23:55,892-[agedb_30][112000]Accuracy-Flip: 0.98283+-0.00742 Training: 2021-03-15 06:23:55,892-[agedb_30][112000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:24:07,669-Speed 468.15 samples/sec Loss 1.4359 Epoch: 22 Global Step: 112050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:24:19,296-Speed 4403.42 samples/sec Loss 1.3928 Epoch: 22 Global Step: 112100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:24:31,066-Speed 4350.45 samples/sec Loss 1.4228 Epoch: 22 Global Step: 112150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:24:42,645-Speed 4422.04 samples/sec Loss 1.4199 Epoch: 22 Global Step: 112200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:24:54,616-Speed 4276.97 samples/sec Loss 1.4259 Epoch: 22 Global Step: 112250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:25:06,188-Speed 4424.62 samples/sec Loss 1.3772 Epoch: 22 Global Step: 112300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:25:18,185-Speed 4268.13 samples/sec Loss 1.4190 Epoch: 22 Global Step: 112350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:25:29,772-Speed 4418.70 samples/sec Loss 1.3941 Epoch: 22 Global Step: 112400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:25:41,250-Speed 4460.81 samples/sec Loss 1.4188 Epoch: 22 Global Step: 112450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:25:52,760-Speed 4448.64 samples/sec Loss 1.4069 Epoch: 22 Global Step: 112500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:26:04,527-Speed 4351.17 samples/sec Loss 1.4246 Epoch: 22 Global Step: 112550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:26:15,846-Speed 4523.71 samples/sec Loss 1.4012 Epoch: 22 Global Step: 112600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:26:27,673-Speed 4329.05 samples/sec Loss 1.4077 Epoch: 22 Global Step: 112650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:26:39,824-Speed 4213.96 samples/sec Loss 1.4534 Epoch: 22 Global Step: 112700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:26:51,525-Speed 4375.86 samples/sec Loss 1.4164 Epoch: 22 Global Step: 112750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:27:03,180-Speed 4393.36 samples/sec Loss 1.4320 Epoch: 22 Global Step: 112800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:27:15,003-Speed 4330.58 samples/sec Loss 1.4316 Epoch: 22 Global Step: 112850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:27:26,605-Speed 4413.34 samples/sec Loss 1.4410 Epoch: 22 Global Step: 112900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:27:38,387-Speed 4345.58 samples/sec Loss 1.4348 Epoch: 22 Global Step: 112950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:27:50,167-Speed 4346.69 samples/sec Loss 1.4139 Epoch: 22 Global Step: 113000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:28:01,752-Speed 4419.66 samples/sec Loss 1.4036 Epoch: 22 Global Step: 113050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:28:13,302-Speed 4432.98 samples/sec Loss 1.4434 Epoch: 22 Global Step: 113100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:28:24,727-Speed 4481.76 samples/sec Loss 1.4118 Epoch: 22 Global Step: 113150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:28:36,279-Speed 4432.11 samples/sec Loss 1.4060 Epoch: 22 Global Step: 113200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:28:48,037-Speed 4354.72 samples/sec Loss 1.4267 Epoch: 22 Global Step: 113250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:00,183-Speed 4215.55 samples/sec Loss 1.4238 Epoch: 22 Global Step: 113300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:11,782-Speed 4414.45 samples/sec Loss 1.4110 Epoch: 22 Global Step: 113350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:23,497-Speed 4370.50 samples/sec Loss 1.4027 Epoch: 22 Global Step: 113400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:35,099-Speed 4413.51 samples/sec Loss 1.4170 Epoch: 22 Global Step: 113450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:46,677-Speed 4422.15 samples/sec Loss 1.4283 Epoch: 22 Global Step: 113500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:29:58,196-Speed 4444.89 samples/sec Loss 1.4030 Epoch: 22 Global Step: 113550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:30:09,971-Speed 4348.66 samples/sec Loss 1.3832 Epoch: 22 Global Step: 113600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:30:21,475-Speed 4450.69 samples/sec Loss 1.4156 Epoch: 22 Global Step: 113650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:30:33,249-Speed 4348.67 samples/sec Loss 1.4256 Epoch: 22 Global Step: 113700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:30:44,893-Speed 4397.53 samples/sec Loss 1.4257 Epoch: 22 Global Step: 113750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:30:56,731-Speed 4325.02 samples/sec Loss 1.4164 Epoch: 22 Global Step: 113800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:31:08,151-Speed 4483.81 samples/sec Loss 1.4120 Epoch: 22 Global Step: 113850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:31:19,641-Speed 4456.06 samples/sec Loss 1.4183 Epoch: 22 Global Step: 113900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:31:31,187-Speed 4434.76 samples/sec Loss 1.4323 Epoch: 22 Global Step: 113950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:31:42,652-Speed 4466.03 samples/sec Loss 1.3914 Epoch: 22 Global Step: 114000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:32:13,232-[lfw][114000]XNorm: 22.526834 Training: 2021-03-15 06:32:13,232-[lfw][114000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 06:32:13,232-[lfw][114000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:32:48,670-[cfp_fp][114000]XNorm: 21.150343 Training: 2021-03-15 06:32:48,671-[cfp_fp][114000]Accuracy-Flip: 0.98857+-0.00293 Training: 2021-03-15 06:32:48,671-[cfp_fp][114000]Accuracy-Highest: 0.98943 Training: 2021-03-15 06:33:19,243-[agedb_30][114000]XNorm: 22.527114 Training: 2021-03-15 06:33:19,243-[agedb_30][114000]Accuracy-Flip: 0.98217+-0.00687 Training: 2021-03-15 06:33:19,243-[agedb_30][114000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:33:31,277-Speed 471.35 samples/sec Loss 1.4312 Epoch: 22 Global Step: 114050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:33:43,157-Speed 4309.91 samples/sec Loss 1.4195 Epoch: 22 Global Step: 114100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:33:54,924-Speed 4351.29 samples/sec Loss 1.4165 Epoch: 22 Global Step: 114150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:34:06,677-Speed 4356.57 samples/sec Loss 1.4239 Epoch: 22 Global Step: 114200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:34:18,218-Speed 4436.56 samples/sec Loss 1.4306 Epoch: 22 Global Step: 114250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:34:30,063-Speed 4322.64 samples/sec Loss 1.4001 Epoch: 22 Global Step: 114300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:34:41,683-Speed 4406.54 samples/sec Loss 1.4217 Epoch: 22 Global Step: 114350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:34:53,143-Speed 4467.78 samples/sec Loss 1.3857 Epoch: 22 Global Step: 114400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:35:04,736-Speed 4416.70 samples/sec Loss 1.4538 Epoch: 22 Global Step: 114450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:35:16,525-Speed 4343.05 samples/sec Loss 1.4635 Epoch: 22 Global Step: 114500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:35:28,051-Speed 4442.56 samples/sec Loss 1.4185 Epoch: 22 Global Step: 114550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:35:43,503-Speed 3313.50 samples/sec Loss 1.4235 Epoch: 23 Global Step: 114600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:35:55,992-Speed 4100.24 samples/sec Loss 1.3699 Epoch: 23 Global Step: 114650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:36:08,091-Speed 4231.94 samples/sec Loss 1.3847 Epoch: 23 Global Step: 114700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:36:19,742-Speed 4394.66 samples/sec Loss 1.3702 Epoch: 23 Global Step: 114750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:36:31,547-Speed 4337.20 samples/sec Loss 1.3942 Epoch: 23 Global Step: 114800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:36:43,583-Speed 4254.07 samples/sec Loss 1.3679 Epoch: 23 Global Step: 114850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:36:55,377-Speed 4341.30 samples/sec Loss 1.3525 Epoch: 23 Global Step: 114900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:37:06,905-Speed 4441.61 samples/sec Loss 1.3861 Epoch: 23 Global Step: 114950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:37:18,511-Speed 4411.78 samples/sec Loss 1.3762 Epoch: 23 Global Step: 115000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:37:30,305-Speed 4341.44 samples/sec Loss 1.3637 Epoch: 23 Global Step: 115050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:37:41,859-Speed 4431.43 samples/sec Loss 1.3754 Epoch: 23 Global Step: 115100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:37:53,304-Speed 4473.96 samples/sec Loss 1.3868 Epoch: 23 Global Step: 115150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:38:05,106-Speed 4338.32 samples/sec Loss 1.3685 Epoch: 23 Global Step: 115200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:38:16,792-Speed 4381.33 samples/sec Loss 1.3615 Epoch: 23 Global Step: 115250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:38:28,476-Speed 4382.57 samples/sec Loss 1.3736 Epoch: 23 Global Step: 115300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:38:40,177-Speed 4375.78 samples/sec Loss 1.3683 Epoch: 23 Global Step: 115350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:38:51,690-Speed 4447.26 samples/sec Loss 1.3792 Epoch: 23 Global Step: 115400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:39:03,257-Speed 4426.45 samples/sec Loss 1.3724 Epoch: 23 Global Step: 115450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:39:14,957-Speed 4376.21 samples/sec Loss 1.3542 Epoch: 23 Global Step: 115500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:39:26,436-Speed 4460.56 samples/sec Loss 1.3846 Epoch: 23 Global Step: 115550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:39:38,046-Speed 4410.12 samples/sec Loss 1.4019 Epoch: 23 Global Step: 115600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:39:49,763-Speed 4370.09 samples/sec Loss 1.3788 Epoch: 23 Global Step: 115650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:40:01,720-Speed 4282.27 samples/sec Loss 1.3868 Epoch: 23 Global Step: 115700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:40:13,454-Speed 4363.55 samples/sec Loss 1.3503 Epoch: 23 Global Step: 115750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:40:25,328-Speed 4312.21 samples/sec Loss 1.3708 Epoch: 23 Global Step: 115800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:40:37,045-Speed 4369.80 samples/sec Loss 1.3698 Epoch: 23 Global Step: 115850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:40:48,708-Speed 4389.87 samples/sec Loss 1.3531 Epoch: 23 Global Step: 115900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:41:00,359-Speed 4394.83 samples/sec Loss 1.3910 Epoch: 23 Global Step: 115950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:41:11,780-Speed 4483.30 samples/sec Loss 1.3949 Epoch: 23 Global Step: 116000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:41:42,531-[lfw][116000]XNorm: 22.643189 Training: 2021-03-15 06:41:42,531-[lfw][116000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 06:41:42,531-[lfw][116000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:42:18,341-[cfp_fp][116000]XNorm: 21.260813 Training: 2021-03-15 06:42:18,342-[cfp_fp][116000]Accuracy-Flip: 0.98957+-0.00256 Training: 2021-03-15 06:42:18,342-[cfp_fp][116000]Accuracy-Highest: 0.98957 Training: 2021-03-15 06:42:49,168-[agedb_30][116000]XNorm: 22.650531 Training: 2021-03-15 06:42:49,168-[agedb_30][116000]Accuracy-Flip: 0.98217+-0.00753 Training: 2021-03-15 06:42:49,168-[agedb_30][116000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:43:01,144-Speed 468.16 samples/sec Loss 1.3974 Epoch: 23 Global Step: 116050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:43:12,703-Speed 4429.89 samples/sec Loss 1.3469 Epoch: 23 Global Step: 116100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:43:24,121-Speed 4484.46 samples/sec Loss 1.3859 Epoch: 23 Global Step: 116150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:43:35,610-Speed 4456.57 samples/sec Loss 1.3701 Epoch: 23 Global Step: 116200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:43:47,490-Speed 4309.62 samples/sec Loss 1.3720 Epoch: 23 Global Step: 116250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:43:59,136-Speed 4396.70 samples/sec Loss 1.3796 Epoch: 23 Global Step: 116300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:44:11,034-Speed 4303.47 samples/sec Loss 1.3656 Epoch: 23 Global Step: 116350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:44:22,451-Speed 4484.72 samples/sec Loss 1.3690 Epoch: 23 Global Step: 116400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:44:33,916-Speed 4466.03 samples/sec Loss 1.3603 Epoch: 23 Global Step: 116450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:44:45,437-Speed 4444.17 samples/sec Loss 1.3450 Epoch: 23 Global Step: 116500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:44:56,850-Speed 4486.28 samples/sec Loss 1.3923 Epoch: 23 Global Step: 116550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:45:08,322-Speed 4463.25 samples/sec Loss 1.4031 Epoch: 23 Global Step: 116600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:45:19,817-Speed 4454.12 samples/sec Loss 1.3923 Epoch: 23 Global Step: 116650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:45:31,372-Speed 4431.48 samples/sec Loss 1.3850 Epoch: 23 Global Step: 116700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:45:43,269-Speed 4303.54 samples/sec Loss 1.3906 Epoch: 23 Global Step: 116750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:45:55,020-Speed 4357.17 samples/sec Loss 1.3855 Epoch: 23 Global Step: 116800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:46:07,040-Speed 4259.95 samples/sec Loss 1.3656 Epoch: 23 Global Step: 116850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:46:18,575-Speed 4438.93 samples/sec Loss 1.3945 Epoch: 23 Global Step: 116900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:46:30,516-Speed 4287.64 samples/sec Loss 1.3672 Epoch: 23 Global Step: 116950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:46:42,018-Speed 4451.75 samples/sec Loss 1.3804 Epoch: 23 Global Step: 117000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:46:53,577-Speed 4429.51 samples/sec Loss 1.3846 Epoch: 23 Global Step: 117050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:47:05,222-Speed 4397.21 samples/sec Loss 1.3877 Epoch: 23 Global Step: 117100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:47:16,778-Speed 4430.45 samples/sec Loss 1.3855 Epoch: 23 Global Step: 117150 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:47:28,330-Speed 4432.47 samples/sec Loss 1.3752 Epoch: 23 Global Step: 117200 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:47:40,172-Speed 4323.81 samples/sec Loss 1.3974 Epoch: 23 Global Step: 117250 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:47:51,758-Speed 4419.19 samples/sec Loss 1.3696 Epoch: 23 Global Step: 117300 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:48:03,289-Speed 4440.48 samples/sec Loss 1.3828 Epoch: 23 Global Step: 117350 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:48:14,903-Speed 4408.62 samples/sec Loss 1.3883 Epoch: 23 Global Step: 117400 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:48:26,636-Speed 4363.98 samples/sec Loss 1.3989 Epoch: 23 Global Step: 117450 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:48:38,399-Speed 4352.66 samples/sec Loss 1.4295 Epoch: 23 Global Step: 117500 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:48:50,251-Speed 4320.07 samples/sec Loss 1.3884 Epoch: 23 Global Step: 117550 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:49:02,089-Speed 4325.43 samples/sec Loss 1.3827 Epoch: 23 Global Step: 117600 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:49:13,879-Speed 4342.94 samples/sec Loss 1.3942 Epoch: 23 Global Step: 117650 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:49:25,483-Speed 4412.25 samples/sec Loss 1.4027 Epoch: 23 Global Step: 117700 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:49:37,249-Speed 4351.91 samples/sec Loss 1.3870 Epoch: 23 Global Step: 117750 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:49:48,802-Speed 4431.98 samples/sec Loss 1.4062 Epoch: 23 Global Step: 117800 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:50:00,454-Speed 4394.16 samples/sec Loss 1.3698 Epoch: 23 Global Step: 117850 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:50:11,922-Speed 4464.77 samples/sec Loss 1.3810 Epoch: 23 Global Step: 117900 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:50:23,464-Speed 4436.24 samples/sec Loss 1.3939 Epoch: 23 Global Step: 117950 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:50:35,268-Speed 4337.51 samples/sec Loss 1.3710 Epoch: 23 Global Step: 118000 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:51:05,911-[lfw][118000]XNorm: 22.475227 Training: 2021-03-15 06:51:05,911-[lfw][118000]Accuracy-Flip: 0.99783+-0.00183 Training: 2021-03-15 06:51:05,911-[lfw][118000]Accuracy-Highest: 0.99833 Training: 2021-03-15 06:51:41,499-[cfp_fp][118000]XNorm: 21.118027 Training: 2021-03-15 06:51:41,500-[cfp_fp][118000]Accuracy-Flip: 0.98857+-0.00350 Training: 2021-03-15 06:51:41,500-[cfp_fp][118000]Accuracy-Highest: 0.98957 Training: 2021-03-15 06:52:12,157-[agedb_30][118000]XNorm: 22.473627 Training: 2021-03-15 06:52:12,157-[agedb_30][118000]Accuracy-Flip: 0.98233+-0.00663 Training: 2021-03-15 06:52:12,157-[agedb_30][118000]Accuracy-Highest: 0.98350 Training: 2021-03-15 06:52:24,009-Speed 470.85 samples/sec Loss 1.3969 Epoch: 23 Global Step: 118050 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:52:35,646-Speed 4399.91 samples/sec Loss 1.3942 Epoch: 23 Global Step: 118100 Fp16 Grad Scale: 16384 Required: 1 hours Training: 2021-03-15 06:52:47,311-Speed 4389.43 samples/sec Loss 1.3824 Epoch: 23 Global Step: 118150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:52:58,814-Speed 4451.11 samples/sec Loss 1.3992 Epoch: 23 Global Step: 118200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:53:10,493-Speed 4384.23 samples/sec Loss 1.3947 Epoch: 23 Global Step: 118250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:53:22,087-Speed 4416.31 samples/sec Loss 1.3707 Epoch: 23 Global Step: 118300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:53:33,591-Speed 4450.52 samples/sec Loss 1.3862 Epoch: 23 Global Step: 118350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:53:45,110-Speed 4445.11 samples/sec Loss 1.3801 Epoch: 23 Global Step: 118400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:53:56,503-Speed 4494.39 samples/sec Loss 1.3867 Epoch: 23 Global Step: 118450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:54:08,005-Speed 4451.30 samples/sec Loss 1.3801 Epoch: 23 Global Step: 118500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:54:19,535-Speed 4440.74 samples/sec Loss 1.3842 Epoch: 23 Global Step: 118550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:54:31,562-Speed 4257.34 samples/sec Loss 1.3876 Epoch: 23 Global Step: 118600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:54:42,989-Speed 4481.08 samples/sec Loss 1.4009 Epoch: 23 Global Step: 118650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:54:54,650-Speed 4390.86 samples/sec Loss 1.3909 Epoch: 23 Global Step: 118700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:55:06,505-Speed 4318.97 samples/sec Loss 1.3844 Epoch: 23 Global Step: 118750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:55:18,263-Speed 4354.68 samples/sec Loss 1.4200 Epoch: 23 Global Step: 118800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:55:30,039-Speed 4347.83 samples/sec Loss 1.3887 Epoch: 23 Global Step: 118850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:55:41,793-Speed 4356.13 samples/sec Loss 1.3934 Epoch: 23 Global Step: 118900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:55:53,372-Speed 4422.07 samples/sec Loss 1.3983 Epoch: 23 Global Step: 118950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:56:04,757-Speed 4497.36 samples/sec Loss 1.4010 Epoch: 23 Global Step: 119000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:56:16,277-Speed 4444.89 samples/sec Loss 1.3916 Epoch: 23 Global Step: 119050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:56:27,956-Speed 4383.78 samples/sec Loss 1.3921 Epoch: 23 Global Step: 119100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:56:39,486-Speed 4440.99 samples/sec Loss 1.3812 Epoch: 23 Global Step: 119150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:56:51,048-Speed 4428.62 samples/sec Loss 1.3996 Epoch: 23 Global Step: 119200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:57:02,669-Speed 4405.86 samples/sec Loss 1.3958 Epoch: 23 Global Step: 119250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:57:14,340-Speed 4387.09 samples/sec Loss 1.4036 Epoch: 23 Global Step: 119300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:57:25,888-Speed 4433.80 samples/sec Loss 1.3817 Epoch: 23 Global Step: 119350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:57:37,248-Speed 4507.49 samples/sec Loss 1.3642 Epoch: 23 Global Step: 119400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:57:48,974-Speed 4366.53 samples/sec Loss 1.4100 Epoch: 23 Global Step: 119450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:58:00,469-Speed 4454.19 samples/sec Loss 1.3744 Epoch: 23 Global Step: 119500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:58:12,075-Speed 4411.63 samples/sec Loss 1.4069 Epoch: 23 Global Step: 119550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:58:28,045-Speed 3206.11 samples/sec Loss 1.3582 Epoch: 24 Global Step: 119600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:58:39,834-Speed 4343.46 samples/sec Loss 1.3442 Epoch: 24 Global Step: 119650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:58:51,884-Speed 4249.19 samples/sec Loss 1.3494 Epoch: 24 Global Step: 119700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:59:03,448-Speed 4427.73 samples/sec Loss 1.3474 Epoch: 24 Global Step: 119750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:59:15,381-Speed 4290.78 samples/sec Loss 1.3527 Epoch: 24 Global Step: 119800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:59:27,087-Speed 4373.81 samples/sec Loss 1.3481 Epoch: 24 Global Step: 119850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:59:38,670-Speed 4420.34 samples/sec Loss 1.3230 Epoch: 24 Global Step: 119900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 06:59:50,726-Speed 4247.05 samples/sec Loss 1.3744 Epoch: 24 Global Step: 119950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:00:02,462-Speed 4363.15 samples/sec Loss 1.3809 Epoch: 24 Global Step: 120000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:00:33,295-[lfw][120000]XNorm: 22.506448 Training: 2021-03-15 07:00:33,295-[lfw][120000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 07:00:33,295-[lfw][120000]Accuracy-Highest: 0.99833 Training: 2021-03-15 07:01:09,102-[cfp_fp][120000]XNorm: 21.135033 Training: 2021-03-15 07:01:09,102-[cfp_fp][120000]Accuracy-Flip: 0.98900+-0.00320 Training: 2021-03-15 07:01:09,102-[cfp_fp][120000]Accuracy-Highest: 0.98957 Training: 2021-03-15 07:01:40,047-[agedb_30][120000]XNorm: 22.516342 Training: 2021-03-15 07:01:40,047-[agedb_30][120000]Accuracy-Flip: 0.98217+-0.00734 Training: 2021-03-15 07:01:40,047-[agedb_30][120000]Accuracy-Highest: 0.98350 Training: 2021-03-15 07:01:51,516-Speed 469.49 samples/sec Loss 1.3599 Epoch: 24 Global Step: 120050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:02:03,245-Speed 4365.37 samples/sec Loss 1.3238 Epoch: 24 Global Step: 120100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:02:14,705-Speed 4467.93 samples/sec Loss 1.3408 Epoch: 24 Global Step: 120150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:02:26,310-Speed 4411.97 samples/sec Loss 1.3577 Epoch: 24 Global Step: 120200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:02:38,265-Speed 4283.15 samples/sec Loss 1.3147 Epoch: 24 Global Step: 120250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:02:49,833-Speed 4425.94 samples/sec Loss 1.3488 Epoch: 24 Global Step: 120300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:01,591-Speed 4354.72 samples/sec Loss 1.3289 Epoch: 24 Global Step: 120350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:13,207-Speed 4407.92 samples/sec Loss 1.3551 Epoch: 24 Global Step: 120400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:24,822-Speed 4408.21 samples/sec Loss 1.3210 Epoch: 24 Global Step: 120450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:36,249-Speed 4481.05 samples/sec Loss 1.3605 Epoch: 24 Global Step: 120500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:47,797-Speed 4434.05 samples/sec Loss 1.3692 Epoch: 24 Global Step: 120550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:03:59,318-Speed 4444.12 samples/sec Loss 1.3566 Epoch: 24 Global Step: 120600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:04:10,922-Speed 4412.54 samples/sec Loss 1.3412 Epoch: 24 Global Step: 120650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:04:22,794-Speed 4312.88 samples/sec Loss 1.3685 Epoch: 24 Global Step: 120700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:04:34,600-Speed 4336.89 samples/sec Loss 1.3289 Epoch: 24 Global Step: 120750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:04:46,232-Speed 4401.98 samples/sec Loss 1.3572 Epoch: 24 Global Step: 120800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:04:58,175-Speed 4287.32 samples/sec Loss 1.3793 Epoch: 24 Global Step: 120850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:05:09,627-Speed 4470.73 samples/sec Loss 1.3459 Epoch: 24 Global Step: 120900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:05:21,430-Speed 4337.96 samples/sec Loss 1.3601 Epoch: 24 Global Step: 120950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:05:33,028-Speed 4414.75 samples/sec Loss 1.3340 Epoch: 24 Global Step: 121000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:05:44,844-Speed 4333.28 samples/sec Loss 1.3602 Epoch: 24 Global Step: 121050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:05:56,402-Speed 4430.29 samples/sec Loss 1.3652 Epoch: 24 Global Step: 121100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:06:07,882-Speed 4460.12 samples/sec Loss 1.3624 Epoch: 24 Global Step: 121150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:06:19,333-Speed 4471.24 samples/sec Loss 1.3617 Epoch: 24 Global Step: 121200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:06:31,281-Speed 4285.53 samples/sec Loss 1.3511 Epoch: 24 Global Step: 121250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:06:42,869-Speed 4418.66 samples/sec Loss 1.3487 Epoch: 24 Global Step: 121300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:06:54,557-Speed 4380.68 samples/sec Loss 1.3741 Epoch: 24 Global Step: 121350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:07:06,053-Speed 4453.79 samples/sec Loss 1.3843 Epoch: 24 Global Step: 121400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:07:17,979-Speed 4293.19 samples/sec Loss 1.3491 Epoch: 24 Global Step: 121450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:07:29,502-Speed 4443.56 samples/sec Loss 1.3569 Epoch: 24 Global Step: 121500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:07:40,976-Speed 4462.43 samples/sec Loss 1.3420 Epoch: 24 Global Step: 121550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:07:52,803-Speed 4329.22 samples/sec Loss 1.3656 Epoch: 24 Global Step: 121600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:08:04,770-Speed 4278.68 samples/sec Loss 1.3618 Epoch: 24 Global Step: 121650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:08:16,559-Speed 4343.39 samples/sec Loss 1.3563 Epoch: 24 Global Step: 121700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:08:28,008-Speed 4471.97 samples/sec Loss 1.3894 Epoch: 24 Global Step: 121750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:08:39,567-Speed 4429.69 samples/sec Loss 1.3505 Epoch: 24 Global Step: 121800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:08:51,395-Speed 4328.98 samples/sec Loss 1.3401 Epoch: 24 Global Step: 121850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:09:02,962-Speed 4426.73 samples/sec Loss 1.3652 Epoch: 24 Global Step: 121900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:09:15,184-Speed 4189.24 samples/sec Loss 1.3858 Epoch: 24 Global Step: 121950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:09:26,861-Speed 4384.89 samples/sec Loss 1.3522 Epoch: 24 Global Step: 122000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:09:57,706-[lfw][122000]XNorm: 22.548049 Training: 2021-03-15 07:09:57,706-[lfw][122000]Accuracy-Flip: 0.99800+-0.00194 Training: 2021-03-15 07:09:57,707-[lfw][122000]Accuracy-Highest: 0.99833 Training: 2021-03-15 07:10:33,377-[cfp_fp][122000]XNorm: 21.193385 Training: 2021-03-15 07:10:33,378-[cfp_fp][122000]Accuracy-Flip: 0.98943+-0.00357 Training: 2021-03-15 07:10:33,378-[cfp_fp][122000]Accuracy-Highest: 0.98957 Training: 2021-03-15 07:11:04,255-[agedb_30][122000]XNorm: 22.556008 Training: 2021-03-15 07:11:04,255-[agedb_30][122000]Accuracy-Flip: 0.98167+-0.00723 Training: 2021-03-15 07:11:04,256-[agedb_30][122000]Accuracy-Highest: 0.98350 Training: 2021-03-15 07:11:15,905-Speed 469.54 samples/sec Loss 1.3608 Epoch: 24 Global Step: 122050 Fp16 Grad Scale: 8192 Required: 0 hours Training: 2021-03-15 07:11:27,433-Speed 4441.79 samples/sec Loss 1.3830 Epoch: 24 Global Step: 122100 Fp16 Grad Scale: 8192 Required: 0 hours Training: 2021-03-15 07:11:39,347-Speed 4297.59 samples/sec Loss 1.3734 Epoch: 24 Global Step: 122150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:11:50,961-Speed 4408.73 samples/sec Loss 1.3628 Epoch: 24 Global Step: 122200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:12:02,808-Speed 4321.74 samples/sec Loss 1.3723 Epoch: 24 Global Step: 122250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:12:14,346-Speed 4437.70 samples/sec Loss 1.3594 Epoch: 24 Global Step: 122300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:12:25,815-Speed 4464.69 samples/sec Loss 1.3708 Epoch: 24 Global Step: 122350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:12:37,394-Speed 4421.77 samples/sec Loss 1.3620 Epoch: 24 Global Step: 122400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:12:49,077-Speed 4382.80 samples/sec Loss 1.3525 Epoch: 24 Global Step: 122450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:00,916-Speed 4324.80 samples/sec Loss 1.3954 Epoch: 24 Global Step: 122500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:12,462-Speed 4434.63 samples/sec Loss 1.3762 Epoch: 24 Global Step: 122550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:24,090-Speed 4403.33 samples/sec Loss 1.3345 Epoch: 24 Global Step: 122600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:35,493-Speed 4490.34 samples/sec Loss 1.3592 Epoch: 24 Global Step: 122650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:47,396-Speed 4301.44 samples/sec Loss 1.3585 Epoch: 24 Global Step: 122700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:13:59,004-Speed 4410.97 samples/sec Loss 1.3747 Epoch: 24 Global Step: 122750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:14:10,711-Speed 4373.69 samples/sec Loss 1.3546 Epoch: 24 Global Step: 122800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:14:22,280-Speed 4425.63 samples/sec Loss 1.3601 Epoch: 24 Global Step: 122850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:14:34,498-Speed 4191.03 samples/sec Loss 1.3354 Epoch: 24 Global Step: 122900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:14:46,118-Speed 4406.16 samples/sec Loss 1.3545 Epoch: 24 Global Step: 122950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:14:57,854-Speed 4362.70 samples/sec Loss 1.4062 Epoch: 24 Global Step: 123000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:15:09,477-Speed 4405.40 samples/sec Loss 1.3678 Epoch: 24 Global Step: 123050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:15:21,266-Speed 4343.28 samples/sec Loss 1.3505 Epoch: 24 Global Step: 123100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:15:32,805-Speed 4437.40 samples/sec Loss 1.3729 Epoch: 24 Global Step: 123150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:15:44,332-Speed 4441.96 samples/sec Loss 1.3302 Epoch: 24 Global Step: 123200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:15:55,914-Speed 4420.79 samples/sec Loss 1.3567 Epoch: 24 Global Step: 123250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:16:07,825-Speed 4298.60 samples/sec Loss 1.3849 Epoch: 24 Global Step: 123300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:16:19,331-Speed 4449.94 samples/sec Loss 1.3845 Epoch: 24 Global Step: 123350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:16:31,127-Speed 4340.80 samples/sec Loss 1.3728 Epoch: 24 Global Step: 123400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:16:42,640-Speed 4447.33 samples/sec Loss 1.3425 Epoch: 24 Global Step: 123450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:16:54,319-Speed 4383.90 samples/sec Loss 1.3696 Epoch: 24 Global Step: 123500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:17:05,777-Speed 4469.03 samples/sec Loss 1.3837 Epoch: 24 Global Step: 123550 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:17:17,268-Speed 4455.86 samples/sec Loss 1.3669 Epoch: 24 Global Step: 123600 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:17:28,754-Speed 4457.67 samples/sec Loss 1.3494 Epoch: 24 Global Step: 123650 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:17:40,937-Speed 4202.69 samples/sec Loss 1.3585 Epoch: 24 Global Step: 123700 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:17:52,557-Speed 4406.36 samples/sec Loss 1.3882 Epoch: 24 Global Step: 123750 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:18:04,099-Speed 4436.25 samples/sec Loss 1.3552 Epoch: 24 Global Step: 123800 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:18:15,649-Speed 4433.06 samples/sec Loss 1.3795 Epoch: 24 Global Step: 123850 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:18:27,282-Speed 4401.28 samples/sec Loss 1.3829 Epoch: 24 Global Step: 123900 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:18:39,269-Speed 4271.57 samples/sec Loss 1.3672 Epoch: 24 Global Step: 123950 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:18:50,835-Speed 4427.15 samples/sec Loss 1.3761 Epoch: 24 Global Step: 124000 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:19:21,613-[lfw][124000]XNorm: 22.605091 Training: 2021-03-15 07:19:21,613-[lfw][124000]Accuracy-Flip: 0.99767+-0.00170 Training: 2021-03-15 07:19:21,613-[lfw][124000]Accuracy-Highest: 0.99833 Training: 2021-03-15 07:19:57,294-[cfp_fp][124000]XNorm: 21.270394 Training: 2021-03-15 07:19:57,295-[cfp_fp][124000]Accuracy-Flip: 0.98943+-0.00321 Training: 2021-03-15 07:19:57,295-[cfp_fp][124000]Accuracy-Highest: 0.98957 Training: 2021-03-15 07:20:28,096-[agedb_30][124000]XNorm: 22.613549 Training: 2021-03-15 07:20:28,096-[agedb_30][124000]Accuracy-Flip: 0.98217+-0.00695 Training: 2021-03-15 07:20:28,096-[agedb_30][124000]Accuracy-Highest: 0.98350 Training: 2021-03-15 07:20:39,441-Speed 471.43 samples/sec Loss 1.3954 Epoch: 24 Global Step: 124050 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:20:51,282-Speed 4324.24 samples/sec Loss 1.3954 Epoch: 24 Global Step: 124100 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:21:02,865-Speed 4420.10 samples/sec Loss 1.3823 Epoch: 24 Global Step: 124150 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:21:14,714-Speed 4321.59 samples/sec Loss 1.4052 Epoch: 24 Global Step: 124200 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:21:26,202-Speed 4456.83 samples/sec Loss 1.3703 Epoch: 24 Global Step: 124250 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:21:37,808-Speed 4411.77 samples/sec Loss 1.3962 Epoch: 24 Global Step: 124300 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:21:49,451-Speed 4397.68 samples/sec Loss 1.3556 Epoch: 24 Global Step: 124350 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:22:00,936-Speed 4458.18 samples/sec Loss 1.3802 Epoch: 24 Global Step: 124400 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:22:12,486-Speed 4433.12 samples/sec Loss 1.3849 Epoch: 24 Global Step: 124450 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:22:24,126-Speed 4398.54 samples/sec Loss 1.3706 Epoch: 24 Global Step: 124500 Fp16 Grad Scale: 16384 Required: 0 hours Training: 2021-03-15 07:22:35,537-Speed 4487.35 samples/sec Loss 1.3908 Epoch: 24 Global Step: 124550 Fp16 Grad Scale: 16384 Required: -0 hours